Skip to content Skip to sidebar Skip to footer

Elephas Not Loaded In Pyspark: No Module Named Elephas.spark_model

I am trying to distribute Keras training on a cluster and use Elephas for that. But, when running the basic example from the doc of Elephas (https://github.com/maxpumperla/elephas)

Solution 1:

I found a solution on how to properly load a virtual environment to the master and all the slave workers:

virtualenv venv --relocatable
cd venv 
zip -qr ../venv.zip *

PYSPARK_PYTHON=./SP/bin/python spark-submit --master yarn --deploy-mode cluster --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./SP/bin/python --driver-memory 4G --archives venv.zip#SP filename.py

More details in the GitHub Issue: https://github.com/maxpumperla/elephas/issues/80#issuecomment-371073492

Solution 2:

You should add elephas library as an argument to your spark-submit command.

Citing official guide:

For Python, you can use the --py-files argument of spark-submit to add .py, .zip or .egg files to be distributed with your application. If you depend on multiple Python files we recommend packaging them into a .zip or .egg.

Official guide

Post a Comment for "Elephas Not Loaded In Pyspark: No Module Named Elephas.spark_model"