Can not import a manual package into Google Cloud ML Engine - google-cloud-platform

I created my own package to be used in the Google Cloud ML Engine job.
In order to use my package, I follow the instructions in the official Google Cloud documentation.
That is, I archived my package into tar.gz and upload it into a Cloud Storage bucket.
Next, I start my job and get the following error:
Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib/python3.5/tokenize.py", line 454, in open buffer = _builtin_open(filename, 'rb') FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-req-build-iprehs_0/setup.py
Please advise me the algorithm for importing manual packages, or share a link where this is done.

Related

Submit a pyspark job with a config file on Dataproc

I'm newbie on GCP and I'm struggling with submitting pyspark job in Dataproc.
I have a python script depends on a config.yaml file. And I notice that when I submit the job everything is executed under /tmp/.
How can I make available that config file in the /tmp/ folder?
At the moment, I get this error:
12/22/2020 10:12:27 AM root INFO Read config file.
Traceback (most recent call last):
File "/tmp/job-test4/train.py", line 252, in <module>
run_training(args)
File "/tmp/job-test4/train.py", line 205, in run_training
with open(args.configfile, "r") as cf:
FileNotFoundError: [Errno 2] No such file or directory: 'gs://network-spark-migrate/model/demo-config.yml'
Thanks in advance
Below a snippet worked for me:
gcloud dataproc jobs submit pyspark gs://network-spark-migrate/model/train.py --cluster train-spark-demo --region europe-west6 --files=gs://network-spark-migrate/model/demo-config.yml -- --configfile ./demo-config.yml

ImportError: No module named idlelib" when running Google Dataflow worker

I have a python 2.7 script I run locally to launch a Apache Beam / Google Dataflow job (SDK 2.12.0). The job takes a csv file from a Google storage bucket, processes it and then creates an entity in Google Datastore for each row. The script ran fine for years ...but now it is failing:
INFO:root:2019-05-15T22:07:11.481Z: JOB_MESSAGE_DETAILED: Workers have started successfully.
INFO:root:2019-05-15T21:47:13.370Z: JOB_MESSAGE_ERROR: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 773, in run
self._load_main_session(self.local_staging_directory)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 489, in _load_main_session
pickler.load_session(session_file)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 280, in load_session
return dill.load_session(file_path)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 410, in load_session
module = unpickler.load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1139, in load_reduce
value = func(*args)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 827, in _import_module
return __import__(import_name)
ImportError: No module named idlelib
I believe this error is happening at the worker level (not locally). I don't make reference to it in my script. To make sure it wasn't me I have installed updates for all google-cloud packages, apache-beam[gcp] etc locally -just in case. I tried importing idlelib into my script I get the same error. Any suggestions?
It has been fine for years and started failing from SDK 2.12.0 release.
What was the last release that this script succeeding on? 2.11?

AWS DataPipeline Maching Learning AMI tensorflow issues

I'm running the AWS Machine Learning AMI on an EC2 instance. I've confirmed that from the terminal, both in python and jupyter can run
import tensorflow as tf
along with
python pytest.py
from the terminal (which contains the above tensorflow import), with no issues.
I'm now trying to automate my script using DataPipeline along with TaskRunner. The bash command in DataPipeline is again, just:
python pytest.py
However, Immediately get the following error:
Traceback (most recent call last): File "pytest.py", line 1, in
import tensorflow as tf File "/usr/lib/python2.7/dist-packages/tensorflow/init.py", line 24, in
from tensorflow.python import * File "/usr/lib/python2.7/dist-packages/tensorflow/python/init.py", line
72, in
raise ImportError(msg) ImportError: Traceback (most recent call last): File
"/usr/lib/python2.7/dist-packages/tensorflow/python/init.py", line
61, in
from tensorflow.python import pywrap_tensorflow File "/usr/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py",
line 28, in
_pywrap_tensorflow = swig_import_helper() File "/usr/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py",
line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description) ImportError: libcudart.so.7.5: cannot open shared object
file: No such file or directory
Failed to load the native TensorFlow runtime.
See
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#import_error
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
It seems like AWS DataPipeline (or TaskRunner?) uses a different enviornment setup, because again, I have no issues running the script through an ssh terminal to the instance. I found a few posts which suggested adding cuda to the LD_LIBRARY_PATH, but the AMI instance already has it:
echo $LD_LIBRARY_PATH
/home/ec2-user/src/torch/install/lib:/home/ec2-user/src/cntk/bindings/python/cntk/libs:/usr/local/cuda/lib64:/usr/local/lib:/usr/lib:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/mpi/lib:/home/ec2-user/src/mxnet/mklml_lnx_2017.0.1.20161005/lib:
which clearly contains the cuda librarypath that tensorflow needs.

AWS command line tools broken : (

I tried to install awscli after ebcli, and they both broke. Currently, if I type aws s3 ls, it just hangs with no response, and if I try to use eb, I get this error:
Traceback (most recent call last):
File "/usr/local/bin/eb", line 11, in <module>
load_entry_point('awsebcli==3.8.4', 'console_scripts', 'eb')()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 565, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2631, in load_entry_point
return ep.load()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2291, in load
return self.resolve()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2297, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/usr/local/lib/python2.7/dist-packages/ebcli/core/ebcore.py", line 43, in <module>
from . import ebglobals, base, io, hooks
File "/usr/local/lib/python2.7/dist-packages/ebcli/core/base.py", line 19, in <module>
from ebcli import __version__
ImportError: cannot import name __version__
I basically need to have command line tools for s3 and elastic beanstalk, but I apparently have no luck, and will be spending my entire day googling the universe, and combing through error codes to try and fix this : (
I'm on Ubuntu 14.04 on a Thinkpad.
It is quite common for different Python libraries to install over each other, causing problems like this.
A popular fix is to use a the virtualenv tool to create isolated Python environments.
The AWS documentation for awsebcli has a page showing how: Install the EB CLI in a Virtual Environment
Alternatively, keep using the AWS Command-Line Interface (CLI) since it works across all AWS services, rather than using service-specific command sets like awsebcli (which pre-date the CLI).

boto3 throws error in when packaged under rpm

I am using boto3 in my project and when i package it as rpm it is raising error while initializing ec2 client.
<class 'botocore.exceptions.DataNotFoundError'>:Unable to load data for: _endpoints. Traceback -Traceback (most recent call last):
File "roboClientLib/boto/awsDRLib.py", line 186, in _get_ec2_client
File "boto3/__init__.py", line 79, in client
File "boto3/session.py", line 200, in client
File "botocore/session.py", line 789, in create_client
File "botocore/session.py", line 682, in get_component
File "botocore/session.py", line 809, in get_component
File "botocore/session.py", line 179, in <lambda>
File "botocore/session.py", line 475, in get_data
File "botocore/loaders.py", line 119, in _wrapper
File "botocore/loaders.py", line 377, in load_data
DataNotFoundError: Unable to load data for: _endpoints
Can anyone help me here. Probably boto3 requires some run time resolutions which it not able to get this in rpm.
I tried with using LD_LIBRARY_PATH in /etc/environment which is not working.
export LD_LIBRARY_PATH="/usr/lib/python2.6/site-packages/boto3:/usr/lib/python2.6/site-packages/boto3-1.2.3.dist-info:/usr/lib/python2.6/site-packages/botocore:
I faced the same issue:
botocore.exceptions.DataNotFoundError: Unable to load data for: ec2/2016-04-01/service-2
For which I figured out the directory was missing. Updating botocore by running the following solved my issue:
pip install --upgrade botocore
Botocore depends on a set of service definition files that it uses to generate clients on the fly. Boto3 further depends on another set of files that it uses to generate resource clients. You will need to include these in any installs of boto3 or botocore. The files will need to be located in the 'data' folder of the root of the respective library.
I faced similar issue which was due to old version of botocore. Once I updated it, it started working.
Please consider using below command.
pip install --upgrade botocore
Also please ensure, you have setup boto configuration profile.
Boto searches credentials in below order.
Passing credentials as parameters in the boto.client() method
Passing credentials as parameters when creating a Session object
Environment variables
Shared credential file (~/.aws/credentials)
AWS config file (~/.aws/config)
Assume Role provider
Boto2 config file (/etc/boto.cfg and ~/.boto)
Instance metadata service on an Amazon EC2 instance that has an IAM
role configured.