GCP Composer Airflow - unable to install packages using PyPi - google-cloud-platform

I have created a Composer environment with image version -> composer-2.0.13-airflow-2.2.5
when i try to install software using PyPi, it fails.
details below :
Command :
gcloud composer environments update $AIRFLOW --location us-east1 --update-pypi-packages-from-file requirements.txt
requirement.txt
---------------
google-api-core
google-auth
google-auth-oauthlib
google-cloud-bigquery
google-cloud-core
google-cloud-storage
google-crc32c
google-resumable-media
googleapis-common-protos
google-endpoints
joblib
json5
jsonschema
pandas
requests
requests-oauthlib
Error :
Karans-MacBook-Pro:composer_dags karanalang$ gcloud composer environments update $AIRFLOW --location us-east1 --update-pypi-packages-from-file requirements.txt
Waiting for [projects/versa-sml-googl/locations/us-east1/environments/versa-composer3] to be updated with [projects/versa-sml-googl/locations/us-east1/operations/c23b77a9-f46b-4222-bafd-62527bf27239]..
.failed.
ERROR: (gcloud.composer.environments.update) Error updating [projects/versa-sml-googl/locations/us-east1/environments/versa-composer3]: Operation [projects/versa-sml-googl/locations/us-east1/operations/c23b77a9-f46b-4222-bafd-62527bf27239] failed: Failed to install PyPI packages. looker-sdk 22.4.0 has requirement attrs>=20.1.0; python_version >= "3.7", but you have attrs 17.4.0.
Check the Cloud Build log at https://console.cloud.google.com/cloud-build/builds/60ac972a-8f5e-4b4f-a4a7-d81049fb19a3?project=939354532596 for details. For detailed instructions see https://cloud.google.com/composer/docs/troubleshooting-package-installation
Pls note:
I have an older Composer cluster (Composer version - 1.16.8, Airflow version - 1.10.15), where the above command works fine.
However, it is not working with the new cluster
What needs to be done to debug/fix this ?
tia!

I was able to get this working using the following code :
path = "gs://dataproc-spark-configs/pip_install.sh"
CLUSTER_GENERATOR_CONFIG = ClusterGenerator(
project_id=PROJECT_ID,
zone="us-east1-b",
master_machine_type="n1-standard-4",
worker_machine_type="n1-standard-4",
num_workers=4,
storage_bucket="dataproc-spark-logs",
init_actions_uris=[path],
metadata={'PIP_PACKAGES': 'pyyaml requests pandas openpyxl kafka-python'},
).make()
with models.DAG(
'Versa-Alarm-Insights-UsingComposer2',
# Continue to run DAG twice per day
default_args=default_dag_args,
schedule_interval='0 0/12 * * *',
catchup=False,
) as dag:
create_dataproc_cluster = DataprocCreateClusterOperator(
task_id="create_dataproc_cluster",
cluster_name="versa-composer2",
region=REGION,
cluster_config=CLUSTER_GENERATOR_CONFIG
)
The earlier command which involved installing packages by reading from file was working in Composer1 (Airflow 1.x), however failing with Composer 2.x (Airflow 2.x)

From the error, it is clear that you are running old version of attrs package.
run the below command and try
pip install attrs==20.3.0
or
pip install attrs==20.1.0

Related

What is the correct configuration AWS SageMaker-Python-SDK to achieve local debugging/training with Apple M1 Pro

I want to run an RL training job on AWS SageMaker(script given below). But since the project is complex I was hoping to do a test run using SageMaker Local Mode (In my M1 MacBook Pro) before submitting to paid instances. However, I am struggling to make this local run successful even with a simple training task.
Now I did use Tensorflow-metal and Tensorflow-macos when running local training jobs(Without SageMaker). And I did not see anywhere I can specify this in the framework_version and nor that I am sure "local_gpu" which is the correct argument for a normal linux machine with GPU is exactly matching for Apple Silicon (M1 Pro).
I searched all over but I cannot find a case where this is addressed. (Very odd, am I doing something wrong? If so, please correct me.) If not and there's anyone who knows of a configuration, a docker image or an example properly done with M1 Pro please share.
I tried to run the following code. Which hangs after Logging in. (If you are trying to run the code, try with any simple training script as entry_point, and make sure to login with a similar code matching to your region using awscli with following command.
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com
##main.py
import boto3
import sagemaker
import os
import keras
import numpy as np
from keras.datasets import fashion_mnist
from sagemaker.tensorflow import TensorFlow
sess = sagemaker.Session()
#role = <'arn:aws:iam::0000000000000:role/CFN-SM-IM-Lambda-Catalog-sk-SageMakerExecutionRole-BlaBlaBla'> #KINDLY ADD YOUR ROLE HERE
(x_train, y_train), (x_val, y_val) = fashion_mnist.load_data()
os.makedirs("./data", exist_ok = True)
np.savez('./data/training', image=x_train, label=y_train)
np.savez('./data/validation', image=x_val, label=y_val)
# Train on local data. S3 URIs would work too.
training_input_path = 'file://data/training.npz'
validation_input_path = 'file://data/validation.npz'
# Store model locally. A S3 URI would work too.
output_path = 'file:///tmp/model/'
tf_estimator = TensorFlow(entry_point='mnist_keras_tf.py',
role=role,
instance_count=1,
instance_type='local_gpu', # Train on the local CPU ('local_gpu' if it has a GPU)
framework_version='2.1.0',
py_version='py3',
hyperparameters={'epochs': 1},
output_path=output_path
)
tf_estimator.fit({'training': training_input_path, 'validation': validation_input_path})
The prebuilt SageMaker Docker Images for Deep Learning doesn't have Arm based support yet.
You can see Available Deep Learning Containers Images here.
The solution is to build your own Docker image and use it with SageMaker.
This is an example Dockerfile that uses miniconda to install TensorFlow dependencies:
FROM arm64v8/ubuntu
RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
nginx \
ca-certificates \
gcc \
linux-headers-generic \
libc-dev
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.9.2-Linux-aarch64.sh
RUN chmod a+x Miniconda3-py38_4.9.2-Linux-aarch64.sh
RUN bash Miniconda3-py38_4.9.2-Linux-aarch64.sh -b
ENV PATH /root/miniconda3/bin/:$PATH
COPY ml-dependencies.yml ./
RUN conda env create -f ml-dependencies.yml
ENV PATH /root/miniconda3/envs/ml-dependencies/bin:$PATH
This is the the ml-dependencies.yml:
name: ml-dependencies
dependencies:
- python=3.8
- numpy
- pandas
- scikit-learn
- tensorflow==2.8.2
- pip
- pip:
- sagemaker-training
And this is how you'll run the trainin gusing SageMaker Script mode:
image = 'sagemaker-tensorflow2-graviton-training-toolkit-local'
california_housing_estimator = Estimator(
image_uri=image,
entry_point='california_housing_tf2.py',
source_dir='code',
role=DUMMY_IAM_ROLE,
instance_count=1,
instance_type='local',
hyperparameters={'epochs': 10,
'batch_size': 64,
'learning_rate': 0.1})
inputs = {'train': 'file://./data/train', 'test': 'file://./data/test'}
california_housing_estimator.fit(inputs, logs=True)
You can find the full working sample code on the Amazon SageMaker Local Mode Examples GitHub repository here.

how to use pyhive in lambda function?

I've wrote a function that is using pyhive to read from Hive. Running it locally it works fine. However when trying to use lambda function I got the error:
"Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'"
I've tried to use the guidelines in this link:
https://github.com/cloudera/impyla/issues/201
However, I wasn't able to use latest command:
yum install cyrus-sasl-lib cyrus-sasl-gssapi cyrus-sasl-md5
since the system I was using to build is ubuntu that doesn't support the yum function.
Tried to install those packages (using apt-get):
sasl2-bin libsasl2-2 libsasl2-dev libsasl2-modules libsasl2-modules-gssapi-mit
like described in:
python cannot connect hiveserver2
But still no luck. Any ideas?
Thanks,
Nir.
You can follow this github issue. I am able to connect Hive server2 with LDAP authentication using the pyhive library in AWS Lambda with Python 2.7. What I have done to make it work is:
Took one EC2 instance or launch container with AMI used in Lambda.
Run the following commands to install the required dependencies
yum upgrade
yum install gcc
yum install gcc-g++
sudo yum install cyrus-sasl cyrus-sasl-devel cyrus-sasl-ldap #include cyrus-sals dependency for authentication mechanism you are using to connect to hive
pip install six==1.12.0
Bundle up the /usr/lib64/sasl2/ to Lambda and set os.environ['SASL_PATH'] = os.path.join(os.getcwd(), /path/to/sasl2. Verify if .so files are presented on os.environ['SASL_PATH'] path.
My Lambda code looks like:
from pyhive import hive
import logging
import os
os.environ['SASL_PATH'] = os.path.join(os.getcwd(), 'lib/sasl2')
log = logging.getLogger()
log.setLevel(logging.INFO)
log.info('Path: %s',os.environ['SASL_PATH'])
def lambda_handler(event, context):
cursor = hive.connect(host='hiveServer2Ip', port=10000, username='userName', auth='LDAP',password='password').cursor()
SHOW_TABLE_QUERY = "show tables"
cursor.execute(SHOW_TABLE_QUERY)
tables = cursor.fetchall()
log.info('tables: %s', tables)
log.info('done')

Install Scrapy in apache airflow will cause INVALID_ARGUMENT

I`m trying to install Scrapy from PyPi using below command.
gcloud composer environments update $(AIRFLOW_ENVIRONMENT_NAME) \
--update-pypi-packages-from-file requirements.txt \
--location $(AIRFLOW_LOCATION)
requirements.txt is like this.
google-api-python-client==1.7.*
google-cloud-datastore==1.7.*
Scrapy==2.0.0
After running gcloud command, It will cause an invalid argument but it runs successfully in the local environment.
gcloud composer environments update xxxx \
--update-pypi-packages-from-file requirements.txt \
--location asia-northeast1
ERROR: (gcloud.composer.environments.update) INVALID_ARGUMENT: Found 1 problem:
1) Error validating key Scrapy. PyPi dependency name is not formatted properly. It must be lowercase and follow the format of 'identifier' specified in PEP-508.
Is there any way to install?
As the previous answer stated, the error that you are receiving now is quite clear and it's caused by the wrong formatting of the dependency. It should be scrapy==2.0.0 instead of Scrapy==2.0.0 inside the requirements.txt.
I would like to add that to avoid the installation error when you fix the formatting, you should add one more dependency to your list and that is attrs==19.2.0. I was able to install your requirements to my environment by specifying the following list:
google-api-python-client==1.7.*
google-cloud-datastore==1.7.*
scrapy==2.0.0
attrs==19.2.0
Even though you adjust package name in requirements.txt file according to PEP-508 document prerequisites, formatting certan package name in lowercase layout scrapy==2.0.0, the issue most probably will remain the same and updating process will stuck with the error:
Failed to install PyPI packages
Generally, this kind of error appears then the source PyPI package has some external dependencies or this package is sensitive on some system-level libraries that GCP Composer doesn't support.
In this case a vendor recommends two ways either using KubernetesPodOperator to build own custom image and use it in particular Kubernetes Pod or deploy PyPi package as a local Python library, uploading shared object libraries for the PyPI dependency to Airflow /plugins directory, find more info here.

RuntimeError:This command is using a remote connection in offline mode.[CondaError]

when I've created environment into conda, I got this error just after Proceed response:
[root#MyServer]#conda create -n py26 python=2.6 anaconda --offline
Fetching package metadata .........
Solving package specifications: ..............
......................
....
...
Proceed ([y]/n)? y
CondaError: RuntimeError(u'EnforceUnusedAdapter called with url https://repo.continuum.io/pkgs/free/linux-64/jpeg-8d-0.tar.bz2\nThis command is using a remote connection in offline mode.\n',)
CondaError: RuntimeError(u'EnforceUnusedAdapter called with url https://repo.continuum.io/pkgs/free/linux-64/jpeg-8d-0.tar.bz2\nThis command is using a remote connection in offline mode.\n',)
CondaError: RuntimeError(u'EnforceUnusedAdapter called with url https://repo.continuum.io/pkgs/free/linux-64/jpeg-8d-0.tar.bz2\nThis command is using a remote connection in offline mode.\n',)
even that I can see that my env has created successfuly:
[root#MyServer]# conda env list
# conda environments:
#
py26 /opt/Anaconda/Anaconda2-4.4.0/envs/py26
py27 /opt/Anaconda/Anaconda2-4.4.0/envs/py27
root * /opt/Anaconda/Anaconda2-4.4.0
is that Error influence environement I have created?
After few hours of search I found out that this issue cames from a bug on the Conda version 4.3.x: Github
to fix this issue, you will have to install the Conda 4.4.x
also, you have to check out the UPDATE on this version to enable conda in your shel:

Using tensorflow.contrib.data.Dataset in Cloud ML

Recently I changed my data pipeline in tensorflow from threading to a new Dataset api, which is pretty convenient once you want to validate your model each epoch.
I've noticed that current runtime version of tensorflow in Cloud ML is 1.2. Nevertheless, I've tried to use nightly build of tensorflow v1.3, but pip installation fails with:
AssertionError: tensorflow==1.3.0 .dist-info directory not found
Command '['pip', 'install', '--user', '--upgrade', '--force-reinstall', '--no-deps', u'tensorflow-1.3.0-cp27-none-linux_x86_64.whl']' returned non-zero exit status 2
Does anyone succeeded with using tensorflow.cotrib.data.Dataset with Cloud ML engine?
This worked for me: create a setup.py file with the following content:
from setuptools import find_packages
from setuptools import setup
REQUIRED_PACKAGES = ['tensorflow==1.3.0']
setup(
name='trainer',
version='0.1',
install_requires=REQUIRED_PACKAGES,
packages=find_packages(),
include_package_data=True,
description='Upgrading tf to 1.3')
more info on the setup.py file is available at: Packaging a Training Application