I'm trying to get NLTK and Wordnet working on a lambda via CodeBuild.
It looks like it installs fine in CloudFormation, but I get the following error in the Lambda:
START RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c Version: $LATEST
Unable to import module 'index': No module named 'nltk'
END RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c
REPORT RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c Duration: 2.10 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 21 MB
However when I check, it installed fine in CodeBuild:
[Container] 2018/11/06 12:45:06 Running command pip install -U nltk
Collecting nltk
Downloading https://files.pythonhosted.org/packages/50/09/3b1755d528ad9156ee7243d52aa5cd2b809ef053a0f31b53d92853dd653a/nltk-3.3.0.zip (1.4MB)
Requirement already up-to-date: six in /usr/local/lib/python2.7/site-packages (from nltk)
Building wheels for collected packages: nltk
Running setup.py bdist_wheel for nltk: started
Running setup.py bdist_wheel for nltk: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/d1/ab/40/3bceea46922767e42986aef7606a600538ca80de6062dc266c
Successfully built nltk
Installing collected packages: nltk
Successfully installed nltk-3.3
Here is the actual python code:
import json
import datetime
import nltk
from nltk.corpus import wordnet as wn
And here is the YML file:
version: 0.2
phases:
install:
commands:
# Upgrade AWS CLI to the latest version
- pip install --upgrade awscli
# Install nltk & WordNet
- pip install -U nltk
- python -m nltk.downloader wordnet
pre_build:
commands:
# Discover and run unit tests in the 'tests' directory. For more information, see <https://docs.python.org/3/library/unittest.html#test-discovery>
# - python -m unittest discover tests
build:
commands:
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml
artifacts:
type: zip
files:
- template-export.yml
Any idea why it installs fine in CodeBuild but can't access the module NLTK in the Lambda? For reference the code runs fine in the lambda if you just remove NLTK.
I have a feeling this a YML file issue, but not sure what, given NLTK installs fine.
NLTK was installed only locally, on the machine where the CodeBuild job was running. You need to copy NLTK into the CloudFormation deployment package. Your buildspec.yml will then look something like this:
install:
commands:
# Upgrade AWS CLI to the latest version
- pip install --upgrade awscli
pre_build:
commands:
- virtualenv /venv
# Install nltk & WordNet
- pip install -U nltk
- python -m nltk.downloader wordnet
build:
commands:
- cp -r /venv/lib/python3.6/site-packages/. ./
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml
Additional reading:
Create Deployment Package Using a Python Environment Created with Virtualenv
Ok, so thanks to laika for pointing me in the right direction.
This is a working deployment of NLTK & Wordnet to Lambda via CodeStar / CodeBuild. Some things to keep in mind:
1) You cannot use source venv/bin/activate as it is not POSIX compliant. Use . venv/bin/activate as below instead.
2) You must set the path for NLTK as shown in the define directories section.
buildspec.yml
version: 0.2
phases:
install:
commands:
# Upgrade AWS CLI & PIP to the latest version
- pip install --upgrade awscli
- pip install --upgrade pip
# Define Directories
- export HOME_DIR=`pwd`
- export NLTK_DATA=$HOME_DIR/nltk_data
pre_build:
commands:
- cd $HOME_DIR
# Create VirtualEnv to package for lambda
- virtualenv venv
- . venv/bin/activate
# Install Supporting Libraries
- pip install -U requests
# Install WordNet
- pip install -U nltk
- python -m nltk.downloader -d $NLTK_DATA wordnet
# Output Requirements
- pip freeze > requirements.txt
# Unit Tests
# - python -m unittest discover tests
build:
commands:
- cd $HOME_DIR
- mv $VIRTUAL_ENV/lib/python3.6/site-packages/* .
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml
artifacts:
type: zip
files:
- template-export.yml
If anyone has any improvements LMK. It's working for me.
Related
I am trying to do a pip install from codeartifact from within a dockerbuild in aws codebuild.
This article does not quite solve my problem: https://docs.aws.amazon.com/codeartifact/latest/ug/using-python-packages-in-codebuild.html
The login to AWS CodeArtifct is in the prebuild; outside of the Docker context.
But my pip install is inside my Dockerfile (we pull from a private pypi registry).
How do I do this, without doing something horrible like setting an env variable to the password derived from reading ~/.config/pip.conf/ after running the login command in prebuild?
You can use the environment
variable: PIP_INDEX_URL[1].
Below is an AWS CodeBuild buildspec.yml file where we construct the
PIP_INDEX_URL for CodeArtifact by using
this example from the AWS documentation.
buildspec.yml
pre_build:
commands:
- echo Getting CodeArtifact authorization...
- export CODEARTIFACT_AUTH_TOKEN=$(aws codeartifact get-authorization-token --domain "${CODEARTIFACT_DOMAIN}" --domain-owner "${AWS_ACCOUNT_ID}" --query authorizationToken --output text)
- export PIP_INDEX_URL="https://aws:${CODEARTIFACT_AUTH_TOKEN}#${CODEARTIFACT_DOMAIN}-${AWS_ACCOUNT_ID}.d.codeartifact.${AWS_DEFAULT_REGION}.amazonaws.com/pypi/${CODEARTIFACT_REPO}/simple/"
In your Dockerfile, add an ARG PIP_INDEX_URL line just above
your RUN pip install -r requirements.txt so it can become an environment
variable during the build process:
Dockerfile
# this needs to be added before your pip install line!
ARG PIP_INDEX_URL
RUN pip install -r requirements.txt
Finally, we build the image with the PIP_INDEX_URL build-arg.
buildspec.yml
build:
commands:
- echo Building the Docker image...
- docker build -t "${IMAGE_REPO_NAME}" --build-arg PIP_INDEX_URL .
As an aside, adding ARG PIP_INDEX_URL to your Dockerfile shouldn't break any
existing CI or workflows. If --build-arg PIP_INDEX_URL is omitted when
building an image, pip will still use the default PyPI index.
Specifying --build-arg PIP_INDEX_URL=${PIP_INDEX_URL} is valid, but
unnecessary. Specifying the argument name with no value will make Docker take
its value from the environment variable of the same
name[2].
Security note: If someone runs docker history ${IMAGE_REPO_NAME}, they can
see the value
of ${PIP_INDEX_URL}[3]
. The token is only good for a maximum of 12 hours though, and you can shorten
it to as little as 15 minutes with the --duration-seconds parameter
of aws codeartifact get-authorization-token[4],
so maybe that's acceptable. If your Dockerfile is a multi-stage build, then it
shouldn't be an issue if you're not using ARG PIP_INDEX_URL in your target
stage. docker build --secret does not seem to be supported in CodeBuild at this time.
So, here is how I solved this for now. Seems kinda hacky, but it works. (EDIT: we have since switched to #phistrom answer)
In the prebuild, I run the command and copy ~/.config/pip/pip.conf to the current build directory:
pre_build:
commands:
- echo Logging in to Amazon ECR...
...
- echo Fetching pip.conf for PYPI
- aws codeartifact --region us-east-1 login --tool pip --repository ....
- cp ~/.config/pip/pip.conf .
build:
commands:
- docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
- docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
Then in the Dockerfile, I COPY that file in, do the pip install, then rm it
COPY requirements.txt pkg/
COPY --chown=myuser:myuser pip.conf /home/myuser/.config/pip/pip.conf
RUN pip install -r ./pkg/requirements.txt
RUN pip install ./pkg
RUN rm /home/myuser/.config/pip/pip.conf
I'm trying to create a trigger that test a function before deploying it in cloud function. So far I managed to install requirements.txt and execute pytest but I get the following error:
/usr/local/lib/python3.7/site-packages/ghostscript/__init__.py:35: in <module>
from . import _gsprint as gs
/usr/local/lib/python3.7/site-packages/ghostscript/_gsprint.py:515: in <module>
raise RuntimeError('Can not find Ghostscript library (libgs)')
E RuntimeError: Can not find Ghostscript library (libgs)
I have ghostscript in my requirements.txt file :
[...]
ghostscript==0.6
[...]
pytest==6.0.1
pytest-mock==3.3.1
Here is my deploy.yaml
steps:
- name: 'docker.io/library/python:3.7'
id: Test
entrypoint: /bin/sh
dir: 'My_Project/'
args:
- -c
- 'pip install -r requirements.txt && pytest pytest/test_mainpytest.py -v'
From the traceback, I understand that I don't have ghostscript installed on the cloud build, which is true.
Is there a way to install ghostscript on a step of my deploy.yaml?
Edit-1:
So I tried to install ghostscript using commands in a step, I tried apt-get gs, apt-get ghostscript but unfortunately it didn't work
The real problem is that you are missing a c-library, the package itself seems installed by pip. You should install that library with your package manager. This is an example for ubuntu-based containers:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args:
- '-c'
- |
apt update
apt install ghostscript -y
pip install -r requirements.txt
pytest pytest/test_mainpytest.py -v
AWS beginner here
I have a repo in GitLab which has a python script and a requirements.txt file, and the python script has to be deployed in the EC2 ubuntu instance (and the script has to be triggered only once a day) via Gitlab CI. I am creating a deployment package of the repo using CI and through this, I am deploying the zipped package in the S3 bucket. My .gitlab-ci.yml file:
image: ubuntu:18.04
variables:
AWS_DEFAULT_REGION: eu-central-1
GIT_SUBMODULE_STRATEGY: recursive
S3_TEST_BUCKET: $BUCKET_UNPACK
stages:
- deploy
TestJob:
stage: deploy
script:
- apt-get -y update
- apt-get -y install python3-pip python3.7 zip
- python3.7 -m pip install --upgrade pip
- python3.7 -V
- pip3.7 install virtualenv
- mv iso_forest_ad.py ~ # This is the python script
- mv requirements.txt ~
# Setup virtual environment
- mkdir ~/forEC2
- cd ~/forEC2
- virtualenv -p python3 venv
- source venv/bin/activate
- pip3.7 install -r ~/requirements.txt -t ~/forEC2/venv/lib/python3.7/site-packages/
# Package environment and dependencies
- cd ~/forEC2/venv/lib/python3.7/site-packages/
- zip -r9 ~/forEC2/archive.zip .
- cd ~
- zip -g ~/forEC2/archive.zip iso_forest_ad.py
- pip install awscli --upgrade
- export PATH=$PATH:~/.local/bin
- aws configure set aws_access_key_id $AWS_TEST_ACCESS_KEY_ID
- aws configure set aws_secret_access_key $AWS_TEST_SECRET_ACCESS_KEY
- aws configure set default.region $AWS_DEFAULT_REGION
- aws s3 cp ~/forEC2/archive.zip $BUCKET_UNPACK/anomaly-detection-deployment.zip
Contents of requirements.txt
-i https://pypi.org/simple
joblib==0.16.0; python_version >= '3.6'
numpy==1.19.0
pandas==1.0.5
psycopg2-binary==2.8.5
python-dateutil==2.8.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
pytz==2020.1
scikit-learn==0.23.1
scipy==1.5.1; python_version >= '3.6'
six==1.15.0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
sqlalchemy==1.3.18
threadpoolctl==2.1.0; python_version >= '3.5'
Now, I would like to transfer the script and install the dependencies in the ubuntu EC2 instance and run the script.
I know one way would be to connect to the EC2 instance and do
aws s3 sync s3://s3-bucket-name/folder /home/ubuntu
as suggested in the post: Moving files from s3 to EC2 instance. But doing this, I was not able to install the dependencies from the requirements.txt file.
I would like to know if there is an alternate way (perhaps maybe by using shell script or some other way?) for achieving this. Since I am using ubuntu locally too, using putty is not an option for me.
The link you've posted already shows one way of doing this. Namely, by using UserData.
Therefore, you would have to develop a bash script which would not only download the zip file as shown in the link, but also unpack it, and install the requirements.txt file along side with any other dependencies or configuration setup you require.
So the UserData for your instance would be something like this (pseudo-code, this is only a rough example):
#!/bin/bash
apt update
apt install -y zip awscli python3-pip # awscli is not normally on ubuntu
aws s3 sync s3://optimal-aws-nz-play-config/package.zip .
unzip package.zip
cd package
pip install -r ./requirenements.txt
If this is something you do often, you could create lunch template with the instance settings and the UserData to automatically execute these steps for each instance launched from the template.
There are also other possibilities, involving CodeDeploy, CodePipeline, but plain old UserData would be a good start.
Alternative would be to use run-command. The execution of the command would be triggered from gitlab following upload of the new s3 package.
An example of how to invoke the run-command is in the docs:
aws ssm send-command \
--document-name "AWS-RunPowerShellScript" \
--parameters commands=["echo helloWorld"] \
--targets Key=tag:Env,Values=Dev,Test
Instead of echo helloWorld you would have to write your own bash commands to be executed.
I'm working on a Gitlab CI project where i have to push the APK to our aws S3 Bucket for that i have specified the keys in environment variables in the project setting of our repository, now here is my gitlab-ci.yml file:
stages:
- build
- deploy
variables:
AWS_DEFAULT_REGION: us-east-2 # The region of our S3 bucket
BUCKET_NAME: abc.bycket.info # bucket name
FILE_NAME: ConfuRefac.apk
assembleDebug:
stage: build
script:
- export ANDROID_HOME=/home/bitnami/android-sdk-linux
- export ANDROID_NDK_HOME=/opt/android-ndk
- export PATH=$PATH:/home/bitnami/android-sdk-linux/platform-tools/
- export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
- export PATH=/usr/lib/jvm/java-1.8.0-openjdk-amd64/bin:$PATH
- chmod +x ./gradlew
- ./gradlew assembleDebug
- cd app/build/outputs/apk/debug
- mv app-debug.apk ${FILE_NAME}
artifacts:
paths:
- app/build/outputs/apk/debug/${FILE_NAME}
deploys3:
image: "python:latest" # We use python because there is a well-working AWS Sdk
stage: deploy
dependencies:
- assembleDebug
script:
- pip install awscli
- cd app/build/outputs/apk/debug/
- ls && pwd
- aws s3 cp ${FILE_NAME} s3://${BUCKET_NAME}/${FILE_NAME} --recursive
So when the deploy stage starts kicking in it cannot find the file even though in ls you can clearly see that the file is indeed there.
Collecting futures<4.0.0,>=2.2.0; python_version == "2.7" (from s3transfer<0.4.0,>=0.3.0->awscli)
Using cached https://files.pythonhosted.org/packages/d8/a6/f46ae3f1da0cd4361c344888f59ec2f5785e69c872e175a748ef6071cdb5/futures-3.3.0-py2-none-any.whl
Collecting six>=1.5 (from python-dateutil<3.0.0,>=2.1->botocore==1.16.13->awscli)
Using cached https://files.pythonhosted.org/packages/65/eb/1f97cb97bfc2390a276969c6fae16075da282f5058082d4cb10c6c5c1dba/six-1.14.0-py2.py3-none-any.whl
Installing collected packages: urllib3, docutils, jmespath, six, python-dateutil, botocore, pyasn1, rsa, futures, s3transfer, PyYAML, colorama, awscli
Successfully installed PyYAML-5.3.1 awscli-1.18.63 botocore-1.16.13 colorama-0.4.3 docutils-0.15.2 futures-3.3.0 jmespath-0.10.0 pyasn1-0.4.8 python-dateutil-2.8.1 rsa-3.4.2 s3transfer-0.3.3 six-1.14.0 urllib3-1.25.9
You are using pip version 8.1.1, however version 20.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
$ cd ${PWD}/app/build/outputs/apk/debug/
$ ls && pwd
ConfuRefac.apk
/home/gitlab-runner/builds/CeGhSYCJ/0/root/confu-android/app/build/outputs/apk/debug
$ aws s3 cp ${PWD}/${FILE_NAME} s3://${BUCKET_NAME}/${FILE_NAME} --recursive
warning: Skipping file /home/gitlab-runner/builds/CeGhSYCJ/0/root/confu-android/app/build/outputs/apk/debug/ConfuRefac.apk/. File does not exist.
Running after_script
00:00
Uploading artifacts for failed job
00:01
ERROR: Job failed: exit status 1
As I indicated in the comments, the issue was caused because --recursive is treating ${FILE_NAME} as a directory, not file.
Which of course would make sense, because one can't recursively copy a single file.
I'm using zappa to deploy on aws. And I wanted to implement CI/CD on AWS.
So, I created a pipeline and successfully did Aws COMMIT and AWS BUILD.
I'm unable to deploy the same using AWS CODE DEPLOY.
The Buildspec.yaml looks like this:
version: 0.2
phases:
install:
commands:
- echo Setting up virtualenv
- python -m venv venv
- source venv/bin/activate
- echo Installing requirements from file
- pip install -r requirements.txt
build:
commands:
- echo Build started on `date`
- echo Building and running tests
- python tests.py
- flask db upgrade
post_build:
commands:
- echo Build completed on `date`
- echo Starting deployment
- zappa update dev
- echo Deployment completed
How should I execute zappa deploy or zappa update on AWS?
I'm not sure how to add create appspec.yaml file.
Please HELP! Stuck!!
Here's a buildspec.yml file that I use. You could adjust this to suit your needs (for example, including the DB upgrade command).
version: 0.2
phases:
install:
commands:
- mkdir /tmp/src/
- mv $CODEBUILD_SRC_DIR/* /tmp/src/
- cd /tmp/src/
- python3 -m venv docker_env && source docker_env/bin/activate && pip install --upgrade pip==9.0.3 && pip install -r requirements.txt && zappa update production && deactivate && rm -rf docker_env
post_build:
commands:
- cd $CODEBUILD_SRC_DIR
- rm -rf /tmp/src/
- echo Build completed on `date`
Note that this is using the Docker image danielwhatmuff/zappa:python3.6 in CodeBuild. I use this image as it's based on AWS Lambda and has been tuned for Zappa.
Zappa update to Code Deploy:
Your Buildspec.yaml looks fair good but there is one important point to consider.
Postbuild will always run regardless of success/failure. Debug information can be pulled from a failed build.
Either check the reason for failure from build log, or modify your yml to look like below (caution: this is only draft change, test before using in systems):
version: 0.2
phases:
install:
commands:
- yum -y groupinstall development
- yum -y install zlib-devel
- yum -y install openssl-devel
- wget https://www.python.org/ftp/python/3.6.0/Python-3.6.0.tar.xz
- tar xJf Python-3.6.0.tar.xz
- cd Python-3.6.0
- ./configure
- make
- make install
- ln -s /usr/local/bin/python3.6 /usr/bin/python3
- curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
- python3 get-pip.py
- pip3 install virtualenv
- virtualenv -p /usr/bin/python3 venv
- source venv/bin/activate
- pip3 install -r requirements.txt
build:
commands:
- echo Build started on `date`
- echo Building and running tests
- python3 tests.py
- flask db upgrade
post_build:
commands:
- if [ $CODEBUILD_BUILD_SUCCEEDING = 1 ]; then echo Build completed on `date`; echo Starting deployment; zappa update dev; else echo Build failed ignoring deployment; fi
- echo Deployment completed
Hope it answers.
Zappa update to AWS
Below are the steps to do Zappa update on AWS
Configure AWS with IAM user
Configure AWS cli in the local host using command
a. pip install awscli
b. aws configure
Call "Zappa init", it will generate zappa_settings.json based on details provided
Zappa deploy <name provided for environment in step3>
Now your application will be deployed to AWS. Whenever you need to update call
Zappa update <name provided for environment in step3>