How to run lambda function on a schedule in my localhost? - amazon-web-services

I have a task that need to be scheduled on aws lambda function. I wrote a SAM template as below and I see it works when deploying on aws environment (my function get triggered intervally).
But we want to do testing on dev environment first before deploying. I use sam local start-api [OPTIONS] to deploy our functions to dev environment. But the problem is that, every functions configured as rest API work, but the schedule task not. I'm not sure is it possible on local/dev environment or not. If not, please suggest a solution (is it possible?). Thank you
This is template:
aRestApi:
...
...
sendMonthlyReport:
Type: AWS::Serverless::Function
Properties:
Handler: src.monthlyReport
Runtime: nodejs16.x
Events:
ScheduledEvent:
Type: Schedule
Properties:
Schedule: "cron(* * * * *)"

If you search for local testing before deployment of lambda functions you will probably be fed this resource by the quote-unquote "google": https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-debugging.html
However, there are other ways to do it.
I personally use the public docker image from aws.
Here is an example of the docker image created for a specific use case of mine.
FROM public.ecr.aws/lambda/python:3.8
RUN yum -y install tar gzip zlib freetype-devel \
gcc \
ghostscript \
lcms2-devel \
libffi-devel \
libimagequant-devel \
.
enter code here
enter code here
and some more dependencies ....
&& yum clean all
COPY requirements.txt ./
RUN python3.8 -m pip install -r requirements.txt
# Replace Pillow with Pillow-SIMD to take advantage of AVX2
RUN pip uninstall -y pillow && CC="cc -mavx2" pip install -U --force-reinstall pillow-simd
COPY <handler>.py ./<handler>.py
# Set the CMD to your handler
ENTRYPOINT [ "python3.8","app5.py" ]
In your case, follow instructions for node and run the docker image locally. If it works, you can then continue with aws lambda creation/update.
I see that you also have a cron job, why not use the cron job to invoke this lambda function separately and not define it in you SAML?
There are a number of ways you can invoke a lambda function based on event.
For example to invoke using cli: (For aws-cliv2)
make sure you configure awscli
# !/bin/bash
export AWS_PROFILE=<your aws profile>
export AWS_REGION=<aws region>
aws lambda invoke --function-name <function name> \
--cli-binary-format raw-in-base64-out \
--log-type Tail \
--payload <your json payload > \
<output filename>
Makes it loosely coupled.
Then you can use carlo's node cronjob suggestion to invoke is as many times you like, free of charge.

I used localstack for the demonstrate. LocalStack is a cloud service emulator that runs in a single container on your laptop or in your CI environment. You can see more detail in this link https://github.com/localstack/localstack

I would use node-cron to set a scheduler on a node file to run.
npm install --save node-cron
var cron = require('node-cron');
cron.schedule('* * * * *', () => {
console.log('running a task every minute');
});
https://www.npmjs.com/package/node-cron
You can also check for this DigitalOcean tutorial!

Related

AWS Glue 3.0 container not working for Jupyter notebook local development

I am working on Glue in AWS and trying to test and debug in local dev. I follow the instruction here https://aws.amazon.com/blogs/big-data/developing-aws-glue-etl-jobs-locally-using-a-container/ to develop Glue job locally. On that post, they use Glue 1.0 image for testing and it works as it should be. However when I load and try to dev by Glue 3.0 version; I follow the guidance steps but, I can't open Jupyter notebook on :8888 like the post said even every step seems correct.
here my cmd to start a Jupyter notebook on Glue 3.0 container
docker run -itd -p 8888:8888 -p 4040:4040 -v ~/.aws:/root/.aws:ro --name glue3_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/jupyter/jupyter_start.sh
nothing shows on http://localhost:8888.
still have no idea why! I understand the diff. between versions of Glues just wanna develop and test on the latest version of it. Have anybody got the same issue?
Thanks.
It seems that GLUE 3.0 image has some issues with SSL. A workaround for working locally is to disable SSL (you also have to change the script paths as documentation is not updated).
$ docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" \
-e AWS_ACCESS_KEY_ID=$(aws --profile default configure get aws_access_key_id) \
-e AWS_SECRET_ACCESS_KEY=$(aws --profile default configure get aws_secret_access_key) \
-e AWS_DEFAULT_REGION=$(aws --profile default configure get region) \
--name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 \
/home/glue_user/jupyter/jupyter_start.sh
After a few seconds you should have a working jupyter notebook instance running on http://127.0.0.1:8888

AWS CDK CodePipeline deploying app and CDK

I'm using the AWS CDK with typescript and I'd like to automate my CDK and Code Package deployments.
I have 2 github repos: app-cdk and app-website.
I have setup a CodePipeline as follows:
const pipeline = new CodePipeline(this, 'MyAppPipeline', {
pipelineName: 'MyAppPipeline',
synth: new ShellStep('Synth', {
input: CodePipelineSource.gitHub(`${ORG_NAME}/app-cdk`, BRANCH_NAME, {
}),
commands: ['npm ci', 'npm run build', 'npx cdk synth']
})
});
and added a beta stage as follows
pipeline.addStage(new MyAppStage(this, 'Beta', {
env: {account: 'XXXXXXXXX', region: 'us-east-2' }
}))
This works fine when I push code to my CDK code package, and deploys new resources. How can I add my website repo as a source to kickoff this pipeline, build in a different manner, and deploy the assets to the necessary resources? Shouldn't that be a part of the CodePipeline's source and build stages?
I have encountered similar scenario, where I had to create a CDK Pipeline for multiple Static S3 sites in a repository.
Soon, It became evident, that this had to be done using two stacks as Pipeline requires step to be of type Stage and does not support Construct.
Whereas my Static S3 Websites was a construct (BucketDeployment).
The way in which I handled this integration is as follows
deployment_code_build = cb.Project(self, 'PartnerS3deployment',
project_name='PartnerStaticS3deployment',
source=cb.Source.git_hub(owner='<github-org>',
repo='<repo-name>', clone_depth=1,
webhook_filters=[
cb.FilterGroup.in_event_of(
cb.EventAction.PUSH).and_branch_is(
branch_name="main")]),
environment=cb.BuildEnvironment(
build_image=cb.LinuxBuildImage.STANDARD_5_0
))
This added/provisioned a Codebuild Project which would dynamically deploy the changesets of cdk ls
The above Codebuild Project will need a buildspecfile in your root of the repo with the following code (for reference)
version: 0.2
phases:
install:
commands:
- echo Entered in install phase...
- npm install -g aws-cdk
- cdk --version
build:
commands:
- pwd
- cd cdk_pipeline_static_websites
- ls -lah
- python -m pip install -r requirements.txt
- nohup ./parallel_deploy.sh & echo $! > pidfile && wait $(cat pidfile)
finally:
- echo Build completed on `date`
The contents of parallel_deploy.sh are as follows
#!/bin/bash
for stack in $(cdk list);
do
cdk deploy $stack --require-approval=never &
done;
While this works great, There has to be a simpler alternative which can directly import other stacks/constructs in the CDK Pipeline class.

How to solve an AWS Lamba function deployment problem?

.. aaaand me again :)
This time with a very interesting problem.
Again AWS Lambda function, node.js 12, Javascript, Ubuntu 18.04 for local development, aws cli/aws sam/Docker/IntelliJ, everything is working perfectly in local and is time to deploy.
So I did set up an AWS account for tests, created and assigned an access key/secret and finally did try to deploy.
Almost at the end an error pop up aborting the deployment.
I'm showing the SAM cli version from a terminal, but the same happens with IntelliJ.
(of course I mask/change some names)
From a terminal I'm going where I have my local sandbox with the project and then :
$ sam deploy --guided
Configuring SAM deploy
======================
Looking for config file [samconfig.toml] : Not found
Setting default arguments for 'sam deploy'
=========================================
Stack Name [sam-app]: MyActualProjectName
AWS Region [us-east-1]: us-east-2
#Shows you resources changes to be deployed and require a 'Y' to initiate deploy
Confirm changes before deploy [y/N]: y
#SAM needs permission to be able to create roles to connect to the resources in your template
Allow SAM CLI IAM role creation [Y/n]: y
Save arguments to configuration file [Y/n]: y
SAM configuration file [samconfig.toml]: y
SAM configuration environment [default]:
Looking for resources needed for deployment: Not found.
Creating the required resources...
Successfully created!
Managed S3 bucket: aws-sam-cli-managed-default-samclisourcebucket-7qo1hy7mdu9z
A different default S3 bucket can be set in samconfig.toml
Saved arguments to config file
Running 'sam deploy' for future deployments will use the parameters saved above.
The above parameters can be changed by modifying samconfig.toml
Learn more about samconfig.toml syntax at
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html
Error: Unable to upload artifact MyFunctionName referenced by CodeUri parameter of MyFunctionName resource.
ZIP does not support timestamps before 1980
$
I spent quite some time looking around for this problem but I found only some old threads.
In theory this problems was solved in 2018 ... but probably some npm libraries I had to use contains something old ... how in the world I fix this stuff ?
In one thread I found a kind of workaround.
In the file buildspec.yml somebody suggested to add AFTER the npm install :
ls $CODEBUILD_SRC_DIR
find $CODEBUILD_SRC_DIR/node_modules -mtime +10950 -exec touch {} ;
Basically the idea is to touch all the files installed after the npm install but still the error happens.
This my buildspec.yml file after the modification :
version: 0.2
phases:
install:
commands:
# Install all dependencies (including dependencies for running tests)
- npm install
- ls $CODEBUILD_SRC_DIR
- find $CODEBUILD_SRC_DIR/node_modules -mtime +10950 -exec touch {} ;
pre_build:
commands:
# Discover and run unit tests in the '__tests__' directory
- npm run test
# Remove all unit tests to reduce the size of the package that will be ultimately uploaded to Lambda
- rm -rf ./__tests__
# Remove all dependencies not needed for the Lambda deployment package (the packages from devDependencies in package.json)
- npm prune --production
build:
commands:
# Use AWS SAM to package the application by using AWS CloudFormation
- aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml
artifacts:
type: zip
files:
- template-export.yml
I will continue to search but again I wonder if somebody here had this kind of problem and thus some suggestions/methodology about how to solve it.
Many many thanks !
Steve

supply token in npmrc during build

I am using AWS Codeartifact within my project as a private NPM registry (and proxy of course) and i have some issues getting the perfect workflow. Right now i have a .sh script which generates me the Auth token for AWS and generates a project local .npmrc file. It pretty much looks like this:
#!/bin/sh
export CODEARTIFACT_AUTH_TOKEN=`aws codeartifact get-authorization-token --domain xxxxx \
--domain-owner XXXXXX --query authorizationToken --output text --profile XXXXX`
export REPOSITORY_ENDPOINT=`aws codeartifact get-repository-endpoint --domain xxxxx \
--repository xxxx --format npm --query repositoryEndpoint --output text --profile xxxx`
cat << EOF > .npmrc
registry=$REPOSITORY_ENDPOINT
${REPOSITORY_ENDPOINT#https:}:always-auth=true
${REPOSITORY_ENDPOINT#https:}:_authToken=\${CODEARTIFACT_AUTH_TOKEN}
EOF
Now i dont want to run this script manually of course but it should be part of my NPM build process, so i started with things like this in package.json
"scripts": {
"build": "tsc",
"prepublish": "./scriptabove.sh"
}
When running "npm publish" (for example) the .npmrc is created nicely but i assume since NPM is already running, any changes to npmrc wont get picked up. When i run "npm publish" the second time, it works of course.
My question: Is there any way to hook into the build process to apply the token? I dont want to say to my users "please call the scriptabove.sh first before doing any NPM commands. And i dont like "scriptabove.sh && npm publish" either.
You could create a script like this
publish-package command can be called whatever you want
"scripts": {
"build": "tsc",
"prepublish": "./scriptabove.sh",
"publish-package": "npm run prepublish && npm publish"
}
Explanation:
Use & (single ampersand) for parallel execution.
Use && (double ampersand) for sequential execution.
publish-package will then run the prepublish command first then after run npm publish. This method is a great way to chain npm commands that need to run in sequential order.
For more information on this here's a StackOverflow post about it.
Running NPM scripts sequentially

AWS Lambda Error: Unzipped size must be smaller than 262144000 bytes

I am developing one lambda function, which use the ResumeParser library made in the python 2.7. But when I deploy this function including the library on the AWS it's throwing me following error:
Unzipped size must be smaller than 262144000 bytes
Perhaps you did not exclude development packages which made your file to grow that big.
I my case, (for NodeJS) I had missing the following in my serverless.yml:
package:
exclude:
- node_modules/**
- venv/**
See if there are similar for Python or your case.
This is a hard limit which cannot be changed:
AWS Lambda Limit Errors
Functions that exceed any of the limits listed in the previous limits tables will fail with an exceeded limits exception. These limits are fixed and cannot be changed at this time. For example, if you receive the exception CodeStorageExceededException or an error message similar to "Code storage limit exceeded" from AWS Lambda, you need to reduce the size of your code storage.
You need to reduce the size of your package. If you have large binaries place them in s3 and download on bootstrap. Likewise for dependencies, you can pip install or easy_install them from an s3 location which will be faster than pulling from pip repos.
The best solution to this problem is to deploy your Lambda function using a Docker container that you've built and pushed to AWS ECR. Lambda container images have a limit of 10 gb.
Here's an example using Python flavored AWS CDK
from aws_cdk import aws_lambda as _lambda
self.lambda_from_image = _lambda.DockerImageFunction(
scope=self,
id="LambdaImageExample",
function_name="LambdaImageExample",
code=_lambda.DockerImageCode.from_image_asset(
directory="lambda_funcs/LambdaImageExample"
),
)
An example Dockerfile contained in the directory lambda_funcs/LambdaImageExample alongside my lambda_func.py and requirements.txt:
FROM amazon/aws-lambda-python:latest
LABEL maintainer="Wesley Cheek"
RUN yum update -y && \
yum install -y python3 python3-dev python3-pip gcc && \
rm -Rf /var/cache/yum
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY lambda_func.py ./
CMD ["lambda_func.handler"]
Run cdk deploy and the Lambda function will be automagically bundled into an image along with its dependencies specified in requirements.txt, pushed to an AWS ECR repository, and deployed.
This Medium post was my main inspiration
Edit:
(More details about this solution can be found in my Dev.to post here)
A workaround that worked for me:
Install pyminifier:
pip install pyminifier
Go to the library folder that you want to zip. In my case I wanted to zip the site-packages folder in my virtual env. So I created a site-packages-min folder at the same level where site-packages was. Run the following shell script to minify the python files and create identical structure in the site-packages-min folder. Zip and upload these files to S3.
#/bin/bash
for f in $(find site-packages -name '*.py')
do
ori=$f
res=${f/site-packages/site-packages-min}
filename=$(echo $res| awk -F"/" '{print $NF}')
echo "$filename"
path=${res%$filename}
mkdir -p $path
touch $res
pyminifier --destdir=$path $ori >> $res || cp $ori $res
done
HTH
As stated by Greg Wozniak, you may just have imported useless directories like venv and node_modules.
package.exclude is now deprecated and removed in serverless 4, you should now use package.patterns instead:
package:
patterns:
- '!node_modules/**'
- '!venv/**'
In case you're using CloudFormation, in your template yaml file, make sure your 'CodeUri' property includes only your necessary code files and does not contain stuff like the .aws-sam directory (which is big) etc.