How to upload a zip to S3 with CDK - amazon-web-services

I'm working on building a CDK library and am trying to upload a zip folder to S3 that I can then use for a Lambda deployment later. I've found a lot of direction online to use aws_s3_deployment.
The problem with that construct is that it loads the contents of a zip rather than a zip itself. I've tried to zip a zip inside a zip and that doesn't work. I've also tried to zip a folder and that doesn't work either. The behavior I see is that nothing shows up in S3 and there are no errors from CDK. Is there another way to load a zip to S3?

What you're looking for is the aws-s3-assets module. It allows you to define either directories (which will be zipped) or regular files as assets that the CDK will upload to S3 for you. Using the attributes you can refer to the assets.
The documentation has this example for it:
import { Asset } from 'aws-cdk-lib/aws-s3-assets';
// Archived and uploaded to Amazon S3 as a .zip file
const directoryAsset = new Asset(this, "SampleZippedDirAsset", {
path: path.join(__dirname, "sample-asset-directory")
});
// Uploaded to Amazon S3 as-is
const fileAsset = new Asset(this, 'SampleSingleFileAsset', {
path: path.join(__dirname, 'file-asset.txt')
});

In order to upload the zip file to a given bucket I ended up using BucketDeployment with a custom ILocalBundling. The custom bundler will compress the files and put them in an assets directory for CDK to upload. The important part is to set output_type=BundlingOutput.NOT_ARCHIVED, this way CDK will not try to unzip the file.
#implements(ILocalBundling)
class LocalBundling:
#member(jsii_name="tryBundle")
def try_bundle(self, output_dir: str, image: DockerImage,) -> bool:
cwd = pathlib.Path.cwd()
print(f"bundling to {output_dir}...")
build_dir = f"{cwd}/directory/to"
command = ["zip", "-r", f"{output_dir}/python.zip", f"zip"]
print(command)
output = subprocess.run(command, capture_output=True, check=True, cwd=build_dir)
# print(output.stdout.decode("utf-8"))
return True
local_bundling = LocalBundling()
s3_deployment.BucketDeployment(
self,
f"SomeIdForBucketDeployment",
sources=[
s3_deployment.Source.asset(
"directory/to/zip",
bundling=BundlingOptions(
command=['none'],
image=DockerImage.from_registry("lm"),
local=local_bundling,
output_type=BundlingOutput.NOT_ARCHIVED,
),
)
],
destination_bucket=some_bucket,
destination_key_prefix=some_key_prefix,
)

Related

How to give the local zip path in AWS CouldFormation YAML CodeUri?

I have exported a lambda YAML from its export funtion using Download AWS SAM file.
Also I have Downloaded the code zip file from Download deployment package.
in the YAML file we need to give the CodeUri
in the Downloaded YAML it is . as shown in the below picture.
So when I upload it in the AWS CouldFormation it says:
'CodeUri' is not a valid S3 Uri of the form 's3://bucket/key' with
optional versionId query parameter.
I need to know is there a way to give the zip file in the CodeUri from the local file path rather then uploading it in the S3.
I have tried with the zip file name I downloaded as well and still I get the same error.
You have to first run package command. It may not work with zip itself, so you may try with unpacked source code.

No such file or directory when downloading a file using boto3 from an was s3 bucket

I want to download the whole bucket "my-bucket" from aws s3 storage, and save it in the path "./resources/my-bucket". So, I'm doing the following, importing boto3, creating a resource with the access_key_id and secret_access_key, then iterating over all the objects in the bucket and downloading them using the download_file api call:
import boto3
s3 = boto3.resource(
service_name="s3",
region_name="us-west-1",
aws_access_key_id="AWUHWIEDFB",
aws_secret_access_key="AOIERJGOERWHJO"
)
for o in s3.Bucket("my-bucket").objects.all():
filename = o.key
s3.Bucket("my-bucket").download_file(Key=filename, Filename="./resources/my-bucket/" + filename)
The s3 bucket my-bucket itself has a folder called "input" with only png and jpg files, and also has another .pt file in the root path.
But I'm getting the error:
FileNotFoundError: [Errno 2] No such file or directory: './resources/my-bucket/input/.F89bdcAc'
I don't know what .F89bdcAc means, and why boto3 is trying to find that file.
How to fix this?
See the answer of this post : Boto3 to download all files from a S3 Bucket
It separates the processing of the directories in your bucket and the processing of the files (keys). Directories on your system are created on the fly with os.makedirs, you won't get this error anymore. And you ought to change your access key / secret !
It seems you don't have a directory structure that you are providing in the Filename param.
Try creating provided directory structure, i-e './resources/my-bucket/input/' and then the name of the file at the end, along with the extension for better.

AWS CDK: run external build command in CDK sequence?

Is it possible to run an external build command as part of a CDK stack sequence? Intention: 1) create a rest API, 2) write rest URL to config file, 3) build and deploy a React app:
import apigateway = require('#aws-cdk/aws-apigateway');
import cdk = require('#aws-cdk/core');
import fs = require('fs')
import s3deployment = require('#aws-cdk/aws-s3-deployment');
export class MyStack extends cdk.Stack {
const restApi = new apigateway.RestApi(this, ..);
fs.writeFile('src/app-config.json',
JSON.stringify({ "api": restApi.deploymentStage.urlForPath('/myResource') }))
// TODO locally run 'npm run build', create 'build' folder incl rest api config
const websiteBucket = new s3.Bucket(this, ..)
new s3deployment.BucketDeployment(this, .. {
sources: [s3deployment.Source.asset('build')],
destinationBucket: websiteBucket
})
}
Unfortunately, it is not possible, as the necessary references are only available after deploy and therefore after you try to write the file (the file will contain cdk tokens).
I personally have solved this problem by telling cdk to output the apigateway URLs to a file and then parse it after the deploy to upload it so a S3 bucket, to do it you need:
deploy with the output file options, for example:
cdk deploy -O ./cdk.out/deploy-output.json
In ./cdk.out/deploy-output.json you will find a JSON object with a key for each stack that produced an output (e.g. your stack that contains an API gateway)
manually parse that JSON to get your apigateway url
create your configuration file and upload it to S3 (you can do it via aws-sdk)
Of course, you have the last steps in a custom script, which means that you have to wrap your cdk deploy. I suggest to do so with a nodejs script, so that you can leverage aws-sdk to upload your file to S3 easily.
Accepting that cdk doesn't support this, I split logic into two cdk scripts, accessed API gateway URL as cdk output via the cli, then wrapped everything in a bash script.
AWS CDK:
// API gateway
const api = new apigateway.RestApi(this, 'my-api', ..)
// output url
const myResourceURL = api.deploymentStage.urlForPath('/myResource');
new cdk.CfnOutput(this, 'MyRestURL', { value: myResourceURL });
Bash:
# deploy api gw
cdk deploy --app (..)
# read url via cli with --query
export rest_url=`aws cloudformation describe-stacks --stack-name (..) --query "Stacks[0].Outputs[?OutputKey=='MyRestURL'].OutputValue" --output text`
# configure React app
echo "{ \"api\" : { \"invokeUrl\" : \"$rest_url\" } }" > src/app-config.json
# build React app with url
npm run build
# run second cdk app to deploy React built output folder
cdk deploy --app (..)
Is there a better way?
I solved a similar issue:
Needed to build and upload react-app as well
Supported dynamic configuration reading from react-app - look here
Released my react-app with specific version (in a separate flow)
Then, during CDK deployment of my app, it took a specific version of my react-app (version retrieved from local configuration) and uploaded its zip file to S3 bucket using CDK BucketDeployment
Then, using AwsCustomResource I generated a configuration file with references to Cognito and API-GW and uploaded this file to S3 as well:
// create s3 bucket for react-app
const uiBucket = new Bucket(this, "ui", {
bucketName: this.stackName + "-s3-react-app",
blockPublicAccess: BlockPublicAccess.BLOCK_ALL
});
let confObj = {
"myjsonobj" : {
"region": `${this.region}`,
"identity_pool_id": `${props.CognitoIdentityPool.ref}`,
"myBackend": `${apiGw.deploymentStage.urlForPath("/")}`
}
};
const dataString = JSON.stringify(confObj, null, 4);
const bucketDeployment = new BucketDeployment(this, this.stackName + "-app", {
destinationBucket: uiBucket,
sources: [Source.asset(`reactapp-v1.zip`)]
});
bucketDeployment.node.addDependency(uiBucket)
const s3Upload = new custom.AwsCustomResource(this, 'config-json', {
policy: custom.AwsCustomResourcePolicy.fromSdkCalls({resources: custom.AwsCustomResourcePolicy.ANY_RESOURCE}),
onCreate: {
service: "S3",
action: "putObject",
parameters: {
Body: dataString,
Bucket: `${uiBucket.bucketName}`,
Key: "app-config.json",
},
physicalResourceId: PhysicalResourceId.of(`${uiBucket.bucketName}`)
}
});
s3Upload.node.addDependency(bucketDeployment);
As others have mentioned, this isn't supported within CDK. So this how we solved it in SST: https://github.com/serverless-stack/serverless-stack
On the CDK side, allow defining React environment variables using the outputs of other constructs.
// Create a React.js app
const site = new sst.ReactStaticSite(this, "Site", {
path: "frontend",
environment: {
// Pass in the API endpoint to our app
REACT_APP_API_URL: api.url,
},
});
Spit out a config file while starting the local environment for the backend.
Then start React using sst-env -- react-scripts start, where we have a simple CLI that reads from the config file and loads them as build-time environment variables in React.
While deploying, replace these environment variables inside a custom resource based on the outputs.
We wrote about it here: https://serverless-stack.com/chapters/setting-serverless-environments-variables-in-a-react-app.html
And here's the source for the ReactStaticSite and StaticSite constructs for reference.
In my case, I'm using the Python language for CDK. I have a Makefile which I invoke directly from my app.py like this:
os.system("make"). I use the make to build up a layer zip file per AWS Docs. Technically you can invoke whatever you'd like. You must import the os package of course. Hope this helps.

jenkinsfile - copy files to s3 and make public

I am uploading a website to an s3 bucket for hosting, I upload from a jenkins build job using this in the jenkins file
withAWS(credentials:'aws-cred') {
sh 'npm install'
sh 'ng build --prod'
s3Upload(
file: 'dist/topic-creation',
bucket: 'bucketName',
acl:'PublicRead'
)
}
After this step I go to the s3 bucket and get the URL (I have configured the bucket for hosting), when i go to the endpoint url I get a 403 error. When i go back to bucket and give all the items that got uploaded public access, then the URL brings me to my website.
I don't want to make the bucket public, I want to give the files public access, I thought adding the line acl:'PublicRead' which can be seen above would do this but it does not.
Can anyone tell me how I can upload the files and give public access from a jenkins file?
Thanks
Install S3Publisher plugin on your Jenkins instance: https://plugins.jenkins.io/s3/
In order to upload the local artifacts with public access onto your S3 bucket , use the following command (You can also use the Jenkins Pipeline Syntax):
def identity=awsIdentity();
s3Upload acl: 'PublicRead', bucket: 'NAME_OF_S3_BUCKET', file: 'THE_ARTIFACT_TO_BE_UPLOADED_FROM_JENKINS', path: "PATH_ON_S3_BUCKET", workingDir: '.'
In case of a Free-style build, here's a sample:

How do I unzip a .zip file in google cloud storage?

How do I unzip a .zip file in Goolge Cloud Storage Bucket? (If we have some other tool like 'CloudBerry Explorer' for AWS, that will be great.)
You can use Python, e.g. from a Cloud Function:
from google.cloud import storage
from zipfile import ZipFile
from zipfile import is_zipfile
import io
def zipextract(bucketname, zipfilename_with_path):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucketname)
destination_blob_pathname = zipfilename_with_path
blob = bucket.blob(destination_blob_pathname)
zipbytes = io.BytesIO(blob.download_as_string())
if is_zipfile(zipbytes):
with ZipFile(zipbytes, 'r') as myzip:
for contentfilename in myzip.namelist():
contentfile = myzip.read(contentfilename)
blob = bucket.blob(zipfilename_with_path + "/" + contentfilename)
blob.upload_from_string(contentfile)
zipextract("mybucket", "path/file.zip") # if the file is gs://mybucket/path/file.zip
If you ended up having a zip file on your Google Cloud Storage bucket because you had to move large files from another server with the gsutil cp command, you could instead gzip when copying and it will be transferred in compressed format and unzippet when arriving to the bucket.
It is built in gsutil cp by using the -Z argument.
E.g.
gsutil cp -Z largefile.txt gs://bucket/largefile.txt
Here is some code I created to run as a Firebase Cloud Function. It is designed to listen to files loaded into a bucket with the content-type 'application/zip' and extract them in place.
const functions = require('firebase-functions');
const admin = require("firebase-admin");
const path = require('path');
const fs = require('fs');
const os = require('os');
const unzip = require('unzipper')
admin.initializeApp();
const storage = admin.storage();
const runtimeOpts = {
timeoutSeconds: 540,
memory: '2GB'
}
exports.unzip = functions.runWith(runtimeOpts).storage.object().onFinalize((object) => {
return new Promise((resolve, reject) => {
//console.log(object)
if (object.contentType !== 'application/zip') {
reject();
} else {
const bucket = firebase.storage.bucket(object.bucket)
const remoteFile = bucket.file(object.name)
const remoteDir = object.name.replace('.zip', '')
console.log(`Downloading ${remoteFile}`)
remoteFile.createReadStream()
.on('error', err => {
console.error(err)
reject(err);
})
.on('response', response => {
// Server connected and responded with the specified status and headers.
//console.log(response)
})
.on('end', () => {
// The file is fully downloaded.
console.log("Finished downloading.")
resolve();
})
.pipe(unzip.Parse())
.on('entry', entry => {
const file = bucket.file(`${remoteDir}/${entry.path}`)
entry.pipe(file.createWriteStream())
.on('error', err => {
console.log(err)
reject(err);
})
.on('finish', () => {
console.log(`Finsihed extracting ${remoteDir}/${entry.path}`)
});
entry.autodrain();
});
}
})
});
In shell, you can use the below command to unzip a compressed file
gsutil cat gs://bucket/obj.csv.gz | zcat | gsutil cp - gs://bucket/obj.csv
There is no mechanism in the GCS to unzip the files. A feature request regarding the same has already been forwarded to the Google development team.
As an alternative, you can upload the ZIP files to the GCS bucket and then download them to a persistent disk attached to a VM instance, unzip them there, and upload the unzipped files using the gsutil tool.
There are Data flow templates in google Cloud data flow which helps to Zip/unzip the files in cloud storage.Refer below screenshots.
This template stages a batch pipeline that decompresses files on Cloud Storage to a specified location. This functionality is useful when you want to use compressed data to minimize network bandwidth costs.
The pipeline automatically handles multiple compression modes during a single execution and determines the decompression mode to use based on the file extension (.bzip2, .deflate, .gz, .zip).
Pipeline requirements
The files to decompress must be in one of the following formats: Bzip2, Deflate, Gzip, Zip.
The output directory must exist prior to pipeline execution.
I'm afraid that by default in Google Cloud no program could do this..., but you can have this functionality, for example, using Python.
Universal method available on any machine where Python is installed (so also on Google Cloud):
You need to enter the following commands:
python
or if you need administrator rights:
sudo python
and then in the Python Interpreter:
>>> from zipfile import ZipFile
>>> zip_file = ZipFile('path_to_file/t.zip', 'r')
>>> zip_file.extractall('path_to_extract_folder')
and finally, press Ctrl+D to exit the Python Interpreter.
The unpacked files will be located in the location you specify (of course, if you had the appropriate permissions for these locations).
The above method works identically for Python 2 and Python 3.
Enjoy it to the fullest! :)
Enable Dataflow API in your gcloud console
Create a temp dir in your bucket (cant use root).
Replace YOUR_REGION (e.g. europe-west6) and YOUR_BUCKET in the below command and run it with gcloud cli (presumption is gz file is at root - change if not):
gcloud dataflow jobs run unzip \
--gcs-location gs://dataflow-templates-YOUR_REGION/latest/Bulk_Decompress_GCS_Files \
--region YOUR_REGION \
--num-workers 1 \
--staging-location gs://YOUR_BUCKET/temp \
--parameters inputFilePattern=gs://YOUR_BUCKET/*.gz,outputDirectory=gs://YOUR_BUCKET/,outputFailureFile=gs://YOUR_BUCKET/decomperror.txt
Another fast way to do it using Python in version 3.2 or higher:
import shutil
shutil.unpack_archive('filename')
The method also allows you to indicate the destination folder:
shutil.unpack_archive('filename', 'extract_dir')
The above method works not only for zip archives, but also for tar, gztar, bztar, or xztar archives.
If you need more options look into documentation of shutil module: shutil.unpack_archive