Is it possible to download the contents of a public Lambda layer from AWS given the ARN? - amazon-web-services

I want to download the public arn for a more compact version of spacy from this GitHub repository.
"arn:aws:lambda:us-west-2:113088814899:layer:Klayers-python37-spacy:27"
How can I achieve this?

You can get it from a Arn using the get-layer-version-by-arn function in the CLI.
You can run the below command to get the source of the Lambda layer you requested.
aws lambda get-layer-version-by-arn \
--arn "arn:aws:lambda:us-west-2:113088814899:layer:Klayers-python37-spacy:27"
An example of the response you will receive is below
{
"LayerVersionArn": "arn:aws:lambda:us-west-2:123456789012:layer:AWSLambda-Python37-SciPy1x:2",
"Description": "AWS Lambda SciPy layer for Python 3.7 (scipy-1.1.0, numpy-1.15.4) https://github.com/scipy/scipy/releases/tag/v1.1.0 https://github.com/numpy/numpy/releases/tag/v1.15.4",
"CreatedDate": "2018-11-12T10:09:38.398+0000",
"LayerArn": "arn:aws:lambda:us-west-2:123456789012:layer:AWSLambda-Python37-SciPy1x",
"Content": {
"CodeSize": 41784542,
"CodeSha256": "GGmv8ocUw4cly0T8HL0Vx/f5V4RmSCGNjDIslY4VskM=",
"Location": "https://awslambda-us-west-2-layers.s3.us-west-2.amazonaws.com/snapshots/123456789012/..."
},
"Version": 2,
"CompatibleRuntimes": [
"python3.7"
],
"LicenseInfo": "SciPy: https://github.com/scipy/scipy/blob/master/LICENSE.txt, NumPy: https://github.com/numpy/numpy/blob/master/LICENSE.txt"
}
Once you run this you will get a response returned with a key of "Content", containing a subkey of "Location" which references the S3 path to download the layer contents.
You can download from this path, you will then need to configure this as a Lambda layer again after removing any dependencies.
Please ensure in this process that you only remove unnecessary dependencies.

Related

What are SageMaker pipelines actually?

Sagemaker pipelines are rather unclear to me, I'm not experienced in the field of ML but I'm working on figuring out the pipeline definitions.
I have a few questions:
Is sagemaker pipelines a stand-alone service/feature? Because I don't see any option to create them through the console, though I do see CloudFormation and CDK resources.
Is a sagemaker pipeline essentially codepipeline? How do these integrate, how do these differ?
There's also a Python SDK, how does this differ from the CDK and CloudFormation?
I can't seem to find any examples besides the Python SDK usage, how come?
The docs and workshops seem only to properly describe the Python SDK usage,it would be really helpful if someone could clear this up for me!
SageMaker has two things called Pipelines: Model Building Pipelines and Serial Inference Pipelines. I believe you're referring to the former
A model building pipeline defines steps in a machine learning workflow, such as pre-processing, hyperparameter tuning, batch transformations, and setting up endpoints
A serial inference pipeline is two or more SageMaker models run one after the other
A model building pipeline is defined in JSON, and is hosted/run in some sort of proprietary, serverless fashion by SageMaker
Is sagemaker pipelines a stand-alone service/feature? Because I don't see any option to create them through the console, though I do see CloudFormation and CDK resources.
You can create/modify them using the API, which can also be called via the CLI, Python SDK, or CloudFormation. These all use the AWS API under the hood
You can start/stop/view them in SageMaker Studio:
Left-side Navigation bar > SageMaker resources > Drop-down menu > Pipelines
Is a sagemaker pipeline essentially codepipeline? How do these integrate, how do these differ?
Unlikely. CodePipeline is more for building and deploying code, not specific to SageMaker. There is no direct integration as far as I can tell, other than that you can start a SM pipeline with CP
There's also a Python SDK, how does this differ from the CDK and CloudFormation?
The Python SDK is a stand-alone library to interact with SageMaker in a developer-friendly fashion. It's more dynamic than CloudFormation. Let's you build pipelines using code. Whereas CloudFormation takes a static JSON string
A very simple example of Python SageMaker SDK usage:
processor = SKLearnProcessor(
framework_version="0.23-1",
instance_count=1,
instance_type="ml.m5.large",
role="role-arn",
)
processing_step = ProcessingStep(
name="processing",
processor=processor,
code="preprocessor.py"
)
pipeline = Pipeline(name="foo", steps=[processing_step])
pipeline.upsert(role_arn = ...)
pipeline.start()
pipeline.definition() produces rather verbose JSON like this:
{
"Version": "2020-12-01",
"Metadata": {},
"Parameters": [],
"PipelineExperimentConfig": {
"ExperimentName": {
"Get": "Execution.PipelineName"
},
"TrialName": {
"Get": "Execution.PipelineExecutionId"
}
},
"Steps": [
{
"Name": "processing",
"Type": "Processing",
"Arguments": {
"ProcessingResources": {
"ClusterConfig": {
"InstanceType": "ml.m5.large",
"InstanceCount": 1,
"VolumeSizeInGB": 30
}
},
"AppSpecification": {
"ImageUri": "246618743249.dkr.ecr.us-west-2.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3",
"ContainerEntrypoint": [
"python3",
"/opt/ml/processing/input/code/preprocessor.py"
]
},
"RoleArn": "arn:aws:iam::123456789012:role/foo",
"ProcessingInputs": [
{
"InputName": "code",
"AppManaged": false,
"S3Input": {
"S3Uri": "s3://bucket/preprocessor.py",
"LocalPath": "/opt/ml/processing/input/code",
"S3DataType": "S3Prefix",
"S3InputMode": "File",
"S3DataDistributionType": "FullyReplicated",
"S3CompressionType": "None"
}
}
]
}
}
]
}
You could use the above JSON with CloudFormation/CDK, but you build the JSON with the SageMaker SDK
You can also define model building workflows using Step Function State Machines, using the Data Science SDK, or Airflow

Using CloudWatch Event : How to Pass JSON Object to CodeBuild as an Environment Variable

Summary: I can't specify a JSON object using CloudWatch target Input Transformer, in order to pass the object contents as an environment variable to a CodeBuild project.
Background:
I trigger an AWS CodeBuild job when an S3 bucket receives any new object. I have enabled CloudTrail for S3 operations so that I can use a CloudWatch rule that has my S3 bucket as an Event Source, with the CodeBuild project as a Target.
If I setup the 'Configure input' part of the Target, using Input Transformer, I can get single 'primitive' values from the event using the format below:
Input path textbox:
{"zip_file":"$.detail.requestParameters.key"}
Input template textbox:
{"environmentVariablesOverride": [ {"name":"ZIP_FILE", "value":<zip_file>}]}
And this works fine if I use 'simple' single strings.
However, for example, if I wish to obtain the entire 'resources' key, which is a JSON object, I need to have knowledge of each of the keys within, and the object structure, and manually recreate the structure for each key/value pair.
For example, the resources element in the Event is:
"resources": [
{
"type": "AWS::S3::Object",
"ARN": "arn:aws:s3:::mybucket/myfile.zip"
},
{
"accountId": "1122334455667799",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::mybucket"
}
],
I want the code in the buildspec in CodeBuild to do the heavy lifting and parse the JSON data.
If I specify in the input path textbox:
{"zip_file":"$.detail.resources"}
Then CodeBuild project never gets triggered.
Is there a way to get the entire JSON object, identified by a specific key, as an environment variable?
Check this...CodeBuild targets support all the parameters allowed by StartBuild API. You need to use environmentVariablesOverride in your JSON string.
{"environmentVariablesOverride": [ {"name":"ZIPFILE", "value":<zip_file>}]}
Please,avoid using '_' in the environment name.

Cloud Recording issue

I am trying agora cloud Recording API and trying to record into AWS S3 bucket. The calls appear to go through fine. While doing stop record, I get success message. I have reproduced part of it here:
{
insertId: "5d66423d00012ad9d6d02f2b"
labels: {
clone_id:
"00c61b117c803f45c35dbd46759dc85f8607177c3234b870987ba6be86fec0380c162a"
}
lotextPayload: "Stop cloud recording success. FileList :
01ce51a4a640ecrrrrhxabd9e9d823f08_tdeaon_20197121758.m3u8, uploading
status: backuped"
timestamp: "2019-08-28T08:58:37.076505Z"
}
It shows the status 'backuped'. As per agora documentation, it uploaded the files into agora cloud. Then within 5 minutes it is supposed to upload into my AWS S3 bucket.
I am not seeing this file in my AWS bucket. I have tested the bucket secret key. the same key works fine for other application. Also I have verified CORS settings.
Please suggest how I could debug further.
Make sure you are inputing your S3 credentials correctly within the storageConfig settings
"storageConfig":{
"vendor":{{StorageVendor}},
"region":{{StorageRegion}},
"bucket":"{{Bucket}}",
"accessKey":"{{AccessKey}}",
"secretKey":"{{SecretKey}}"
}
Agora offers a Postman collection to make testing easier: https://documenter.getpostman.com/view/6319646/SVSLr9AM?version=latest
I had faced issue due to wrong uid.
Recording uid needs to be a random id which will be used for 'joining by the recording client which 'resides' on cloud. I had passed my main client's id.
Other two reasons I had faced:
S3 credentials
S3 CORS settings : Go to AWS S3 permissions and set the allowed CORS headers.
EDIT:
It could be something like this on S3 side ..
[
{
"AllowedHeaders": [
"Authorization",
"*"
],
"AllowedMethods": [
"HEAD",
"POST"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": [
"ETag",
"x-amz-meta-custom-header",
"x-amz-storage-class"
],
"MaxAgeSeconds": 5000
}
]

How to get the Initialization Vector (IV) from the AWS Encryption CLI?

I'm encrypting a file using the AWS Encryption CLI using a command like so:
aws-encryption-cli --encrypt --input test.mp4 --master-keys key=arn:aws:kms:us-west-2:123456789012:key/exmaple-key-id --output . --metadata-output -
From the output of the command, I can clearly see that it's using an Initialization Vector (IV) of strength 12, which is great, but how do I actually view the IV? In order to pass the encrypted file to another service, like AWS Elastic Transcoder, where it'll do the decryption itself, I need to actually know what the IV was that was used for encrypting the file.
{
"header": {
"algorithm": "AES_256_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384",
"content_type": 2,
"encrypted_data_keys": [{
"encrypted_data_key": "...............",
"key_provider": {
"key_info": "............",
"provider_id": "..........."
}
}],
"encryption_context": {
"aws-crypto-public-key": "..............."
},
"frame_length": 4096,
"header_iv_length": 12,
"message_id": "..........",
"type": 128,
"version": "1.0"
},
"input": "/home/test.mp4",
"mode": "encrypt",
"output": "/home/test.mp4.encrypted"
}
Unfortunately, you won't be able to use the AWS Encryption SDK CLI to encrypt data for Amazon Elastic Transcoder's consumption.
One of the primary benefits of the AWS Encryption SDK is the message format[1] which packages all necessary information about the encrypted message into a binary blob and provides a more scalable way of handling large messages. Extracting the data primitives from that blob is not recommended and even if you did, they may or may not be directly compatible with another system, depending on how you used the AWS Encryption SDK and what that other system expects.
In the case of Elastic Transcoder, they expect the raw ciphertext encrypted using the specified AES mode[2]. This is not compatible with the AWS Encryption SDK format.
[1] https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/message-format.html
[2] https://docs.aws.amazon.com/elastictranscoder/latest/developerguide/create-job.html#create-job-request-inputs-encryption

API Gateway not importing exported definition

I am testing my backup procedure for an API, in my API Gateway.
So, I am exporting my API from the API Console within my AWS account, I then go into API Gateway and create a new API - "import from swagger".
I paste my exported definition in and create, which throws tons of errors.
From my reading - it seems this is a known issue / pain.
I suspect the reason for the error(s) are because I use a custom authorizer;
"security" : [ {
"TestAuthorizer" : [ ]
}, {
"api_key" : [ ]
} ]
I use this on each method, hence, I get a lot of errors.
The weird thing is that I can clone this API perfectly fine, hence, I assumed that I could take an exported definition and re-import without issues.
Any ideas on how I can correct these errors (preferably within my API gateway, so that I can export / import with no issues).
An example of one of my GET methods using this authorizer is:
"/api/example" : {
"get" : {
"produces" : [ "application/json" ],
"parameters" : [ {
"name" : "Authorization",
"in" : "header",
"required" : true,
"type" : "string"
} ],
"responses" : {
"200" : {
"description" : "200 response",
"schema" : {
"$ref" : "#/definitions/exampleModel"
},
"headers" : {
"Access-Control-Allow-Origin" : {
"type" : "string"
}
}
}
},
"security" : [ {
"TestAuthorizer" : [ ]
}, {
"api_key" : [ ]
} ]
}
Thanks in advance
UPDATE
The error(s) that I get when importing a definition I had just exported are:
Your API was not imported due to errors in the Swagger file.
Unable to put method 'GET' on resource at path '/api/v1/MethodName': Invalid authorizer ID specified. Setting the authorization type to CUSTOM or COGNITO_USER_POOLS requires a valid authorizer.
I get the message for each method in my API - so there is a lot.
Additionality, right at the end of the message, I get this:
Additionally, these warnings were found:
Unable to create authorizer from security definition: 'TestAuthorizer'. Extension x-amazon-apigateway-authorizer is required. Any methods with security: 'TestAuthorizer' will not be created. If this security definition is not a configured authorizer, remove the x-amazon-apigateway-authtype extension and it will be ignored.
I have tried with Ignoring the errors, same result.
Make sure you are exporting your swagger with both integrations and authorizers extensions.
Try exporting your swagger using AWS CLI:
aws apigateway get-export \
--parameters '{"extensions":"integrations,authorizers"}' \
--rest-api-id {api_id} \
--stage-name {stage_name} \
--export-type swagger swagger.json
The output will be sent to swagger.json file.
For more details about swagger custom extensions see this.
For anyone that may come across this issue.
After LOTS of troubleshooting and eventually involving the AWS Support Team, this has been resolved and identified as an AWS CLI client bug (confirmed from AWS Support Team);
Final response.
Thank you for providing the details requested. After going through the AWS CLI version and error details, I can confirm the error is because of known issue with Powershell AWS CLI. I apologize for inconvenience caused due to the error. To get around the error I recommend going through the following steps
Create a file named data.json in the current directory where the powershell command is to be executed
Save the following contents to file {"extensions":"authorizers,integrations"}
In Powershell console, ensure the current working directory is the same as the location where data.json is present
Execute the following command aws apigateway get-export --parameters file://data.json --rest-api-id APIID --stage-name dev --export-type swagger C:\temp\export.json
Using this, finally resolved my issue - I look forward to the fix in one of the upcoming versions.
PS - this is currently on the latest version:
aws --version
aws-cli/1.11.44 Python/2.7.9 Windows/8 botocore/1.5.7