AWS lambda function logs are not getting displayed in cloudwatch - amazon-web-services

I have below setup which I am trying to run.
I have a python app which is running locally on my linux host.
I am using boto3 to connect to AWS with my user secret key and secret key Id.
My user had full access to EC2, Cloudwatch, S3 and config
My application invokes a lamdbda function called mylambda.
The execution role for mylambda also has all the required permissions.
Now if i call my lambda function from aws console it works fine. I can see the logs of execution in cloudwatch. But if I do it from my linux box from my custom application, I dont see any execution logs, I am not getting error either.
is there anything I am missing ?
Any help is really appreciated.
I dont see it getting invoked. But surprisingly I am getting response as below.
gaurav#random:~/lambda_s3$ python main.py
{u'Payload': <botocore.response.StreamingBody object at 0x7f74cb7f5550>, u'ExecutedVersion': '$LATEST', 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '7417534c-6263-11e8-xxx-afab1667510a', 'HTTPHeaders': {'x-amzn-requestid': '7417534c-xxx-11e8-8a24-afab1667510a', 'content-length': '4', 'x-amz-executed-version': '$LATEST', 'x-amzn-trace-id': 'root=1-5b0bdc78-7559e68acd668476bxxxx754;sampled=0', 'x-amzn-remapped-content-length': '0', 'connection': 'keep-alive', 'date': 'Mon, 28 May 2018 10:39:52 GMT', 'content-type': 'application/json'}}, u'StatusCode': 200}
{u'CreationDate': datetime.datetime(2018, 5, 27, 9, 50, 9, tzinfo=tzutc()), u'Name': 'bucketname'}
gaurav#random:~/lambda_s3$
My sample app is as below
#!/usr/bin/python
import boto3
import json
import base64
d= {'key': 10, 'key2' : 20}
client = boto3.client('lambda')
response = client.invoke(
FunctionName='mylambda',
InvocationType='RequestResponse',
#LogType='None',
ClientContext=base64.b64encode(b'{"custom":{"foo":"bar", \
"fuzzy":"wuzzy"}}').decode('utf-8'),
Payload=json.dumps(d)
)
print response

Make sure that you're actually invoking the Lambda correctly. Lambda error handling can be a bit tricky. Using boto3 the invoke method doesn't necessarily throw even if the invocation fails. You have to check the statusCode property in the response.
You mentioned that your user has full access to EC2, Cloudwatch, S3, and config. For your use case, you need to add lambda:InvokeFunction to your user's permissions.

Related

Cloudfront API Distribution Status

I'm trying to retrieve the status of my Cloudfront distributions through boto3 but it seems like the status returned through get_distribution or list_distributions only returns the state instead (Deployed or in-progress).
enter image description here
{
"ResponseMetadata":{
"RequestId":"bacc7917-90b4-4f91-8915-5dc7201b179a",
"HTTPStatusCode":200,
"HTTPHeaders":{
"x-amzn-requestid":"bacc7917-90b4-4f91-8915-5dc7201b179a",
"etag":"ECIVXNE16EKWC",
"content-type":"text/xml",
"content-length":"3102",
"date":"Thu 02 Feb 2023 21:33:47 GMT"
},
"RetryAttempts":0
},
"ETag":"ECIVXNE16EKWC",
"Distribution":{
"Id":"E2H2PR2OHJ17TC",
"ARN":"arn:aws:cloudfront::556730911179:distribution/E2H2PR2OHJ17TC",
"Status":"Deployed",
"LastModifiedTime":"datetime.datetime(2023, 2, 2, 21, 29, 35, 959000, tzinfo=tzutc())",
"InProgressInvalidationBatches":0,
"DomainName":"dx4o38vn878h1.cloudfront.net",
"ActiveTrustedSigners":{
"Enabled":false,
"Quantity":0
Anyone know a way to return the status (enabled / disabled) of a cloudfront distribution through boto3 ?
I tried returning the output through get_distribution and list_distributions
Documentation & API for config mentions 'Enabled' field but also says it is to enable/disable the distribution. Might worth to try inspecting get_distribution_config to see if Enabled is coming back in the results.

Devstack listing s3 bucket error "The AWS Access Key Id you provided does not exist in our records"

I followed the installation guide for Devstack according to this https://docs.openstack.org/devstack/latest/ and then followed this to configure the keystoneauth middleware https://docs.openstack.org/swift/latest/overview_auth.html#keystone-auth
But when I tried to list bucket using boto3 with credentials I generate from OpenStack ec2 credential create, I got the error "The AWS Access Key Id you provided does not exist in our records"
Would appreciate any help
My boto3 code is
import boto3 s3 = boto3.resource('s3',aws_access_key_id='5d14869948294bb48f9bfe684b8892ca',aws_secret_access_key='ffcbcec69fb54622a0185a5848d7d0d2',)
for bucket in s3.objects.all():
print(bucket)
Where the 2 keys are according to below:
| access | 5d14869948294bb48f9bfe684b8892ca|
| links | {'self': '10.180.205.202/identity/v3/users/…'} |
| project_id | c128ad4f9a154a04832e41a43756f47d |
| secret | ffcbcec69fb54622a0185a5848d7d0d2 |
| trust_id | None |
| user_id | 2abd57c56867482ca6cae5a9a2afda29
After running the commands #larsks provided, I got public: http://10.180.205.202:8080/v1/AUTH_ed6bbefe5ab44f32b4891fc5e3e55f1f for my swift endpoint. And just making sure, my ec2 credential is under the user admin and also project admin.
When I followed the Boto3 code and removed everything starting from v1 for my endpoint I got the error botocore.exceptions.ClientError: An error occurred () when calling the ListBuckets operation:
And when I kept the AUTH part, I got botocore.exceptions.ClientError: An error occurred (412) when calling the ListBuckets operation: Precondition Failed
The previous problem is resolved by adding enable_service s3api in the local.conf and stack again. This is likely because OpenStack needs to know it's using s3api, from the documentation it says Swift will be configured to act as a S3 endpoint for Keystone so effectively replacing the nova-objectstore.
Your problem is probably that nowhere are you telling boto3 how to connect to your OpenStack environment, so by default it is trying to connect to Amazon's S3 service (in your example you're also not passing in your access key and secret key, but I'm assuming this was just a typo when creating your example).
If you want to connect to the OpenStack object storage service, you'll need to first get the endpoint for that service from the catalog. You can get this from the command line by running openstack catalog list; you can also retrieve it programatically if you make use of the openstack Python module.
You can just inspect the output of openstack catalog list and look for the swift service, or you can parse it out using e.g. jq:
$ openstack catalog list -f json |
jq -r '.[]|select(.Name == "swift")|.Endpoints[]|select(.interface == "public")|.url'
https://someurl.example.com/swift/v1
In any case, you need to pass the endpoint to boto3:
>>> import boto3
>>> session = boto3.session.Session()
>>> s3 = session.client(service_name='s3',
... aws_access_key_id='access_key_id_goes_here',
... aws_secret_access_key='secret_key_goes_here',
... endpoint_url='endpoint_url_goes_here')
>>> s3.list_buckets()
{'ResponseMetadata': {'RequestId': 'tx0000000000000000d6a8c-0060de01e2-cff1383c-default', 'HostId': '', 'HTTPStatusCode': 200, 'HTTPHeaders': {'transfer-encoding': 'chunked', 'x-amz-request-id': 'tx0000000000000000d6a8c-0060de01e2-cff1383c-default', 'content-type': 'application/xml', 'date': 'Thu, 01 Jul 2021 17:56:51 GMT', 'connection': 'close', 'strict-transport-security': 'max-age=16000000; includeSubDomains; preload;'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 'larstest', 'CreationDate': datetime.datetime(2018, 12, 5, 0, 20, 19, 4000, tzinfo=tzutc())}, {'Name': 'larstest2', 'CreationDate': datetime.datetime(2019, 3, 7, 21, 4, 12, 628000, tzinfo=tzutc())}, {'Name': 'larstest4', 'CreationDate': datetime.datetime(2021, 5, 12, 18, 47, 54, 510000, tzinfo=tzutc())}], 'Owner': {'DisplayName': 'lars', 'ID': '4bb09e3a56cd451b9d260ad6c111fd96'}}
>>>
Note that if the endpoint url from openstack catalog list includes a version (e.g., .../v1), you will probably want to drop that.

boto3- regions names and service names

How can I get the list of all the regions for eg: us-east-1. How can I get the list of all the regions using boto3. (I am not trying to get the available regions for a particular service as have been asked already)
Also how can I get the names of all the services that AWS provides using boto3 so that I can use them later, when creating resources or clients later on.
I am asking this question because when creating sessions or resources or client I have to specify these things and I don't know how to find the exact value to pass.
For the regions, the closest you can get is describe_regions:
ec2 = boto3.client('ec2')
response = [region['RegionName'] for region in ec2.describe_regions(AllRegions=True)['Regions']]
print(response)
which gives:
['af-south-1', 'eu-north-1', 'ap-south-1', 'eu-west-3', 'eu-west-2', 'eu-south-1', 'eu-west-1', 'ap-northeast-3', 'ap-northeast-2', 'me-south-1', 'ap-northeast-1', 'sa-east-1', 'ca-central-1', 'ap-east-1', 'ap-southeast-1', 'ap-southeast-2', 'eu-central-1', 'us-east-1', 'us-east-2', 'us-west-1', 'us-west-2']
For services - I don't think there is any API call for that. You could scrape them from here, but this would not involve boto3.
To get the list of services in AWS you can perform get_available_services of boto3 Session:
import boto3
boto_session = boto3.session.Session()
list_of_services = boto_session.get_available_services()
print(list_of_services)
which gives you a list of all available services in AWS
['accessanalyzer', 'account', 'acm', 'acm-pca', 'alexaforbusiness', 'amp', 'amplify', 'amplifybackend', 'apigateway', 'apigatewaymanagementapi', 'apigatewayv2', 'appconfig', 'appflow', 'appintegrations', 'application-autoscaling', 'application-insights', 'applicationcostprofiler', 'appmesh', 'apprunner', 'appstream', 'appsync', 'athena', 'auditmanager', 'autoscaling', 'autoscaling-plans', 'backup', 'batch', 'braket', 'budgets', 'ce', 'chime', 'chime-sdk-identity', 'chime-sdk-meetings', 'chime-sdk-messaging', 'cloud9', 'cloudcontrol', 'clouddirectory', 'cloudformation', 'cloudfront', 'cloudhsm', 'cloudhsmv2', 'cloudsearch', 'cloudsearchdomain', 'cloudtrail', 'cloudwatch', 'codeartifact', 'codebuild', 'codecommit', 'codedeploy', 'codeguru-reviewer', 'codeguruprofiler', 'codepipeline', 'codestar', 'codestar-connections', 'codestar-notifications', 'cognito-identity', 'cognito-idp', 'cognito-sync', 'comprehend', 'comprehendmedical', 'compute-optimizer', 'config', 'connect', 'connect-contact-lens', 'connectparticipant', 'cur', 'customer-profiles', 'databrew', 'dataexchange', 'datapipeline', 'datasync', 'dax', 'detective', 'devicefarm', 'devops-guru', 'directconnect', 'discovery', 'dlm', 'dms', 'docdb', 'ds', 'dynamodb', 'dynamodbstreams', 'ebs', 'ec2', 'ec2-instance-connect', 'ecr', 'ecr-public', 'ecs', 'efs', 'eks', 'elastic-inference', 'elasticache', 'elasticbeanstalk', 'elastictranscoder', 'elb', 'elbv2', 'emr', 'emr-containers', 'es', 'events', 'finspace', 'finspace-data', 'firehose', 'fis', 'fms', 'forecast', 'forecastquery', 'frauddetector', 'fsx', 'gamelift', 'glacier', 'globalaccelerator', 'glue', 'grafana', 'greengrass', 'greengrassv2', 'groundstation', 'guardduty', 'health', 'healthlake', 'honeycode', 'iam', 'identitystore', 'imagebuilder', 'importexport', 'inspector', 'iot', 'iot-data', 'iot-jobs-data', 'iot1click-devices', 'iot1click-projects', 'iotanalytics', 'iotdeviceadvisor', 'iotevents', 'iotevents-data', 'iotfleethub', 'iotsecuretunneling', 'iotsitewise', 'iotthingsgraph', 'iotwireless', 'ivs', 'kafka', 'kafkaconnect', 'kendra', 'kinesis', 'kinesis-video-archived-media', 'kinesis-video-media', 'kinesis-video-signaling', 'kinesisanalytics', 'kinesisanalyticsv2', 'kinesisvideo', 'kms', 'lakeformation', 'lambda', 'lex-models', 'lex-runtime', 'lexv2-models', 'lexv2-runtime', 'license-manager', 'lightsail', 'location', 'logs', 'lookoutequipment', 'lookoutmetrics', 'lookoutvision', 'machinelearning', 'macie', 'macie2', 'managedblockchain', 'marketplace-catalog', 'marketplace-entitlement', 'marketplacecommerceanalytics', 'mediaconnect', 'mediaconvert', 'medialive', 'mediapackage', 'mediapackage-vod', 'mediastore', 'mediastore-data', 'mediatailor', 'memorydb', 'meteringmarketplace', 'mgh', 'mgn', 'migrationhub-config', 'mobile', 'mq', 'mturk', 'mwaa', 'neptune', 'network-firewall', 'networkmanager', 'nimble', 'opensearch', 'opsworks', 'opsworkscm', 'organizations', 'outposts', 'panorama', 'personalize', 'personalize-events', 'personalize-runtime', 'pi', 'pinpoint', 'pinpoint-email', 'pinpoint-sms-voice', 'polly', 'pricing', 'proton', 'qldb', 'qldb-session', 'quicksight', 'ram', 'rds', 'rds-data', 'redshift', 'redshift-data', 'rekognition', 'resource-groups', 'resourcegroupstaggingapi', 'robomaker', 'route53', 'route53-recovery-cluster', 'route53-recovery-control-config', 'route53-recovery-readiness', 'route53domains', 'route53resolver', 's3', 's3control', 's3outposts', 'sagemaker', 'sagemaker-a2i-runtime', 'sagemaker-edge', 'sagemaker-featurestore-runtime', 'sagemaker-runtime', 'savingsplans', 'schemas', 'sdb', 'secretsmanager', 'securityhub', 'serverlessrepo', 'service-quotas', 'servicecatalog', 'servicecatalog-appregistry', 'servicediscovery', 'ses', 'sesv2', 'shield', 'signer', 'sms', 'sms-voice', 'snow-device-management', 'snowball', 'sns', 'sqs', 'ssm', 'ssm-contacts', 'ssm-incidents', 'sso', 'sso-admin', 'sso-oidc', 'stepfunctions', 'storagegateway', 'sts', 'support', 'swf', 'synthetics', 'textract', 'timestream-query', 'timestream-write', 'transcribe', 'transfer', 'translate', 'voice-id', 'waf', 'waf-regional', 'wafv2', 'wellarchitected', 'wisdom', 'workdocs', 'worklink', 'workmail', 'workmailmessageflow', 'workspaces', 'xray']

How to use Runner_v2 for apache beam dataflow job?

My python code for dataflow job looks like below:
import apache_beam as beam
from apache_beam.io.external.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions
topic1="topic1"
conf={'bootstrap.servers':'gcp_instance_public_ip:9092'}
pipeline = beam.Pipeline(options=PipelineOptions())
(pipeline
| ReadFromKafka(consumer_config=conf,topics=['topic1'])
)
pipeline.run()
As i am using kafkaIO in python code, someone suggested me to use DataflowRunner_V2( I think V1 doesn't support python).
As per dataflow documentation, i am using this parameter to use runner v2:--experiments=use_runner_v2 (I have not made any change on code level for switching from V1 to V2.)
I am getting below error:
http_response, method_config=method_config, request=request)
apitools.base.py.exceptions.HttpBadRequestError: HttpError accessing <https://dataflow.googleapis.com/v1b3/projects/metal-voyaasfger-23424/locations/us-central1/jobs?alt=json>: response: <{'vary': 'Origin, X-Origin, Referer', 'content-type': 'application/json; charset=UTF-8', 'date': 'Wed, 08 Jul 2020 07:23:21 GMT', 'server': 'ESF', 'cache-control': 'private', 'x-xss-protection': '0', 'x-frame-options': 'SAMEORIGIN', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'status': '400', 'content-length': '544', '-content-encoding': 'gzip'}>, content <{
"error": {
"code": 400,
"message": "(5fd1bf4d41e8b7e): The workflow could not be created. Causes: (5fd1bf4d41e8018): The workflow could not be created due to misconfiguration. If you are trying any experimental feature, make sure your project and the specified region support that feature. Contact Google Cloud Support for further help. Experiments enabled for project: [enable_streaming_engine, enable_windmill_service, shuffle_mode=service], experiments requested for job: [use_runner_v2]",
"status": "INVALID_ARGUMENT"
}
}
I have already added service account using export GOOGLE_APPLICATION_CREDENTIALS=(project owner permission) command.
Can someone help where is my mistake. Am i mistaking using Runner_V2?
I will really thnkful if someone shortly tell whats difference in using Runner_v1 and Runner_V2.
Thanks ... :)
I was able to reproduce your issue. The error message was complaining that use_runner_v2 isn't enabled because Runner v2 isn't enabled for batch jobs.
Experiments enabled for project: [enable_streaming_engine, enable_windmill_service, shuffle_mode=service], experiments requested for job: [use_runner_v2]",
Please try running your job with the --streaming flag added.

S3A hadoop aws jar always return AccessDeniedException

Could anyone please help me in figure out why do I get below exception? All I'm trying to read some data from local file in my spark program and writing into S3. I have correct secret key and access key specified like this -
Do you think it's related to version mismatch of some library?
SparkConf conf = new SparkConf();
// add more spark related properties
AWSCredentials credentials = DefaultAWSCredentialsProviderChain.getInstance().getCredentials();
conf.set("spark.hadoop.fs.s3a.access.key", credentials.getAWSAccessKeyId());
conf.set("spark.hadoop.fs.s3a.secret.key", credentials.getAWSSecretKey());
The java code is plain vanilla -
protected void process() throws JobException {
JavaRDD<String> linesRDD = _sparkContext.textFile(_jArgs.getFileLocation());
linesRDD.saveAsTextFile("s3a://my.bucket/" + Math.random() + "final.txt");
This is my code and gradle.
Gradle
ext.libs = [
aws: [
lambda: 'com.amazonaws:aws-lambda-java-core:1.2.0',
// The AWS SDK will dynamically import the X-Ray SDK to emit subsegments for downstream calls made by your
// function
//recorderCore: 'com.amazonaws:aws-xray-recorder-sdk-core:1.1.2',
//recorderCoreAwsSdk: 'com.amazonaws:aws-xray-recorder-sdk-aws-sdk:1.1.2',
//recorderCoreAwsSdkInstrumentor: 'com.amazonaws:aws-xray-recorder-sdk-aws-sdk-instrumentor:1.1.2',
// https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk
javaSDK: 'com.amazonaws:aws-java-sdk:1.11.311',
recorderSDK: 'com.amazonaws:aws-java-sdk-dynamodb:1.11.311',
// https://mvnrepository.com/artifact/com.amazonaws/aws-lambda-java-events
lambdaEvents: 'com.amazonaws:aws-lambda-java-events:2.0.2',
snsSDK: 'com.amazonaws:aws-java-sdk-sns:1.11.311',
// https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-emr
emr :'com.amazonaws:aws-java-sdk-emr:1.11.311'
],
//jodaTime: 'joda-time:joda-time:2.7',
//guava : 'com.google.guava:guava:18.0',
jCommander : 'com.beust:jcommander:1.71',
//jackson: 'com.fasterxml.jackson.module:jackson-module-scala_2.11:2.8.8',
jackson: 'com.fasterxml.jackson.core:jackson-databind:2.8.0',
apacheCommons: [
lang3: "org.apache.commons:commons-lang3:3.3.2",
],
spark: [
core: 'org.apache.spark:spark-core_2.11:2.3.0',
hadoopAws: 'org.apache.hadoop:hadoop-aws:2.8.1',
//hadoopClient:'org.apache.hadoop:hadoop-client:2.8.1',
//hadoopCommon:'org.apache.hadoop:hadoop-common:2.8.1',
jackson: 'com.fasterxml.jackson.module:jackson-module-scala_2.11:2.8.8'
],
Exception
2018-04-10 22:14:22.270 | ERROR | | | |c.f.d.p.s.SparkJobEntry-46
Exception found in job for file type : EMAIL
java.nio.file.AccessDeniedException: s3a://my.bucket/0.253592564392344final.txt: getFileStatus on
s3a://my.bucket/0.253592564392344final.txt:
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
62622F7F27793DBA; S3 Extended Request ID: BHCZT6BSUP39CdFOLz0uxkJGPH1tPsChYl40a32bYglLImC6PQo+LFtBClnWLWbtArV/z1SOt68=), S3 Extended Request ID: BHCZT6BSUP39CdFOLz0uxkJGPH1tPsChYl40a32bYglLImC6PQo+LFtBClnWLWbtArV/z1SOt68=
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) ~[hadoop-aws-2.8.1.jar:na]
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) ~[hadoop-aws-2.8.1.jar:na]
at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1568) ~[hadoop-aws-2.8.1.jar:na]
at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) ~[hadoop-aws-2.8.1.jar:na]
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1436) ~[hadoop-common-2.8.1.jar:na]
at org.apache.hadoop.fs.s3a.S3AFileSystem.exists(S3AFileSystem.java:2040) ~[hadoop-aws-2.8.1.jar:na]
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:131) ~[hadoop-mapreduce-client-core-2.6.5.jar:na]
at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.assertConf(SparkHadoopWriter.scala:283) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:71) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1096) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) ~[spark-core_2.11-2.3.0.jar:2.3.0]
Once you are playing with Hadoop Configuration classes, you need to strip out the spark.hadoop prefix, so just use fs.s3a.access.key, etc.
All the options are defined in the class org.apache.hadoop.fs.s3a.Constants: if you reference them you'll avoid typos too.
One thing to consider is all the source for spark and hadoop is public: there's nothing to stop you taking that stack trace, setting some breakpoints and trying to run this in your IDE. It's what we normally do ourselves when things get bad.