Connecting DMS to S3 - amazon-web-services

We are trying to get DMS set up with an S3 Source however we are unable to connect the replication instance to the Source S3 endpoint.
When we run a connection test on the source endpoint, the error we receive is:
Error Details: [errType=ERROR_RESPONSE, status=1020414, errMessage= Failed to connect to database., errDetails=]
We have followed the documentation however we are still unable to get the connection to work. The bucket is within the VPC that the replication instance has access to, and the IAM role has the GetObject, ListBucket and dms* permissions. I'm 95% sure that the JSON mapping file is set up correctly with schema and table names pointing to the right place.
Due to the lack of error messages or detailed reasons why we can't connect to the source database (the S3 bucket/CSV file), debugging this feels a tad hit and miss. We are using the Amazon Console and not the CLI, if that makes much of a difference.

I had this same error.
Check this troubleshooting guide. It covers the basic configuration problems you might run into.
My answer wasn't there, tho, and I couldn't find it anywhere, not even asking in the official forums.
In my case, for some reason I thought I should use the full bucket name in the "Bucket Name" field, like "arn:aws:s3:::my-bucket". Probably because I had to use the ARN for the role in the previous field.
And the error message when you try to connect to it will not be clear, it only says it couldn't connect to the bucket. Anyway, you don't need to provide an ARN, just the bucket's name, as in "my-bucket".

Related

Jenkins job trigger when s3 bucket updated

I am looking for a way to trigger the Jenkins job whenever the s3 bucket is updated with a particular file format.
I have tried a lambda function method with an "Add trigger -> s3 bucket PUT method". I have followed this article. But it's not working. I have explored and I have found out that "AWS SNS" and "AWS SQS" also can use for this, but the problem is some are saying this is outdated. So which is the simplest way to trigger the Jenkins job when the s3 bucket is updated?
I just want a trigger, whenever the zip file is updated from job A in jenkins1 to the S3 bucket name called 'testbucket' in AWS enviornment2. Both Jenkins are in different AWS accounts under seperate private VPC. I have attached my Jenkins workflow as a picture. Please refer below picture.
The approach you are using seems solid and a good way to go. I'm not sure what specific issue you are having so I'll list a couple things that could explain why this might not be working for you:
Permissions issue - Check to ensure that the Lambda can be invoked by the S3 service. If you are doing this in the console (manually) then you probably don't have to worry about that since the permissions should be automatically setup. If you're doing this through infrastructure as code then it's something you need to add.
Lambda VPC config - Your lambda will need to run out of the same subnet in your VPC that the Jenkins instance runs out of. Lambda by default will not be associated to a VPC and will not have access to the Jenkins instance (unless it's publicly available over the internet).
I found this other stackoverflow post that describes the SNS/SQS setup if you want to continue down that path Trigger Jenkins job when a S3 file is updated

Mediaconvert Network failure

My AWS Elemental Media Convert is showing:
undefined (undefined)
Network Failure: {"message":"Network Failure","code":"NetworkingError","time":"xxx","hostname":"mediaconvert-eu-west-1.amazonaws.com","retryable":true}
The undefined network failure error is a sort of catch-all, so a closer look at the job configuration and access permissions to any specified resources (S3 buckets, IAM roles, etc.) would be required to identify the exact cause. I would recommend opening a ticket with AWS support for additional support.
When I've seen this error, the most common reason was because the user was specifying a MediaConvert queue that belonged to a different AWS account. Check to make sure you aren't referencing any resources that your account/role don't have access to, and make sure you're running the job in the correct region (if any referenced resources are region-specific).
A future update to MediaConvert will address this generalized error to make it more specific and actionable by the user.

AWS SageMaker GroundTruth permissions issue (can't read manifest)

I'm trying to run a simple GroundTruth labeling job with a public workforce. I upload my images to S3, start creating the labeling job, generate the manifest using their tool automatically, and explicitly specify a role that most certainly has permissions on both S3 bucket (input and output) as well as full access to SageMaker. Then I create the job (standard rest of stuff -- I just wanted to be clear that I'm doing all of that).
At first, everything looks fine. All green lights, it says it's in progress, and the images are properly showing up in the bottom where the dataset is. However, after a few minutes, the status changes to Failure and I get this: ClientError: Access Denied. Cannot access manifest file: arn:aws:sagemaker:us-east-1:<account number>:labeling-job/<job name> using roleArn: null in the reason for failure.
I also get the error underneath (where there used to be images but now there are none):
The specified key <job name>/manifests/output/output.manifest isn't present in the S3 bucket <output bucket>.
I'm very confused for a couple of reasons. First of all, this is a super simple job. I'm just trying to do the most basic bounding box example I can think of. So this should be a very well-tested path. Second, I'm explicitly specifying a role arn, so I have no idea why it's saying it's null in the error message. Is this an Amazon glitch or could I be doing something wrong?
The role must include SageMakerFullAccess and access to the S3 bucket, so it looks like you've got that covered :)
Please check that:
the user creating the labeling job has Cognito permissions: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-getting-started-step1.html
the manifest exists and is at the right S3 location.
the bucket is in the same region as SageMaker.
the bucket doesn't have any bucket policy restricting access.
If that still doesn't fix it, I'd recommend opening a support ticket with the labeling job id, etc.
Julien (AWS)
There's a bug whereby sometimes the console will say something like 401 ValidationException: The specified key s3prefix/smgt-out/yourjobname/manifests/output/output.manifest isn't present in the S3 bucket yourbucket. Request ID: a08f656a-ee9a-4c9b-b412-eb609d8ce194 but that's not the actual problem. For some reason the console is displaying the wrong error message. If you use the API (or AWS CLI) to DescribeLabelingJob like
aws sagemaker describe-labeling-job --labeling-job-name yourjobname
you will see the actual problem. In my case, one of the S3 files that define the UI instructions was missing.
I had the same issue when I tried to write to a different bucket to the one that was used successfully before.
Apparently the IAM role ARN can be assigned permissions for a particular bucket only.
I would suggest to refer to CloudWatch logs and look for a CloudWatch>>CloudWatch Logs >> Log groups >> /aws/sagemaker/LabelingJobs group. I had all points ticked from another post, but my pre-processing Lambda function had wrong id for my region and the error was obvious in the logs.

Writing parquet file from transient EMR Spark cluster fails on S3 credentials

I am creating a transient EMR Spark cluster programmatically, reading a vanilla S3 object, converting it to a Dataframe and writing a parquet file.
Running on a local cluster (with S3 credentials provided) everything works.
Spinning up a transient cluster and submitting the job fails on the write to S3 with the error:
AmazonS3Exception: The AWS Access Key Id you provided does not exist in our records.
But my job is able to read the vanilla object from S3, and it logs to S3 correctly. Additionally I see that EMR_EC2_DefaultRole is set as EC2 instance profile, and that EMR_EC2_DefaultRole has the proper S3 permissions, and that my bucket has a policy set for EMR_EC2_DefaultRole.
I get the the 'filesystem' that I am trying to write the parquet file to is special, but I cannot figure out what needs to be set for this to work.
Arrrrgggghh! Basically as soon as I had posted my question, the light bulb went off.
In my Spark job I had
val cred: AWSCredentials = new DefaultAWSCredentialsProviderChain().getCredentials
session.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", cred.getAWSAccessKeyId)
session.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", cred.getAWSSecretKey)
which were necessary when running locally in a test cluster, but clobbered the good values when running on EMR. I changed the block to
overrideCredentials.foreach(cred=>{
session.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", cred.getAWSAccessKeyId)
session.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", cred.getAWSSecretKey)
})
and pushed the credential retrieval into my test harness (which is, of course, where it should have been all the time.)
If you are running in EC2 on the AWS code (not EMR), use the S3A connector, as it will use the EC2 IAM credential provider as the last one of the credential providers it uses by default.
The IAM credentials are short lived and include a session key: if you are copying them then you'll need to refresh at least every hour and set all three items: access key, session key and secret.
Like I said: s3a handles this, with the IAM credential provider triggering a new GET of the instance-info HTTP server whenever the previous key expires.

All my AWS datapipelines have stopped working with Validation error

I use AWS data pipelines to automatically back up dynamodb tables to S3 on a weekly basis.
All of my data-pipelines, have stopped working since two weeks ago.
After some investigation, I see that EMR fails with "validation error" and "Terminated with errors No active keys found for user account". As a results all the jobs timeout.
Any ideas what this means?
I ruled out changes to the list of instant types that are allowed to be used with EMR.
Also I tried to read the EMR logs but it looks like it doesn't event get to the point to create logs (or I am looking for them in the wrong place).
AWS account which used to launch EMR has keys ( access key and sec key ) Could you check if those keys are deleted ? You need to login to AWS console and check keys exists for your account.
if not re create keys and use in your code that launches EMR.
Basically #Sandesh Deshmane answered my question correctly.
For future reference and clarity I explain the situation here too:
What happened was that originally I used the root account and console to create the pipelines. Later I decided to follow the best practices and removed my root account keys.
A few days later (my pipelines are scheduled to run weekly) when they all failed I did not make the connection and thought of other problems.
I think one good way to avoid this (if you want to use console) will be to login to console with an IAM account and create the pipelines.
Or you can use command line tools to create them with and IAM credentials.
The real solution now (I think it was not available when the console was first introduced), is to assign the correct IAM role in the first page when you are creating your pipeline in the console. In the "security/access" section change it from default to custom and select the correct roles there.