AWS Cloudsearch: cs-configure-from-batches Resource not found - amazon-web-services

I am attempting to add index fields to a cloudSearch domain I have just created, but I am getting errors doing so.
I create the domain using:
aws cloudsearch create-domain --domain-name test-run
If I view information about the domain using the below command I get results including indication the the domain has been created.
aws cloudsearch describe-domains --domain-name test-run
I can also view this information using the CloudSearch console.
However, when I try to run
cs-configure-from-batches --domain-name test-run --source scripts/SeedFile.json -c ~/.aws/credentials
I get the below error
Domain not found: test-run (Service: AmazonCloudSearchv2; Status Code: 409; Error Code: ResourceNotFound;

Solution:
the endpoint defaults to a us east endpoint. I needed to specify us-west with --endpoint on the command

Related

AWS cannot delete RestApi

On deleting AWS Rest API from UI or through AWS console or terminal with command:
aws apigateway delete-rest-api --rest-api-id 1234123412
(mentioned in aws docs)
I faced the error saying to delete base-path mappings related to RestApi in your domain,
I tried deleting it with the following cmd given in aws docs:
aws apigateway delete-base-path-mapping --domain-name 'api.domain.tld' --base-path 'dev'
I got error: An error occurred (NotFoundException) when calling DeleteBasePathMapping operation. Invalid base path mapping identifier specified
Delete the corresponding domain name from the UI (under 'Custom domain names').
After which, the RestApi can be deleted.

AWS credentials required for Common Crawl S3 buckets

I'm trying to get at the Common Crawl news S3 bucket, but I keep getting a "fatal error: Unable to locate credentials" message. Any suggestions for how to get around this? As far as I was aware Common Crawl doesn't even require credentials?
From News Dataset Available – Common Crawl:
You can access the data even without a AWS account by adding the command-line option --no-sign-request.
I tested this by launching a new Amazon EC2 instance (without an IAM role) and issuing the command:
aws s3 ls s3://commoncrawl/crawl-data/CC-NEWS/
It gave me the error: Unable to locate credentials
I then ran it with the additional parameter:
aws s3 ls s3://commoncrawl/crawl-data/CC-NEWS/ --no-sign-request
It successfully listed the directories.

aws access s3 from spark using IAM role

I want to access s3 from spark, I don't want to configure any secret and access keys, I want to access with configuring the IAM role, so I followed the steps given in s3-spark
But still it is not working from my EC2 instance (which is running standalone spark)
it works when I tested
[ec2-user#ip-172-31-17-146 bin]$ aws s3 ls s3://testmys3/
2019-01-16 17:32:38 130 e.json
but it did not work when I tried like below
scala> val df = spark.read.json("s3a://testmys3/*")
I am getting the below error
19/01/16 18:23:06 WARN FileStreamSink: Error while looking for metadata directory.
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: E295957C21AFAC37, AWS Error Code: null, AWS Error Message: Bad Request
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.sql.execution.datasources.DataSource$.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:616)
this config worked
./spark-shell \
--packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.3 \
--conf spark.hadoop.fs.s3a.endpoint=s3.us-east-2.amazonaws.com \
spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.InstanceProfileCredentialsProvider \
--conf spark.executor.extraJavaOptions=-Dcom.amazonaws.services.s3.enableV4=true \
--conf spark.driver.extraJavaOptions=-Dcom.amazonaws.services.s3.enableV4=true
"400 Bad Request" is fairly unhelpful, and not only does S3 not provide much, the S3A connector doesn't date print much related to auth either. There's a big section on troubleshooting the error
The fact it got as far as making a request means that it has some credentials, only the far end doesn't like them
Possibilities
your IAM role doesn't have the permissions for s3:ListBucket. See IAM role permissions for working with s3a
your bucket name is wrong
There's some settings in fs.s3a or the AWS_ env vars which get priority over the IAM role, and they are wrong.
You should automatically have IAM auth as an authentication mechanism with the S3A connector; its the one which is checked last after: config & env vars.
Have a look at what is set in fs.s3a.aws.credentials.provider -it must be unset or contain the option com.amazonaws.auth.InstanceProfileCredentialsProvider
assuming you also have hadoop on the command line, grab storediag
hadoop jar cloudstore-0.1-SNAPSHOT.jar storediag s3a://testmys3/
it should dump what it is up to regarding authentication.
Update
As the original poster has commented, it was due to v4 authentication being required on the specific S3 endpoint. This can be enabled on the 2.7.x version of the s3a client, but only via Java system properties. For 2.8+ there are some fs.s3a. options you can set it instead
step1. to config spark container framework like Yarn core-site.xml.Then restart Yarn
fs.s3a.aws.credentials.provider--
com.cloudera.com.amazonaws.auth.InstanceProfileCredentialsProvider
fs.s3a.endpoint--
s3-ap-northeast-2.amazonaws.com
fs.s3.impl--
org.apache.hadoop.fs.s3a.S3AFileSystem
step2. spark shell to test as follow.
val rdd=sc.textFile("s3a://path/file")
rdd.count()
rdd.take(10).foreach(println)
It works for me

HTTP 403 when sending metrics to CloudWatch in Frankfurt, works in Ireland

We use mon-get-instance-stats.pl to send custom metrics (RAM and Disk usage) to Cloudwatch.
I set this up following the AWS documentation. We use instance roles to give the instances the right to call CloudWatch, we do not use access keys.
This works like a charm for our Ireland (eu-west-1) instances but fails for our Frankfurt (eu-central-1) instances, where I get this error message:
$ /home/ec2-user/aws-scripts-mon/mon-put-instance-data.pl --mem-util --mem-used --mem-avail --swap-util --swap-used --disk-path=/ --disk-space-util --disk-space-used --disk-space-avail --aws-iam-role=instancerole
ERROR: Failed to call CloudWatch: HTTP 403. Message: The security token included in the request is invalid
For more information, run 'mon-put-instance-data.pl --help'
Note that the role instancerole is correctly configured on EC2 instances on both Ireland and Frankfurst.
What can I do to fix this?
Turns out, because Frankfurt is a new region, is does not support the old version of Cloudwatch scripts. I was running version 1.1.0, updating to 1.2.1 fixed the issue.

What's wrong with my AWS CLI configuration?

I'm attempting to use the AWS CLI tool to upload a file to Amazon Glacier. I installed awscli using pip:
sudo pip install awscli
I created a new AWS IAM group example with AmazonGlacierFullAccess permissions.
I created a new AWS IAM user example and added the user to the example group. I also created a new access key for the user.
I created a new AWS Glacier Vault example and edited the policy document to allow the example user to allow actions glacier:* with no conditions.
I then ran aws configure and added the "Access Key ID" and "Secret Access Key" as well as the default region.
When I run:
aws glacier list-vaults
I get the error:
aws: error: argument --account-id is required
If I add the account ID:
aws --account-id=[example user account ID] glacier list-vaults
I get this error:
A client error (UnrecognizedClientException) occurred when calling the ListVaults operation: No account found for the given parameters
I figured I might have gotten something in the group assignment wrong, so I added the AdministratorAccess policy directly to the example user. Now I can run commands such as aws s3 ls, but I still cannot aws glacier list-vaults without getting the aws: error: argument --account-id is required error.
Have I missed something in my AWS configuration? How can I further troubleshoot this issue?
Looks like for AWS Glacier you need the account ID List Vaults (GET vaults)
You can get your account id (12 digits) from Support page - Top right on your AWS dashboard.