AWS EFS - Script to create mount target after creating the file system - amazon-web-services

I am writing a script that will create an EFS file system with a name from input. I am using the AWS SDK for PHP Version 3.
I am able to create the file system using the createFileSystem command. This new file system is not usable until it has a mount target created. If I run the CreateMountTarget command after the createFileSystem command then I receive an error that the file system's life cycle state is not in the 'available' state.
I have tried using createFileSystemAsync to create a promise and calling the wait function on that promise to force the script to run synchronously. However, the promise is always fulfilled while the file system is still in 'creating' life cycle state.
Is there a way to force the script to wait for the file system to be in the available state using the AWS SDK?

One way is to check the status of the file system using DescribeFileSystems API. In the response look at the LifeCycleState, if it is available fire the CreateMountTarget API. You can keep checking the DescribeFileSystems in a loop with a few seconds delay until the LifeCycleState is Available

It looks like you want a waiter for FileSystemAvailable, but the elasticfilesystem files don't specify one. I'd file an issue on GitHub asking for one. You'd need to wait for DescribeFileSystems to have a LifeCycleState of available.
In the mean time, you can probably write your own with something like the following and following the waiters guide.
{
"version":2,
"FileSystemAvailable": {
"delay": 15,
"operation": "DescribeFileSystems",
"maxAttempts": 40,
"acceptors": [
{
"expected": "available",
"matcher": "pathAll",
"state": "success",
"argument": "FileSystems[].LifeCycleState"
},
{
"expected": "deleted",
"matcher": "pathAny",
"state": "failure",
"argument": "FileSystems[].LifeCycleState"
},
{
"expected": "deleting",
"matcher": "pathAny",
"state": "failure",
"argument": "FileSystems[].LifeCycleState"
}
]
},
}
Promises in the AWS SDK for PHP are used for making the HTTP request concurrently. This doesn't help in this case because the behavior of the API call is to start an asynchronous task in EFS.

Related

how to add sharedIdentifier to aws event bridge rule for scheduled execution of aws batch job

I configured aws bridge event rule (via web gui) for running aws batch job - rule is triggered but a I am getting following error after invocation:
shareIdentifier must be specified. (Service: AWSBatch; Status Code: 400; Error Code: ClientException; Request ID: 07da124b-bf1d-4103-892c-2af2af4e5496; Proxy: null)
My job is using scheduling policy and needs shareIdentifier to be set but I don`t know how to set it. Here is screenshot from configuration of rule:
There are no additional settings for subsequent arguments/parameters of job, the only thing I can configure is retries. I also checked aws-cli command for putting rule (https://awscli.amazonaws.com/v2/documentation/api/latest/reference/events/put-rule.html) but it doesn`t seem to have any additional settings. Any suggestions how to solve it? Or working examples?
Edited:
I ended up using java sdk for aws batch: https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-batch. I have a scheduled method that periodically spawns jobs with following peace of code:
AWSBatch client = AWSBatchClientBuilder.standard().withRegion("eu-central-1").build();
SubmitJobRequest request = new SubmitJobRequest()
.withJobName("example-test-job-java-sdk")
.withJobQueue("job-queue")
.withShareIdentifier("default")
.withJobDefinition("job-type");
SubmitJobResult response = client.submitJob(request);
log.info("job spawn response: {}", response);
Have you tried to provide additional settings to your target via the input transformer as referenced in the AWS docs AWS Batch Jobs as EventBridge Targets ?
FWIW I'm running into the same problem.
I had a similar issue, from the CLI and the GUI, I just couldn't find a way to pass ShareIdentifier from an Eventbridge rule. In the end I had to use a state machine (step function) instead:
"States": {
"Batch SubmitJob": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobName": <name>,
"JobDefinition": <Arn>,
"JobQueue": <QueueName>,
"ShareIdentifier": <Share>
},
...
You can see it could handle ShareIdentifier fine.

EC2 Instance Vanished

We have a peculiar situation today where we see that one of our EC2 instance has disappeared from the console and we weren't sure what caused this. Cloudtrail doesn't have any terminated event against this instance-id.
The last noted cloudtrail event for the instance-id that went down goes something like this
{
"eventVersion": "1.08",
"userIdentity": {
"type": "AWSService",
"invokedBy": "ec2.amazonaws.com"
},
"eventTime": "2022-03-23T05:46:40Z",
"eventSource": "sts.amazonaws.com",
"eventName": "AssumeRole",
"awsRegion": "ap-south-1",
"sourceIPAddress": "ec2.amazonaws.com",
"userAgent": "ec2.amazonaws.com",
"requestParameters": {
"roleArn": "arn:aws:iam::2************:role/ec2-instance-***********",
"roleSessionName": "i-06135ad01bb90****"
},
"responseElements": {
"credentials": {
"accessKeyId": "<redacted>",
"sessionToken": "<redacted>",
"expiration": "Mar 23, 2022, 12:01:34 PM"
}
},
"requestID": "d9882911-39e7-449b-9701-***********"",
"eventID": "0fa1b79b-08aa-48e6-8232-***********"",
"readOnly": true,
"resources": [
{
"accountId": "2************",
"type": "AWS::IAM::Role",
"ARN": "arn:aws:iam::2************:role/ec2-instance-***********"
}
],
"eventType": "AwsApiCall",
"managementEvent": true,
"recipientAccountId": "2************",
"sharedEventID": "4b842373-e89d-438b-be3b-*********",
"eventCategory": "Management"
}
The only thing that I can think of is either a hardware failure from AWS side or some crude command ran within the OS by some user that took the instance down. Unfortunately we dont have AWS developer support as that's quite costly.
Has anyone faced anything similar? Any leads on how i can go ahead to find the root cause?
For anyone that is interested or may face this in future, we had to opt for AWS developer support to get an answer, here is what they had to say
From the case notes, I understood that the instance
'i-06135ad01bb******' was missing from yesterday. However, you tried
to check cloudtrail and could not find any traces of termination.
Please correct me if I misunderstood your query.
Upon reviewing the case description, I started checking the instance
using our internal tools and observed that the instance got terminated
on '2022-03-23 06:33 UTC' with the reason
'INSTANCE-INITIATED-SHUTDOWN'.
This means that the shutdown got initiated from OS. Please allow me to
inform you that AWS engineers do not have access or visibility to the
customer's instance/OS level due to data privacy[1] and shared
responsibility model[2]. Hence, I will not be in a position to check
how shutdown call got initiated.
To further investigate this, I checked using our internal tools and
could see that you have selected the 'termination' option: on shut
down which means that when an instance gets shutdown it will be
automatically terminated. So, I would request you to change the option
'termination' : on shut down to 'STOP' : on shut down. With this, if
the OS initiates a shutdown by any chance, the instance will be
stopped instead of getting terminated.
As per my analysis, I can confirm that the AWS infrastructure was healthy and there weren't any issues from our end. However, for the future, I would request you to consider the following best practices which you already might be aware of, however I am mentioning them here for the sake of completeness:
Enable termination protection
Regularly back up your data
Preserving root volume after termination:
Thank You fannymug. I think this was the case also for me.
Just to add up:
in cloudtrail search for the instance ID and select the RunInstances eventName
here it is possible to check the event details.
Double check the value for deleteOnTermination value. If it is set to true, termination protection is not enabled.
To avoid this, during EC2 creation process, look in advanced details > Termination Protection > Enable.
Shutdown behavior option can also help and can be set to Stop.
I hope it helps

Debugging "read time out" for AWS lambda function in Alexa Skill

I am using an AWS lambda function to serve my NodeJS codebase for an Alexa Skill.
The skill makes external API calls to a custom API as well as the Amazon GameOn API, it also uses URL's which serve audio files and images from an S3 Bucket.
The issue I am having is intermittent, and is affecting about 20% of users. At random points of the skill, the user request will produce an invalid response from the skill, with the following error:
{
"Request": {
"type": "System.ExceptionEncountered",
"requestId": "amzn1.echo-api.request.ab35c3f1-b8e6-4478-945c-16f644359556",
"timestamp": "2020-05-16T19:54:24Z",
"locale": "en-US",
"error": {
"type": "INVALID_RESPONSE",
"message": "Read timed out for requestId amzn1.echo-api.request.323b1fbb-b4e8-4cdf-8f31-30c9b67e4a5d"
},
"cause": {
"requestId": "amzn1.echo-api.request.323b1fbb-b4e8-4cdf-8f31-30c9b67e4a5d"
}
},
I have looked up this issue, I believe it's something wrong with the lambda function configuration but can't figure out where!
I've tried increasing the Memory the function uses (now 256MB).
It should be noted that the function timeout is 8000ms, since this is the max time you are allowed for an Alexa response.
What causes this Read timeout issue, and what measures can I take to debug and resolve it?
Take a look at AWS XRay. By using this with your Lambda you should be able to identify the source of these timeouts.
This link should help you understand how to apply it.
We found that this was occurring when the skill was trying to access a resource which was stored on our Azure website.
The CPU and Memory allocation for the azure site was too low, and it would fail when facing a large amount of requests.
To fix, we improved the plan the app service was running on.

Cloud Tasks Not Triggering HTTPrequest Endpoints

I have one simple cloud task queue and have successfully submitted a task to the queue. It is supposed to deliver a JSON payload to my API to perform a basic database update. The task is created at the end of a process in a .net core 3.1 app running locally on my desktop triggered by postman and the API is a golang app running in cloud run. However, the task never seems to fire and never registers an error.
The tasks in queue is always 0 and the tasks running is always blank. I have hit the "Run Now" button dozens of times but it never changes anything and no log entries or failed attempts are ever registered.
The task is created with the OIDCToken with a service account and audience set for the service account that has the authorization to create tokens and execute the cloud run instance.
Screen Shot of Tasks Queue in Google Cloud Console
Task creation log entry shows that it was created OK:
{
"insertId": "efq7sxb14",
"jsonPayload": {
"taskCreationLog": {
"targetAddress": "PUT https://{readacted}",
"targetType": "HTTP",
"scheduleTime": "2020-04-25T01:15:48.434808Z",
"status": "OK"
},
"#type": "type.googleapis.com/google.cloud.tasks.logging.v1.TaskActivityLog",
"task": "projects/{readacted}/locations/us-central1/queues/database-updates/tasks/0998892809207251757"
},
"resource": {
"type": "cloud_tasks_queue",
"labels": {
"target_type": "HTTP",
"project_id": "{readacted}",
"queue_id": "database-updates"
}
},
"timestamp": "2020-04-25T01:15:48.435878120Z",
"severity": "INFO",
"logName": "projects/{readacted}/logs/cloudtasks.googleapis.com%2Ftask_operations_log",
"receiveTimestamp": "2020-04-25T01:15:49.469544393Z"
}
Any ideas as to why the tasks are not running? This is my first time using Cloud Tasks so don't rule out the idiot between the keyboard and the chair.
Thanks!
You might be using a non-default service. See Configuring Cloud Tasks queues
Try creating a task from the command line and watch the logs e.g.
gcloud tasks create-app-engine-task --queue=default \
--method=POST --relative-uri=/update_counter --routing=service:worker \
--body-content=10
In my own case, I used --routing=service:api and it worked straight away. Then I added AppEngineRouting to the AppEngineHttpRequest.

Lambda function not working upon Alexa Skill invocation

I've just created my first (custom) still. I've set the function up in Lambda by uploading a zip file containing my index.js and all the necessary code required, including node_modules and the base Alexa skill that mine is a child of (as per the tutorials). I made sure I zipped up the files and sub-folders, not the folder itself (as I can see this is a common cause of similar errors) but when I create the skill and test in the web harness with a sample utterance I get:
remote endpoint could not be called, or the response it returned was
invalid.
I'm not sure how to debug this as there's nothing logged in CloudWatch.
I can see in the Lambda request that my slot value is translated/parsed successfully and the intentname is correct.
In AWS Lambda I can invoke the function successfully both with a LaunchRequest and another named intent. From the developer console though, I get nothing.
I've tried copying the JSON from the lambda test (that works) to the developer portal and I get the same error. Here is a sample of the JSON I'm putting in the dev portal (that works in Lambda)
{
"session": {
"new": true,
"sessionId": "session1234",
"attributes": {},
"user": {
"userId": null
},
"application": {
"applicationId": "amzn1.echo-sdk-ams.app.149e75a3-9a64-4224-8bcq-30666e8fd464"
}
},
"version": "1.0",
"request": {
"type": "LaunchRequest",
"requestId": "request5678"
}
}
The first step in pursuing this problem is probably to test your lambda separate from your skill configuration.
When looking at your lambda function in the AWS console, note the 'test' button at the top, and next to it there is a drop down with an option to configure a test event. If you select that option you will find that there are preset test events for Alexa. Choose 'alexa start session' and then choose 'save and test' button.
This will give you more detailed feedback about the execution of your lambda.
If your lambda works fine here then the problem probably lies in your skill configuration, so I would go back through whatever tutorial and documentation you were using to configuration your skill and make sure you did it right.
When you write that the lambda request looks fine I assume you are talking about the service simulator, so that's a good start, but there could still be a problem on the configuration tab.
We built a tool for local skill development and testing.
BST Tools
Requests and responses from Alexa will be sent directly to your local server, so that you can quickly code and debug without having to do any deployments. I have found this to be very useful for our own development.
Let me know if you have any questions.
It's open source: https://github.com/bespoken/bst