Amazon Personalize - how to delete Batch Inference Job "Create in progres"? - amazon-web-services

I started with Amazon Personalize yesterday with the help of this tutorial. Since it took more time than expected, in the middle of the notebook I decided to postpone it and deleted all resources (Cloud Formation stack, Jupyter notebook, S3 Bucket). Evidently, something went wrong. I still have a Dataset Group with status 'Active'.
I cannot delete it, because there is one Batch Inference job with status 'Create in progress'. It has this status since yesterday, now for more than 12 hours.
How can I delete all of this? What charges should I expect?

There is no option to stop 'Create in progress' for any of the Personalize resources. This is the downside of having 'black box' service.
I believe the best option would be to contact AWS Support team and they should be able to terminate it manually.
I had some cases, when some of resources creation were taking more than 12 hours and the time depends mostly on dataset size and type of the job. There is no other option than waiting for it to complete, if you don't want to contact the support.

Related

AWS Glue Job Alerting on Long Run Time

I'm hoping to configure some form of alerting for AWS Glue Jobs when they run longer than a configurable amount of time. These Glue jobs can be triggered at any time of day, and usually take less than 2 hours to complete. However if this exceeds the 2 hour threshold, I want to get a notification for this (via SNS).
Usually I can configure run time alerting in CloudWatch Metrics, but I am struggling to do this for a Glue Job. The only metric I can see that could be useful is
glue.driver.aggregate.elapsedTime, but it doesn't appear to help. Any advice would be appreciated.
You could use the library for that. You just need the job run id and then call getJobRun to get the execution time. Based on that you can then notify someone / some other service.

AWS ElasticSearch - Automating manual snapshots

The requirement - A customer requires an automated mechanism that takes a manual snapshot of an AWS ElasticSearch domain (production) on a daily basis. The target of the snapshot is an AWS S3 bucket.
Expected flow
Schedule Daily # 2am --> start process --> take snapshot --> wait 5 min --> check snapshot status (success/in_progress/failed)
if state==IN_PROGRESS, check snapshot status again, up to 10 times, interval of 5 mins
state==SUCCESS - end process (success)
state==IN_PROGRESS - when reaching 10 retries (50 mins), end process (failed)
state==FAILED - end process (failed)
If previous step failed, send push notification (Slack/Teams/Email/etc.)
Motivation - The automated snapshots that are taken by AWS can be used for disaster recovery or a failure in an upgrade, they cannot be used if someone by accident (yes, it happened) deleted the whole ElasticSearch cluster.
Haven't found an out-of-the-box Lambda/mechanism that meets the requirements. Suggestions? Thoughts?
p.s- I did a POC with AWS Step Functions + Lambda in VPC, which seems to be working, but I'd rather use a managed service or a living open-source project.
In case you accidentally delete your AWS Elasticsearch domain, AWS Support can help you recover the domain along with its latest snapshot on best effort basis. This is not listed in the documentation since this shouldn't ideally be your first bet.
Assuming this will be rare scenario, you should be fine. However, if you think there are fair chances of your AWS ES cluster getting delete again and again, you will be better off setting up a lambda function to save a latest snapshot in your own S3 bucket. This will save you from depending on AWS support as well.
AWS Elasticsearch have accidental delete protection. Incase you delete your domain by mistake, AWS elasticsearch can recover it within 14 days.
Hope this solves your purpose.

How to delete a Sagemaker Ground Truth Labeling Job?

How can I delete an Amazon Sagemaker Ground Truth Labeling Job?
Can't find that option on the console.
Unfortunately, it is not possible to delete a Labeling Job.
I've also had a similar issue and was wondering if there was a way to do it through the console, SDK or CLI. I spoke to a cloud support engineer and this is in his own words:
At the moment it is not possible to delete a Labeling Job from
console, CLI or SDK. If the Job is "In Progress" you can stop the job
to avoid further labeling and charges associated with it.
If you are concerned about charges, the engineer did assure me that failed and completed jobs will not be charged.
You will only charged for the following:
1- Worker charges if you are using Public or Vendor workforce.
2- S3 storage charges for the input and output data
3- Fixed price per labeled object. https://aws.amazon.com/sagemaker/groundtruth/pricing/

CloudWatch to delete old backups

I am currently using AWS CloudWatch to create backups of a particular EBS volume every 12 hours and would like to delete old snapshots every so often so I don't end up with a crazy amount of backups. Based on the simpler route I'd like to either replace the existing backup with a new one every time rule triggers OR delete backups older than 2 days. Any idea how to accomplish this?
I tried search Target actions in the CloudWatchAWS console for something like "EC2 DeleteSnapshot API call" or similar with no success.
You could create a Lambda function that does this and then invoke that Lambda from a scheduled CloudWatch Event. Beware the maximum execution time of Lamda though. Alternatively you could run an instance and cron a script that does this too. Whichever way you go, you’ll need to script it.

Delete AWS codeDeploy Revisions from S3 after successfull deployment

I am using codeDeploy addon for bitbucket to deploy my codes directly from Bitbucket Git repository to my EC2 instances via AWS codeDeploy. However, after a while, I have a lot of revisions in my codeDeploy console which were stored in one S3 bucket. So what should I do to save my S3 storage from keeping old codeDeploy revisions?
Is it possible to delete these revisions automatically after a successful deployment?
Is it possible to delete them automatically if there is X number of successful revision? For example, delete an old revision if we have three new successful revisions.
CodeDeploy keeps every revision from BitBucket is because, the service needs last successful revision all the time for different kinds of features like AutoRollback. So we can't easily override the previous revision for now, when doing a deployment. But for all revisions older than last successful revision, they can be deleted.
Unfortunately, CodeDeploy doesn't have a good/elegant way to handle those obsolete revisions at the moment. It'd be great if there is an overwrite option when bitbucket pushes to S3.
CodeDeploy is purely a deployment tool, it cannot handle the revisions in S3 bucket.
I would recommend you look into the "lifecycle management" for S3. Since you are using version controlled bucket (I assume), there is always one latest version and 0 to many obsolete version. You can set a lifecycle configuration of type "NoncurrentVersionExpiration" so that the obsolete version will be deleted after some days.
This method is still not possible to maintain a fixed number of deployments as AWS only allows specifying lifecycle in number of days. But it's probably the best alternative to your use-case.
[1] http://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-set-lifecycle-configuration-intro.html
[2] http://docs.aws.amazon.com/AmazonS3/latest/dev/intro-lifecycle-rules.html
CodeDeploy does not handle a feature like Jenkins sample: "keep the last X [successful or not] runs".
However, with S3 Lifecycle, you can expire (delete) the S3 objects automatically after 3 months for sample.
On one hand this solution is a nice FinOps action when there is a constant activity during the expiration window (at least 3 deployments) by assuring the automatic rollback process of CodeDeploy and reducing the S3 cost.
On the other hand this solution is less efficient when you have spiky activities or worse no deployment at all during the specified S3 expiration delay: in the case of the deployment 12 months after the last one, when this deployment fails, Code Deploy will not be able to proceed to the rollback since the previous artifacts are no more available in S3.
As mitigation, I recommand you to use the Intelligent Tiering it can divide the S3 cost 4 without interferring with the CodeDeploy capabilities. Also you can set a expiration to 12 months to delete the ancient artifacts.
Another last solution is coding a Lambda scheduled by a weekly Cloudwatch Events and that will:
List deplyments using using your own critera success/fail status
Get deployment details for each
Filter out again this deployments using your cirteria (date, user, ..)
Delete the S3 objects using the deployment details