I would like to know at what point in time a given Cloudfront invalidation has been completed.
When browsing invalidations in AWS Console, I can only see Date created timestamp, as per attached image.
I would like to know what is the date of completion. Couldn't find any info in the docs, also aws cli get-invalidations api doesn't say anything about it.
Is it possible at all?
Basically what im trying to achieve is to measure how much time my invalidations take, so I can assertain if this is the cause of tests failures in my project.
Cheers!
As per AWS:- Invalidate the Amazon S3 objects
Object invalidations typically take from 10 to 100 seconds to complete. You can check the status of an invalidation by viewing your distribution from the CloudFront console.
Related
Since past week we recorded irregular deletions of the trigger of AWS Lambda.
We would like to find out when this exactly happened to determine the reason/cause of deletion. We tried looking for the entries in Cloudtrail, but not sure what to look for exactly ?
How to find the root cause and reasons for the deletion ?
thanks Marcin and ydaetskcoR. We found the problem. The Lambda trigger is a property of the S3 bucket. We had different lambda trigger in different projects (with different terraform states). So every time one (terraform) project will be applied, the trigger of the other project will be overwritten, because the terraform state is not aware of it. We saw PutBucketNotifications in cloudtrail,but didn't recognize the connections...
You can troubleshoot operational and security incidents over the past 90 days in the CloudTrail console by viewing Event history. You can look up events related to creation, modification, or deletion of resources. To view Events in logs follow this ;
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/get-and-view-cloudtrail-log-files.html
I have been using the AmazonS3 service to store some files.
I have uploaded 4 videos and they are public. I'm using a third party video player for those videos (JW Player). As a new user on the AWS Free Tier, my free PUT, POST and LIST requests are almost used up from 2000 allowed requests, and for four videos that seems ridiculous.
Am I missing something or shouldn't one upload be one PUT request, I don't understand how I've hit that limit already.
The AWS Free Tier for Amazon S3 includes:
5GB of standard storage (normally $0.023 per GB)
20,000 GET requests (normally $0.0004 per 1,000 requests)
2,000 PUT requests (normally $0.005 per 1,000 requests)
In total, it is worth up to 13.3 cents every month!
So, don't be too worried about your current level of usage, but do keep an eye on charges so you don't get too many surprises. You can always Create a Billing Alarm to Monitor Your Estimated AWS Charges.
The AWS Free Tier is provided to explore AWS services. It is not intended for production usage.
It would be very hard to find out the reason for this without debugging a bit. So I would suggest you try the following debugging :
See if you have cloudtrail enabled. If yes, then you can track the API calls to S3 to see if anything is wrong there.
If you have cloudtrail enabled then it itself put data into the S3 bucket that might also take up some of the requests.
See if you have logging enabled at the bucket level, that might give you more insight on what all requests are reaching your bucket.
Your vides are public and that is the biggest concern here as you don't know who all can access it.
Setup cloudwatch alarms to avoid any surprises and try to look at logs to find out the issue.
I'd like to use AWS AccessLogs for processing website impressions using an existing batch oriented ETL pipeline that grabs last finished hour of impressions and do a lot of further transformations with them.
The problem with AccessLog though is that :
Note, however, that some or all log file entries for a time period can
sometimes be delayed by up to 24 hours
So I would never know when all the logs for a particular hour are complete.
I unfortunately cannot use any streaming solution, I need to use existing pipeline that grabs hourly batches of data.
So my question is, is there any way to be notified that all logs has been delivered to s3 for a particular hour?
You have asked about S3, but your pull-quote is from the documentation for CloudFront.
Either way, though, it doesn't matter. This is just a caveat, saying that log delivery might sometimes be delayed, and that if it's delayed, this is not a bug -- it's a side effect of a massive, distributed system.
Both services operate an an incomprehensibly large scale, so periodically, things go wrong with small parts of the system, and eventually some stranded logs or backlogged logs may be found and delivered. Rarely, they can even arrive days or weeks later.
There is no event that signifies that all of the logs are finished, because there's no single point within such a system that is aware of this.
But here is the takeaway concept: the majority of logs will arrive within minutes, but this isn't guaranteed. Once you start running traffic and observing how the logging works, you'll see what I am referring to. Delayed logs are the exception, and you should be able to develop a sense, fairly rapidly, of how long you need to wait before processing the logs for a given wall clock hour. As long as you track what you processed, you can audit this against the bucket, later, to ensure that yout process is capturing a sufficient proportion of the logs.
Since the days before CloudFront had SNI support, I have been routing traffic to some of my S3 buckets using HAProxy in EC2 in the same region as the bucket. This gave me the ability to use custom hostnames, and SNI, but also gave me real-time logging of all the bucket traffic using HAProxy, which can stream copies of its logs to a log collector for real-time analysis over UDP, as well as writing it to syslog. There is no measurable difference in performance with this solution, and HAProxy runs extremely well on t2-class servers, so it is cost-effective. You do, of course, introduce more costs and more to maintain, but you can even deploy HAProxy between CloudFront and S3 as long as you are not using an origin access identity. One of my larger services does exactly this, a holdover from the days before Lambda#Edge.
How do you route AWS Web Application Firewall (WAF) logs to an S3 bucket? Is this something I can quickly do through the AWS Console? Or, would I have to use a lambda function (invoked by a CloudWatch timer event) to query the WAF logs every n minutes?
UPDATE:
I'm interested in the ACL logs (Source IP, URI, Matches rule, Request Headers, Action, Time, etc).
UPDATE (05/15/2017)
AWS doesn't provide an easy way to view/parse these logs. You can get a "random sample" via the get-sampled-requests command. Which isn't acceptable...
Gets detailed information about a specified number of requests--a
sample--that AWS WAF randomly selects from among the first 5,000
requests that your AWS resource received during a time range that you
choose. You can specify a sample size of up to 500 requests, and you
can specify any time range in the previous three hours.
http://docs.aws.amazon.com/cli/latest/reference/waf/get-sampled-requests.html
Also, I'm not the only one experiencing this issue either:
https://forums.aws.amazon.com/thread.jspa?threadID=220202
I was looking for this functionality today and stumbled across the referenced thread. It was, coincidentally, updated today:
Hello,
Thanks for your input. I have submitted a feature request on your
behalf to export WAF events to S3 for long term analysis.
Best Regards, albertpataws
The lack of this feature strikes me as being almost as odd as the fact that I can't change timezones for graphs.
How do I integrate Amazon Cloud Front and S3 in a photo sharing application?
I currently upload to S3, and return the cloudfront url but this has not been very successful because it appears there is a latency between s3 and cloudfront such that the returned url is not immediately valid.
Does any know how I can work around this?
Facebook uses Akamai and if I upload an image it is immediately available.
Would appreciate some ideas on this.
You must be trying to fetch the object immediately through cloudfront. I'm unsure why that might be, but you are hitting the limits of S3's eventual consistency model.
When you upload an object, the message takes a tiny amount of time to propagate across the S3 service. Generally this is well under one second and is hard to detect. (in a previous life job, we found we could reasonably guarantee all files arrived within 10 seconds, and 99.9% within 1 second)
Here's the official word from AWS; it's worth reading the whole page:
A process writes a new object to Amazon S3 and immediately lists keys within its bucket. Until the change is fully propagated, the object might not appear in the list.
There's a much longer discussion on this stackoverflow question; assuming you are using the standard S3 bucket, you need to change your endpoint slightly to take advantage of the read-after-write model.
Further reading:
* Instrumental: Why you should stop using the us-standard Region in S3. Right Now™
* Read-After-Write Consistency in Amazon S3 (from 2009, contains dated info)
One way you can debug/prove this is by calling getObjectMetadata right before your CloudFront call. It should fail in this case.