npm:youtube-dl and Lamda HTTP Error 429: Too Many Requests - amazon-web-services

I am running an npm package: youtube-dl through a Lambda function as I want to create an online convertor.
I have suddenly started to run into the following error message:
{
"errorMessage": "Command failed: /var/task/node_modules/youtube-dl/bin/youtube-dl --dump-json --format=best[ext=mp4] https://www.youtube.com/watch?v=MfTbHITdhEI\nERROR: Unable to download webpage: HTTP Error 429: Too Many Requests (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.\n",
"errorType": "Error",
"stackTrace": ["ERROR: Unable to download webpage: HTTP Error 429: Too Many Requests (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.", "", "ChildProcess.exithandler (child_process.js:275:12)", "emitTwo (events.js:126:13)", "ChildProcess.emit (events.js:214:7)", "maybeClose (internal/child_process.js:925:16)", "Process.ChildProcess._handle.onexit (internal/child_process.js:209:5)"]
}
Edit: I have run this a few times when I was testing the other day, but today I only ran it once.
I think that the IP address used by my Lambda function has now been blacklisted. I'm unsure how to proceed as I am a junior and very new to all this.
Is there a way to resolve this? Can I get a new IP address? Is this going to be super costly?

youtube-dl lack of delay (limit of request per time) option.
(see suggestion it the bottom of my post).
NEVER download more than one video with youtube-dl.
You can search youtube-dl author contact (e-mail etc) and write them directly, also as open issue on github page regarding it. as more request they have as fast they be pleased to fix it.
Currenty they have planty same request on this issue in gitlab but they hardly to block discussions and close tickets by this problem.
This is some sort of misbehaviour I believe.
I also found that developer suggest to use proxy instead of introducing delay option in his code - extremely funny.
OK, re to use proxy - but this actually does not solve the problem since it is lack of program design and no matter you use proxy or not YouTube limits is still here.
Please note:
This cause not only subj error but blocking your IP by YouTube.
Once you hit this situation YouTube will block your IP as a suspicious again and again even with a small requests amount. this cause tremendous problems since IP marked as suspicious.
Without limiting request per time option (with safe value by default) I consider youtube-dl as a dangerous software should cause problems and I stopped using it until this option will be introduced.
RECOMENDATIONS:
Use Ctrl+S (suspend) , Ctrl+Q (resume) when youtube-dl collecting digest for many videos (when you already donloaded many videos of channel but new one still there). I suspend it for a few minutes after eatch 10.
And use --limit-rate 150K (or as low as it sane), this may help you to not hit the limit since whole transmission is shaped.

Ok, so I found this response: https://stackoverflow.com/a/45339683/9793169
I am wondering if it's possible that our because our volume is low we just always end up using the same container hence the same IP address?
Yes, that is exactly the reason. A container is only spawned if no containers are already available. After a few minutes of no further demand, excess/unneeded containers are destroyed.
If so is there any way to prevent this?
No, this behavior is by design.
SOLUTION:
I logged out for 20 minutes and went back to the function and ran it again. It worked

Not my solution, it took me a while to understand what he ment (reading is an art). It worked for me.
(see: https://askubuntu.com/questions/1220266/youtube-dl-do-not-working-http-error-429-too-many-requests-how-can-i-solve-this)
You have to use the option --cookies in combination with a current/correct cookie file.
Here the steps I followed
1. if you use Firefox, install addon cookies.txt, enable the addon
2. clear your browser cache, clear you browser cookies (privacy reasons)
3. go to google.com, and log in with your google account
4. go to youtube.com
5. click on the cookies.txt addon, and export the cookies, save it as cookies.txt (in the same directory from where you are going to run youtube-dl)
6. this worked for me ... youtube-dl --cookies cookies.txt https://www.youtube.com/watch?v=....
Hope it helps.

use --force-ipv4 option in command.
youTube-dl --force-ipv4 ...

What you should do is handle that error by retrying the requests that are throttled.

Related

How to send post from Zapier python in less than 1.00 seconds?

Is there a way to send a POST from a "Code by Zapier" Zap to MailChimp to add a subscriber to a list and have it reliably complete in less than 1.00 second?
I spent the weekend at a volunteer hackathon for non-profit organizations. My non-profit client needs some data parsed out of an email and used to add a subscriber to a list in MailChimp (the Commerce portion of SquareSpace emails the data but doesn't allow setting storage on the purchase form to MailChimp -- even though that works in SquareSpace if you're not in the Commerce area). We found we could do that with Zapier -- except we ran up to the limits of what one can do with a free account on Zapier and the non-profit couldn't purchase a paid account right now (the Zapier discount for non-profits is a 15% reduction).
The first limitation was we couldn't do a 3-step zap (maximum 2 steps for free accounts) to go from (1) a Gmail trigger to (2) "Code by Zapier" to parse the email contents and then (3) to MailChimp. The workaround we came up with was to delete step #3 and send to MailChimp directly via http POST to the MailChimp API from a Python script in "Code by Zapier". This worked in test mode in Zapier.
But once the Zap was turned on and we ran an end-to-end test with the site, the Zap failed. There is a 1.00 second runtime limitation to free Zaps: after that Zapier kills the job. The POST to MailChimp took long enough that the Zap timed out.
I used "Code by Zapier" with Python to send the post. They use Python 2.7.10. I was able to import requests to do the post, and I found several other modules worked too, such as json, httplib, and urllib.
What I'm wondering is whether there's a way to get the POST to happen reliably in under 1 second. For example, is there a way to use an async send and then not wait for the response. And I'm constrained to Python 2.7.10 and the Zapier environment. Zapier also allows JavaScript as an alternative to Python, so that might be another path to investigate if there's no solution in Python.
David here, from the Zapier Platform team.
I can't speak to the speed of Python specifically, but I know that javascript can fire off requests without waiting for a response. We've got a basic example here, which you'd modify to send the request and the immediately end execution (by calling the callback function). This won't be a great experience because errors will happen silently, but it'll almost certainly fit in the 1 second window.
Separately, the whole python stdlib is available, as well as the requests module (docs)

Google Places API error 502 - The server encountered a temporary error

we run a website that obtains location data through the Google Place API. We have 150k daily searches available, which we haven´t met yet as the website has been live for few weeks only. We have suddenly received a 502 error. A notification in the Console says: “The server encountered a temporary error and could not complete your request.”. Is this a temporary error? Is there any suggestions on what we can do? The website hasn’t been available for 40 minutes.
When you receive 5xx status or UNKNOWN_ERROR in the response, you should implement a retrying logic. Google has a following recommendation in their web services documentation:
In rare cases something may go wrong serving your request; you may receive a 4XX or 5XX HTTP response code, or the TCP connection may simply fail somewhere between your client and Google's server. Often it is worthwhile re-trying the request as the followup request may succeed when the original failed. However, it is important not to simply loop repeatedly making requests to Google's servers. This looping behavior can overload the network between your client and Google causing problems for many parties.
A better approach is to retry with increasing delays between attempts. Usually the delay is increased by a multiplicative factor with each attempt, an approach known as Exponential Backoff.
https://developers.google.com/maps/documentation/directions/web-service-best-practices#exponential-backoff
However, if retrying logic with Exponential Backoff doesn't help and the error persists for a long time you should file a bug in Google issue tracker
I hope this addresses your doubt!
UPDATE
There was an issue on Google side yesterday (November 6, 2017), you can refer to the following bug that explains the issue:
https://issuetracker.google.com/issues/68938173

Signature expired: is now earlier than error : InvalidSignatureException

I am trying a small example with AWS API Gateway and IAM authorization. The AWS API Gateway generated the below Endpoint :
https://xyz1234.execute-api.us-east-2.amazonaws.com/Users/users
with POST action and no parameters.
Initially I had turned off the IAM for this POST Method and I verified results using Postman it works.
Then I created a new IAM User and attached AmazonAPIGatewayInvokeFullAccess Policy to the user thereby giving permission to invoke any API's. Enabled the IAM for the POST Method.
I then went to Postman - and added Authorization with AccessKey, Secret Key, AWS Region as us-east-2 and Service Name as execute-api and tried to execute the Request but I got InvalidSignatureException Error with 403 as return code.
The body contains following message :
Signature expired: 20170517T062414Z is now earlier than 20170517T062840Z (20170517T063340Z - 5 min.)"
What am I missing ?
A request signed with AWS sigV4 includes a timestamp for when the signature was created. Signatures are only valid for a short amount of time after they are created. (This limits the amount of time that a replay attack can be attempted.)
When the signature is validated the timestamp is compared to the current time. If this indicates that the signature was not created recently, then signature validation fails with the error message you mentioned.
If you get this on in a Docker container on Windows that uses WSL, then it may help to fix the WSL time with by running wsl -d docker-desktop -e /sbin/hwclock -s in a Powershell. You can verify this is the case beforehand by logging into the container and
typing date in the terminal and comparing it with your host machine time.
A common cause of this is when the local clock on the host generating the signature is off by more than a couple of minutes.
You need to synchronize your machines local clock with NTP.
for eg. on an ubuntu machine:
sudo ntpdate pool.ntp.org
System time goes out of sync quite often. You need to keep them in sync periodically.
You can run a daily CRON job to keep your system time in sync as mentioned at this link: Periodically synchronize time in Linux
Create a bash script to sync time called ntpdate and put the below
into it
#!/bin/sh
# sync server time
/usr/sbin/ntpdate pool.ntp.org >> /tmp/ntpdate.log
You can place this script anywhere you like and then set up a cron I
will be putting it into the daily cron directory so that it runs once
every day So my ntpdate script is now in /etc/cron.daily/ntpdate and
it will run every day
Make this script executable
chmod +x /etc/cron.daily/ntpdate
Test it by running the script once and look for some output in
/tmp/ntpdate.log
/etc/cron.daily/ntpdate
In your log file you should see something like
26 Aug 12:19:06 ntpdate[2191]: adjust time server 206.108.0.131 offset 0.272120 sec
Faced similar issue when I use timedatectl command to change datetime of underlying machine... Explanation given by MikeD & others are really informative to fix the issue....
sudo apt install ntp
sudo apt install ntpdate
sudo ntpdate ntp.ubuntu.com
After synchronizing time with correct current datetime, this issue will be resolved
For me, the issue happened while using WSL. The date in WSL was out of sync.
The solution was to run the command
wsl --shutdown and restart docker.
This one command did the trick
sudo ntpdate pool.ntp.org
Make sure your PC's clock is set correctly. I faced the same issue and then realized my clock wasn't showing the right time due to some reason. As soon as I corrected the time, it started working fine again! Hope this helped.
I was also facing this issue , added
correctClockSkew: true
and issue fixed for me
const nodemailer = require('nodemailer');
const ses = require('nodemailer-ses-transport');
let transporter = nodemailer.createTransport(ses({
correctClockSkew: true,
accessKeyId: **,
secretAccessKey: **,
region: **
}));
If you are in AWS Ec2 Ubuntu server and somehow not able to fix time with NTP thing.
sudo date -s "$(wget -qSO- --max-redirect=0 google.com 2>&1 | grep Date: | cut -d' ' -f5-8)Z"
Source:https://askubuntu.com/a/655528
Had this problem on Windows. The current time got out of sync after a power outage. Solved it by: Setting -> date and time -> Sync now.
For those who face this issue while running Lambda functions (that use other AWS services like DynamoDB) locally with sam local invoke:
The time in docker container, used by sam, may not be in sync with host. Restarting your docker on host (Docker Desktop on Windows) should resolve this issue.
I was making AWS API requests from a VM on my local machine. I checked the date was correct and was syncing, but I was still getting the error above. I halted and re-upped my VM and the error went away. I never figured out the exact cause, but "turning it off and back on again" fixed it.
Complementing what as #miked-at-aws post about AWS sigV4, There are at least 2 main possible root causes for the clock skew:
your CPU is overloaded (reaching 99% usage or in EC2 instances with CPU limits that run out on CPU credits).
Why would this generate the time skew? because when the amazon SDK creates the time stamp to the moment the request is sent, normally there shouldn't be more than just a few nano or micro seconds, but if your CPU is overwhelmed it may take it several seconds or even minutes in some cases to process, so for this root cause you will experience not a 100% events lost but just some x% that may not be too big.
for the second root cause which is that your machine clock isn't just adjusted, well probably 100% of your events are being lost and you just have to make sure that your machine clock is being set and adjusted correctly.
I have tried all the solution related to time sync, but nothing works out. What I did was, while creating a service client, I set the correctClockSkew option as true. This solved my problem.
For instance:
let dynamodb = new AWS.DynamoDB({correctClockSkew: true});
Hope this will sort out.
Reference: https://github.com/aws/aws-sdk-js/issues/527
I have face this same problem while fetching video from Amazon Kinesis to my local website. So, in order to solve this problem i have install crony in my computer.This crony solved my problem.You can see the Amazon crony installation in this following link.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html
What worked for me was to change the time on my computer. I am in the UK so I put it forward one hour to put on a European time zone. Then it worked. This is not the best fix but it worked for me to move forward.
I set the timezone to eu-west-2 which is London so I am not sure why it only worked when I put the time on my computer forward an hour. I need to look into that.
Just try to update the system date and time they might be outdated synchronize your clock, and reload your console. This worked for me.
This is a question asking for any recent updates/suggestions.
Is this problem can be solved using aws amplifier, aws cognito SDK, service worker?
Time synchronization is not working

NATS Error while developing echo service

I'm trying to develop a system service, so I use the echo service as a test.
I developed the service by following the directions on the CF doc.
Now the echo node can be running, but the echo gateway failed with the error "echo_gateway - pid=15040 tid=9321 fid=290e ERROR -- Exiting due to NATS error: Could not connect to server on nats://localhost:4222/"
I got into this issue and struck for almost a week finally someone helped me to resolve it. The underlying problemn is something else and since errors are not trapped properly it gives a wrong message. You need to goto github and get the latest code base. The fix for this issue is http://reviews.cloudfoundry.org/#/c/8891 . Once you fix this issue, you will most likely encounter a timeout field issue. the solution for that is to define the timeout field gateway.yml
A few additional properties became required in the echo_gateway.yml.erb file - specifically, the latest were default_plan and timeout, under the service group. The properties have been added to the appropriate file in the vcap-services-sample-release repo.
Looks like the fix for the misleading error has been merged into github. I haven't updated and verified this myself just yet but the gerrit comments indicate the solution is the same as what the node base has had for some time. I did previously run into that error handling and it was far more helpful.

Django returns blank pages

Suddenly I started getting HTTP 200 with zero byte content for every request handled by Django.
This problem has appeared in past, too, and seemed to randomly disappear.
I see a debug view when I make syntax errors, but if the code executes fine, I get a blank page.
I tried resetting Apache, moving project directory, removing .pyc's—what next?
This mistake scores highest on stupidness * impact measure among all I've ever made.
I upload changes to our server via SFTP, and I got a short connection outage during last round of changes. Apparently, it happened exactly the moment I was uploading base.html, the base template for them all. The file was overwritten as zero byte empty file, and Django was correctly serving it.
Two things I've learned:
to never trust SFTP clients;
to inspect diff with HEAD when a problem occurs.
This just happened to me again. (I guess I'm lucky!)
I haven't found the cause but was able to recover by stopping and then starting Apache:
sudo apache2ctl stop
sudo apache2ctl start
Apparently, this isn't the same as restarting (sudo apache2ctl start) which didn't help at all.