How get Uploader Name of Downloded video from youtube-dl? - youtube-dl

I have downloaded some video from youtube using youtube-dl from many different playlists. Now i want all video's title should be included there uploader's name or channels name without downloading all video again so which cmd i need i am using window 10.

You may extract uploader name from -j JSON metadata. E.g. as:
youtube-dl.exe -j https://www.youtube.com/watch?v=YOUR-URL | python.exe -c "import sys, json; print(json.load(sys.stdin)['uploader'])"
-j option doesn't download a whole video.

Related

Save output file into AWS S3 Bucket

Using the FFmpeg I'm trying to output file to the s3 bucket.
ffmpeg -i myfile.mp4 -an -crf 20 -vf crop=200:200 -s 800x600 -f mp4 pipe:1 | aws s3 cp - s3://my.test.bucket
As I'm already advised that this cannot be done since creating an mp4 file requires seeking and piping doesn't allow seeking. if I change this command to store the file on the local disk
ffmpeg -i myfile.mp4 -an -crf 20 -vf crop=200:200 -s 800x600 myfile.mp4
it will store locally under project root folder which is fine.
But since I'm running my app from the container and the ffmpeg itself is installed in the Dockerfile I'm trying to figure out what are the possible options here? (if mp4 cannot be stored on S3 from ffmpeg command).
I need to download the output file myfile.mp4 into the server path if I use IWebHostEnvironment where it would actually be saved? is it inside container? Can I mount some s3 bucket folder into docker file and use it from the actual ffmpeg command again?
Since my input file is on s3 bucket and I want my output file to be on the same s3 bucket is there any solution where I wouldn't need to download the output file from the ffmpeg and upload it again?
I guess this is a lot of questions but I feel like I run into a rabbit hole here.
There are really a lot of questions. :D
To make it fair, a few questions from me, to see if I understand everything.
Where is your docker container running? Lambda, ec2 machine, kubernetes cluster?
If it is on ec2, you can use https://aws.amazon.com/efs/ but....
Can you simply save the file in /tmp? And then make an aws s3 cp command from tmp folde ?
In some environments (for example lambda), /tmp was the only place where I had programmatically access to file system.
Although if I understand correctly, you have write rights in your environment? Because you download the original image from s3 bucket. So can you do something like this?
download source file from s3
create new file with ffmpeg
uploaded the file to s3

Does youtube-dl still work(newest version youtube-dl-2020.2.16)?

Used command:
youtube-dl --max-filesize 30m -f m4a -o "/home/dc3014b3c6a1a23bba88b2a1fbcc1447.m4a" "https://www.youtube.com/watch?v=_Xa0ydtx8PM"
youtube-dl can't work at all for me. Error happened like these:
ERROR: Unable to download webpage: <urlopen error EOF occurred in violation of protocol (_ssl.c:618)> (caused by URLError(SSLEOFError(8, u'EOF occurred in violation of protocol (_ssl.c:618)'),))
OR
ERROR: Unable to download webpage: HTTP Error 429: Too Many Requests (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
But When I use curl command to get url content, it's ok.
curl -v https://www.youtube.com/watch?v=_Xa0ydtx8PM
How can I resolve it?
From the error message, You might want to be sure you are using the latest version of youtube-dl. you might want to update it. Am assuming you are using a *nix system. Also depending on how you first installed, there are several options for updating. here are a few options;
For manual installations:
you can simply run youtube-dl -U or, on Linux, sudo youtube-dl -U.
If you are already running on an update version, you may want to consider the below as best methods to download videos. Mind you that with the new version of youtube-dl, it automatically downloads the best version for you so you do not need to specify although you could still do this to be sure.
# Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but no better than 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger than 50 MB
$ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol
$ youtube-dl -f '(bestvideo+bestaudio/best)[protocol^=http]'
# Download the best video format and the best audio format without merging them
$ youtube-dl -f 'bestvideo,bestaudio' -o '%(title)s.f%(format_id)s.%(ext)s'
Here is a link for reference and further instructions/support.
I hope this helps. If not, let me know. Glad to help to the end.
Just wanted to share the youtube-dl alternative (it is just a fork) that for now (September 2022) works fine and supports almost all youtube-dl features yt-dlp.
I switched to this project and now I am using the same scripts which I had have previously.

youtube-dl download stops on playlist with video missing

I am using youtube-dl to download from a playlist for offline viewing. The operators of the playlist have started putting a scheduled video in the playlist that causes the downloads to fail. When trying to download the videos on the playlist, when it tries to download a video that isn't available (the scheduled video), it fails and the downloads abort.
How can I have the playlist download continue when there is a missing video?
My command:
/share/Multimedia/temp/youtube-dl -f 'best[ext=mp4]' -o "/share/Multimedia/YouTube/TheNational/%(upload_date)s.%(title)s.%(ext)s" --restrict-filenames --dateafter today-3day --no-mtime --download-archive "/share/Multimedia/temp/dllist-thenational.txt" --playlist-end 10 https://www.youtube.com/playlist?list=PLvntPLkd9IMcbAHH-x19G85v_RE-ScYjk
The download results from today:
[youtube:playlist] PLvntPLkd9IMcbAHH-x19G85v_RE-ScYjk: Downloading webpage
[download] Downloading playlist: The National | Full Show | Live Streaming Nightly at 9PM ET
[youtube:playlist] playlist The National | Full Show | Live Streaming Nightly at 9PM ET: Downloading 10 videos
[download] Downloading video 1 of 10
[youtube] pZ2AG5roG-A: Downloading webpage
[youtube] pZ2AG5roG-A: Downloading video info webpage
ERROR: This video is unavailable.
I want to playlist download to ignore the missing file and continue to the next available video.
Thanks.
I would add these before -f
-i, --ignore-errors
Continue on download errors, for example to skip unavailable videos in a playlist
-c, --continue
Force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.

How to get kaggle competition data via command line on virtual machine?

I am looking for the easiest way to download the kaggle competition data (train and test) on the virtual machine using bash to be able to train it there without uploading it on git.
Fast-forward three years later and you can use Kaggle's API using the CLI, for example:
kaggle competitions download favorita-grocery-sales-forecasting
First you need to copy your cookie information for kaggle site in a text file. There is a chrome extension which will help you to do this.
Copy the cookie information and save it as cookies.txt.
Now transfer the file to the EC2 instance using the command
scp -i /path/my-key-pair.pem /path/cookies.txt user-name#ec2-xxx-xx-xxx-x.compute-1.amazonaws.com:~
Accept the competitions rules and copy the URLs of the datasets you want to download from kaggle.com. For example the URL to download the sample_submission.csv file of Intel & MobileODT Cervical Cancer Screening competition is: https://kaggle.com/c/intel-mobileodt-cervical-cancer-screening/download/sample_submission.csv.zip
Now, from the terminal use the following command to download the dataset into the instance.
wget -x --load-cookies cookies.txt https://kaggle.com/c/intel-mobileodt-cervical-cancer-screening/download/sample_submission.csv.zip
Install CurlWget chrome extension.
start downloading your kaggle data-set. CurlWget will give you full wget command. paste this command to terminal with sudo.
Job is done.
Install cookies.txt extension on chrome and enable it.
Login to kaggle
Go to the challenge page that you want the data from
Click on cookie.txt extension on top right and it download the current page's cookie. It will download the cookies in cookies.txt file
Transfer the file to the remote service using scp or other methods
Copy the data link shown on kaggle page (right click and copy link address)
run wget -x --load-cookies cookies.txt <datalink>

Process stops when one URL in file causes error

I use youtube-dl -a filename to download the videos. However, when one URL in the list of URLs fail, the process exits, is there a way to skip the failing URL and proceeding with the remaining URLs?
The man page of youtube-dl says:
-i, --ignore-errors Continue on download errors, for example to skip unavailable
videos in a playlist
Thus:
youtube-dl -i -a filename
edit: I strongly advice you to run
youtube-dl -U
prior to any download, as the world of online videos is fast changing and updates often fix download errors. Moreover, some errors are due to content restriction and can be solved by adding login and password to the tool:
youtube-dl -u USERNAME -p PASSWORD