youtube-dl and aria2c - batch download based on a list with YouTube links and saving each file according to a list of file names - youtube-dl

There is some ways of saving files downloaded with youtube-dl, like this:
youtube-dl -o "1-%(uploader)s%(title)s.%(ext)s
What I trying to do is, the same way as I have a URL.TXT containing each YouTube link, line by line, I would like to save the downloaded video files names according to a file name list, line by line.
Example:
URL.TXT
https://www.youtube.com/watch?v=video1
https://www.youtube.com/watch?v=video2
https://www.youtube.com/watch?v=video3
https://www.youtube.com/watch?v=video4
FILES_NAMES.TXT
VIDEO_FILE_NAME1
VIDEO_FILE_NAME2
VIDEO_FILE_NAME3
VIDEO_FILE_NAME4
So, in my directory, instead of using YouTube's video name, it would write the downloaded file using my file name's list. Example:
VIDEO_FILE_NAME1.MP4
VIDEO_FILE_NAME2.MP4
VIDEO_FILE_NAME3.MP4
VIDEO_FILE_NAME4.MP4
I'm using youtube-dl and aria2c for downloading. According to aria2c documentation, it says:
"
These options have exactly same meaning of the ones in the command-line options, but it just applies to the URIs it belongs to. Please note that for options in input file -- prefix must be stripped.
For example, the content of uri.txt is:
http://server/file.iso http://mirror/file.iso
dir=/iso_images
out=file.img
http://foo/bar
If aria2 is executed with -i uri.txt -d /tmp options, then file.iso is saved as /iso_images/file.img and it is downloaded from http://server/file.iso and http://mirror/file.iso. The file bar is downloaded from http://foo/bar and saved as /tmp/bar.
In some cases, out parameter has no effect. See note of --out option for the restrictions.
"
What about this syntax:
.\youtube-dl_2021-04-17.exe --format best --write-all-thumbnails --newline --external-downloader aria2c --external-downloader-args "-x 16 -s 10 -k 1M -i URL.TXT" -a .\URL.TXT
Where URL.TXT contains this:
https://youtu.be/video1
out=Video_Filne_Name_01
https://youtu.be/video2
out=Video_Filne_Name_02
https://youtu.be/video3
out=Video_Filne_Name_03
https://youtu.be/video4
out=Video_Filne_Name_04

Related

How to use youtube-dl to download multiple videos in one txt file on mac?

I got a txt file with one video url per line and there're 50 urls in total. I know youtube-dl has the feature that allows you to download multiple videos with youtube-dl -a sample.txt.
But I need another way to do this because I'm also using a download tool called you-get which works better on some sites. However, it doesn't support download from a txt file. Last week I find a method to convert multiple files with ffmpeg with this command for i in *.m4a; do ffmpeg -i "$i" "${i%.*}.mp3"; done. I am wondering that is there any similar one line command like this one can help me read urls from the txt file and download with youtube-dl. I am using a mac btw.
More generally, xargs is specific tool for that approach.
Example:
cat sample.txt | xargs --max-args=1 you-get
takes each line from sample.txt, giving them as an argument to you-get. This ends up with 50 command lines:
you-get url1
you-get url2
you-get url[...]
and so on. This is a good tutorial, from tutorialspoint: https://www.tutorialspoint.com/unix_commands/xargs.htm

Wget returns images in an unknown format (jpg#f=)

I am running the following line:
wget -P "C:\My Web Sites\REGEX" -r --no-parent -A jpg,jpeg https://www.mywebsite.com/directory1/directory2/
and it stops (no errors) without returning more than a small amount of the website (two files). I am then running this:
wget -P "C:\My Web Sites\REGEX" https://www.mywebsite.com/directory1/directory2/ -m
and expecting to see data only from the directory. As a start, I found out that the script downloaded everything from the website as if I gave the https://www.mywebsite.com/ url. Also, the images are returned with an additional string in the extension (e.g. instead of .jpg I get something like .jpg#f=l=q)
Is there anything wrong in my code that causes that? I only want to get the images from the links that are shown in the directory given initially.
If there is nothing I can change, then I want to only download the files that contain .jpg in their names. Then, I have a prepared script in Python that can rename the files to have the original extension. Worst case, I can try Python instead of the CMD in Windows (page scraping)?
Note that --no-parent doesn't work in this case because the images are saved in a different directory. --accept-regex can be used if there is no way to get the correct extension.
PS: I do this thing in order to learn more about the wget options and protect my future hobby website.
UPD: Any suggestions regarding a Python script are welcome.

I need a bash script which needs to download a tar file from a website, this site has multiple files which needs to be filtered

I have a situation where I need to use curl/wget to download a tar file from a website, based on users input. If they mention a build I need to download a tar file based on the release version, I have a logic already to switch between builds, Questions is how can i filter out a particular tar file from multiple files.
curl -s https://somewebsite/somerepo-folder/os-based.tar | grep os-based* > sample.txt
curl -s https://somewebsite/somerepo-folder/os-based2.tar
curl -s https://somewebsite/somerepo-folder/os-based2.tar
first curl downloads all files. Regex helps here, how can I place this along with curl?
if there is a mapping between the user-input and the tar file that you can think of, you can do something like this:
userInput=1
# some logic to map user-input with the tar filename to download
$tarFileName="os-based$userInput.tar"
wget "https://somewebsite/somerepo-folder/$tarFileName"

Youtube-dl # want to download only the playlist (list of file name) to a text file

I want to download only the File list (File names in a playlits) to a output text file using YouTube dl.
Q
Have a easy method. use this code
youtube-dl -i -o %(playlist_index)s-%(title)s --get-filename --skip-download https://www.youtube.com/playlist?list=PLpQLftDPSfXzLsww0KvSRerT00G4frW7L > log.txt
According to this code you will get the log.txt file as follows
-i or --ignore-errors => continue download videos, skip unavilable videos
After the -o you can create the template as you want with the file name.
If you only need the file name, use this section %(title)s
If you only need the file name and playlist index number, use this
section %(playlist_index)s-%(title)s
extension if needed %(playlist_index)s-%(title)s.%(ext)s
uploader if needed %(playlist_index)s-%(uploader)s%(title)s.%(ext)s
This way you can customize the code. You can read the "readme file" in youtube-dl here

GeoIPCity.dat file where do I find it?

I have a file GeoIPCity.day on one of my sites. The original developer told me it was a free file from maxmind.com and I should update it every month. I have looked on maxmind.com and haven't been able to find a file with the exact same name. Any idea what file I should used as the update? Here are the list of files I was able to find on the website: http://dev.maxmind.com/geoip/legacy/geolite/#Downloads
The Legacy "dat" format DBs are discontinued:
https://support.maxmind.com/geolite-legacy-discontinuation-notice/
Free version of the GEO IP City was rename from GeoIPCity.dat TO GeoLiteCity.dat.
Download the data. This should work on CentOS 7/7.1
wget -N http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz -O /usr/share/GeoIP/GeoLiteCity.dat.gz && gunzip --force /usr/share/GeoIP/GeoLiteCity.dat.gz
this maintenance a backward compatible version
ln -s /usr/share/GeoIP/GeoLiteCity.dat /usr/share/GeoIP/GeoIPCity.dat
The old files can be download from the following: https://mirrors-cdn.liferay.com/geolite.maxmind.com/download/geoip/database/
A copy of the file "GeoLiteCity.dat" is also available from github at the following url:
https://github.com/mbcc2006/GeoLiteCity-data
Click on Code button then download