Batch file: evaluating command output with regex? - regex

I have made a Windows batch file based on some examples I have found elsewhere.
What it does is it parses a folder for a specific file type (.mkv) and then runs mkvmerge.exe from the MKVToolNix folder.
The command produces an output, listing the different tracks in the container.
The core of the file is
set rootfolder="Z:\Movies"
for /r %rootfolder% %%a in (*.mkv) do (
for /f %%b in ('mkvmerge -i "%%a" ^| find /c /i "chapters"') do (
if [%%b]==[0] (
echo "%%a" has no chapters
) else (
echo Doing some interesting stuff!
)
)
)
The above example is just part of the file, rootfolder is set to the folder I want parsed, of course, and upon finding a file with chapters in it, it will run additional commands.
It all works beautifully but I also want to check for subtitles at the same time. The find command doesn't take regular expressions or I could just have added "chapters subtitles". My efforts using other commands, like findstr, haven't really worked.
How do I go about using RegEx here?
This is an example output, running mkvmerge.exe on an .mkv file
mkvmerge.exe -i "Snow White and the Seven Dwarfs (1937) [tt0029583].mkv"
File 'Snow White and the Seven Dwarfs (1937) [tt0029583].mkv': container: Matroska
Track ID 0: video (MPEG-4p10/AVC/h.264)
Track ID 1: audio (DTS)
Track ID 2: audio (AC-3)
Track ID 3: subtitles (HDMV PGS)
Track ID 4: subtitles (HDMV PGS)
Chapters: 26 entries
This example has both subtitle and chapter tracks and the batch file will find the keyword "chapters" (it's set to ignore case). I also want to catch the files that contain the keyword "subtitles" even when there are no chapters.
To clarify my intent here, I want the code to:
Parse through the given folder
For all .mkv files, do mkvmerge /i which will output (as text) the streams in that file
Look at that output and if it contains the word(s) "chapters" and/or "subtitles" trigger some action.

Since you appear to make no distinction between whether one string or the other (or both) is detected, then
#Echo off
set rootfolder="Z:\Movies"
for /r %rootfolder% %%a in (*.mkv) do (
mkvmerge -i "%%a" | findstr /i "^Chapters subtitles" >nul
if errorlevel 1 (
echo Neither Chapters nor subs found in "%%a"
) else (
echo Chapters or subs found in "%%a"
)
)
would likely be easier.

Thanks LotPings for your effort. I learned something about tokens that will be very useful. Your script also got me on the right track (learning a few more commands on the way.
Your script ended up looking like this:
:: Q:\Test\2018\05\27\SO_50555308.cmd
#Echo off
set rootfolder="Z:\Movies"
for /r %rootfolder% %%a in (*.mkv) do (
set "found="
for /f "tokens=1-4* delims=: " %%b in (
'mkvmerge -i "%%a" ^| findstr /i "^Chapters subtitles"'
) do (
if /i "%%b"=="Chapters" set found=1
if /i "%%e"=="subtitles" set found=1
)
if defined found (
echo Chapters or subs found in "%%a"
) else (
echo.
)
)
All I needed was to check whether one of my keywords were present in any of the tokens and then set a flag accordingly and after the loop do the appropriate action, resetting the flag for the next file.

Still unclear how a line with Subtitles would look like.
Presuming the values Chapters and Subtitles appear in 1st column
The for /f splits the lines at colon and space (adjacent delims count as one) into tokens 1=%%b and 2=%%c the possible unsplitted rest in token 3=%%d
The space separated (ORed) RegEx search words anchored ^ at line begin will only match chapters/subtitles.
:: Q:\Test\2018\05\27\SO_50555308.cmd
#Echo off
set rootfolder="Z:\Movies"
for /r %rootfolder% %%a in (*.mkv) do (
for /f "tokens=1-4* delims=: " %%b in (
'mkvmerge -i "%%a" ^| findstr /i "^Chapters subtitles"'
) do (
if /i "%%b"=="Chapters" if "%%c"=="0" (
echo "%%a" has no chapters
) else (
echo "%%a" has %%c chapters
echo Doing some interesting stuff!
)
if /i "%%e"=="subtitles" echo "%%a" %%b %%b %%d: %%e %%f
)
)

Related

How to subtract string and non-null value entries from txt file?

I have a script that extracts lines such as :
THIS_IS_A_LINE:=
THIS_IS_A_LINE2:=
and outputs all of the same kind into another .txt file as:
THIS_IS_A_LINE
THIS_IS_A_LINE2
The script is the following:
set "file=%cd%/Config.mak"
set /a i=0
set "regexp=.*:=$"
setlocal enableDelayedExpansion
IF EXIST Source_List.txt del /F Source_List.txt
for /f "usebackq delims=" %%a in ("%file%") do (
set /a i+=1
call set Feature[!i!]=%%a
)
cd .. && cd ..
rem call echo.!Feature[%i%]!
for /L %%N in (1,1,%i%) do (
echo(!Feature[%%N]!|findstr /R /C:"%regexp%" >nul && (
call echo FOUND
call set /a j+=1
call set Feature_Disabled[%j%]=!Feature[%%N]:~0,-2!
call echo.!Feature_Disabled[%j%]!>>Source_List.txt
) || (
call echo NOT FOUND
)
)
endlocal
I also have another script that extracts lines such as:
THIS_IS_ANOTHER_LINE:=true
THIS_IS_ANOTHER_LINE2:=true
...
and outputs all of the same kind into another .txt file as:
THIS_IS_ANOTHER_LINE
THIS_IS_ANOTHER_LINE2
...
The script is the following:
set "file=%cd%/Config.mak"
set /a i=0
set "regexp=.*:=true$"
setlocal enableDelayedExpansion
IF EXIST Source_List2.txt del /F Source_List2.txt
for /f "usebackq delims=" %%a in ("%file%") do (
set /a i+=1
call set Feature[!i!]=%%a
)
cd .. && cd ..
rem call echo.!Feature[%i%]!
for /L %%N in (1,1,%i%) do (
echo(!Feature[%%N]!|findstr /R /C:"%regexp%" >nul && (
call echo FOUND
call set /a j+=1
call set Feature_Disabled[%j%]=!Feature[%%N]:~0,-6!
call echo.!Feature_Disabled[%j%]!>>Source_List2.txt
) || (
call echo NOT FOUND
)
)
endlocal
Nevertheless, there is a third kind of lines which contain numerical numbers (also some hexadecimal values), such as:
THIS_IS_AN_UNPROCESSED_LINE:=0xA303
THIS_IS_AN_UNPROCESSED_LINE2:=1943
THIS_IS_AN_UNPROCESSED_LINE3:=HELLO_DOOD_CAN_YOU_PARSE_ME?
So I need the way to extract as well those kind of lines into another .txt file such as:
THIS_IS_AN_UNPROCESSED_LINE:=0xA303
THIS_IS_AN_UNPROCESSED_LINE2:=1943
THIS_IS_AN_UNPROCESSED_LINE3:=HELLO_DOOD_CAN_YOU_PARSE_ME?
So basically extract lines which are not of the kind:
THIS_IS_AN_UNPROCESSED_LINE:=
or
THIS_IS_AN_UNPROCESSED_LINE:=true
but keeping both the sides of the line entry.
I know there must be some trick with the regular expression but I just can't find it out.
You have made your code much more complicated than it needs to be. There is no need to create an array of every line in the file.
If there are no other : or = before the first :=, then you can use FINDSTR to print out all lines that contain a string, followed by :=. FOR /F can capture and parse each matching line into the parts before and after :=, and then IF statements can classify the three different types of lines.
I use n> to open all three output files outside the main code block for improved performance, and then I use the &n> syntax to direct each output to the appropriate, already opened file. I use high numbered file handles to avoid problems described at Why doesn't my stderr redirection end after command finishes? And how do I fix it?.
#echo off
setlocal
set "file=Config.mak"
set /a "empty=7, true=8, unprocessed=9"
%empty%>empty.txt %true%>true.txt %unprocessed%>unprocessed.txt (
for /f "delims=:= tokens=1*" %%A in ('findstr /r "^[^:=][^:=]*:=" "%file%"') do (
if "%%B" equ "" (
>&%empty% (echo %%A)
) else if "%%B" equ "true" (
>&%true% (echo %%A)
) else (
>&%unprocessed% (echo %%A:=%%B)
)
)
)
The above will ignore lines that contain : or = before :=, and it will not work properly if the first character after := is : or =. I'm assuming that should not be a problem.
It should be relatively easy to write a very efficient solution using PowerShell, VBScript, or JScript that eliminates the limitations.
You could also use JREPL.BAT - a powerful and efficient regular expression text processing command line utility. JREPL.BAT is pure script (hybrid batch/JScrpt) that runs natively on any Windows machine from XP onward, no 3rd party exe required. And JREPL is much faster than any pure batch solution, especially if the files are large.
Here is one JREPL solution
#echo off
setlocal
set repl=^
$txt=false;^
if ($2=='') stdout.WriteLine($1);^
else if ($2=='true') stderr.WriteLine($1);^
else $txt=$0;
call jrepl "^(.+):=(.*)$" "%repl%" /jmatchq^
/f Config.mak /o unprocessed.txt >empty.txt 2>true.txt
If all you have to do is classify the lines into three different files, without worrying about stripping off the :=true and := parts for the empty and true lines, then there is a very simple pure batch solution using nothing but FINDSTR.
#echo off
set "file=Config.mak"
findstr /r ".:=$" "%file%" >empty.txt
findstr /r ".:=true$" "%file%" >true.txt
findstr /r ".:=" "%file%" | findstr /r /v ":=$ :=true$" >unprocessed.txt

Batch Splitting a line of text into multiple lines, delimited by quotation space quotation

Thanks in Advance.
Using a DOS batch file, I am trying to read a text file that contains several full paths with quotes, separated by a space and write a new file containing one path per line.
For example, I want to turn this file:
"C:\path\filename.doc" "C:\path\filename.doc" "C:\path\filename.doc" "C:\path\filename.doc"
into this:
"C:\path\filename.doc"
"C:\path\filename.doc"
"C:\path\filename.doc"
"C:\path\filename.doc"
I have had some success using the wonderful repl.bat (by dbenham).
type "files.txt" | repl " " "\r\n" x l >"newfile.txt"
But when there are spaces in the filenames or path it breaks a new line in the middle of the path and wrecks it.
Ive tried passing as the search variable into repl using the escape character ^, i.e. repl "^" ^"" and other ways with no joy.
At the end of the day, I simply need to move all the files into another directory, and so was going to then pass the resulting text file to another bulk delete batch file for processing, but perhaps there is a better way im missing ?
This has a limitation in the length of the line, of around 8 KB.
Less than that and it will move the files to your new folder.
#echo off
for /f "usebackq delims=" %%a in ("c:\folder\file.txt") do (
for %%b in (%%a) do move "%%~b" "d:\existing\new\folder"
)
The code below should work to move all files in except the ones in the list.
It adds a hidden attribute to the files in the list, moves all the other files, then removes the hidden attributes again.
#echo off
for /f "usebackq delims=" %%a in ("c:\folder\file.txt") do (
for %%b in (%%a) do attrib +h "%%~b"
)
cd /d "c:\folder"
move *.* "d:\already\existing folder"
for /f "usebackq delims=" %%a in ("c:\folder\file.txt") do (
for %%b in (%%a) do attrib -h "%%~b"
)
Test code for Windows 2012 as mentioned in the comments
#echo off
(echo "c:\widget1\test 1.txt" "2:\widget2\test 2.doc")>"file.txt"
for /f "usebackq delims=" %%a in ("file.txt") do (
for %%b in (%%a) do echo move "%%~b" "d:\existing\new\folder"
)
pause
You could use the following batch file split.bat and call it redirecting the content of your text file into it and redirecting the output into another file like split.bat < files.txt > newfiles.txt:
#echo off
set /P INFILE=
call :SPLIT %INFILE%
exit /B
:SPLIT
shift
if "%~0"=="" exit /B
echo "%~0"
goto :SPLIT
If you do not provide an input file (< files.txt) the scripts prompts you for a space-separated list.
If no output file is given (> newfiles.txt), the created new-line-separated list is shown on screen.
Notice that this does not verify whether your input file fulfills the described formatting.
This method is limited to a list length of 1021 bytes (characters), everything after will be truncated!
Assuming you can guarantee that each file path is enclosed within double quotes, then you just need to tweak your REPL.BAT command a bit:
type "files.txt" | repl "(\q.*?\q) *" "$1\r\n" x >"newfile.txt"
But REPL.BAT has been superseded by JREPL.BAT - it has even more functionality, and a slightly different syntax.
A JREPL solution can be as simple as:
jrepl "\q.*?\q" $0 /x /jmatch /f file.txt /o newfile.txt
If you want, you can overwrite the original file with the result by specifying - as the output file.
jrepl "\q.*?\q" $0 /x /jmatch /f file.txt /o -
If each line in the original file is <8k, then the following pure batch script should work, and it is pretty simple:
#echo off
>newfile.txt (
for /f "delims=" %%A in (files.txt) do for %%B in (%%A) do echo %%B
)

Windows batch script to search for specific files to delete

I need help converting the following Linux/Mac commands into a Windows batch script.
find / -regex ``^.*/Excel_[a-z]*.xls'' -delete 2>/dev/null
find / -regex ``^.*/presentation[0-9]*[a-z]*.ppt'' -delete 2>/dev/null
Basically using regular expressions, I need to find any of the .xls/.ppt files (in the format above) in a given Windows box and delete them.
Sorry, I'm new to Windows batch files.
You really don't explain what your hieroglypics mean.
In a batch file,
#echo off
setlocal
dir /s "c:\local root\Excel_*.xls"
would show all of the files matching starting Excel_ with extension .xls
(and if that's what you want, simply replacing dir with del would delete them; adding >nul would suppress messages; adding 2>nul suppresses error messages)
If you want files starting Excel_ then followed by any alphas-only, then
for /f "delims=" %%a in ('dir /b /s /a-d Excel_*.xls ^| findstr /E /R "\\Excel_[a-z]*\.xls" ') do echo "%%a"
The dir produces a directory list in /b (basic) form (ie. filename-only) /s - with subdirectories (which attaches the full directory name) and the /a-d suppresses directorynames. This is piped to findstr to filter out the required filenames. The result is assigned to %%a line-by-line, and the delims= means "don't tokenise the data"
should show you all the files matching that criterion - but it would be case-sensitive; add /i to the findstr switches to make it case-insensitive. (/r means "regex" within findstr's restricted implementation; /e means "ends" - I tend to use these over $) The \\ in intended to implement escaped \ to ensure the filename match is complete (ie do't find "some_prefixExcel_whatever.xls) - I'll leave what \. is intended to do to your imagination...
(again, change echo to del to delete and add in the redirection palaver if required)
And the other regex - well, follow the bouncing ball. It would appear you want .ppt files with names starting presentation followed by a possible series of numerics then by a series of alphabetics. I'd suggest
findstr /e /r "\\presentation[0-9]*[a-z]*\.ppt" for that task.
Use PowerShell.
get-childitem | where-object { $_.Name -match '<put a regex here>' } | remove-item
get-childitem returns file system objects, and the where-object filter selects only those file system objects whose name property matches a regular expression. These filtered items are then passed through the pipeline to remove-item.
There is good information about the PowerShell pipeline in the about_pipelines help topic, which you can read using the following command:
help about_pipelines
Plain Batch can't do this task. But you could make use of tools like findstr which come with Windows and support Regex.
This line can be executed from CMD and deletes all files in the current folder which match the RegEx:
for /f "eol=: delims=" %F in ('findstr /r "MY_REGEX_HERE"') do del "%F"
So try to get your expected results by playing around with this command. If your fine with the output/results, you can embed this line in your batch script. (Be careful, when embedding this line in batchscript you have to double the percentage signs!)
for /f "eol=: delims=" %%F in ('findstr /r "MY_REGEX_HERE"') do del "%%F"
Here a small batch to do what you are looking for:
#echo off
set /p dirpath=Where are your files ?
:: set dirpath=%~dp0 :: if you want to use the directory where the batch file is
set /p pattern=Which pattern do you wanna search (use regex: *.xml e.g.) :
:: combinason /s /b for fullpath+filename, /b for filename
for /f %%A in ('dir /s /b "%dirpath%\%pattern%" ^| find /v /c ""') do set cnt=%%A
echo File count = %cnt%
call :MsgBox "Do you want to delete all %pattern% in %dirpath%? %cnt% files found" "VBYesNo+VBQuestion" "Click yes to delete the %pattern%"
if errorlevel 7 (
echo NO - quit the batch file
) else if errorlevel 6 (
echo YES - delete the files
:: set you regex, %%i in batch file, % in cmd
for /f "delims=" %%a in ('dir /s /b "%dirpath%\%pattern%"') do del "%%a"
)
:: avoid little window to popup
exit /b
:: VBS code for the yesNo popup
:MsgBox prompt type title
setlocal enableextensions
set "tempFile=%temp%\%~nx0.%random%%random%%random%vbs.tmp"
>"%tempFile%" echo(WScript.Quit msgBox("%~1",%~2,"%~3") & cscript //nologo //e:vbscript "%tempFile%"
set "exitCode=%errorlevel%" & del "%tempFile%" >nul 2>nul
endlocal & exit /b %exitCode%

Extract number from string in batch file

From a batch file I want to extract the number 653456 from the following string:
C:\Users\testing\AppData\Local\Test\abc123\643456\VSALBT81_COM
The number will change, however it will always be just digits.
My current theory is to search for something that fits \alldigits\, then replace the two \s with white space, but I can’t quite get it.
Assuming the number is always the parent folder (the folder before the end):
#echo off
set "str=C:\Users\testing\AppData\Local\Test\abc123\643456\VSALBT81_COM"
for %%F in ("%str%\..") do set "number=%%~nxF"
EDIT - Code sample adapted to correct errors shown in comments
set d=C:\Users\testing\AppData\Local\Test\abc123\643456\VSALBT81_COM
for %%f in ("%d:\=" "%") do for /f %%n in ('echo %%f^|findstr /b /e /r "\"[0-9]*\""') do (
echo %%~n
)
Just precede the path with a quote, split the path, replacing each backslash with a quote a space and a quote and append a quote (so we have a list of elements to iterate), and for each part check if it is formed only by numbers
#echo off
setlocal EnableDelayedExpansion
set "string=C:\Users\testing\AppData\Local\Test\abc123\643456\VSALBT81_COM"
for /L %%d in (0,1,9) do set "string=!string:\%%d=\ %%d!"
for /F "tokens=2" %%a in ("%string%") do for /F "delims=\" %%b in ("%%a") do echo Number: [%%b]
This uses a helper batch file called repl.bat from - https://www.dropbox.com/s/qidqwztmetbvklt/repl.bat
#echo off
set "string=C:\Users\testing\AppData\Local\Test\abc123\643456\VSALBT81_COM"
echo "%string%"|repl ".*\\([0-9]*)\\.*" "$1"
Here is how I striped numbers from a string in batch (not a file path, should be generically working for a "string")
#ECHO OFF
::set mystring=Microsoft Office 64-bit Components 2013
set mystring=Microsoft 365 Apps for enterprise - en-us
echo mystring = %mystring%
for /f "tokens=1-20 delims=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!##$&*()-= " %%a in ("%mystring%") do (
IF %%a == 64 (
set ONum=%%b
GoTo varset
)
IF %%a == 32 (
set ONum=%%b
GoTo varset
)
set ONum=%%a
)
:varset
echo numfromalphanumstr = %numfromalphanumstr%
pause
https://www.dostips.com/forum/viewtopic.php?t=3499
https://superuser.com/questions/1065531/filter-only-numbers-0-9-in-output-in-classic-windows-cmd
Extract number from string in batch file
How to extract number from string in BATCH

Batch .BAT script to rename files

Im looking for a batch script to (recursively) rename a folder of files..
Example of the rename:
34354563.randomname_newname.png to newname.png
I already dug up the RegEx for matching from the beginning of the string to the first underscore (its ^(.*?)_ ), but cant convince Windows Batch to let me copy or rename using RegEx.
From command-line prompt - without regex:
FOR /R %f IN (*.*) DO FOR /F "DELIMS=_ TOKENS=1,*" %m IN ("%~nxf") DO #IF NOT "%n" == "" REN "%f" "%n"
In batch file, double %:
FOR /R %%f IN (*.*) DO FOR /F "DELIMS=_ TOKENS=1,*" %%m IN ("%%~nxf") DO #IF NOT "%%n" == "" REN "%%f" "%%n"
EDIT: A new pure batch solution issuing following cases:
Path\File_name.ext => name.ext
Path\none.ext (does nothing)
Path\Some_file_name.ext => file_name.ext
Path\name.some_ext (does nothing)
Path\Some_file_name.some_ext => name.some_ext
Batch (remove ECHO to make it functional):
FOR /R %%f IN (*.*) DO CALL :UseLast "%%~f" "%%~nf"
GOTO :EOF
:UseLast
FOR /F "DELIMS=_ TOKENS=1,*" %%m IN (%2) DO IF "%%n"=="" (
IF NOT "%~2"=="%~n1" ECHO REN %1 "%~2%~x1"
) ELSE CALL :UseLast %1 "%%n"
GOTO :EOF
avoid _ in the extension:
#ECHO OFF
FOR /F "DELIMS=" %%A IN ('DIR /S /B /A-D *_*.*') DO FOR /F "TOKENS=1*DELIMS=_" %%B IN ("%%~NA") DO IF "%%~C" NEQ "" ECHO REN "%%~A" "%%~C%%~XA"
Look at the output and remove ECHO if it looks good.
The standard windows shell doesn't have regex capabilities, and in fact it's extremely basic. But you can use powershell to do this.
I'm not very familiar with power shell, but this other question explains how to filter files: Using powershell to find files that match two seperate regular expressions
New Solution
There is an extremely simple solution using JREN.BAT, my new hybrid JScript/batch command line utility for renaming files and folders via regular expression search and replace.
Only rename files with one underscore:
jren "^[^_]+_([^_]+\.png)$" "$1" /s /i
Preserve everything after the first underscore:
jren "^.*?_" "" /s /fm "*.png"
Preserve everything after the last underscore:
jren "^.*_" "" /s /fm "*.png"
=========================================================
Original Answer
This can be done using a hybrid JScript/batch utility called REPL.BAT that performs regex search and replace on stdin and writes the result to stdout. All the solutions below assume REPL.BAT is somewhere within your PATH.
I intentionally put ECHO ON so that you get a log of the executed rename commands. This is especially important if you get a name collision: two different starting names could both collapse to the same new name. Only the first rename will succeed - the second will fail with an error message.
I have three solutions that differ only in how they handle names that contain more than 1 _ character.
This solution will only rename files that have exactly one _ in the name (disregarding extension). A name like a_b_c_d.txt would be ignored.
#echo on&#for /f "tokens=1,2 delims=*" %%A in (
'2^>nul dir /b /s /a-d *_* ^| repl ".*\\[^_\\]*_([^_\\]*\.[^.\\]*)$" "$&*$1" a'
) do ren "%%A" "%%B"
The next solution will preserve the name after the last _. A name like a_b_c_d.txt would become d.txt
#echo on&#for /f "tokens=1,2 delims=*" %%A in (
'2^>nul dir /b /s /a-d *_* ^| repl ".*_([^\\_.]*\.[^.\\]*)$" "$&*$1" a'
) do echo ren "%%A" "%%B"
This last solution will preserve the name after the first _. A name like a_b_c_d.txt would become b_c_d.txt. This should give the same result as the Endoro answer.
#echo on&#for /f "tokens=1,2 delims=*" %%A in (
'2^>nul dir /b /s /a-d *_* ^| repl ".*\\[^\\]*?_([^\\]*\.[^.\\]*)$" "$&*$1" a'
) do echo ren "%%A" "%%B"