Writing .txt files from lines 1 to i - regex

I am very close to my answer, but I cant seem to find it.
I am using the Findstr function in a batch file to narrow now an entire directory to just one file.
cd ...
findstr /s /m "Desktop" *class.asasm >results1.txt
findstr /m /f:results1.txt "Production" *class.asasm >results2.txt
findstr /n /f:results2.txt "Capabilities" *class.asasm >results3.txt
TASK 1: I need to figure out a way to make findstr search backwards for a fourth string from the line number the third line was found on
TASK 2: I need to write lines 1-the one we arrive at of the file in results2.txt
the insert a .txt file. Then write the rest of the original lines.
I am writing an application in VB.Net with Visual Studios and I am having a difficult time figuring out how to complete this process. I am currently having better luch with having the application run batch files that are written within the application.

The correct solution is to find a tool that does this properly. batch/CMD does not.
Here's a script that tells you the line numbers of the 3rd and 4th match. It's probably not exactly what you want, but it is a demonstration of how one can effectively work with line numbers.
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
SET FILE=TestFile.txt
SET _LINENO=1
SET _MATCHNO=0
SET _THIRDLINENUM=
SET _FOURTHLINENUM=
FOR /F %%l IN (%FILE%) DO (
ECHO %%l | FINDSTR "Target" %_TMP% >NUL
IF NOT ERRORLEVEL 1 (
SET /A _MATCHNO=!_MATCHNO!+1
IF !_MATCHNO!==3 SET _THIRDLINENUM=!_LINENO!
IF !_MATCHNO!==4 SET _FOURTHLINENUM=!_LINENO!
)
SET /A _LINENO=!_LINENO!+1
)
#ECHO %_THIRDLINENUM% : %_FOURTHLINENUM%
Here's what's in TestFile.txt
abcdefg
bcdefgh
Target 1
cdefghi
defghij
fghijkl
Target 2
ghijklm
hijklmn
ijklmno
jklmnop
klmnopq
lmnopqr
mnopqrs
Target 3
nopqrst
Target 4
opqrstu
pqrstuv
qrstuvw
rstuvwx
stuvwxy
tuvwxyz
If you insist on using batch/CMD (and I sometimes do when nothing else is available), and you need to get the text on line #n (otherwise, head and tail would do just fine), you could produce a similar loop but replace the code from FINDSTR down to the end of the IF statement with something that compares _LINENO with some other variable, ECHO'ing the line if it is between the two values. I don't know if IF supports logical operators, so you may have to nest the IF statements, like
IF !_LINENO! GEQ %START_LINE% IF !_LINENO! LEQ %END_LINE% #ECHO %%l

assuming you need this (from your first comment):
I still have not found a way to search starting at line xx rather than 1 or to search in reverse order
you can try this (from the command line):
for /r %i in ("file pattern") do #more "%~i" +starting_line |findstr "search string"
for /r = recursively (if you mean really reverse, please explain)
"file pattern" = files to find, eg. "*class.asasm"
starting_line = search starting line, eg. 7 (more +6)
"search string" = your search pattern, eg. "Desktop"
OR search "Desktop Production Capabilities"
AND search |findstr "Desktop"|findstr "Production"|findstr "Capabilities"

Related

How to replace + append with jrepl via cmd?

In Windows, I have a batch file for processing the text file C:\BBB\CCC\list.txt for deciding which files to move. It should move all the files that are in a folder (%folder%) and its subfolders, but only if:
in the name of the folder there is not the year set in input (%excludeName%)
that file is not listed in a text file (%excludeFile%).
In that list.txt I have millions of rows like:
C:\AAA\XXX\ZZZ\image_1.jpg
C:\AAA\XXX\KKK\image_2.jpg
C:\AAA\XXX\ZZZ\pdf_1.pdf
This is the batch file and it's working fine for that purpose:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "folder=C:\AAA"
set "excludeFile=C:\BBB\CCC\list.txt"
set /p excludeName="Year not to delete: "
echo:
set /p rootPath="Backup folder path: "
FOR /f %%a in ('WMIC OS GET LocalDateTime ^| find "."') DO Set _DTS=%%a
Set _date=%_DTS:~0,4%%_DTS:~4,2%%_DTS:~6,2%
if "%rootPath:~-1%"=="\" (set rootPath= %rootPath:~0,-1%)
set localPath=%rootPath%\backup_deleted_media_%_date%
echo:
rem // Change to the root directory:
pushd "%folder%" && (
rem // Loop through all files but exclude those listed in the list file:
for /F "delims= eol=|" %%F in ('
dir /B /S /A:-D "*.*" ^| findstr /V /L /I /X /G:"%excludeFile%"') do (
for /D %%I in ("%%F\..") do (
echo.%%~nxI|findstr /C:"%excludeFile%" >nul 2>&1
if not errorlevel 1 (
echo Found
) else (
if not exist %localPath%\%%~pF md %localPath%\%%~pF
move %%F %localPath%\%%~pF
)
)
)
)
rem // Return from currently iterated directory to root directory:
endlocal
cmd /k
What I need now is another batch file for doing more or less the same but:
folder is not C:\AAA, but is C:\EEE\AAA
I have to change the path of the files in list.txt replacing C:\AAA by C:\EEE\AAA and I have to add .jpg to every single row of that list.txt (because, by mistake, all the files in C:\EEE\AAA and its subfolders have that extension, so like image_1.jpg.jpg, pdf_1.pdf.jpg, ...) before doing the same move. And I want these changes to be in a new file (%newExcludeFile%) instead of the original list.txt.
So I've added:
set "newExcludeFile=C:\BBB\CCC\list_new.txt"
set "SEARCHTEXT=\AAA\"
set "REPLACETEXT=\EEE\AAA\"
and I was doing this for deleting and creating the file %newExcludeFile% from the file %excludeFile%
if exist "%newExcludeFile%" del "%newExcludeFile%"
call jrepl "%SEARCHTEXT%" "%REPLACETEXT%" /x /m /l /f "%excludeFile%" /o "%newExcludeFile%"
Now I'm missing the part for appending .jpg at the end of every record in the file %newExcludeFile% and I was thinking if there is a way for doing it without iterating all the rows again after that replace.
I recommend reading first the Stack Overflow page with the question:
How does the Windows Command Interpreter (CMD.EXE) parse scripts?
I post next the simple solution for the task to create the output file C:\BBB\CCC\list_new.txt from input file C:\BBB\CCC\list.txt with replacing C:\AAA at beginning of each line by C:\EEE\AAA and append additionally at end of each line .jpg so that the lines in C:\BBB\CCC\list.txt
C:\AAA\XXX\ZZZ\image_1.jpg
C:\AAA\XXX\KKK\image_2.jpg
C:\AAA\XXX\ZZZ\pdf_1.pdf
become in file C:\BBB\CCC\list_new.txt
C:\EEE\AAA\XXX\ZZZ\image_1.jpg.jpg
C:\EEE\AAA\XXX\KKK\image_2.jpg.jpg
C:\EEE\AAA\XXX\ZZZ\pdf_1.pdf.jpg
This task can be done with:
#echo off
set "excludeFile=C:\BBB\CCC\list.txt"
set "newExcludeFile=C:\BBB\CCC\list_new.txt"
(for /F "usebackq tokens=2* delims=\" %%I in ("%excludeFile%") do echo C:\EEE\AAA\%%J.jpg)>"%newExcludeFile%"
That's really all.
FOR with option /F reads one line after the other from file C:\BBB\CCC\list.txt.
Each non-empty line is split up into substrings using backslash as string delimiters because of option delims=\.
The first substring is drive letter and colon which is ignored because of option tokens=2*, with the exception of looking on starting with end of line character in which case the entire line would be ignored, too.
The first substring is always C: and for that reason the default eol=; can be kept in this use case. There is no line ignored because of end of line character as there is no line starting with a semicolon and so no first substring starting with ;.
The second substring is on each line AAA which is assigned to specified loop variable I according to tokens=2.
But of real interest is the remaining part after C:\AAA\ on each line which is assigned without further line splitting according to * after tokens=2 to next but one loop variable J.
It would be also possible to use the FOR command line:
(for /F "usebackq tokens=1,2* delims=\" %%I in ("%excludeFile%") do echo %%I\EEE\AAA\%%K.jpg)>"%newExcludeFile%"
This variant copies drive letter and colon (first substring) from source to destination file.
I am a fan of JREPL.BAT, but this batch/JScript hybrid is not really necessary for this task.
However, here is the command line doing the same using jrepl.bat as the command line with for /F.
call jrepl.bat "(\\AAA\\.*)$" "\EEE$1.jpg" /F "%excludeFile%" /O "%newExcludeFile%"
It runs a regular expression search for
the string \AAA\ whereby each backslash must be escaped with one more backslash as the backslash is the escape character in a search regular expression and
with .*$ for 0 or more characters up to end of the line
within a marking group defined with ( and )
with replacing each found string with
the string \EEE (with backslash not escaped by JScript exception) and
with back-referencing with $1 the found string to keep it and
with .jpg appended.
Next I want to let all readers of this answer know what was not good coded in the few lines of the batch file posted in the question with the reasons.
It is recommended to modify
if "%rootPath:~-1%"=="\" (set rootPath= %rootPath:~0,-1%)
set localPath=%rootPath%\backup_deleted_media_%_date%
to
if "%rootPath:~-1%" == "\" set "rootPath=%rootPath:~0,-1%"
set "localPath=%rootPath%\backup_deleted_media_%_date%"
The arguments for command IF are specified in this case with 100% correct syntax for a string comparison as described extensively by my answer on Symbol equivalent to NEQ, LSS, GTR, etc. in Windows batch files with
first argument being the first string "%rootPath:~-1%";
space as argument separator;
second argument being the comparison operator ==;
space as argument separator;
third argument being the second string "\".
It can be seen on debugging the batch file that Windows command processor corrects automatically if "%rootPath:~-1%"=="\" with the missing spaces around == to if "%rootPath:~-1%" == "\" with spaces before executing the command IF. Therefore it is best to write the string comparison condition 100% correct in the batch file with spaces around ==.
The space right of = is removed in argument string of command SET in improved command line as rootPath should not be redefined with a space at beginning as described in detail by my answer on Why is no string output with 'echo %var%' after using 'set var = text' on command line?
The argument string of the two SET commands are additionally enclosed in " to work also for paths containing an ampersand character as otherwise & in path would be interpreted as AND operator for an additional command to execute after command SET. See my answer on single line with multiple commands using Windows batch file for meaning of & which is not within a double quoted argument string.
See also my answer on syntax error in one of two almost-identical batch scripts: ")" cannot be processed syntactically here which describes several very common syntax issues. Issue 1 is not enclosing file/folder argument strings in double quotes as required on file/folder argument string containing a space or one of these characters &()[]{}^=;!'+,`~ as described by help of Windows command processor output on running cmd /? in a command prompt window.
The two command lines
if not exist %localPath%\%%~pF md %localPath%\%%~pF
move %%F %localPath%\%%~pF
are also very problematic if localPath is for example defined with the string C:\Temp\Test & Development.
The command IF is designed to run one command on condition being true. There should not be used ( and ) if just a single command needs to be executed on true condition although it is always possible to define a command block with just one command. This is the second common syntax issue on batch file coding.
There are lots of characters in the ASCII table which have no special meaning for neither cmd.exe processing a batch file and nor for its internal command FOR, but beginners in batch file writing tend towards using characters from the small set as loop variable which have a special meaning like a or F which must be used very carefully on being used as loop variables.
The command popd should be used always after a successful execution of pushd, especially if pushd assigns a network resource access with a UNC path to a drive letter. Otherwise it could happen on repeated execution of a batch file that all drive letters are used finally.
It is very good practice to use the fully qualified file name of an executable wherever possible to make a batch file independent on the environment variables PATH and PATHEXT and avoid unnecessary file system accesses by Windows command processor to find the files which are usually specified only with its file name like find or findstr or wmic.
Here is the batch file code as posted in question with all more or less small issues fixed:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "folder=C:\AAA"
set "excludeFile=C:\BBB\CCC\list.txt"
set /P "excludeName=Year not to delete: "
echo:
set /P "rootPath=Backup folder path: "
for /F %%I in ('%SystemRoot%\System32\wbem\wmic.exe OS GET LocalDateTime ^| %SystemRoot%\System32\find.exe "."') do set "_DTS=%%I"
set "_date=%_DTS:~0,4%%_DTS:~4,2%%_DTS:~6,2%"
if "%rootPath:~-1%" == "\" set "rootPath=%rootPath:~0,-1%"
set "localPath=%rootPath%\backup_deleted_media_%_date%"
echo:
rem // Change to the root directory:
pushd "%folder%" && (
rem // Loop through all files but exclude those listed in the list file:
for /F "eol=| delims=" %%I in ('dir /B /S /A:-D "*.*" ^| %SystemRoot%\System32\findstr.exe /V /L /I /X /G:"%excludeFile%"') do (
for /D %%J in ("%%I\..") do (
echo.%%~nxJ|%SystemRoot%\System32\findstr.exe /C:"%excludeName%" >nul
if errorlevel 1 (
if not exist "%localPath%\%%~pI" md "%localPath%\%%~pI"
move "%%I" "%localPath%\%%~pI"
) else echo Found
)
)
popd
)
rem // Return from currently iterated directory to root directory:
endlocal
%ComSpec% /K
There is additionally corrected /C:"%excludeFile%" to /C:"%excludeName%" in most inner FOR loop.
Note: This batch file was not tested by me as I have never executed it!

Batch Script: Extract number from a string

I'm writing a batch script where I need to extract numbers from a string (which indicates the version of a file), so that I can compare it with another number.
Below is my script written so far
:: Over here I'm trying to extract number from the string , this isn't working
for %%F in ("!name!\.." ) do (
::set "number=%%~nxF" |findstr /b /e /r "\"[0-9]*\""
set res=%name:findstr /r "^[1-9][0-9]*$"
echo !res!
)
In the last two for loop I have implemented the extract number logic, but it just prints The system cannot find the drive specified.
If anybody could help me with this issue that would be a great help.

Windows Batch - check if string starts with ... in a loop

this grabs the output from a remote branch list with git::
for /f "delims=\" %r in ('git branch -r') do (
::then in here I want to get rid of the origin/HEAD -> line of the output
::and do a few git ops on the other lines, which are the names of branches
)
anyhow, I'm finding this frustrating as apparently batch doesn't have regex
here's the code I'm using to do this in bash
for remote in `git branch -r | grep -v '\->'`;
do echo $remote; #...git stuff
done;
the grep removes the "origin/HEAD -> origin/master" line of the output of git branch -r
So I'm hoping to ask how to implement the 'contains' verb
for /f "delims=\" %r in ('git branch -r') do (
if not %r contains HEAD (
::...git stuff
)
)
on a loop variable
this stackoverflow question appears to answer a similar question, although in my attempts to implement as such, I became confused by % symbols and no permutation of them yielded function
EDIT FOR FUTURE READERS: there is some regex with findstr /r piped onto git branch -r
for /f "delims=\" %%r in ('git branch -r^|findstr "HEAD"') do (
echo ...git stuff %%r
)
should give you a start.
Note: %%r, not %r within a batch file - %r would work directly from the prompt.
Your delims=\ filter will produce that portion up to the first \ of any line from git branch -r which contains HEAD - sorry, I don't talk bash-ish; you'd need to say precisely what the HEAD string you want to locate is.
Use "delims=" fo the entire line - omitting the delims option will set delimiters to the default set (space, comma, semicolon, etc.)
Don't use ::-comments within a block (parenthesised statement-sequence) as it's actually a broken label and cmd doesn't appeciate labels within a block. Use REM comments here instead.
The resultant strings output from the findstr (which acts on a brain-dead verion of regex) will be processed through to the echo (or whatever statement you may substitute here) - if there are none, the for will appear to be skipped.
Quite what your target string would be for findstr I can't tell. From the prompt, findstr /? may reveal. You may also be able to use find (find /?) - but if you are using cygwin the *nix version of find overrides windows-native.
I don't know what the git branch output looks like, but with a test case of
test 1
HEAD test \-> 2
test 3
test 4
the following prints all the text lines except the one containing \->
#setlocal enableextensions enabledelayedexpansion
#echo off
for /f "tokens=*" %%r in (d:\test2.txt) do (
set str1=%%r
if "!str1:\->=!"=="!str1!" (
echo %%r
)
)
The if test is fundamentally doing this test: string1.replace("HEAD", "") == string1.
Your loop variable needs to be %r if used directly in the command prompt, but %%r if in a batch file.
The string replacement is a part of environment variables, not loop variables, so it needs to be put into a holding string (str1) to work with. If you have the command extensions enabled ( enableextensions ).
And because environment variable setting operations happen when the script is read, you need to override that with enabledelayedexpansion and using !str1! instead of %str1%, otherwise the value of str1 won't change from one loop to the next.
(PS. Use PowerShell instead. Get-Content D:\test2.txt | Select-String "\->" -NotMatch ).

Test if a Reg Query is a specific value in a batch script

So without getting to convoluted, the gist of what I am trying to accomplish is that I currently am listing the results from a reg query by display name of programs, assigning a number to it, and then calling it later by number. When it lists the results it then uses a findstr to filter specific programs (such as anything with microsoft in it) from the list because I don't want them to even be an option for uninstalling. Right now it works basically, except it returns like this:
Let's say the programs in the Registry are:
Microsoft Update (should be filtered)
Notepad
Java
Microsoft Word (should be filtered)
Yahoo Toolbar
When I run this:
: progList64
cls
set regVar=HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall
set opt=64
echo _______________________________________________________
echo.
echo Please wait while I compile a list of known programs...
echo _______________________________________________________
echo.
echo %tab%64bit Programs
echo Index%tab%Name
set count=0
for /f "tokens=2,*" %%a in ('Reg Query %regVar% /S^|find " DisplayName"') do (
set /a count+=1
setlocal EnableDelayedExpansion
for %%n in (!count!) do (
endlocal
set product[%%n]=%%b
echo %%n.%tab%%%b | findstr /V /C:"Microsoft" | findstr /V /C:"Dell" | findstr /V /C:"MDOP" | findstr /V /C:"MED"
)
)
echo _______________________________________________________
echo.
echo ============ PRESS 'M' TO GO TO MAIN MENU =============
echo.
goto uninstallerMenu
I get this:
2. Notepad
3. Java
5. Yahoo Toolbar
So later when I call from the array you can actually put in 1 or 4 and select that product even though it's not displayed. I'm trying to filter it before it prints that it only prints what I want, resulting in this:
1. Notepad
2. Java
3. Yahoo Toolbar
I've tryin using various IF statements, tried putting the entire for %%n in (!count!) part in an IF statement that tests if Microsoft, dell, etc are in the DisplayName and then only displaying and increasing the counter if it fits, but that's not working either. I'm at my wits end here, any ideas?
And unrelated and not really important, but does anyone know a better way of filtering rather than daiychaining an entire row of findstr statements? Like an exclude list or something?
At a quick guess, try, before the FOR loop, (say after the SET COUNT...)
set excluded=Microsoft Dell MDOP MED
Then cascade
|findstr /v "%excluded%"
after the FIND " Displayname"
This should filter out any of the space-separated words in excludeme
Your existing cascaded findstr can then be removed as those names are removed before your inner for loop and thus also won't acquire a number.
see
findstr /?
from the prompt for docco...

Regular Expression with findstr (ms-dos)

I am trying to use ms-dos command findstr to find a string and eliminate it from the file.
At the moment I can find an explicit string but I am really struggling with regular expressions.
The file looks something like the below:
PLs - TULIP Report
Output_Format, PLS - TULIP REPORT
NUMLINES, 110907
VARIABLE_TYPES,T1,T8,I,T9,T2,N,N,N
[[data below]]
The file is an export from some system and annoyingly has that header in it - so I would like to clean it before using SQL Loader to bring it into an Oracle database.
There's more than just the one file and all would have the same type of header but ever so slightly different in every file.
Although I am happy to first remove the first 2 lines using hardcoded values, e.g.:
findstr /v "PLs - TULIP Report" "c:\myfiles\file1.PRO" > "c:\myfiles\file1.csv"</code><br>
findstr /v "Output_Format, PLS - TULIP REPORT" "c:\myfiles\file1.csv" > "c:\myfiles\file2.csv"
(note how I do that in 2 steps - any suggestions to make this happen in a single step, would be massivelly appreciated)
The third line is mnore complicated for me, it will always be in that format:
NUMLINES, 110907
except that the number at the end would be different for each file. So how do I get to find this entire line using a regular expression? I have tried:
findstr /v /b /r "\D+ \s+ \d+"
but without any luck.
FYI, the data in [[data below]] looks like
*,"00000161",456823,"017896532","FU",23.95,3.34,20.61
etc ..
Obviously, I do not want to modify the data area.
I hope the above makes sense,
Thanks
You must exclude single lines, findstr cannot match multiple lines. Just separate the different regexes with a space
findstr /r /b /v "NUMLINES PLs Output_Format" *.txt
^regex1 ^2 ^3
Specifying /b allows you to find matches only at the beginning of the lines and /v excludes those lines.
EDIT:
Of course the usage is
findstr /r /b /v "NUMLINES PLs Output_Format" yourfile > yourtarget
And in yourtarget you will find the data of yourfile except the lines excluded by the regex.
EDIT 2:
Based on your comments you need just to add VARIABLE_TYPES to your regex making it
findstr /r /b /v "NUMLINES PLs Output_Format VARIABLE_TYPES" yourfile > yourtarget
This is the way to complete the whole operation in one single instruction.
Here is a one liner using regex that will exclude all four lines. (I used line continuation so that the code looks better.) Each line must match exactly. I allow for each line to end in any number of spaces because I wasn't sure of your format. Note - FINDSTR regex support is very limited and non-standard. There are many other FINDSTR quirks and bugs. See What are the undocumented features and limitations of the Windows FINDSTR command? for more info.
findstr /vrx /c:"PLs - TULIP Report *"^
/c:"Output_Format, PLS - TULIP REPORT *"^
/c:"NUMLINES, *[0-9]* *"^
/c:"VARIABLE_TYPES,T1,T8,I,T9,T2,N,N,N *"^
"c:\myfiles\file1.PRO" >"c:\myfiles\file1.csv"
If all you need to do is skip the first 4 lines, then you normally should be able to use MORE. But there are some circumstances with large files where MORE can hang, but I can't remember the specifics. Also MORE will convert tabs into a series of spaces.
more +4 "c:\myfiles\file1.PRO" >"c:\myfiles\file1.csv"
Another option is to use a FOR /F loop. The FOR /F skips empty lines, but I don't think that is a concern for you.
>"c:\myfiles\file1.csv" (
for "usebackq skip=4 delims=" %%A in ("c:\myfiles\file1.PRO") do echo(%%A
)
If any of your data can begin with a ; then the code gets a bit uglier. You would then want to disable the EOL option by setting it to a line feed character.
set LF=^
::above 2 blank lines are critical - do not remove
>"c:\myfiles\file1.csv" (
for usebackq^ skip^=4^ eol^=^%LF%%LF%^ delims^= %%A in ("c:\myfiles\file1.PRO") do echo(%%A
)