FindStr with regex - regex

I have a system log file like following:
</t>Processed 8 rows.<LF>
</t>Success: 8<LF>
</t>Skip: 0<LF>
</t>Error: 0<LF>
</t>Exceptions: 0<LF>
// other log details
</t>Processed 8 rows.<LF>
</t>Success: 6<LF>
</t>Skip: 1<LF>
</t>Error: 1<LF>
</t>Exceptions: 0<LF>
<\t> is tab character, <LF> is line feed character.
My job need to create a dos batch to examine these files, and take action if any Skip, Error or Exceptions found.
What's on my mind is using findstr with regular expression to locate any line have case fail, I have tested this regex:
// Should be one line here
\t+Skip\:\s+([1-9]|[1-9][0-9])\n|
\s+Error\:\s+([1-9]|[1-9][0-9])\n|
\s+Exceptions\:\s+([1-9]|[1-9][0-9])\n
However, findstr do not accept normal regular expression (\t\s\n...), so I did split into 6 regex:
findstr /rc:"Skip\:[ ]*[1-9]" %file%
findstr /rc:"Skip\:[ ]*[1-9][0-9]" %file%
findstr /rc:"Error\:[ ]*[1-9]" %file%
findstr /rc:"Error\:[ ]*[1-9][0-9]" %file%
findstr /rc:"Exceptions\:[ ]*[1-9]" %file%
findstr /rc:"Exceptions\:[ ]*[1-9][0-9]" %file%
Which this job required to use dos batch only (it's sad but can't change), do any way to simply the findstr syntax? Thanks

Only one findstr with all the matching cases
findstr /r /c:"Skip: *[1-9]" /c:"Error: *[1-9]" /c:"Exceptions: *[1-9]" input.txt
Two findstr commands piped. First one extracts the required lines and second one search for problem conditions
findstr /l "Skip: Error: Exceptions:" input.txt | findstr /r /c:": *[1-9].*"
First option is faster as it involves only one command. Second option is less redundant and less prone to typing errors.

Related

How to use Findstr regex to match a dash (-)?

Trying to use findstr to match text that follows the format below:
PTB-14
AIR-217
The problem I'm having is that I just can't seem to get findstr to match on the dash, -. I've created the script below in a batch file:
set dash=-
echo.%dash%
echo !dash! | findstr /i /r /C:- >nul
if errorlevel 1 (
echo "ERROR!" >&2
)
I've tried the regex with /C:-, /C:"-", /C:"\-" and /C:"\\-". I just can't seem to get it matched. Anyone know what I am doing wrong?
Actually there is no need to use a regular expression (/R), but you can use a literal search string -. And case-insensitivity (/I) does also not make much sense with non-letter characters.
Anyway, I think the problem in your code is that you do not have delayed expansion enabled, although you are trying to use it, so echo !dash! actually echoes !dash! literally.
To solve that, there are a few options:
Place setlocal EnableDelayedExpansion before your code and (optionally) place endlocal after it in order to enable delayed expansion in the parent cmd instance that executes your batch file, like this:
setlocal EnableDelayedExpansion
set "dash=-"
echo(%dash%
echo(!dash!| > nul findstr /C:"-" || >&2 echo ERROR^^!
endlocal
A pipe | initiates two new cmd instances for either side, which do not have delayed expansion enabled, even if the parenth cmd instance does. However, you can explicitly initiate another cmd instance on the left side of the pipe with delayed expansion enabled (/V):
set "dash=-"
echo(%dash%
cmd /V /C echo(^!dash^!| > nul findstr /C:"-" || >&2 echo ERROR!
The exclamation marks are escaped (^!) in order for them not to be consumed by the parent cmd instance in case delayed expansion is enabled there; if not, the escaping does not harm.
In the above code fragments, I additionally changed the following:
set dash=- has become set "dash=-", as this is the most secure syntax that prevents unintended trailing white-spaces and accepts even special characters like ^, &, (, ), <, > and |;
echo. has become echo(, which is the only reliable syntax, although it looks odd;
the SPACE in front of the pipe | has disappeared, because it would also have been echoed;
if ErrorLevel 1 has been replaced by the conditional command concatenation operator ||, which lets the following command execute only in case an error occurred, or, technically spoken, in case the exit code of the previous command was non-zero;
echo "ERROR!" >&2 has been changed to >&2 echo ERROR^^! or >&2 echo ERROR! in order to avoid the quotation marks "" and the SPACE before >&2 to be echoed also; the double-escaping of the ! is needed to display it in case delayed expansion is enabled;

FINDSTR and RegEx issuse

I have a batch file that asks for input, stores this input in a var and then uses the var in a ping. I need to make sure that the input matches one of several naming conventions
Naming conventions:
PCX1 can be as high as 100
GENPRT1 can be as high as 100
NETPRT1 can be as high as 100
FAXPRT1 can be as high as 100
So if i enter 12 it will not work but if I enter PCX12 it will.
Everything in the script works except the regex. How can i get this to work?
if "%sta%" == "findstr %sta% ^PCX[0-9]*[0-9]*[0-9]$ \i" (
echo The syntax is correct
goto PING
) else (
set errmsg=The syntax is wrong
goto START
)
This should help:
^(PCX|GENPRT|NETPRT|FAXPRT)([\d]|0[\d]|00[\d]|0[\d][\d]|[\d][\d]|100)$
FINDSTR's regex flavor is extremely limited. It doesn't even support alternation (|), so even very simple problems are going to have very messy solutions. Here's the most compact expression I can come up with:
FINDSTR /R /I "^PCX[1-9][0-9]?$ ^PCX100$ ^GENPRT[1-9][0-9]?$ ^GENPRT100$ ^NETPRT[1-9][0-9]?$ ^NETPRT100$ ^FAXPRT[1-9][0-9]?$ ^FAXPRT100$"
Each space-separated sequence is treated as a separate regex, so this tries to perform up to eight matches on each string it tests. That's not to say it's slow, but it's a pain in the butt to use when you're used to real regexes.
For reference, here's how I would have written that in a serous regex flavor:
^(PCX|((GEN|NET|FAX)PRT))([1-9][0-9]?|100)$
If you have the option of using a different tool (like PowerShell, which uses .NET's very powerful and feature-rich regex flavor), I strongly recommend you do so.
#echo off
setlocal disabledelayedexpansion
:start
set /p "sta=What ? "
cmd /v /d /q /c "(echo(!sta!)" ^
| findstr /i /r /b /e "PCX[0-9]* GENPRT[0-9]* NETPRT[0-9]* FAXPRT[0-9]*" ^
| findstr /r /e "[^0-9][1-9] [^0-9][1-9][0-9] [^0-9]100" > nul
if errorlevel 1 (
echo The syntax is wrong
goto :start
)
echo The syntax is correct
A new cmd instance is used to ensure the tested string will not include any parser added space at the end. The output of the echo command is tested to see if it matches any of the starting strings followed by numbers up to the end. Then it is tested again for a valid number range.
If errorlevel is set, the value does not match the condition and a new value is requested.
If errorlevel is not set, the value is correct.

Windows Batch - check if string starts with ... in a loop

this grabs the output from a remote branch list with git::
for /f "delims=\" %r in ('git branch -r') do (
::then in here I want to get rid of the origin/HEAD -> line of the output
::and do a few git ops on the other lines, which are the names of branches
)
anyhow, I'm finding this frustrating as apparently batch doesn't have regex
here's the code I'm using to do this in bash
for remote in `git branch -r | grep -v '\->'`;
do echo $remote; #...git stuff
done;
the grep removes the "origin/HEAD -> origin/master" line of the output of git branch -r
So I'm hoping to ask how to implement the 'contains' verb
for /f "delims=\" %r in ('git branch -r') do (
if not %r contains HEAD (
::...git stuff
)
)
on a loop variable
this stackoverflow question appears to answer a similar question, although in my attempts to implement as such, I became confused by % symbols and no permutation of them yielded function
EDIT FOR FUTURE READERS: there is some regex with findstr /r piped onto git branch -r
for /f "delims=\" %%r in ('git branch -r^|findstr "HEAD"') do (
echo ...git stuff %%r
)
should give you a start.
Note: %%r, not %r within a batch file - %r would work directly from the prompt.
Your delims=\ filter will produce that portion up to the first \ of any line from git branch -r which contains HEAD - sorry, I don't talk bash-ish; you'd need to say precisely what the HEAD string you want to locate is.
Use "delims=" fo the entire line - omitting the delims option will set delimiters to the default set (space, comma, semicolon, etc.)
Don't use ::-comments within a block (parenthesised statement-sequence) as it's actually a broken label and cmd doesn't appeciate labels within a block. Use REM comments here instead.
The resultant strings output from the findstr (which acts on a brain-dead verion of regex) will be processed through to the echo (or whatever statement you may substitute here) - if there are none, the for will appear to be skipped.
Quite what your target string would be for findstr I can't tell. From the prompt, findstr /? may reveal. You may also be able to use find (find /?) - but if you are using cygwin the *nix version of find overrides windows-native.
I don't know what the git branch output looks like, but with a test case of
test 1
HEAD test \-> 2
test 3
test 4
the following prints all the text lines except the one containing \->
#setlocal enableextensions enabledelayedexpansion
#echo off
for /f "tokens=*" %%r in (d:\test2.txt) do (
set str1=%%r
if "!str1:\->=!"=="!str1!" (
echo %%r
)
)
The if test is fundamentally doing this test: string1.replace("HEAD", "") == string1.
Your loop variable needs to be %r if used directly in the command prompt, but %%r if in a batch file.
The string replacement is a part of environment variables, not loop variables, so it needs to be put into a holding string (str1) to work with. If you have the command extensions enabled ( enableextensions ).
And because environment variable setting operations happen when the script is read, you need to override that with enabledelayedexpansion and using !str1! instead of %str1%, otherwise the value of str1 won't change from one loop to the next.
(PS. Use PowerShell instead. Get-Content D:\test2.txt | Select-String "\->" -NotMatch ).

Writing .txt files from lines 1 to i

I am very close to my answer, but I cant seem to find it.
I am using the Findstr function in a batch file to narrow now an entire directory to just one file.
cd ...
findstr /s /m "Desktop" *class.asasm >results1.txt
findstr /m /f:results1.txt "Production" *class.asasm >results2.txt
findstr /n /f:results2.txt "Capabilities" *class.asasm >results3.txt
TASK 1: I need to figure out a way to make findstr search backwards for a fourth string from the line number the third line was found on
TASK 2: I need to write lines 1-the one we arrive at of the file in results2.txt
the insert a .txt file. Then write the rest of the original lines.
I am writing an application in VB.Net with Visual Studios and I am having a difficult time figuring out how to complete this process. I am currently having better luch with having the application run batch files that are written within the application.
The correct solution is to find a tool that does this properly. batch/CMD does not.
Here's a script that tells you the line numbers of the 3rd and 4th match. It's probably not exactly what you want, but it is a demonstration of how one can effectively work with line numbers.
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
SET FILE=TestFile.txt
SET _LINENO=1
SET _MATCHNO=0
SET _THIRDLINENUM=
SET _FOURTHLINENUM=
FOR /F %%l IN (%FILE%) DO (
ECHO %%l | FINDSTR "Target" %_TMP% >NUL
IF NOT ERRORLEVEL 1 (
SET /A _MATCHNO=!_MATCHNO!+1
IF !_MATCHNO!==3 SET _THIRDLINENUM=!_LINENO!
IF !_MATCHNO!==4 SET _FOURTHLINENUM=!_LINENO!
)
SET /A _LINENO=!_LINENO!+1
)
#ECHO %_THIRDLINENUM% : %_FOURTHLINENUM%
Here's what's in TestFile.txt
abcdefg
bcdefgh
Target 1
cdefghi
defghij
fghijkl
Target 2
ghijklm
hijklmn
ijklmno
jklmnop
klmnopq
lmnopqr
mnopqrs
Target 3
nopqrst
Target 4
opqrstu
pqrstuv
qrstuvw
rstuvwx
stuvwxy
tuvwxyz
If you insist on using batch/CMD (and I sometimes do when nothing else is available), and you need to get the text on line #n (otherwise, head and tail would do just fine), you could produce a similar loop but replace the code from FINDSTR down to the end of the IF statement with something that compares _LINENO with some other variable, ECHO'ing the line if it is between the two values. I don't know if IF supports logical operators, so you may have to nest the IF statements, like
IF !_LINENO! GEQ %START_LINE% IF !_LINENO! LEQ %END_LINE% #ECHO %%l
assuming you need this (from your first comment):
I still have not found a way to search starting at line xx rather than 1 or to search in reverse order
you can try this (from the command line):
for /r %i in ("file pattern") do #more "%~i" +starting_line |findstr "search string"
for /r = recursively (if you mean really reverse, please explain)
"file pattern" = files to find, eg. "*class.asasm"
starting_line = search starting line, eg. 7 (more +6)
"search string" = your search pattern, eg. "Desktop"
OR search "Desktop Production Capabilities"
AND search |findstr "Desktop"|findstr "Production"|findstr "Capabilities"

Regular Expression with findstr (ms-dos)

I am trying to use ms-dos command findstr to find a string and eliminate it from the file.
At the moment I can find an explicit string but I am really struggling with regular expressions.
The file looks something like the below:
PLs - TULIP Report
Output_Format, PLS - TULIP REPORT
NUMLINES, 110907
VARIABLE_TYPES,T1,T8,I,T9,T2,N,N,N
[[data below]]
The file is an export from some system and annoyingly has that header in it - so I would like to clean it before using SQL Loader to bring it into an Oracle database.
There's more than just the one file and all would have the same type of header but ever so slightly different in every file.
Although I am happy to first remove the first 2 lines using hardcoded values, e.g.:
findstr /v "PLs - TULIP Report" "c:\myfiles\file1.PRO" > "c:\myfiles\file1.csv"</code><br>
findstr /v "Output_Format, PLS - TULIP REPORT" "c:\myfiles\file1.csv" > "c:\myfiles\file2.csv"
(note how I do that in 2 steps - any suggestions to make this happen in a single step, would be massivelly appreciated)
The third line is mnore complicated for me, it will always be in that format:
NUMLINES, 110907
except that the number at the end would be different for each file. So how do I get to find this entire line using a regular expression? I have tried:
findstr /v /b /r "\D+ \s+ \d+"
but without any luck.
FYI, the data in [[data below]] looks like
*,"00000161",456823,"017896532","FU",23.95,3.34,20.61
etc ..
Obviously, I do not want to modify the data area.
I hope the above makes sense,
Thanks
You must exclude single lines, findstr cannot match multiple lines. Just separate the different regexes with a space
findstr /r /b /v "NUMLINES PLs Output_Format" *.txt
^regex1 ^2 ^3
Specifying /b allows you to find matches only at the beginning of the lines and /v excludes those lines.
EDIT:
Of course the usage is
findstr /r /b /v "NUMLINES PLs Output_Format" yourfile > yourtarget
And in yourtarget you will find the data of yourfile except the lines excluded by the regex.
EDIT 2:
Based on your comments you need just to add VARIABLE_TYPES to your regex making it
findstr /r /b /v "NUMLINES PLs Output_Format VARIABLE_TYPES" yourfile > yourtarget
This is the way to complete the whole operation in one single instruction.
Here is a one liner using regex that will exclude all four lines. (I used line continuation so that the code looks better.) Each line must match exactly. I allow for each line to end in any number of spaces because I wasn't sure of your format. Note - FINDSTR regex support is very limited and non-standard. There are many other FINDSTR quirks and bugs. See What are the undocumented features and limitations of the Windows FINDSTR command? for more info.
findstr /vrx /c:"PLs - TULIP Report *"^
/c:"Output_Format, PLS - TULIP REPORT *"^
/c:"NUMLINES, *[0-9]* *"^
/c:"VARIABLE_TYPES,T1,T8,I,T9,T2,N,N,N *"^
"c:\myfiles\file1.PRO" >"c:\myfiles\file1.csv"
If all you need to do is skip the first 4 lines, then you normally should be able to use MORE. But there are some circumstances with large files where MORE can hang, but I can't remember the specifics. Also MORE will convert tabs into a series of spaces.
more +4 "c:\myfiles\file1.PRO" >"c:\myfiles\file1.csv"
Another option is to use a FOR /F loop. The FOR /F skips empty lines, but I don't think that is a concern for you.
>"c:\myfiles\file1.csv" (
for "usebackq skip=4 delims=" %%A in ("c:\myfiles\file1.PRO") do echo(%%A
)
If any of your data can begin with a ; then the code gets a bit uglier. You would then want to disable the EOL option by setting it to a line feed character.
set LF=^
::above 2 blank lines are critical - do not remove
>"c:\myfiles\file1.csv" (
for usebackq^ skip^=4^ eol^=^%LF%%LF%^ delims^= %%A in ("c:\myfiles\file1.PRO") do echo(%%A
)