Regex to match a variable in Batch scripting - regex

#echo off
SET /p var=Enter:
echo %var% | findstr /r "^[a-z]{2,3}$">nul 2>&1
if errorlevel 1 (echo does not contain) else (echo contains)
pause
I'm trying to valid a input which should contain 2 or 3 letters. But I tried all the possible answer, it only runs if error level 1 (echo does not contain).
Can someone help me please. thanks a lot.

findstr has no full REGEX Support. Especially no {Count}. You have to use a workaround:
echo %var%|findstr /r "^[a-z][a-z]$ ^[a-z][a-z][a-z]$"
which searches for ^[a-z][a-z]$ OR ^[a-z][a-z][a-z]$
(Note: there is no space between %var% and | - it would be part of the string)

Since other answers are not against findstr, howabout running cscript? This allows us to use a proper Javascript regex engine.
#echo off
SET /p var=Enter:
cscript //nologo match.js "^[a-z]{2,3}$" "%var%"
if errorlevel 1 (echo does not contain) else (echo contains)
pause
Where match.js is defined as:
if (WScript.Arguments.Count() !== 2) {
WScript.Echo("Syntax: match.js regex string");
WScript.Quit(1);
}
var rx = new RegExp(WScript.Arguments(0), "i");
var str = WScript.Arguments(1);
WScript.Quit(str.match(rx) ? 0 : 1);

Stephan's answer is correct in terms of support for regular expression. However, it does not regard a bug of findstr concerning character classes like [a-z] -- see this answer by dbenham.
To overcome this you need to specify this ( I know it looks terrible):
echo %var%|findstr /R "^[abcdefghijklmnopqrstuvwxyz][abcdefghijklmnopqrstuvwxyz]$ ^[abcdefghijklmnopqrstuvwxyz][abcdefghijklmnopqrstuvwxyz][abcdefghijklmnopqrstuvwxyz]$"
This truly matches only strings consisting of two or three lower-case letters. Using ranges [a-z] would match lower- and upper-case letters except Z.
For a complete list of bugs and features of findstr, reference this post by dbenham.

errorlevel is that number OR HIGHER.
Use following.
if errorlevel 1 if not errorlevel 2 echo It's just one.
See this
Microsoft Windows [Version 10.0.10240]
(c) 2015 Microsoft Corporation. All rights reserved.
C:\Windows\system32>if errorlevel 1 if not errorlevel 2 echo It's just one.
C:\Windows\system32>if errorlevel 0 if not errorlevel 1 echo It's just ohh.
It's just ohh.
C:\Windows\system32>
If Higher than one and not higher than n+1 (2 in this case)

Related

How can I see if a string is four letters long? – Windows Batch

So I'm working on a Windows Batch script and I want to know if an input string (the name of a file) is exactly four letters long. I want to do it with regular expressions or string matching.
I tried the following but it didn't work...
for /R "%windir%\system32" %%f in (*) do (
set filename=%%~nf
if not "!filename!"=="!filename:[a-z][a-z][a-z][a-z]=!" (
echo %%~nf
)
)
So my code loops through all the files in \system32. The files like mode.com should be echoed, but it's not the case.
This works:
dir /B "%windir%\system32" | findstr "^[a-z][a-z][a-z][a-z]\."
Tested on Windows 10
Aacini's answer is the best when no recursion is required.
Just in case you need something more flexible (but way slower):
#echo off
for /R "%windir%\system32" %%f in (*) do (
echo %%~nf|findstr /rix "[a-z][a-z][a-z][a-z]" >nul && (
echo %%~ff has a 4 letter filename: %%~nf and a size of %%~zf Bytes
)
)
As implied in my comment, and assuming four characters, not four alphabetic characters:
#For /R "%__AppDir__%" %%A In (*)Do #(Set "FN=%%~nA"
SetLocal EnableDelayedExpansion
If Not "%%~nA"=="!FN:~,3!" If "%%~nA"=="!FN:~,4!" Echo %%~nA
EndLocal)
And here's a possible alternative, for four alphabetic characters. Run it 'As administrator' if you're really trying to parse all files inside \Windows\System32\, (not essential but may pick up more files):
#Dir /B/S/A-D "%__AppDir__%" 2>NUL|"%__AppDir__%findstr.exe" "\\[a-Z][a-Z][a-Z][a-Z]\.[^\.]*$ \\[a-Z][a-Z][a-Z][a-Z]$"
You could put that inside a for-loop if, for some inexplicable reason, you only want only the basenames:
#For /F "EOL=?Tokens=*" %%A In ('Dir /B/S/A-D "%__AppDir__%" 2^>NUL^|"%__AppDir__%findstr.exe" "\\[a-Z][a-Z][a-Z][a-Z]\.[^\.]*$ \\[a-Z][a-Z][a-Z][a-Z]$"')Do #Echo(%%~nA
Try this:
dir /b C:\Windows\system32 | findstr /r "[a-z][a-z][a-z][a-z]"
The problem in your code was regular expression using style. You need to use findstr for regular expressions.

Trying to extract a GUID from a text, using batch (findstr + regexp)

I want to isolate a specific string from a text provided in a variable, using batch, but it doesn't seem to work as intended. I may do the regexp wrong, or maybe I misunderstood the way "findstr" works.
Te specific string that I need to isolate is a GUID (which has a standard format of alphanumeric characters, arranged in groups of characters separated by a "-", like this: 8-4-4-4-12)
#echo off
setlocal enabledelayedexpansion
SET str="This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
SET rx=[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}
FOR %%u IN ('FINDSTR /r "!rx!" "!str!"') DO ECHO %%u
endlocal
Basically, what I need is to store the GUID in a separate variable, so I can use it later on. If that can be achieved in a different manner, I'm happy to learn!
Thanks!
#ECHO Off
SETLOCAL
SET "str=This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
:: Theoretical
SET "hn=[a-f0-9]"
SET "hn4=%hn%%hn%%hn%%hn%"
SET "hn8=%hn4%%hn4%"
SET "wrx=%hn8%-%hn4%-%hn4%-%hn4%-%hn8%%hn4%"
:again
IF NOT DEFINED str ECHO notfound&GOTO done
ECHO %str%|FINDSTR /b /r /i "%wrx%">NUL
IF ERRORLEVEL 1 (
REM did not find string
SET "str=%str:~1%"
GOTO again
)
SET "str=%str:~0,36%"
ECHO found "%str%"
:done
:: BFI method
SET "str=This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
SET "hn=[a-f0-9]"
SET "hn4=%hn%%hn%%hn%%hn%"
SET "hn8=%hn4%%hn4%"
:bfiagain
IF NOT DEFINED str ECHO notfound&GOTO donebfi
:: "regex" using brute-force and ignorance
ECHO %str:~0,9%|FINDSTR /b /i /r "%hn8%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~9,5%|FINDSTR /b /i /r "%hn4%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~14,10%|FINDSTR /b /i /r "%hn4%-%hn4%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~24,12%|FINDSTR /b /i /r "%hn4%%hn8%">NUL
:bfino
IF ERRORLEVEL 1 (
SET "str=%str:~1%"
GOTO bfiagain
)
SET "str=%str:~0,36%"
ECHO found "%str%"
:donebfi
GOTO :EOF
Well, not so squeezy...
Fundamentally, findstr implements a very small subset of regex. It's intended to locate a character-string in a file.
Theoretically, you could string [a-f0-9] together the requisite number of times and add in the - separators for use as the "regex", then see whether the subject string /b (begins) with such a pattern; lop off the start character if not and repeat until found or subject-string is empty.
Notes here: I believe GUID uses HEX digits only, not alphanumerics. findstr supports /i to have the comparison made case-insensitively (which shortens the individual "character-match" string). Yes - I know ^ can be used in a regex (even one from Uncle Bill's little programmers' toolset) but I prefer /b.
The only small problem with this is that it yielded an out of memory error...
So, feed it small chunks at a time, and it appears happy...
I've done no further testing, and predict stormy weather if your text-string contains characters which cmd regards as specials - the usual suspects like redirectors, % and rabbit's ears.

Syntax for specific RegEx in command line FINDSTR call

I am writing a batch script that takes in various arguments before starting another process. In the example below I am checking the case where the first argument was 1, and the second argument is in the form of "any number of digits 0-9, followed by the letter k, m, or g" (I am specifying the amount of memory the process should start with i.e. 10g = 10 Gb memory).
If I just want a number this will suffice:
IF [%1] EQU [1] ECHO %2|findstr /r "[^0-9]" > nul
IF [%1] EQU [1] IF errorlevel 1 echo starting test number %1 with %2 of memory
What I thought would be an obvious segway to add the letters k, m, or g led me to this (I've tried with and without the '*'):
IF [%1] EQU [1] ECHO %2|findstr /r "[^0-9]*[kmg]" > nul
IF [%1] EQU [1] IF errorlevel 1 echo starting test number %1 with %2 of memory
However I have been unable to match any string to this FINDSTR pattern. Basically I am looking for a FINDSTR that matches [0-9][0-9]*[kmg]. I am fairly certain I am close but am having trouble working out the correct syntax.
Even the first code you posted does not work. [^0-9] looks for any non-digit. I think you wanted ^[0-9], which means any string that starts with a digit. Your logic is also wrong: FINDSTR sets errorlevel to 0 if found, and 1 if not found. I prefer to use the conditional && and || operators to test the result instead of IF.
I recommend the following for what you are attempting. I've thrown in the /I switch to make it case insensitive. I add the /X switch to prevent the string from matching if there are extra characters before or after the number with suffix.
#echo off
if "%~1" equ "1" echo(%~2|findstr /rix "[0-9][0-9]*[kgm]" >nul && (
echo starting test number %~1 with %~2 of memory
)
Unfortunately, FINDSTR does not support the ? meta-character. So the solution is slightly more complicated if the suffix is optional (if you want to support bytes, kilobytes, megabytes, and gigabytes). You would need to search for either of 2 strings, one with the suffix, and one without. FINDSTR breaks the search string into multiple search strings at spaces.
#echo off
if "%~1" equ "1" echo(%~2|findstr /rix "[0-9][0-9]*[kgm] [0-9][0-9]*" >nul && (
echo starting test number %~1 with %~2 of memory
)

Use subpatterns in FINDSTR

I must check the validity of a string stored in a variable, I can not use external CLI utilities (grep, awk, etc.) so I chose FINDSTR.
The string has this format (in regexp):
([1-9][0-9]*:".*"(|".*")*)
I do not know how to check the subpattern (|. "*").
Currently my code is:
((ECHO.) | (SET /P "=(11:"a"|"b"|"c")") | (FINDSTR /R /C:"^([1-9][0-9]*:".*")$"))
Regards.
Mat M is correct about the limitation of FINDSTR. The FINDSTR regex support is very primitive and non-standard. Type HELP FINDSTR or FINDSTR /? from the command line to get a brief synopsis of what is supported. For an in depth explanation, refer to What are the undocumented features and limitations of the Windows FINDSTR command?
I like Harry Johnston's comment - It would be quite easy to create a solution using VBScript or JavaScript. I think that would be a much better choice.
But, here is a native batch solution. I've incorporated the extra rule about the number of subpatterns that the OP stated in the comment to Mat M's answer.
The solution is surprisingly tricky. Special characters can cause problems when piping the ECHO output to FINDSTR because of the way pipes work. Each side of the pipe is executed in it's own CMD session. The special characters must either be quoted, escaped twice, or only exposed via delayed expansion. I chose to use delayed expansion, but the ! characters must be escaped twice to make sure the delayed expansion occurs at the correct time.
The easiest way to parse a variable number of subpatterns is to replace the delimiter with a newline and use FOR /F to iterate each subpattern.
The top half of my code is a brittle coding harness to conveniently iterate and test a set of strings. It will not work properly with any of <space> ; , = <tab> * or ? in the string. Also, the quotes must be balanced in each string.
But the more important validate routine can handle any string in the var variable.
#echo off
setlocal
set LF=^
::Above 2 blank lines are critical for creating a linefeed variable. Do not remove
set test=a
for %%S in (
"(3:"a"|"c"|"c")"
"(11:"a"|"b"|"c"|"d"|"esdf"|"f"|"g"|"h"|"i"|"j"|"k")"
"(4:"a"|"b"|"c")"
"(10:"a"|"b"|"c"|"d"|"esdf"|"f"|"g"|"h"|"i"|"j"|"k")"
"(3:"a"|"b"|"c""
"(3:"a"|"b^|c")"
"(3:"a"|"b"|c)"
"(3:"a"|"b"||"c")"
"(3:"a"|"b"|;|"c")"
) do (
set "var=%%~S"
call :validate
)
exit /b
:validate
setlocal enableDelayedExpansion
cmd /v:on /c echo ^^^!var^^^!|findstr /r /c:"^([1-9][0-9]*:.*)$" >nul || (call :invalid FINDSTR fail& exit /b)
if "!var:||=!" neq "!var!" (call :invalid double pipe fail& exit /b)
for /f "delims=(:" %%N in ("!var!") do set "expectedCount=%%N"
set "str=!var:*:=!"
set "str=!str:~0,-1!"
set foundCount=0
for %%A in ("!LF!") do for /f eol^=^%LF%%LF%^ delims^= %%B in ("!str:|=%%~A!") do (
if %%B neq "%%~B" (call :invalid sub-pattern fail& exit /b)
set /a foundCount+=1
)
if %foundCount% neq %expectedCount% (call :invalid count fail& exit /b)
echo Valid: !var!
exit /b
:invalid
echo Invalid - %*: !var!
exit /b
Here are the results after running the batch script
Valid: (3:"a"|"c"|"c")
Valid: (11:"a"|"b"|"c"|"d"|"esdf"|"f"|"g"|"h"|"i"|"j"|"k")
Invalid - count fail: (4:"a"|"b"|"c")
Invalid - count fail: (10:"a"|"b"|"c"|"d"|"esdf"|"f"|"g"|"h"|"i"|"j"|"k")
Invalid - FINDSTR fail: (3:"a"|"b"|"c"
Invalid - sub-pattern fail: (3:"a"|"b|c")
Invalid - sub-pattern fail: (3:"a"|"b"|c)
Invalid - double pipe fail: (3:"a"|"b"||"c")
Invalid - sub-pattern fail: (3:"a"|"b"|;|"c")
Update
The :validate routine can be simplified a bit by postponing the enablement of delayed expansion until after the CMD /V:ON pipe. This means I no longer have to worry about double escaping the ! on the left side of the pipe.
:validate
cmd /v:on /c echo !var!|findstr /r /c:"^([1-9][0-9]*:.*)$" >nul || (call :invalid FINDSTR fail& exit /b)
setlocal enableDelayedExpansion
... remainder unchanged
As far as I know, findstr is not able to group regexps, so (|".*")* is a no-no. If you know how many blocks you have and you duplicate your code like this
FINDSTR /R /C:"^([1-9][0-9]*:\"..*\"|\"..*\"|\"..*\")$"
This way, if you are sure the number of blocks is constant, having empty ones "" if required, then you can check for it.
The double quotes inside the expression are ignored unless you prefix them with \.
The ..* construct is meant to replace .+ : one or more characters.

Check a string for a substring in a batch file (Windows)?

Let's say I have some text in a variable called $1. Now I want to check if that $1 contains a certain string. If it contains a certain string I want to print a message. The printing is not the problem, the problem is the check. Any ideas how to do that?
The easiest way in my opinion is this :
set YourString=This is a test
If NOT "%YourString%"=="%YourString:test=%" (
echo Yes
) else (
echo No
)
Basiclly the string after ':' is the string you are looking for and you are using not infront of the if because %string:*% will remove the * from the string making them not equal.
The SET search and replace trick works in many cases, but it does not support case sensitive or regular expression searches.
If you need a case sensitive search or limited regular expression support, you can use FINDSTR.
To avoid complications of escaping special characters, it is best if the search string is in a variable and both search and target are accessed via delayed expansion.
You can pipe $1 into the FINDSTR command with the ECHO command. Use ECHO( in case $1 is undefined, and be careful not to add extra spaces. ECHO !$1! will echo ECHO is off. (or on) if $1 is undefined, whereas ECHO(!$1! will echo a blank line if undefined.
FINDSTR will echo $1 if it finds the search string - you don't want that so you redirect output to nul. FINDSTR sets ERRORLEVEL to 0 if the search string is found, and 1 if it is not found. That is what is used to check if the string was found. The && and || is a convenient syntax to use to test for match (ERRORLEVEL 0) or no match (ERRORLEVEL not 0)
The regular expression support is rudimentary, but still useful.
See FINDSTR /? for more info.
This regular expression example will search $1 for "BEGIN" at start of string, "MID" anywhere in middle, and "END" at end. The search is case sensitive by default.
set "search=^BEGIN.*MID.*END$"
setlocal enableDelayedExpansion
echo(!$1!|findstr /r /c:"!search!" >nul && (
echo FOUND
rem any commands can go here
) || (
echo NOT FOUND
rem any commands can go here
)
As far as I know cmd.exe has no built-in function which answers your question directly. But it does support replace operation. So the trick is: in your $1 replace the substring you need to test the presence of with an empty string, then check if $1 has changed. If it has then it did contain the substring (otherwise the replace operation would have had nothing to replace in the first place!). See the code below:
set longString=the variable contating (or not containing) some text
#rem replace xxxxxx with the string you are looking for
set tempStr=%longString:xxxxxx=%
if "%longString%"=="%tempStr%" goto notFound
echo Substring found!
goto end
:notFound
echo Substring not found
:end