Why does FINDSTR behave differently in powershell and cmd? - regex

The following command pipes the output of echo to findstr and tries to match a regular expression with it. I use it to check if the echoed line only consists of (one or more) digits:
echo 123 | findstr /r /c:"^[0-9][0-9]*$"
The expected output of findstr is 123, which means that the expression could be matched with this string. The output is correct when I execute the command with powershell.exe.
Executing the command in cmd.exe however does not give a match. It only outputs an empty line and sets %ERRORLEVEL% to 1, which means that no match was found.
What causes the different behavior? Is there a way to make this command run correctly on cmd as well?
My OS is Windows 7 Professional, 64 Bit.

In Powershell the command echoes the string 123 to the pipeline and that matches your regular expression.
In cmd, your command echos 123<space> to the pipeline. The trailing space isn't allowed for in your regular expression so you don't get a match.
Try:
echo 123| findstr /r /c:"^[0-9][0-9]*$"
and it will work just fine. Or just switch entirely to Powershell and stop having to worry about the vagaries of cmd.exe.
Edit:
Yes, cmd and powershell handle parameters very differently.
With cmd all programs are passed a simple text command line. The processing that cmd performs is pretty minimal: it will terminate the command at | or &, removes i/o redirection and will substitute in any variables. Also of course it identifies the command and executes it. Any argument processing is done by the command itself, so a command can choose whether spaces separate arguments or what " characters mean. Mostly commands have a fairly common interpretation of these things but they can just do their own thing with the string they were given. echo does it's own thing.
Powershell on the other hand has a complex syntax for arguments. All of the argument parsing is done by Powershell. The parsed arguments are then passed to Powershell functions or cmdlets as a sequence of .Net objects: that means you aren't limited to just passing simple strings around. If the command turns out not to be a powershell command and runs externally it will attempt to convert the objects into a string and puts quotes round any arguments that have a space. Sometimes the conversion can be a bit confusing, but it does mean that something like this:
echo (1+1)
will echo 2 in Powershell where cmd would just echo the input string.
It is worth always remembering that with Powershell you are working with objects, so for example:
PS C:\> echo Today is (get-date)
Today
is
17 April 2014 20:03:15
PS C:\> echo "Today is $(get-date)"
Today is 04/17/2014 20:03:20
In the first case echo gets 3 objects, two strings and a date. It outputs each object on a separate line (and a blank line when the type changes). In the second case it gets a single object which is a string (and unlike the cmd echo it never sees the quote marks).

Related

error in grep using a regex expression

I think I have uncovered an error in grep. If I run this grep statement against a db log on the command line it runs fine.
grep "Query Executed in [[:digit:]]\{5\}.\?" db.log
I get this result:
Query Executed in 19699.188 ms;"select distinct * from /xyztable.....
when I run it in a script
LONG_QUERY=`grep "Query Executed in [[:digit:]]\{5\}.\?" db.log`
the asterisk in the result is replaced with a list of all files in the current directory.
echo $LONG_QUERY
Result:
Query Executed in 19699.188 ms; "select distinct <list of files in
current directory> from /xyztable.....
Has anyone seen this behavior?
This is not an error in grep. This is an error in your understanding of how scripts are interpreted.
If I write in a script:
echo *
I will get a list of filenames, because an unquoted, unescaped, asterisk is interpreted by the shell (not grep, but /bin/bash or /bin/sh or whatever shell you use) as a request to substitute filenames matching the pattern '*', which is to say all of them.
If I write in a script:
echo "*"
I will get a single '*', because it was in a quoted string.
If I write:
STAR="*"
echo $STAR
I will get filenames again, because I quoted the star while assigning it to a variable, but then when I substituted the variable into the command it became unquoted.
If I write:
STAR="*"
echo "$STAR"
I will get a single star, because double-quotes allow variable interpolation.
You are using backquotes - that is, ` characters - around a command. That captures the output of the command into a variable.
I would suggest that if you are going to be echoing the results of the command, and little else, you should just redirect the results into a file. (After all, what are you going to do when your LONG_QUERY contains 10,000 lines of output because your log file got really full?)
Barring that, at the very least do echo "$LONG_QUERY" (in double quotes).

What is the difference b/w two sed commands below?

Information about the environment I am working in:
$ uname -a
AIX prd231 1 6 00C6B1F74C00
$ oslevel -s
6100-03-10-1119
Code Block A
( grep schdCycCleanup $DCCS_LOG_FILE | sed 's/[~]/ \
/g' | grep 'Move(s) Exist for cycle' | sed 's/[^0-9]*//g' ) > cycleA.txt
Code Block B
( grep schdCycCleanup $DCCS_LOG_FILE | sed 's/[~]/ \n/g' | grep 'Move(s) Exist for cycle' | sed 's/[^0-9]*//g' ) > cycleB.txt
I have two code blocks(shown above) that make use of sed to trim the input down to 6 digits but one command is behaving differently than I expected.
Sample of input for the two code blocks
Mar 25 14:06:16 prd231 ajbtux[33423660]: 20160325140616:~schd_cem_svr:1:0:SCHD-MSG-MOVEEXISTCYCLE:200705008:AUDIT:~schdCycCleanup - /apps/dccs/ajbtux/source/SCHD/schd_cycle_cleanup.c - line 341~ SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210~
I get the following output when the sample input above goes through the two code blocks.
cycleA.txt content
389210
cycleB.txt content
25140616231334236602016032514061610200705008341389210
I understand that my last piped sed command (sed 's/[^0-9]*//g') is deleting all characters other than numbers so I omitted it from the block codes and placed the output in two additional files. I get the following output.
cycleA1.txt content
SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210
cycleB1.txt content
Mar 25 15:27:58 prd231 ajbtux[33423660]: 20160325152758: nschd_cem_svr:1:0:SCHD-MSG-MOVEEXISTCYCLE:200705008:AUDIT: nschdCycCleanup - /apps/dccs/ajbtux/source/SCHD/schd_cycle_cleanup.c - line 341 n SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210 n
I can see that the first code block is removing every thing other that (SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210) and is using the tilde but the second code block is just replacing the tildes with the character n. I can also see that it is necessary in the first code block for a line break after this(sed 's/[~]/ ) and that is why I though having \n would simulate a line break but that is not the case. I think my different output results are because of the way regular expressions are being used. I have tried to look into regular expressions and searched about them on stackoverflow but did not obtain what I was looking for. Could someone explain how I can achieve the same result from code block B as code block A without having part of my code be on a second line?
Thank you in advance
This is an example of the XY problem (http://xyproblem.info/). You're asking for help to implement something that is the wrong solution to your problem. Why are you changing ~s to newlines, etc when all you need given your posted sample input and expected output is:
$ sed -n 's/.*schdCycCleanup.* \([0-9]*\).*/\1/p' file
389210
or:
$ awk -F'[ ~]' '/schdCycCleanup/{print $(NF-1)}' file
389210
If that's not all you need then please edit your question to clarify your requirements for WHAT you are trying to do (as opposed to HOW you are trying to do it) as your current approach is just wrong.
Etan Reisner's helpful answer explains the problem and offers a single-line solution based on an ANSI C-quoted string ($'...'), which is appropriate, given that you originally tagged your question bash.
(Ed Morton's helpful answer shows you how to bypass your problem altogether with a different approach that is both simpler and more efficient.)
However, it sounds like your shell is actually something different - presumably ksh88, an older version of the Korn shell that is the default sh on AIX 6.1 - in which such strings are not supported[1]
(ANSI C-quoted strings were introduced in ksh93, and are also supported not only in bash, but in zsh as well).
Thus, you have the following options:
With your current shell, you must stick with a two-line solution that contains an (\-escaped) actual newline, as in your code block A.
Note that $(printf '\n') to create a newline does not work, because command substitutions invariably trim all trailing newlines, resulting in the empty string in this case.
Use a more modern shell that supports ANSI C-quoted strings, and use Etan's answer. http://www.ibm.com/support/knowledgecenter/ssw_aix_61/com.ibm.aix.cmds3/ksh.htm tells me that ksh93 is available as an alternative shell on AIX 6.1, as /usr/bin/ksh93.
If feasible: install GNU sed, which natively understands escape sequences such as \n in replacement strings.
[1] As for what actually happens when you try echo 'foo~bar~baz' | sed $'s/[~]/\\\n/g' in a POSIX-like shell that does not support $'...': the $ is left as-is, because what follow is not a valid variable name, and sed ends up seeing literal $s/[~]/\\\n/g, where the $ is interpreted as a context address applying to the last input line - which doesn't make a difference here, because there is only 1 line. \\ is interpreted as plain \, and \n as plain n, effectively replacing ~ instances with literal \n sequences.
GNU sed handles \n in the replacement the way you expect.
OS X (and presumably BSD) sed does not. It treats it as a normal escaped character and just unescapes it to n. (Though I don't see this in the manual anywhere at the moment.)
You can use $'' quoting to use \n as a literal newline if you want though.
echo 'foo~bar~baz' | sed $'s/[~]/\\\n/g'

Sed isn't replacing all occurrences of string in file

EDIT: I am using Cygwin. I am unsure whether this is of relevance and it was a detail I missed during writing this question.
EDIT2: Have tried replacing the "TAB" char people pointed out with the RegEx \s which covers spacing chars (spaces and tabs primarily) and this did not affect the expression at all, meaning that it is not the tabs causing the issue, especially since the expression runs once without errors anyway.
So far this script has been causing me a ton of trouble.
I DID have an issue before but I resolved that while I was writing a question here (lucky imo) but this one I've been stuck on for at least an hour now and I've tried varying solutions, none of which actually work or told me something I didn't already try.
I have a rather cool seeming FTP log fetcher script and part of this script replaces the 600MB of errors in this logfile to nothing, essentially removing them. Unfortunately this script also gets rid of parts of other errors too, so I've had to edit it. This is where I'm getting stuck.
Through base research I managed to find out that sed could do what I want, and through three hours of playing so far it does most of what I tell it to, minus one thing. One, and ONLY one, of the sed statements I have built only replaces the first instance of the string I've given it despite having the g modifier attached to the end.
I am working with a test script right now as to avoid potential permanent damage to my original FTP script, and the test script copies over an example file with a few of the errors I need replacing.
Walkthrough of the scripts INTENDED behaviour before showing:
1. Sets a prefix which happens on ALL lines in the file, pretty important part of the script.
2. Copies the example file to a file named test2.log
3. Replace all instances of the UNIX newline char \n with [loll] (first thing that came to my mind)
4. Remove all instances of battle error type 1 and 2.
5. Return all [loll] strings with the UNIX \n for newlines, therefore returning the logfile to its original state minus the errors.
Script:
#DTP="\[([0-9]+-[0-9]+-[0-9]+-[0-9]+|latest)\.log\] \[[0-9]+:[0-9]+:[0-9]+\] \[Server thread/(INFO|WARN)\]: "
echo "${DTP}"
DTP1="\[[0-9]*:[0-9]*:[0-9]*\]\s\[Server\sThread\/\(WARN\|INFO\)\]:\s"
DTP="\[loll\]\[[0-9]*:[0-9]*:[0-9]*\]\s\[Server\sThread\/\(WARN\|INFO\)\]:\s"
echo "${DTP}"
echo "1"
cp test.log test2.log
#cat test.log >test2.log
sed -i ':a;N;$!ba;s/\n/\[loll\]/g' test2.log #| egrep -i "" >test2.log
sed -i 's/'${DTP1}'Caught error in battle. Continuing...'${DTP}'java.lang.NullPointerException'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.participants.PixelmonWrapper.useAttack(PixelmonWrapper.java:173)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.participants.PixelmonWrapper.takeTurn(PixelmonWrapper.java:330)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.BattleControllerBase.takeTurn(BattleControllerBase.java:276)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.BattleControllerBase.update(BattleControllerBase.java:157)'${DTP}' at com.pixelmonmod.pixelmon.battles.BattleRegistry.updateBattles(BattleRegistry.java:63)'${DTP}' at com.pixelmonmod.pixelmon.battles.BattleTickHandler.tickStart(BattleTickHandler.java:12)'${DTP}' at cpw.mods.fml.common.eventhandler.ASMEventHandler_20_BattleTickHandler_tickStart_WorldTickEvent.invoke(.dynamic)'${DTP}' at cpw.mods.fml.common.eventhandler.ASMEventHandler.invoke(ASMEventHandler.java:51)'${DTP}' at cpw.mods.fml.common.eventhandler.EventBus.post(EventBus.java:122)'${DTP}' at cpw.mods.fml.common.FMLCommonHandler.onPostWorldTick(FMLCommonHandler.java:255)'${DTP}' at net.minecraft.server.MinecraftServer.func_71190_q(MinecraftServer.java:929)'${DTP}' at net.minecraft.server.dedicated.DedicatedServer.func_71190_q(DedicatedServer.java:429)'${DTP}' at net.minecraft.server.MinecraftServer.func_71217_p(MinecraftServer.java:776)'${DTP}' at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:639)'${DTP}' at java.lang.Thread.run(Thread.java:745)//gI' test2.log
echo "2"
sed -i 's/'${DTP1}'Caught error in battle. Continuing...'${DTP}'java.lang.NullPointerException\[loll\]//gI' test2.log
echo "3"
sed -i 's/\[loll\]/\n/g' test2.log
I've set them to also run case insensitive checks on the provided strings as sometimes I write with all lower case, however for most of this I copied and pasted it directly.
Sample input:
http://pastebin.com/3KPB33X2
Outputs:
Expected:
meow
Test message
WOOF MEOWLOL
Actual: http://pastebin.com/pnvDwkxz
It's been killing my mind for a while now because I had this issue even before the other one, except I barely noticed it. I can't find any predictable behaviour in the script, and as far as I am aware it SHOULD be working perfectly fine and giving me the output I expect.
Any help would be appreciated, because as soon as I can get this bug sorted out I'll be able to enter in the rest of the script and replace this with the existing battle-error replacement script in my log-fetcher.
Knowing me it's something small and stupid but I've tried literally everything I came across, including adding the :a;N;$!ba; to the start of the bit which isn't working properly (and realising that failed horribly).
Thanks.
~BAI1
Are you looking for something like this:
sed -n ':a;s/\[.*Server thread\/\(INFO\|WARN\).*//i;/^$/!p;n;b a' battle.log

Unpredictable behavior in sed interpreters output from multiple expressions

Why does GNU sed sometimes handle substitution with piped output into another sed instance differently than when multiple expressions are used with the same one?
Specifically, for msys/mingw sessions, in the /etc/profile script I have a series of manipulations that "rearrange" the order of the environment variable PATH and removes duplicate entries.
Take note that while normally sed treats each line of input seperately (and therfore can't easily substitute '\n' in the input stream, this sed statement does a substitution of ':' with '\n', so it still handles the entire input stream like one line (with '\n' characters in it). This behavior stays true for all sed expressions in the same instance of sed (basically until you redirect or pipe the output into another program).
Here's the obligatory specs:
Windows 7 Professional Service Pack 1
HP Pavilion dv7-6b78us
16 GB DDR3 RAM
MinGW-w64 (x86_64-w64-mingw32-gcc-4.7.1.2-release-win64-rubenvb) mounted on /mingw/
MSYS (20111123) mounted on / and on /usr/
$ uname -a="MINGW32_NT-6.1 CHRIV-L09 1.0.17(0.48/3/2) 2011-04-24 23:39 i686 Msys"
$ which sed="/bin/sed.exe" (it's part of MSYS)
$ sed --version="GNU sed version 4.2.1"
This is the contents of PATH before manipulation:
PATH='.:/usr/local/bin:/mingw/bin:/bin:/c/PHP:/c/Program Files (x86)/HP SimplePass 2011/x64:/c/Program Files (x86)/HP SimplePass 2011:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0:/c/si:/c/android-sdk:/c/android-sdk/tools:/c/android-sdk/platform-tools:/c/Program Files (x86)/WinMerge:/c/ntp/bin:/c/GnuWin32/bin:/c/Program Files/MySQL/MySQL Server5.5/bin:/c/Program Files (x86)/WinSCP:/c/Program Files (x86)/Overlook Fing 2.1/bin:/c/Program Files/7-zip:.:/c/Program Files/TortoiseGit/bin:/c/Program Files (x86)/Git/bin:/c/VS10/VC/bin/x86_amd64:/c/VS10/VC/bin/amd64:/c/VS10/VC/bin'
This is an excerpt of /etc/profile (where I have begun the PATH manipulation):
set | grep --color=never ^PATH= | sed -e "s#^PATH=##" -e "s#'##g" \
-e "s/:/\n/g" -e "s#\n\(/[^\n]*tortoisegit[^\n]*\)#\nZ95-\1#ig" \
-e "s#\n\(/[a-z]/win\)#\nZ90-\1#ig" -e "s#\n\(/[a-z]/p\)#\nZ70-\1#ig" \
-e "s#\.\n#A10-.\n#g" -e "s#\n\(/usr/local/bin\)#\nA15-\1#ig" \
-e "s#\n\(/bin\)#\nA20-\1#ig" -e "s#\n\(/mingw/bin\)#\nA25-\1#ig" \
-e "s#\n\(/[a-z]/vs10/vc/bin\)#\nA40-\1#ig"
The last sed expression in that line basically looks for lines that begins with "/c/VS10/VC/bin" and prepends them with 'A40-' like this:
...
/c/si
A40-/c/VS10/VC/bin
A40-/c/VS10/VC/bin/amd64
A40-/c/VS10/VC/bin/x86_amd64
/c/GnuWin32/bin
...
I like my sed expressions to be flexible (path structures change), but I don't want it to match the lines that end with amd64 or x86_amd64 (those are going to have a different string prepended). So I change the last expression to:
-e "s#\n\(/[a-z]/vs10/vc/bin\)\n#\nA40-\1\n#ig"
This works:
...
/c/si
A40-/c/VS10/VC/bin
/c/VS10/VC/bin/amd64
/c/VS10/VC/bin/x86_amd64
/c/GnuWin32/bin
...
Then, (to match any "line" matching the pseudocode "/x/.../bin") I change the last expression to:
-e "s#\n\(/[a-z]/.*/bin\)\n#\nA40-\1\n#ig"
Which produces:
...
/c/si
/c/VS10/VC/bin
/c/VS10/VC/bin/amd64
/c/VS10/VC/bin/x86_amd64
/c/GnuWin32/bin
...
??? - sed didn't match any character ('.') any number of times ('*') in the middle of the line ???
But, if I pipe the output into a different instance of sed (and compensate for sed handling each "line" seperately) like this:
| sed -e "s#^\(/[a-z]/.*/bin\)$#A40-\1#ig"
I get:
sed: -e expression #1, char 30: unterminated `s' command
??? How is that unterminated? It's got all three '#' characters after the s, has the modifiers 'i' and 'g' after the third '#', and the entire expression is in double quotes ('"'). Also, there are no escapes ('\') immediately preceding the delimiters, and the delimiter is not a part of either the search or the replacement. Let's try a different delimiter than '#', like '~':
I use:
| sed -e "s~^(/[a-z]/.*/bin)$~A40-\1~ig"
and, I get:
...
/c/si
A40-/c/VS10/VC/bin
/c/VS10/VC/bin/amd64
/c/VS10/VC/bin/x86_amd64
A40-/c/GnuWin32/bin
...
And, that is correct! The only thing I changed was the delimeter from '#' to '~' and it worked ???
This is not (even close to) the first time that sed has produced unexplainable results for me.
Why, oh, why, is sed NOT matching syntax in an expression in the same instance, but IS matching when piped into another instance of sed?
And, why, oh, why, do I have to use a different delimeter when I do this (in order not to get an "unterminated 's' command"?
And the real reason I'm asking: Is this a bug in sed, OR, is it correct behavior that I don't understand (and if so, can someone explain why this behavior is correct)? I want to know if I'm doing it wrong, or if I need a different/better tool (or both, they don't have to be mutually exclusive).
I'll mark a response it as the answer if someone can either prove why this behavior is correct or if they can prove why it is a bug. I'll gladly accept any advice about other tools or different methods of using sed, but those won't answer the question.
I'm going to have to get better at other text processors (like awk, tr, etc.) because sed is costing me too much time with it's unexplainable results.
P.S. This is not the complete logic of my PATH manipulation. The complete logic also finishes prepending all the lines with values from 'A00-' to 'Z99-', then pipes that output into 'sort -u -f' and back into sed to remove those same prefixes on each line and to convert the lines ('\n') back into colons (':'). Then "export PATH='" is prepended to the single line and "'" is appended to it. Then that output is redirected into a temporary file. Next, that temporary file is sourced. And, finally, that temporary file is removed.
The /etc/profile script also displays the contents of PATH before and after sorting (in case it screwed up the path).
P.P.S. I'm sure there is a much better way to do this. It started as some very simple sed manipulations, and grew into the monster you see here. Even if there is a better way, I still need to know why sed is giving me these results.
sed -e "s#^\(/[a-z]/.*/bin\)$#A40-\1#ig"
is unterminated because the shell is trying to expand "$#A". Put your expressions in single quotes to avoid this.
The expression
-e "s#\n\(/[a-z]/.*/bin\)\n#\nA40-\1\n#ig"
fails, or doesn't do what you expect, because . matches the newline in a multi-line expression. Check your whole output, the A40- is at the very beginning. Change it to
-e "s#\n\(/[a-z]/[^\n]*/bin\)\n#\nA40-\1\n#ig"
and it might be more what you expect. This may very well be the case with most of your issues with multi-line modifications.
You can also put the statements, one per line, into a standalone file and invoke sed with sed -f editscript. It might make maintenance of this a bit easier.

Piped Variable Into FINDSTR w/ Regular Expressions and Escaped Double Quotes

I am trying to understand a batch file that was sent to me in order to work around a bug in a third party program while they resolve the issue. Basically they are running a findstr regular expression command in order to determine whether or not the string matches. If it does, then the special characters that should not be stripped out are being added back in manually before it is passed off to the original commandline program.
As best I can tell though, what has been provided does not work or I do not understand it. I am pasting the relevant section of code below.
#echo off
setlocal
set username=%1
shift
echo %username% | findstr /r "^\"[0-9][0-9]*\"" >nul
if not errorlevel 1 (set username=";%username:~0,9%=%username:~10,4%?")
echo %username%
The three pieces I really have questions about are as follows:
I believe the unescaped interpretation of the regular express above is ^"[0-9][0-9]*" which I think means that the string must begin with a numeric character and then must consist of zero or more additional numeric-only characters in order for a match to be found. Well, FINDSTR seems to be doing something weird with the escaped quotes and I cannot get it to match anything I have tried. If I remove the \" around [0-9][0-9]* then I can get it to work, but it does not properly reject non-numeric characters such as an input string of 123456789O1234 (there is a letter O instead of a zero in that sample string).
What is the point of the >nul
Wouldn't it be better to check for an errorlevel equal to 0 instead of "not errorlevel 1" since it could possibly return an error level of 2?
Anyway, the following code works, but it is not as precise as I would like. I am just looking to understand why the quotes in the regex string are not working. Perhaps this is a limitation of FINDSTR, but I have not came across anything definitive yet.
#echo off
setlocal
set username=%1
shift
echo %username% | findstr /r "^[0-9][0-9]*" >nul
if not errorlevel 1 (set username=";%username:~0,9%=%username:~10,4%?")
echo %username%
I can workaround the problem by repeating the class 14 times since that is the number of characters in my situation (more than 15 classes will cause it to crash - scroll to the bottom). I am still curious as to how this could be achieved more simply, and of course the remaining 2 questions.
EDIT / WORKING SOLUTION
#echo off
setlocal enableDelayedExpansion
set username=%~1
shift
echo !username!|findstr /r /c:"^[0-9][0-9]*$" >nul
if not errorlevel 1 (set username=";!username:~0,9!=!username:~10,4!?")
echo !username!
NOTES:
When I first ran it after modifying my existing code to more cloesly resemble dbenham's, enableDelayedExpansion gave an error as did the quotes around setting the username (see below). I can't replicate what I did wrong, but it is all working now (this is in case someone else comes across the same issue).
I had tried the $ for the EOL marker (which is the key to forcing it match numeric content only), but I think that the other problems were getting in the way which made me think it was not the solution. Also, to ensure the $ works don't miss this part of dbenham's answer "...you must also make sure there are no spaces between your echoed value and the pipe symbol."
In short it pretty much seems that trying to put double quotes inside a regex for findstr is wrong syntax/does not work/etc... unless you are actually looking to match " in the string/files you are parsing through. See dbenham's answer for clarity here. As he noted, you can use %~1 to strip the quotes from the argument instead of adding it to your regex (and programmatically add them back in if needed).
Error Message
C:>sample.bat 123456789
'enableDelayedExpansion' is not recognized as an internal or external command,
operable program or batch file.
'"' is not recognized as an internal or external command,
operable program or batch file.
!username!
Reference Links:
Undocumented features and limitations of the Windows FINDSTR command
Case sesntive anomalies with findstr (not handling case properly in some circumstances)
http://ss64.com/nt/findstr.html
http://www.robvanderwoude.com/findstr.php
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/findstr.mspx
Answering your questions in reverse order:
3) if not errorlevel 1 is probably the same as if %errorlevel%==0 because IF ERRORLEVEL 1 means if ERRORLEVEL is greater than or equal to 1. So putting a NOT in front means if ERRORLEVEL is less than 1. I believe FINDSTR never returns a negative ERRORLEVEL, so the syntax should be OK.
2) The >nul redirects the stdout output of FINDSTR to the nul device, meaning it disables the output. Normally any matching line would be printed. You are only interested in the return code - you don't want to see the output.
1) The original regex will match any input string that starts with a quote, followed by at least one digit, followed by another quote. It ignores any characters that may appear after the 2nd quote.
So the following strings (quotes included) will match:
"0"
"01234"
"0"a
"01234"a
The following strings will not match:
0
01234
""
"0a"
The original code has problems if the number of digits in the matching string reaches a certain length because the ending quote gets stripped causing the closing ) to be quoted and so the rest of the script fails.
I don't understand your requirements so I don't know how to fix the code.
It sounds like you don't want to match strings that have non digits. That means you need to include the end of line marker $ at the end of the regex. But you must also make sure there are no spaces between your echoed value and the pipe symbol.
I believe you probably don't want quotes in your value, (or else you should programatically add them at the very end). You can use %~1 to strip any enclosing quotes from the supplied argument.
If you are looking to check if argument 1 consists of nothing but numeric digits, then you can use:
setlocal enableDelayedExpansion
set "username=%~1"
echo !username!|findstr /r "^[0-9][0-9]*$" >nul
I used delayed expansion because you have no control over what characters are in %1, and if it contains special characters like & or | it will cause problems if you use normal expansion. The syntax I have given is not bullet proof, but it handles most "normal" situations.
It is not necessary in your case, but I prefer to use the /c option, just in case your search string contains spaces. So the above could be written as
echo !username!|findstr /r /c:"^[0-9][0-9]*$" >nul
It seems odd to me that both the original and your modified code simply pass through the username if it does not match your regex. Maybe that is your intent, maybe not.