Batch File Command - Remove Lines With Phrases From File - regex

I am trying to cut out unnecessary lines from a list of installed programs on devices.
Currently using:
type "original.txt" | findstr /v "Click-to-Run" | findstr /v "Visual C++" | findstr /v "Windows*SDK*" > "example_new.txt"
I need it to remove lines such as "Windows Desktop SDK Tools" but KEEP lines such as ".Net Framework 4.0.0 SDK".
How can I get this to only remove the lines that contain the entire phrases specified?
Is it possible to do that, while also using wildcard in the phrases?
Thanks so much!

You can make your life easier (especially if your list is long) by using the /g switch (see findstr /? for details).
type "original.txt" | findstr /vrg:"exclude.txt" > "example_new.txt"
with exclude.txt containing your "to-ignore" list (REGEX allowed):
Click-to-Run
Visual C++
Windows.*SDK
(the /g includes /c, so spaces are no problem)

Related

Find and replace pattern in large number of files

I want to replace text in about 80.000 log files using a regex. I love the batch search and replace of VSCode. I was unable to do this with VSCode, because it did not seem to handle this amount of data well. Any suggestion how I could do this with VSCode? Are there suggestions for alternatives?
Instead of depending on a GUI based tool, it might be easier to for a CLI tool for this.
If you're using Linux, or willing to install any of the tools like sed and find if you're on Windows then it should be relatively simple.
You can use sed which is a command line tool on all (or at least most) distributions of Linux, and can be installed on Windows.
Usage (for this use case):
sed -i s/{pattern}/{replacement}/g {file}
Use sed to replace the matched pattern with a replacement, using the global modifier to match all results, and the file to do the replacement and overwrite.
To target all files in a directory you can do:
find -type f -name "*.log" exec sed -i s/{pattern}/{replacement}/g {};
Find items recursively starting from the current directory where it's type is file, and it has a name ending with .log. Then use sed to replace the pattern with the contents you want for each matched file.
You can find how to get tools like sed and find for Windows on the following question:
https://stackoverflow.com/a/127567/6277798

How should I find all the files that contains two strings?

My problem is to create a batch script file for Windows and iterate through a lot of files and find every file which has a line that contains two specified strings. So if the whole file contains those strings, that's not good enough, they should be at the same line.
For example, I have 5 files which contains the following:
1st: apple:green
2nd: apple
green
3rd: green
apple
4th: apple: yellowgreen
5th: apple: green
It should return the filenames of the first, fourth and fifth file.
Here is what I have:
FINDSTR /s /i /m "apple green" *.txt | FINDSTR "\MyDirectory" >> results.txt
How should I modify this to make it work?
findstr /i /s /m /r /c:"apple.*green" /c:"green.*apple" *.txt
EDITED TO WORK WITH FINDSTR
This regex worked for me:
"apple.*green green.*apple"
Also, your write to file command with the pipe did not work for me (perhaps I'm missing something). If it doesn't work for you, perhaps this will:
FINDSTR /s /i /m "apple.*green green.*apple" *.txt >> results.txt

Windows scripting: list files not matching a pattern

In Windows 7 command prompt, I´d like to list all files of a folder which name does not start with abc. I have tried:
forfiles /P C:\myFolder\ /M ^[abc]* /S /C "CMD /C echo #file"
Where is my error?
Many thanks.
Looking at forfiles /?:
/M searchmask Searches files according to a searchmask.
The default searchmask is '*' .
which strongly suggests forfiles doesn't support regular expressions, just normal Cmd/Windows wildcards.
On Windows 7 this can easily be achieved in PowerShell:
dir c:\myFolder | ?{ -not($_.Name -match '^abc') } | select Name
(That performs a case-insensitive regular expression match, which doesn't matter in the case of Windows filenames.)
NB. Assuming you want files not starting ABC, which isn't what your (attempted) regular expression says (any filename starting something that isn't a, b or c).
Where is my error?
Your error is thinking that the forfiles command would support regular expressions.
It does not. It supports file name matching with * and ?.
An alternative, in case of using a xcopy command instead of echo is using the option /exclude. For instance:
forfiles /P C:\myFolder\ /M ^[abc]* /S /C "CMD /C xcopy #path %myDestinationFolder% /exclude:abc*"
Also, if you´re using PowerShell, another option is the operator -match.

How can I use regex to chop apart xcopy statements embedded in .csproj files?

I'm working with a bunch (~2000) .csproj files, and in this development staff there's a historical precedent for embedded xcopy in the post-build events to move things around during the build process. In order to get build knowledge into once place, I'm working towards eradicating these xcopy calls in favor of declarative build actions in our automated build process.
With that in mind, I'm trying to come up with a regex I can use to chop out the path arguments supplied to xcopy. The statements come in a couple flavors:
xcopy /F /I /R /E /Y "..\..\..\Microsoft\Enterprise Library\3.1\bin"
xcopy /F /I /R /E /Y ..\Crm\*.* .\
xcopy ..\NUnit ..\..\..\output\debug /I /Y
specifically:
unpredictable placement of switches
destination path argument not always supplied
path arguments sometimes wrapped in quotes
I'm no regex wizard, but this is what I've got so far (the excessive use of parenteses are for match saving in powershell:
(.*x?copy.* '"?)([^ /'"]+)('"/.* '"?)([^ /'"]+)('"?.*)
the ([^ /'"]+) sections are the part that I intend to be the path arguments, being defined as strings containing no quotes, spaces, or forwards slashes, but I have a feeling I'll have to apply two regexes (one for quote-wrapped paths with spaces and one for no-quote paths)
Unfortunately, when I run this regex it seems to give me the same match for both the first and second path arguments. Most frustrating.
How would I change this to correct it?
In cases like this, I like to leverage PowerShell's argument parsing system. Use a simple regex to grab the whole xcopy line and then run it through a function.
$samples = 'xcopy /F /I /R /E /Y "..\..\..\Microsoft\Enterprise Library\3.1\bin"',
'xcopy /F /I /R /E /Y ..\Crm\*.* .\',
'xcopy ..\NUnit ..\..\..\output\debug /I /Y'
function argumentgrinder {
$args | Where-Object {($_ -notlike "/*") -and ($_ -ne "xcopy")}
}
$samples | foreach { Invoke-Expression "argumentgrinder $_"}
You do have to be careful of anything that looks like a PowerShell variable in the paths though ($, # and parentheses).
I don't think you need two different patterns to match the paths.
The following pattern should match each single statement in all three cases you have provided:
\A(xcopy)\s+([\/A-Z\s]*)\s*((".*?")|([^\s]*))\s*((".*?")|([^\s]*))\s*([\/A-Z\s]*)
I've used or (|) to match paths in the various combinations.
NOTE Because I've not windows at the moment, I've been testing this pattern on my linux ruby but the syntax should not be different or at least should give you an idea.

Windows Batch: How remove all blank (or empty) lines

I am trying to remove all blank lines from a text file using a Windows batch program.
I know the simplest way do achieving this is bash is via regular expressions and the sed command:
sed -i "/^$/d" test.txt
Question: Does Windows batch have an similar simple method for removing all lines from a text file? Otherwise, what is the simplest method to achieving this?
Note: I'm running this batch script to setup new Windows computers for customers to use, and so preferably no additional programs need to be installed (and then unistalled) to achieve this - ideally, I'll just be using the "standard" batch library.
For /f does not process empty lines:
for /f "usebackq tokens=* delims=" %%a in ("test.txt") do (echo(%%a)>>~.txt
move /y ~.txt "test.txt"
You could also use FINDSTR:
findstr /v "^$" C:\text_with_blank_lines.txt > C:\text_without_blank_lines.txt
/V -- Print only lines that do NOT contain a match.
^ --- Line position: beginning of line
$ --- Line position: end of line
I usually pipe command output to it:
dir | findstr /v "^$"
You also might find these answers to a similar question helpful, since some 'blank lines' may include spaces or tabs.
https://stackoverflow.com/a/45021815/5651418
https://stackoverflow.com/a/16062125/5651418