Equivalent of `grep -o` for findstr command in Windows - regex

In Unix, the command grep -o prints out only the matched string. This is very helpful when you are searching for a pattern using regular expression and only interested in what matched exactly and not the entire line.
However, I'm not able to find something similar for the Windows command findstr. Is there any substitute for printing only matched string in windows?
For example:
grep -o "10\.[0-9]+\.[0-9]+\.[0-9]+" myfile.txt
The above command prints only the IP address in myfile.txt of the form 10.*.*.* but not the entire lines which contain such IP adresses.

PowerShell:
select-string '10\.[0-9]+\.[0-9]+\.[0-9]+' myfile.txt | foreach-object {
$_.Matches[0].Groups[0].Value
}

Just use your familiar grep and other great Linux commands by downloading this UnxUtils (ready .exe binaries). Add it to your PATH environment variable for convenience

Related

Find and repalce string that includes quotes within files in multiple directories - unix aix

So here's the scenario. I'd like to change the following value from true to false in 100's of files in an installation but can't figure out the command and been working on this for a few days now. what i have is a simple script which looks for all instances of a file and stores the results in a file. I'm using this command to find the files I need to modify:
find /directory -type f \ ( -name 'filename' \) > file_instances.txt
Now what i'd like to do is run the following command, or a variation of it, to modify the following value:
sed 's/directoryBrowsingEnabled="false"/directoryBrowsingEnabled="true"/g' $i > $i
When i tested the above command, it had blanked out the file when it attempted to replace the string but if i run the command against a single file, the change is made correctly.
Can someone please shed some light on to this?
Thank you in advance
What has semi-worked for me is the following:
You can call sed with the -i option instead of doing > $i. You can even do a backup of the old file just in case you have a problem by adding a suffix.
sed -e 'command' -i.backup myfile.txt
This will execute command inplace on myfile.txt and save the old file in myfile.txt.backup.
EDIT:
Not using -i may indeed result in blank files, this is because unix doesn't like you to read and write at the same time (it leads to a race condition).
You can convince yourself of this by some simple cat commands:
$ echo "This is a test" > test.txt
$ cat test.txt > test.txt # This will return an error cat being smart
$ cat <test.txt >test.txt # This will blank the file, cat being not that smart
On AIX you might be missing the -i option of sed. Sad. You could make a script that moves each file to a tmp file and redirects (with sed) to the original file or try using a here-construction with vi:
cat file_instances.txt | while read file; do
vi ${file}<<END >/dev/null 2>&1
:1,$ s/directoryBrowsingEnabled="false"/directoryBrowsingEnabled="true"/g
:wq
END
done

Sed command - order of option flags matters? (-ir vs -ri)

Imagine the following data stored in file data.txt
1, StringString, AnotherString 545
I want to replace "StringString" with "Strung" with the following code
sed -ir 's/String+/Strung/g' data.txt
But it won't work. This works though:
sed -ri 's/String+/Strung/g' data.txt
I don't see any reason why the order of option flags would matter. Is it a bug or is there an explanation?
Please note that I'm not looking for a workaround but rather why the order of -ir and -ri matters.
Sidenotes: The switch -i "edits the file in place" while -r allows "extended regular expression" (allowing the + operator). I'm running sed 4.2.1 Dec. 2010 on Ubuntu 12.10.
When doing -ir you are specifying that "r" should be the suffix for the backup file.
You should be able to do -i -r if you need them in that order
Did you check sed --help or man sed?
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if extension supplied).
The default operation mode is to break symbolic and hard links.
This can be changed with --follow-symlinks and --copy.

How can I use sed to output the current and previous directory while including a tilde if $HOME is one directory further?

Currently my zsh prompt utilizes $'%2~ %%' to output the current and previous directory before just displaying % as my input prompt. For example, if I'm in /Users/david/Documents/Code/project, my prompt will display:
Code/project %
However, if I back up into the Code directory, a tilde is shown:
~Documents/Code %
I'm trying to reproduce this in the fish shell by replacing the regex provided in their prompt_pwd function and gets passed to sed. By default, that function looks like:
function prompt_pwd --description 'Print the current working directory, shortend to fit the prompt'
echo $PWD | sed -e "s|^$HOME|~|" -e "s|^/private||" -e 's-\([^/]\)[^/]*/-\1/-g'
end
Currently, this outputs the full name of the current directory, but truncates all other directories to one character (and replaces $HOME with a tilde). I'm trying to figure out what regular expression I can provide that function to duplicate what I had going on in zsh.
I suggest:
echo $PWD | sed -e "s|^$HOME/|~|" -e 's-.*/\([^/]*/[^/]*\)-\1/-'

Howto: Searching for a string in a file from the Windows command line?

Is there a way to search a directory and its subdirectories' files for a string? The string is rather unique. I want to return the name of the string and hopefully the line that the string is on in the file. Is there anything built into Windows for doing this?
You're looking for the built-in findstr command.
The /S option performs a recursive search.
There is the find.exe command, but it's pretty limited in its capabilities. You could install Cygwin or Unxutils and use a pipeline including its Unix-style find and grep:
find . -type f | xargs grep unique-string

Regex to find external links from the html file using grep

From past few days I'm trying to develop a regex that fetch all the external links from the web pages given to it using grep.
Here is my grep command
grep -h -o -e "\(\(mailto:\|\(\(ht\|f\)tp\(s\?\)\)\)\://\)\{1\}\(.*\?\)" "/mnt/websites_folder/folder_to_search" -r
now the grep seem to return everything after the external links in that given line
Example
if an html file contain something like this on same line
Googlehttps://yahoo.com'>Yahoo
then the given grep command return the following result
http://www.google.com">Google</a><p><a href='https://yahoo.com'>Yahoo</a></p>
the idea here is that if an html file contain more than one links(irrespective in a,img etc) in same line then the regex should fetch only the links and not all content of that line
I managed to developed the same in rubular.com
the regex is as follow
("|')(\b((ht|f)tps?:\/\/)(.*?)\b)("|')
with work with the above input
but iam not able to replicate the same in grep
can anyone help
I can't modify the html file so don't ask me to do that neither I can look for each specific tags and check their attributes to to get external links as it addup processing time and my application doesn't demand that
Thank You
Try this:
cat /path/to/file | egrep -o "(mailto|ftp|http(s)?://){1}[^'\"]+"
egrep -o "(mailto|ftp|http(s)?://){1}[^'\"]+" /path/to/file
Outputs one link per line. It assumes every link is inside single or double quotes. To exclude some certain domain links, use -v:
egrep -o "(mailto|ftp|http(s)?://){1}[^'\"]+" /path/to/file | egrep -v "yahoo.com"
By default grep prints the entire line a match was found on. The -o switch selects only the matched parts of a line. See the man page.