I have this regex that works fine enough for my purposes for identifying emails in CSVs within a directory using grep on Mac OS X:
grep --no-filename -E -o "\b[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" *
I've tried to get this working with sed so that I can replace the emails with foo#bar.baz:
sed -E -i '' -- 's/\b[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b/foo#bar.baz/g' *
However, I can't seem to get it to work. Admittedly, sed and regex are not my strong points. Any ideas?
The sed in OSX is broken. Replace it with GNU sed using Homebrew that will be used as a replacement for the one bundled in OSX. Use this command for installation
sudo brew install gnu-sed
and use this for substitution
sed -E -i 's/\b[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b/foo#bar.baz/g' *
Reference
You seem to assume that grep and sed support the same regex dialect, but that is not necessarily, or even usually, the case.
If you want a portable solution, you could easily use Perl for this, which however supports yet another regex dialect...
perl -i -p -e 's/\b[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b/foo#bar.baz/g' *
For a bit of an overview of regex dialects, see https://stackoverflow.com/a/11857890/874188
Your regex kind of sucks, but I understand that is sort of beside the point here.
Related
How can I remove new lines using Perl and / or Sed at the bash command line but avoiding a specific set of characters?
The closest I came from this is:
perl -C -i -p -e 's/[^.:]\n//' ~/Desktop/bak2
The above code is working well on avoid removing lines ended with a dot or a colon, but its failling because when removing the correct new lines its also erasing the very last character of the string. I also would need the removed \n to be substituted by a space.
Would be great, if possible, to have this solution by Perl and also by Sed.
I've searched for a similar solution in perl or sed and I haven't found it,sorry if it does exists.
Examples:
Existing content:
Violets are blue and
Buda has great teachings.
Programming can be easy because:
Stackoverflow exists,
and the community always helps
a lot.
Desired output:
Violets are blue and Buda has great teachings.
Programming can be easy because:
Stackoverflow exists, and the community always helps a lot.
With sed
sed -e ':A;/[^.:]$/{N;bA' -e '};y/\n/ /' ~/Desktop/bak2
or gnu sed
sed -z 's/\([^.:]\)\n/\1 /g' ~/Desktop/bak2
You may preserve pre new-line match (I added "empty" lines handling):
perl -C -i -p -e 's/(^|[^.:])\n/$1/' ~/Desktop/bak2
or use positive look behind
perl -C -i -p -e 's/(?<=[^.:])\n//' ~/Desktop/bak2
perl -i pe 's/[^.:]\K\n/ /' ~/Desktop/bak2
Just attempting to write a script to do a simple regex replace in php.ini, what I want to do is replace the line ;cgi.fix_pathinfo=1 with cgi.fix_pathinfo=0.
Ideally want to avoid installing any additional packages so sed seems a logical choice since it is bundled with FreeBSD. I have tried the following but doesn't seem to work:
sed 's/;cgi\.fix_pathinfo=1/cgi\.fix_pathinfo=0/' /usr/local/etc/php.ini
To change the content of a file in place with sed BSD, you can do that:
sed -i.bak -e 's/;cgi\.fix_pathinfo=1/cgi.fix_pathinfo=0/;' /usr/local/etc/php.ini
That creates a copy of the old file with a .bak extension.
Or without creating a copy:
sed -i '' -e 's/;cgi\.fix_pathinfo=1/cgi.fix_pathinfo=0/;' /usr/local/etc/php.ini
Note that in this case, a space and an empty string enclosed between quotes are mandatory. You can't simply write sed -i -e '... like with GNU sed.
Imagine the following data stored in file data.txt
1, StringString, AnotherString 545
I want to replace "StringString" with "Strung" with the following code
sed -ir 's/String+/Strung/g' data.txt
But it won't work. This works though:
sed -ri 's/String+/Strung/g' data.txt
I don't see any reason why the order of option flags would matter. Is it a bug or is there an explanation?
Please note that I'm not looking for a workaround but rather why the order of -ir and -ri matters.
Sidenotes: The switch -i "edits the file in place" while -r allows "extended regular expression" (allowing the + operator). I'm running sed 4.2.1 Dec. 2010 on Ubuntu 12.10.
When doing -ir you are specifying that "r" should be the suffix for the backup file.
You should be able to do -i -r if you need them in that order
Did you check sed --help or man sed?
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if extension supplied).
The default operation mode is to break symbolic and hard links.
This can be changed with --follow-symlinks and --copy.
I want to select some files that are matching a regular expression.
Files are for example:
4510-88aid-50048-INA.txt
4510-88nid-50048-INA.txt
xxxx-05xxx-xxxxx-INA.txt
I want all files that match this regex:
.*[\w]{4}-05(?!aid)[\w]{3}-[\w]{5}-INA\.txt
In my opinion this have to be xxxx-05xxx-xxxxx-INA.txt in the case above.
Using some tool like RegexTester, everything works perfect.
Using the bash command find -regex doesn´t seem to work for me.
My question is, why?
I can't figure it out, I am using:
find /some/path -regex ".*[\w]{4}-05(?!aid)[\w]{3}-[\w]{5}-INA\.txt" -exec echo {} \;
But nothing is printed... Any ideas?
$ uname -a
Linux debmu838 2.6.5-7.321-smp #1 SMP Mon Nov 9 14:29:56 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
bash4+ and perl
ls /some/path/**/*.txt | perl -nle 'print if /^[\w]{4}-05(?!aid)[\w]{3}-[\w]{5}-INA\.txt/'
you should have in your .profile shopt -s globstar
According to the find man page the find regex uses per default emacs regex. And according to http://www.regular-expressions.info/refflavors.html emacs is GNU ERE and that does not support look arounds.
You can try a different -regextype like #l0b0 suggested, but also the Posix flavours seems to not support this feature.
I pretty much ditto the other answers: Find's -regex switch can't emulate everything in Perl's regex, However, here's something you can try...
Take a look at the find2perl command. That program can take a typical find statement, and give you a Perl program equivalent for it. I don't believe -regex is recognized by find2perl (It's not in the standard Unix find, but only in the GNU find), but you can simply use -name, and then see the program it generates. From there, you can modify the program to use the Perl expressions you want in your regex. In the end, you'll get a small Perl script that will do the file directory find you want.
Otherwise, try using -regextype posix-extended which pretty much match most of Perl's regex expressions. You can't use look arounds, but you can probably find something that does work.
What you've got looks like a Perl regex. Try with a different -regextype, and tweak the regex accordingly:
Changes the regular expression syntax
understood by -regex and -iregex
tests which occur later on the command
line. Currently-implemented types are
emacs (this is the default),
posix-awk, posix-basic, posix-egrep
and posix-extended.
Try this:
ls ????-??aid-?????-INA.txt
Try simple script like this:
#!/bin/bash
for file in *INA.txt
do
match=$(echo "${file%INA.txt}" | sed -r 's/^\w{4}-\w{5}-\w{5}-$/found/')
[ $match == "found" ] && echo "$file"
done
I'm working on some old code and I found that I used to use
sed -E 's/findText/replaceWith/g' #findText would contain a regex
but I now try
sed -e 's/findText/replaceWith/g'
It seems to do the same thing, or does it?
I kinda remember there being a reason I done it but I can't remember and doing "man sed" doesn't help as they don't have anything about -E only -e that doesn't make much sense ether.
-e, --expression=script
Append the editing commands in script to the end of
the editing command script. script may contain more
than one newline separated command.
I thought -e meant it would match with a regex...
GNU sed version 4.2.1
From source code, -E is an undocumented option for compatibility with BSD sed.
/* Undocumented, for compatibility with BSD sed. */
case 'E':
case 'r':
if (extended_regexp_flags)
usage(4);
extended_regexp_flags = REG_EXTENDED;
break;
And from manual, -E in BSD sed is used to support extended regular expressions.
From sed's documentation:
-E
-r
--regexp-extended
Use extended regular expressions rather than basic regular expressions. Extended regexps are those that egrep accepts; they can be clearer because they usually have fewer backslashes. Historically this was a GNU extension, but the -E extension has since been added to the POSIX standard (http://austingroupbugs.net/view.php?id=528), so use -E for portability. GNU sed has accepted -E as an undocumented option for years, and *BSD seds have accepted -E for years as well, but scripts that use -E might not port to other older systems. See Extended regular expressions.
Therefore it seems that -E should be the preferred way to declare that you are going to use (E)xtended regular expressions, rather than -r.
Instead, -e just specifies that what follows is the script that you want to execute with sed (something like 's/bla/abl/g').
Always from the documentation:
Without -e or -f options, sed uses the first non-option parameter as the script, and the following non-option parameters as input files.