How to toggle cases with sed command? - regex

I want to replace kAbcdedf with abcdef in a file. I try to use the sed command. Some suggestions said, if I input this command
echo 'abc'|sed 's/^../\u&/'
the result will be
Abc
but on macOS (using zsh), the result is
uabc
Does anyone know the correct way to toggle cases with the sed command?
How can I search kAbcd and then replace that string with abcd, i.e., remove k and uppercase the next letter after k?

$ # with GNU sed
$ echo 'kAbcd' | sed -E 's/k(.)/\l\1/'
abcd
$ # with perl if GNU sed is not available
$ echo 'kAbcd' | perl -pe 's/k(.)/\l$1/'
abcd
\l will lowercase the first character of given argument
use g modifier to replace all such occurrences in a line
As mentioned by Wiktor Stribiżew in comments, see also: How to use GNU sed on Mac OS X

Related

Extract version using grep/regex in bash

I have a file that has a line stating
version = "12.0.08-SNAPSHOT"
The word version and quoted strings can occur on multiple lines in that file.
I am looking for a single line bash statement that can output the following string:
12.0.08-SNAPSHOT
The version can have RELEASE tag too instead of SNAPSHOT.
So to summarize, given
version = "12.0.08-SNAPSHOT"
expected output: 12.0.08-SNAPSHOT
And given
version = "12.0.08-RELEASE"
expected output: 12.0.08-RELEASE
The following command prints strings enquoted in version = "...":
grep -Po '\bversion\s*=\s*"\K.*?(?=")' yourFile
-P enables perl regexes, which allow us to use features like \K and so on.
-o only prints matched parts instead of the whole lines.
\b ensures that version starts at a word boundary and we do not match things like abcversion.
\s stands for any kind of whitespace.
\K lets grep forget, that it matched the part before \K. The forgotten part will not be printed.
.*? matches as few chararacters as possible (the matching part will be printed) ...
(?=") ... until we see a ", which won't be included in the match either (this is called a lookahead).
Not all grep implementations support the -P option. Alternatively, you can use perl, as described in this answer:
perl -nle 'print $& if m{\bversion\s*=\s*"\K.*?(?=")}' yourFile
Seems like a job for cut:
$ echo 'version = "12.0.08-SNAPSHOT"' | cut -d'"' -f2
12.0.08-SNAPSHOT
$ echo 'version = "12.0.08-RELEASE"' | cut -d'"' -f2
12.0.08-RELEASE
Portable solution:
$ echo 'version = "12.0.08-RELEASE"' |sed -E 's/.*"(.*)"/\1/g'
12.0.08-RELEASE
or even:
$ perl -pe 's/.*"(.*)"/\1/g'.
$ awk -F"\"" '{print $2}'

Ignore all letters except for capitals

I have an output like Johny-Smith, Juarez-Hugo, etc. and I need instead S, H, etc. Basically, I need the last uppercase letter in a string and that's it. If this is possible in any built in Linux tools (ex awk, sed, grep, etc.) it would be greatly appreciated.
Do you need like this ?
echo "Johny-Smith" | sed 's/^.*\([A-Z]\)[^A-Z]*$/\1/g'
Test:
$ echo "Johny-Smith-Hello Johny-Smith" | sed 's/.*\([A-Z]\)[^A-Z]*/\1/g'
S
With GNU grep and if PCRE option is available
$ echo 'Johny-Smith' | grep -oP '.*\K[A-Z]'
S
$ echo 'Juarez-Hugo' | grep -oP '.*\K[A-Z]'
H
-o prints only matched portion
-P Perl regular expression
.*\K positive lookbehind, not part of output
[A-Z] any uppercase character
with perl, see perldoc for command line options explanation
$ # prints the string within captured group
$ echo 'Johny-Smith' | perl -lne 'print /.*([A-Z])/'
S
$ echo 'Juarez-Hugo' | perl -lne 'print /.*([A-Z])/'
H
In Bash:
$ var="Johny-Smith-Hello Johny-Smith"; var="${var//[^[:upper:]]/}";echo "${var: -1}"
S
${var//[^[:upper:]]/} remove all non-upper case letter chars
echo ${var: -1} output the last one

Extract few matching strings from matching lines in file using sed

I have a file with strings similar to this:
abcd u'current_count': u'2', u'total_count': u'3', u'order_id': u'90'
I have to find current_count and total_count for each line of file. I am trying below command but its not working. Please help.
grep current_count file | sed "s/.*\('current_count': u'\d+'\).*/\1/"
It is outputting the whole line but I want something like this:
'current_count': u'3', 'total_count': u'3'
It's printing the whole line because the pattern in the s command doesn't match, so no substitution happens.
sed regexes don't support \d for digits, or x+ for xx*. GNU sed has a -r option to enable extended-regex support so + will be a meta-character, but \d still doesn't work. GNU sed also allows \+ as a meta-character in basic regex mode, but that's not POSIX standard.
So anyway, this will work:
echo -e "foo\nabcd u'current_count': u'2', u'total_count': u'3', u'order_id': u'90'" |
sed -nr "s/.*('current_count': u'[0-9]+').*/\1/p"
# output: 'current_count': u'2'
Notice that I skip the grep by using sed -n s///p. I could also have used /current_count/ as an address:
sed -r -e '/current_count/!d' -e "s/.*('current_count': u'[0-9]+').*/\1/"
Or with just grep printing only the matching part of the pattern, instead of the whole line:
grep -E -o "'current_count': u'[[:digit:]]+'
(or egrep instead of grep -E). I forget if grep -o is POSIX-required behaviour.
For me this looks like some sort of serialized Python data. Basically I would try to find out the origin of that data and parse it properly.
However, while being hackish, sed can also being used here:
sed "s/.*current_count': [a-z]'\([0-9]\+\).*/\1/" input.txt
sed "s/.*total_count': [a-z]'\([0-9]\+\).*/\1/" input.txt

sed one-liner to convert all uppercase to lowercase?

I have a textfile in which some words are printed in ALL CAPS. I want to be able to just convert everything in the textfile to lowercase, using sed. That means that the first sentence would then read, 'i have a textfile in which some words are printed in all caps.'
With tr:
# Converts upper to lower case
$ tr '[:upper:]' '[:lower:]' < input.txt > output.txt
# Converts lower to upper case
$ tr '[:lower:]' '[:upper:]' < input.txt > output.txt
Or, sed on GNU (but not BSD or Mac as they don't support \L or \U):
# Converts upper to lower case
$ sed -e 's/\(.*\)/\L\1/' input.txt > output.txt
# Converts lower to upper case
$ sed -e 's/\(.*\)/\U\1/' input.txt > output.txt
If you have GNU extensions, you can use sed's \L (lower entire match, or until \L [lower] or \E [end - toggle casing off] is reached), like so:
sed 's/.*/\L&/' <input >output
Note: '&' means the full match pattern.
As a side note, GNU extensions include \U (upper), \u (upper next character of match), \l (lower next character of match). For example, if you wanted to camelcase a sentence:
$ sed -E 's/\w+/\u&/g' <<< "Now is the time for all good men..." # Camel Case
Now Is The Time For All Good Men...
Note: Since the assumption is we have GNU extensions, we can use sequences such as \w (match a word character) and the -E (extended regex) option, which relieves you of having to escape the one-or-more quantifier (+) and certain other special regex characters.
You also can do this very easily with awk, if you're willing to consider a different tool:
echo "UPPER" | awk '{print tolower($0)}'
Here are many solutions :
To upercaser with perl, tr, sed and awk
perl -ne 'print uc'
perl -npe '$_=uc'
perl -npe 'tr/[a-z]/[A-Z]/'
perl -npe 'tr/a-z/A-Z/'
tr '[a-z]' '[A-Z]'
sed y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
sed 's/\([a-z]\)/\U\1/g'
sed 's/.*/\U&/'
awk '{print toupper($0)}'
To lowercase with perl, tr, sed and awk
perl -ne 'print lc'
perl -npe '$_=lc'
perl -npe 'tr/[A-Z]/[a-z]/'
perl -npe 'tr/A-Z/a-z/'
tr '[A-Z]' '[a-z]'
sed y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
sed 's/\([A-Z]\)/\L\1/g'
sed 's/.*/\L&/'
awk '{print tolower($0)}'
Complicated bash to lowercase :
while read v;do v=${v//A/a};v=${v//B/b};v=${v//C/c};v=${v//D/d};v=${v//E/e};v=${v//F/f};v=${v//G/g};v=${v//H/h};v=${v//I/i};v=${v//J/j};v=${v//K/k};v=${v//L/l};v=${v//M/m};v=${v//N/n};v=${v//O/o};v=${v//P/p};v=${v//Q/q};v=${v//R/r};v=${v//S/s};v=${v//T/t};v=${v//U/u};v=${v//V/v};v=${v//W/w};v=${v//X/x};v=${v//Y/y};v=${v//Z/z};echo "$v";done
Complicated bash to uppercase :
while read v;do v=${v//a/A};v=${v//b/B};v=${v//c/C};v=${v//d/D};v=${v//e/E};v=${v//f/F};v=${v//g/G};v=${v//h/H};v=${v//i/I};v=${v//j/J};v=${v//k/K};v=${v//l/L};v=${v//m/M};v=${v//n/N};v=${v//o/O};v=${v//p/P};v=${v//q/Q};v=${v//r/R};v=${v//s/S};v=${v//t/T};v=${v//u/U};v=${v//v/V};v=${v//w/W};v=${v//x/X};v=${v//y/Y};v=${v//z/Z};echo "$v";done
Simple bash to lowercase :
while read v;do echo "${v,,}"; done
Simple bash to uppercase :
while read v;do echo "${v^^}"; done
Note that ${v,} and ${v^} only change the first letter.
You should use it that way :
(while read v;do echo "${v,,}"; done) < input_file.txt > output_file.txt
I like some of the answers here, but there is a sed command that should do the trick on any platform:
sed 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/'
Anyway, it's easy to understand. And knowing about the y command can come in handy sometimes.
If you have GNU sed (likely on Linux, but not on *BSD or macOS):
echo "Hello MY name is SUJIT " | sed 's/./\L&/g'
Output:
hello my name is sujit
If you are using posix sed
Selection for any case for a pattern (converting the searched pattern with this sed than use the converted pattern in you wanted command using regex:
echo "${MyOrgPattern} | sed "s/[aA]/[aA]/g;s/[bB]/[bB]/g;s/[cC]/[cC]/g;s/[dD]/[dD]/g;s/[eE]/[eE]/g;s/[fF]/[fF]/g;s/[gG]/[gG]/g;s/[hH]/[hH]/g;s/[iI]/[iI]/g;s/[jJ]/[jJ]/g;s/[kK]/[kK]/g;s/[lL]/[lL]/g;s/[mM]/[mM]/g;s/[nN]/[nN]/g;s/[oO]/[oO]/g;s/[pP]/[pP]/g;s/[qQ]/[qQ]/g;s/[rR]/[rR]/g;s/[sS]/[sS]/g;s/[tT]/[tT]/g;s/[uU]/[uU]/g;s/[vV]/[vV]/g;s/[wW]/[wW]/g;s/[xX]/[xX]/g;s/[yY]/[yY]/g;s/[zZ]/[zZ]/g" | read -c MyNewPattern
YourInputStreamCommand | egrep "${MyNewPattern}"
convert in lower case
sed "s/[aA]/a/g;s/[bB]/b/g;s/[cC]/c/g;s/[dD]/d/g;s/[eE]/e/g;s/[fF]/f/g;s/[gG]/g/g;s/[hH]/h/g;s/[iI]/i/g;s/j/[jJ]/g;s/[kK]/k/g;s/[lL]/l/g;s/[mM]/m/g;s/[nN]/n/g;s/[oO]/o/g;s/[pP]/p/g;s/[qQ]/q/g;s/[rR]/r/g;s/[sS]/s/g;s/[tT]/t/g;s/[uU]/u/g;s/[vV]/v/g;s/[wW]/w/g;s/[xX]/x/g;s/[yY]/y/g;s/[zZ]/z/g"
same for uppercase replace lower letter between // by upper equivalent in the sed
Have fun
short, sweet and you don't even need redirection :-)
perl -p -i -e 'tr/A-Z/a-z/' file
Instead of typing this long expression:
sed 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/' input
One could use this:
sed 'y/'$(printf "%s" {A..Z} "/" {a..z} )'/' input

having a regex replacing across lines, retain the newlines?

I'd like to have a substitute or print style command with a regex working across lines. And lines retained.
$ echo -e 'a\nb\nc\nd\ne\nf\ng' | tr -d '\n' | grep -or 'b.*f'
bcdef
or
$ echo -e 'a\nb\nc\nd\ne\nf\ng' | tr -d '\n' | sed -r 's|b(.*)f|y\1z|'
aycdezg
i'd like to use grep or sed because i'd like to know what people would've done before awk or perl ..
would they not have? was .* not available? had they no other equivalent?
to possibly modify some input with a regex that spans across lines, and print it to stdout or output to a file, retaining the lines.
This should do what you're looking for:
$ echo -e 'a\nb\nc\nd\ne\nf\ng' | sed ':a;$s/b\([^f]*\)f/y\1z/;N;ba'
a
y
c
d
e
z
g
It accumulates all the lines then does the replacement. It looks for the first "f". If you want it to look for the last "f", change [^f] to ..
Note that this may make use of features added to sed after AWK or Perl became available (AWK has been around a looong time).
Edit:
To do a multi-line grep requires only a little modification:
$ echo -e 'a\nb\nc\nd\ne\nf\ng' | sed ':a;$s/^[^b]*\(b[^f]*f\)[^f]*$/\1/;N;ba'
b
c
d
e
f
sed can match across newlines through the use of its N command. For example, the following sed command will replace bar followed a newline followed by foo with ###:
$ echo -e "foo\nbar\nbaz\nqux" | sed 'N;s/bar\nbaz/###/;P;D'
foo
###
qux
The N command will append the next input line to the current pattern space separated by an embedded newline (\n)
The P command will print the current pattern space up to and including the first embedded newline.
The D command will delete up to and including the first embedded newline in the pattern space. It will also start next cycle but skip reading from the input if there is still data in the pattern space.
Through the use of these 3 commands, you can essentially do any sort of s command replacement looking across N-lines.
Edit
If your question is how can I remove the need for tr in the two examples above and just use sed then here you go:
$ echo -e 'a\nb\nc\nd\ne\nf\ng' | sed ':a;N;$!ba;s/\n//g;y/ag/yz/'
ybcdefz
Proven tools to the rescue.
echo -e "foo\nbar\nbaz\nqux" | perl -lpe 'BEGIN{$/=""}s/foo\nbar/###/'