unexpected result by cutting the last column with sed

unexpected result by cutting the last column with sed - regex

echo '60 test' | sed -r 's/(.*)\s+[^\s]+$/\1/'
result:
60 test
the last column is not cut. but it works pretty well with
echo '60 home' | sed -r 's/(.*)\s+[^\s]+$/\1/'
result:
60
why?

[^\s]+ means not backslash or s repeated 1 or more times and test contains s while home does not and so the latter matches the regexp while the former doesn't.
You should have used either of these instead to match non-space:
$ echo '60 test' | sed -r 's/(.*)\s+\S+$/\1/'
60
$ echo '60 test' | sed -r 's/(.*)\s+[^[:space:]]+$/\1/'
60
As #potong suggested in a comment, to remove the last column with sed all you really need is:
sed -E 's/\s+\S+$//'
I switched from -r to -E as -r is GNU sed only while -E is GNU or OSX/BSD sed so it's generally the better option to use BUT OSX/BSD sed won't recognize \s or \S so changing from -r to -E doesn't really make the script more portable in this case, you'd have to use this instead:
sed -E 's/[[:space:]]+[^[:space:]]+//'
and then to be completely portable to all POSIX seds it'd be:
sed 's/[[:space:]]\{1,\}[^[:space:]]\{1,\}//'
or this would behave the same if there's always 2 or more fields:
sed 's/[[:space:]]*[^[:space:]]*//'

If you are just printing the first part of your string before the space without doing any other modification, you can simply use cut
echo '60 test' | cut -d' ' -f1
60
where you define your delimiter (-d) and the field (-f) you want to select.
No need to go for a complex solution using sed and doing some replacement operations.
With awk you can also print the first field:
echo '60 test' | awk '{print $1}'
60
or via grep in perl mode to have the \s taken into account
echo '60 test' | grep -oP '^.*?(?=\s)'
60

Related

Linux sed - Delete words do not start with a specific character

How to remove words that do not start with a specific character by sed?
Sample:
echo "--foo imhere -abc anotherone" | sed ...
Result must be;
"--foo -abc"

echo "--foo imhere -abc anotherone" |\
sed -e 's/^/ /g' -e 's/ [^-][^ ]*//g' -e 's/^ *//g'
The first and last -e commands are needed if only when the first word can be wrong either.

gnu sed with -r:
kent$ echo "--foo imhere -abc anotherone" | sed -r 's/^|\s[^-]\S*//g'
--foo -abc
However I prefer awk to solve it, more straightforward:
awk '{for(i=1;i<=NF;i++)$i=($i~/^-/?$i:"")}7'
output:
--foo -abc

You can use ssed to enable PCRE regex and then you can use this one:
(?<!-)\b\w+
Working demo
echo "--foo imhere -abc anotherone" | ssed 's/(?<!-)\b\w+//'

Bash shave a first and/or last character from string, but only if it is a certain character

In bash I need to shave a first and/or last character from string, but only if it is a certain character.
If I have | I need
/foo/bar/hah/ => foo/bar/hah
foo/bar/hah => foo/bar/hah
You can downvote me for not listing everything I've tried. But the fact is I've tried at least 35 differents sed strings and bash character stuff, many of which was from stack overflow. I simply cannot get this to happen.

what's the problem with the simple one?
sed "s/^\///;s/\/$//"
Output is
foo/bar/hah
foo/bar/hah

In pure bash :
$ var=/foo/bar/hah/
$ var=${var%/}
$ echo ${var#/}
foo/bar/hah
$
Check bash parameter expansion
or with sed :
$ sed -r 's#(^/|/$)##g' file

How about simply this:
echo "$x" | sed -e 's:^/::' -e 's:/$::'

Further to #sputnick's answer and from this answer, here's a function that would do it:
STR="/foo/bar/etc/";
STRB="foo/bar/etc";
function trimslashes {
STR="$1"
STR=${STR#"/"}
STR=${STR%"/"}
echo "$STR"
}
trimslashes $STR
trimslashes $STRB
# foo/bar/etc
# foo/bar/etc

echo '/foo/bar/hah/' | sed 's#^/##' | sed 's#/$##'

assuming the / character is the only one you're trying to remove, then sed -E 's_^[/](.*)_\1_' should do the job:
$ echo "$var1"; echo "$var2"
/foo/bar/hah
foo/bar/hah
$ echo "$var1" | sed -E 's_^[/](.*)_\1_'
foo/bar/hah
$ echo "$var2" | sed -E 's_^[/](.*)_\1_'
foo/bar/hah
if you also need to replace other characters at the start of the line, add it to the [/] class. for example, if you need to replace / or -, it would be sed -E 's_^[/-](.*)_\1_'

Here is an awk version:
echo "/foo/bar/hah/" | awk '{gsub(/^\/|\/$/,"")}1'
foo/bar/hah

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
cfq

It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
cfq
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
cfq
is something
enclose
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
cfq
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
cfq
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.

I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1

Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
cfq

Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file

Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
cfq

perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

(GNU)Sed: how to replace any character from nth character to nth+10?

I need to replace characters from 10th to 20th in the string which looks like that:
123456789012345678901234567890
So far I've tried:
a)
Works for the 10th character ONLY:
echo "123456789012345678901234567890" | sed 's/./X/10'
b)
Doesn't work on the range:
echo "123456789012345678901234567890" | sed 's/./X/10,20'
echo "123456789012345678901234567890" | sed 's/./X/10\,20'
echo "123456789012345678901234567890" | sed 's/./X/\{10,20\}'
echo "123456789012345678901234567890" | sed 's/./X/\{10\,20\}'
Does not work and I get error
unknown option to `s'
So - the question is - how do I make this to work:
echo "123456789012345678901234567890" | sed 's/./X/10,20'

Try:
$ sed -r "s/^(.{9})(.{11})/\1XXXXXXXXXX/" <<< 123456789012345678901234567890
123456789XXXXXXXXXX1234567890

It is a complex sed problem, I could just find this solution:
$ sed 's/^\(.\{10\}\)\(.\{10\}\)/\1XXXXXXXXXX/' <<< 123456789012345678901234567890
1234567890XXXXXXXXXX1234567890
With awk it looks nicer:
$ awk 'BEGIN{FS=OFS=""} {for (i=10;i<=20;i++) $i="X"} {print}' <<< 123456789012345678901234567890
123456789XXXXXXXXXXX1234567890

You can do it with bash parameter substitution like this:
#!/bin/bash
s="123456789012345678901234567890"
l=${s:0:9} # Extract left part
m=${s:10:11} # Extract middle part
r=${s:20} # Extract right part
# Diddle with middle part to your heart's content and re-assemble "$l$m$r" when done
m=$(sed 's/./X/g' <<<$m)
See here for more explanation and examples.
Or, you can do this:
transform the row of letters into a column so each is on its own line
apply your edits to LINES 10 through 20 (as opposed to characters 10 through 20)
transform column of letters back into a row (by deleting linefeeds)
as shown in the one-liner below:
$ echo "123456789012345678901234567890" | sed "s/\(.\)/\1\n/g" | sed "10,20s/./X/" | tr -d "\n"

I know, that it looks ugly, but:
echo "123456789012345678901234567890" | \
sed 's/^\(.\{10\}\).\{10\}\(.*\)/\1XXXXXXXXXX\2/'

Without placing multiple X in sed command:
sed -r 's/^(.{9})(.{10,20})(.*)$/\1\n\2\n\3/' | sed -e '2s/./X/g' -e 'N;N;s/\n//g'

To replace the 10th to 20th characters, inclusive, try:
echo 123456789012345678901234567890 | sed 's/\(.\{9\}\).\{11\}/\1XXXXXXXXXX/'
123456789XXXXXXXXXX1234567890
With the GNU sed, you can use the -r switch to remove most of the backslashes:
echo 123456789012345678901234567890 | sed -r 's/(.{9}).{11}/\1XXXXXXXXXX/'
Or the naive approach also works here:
echo 123456789012345678901234567890 | sed 's/\(.........\).........../\1XXXXXXXXXX/'

This might work for you (GNU sed):
sed ':a;/.\{9\}X\{11\}/!s/\(.\{9\}X*\)./\1X/;ta' file
or with a bit of syntactic sugar:
sed -r ':a;/.{9}X{11}/!s/(.{9}X*)./\1X/;ta' file

Regular expression to extract a percentage

I have strings like the following: blabla a13724bla-bla244 35%
Notice that there is always a space before the percentage. I would like to extract the percentage number (so, without the %) from these strings using the Linux shell.

Assuming you have GNU grep:
$ grep -oP '\d+(?=%)' <<< "blabla a13724bla-bla244 35%"
35

Using sed:
echo blabla a13724bla-bla244 35% | sed 's/.*[ \t][ \t]*\([0-9][0-9]*\)%.*/\1/'
If you expect to have multiple percentages in a line then:
echo blabla 20% a13724bla-bla244 35% | \
sed -e 's/[^%0-9 ]*//g;s/ */\n/g' | sed -n '/%/p'

You can try this
echo "blabla a13724bla-bla244 35%" | cut -d' ' -f3 | sed 's/\%//g'
NOTE: Assumption is the input is always in this format and percentage is 3rd token separated by space.

You may try this regular expression:
/\s(\d+%)/

Use this regular expression:
\s(\d{1,3})%
If you need it in shell, you can use sed or this perl one-liner:
echo "blah 35%" | perl -pe "s/.*\s(\d{1,3})%/\1/g"
35

If you always have a number of continuous columns maybe you should try with awk instead of a regular expresion.
cat file.txt |awk '{print $3}' |cut -d "%" -f 1
With this code you obtain the third column.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

unexpected result by cutting the last column with sed - regex

echo '60 test' | sed -r 's/(.)\s+[^\s]+$/\1/' result: 60 test the last column is not cut. but it works pretty well with echo '60 home' | sed -r 's/(.)\s+[^\s]+$/\1/' result: 60 why?

Related

Linux sed - Delete words do not start with a specific character

Bash shave a first and/or last character from string, but only if it is a certain character

How to use sed to identify a string in brackets?

(GNU)Sed: how to replace any character from nth character to nth+10?

Regular expression to extract a percentage

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

unexpected result by cutting the last column with sed - regex

echo '60 test' | sed -r 's/(.*)\s+[^\s]+$/\1/' result: 60 test the last column is not cut. but it works pretty well with echo '60 home' | sed -r 's/(.*)\s+[^\s]+$/\1/' result: 60 why?

Related

Linux sed - Delete words do not start with a specific character

Bash shave a first and/or last character from string, but only if it is a certain character

How to use sed to identify a string in brackets?

(GNU)Sed: how to replace any character from nth character to nth+10?

Regular expression to extract a percentage

Categories

Resources

echo '60 test' | sed -r 's/(.)\s+[^\s]+$/\1/' result: 60 test the last column is not cut. but it works pretty well with echo '60 home' | sed -r 's/(.)\s+[^\s]+$/\1/' result: 60 why?