Rgex doen't work with sed command as expected - regex

I have a text file containing :
A 25 27 50
B 35 75
C 75 78
D 99 88 76
I wanted to delete the line that does not have the fourth field(the fourth pair of digits).
Expected output :
A 25 27 50
D 99 88 76
I know that awk command would be the best option for such task, but i'm wondering what's the problem with my sed command since it should work as you can see below :
sed -E '/^[ABCD] ([0-9][0-9]) \1$/d' text.txt
Using POSIX ERE with back-referencing (\1) to refer to the previous pattern surrounded with parenthesis.
I have tried this command instead :
sed -E '/^[ABCD] ([0-9][0-9]) [0-9][0-9]$/d' text.txt
But it seems to delete only the first occurrence of what i want.
I would appreciate further explanation of,
why the back-referencing doesn't work as expected.
what's the matter with the first occurrence in the second attempt,should i included global option if yes then how, since i already tried adding it at the end along side with /d (for delete) but it didn't work .

Much much easier with awk:
awk 'NF == 4' file
A 25 27 50
D 99 88 76
This awk command uses default field separator of space or tab and checks a condition NF == 4 to make sure we print lines with 4 fields only.
With sed it would be (assuming no leading+trailing spaces in each line):
sed -nE '/^[^[:blank:]]+([[:blank:]]+[^[:blank:]]+){3}$/p' file
A 25 27 50
D 99 88 76

With your shown samples in sed program you could try following. Written and tested in GNU sed.
sed -nE '/^([^[:space:]]+[[:space:]]+){3}[^[:space:]]+$/p' Input_file
Explanation: Simply stopping the printing for lines by sed's -n option. Then using -E for using ERE in program. In main program using regex to match from starting non-space(1 or more occurrences) followed by spaces(1 or more occurrences) and this combo 3 times(to match 3 fields basically) which is followed by non spaces 1 or more occurrences till end of line's value, if this regex matched then print that line.

This might work for you (GNU sed):
sed -En 's/\S+/&/4p' file
Turn off implicit printing -n and on extended regexp -E.
Substitute the 4th field with itself and print the result.

Related

How to get the last 2 characters of a string with last character being A or B and second to the last character being 1-360? (REGEX GREP)

I'm not really using regex in a daily basis and I'm still new to this.
For example, I have these strings and this is the format of the strings:
APPLE20B50A
APPLE30A60B
APPLE12B5B
APPLE360A360B
APPLE56B
Basically, I want to get the last letter (A or B) and the digit before the last letter (or a digit after the letter/before the digit which is also A or B too). There are also a format like APPLE56B that doesn't have digit+letter in the middle.
Expected Output:
50A
60B
5B
360B
56B
I tried grep -o '.\{2\}$' but it only outputs the last 2 characters:
0A
0B
5B
0B
6B
and obviously, it's not dynamic for the digits. Any help would be appreciated.
grep -o would indeed work with the correct pattern
grep -oP '[0-9]+[AB]$'
With Perl,
perl -nle'print $& if /[0-9]+[AB]$/'
perl -nle'print for /([0-9]+[AB])$/'
In all cases, you can provide the input via STDIN or by passing a file name to read as an argument.
Try this:
cat input-file | perl -ne 'print "$1\n" if (m/([0-9]+[AB])$/)'
This might work for you (GNU grep):
grep -o '\(360\|3[0-5][0-9]\|[1-2][0-9][0-9]\|[1-9][0-9]\|[1-9]\)[AB]\>' file
This will print each value on a separate line from 1A/1B to 360A/360B.
To space separate these values use:
grep -o '\(360\|3[0-5][0-9]\|[1-2][0-9][0-9]\|[1-9][0-9]\|[1-9]\)[AB]\>' file |
paste -sd' '

How to grep the second number only in one line?

Given the contents of test.txt as follows:
Hello 10 love 20 haha 30
Hello Hello 11 love love 21 haha 31
41 Hello Hello 42 love love 43 haha 44
I want some kind of grep expression so that after saying:
$ cat test.txt | grep ???
I get this output:
20
21
42
How to implement this function?
Seems like you're trying to get the second number..
grep -oP '^\D*\d+\D*\K\d+' file
or
Use sed.
sed 's/^[^[:digit:]]*[[:digit:]]\+[^[:digit:]]*\([[:digit:]]\+\).*/\1/' file
DEMO
An alternative you might like to consider, using awk:
awk -F'[^[:digit:]]+' '{ print /^[[:digit:]]/ ? $2 : $3 }' file
This sets the field separator to one or more non-digit characters, which means that the field you're interested in is either the second or the third field, depending on whether the line starts with a digit or not.
For brevity you may prefer to use the range [0-9] instead of [[:digit:]]:
awk -F'[^0-9]+' '{ print /^[0-9]/ ? $2 : $3 }' file
Or you could use perl to capture the part of the line you're interested in:
perl -lne 'print $1 if /\d\D+(\d+)/' file
\d matches digits and \D matches non-digits, so this captures the second set of digits found on the line. In the case where a second set of digits aren't found, nothing will be printed (this differs to the behaviour of the awk script).

Is it possible to substitute a number using sed matching multiple regexp?

I'm trying to replace a number in a file using sed. This number can be found using \b<NUMBER>\b. However, there are comments in the file I'm parsing that sometimes have the same number and I would like to leave them unchanged.
All the lines that need to be replaced are similar to:
some_text <1 4 35 314 359>
And the complete file could be something like:
# This is not to be replaced: 314
some_text <1 4 35 314 359>
So, if I wanted to replace 314, how could I do it with sed?
I can find it with the following grep:
grep -P "^[^#].*some_text <[ 0-9]*>" "<FILE>" | grep -e "\b314\b")
But I can't seem to figure out a way to do it with sed. The old line I had would replace all the entries for that number:
sed -i "s/\b *314\b//" <FILE>
Any clarifications or help would be most welcome!
Thank you for your help!
/G
You can use sed like this:
sed '/some_text/s/\b314\b/789/' file
# This is not to be replaced: 314
some_text <1 4 35 789 359>
You could use awk instead, skipping any lines that are comments:
awk '!/^#/{sub(/\y314\y/,789)}1' file
As you've used word boundaries in your example, I'm assuming that you have GNU awk installed and I've used \y, which is a word boundary.

How to replace space with comma using sed?

I would like to replace the empty space between each and every field with comma delimiter.Could someone let me know how can I do this.I tried the below command but it doesn't work.thanks.
My command:
:%s//,/
53 51097 310780 1
56 260 1925 1
68 51282 278770 1
77 46903 281485 1
82 475 2600 1
84 433 3395 1
96 212 1545 1
163 373819 1006375 1
204 36917 117195 1
If you are talking about sed, this works:
sed -e "s/ /,/g" < a.txt
In vim, use same regex to replace:
s/ /,/g
Inside vim, you want to type when in normal (command) mode:
:%s/ /,/g
On the terminal prompt, you can use sed to perform this on a file:
sed -i 's/\ /,/g' input_file
Note: the -i option to sed means "in-place edit", as in that it will modify the input file.
I know it's not exactly what you're asking, but, for replacing a comma with a newline, this works great:
tr , '\n' < file
Try the following command and it should work out for you.
sed "s/\s/,/g" orignalFive.csv > editedFinal.csv
IF your data includes an arbitrary sequence of blank characters (tab, space), and you want to replace each sequence with one comma, use the following:
sed 's/[\t ]+/,/g' input_file
or
sed -r 's/[[:blank:]]+/,/g' input_file
If you want to replace sequence of space characters, which includes other characters such as carriage return and backspace, etc, then use the following:
sed -r 's/[[:space:]]+/,/g' input_file
If you want the output on terminal then,
$sed 's/ /,/g' filename.txt
But if you want to edit the file itself i.e. if you want to replace space with the comma in the file then,
$sed -i 's/ /,/g' filename.txt
I just confirmed that:
cat file.txt | sed "s/\s/,/g"
successfully replaces spaces with commas in Cygwin terminals (mintty 2.9.0). None of the other samples worked for me.
On Linux use below to test (it would replace the whitespaces with comma)
sed 's/\s/,/g' /tmp/test.txt | head
later you can take the output into the file using below command:
sed 's/\s/,/g' /tmp/test.txt > /tmp/test_final.txt
PS: test is the file which you want to use

Changing values with grep or sed

Can I increase some numbers in txt files with grep/sed?
I want to find all numbers in file and increase them for 5. Is that possible with grep and sed or I need to write app for that?
EDIT:
File has n lines which begin with number - number and than some text.
Like title for movie.
example line:
34 - 36 : Some text.
You can use perl as:
perl -i -pe 's/(\d+)/$1+5/eg' filename
See it
Probably awk. Change the record separator to whitespace (assuming this is what you want to do), then if a record matches the regex ^[0-9]*$ convert to number add 5 and print, otherwise print.
This is a pretty complete solution but "left as exercise" to code up.
I believe you should use awk Changing the Contents of a Field
>cat 1.txt
34 - 36 : Some text.
cat 1.txt | awk '{ $1=$1+5; $3=$3+5; print $0; }'
39 - 41 : Some text.
This might work for you (GNU sed & Bash):
sed 's/[0-9]\+/$((&+5))/g;s/.*/echo "&"/e' file