I have a file with the following format:
name 3 4
name -4 3
name -5 4
name 2 -4
I want to make this substruction $2-$3 and to add an extra column at the beginning of my file with the -/+ sign based on the second column to obtain the following format:
- name -1 3 4
- name -7 -4 3
- name -9 -5 4
+ name 6 2 -4
I used this command
awk '{print $1,$2-$3,$2,$3}' FILE |if ($2 < 0 ) then awk '{print "-",$0}' ; else awk '{print "+",$0}'; fi
Which giving:
- name -1 3 4
- name -7 -4 3
- name -9 -5 4
- name 6 2 -4
I tried to "play" with curly brackets but it seems my condition stops after the first awk. What did I make wrong on my command?
Could you please try following.
awk '{$1=($2>$3?"+":"-") OFS $1 OFS $2-$3} 1' Input_file
In case you want to have TAB separated output then try following.
awk 'BEGIN{OFS="\t"} {$1=($2>$3?"+":"-") OFS $1 OFS $2-$3} 1' Input_file
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk program from here.
{
$1=($2>$3?"+":"-") OFS $1 OFS $2-$3 ##re-creating 1st field, where firstly checking if 2nd field is greater than 3rd than add + or put - then mentioning value of $1 and then subtraction of 2nd and 3rd field.
}
1 ##1 will print the lines.
' Input_file ##Mentioning Input_file name here.
Related
Say I have this file data.txt:
a=0,b=3,c=5
a=2,b=0,c=4
a=3,b=6,c=7
I want to use grep to extract 2 columns corresponding to the values of a and c:
0 5
2 4
3 7
I know how to extract each column separately:
grep -oP 'a=\K([0-9]+)' data.txt
0
2
3
And:
grep -oP 'c=\K([0-9]+)' data.txt
5
4
7
But I can't figure how to extract the two groups. I tried the following, which didn't work:
grep -oP 'a=\K([0-9]+),.+c=\K([0-9]+)' data.txt
5
4
7
I am also curious about grep being able to do so. \K "removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.
In the meanwhile, I would use sed:
sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file
it catches the digits after a= and c=, whenever this happens on lines starting with a= and not containing anything else after c=digits.
For your input, it returns:
0 5
2 4
3 7
You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7
To get the mentioned format , you need to pass the output of grep to paste or any other commands .
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7
use this :
awk -F[=,] '{print $2" "$6}' data.txt
I am using the separators as = and ,, then spliting on them
I have the following script to remove all lines before a line which matches with a word:
str='
1
2
3
banana
4
5
6
banana
8
9
10
'
echo "$str" | awk -v pattern=banana '
print_it {print}
$0 ~ pattern {print_it = 1}
'
It returns:
4
5
6
banana
8
9
10
But I want to include the first match too. This is the desired output:
banana
4
5
6
banana
8
9
10
How could I do this? Do you have any better idea with another command?
I've also tried sed '0,/^banana$/d', but seems it only works with files, and I want to use it with a variable.
And how could I get all lines before a match using awk?
I mean. With banana in the regex this would be the output:
1
2
3
This awk should do:
echo "$str" | awk '/banana/ {f=1} f'
banana
4
5
6
banana
8
9
10
sed -n '/^banana$/,$p'
Should do what you want. -n instructs sed to print nothing by default, and the p command specifies that all addressed lines should be printed. This will work on a stream, and is different than the awk solution since this requires the entire line to match 'banana' exactly whereas your awk solution merely requires 'banana' to be in the string, but I'm copying your sed example. Not sure what you mean by "use it with a variable". If you mean that you want the string 'banana' to be in a variable, you can easily do sed -n "/$variable/,\$p" (note the double quotes and the escaped $) or sed -n "/^$variable\$/,\$p" or sed -n "/^$variable"'$/,$p'. You can also echo "$str" | sed -n '/banana/,$p' just like you do with awk.
Just invert the commands in the awk:
echo "$str" | awk -v pattern=banana '
$0 ~ pattern {print_it = 1} <--- if line matches, activate the flag
print_it {print} <--- if the flag is active, print the line
'
The print_it flag is activated when pattern is found. From that moment on (inclusive that line), you print lines when the flag is ON. Previously the print was done before the checking.
cat in.txt | awk "/banana/,0"
In case you don't want to preserve the matched line then you can use
cat in.txt | sed "0,/banana/d"
i have a text file and in some lines the first space from left is 2 space long and i want it to be 1 space long. whats the script for this in bash?
123 2 5//problem
1 2 5
1 2 5
1 32 5//problem
what i want
123 2 5
1 2 5
1 2 5
1 32 5
tr way:
cat test.txt | tr -s ' '
Using sed:
sed 's/^\([^ ][^ ]*[ ]\)[ ]*/\1/' input
Starting from the left
^
match and capture non-space characters and a space
\([^ ][^ ]*[ ]\)
and any number of additional spaces:
[ ]* # remove the star if you only care about exactly 2 spaces
and replace these with the captured part:
\1
Edit: I realized that David's answer was almost right.
You can use sed.
cat x | sed -e 's/ \+/ /'
This replaces the first occurrence of one or more spaces with a single space.
But you can do it purely in bash as well:
cat x | while read a b ; do echo "$a" "$b" ; done
This splits each line at the first word, and echos back the first word and the rest of the line. The result is that there is only one space between the first word and the rest of the line.
I'm trying to use awk to check the second column of a three column set of data and replace its value if it's not zero. I've found this regex to find the non-zero numbers, but I can't figure out how to combine gsub with print to replace the contents and output it to a new file. I only want to run the gsub on the second column, not the first or third. Is there a simple awk one-liner to do this? Or am I looking at doing something more complex? I've even tried doing an expression to check for zero, but I'm not sure how to do an if/else statement in awk.
The command that I had semi-success with was:
awk '$2 != 0 {print $1, 1, $3}' input > output
The problem is that it didn't print out the row if the second column was zero. This is where I thought either gsub or an if/else statement would work, but I can't figure out the awk syntax. Any guidance on this would be appreciated.
Remember that in awk, anything that is not 0 is true (though any string that is not "0" is also true). So:
awk '$2 { $2 = 1; print }' input > output
The $2 evaluates to true if it's not 0. The rest is obvious. This replicates your script.
If you want to print all lines, including the ones with a zero in $2, I'd go with this:
awk '$2 { $2 = 1 } 1' input > output
This does the same replacement as above, but the 1 at the end is short-hand for "true". And without a statement, the default statement of {print} is run.
Is this what you're looking for?
In action, it looks like this:
[ghoti#pc ~]$ printf 'none 0 nada\none 1 uno\ntwo 2 tvo\n'
none 0 nada
one 1 uno
two 2 tvo
[ghoti#pc ~]$ printf 'none 0 nada\none 1 uno\ntwo 2 tvo\n' | awk '$2 { $2 = 1 } 1'
none 0 nada
one 1 uno
two 1 tvo
[ghoti#pc ~]$
Is this what you want?
awk '$2 != 0 {print $1, 1, $3} $2 == 0 {print}' input > output
or with sed:
sed 's/\([^ ]*\) [0-9]*[1-9][0-9]* /\1 1 /' input > output
I have a file with this format:
two columns of numbers in the beginning and two columns of number in the end and one column in the middle which is the name but the name has a delimiter of space which mess things up.
Is there any kind of regex that I can take out the name column correctly. Is there anyway that i can use sed to replace (or remove) the space in that column so that I can take that out column out easily?
Example:
1 2 name 3 4
12 12 name1 name2 3 4
12 12 name1 name2 name3 name4 3 4
3 4 name 3 4
--
The output that I want to have is:
name
name1_name2
name1_name2_name3_name4
name
Thanks,
Amir,
One solution using awk is:
cat foo | awk '{ for(i=3; i<=NF-3; i++) { printf $i "_"; } printf $i "\n"; }'
Here is the same thing using sed:
cat foo | sed -e 's/^[0-9 ]*//g' -e 's/ [0-9 ]*$//g' -e 's/ /_/g'
POSIX compliant for clarity:
cat foo | sed -e 's/^[[:digit:][:space:]]*//g' -e 's/[[:space:]]*[[:digit:][:space:]]*$//g' -e 's/ /_/g'
sed 's/^[0-9]\+ [0-9]\+ \(.*\) [0-9]\+ [0-9]\+$/\1/;s/ /_/g'
another awk way without looping
awk 'BEGIN{OFS="_"}{$1=$2=$NF=$(NF-1)="";gsub(/__/,"")}1' yourFile
test:
kent$ cat t
1 2 name 3 4
12 12 name1 name2 3 4
12 12 name1 name2 name3 name4 3 4
3 4 name 3 4
kent$ awk 'BEGIN{OFS="_"}{$1=$2=$NF=$(NF-1)="";gsub(/__/,"")}1' t
name
name1_name2
name1_name2_name3_name4
name
Couple of Perl options
perl -lne '/\d+ \d+ (.+) \d+ \d+/ and do {($_ = $1) =~ s/ /_/g; print}'
perl -lape 'for (1..2) {shift #F; pop #F}; $_ = join "_", #F'