I am trying to replace a word in awk using an if else statement.
Below is what I have already attempted. Basically if the thrid word is "Normal", change it to 0, else change it to 1.
awk -F "," '{(if $3=='Normal') {$3=0}; else {$3=1} }' filename
You need a proper if - else condition, followed with a trigger to print the line:
awk '{if ($3=="Normal") {$3=0} else {$3=1}}1' file
Here, 1 acts as a True condition, which triggers awk's default behaviour, consisting in printing the current line.
You can use a ternary operator for this, though:
awk '{$3=($3=="Normal") ? 0 : 1}1' file
Note also that you are dealing with ,-delimited fields, so you need to preserve them afterwards. For that, use OFS (output field separator). All together:
awk 'BEGIN{FS=OFS=","} {$3=($3=="Normal") ? 0 : 1}1' file
Given a sample file like this:
$ cat a
hello,how,are,you
i,am,Normal,buuu
This would return:
$ awk 'BEGIN{FS=OFS=","} {$3=($3=="Normal") ? 0 : 1}1' a
hello,how,1,you
i,am,0,buuu
What was wrong with your approach?
awk -F "," '{(if $3=='Normal') {$3=0}; else {$3=1} }' filename
You are using {(if condition) {action}; else {action}}. This has several problems:
the condition in if is the one within parentheses: if (condition) instead of (if condition).
You say (if condition) {action} and then ;. With this semi-colon, the if-statement gets finished and the next action will be executed anyways. So just drop it and say if (condition) {...} else {...} without this ; before else.
You are using single quotes in $3=='Normal'. This is closing the awk-statement and hence not being interpreted properly, so you need to use double quotes: $3=="Normal".
Missing a print to have the line printed.
Related
I am having trouble getting this to work:
awk -F ";" '{for(i=1; i < NF;i++) $i ~ /^_.*/ {print $i}}'
I want to iterate over all the fields (records can have 7-9) and print only those that start with an _ except the line above gives me a syntax error at the print statement and if I omit the {print $i} I dont get any output.
How is the correct way to do this?
You're missing an if:
awk -F ";" '{for(i=1; i < NF;i++) if($i ~ /^_.*/) {print $i}}'
The structure of an awk program is condition { action } but what you have currently is all within an action block (the condition is true by default). Within the action block the if isn't implicit.
As an aside, the .* in the pattern is redundant; you may as well use /^_/ to match any string starting with _.
Note: since fields are 1-indexed, likely that the right loop condition is i <= NF. If you are sure that the last field is unnecessary, condition i < NF will do the job.
Pretty new to AWK programming. I have a file1 with entries as:
15>000000513609200>000000513609200>B>I>0011>>238/PLMN/000100>File Ef141109.txt>0100-75607-16156-14 09-11-2014
15>000000513609200>000000513609200>B>I>0011>Danske Politi>238/PLMN/000200>>0100-75607-16156-14 09-11-2014
15>000050354428060>000050354428060>B>I>0011>Danske Politi>238/PLMN/000200>>4100-75607-01302-14 31-10-2014
I want to write a awk script, where if 2nd field subtracted from 3rd field is a 0, then it prints field 2. Else if the (difference > 0), then it prints all intermediate digits incremented by 1 starting from 2nd field ending at 3rd field. There will be no scenario where 3rd field is less than 2nd. So ignoring that condition.
I was doing something as:
awk 'NR > 2 { print p } { p = $0 }' file1 | awk -F">" '{if ($($3 - $2) == 0) print $2; else l = $($3 - $2); for(i=0;i<l;i++) print $2++; }'
(( Someone told me awk is close to C in terms of syntax ))
But from the output it looks to me that the String to numeric or numeric to string conversions are not taking place at right place at right time. Shouldn't it be taken care by AWK automatically ?
The OUTPUT that I get:
513609200
513609201
513609200
Which is not quiet as expected. One evident issue is its ignoring the preceding 0s.
Kindly help me modify the AWK script to get the desired result.
NOTE:
awk 'NR > 2 { print p } { p = $0 }' file1 is just to remove the 1st and last entry in my original file1. So the part that needs to be fixed is:
awk -F">" '{if ($($3 - $2) == 0) print $2; else l = $($3 - $2); for(i=0;i<l;i++) print $2++; }'
In awk, think of $ as an operator to retrieve the value of the named field number ($0 being a special case)
$1 is the value of field 1
$NF is the value of the field given in the NF variable
So, $($3 - $2) will try to get the value of the field number given by the expression ($3 - $2).
You need fewer $ signs
awk -F">" '{
if ($3 == $2)
print $2
else {
v=$2
while (v < $3)
print v++
}
}'
Normally, this will work, but your numbers are beyond awk integer bounds so you need another solution to handle them. I'm posting this to initiate other solutions and better illustrate your specifications.
$ awk -F'>' '{for(i=$2;i<=$3;i++) print i}' file
note that this will skip the rows that you say impossible to happen
A small scale example
$ cat file_0
x>1000>1000>etc
x>2000>2003>etc
x>3000>2999>etc
$ awk -F'>' '{for(i=$2;i<=$3;i++) print i}' file_0
1000
2000
2001
2002
2003
Apparently, newer versions of gawk has --bignum options for arbitrary precision integers, if you have a compatible version that may solve your problem but I don't have access to verify.
For anyone who does not have ready access to gawk with bigint support, it may be simpler to consider other options if some kind of "big integer" support is required. Since ruby has an awk-like mode of operation,
let's consider ruby here.
To get started, there are just four things to remember:
invoke ruby with the -n and -a options (-n for the awk-like loop; -a for automatic parsing of lines into fields ($F[i]));
awk's $n becomes $F[n-1];
explicit conversion of numeric strings to integers is required;
To specify the lines to be executed on the command line, use the '-e TEXT' option.
Thus a direct translation of:
awk -F'>' '{for(i=$2;i<=$3;i++) print i}' file
would be:
ruby -an -F'>' -e '($F[1].to_i .. $F[2].to_i).each {|i| puts i }' file
To guard against empty lines, the following script would be slightly better:
($F[1].to_i .. $F[2].to_i).each {|i| puts i } if $F.length > 2
This could be called as above, or if the script is in a file (say script.rb) using the incantation:
ruby -an -F'>' script.rb file
Given the OP input data, the output is:
513609200
513609200
50354428060
The left-padding can be accomplished in several ways -- see for example this SO page.
is there a way to do this kind of substitution in Awk, sed, ...?
I have a text file with sections divived into two blank lines;
section1_name_x
dklfjsdklfjsldfjsl
section2_name_x
dlskfjsdklfjsldkjflkj
section_name_X
dfsdjfksdfsdf
I would to replace every "section_name_x" by "#section_name_x", this is, how to replace the next string after match (every) two blank lines?
Thanks,
Steve,
awk '
(NR==1 || blank==2) && $1 ~ /^section/ {sub(/section/, "#&")}
{
print
if (length)
blank = 0
else
blank ++
}
' file
#section1_name_x
dklfjsdklfjsldfjsl
#section2_name_x
dlskfjsdklfjsldkjflkj
#section_name_X
dfsdjfksdfsdf
hm....
Given your example data why not just
sed 's/^section[0-9]*_name.*/#/' file > newFile && mv newFile file
some seds support sed -i OR sed -i"" to overwrite the existing file, avoiding the && mv ... shown above.
The reg ex says, section must be at the beginning of the line, and can optionally contain a number or NO number at all.
IHTH
In gawk you can use the RT builtin variable:
gawk '{$1="#"$1; print $0 RT}' RS='\n\n' file
* Update *
Thanks to #EdMorton I realized that my first version was incorrect.
What happens:
Assigning to $1 causes the record to be rebuildt, which is not good in this cases since any sequence of white space is replaced by a single space between fields, and by the null string in the beginning and at the end of the record.
Using print adds an additional newline to the output.
The correct version:
gawk '{printf "%s", "#" $0 RT}' RS='\n\n\n' file
I want to print all the lines of a file where the first element of each line begins with a number using awk. Below are the details on the data contained in the file and command used:
filename contents:
12.44.4444goad ABCDEF/END
LMNOP/START joker
98.0 kites
command used:
awk '{ $1 ~ /^\d[a-zA-Z0-9]*/ }' filename
After running the above command, no results are displayed on the prompt.
Please let me know if there is any correction that needs to be made to the above command.
To print the lines starting with a digit, you can try the following:
awk '/^[[:digit:]]+/' file
as pointed out by #HenkLangeveld your syntax is incorrect. Also the regex \d is not available in awk.
If you only need to match at least one digit at the start of the line, all you need is ^ to match the start of a line and [0-9] to match a digit.
You can use curly brackets with an if statement:
awk '{if($1 ~ /^[0-9]/) print $0}' filename
But that would just be longhand for this:
awk '$1 ~ /^[0-9]/' filename
From your attempted solution, it looks like you want:
awk 'NF>1 && $1 ~ /^[0-9.]*$/' filename
You need to explicitly match the . if you want to include the decimal point, and you need the $ anchor to make the * meaningful. This will miss lines in which the first column looks like 5e39 or -2.3. You can try to catch those cases with:
awk 'NF>1 && $1 ~ /^-?[0-9.]*(e[0-9*])?$/' filename
but at this point I would tell you to use perl and stop trying to be more robust with awk.
Perhaps (this will print blank lines...not sure which behavior you want):
perl -lane 'use POSIX qw(strtod); my ($num, $end) = strtod($F[0]);
print unless $end;' filename
This uses strtod to parse the number and tells you the number of characters at the end of the string that are not part of it.
Drop the braces and the \d, like this:
awk ' $1 ~ /^[0-9]/ ' filename
Awk programs come in chunks. A chunk is a pattern block pair, where the block
defaults to { print }. (An empty pattern defaults to true.)
The /\d/ is a perl-ism and might work in some versions awk - not in those that I tried*. You need either the traditional /^[0-9]/ or the POSIX /^[[:digit:]]/ notation.
*
gnu and ast
I am using the following statement in awk with text piped to it from another command:
awk 'match($0,/(QUOTATION|TAX INVOICE|ADJUSTMENT NOTE|DELIVERY DOCKET|PICKING SLIP|REMITTANCE ADVICE|PURCHASE ORDER|STATEMENT)/) && NR<11 {print substr($0,RSTART,RLENGTH)}'
which is almost working for what I need (find one of the words in the regex within the first 10 lines of the input and print that word). The main thing I need to do is to output something if there is no match. For instance, if none of those words are found in the first ten lines it would output UNKNOWN.
I also need to limit the output to the first match, as I need to ensure a single line of output per input file. I can do this with head or ask another question if needs be, I only include it here in case it affects how to output the no-match text.
I am also not tied to awk as a tool - if there is a simpler way to do this with sed or something else I am open to it.
You just need to exit at the first match, or on line 11 if no match
awk '
match($0,/(QUOTATION|TAX ... ORDER|STATEMENT)/) {
print substr($0,RSTART,RLENGTH)
exit
}
NR == 11 {print "UNKNOWN"; exit}
'
I like glenn jackman's answer, however, if you wish to print matches for all 10 lines then you can try something like this:
awk '
match($0,/(QUOTATION|TAX ... ORDER|STATEMENT)/) {
print NR " ---> " substr($0,RSTART,RLENGTH)
flag=1
}
flag==0 && NR==11 {
print "UNKNOWN"
exit
}'
You can do this..
( head -10 | egrep -o '(QUOTATION|TAX INVOICE|ADJUSTMENT NOTE|
DELIVERY DOCKET|PICKING SLIP|REMITTANCE ADVICE|PURCHASE ORDER|STATEMENT)'
|| print "Unkownn" ) | head -1