I have a config file with param=option[,option...], using standard bash utilities, perhaps the the help of sed, remove one option from the list.
#
param=aa,bb,cc
param=aa,bb
param=bb,cc
param=bb
#
in this example, I want to remove 'bb' (and the separator) from all lines, and in the last case, because 'bb' was the sole option, remove the complete line, so the final result will be
#
param=aa,cc
param=aa
param=cc
#
option 'bb' can be alone or at the start, center or end of the list. Obviously, 'bb' embedded on another option name (ie xxbb, bbxx, etc) should not be considered.
edit: fix typo, addn'l example
Here is a sed version to remove bb parameter from any position and delete the line if bb is the only parameter:
First the input file:
#
param=aa,bb,cc
param=aa,bb
param=bb,cc
param=bb
#
Now run this sed:
sed -E '/^param=/{/=bb$/d; s/,bb(,|$)/\1/; s/=bb,/=/;}' file
This will give:
#
param=aa,cc
param=aa
param=cc
#
To use inline editing use:
sed -i.bak -E '/^param=/{/=bb$/d; s/,bb(,|$)/\1/; s/=bb,/=/;}' file
Note: The solutions below do not address updating the input file; a simple (though not fully robust) approach is to use
awk '...' file > file.$$ && mv file.$$ file
A POSIX-compliant awk solution that should work robustly:
awk -F'=' '
$1 != "param" { print; next }
{
sub(/,bb,/, ",", $2)
sub(/(^|,)bb$/, "", $2)
if ($2 != "") print $1 FS $2
}
' file
GNU awk allows for a simpler solution, using its (nonstandard) gensub() function:
awk -F'=' '
$1 != "param" { print; next }
{
newList = gensub(/(^|,)bb(,|$)/, "\\2", 1, $2)
if (newList != "") print $1 FS newList
}
' file
A (POSIX-compliant) field-based alternative (more verbose, but perhaps easier to generalize):
awk -F'=' '
$1 != "param" { print; next }
{
n = split($2, opts, ","); optList = ""
for (i=1; i<=n; ++i) {
if (opts[i] != "bb") {
optList = optList (optList == "" ? "" : ",") opts[i]
}
}
if (optList != "") print $1 FS optList
}
' file
Let's say your Input_file is as follows:
param=aa,bb,cc
param=aa,bb
param=bb
Then the following code:
awk -F"=" '$2=="bb"{next} {sub(/,bb/,"");print}' Input_file
outputs:
param=aa,cc
param=aa
I'd use a temporary format to be able to find the occurrences easier. And to remove lines I would suggest using grep:
sed 's/=/=,/;s/$/,/;s/,bb,/,/;s/=,/=/;s/,$//;/=$/d'
the s/=/=,/ converts it to:
param=,aa,bb,cc
param=,aa,bb
param=,bb
than s/$/,/ to:
param=,aa,bb,cc,
param=,aa,bb,
param=,bb,
than s/,bb,/,/
param=,aa,cc,
param=,aa,
param=,
and s/=,/=/;s/,$// will remove the commata at the begining and end again
removing empty options can be done with grep -v '=$', or some more advanced sed magic (so it can be still used with sed -i)
EDIT:
the "sed magic" is just appending '/=$/d'
tested this one, and it works fine:
sed -i 's/=/=,/;s/$/,/;s/,bb,/,/;s/=,/=/;s/,$//;/=$/d' filename
or
sed 's/=/=,/;s/$/,/;s/,bb,/,/;s/=,/=/;s/,$//;/=$/d' filename_in > filename_out
Related
I'm trying to emulate GNU grep -Eo with a standard awk call.
What the man says about the -o option is:
-o --only-matching
Print only the matched (non-empty) parts of matching lines, with each such part on a separate output line.
For now I have this code:
#!/bin/sh
regextract() {
[ "$#" -ge 2 ] || return 1
__regextract_ere=$1
shift
awk -v FS='^$' -v ERE="$__regextract_ere" '
{
while ( match($0,ERE) && RLENGTH > 0 ) {
print substr($0,RSTART,RLENGTH)
$0 = substr($0,RSTART+1)
}
}
' "$#"
}
My question is: In the case that the matching part is 0-length, do I need to continue trying to match the rest of the line or should I move to the next line (like I already do)? I can't find a sample of input+regex that would need the former but I feel like it might exist. Any idea?
Here's a POSIX awk version, which works with a* (or any POSIX awk regex):
echo abcaaaca |
awk -v regex='a*' '
{
while (match($0, regex)) {
if (RLENGTH) print substr($0, RSTART, RLENGTH)
$0 = substr($0, RSTART + (RLENGTH > 0 ? RLENGTH : 1))
if ($0 == "") break
}
}'
Prints:
a
aaa
a
POSIX awk and grep -E use POSIX extended regular expressions, except that awk allows C escapes (like \t) but grep -E does not. If you wanted strict compatibility you'd have to deal with that.
If you can consider a gnu-awk solution then using RS and RT may give identical behavior of grep -Eo.
# input data
cat file
FOO:TEST3:11
BAR:TEST2:39
BAZ:TEST0:20
Using grep -Eo:
grep -Eo '[[:alnum:]]+' file
FOO
TEST3
11
BAR
TEST2
39
BAZ
TEST0
20
Using gnu-awk with RS and RT using same regex:
awk -v RS='[[:alnum:]]+' 'RT != "" {print RT}' file
FOO
TEST3
11
BAR
TEST2
39
BAZ
TEST0
20
More examples:
grep -Eo '\<[[:digit:]]+' file
11
39
20
awk -v RS='\\<[[:digit:]]+' 'RT != "" {print RT}' file
11
39
20
Thanks to the various comments and answers I think that I have a working, robust, and (maybe) efficient code now:
tested on AIX/Solaris/FreeBSD/macOS/Linux
#!/bin/sh
regextract() {
[ "$#" -ge 1 ] || return 1
[ "$#" -eq 1 ] && set -- "$1" -
awk -v FS='^$' '
BEGIN {
ere = ARGV[1]
delete ARGV[1]
}
{
tail = $0
while ( tail != "" && match(tail,ere) ) {
if (RLENGTH) {
print substr(tail,RSTART,RLENGTH)
tail = substr(tail,RSTART+RLENGTH)
} else
tail = substr(tail,RSTART+1)
}
}
' "$#"
}
regextract "$#"
notes:
I pass the ERE string along the file arguments so that awk doesn't pre-process it (thanks #anubhava for pointing that out); C-style escape sequences will still be translated by the regex engine of awk though (thanks #dan for pointing that out).
Because assigning $0 does reset the values of all fields,
I chose FS = '^$' for limiting the overhead
Copying $0 in a separate variable nullifies the overhead induced by assigning $0 in the while loop (thanks #EdMorton for pointing that out).
a few examples:
# Multiple matches in a single line:
echo XfooXXbarXXX | regextract 'X*'
X
XX
XXX
# Passing the regex string to awk as a parameter versus a file argument:
echo '[a]' | regextract_as_awk_param '\[a]'
a
echo '[a]' | regextract '\[a]'
[a]
# The regex engine of awk translates C-style escape sequences:
printf '%s\n' '\t' | regextract '\t'
printf '%s\n' '\t' | regextract '\\t'
\t
Your code will malfunction for match which might have zero or more characters, consider following simple example, let file.txt content be
1A2A3
then
grep -Eo A* file.txt
gives output
A
A
your while's condition is match($0,ERE) && RLENGTH > 0, in this case former part gives true, but latter gives false as match found is zero-length before first character (RSTART was set to 1), thus body of while will be done zero times.
Using awk or sed in a bash script, I need to remove comma separated delimiters that are located between an inner and outer delimiter. The problem is that wrong values ends up in the wrong columns, where only 3 columns are desired.
For example, I want to turn this:
2020/11/04,Test Account,569.00
2020/11/05,Test,Account,250.00
2020/11/05,More,Test,Accounts,225.00
Into this:
2020/11/04,Test Account,569.00
2020/11/05,Test Account,250.00
2020/11/05,More Test Accounts,225.00
I've tried to use a few things, testing regex:
But I cannot find a solution to only select the commas in order to remove.
awk -F, '{ printf "%s,",$1;for (i=2;i<=NF-2;i++) { printf "%s ",$i };printf "%s,%s\n",$(NF-1),$NF }' file
Using awk, print the first comma delimited field and then loop through the rest of the field up to the last but 2 field printing the field followed by a space. Then for the last 2 fields print the last but one field, a comma and then the last field.
With GNU awk for the 3rd arg to match():
$ awk -v OFS=, '{
match($0,/([^,]*),(.*),([^,]*)/,a)
gsub(/,/," ",a[2])
print a[1], a[2], a[3]
}' file
2020/11/04,Test Account,569.00
2020/11/05,Test Account,250.00
2020/11/05,More Test Accounts,225.00
or with any awk:
$ awk '
BEGIN { FS=OFS="," }
{
n = split($0,a)
gsub(/^[^,]*,|,[^,]*$/,"")
gsub(/,/," ")
print a[1], $0, a[n]
}
' file
2020/11/04,Test Account,569.00
2020/11/05,Test Account,250.00
2020/11/05,More Test Accounts,225.00
Use this Perl one-liner:
perl -F',' -lane 'print join ",", $F[0], "#F[1 .. ($#F-1)]", $F[-1];' in.csv
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
-F',' : Split into #F on comma, rather than on whitespace.
$F[0] : first element of the array #F (= first comma-delimited value).
$F[-1] : last element of #F.
#F[1 .. ($#F-1)] : elements of #F between the second from the start and the second from the end, inclusive.
"#F[1 .. ($#F-1)]" : the above elements, joined on blanks into a string.
join ",", ... : join the LIST "..." on a comma, and return the resulting string.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perl -pe 's{,\K.*(?=,)}{$& =~ y/,/ /r}e' file
sed -e ':a' -e 's/\(,[^,]*\),\([^,]*,\)/\1 \2/; t a' file
awk '{$1=$1","; $NF=","$NF; gsub(/ *, */,","); print}' FS=, file
awk '{for (i=2; i<=NF; ++i) $i=(i>2 && i<NF ? " " : ",") $i} 1' FS=, OFS= file
awk doesn't support look arounds, we could have it by using match function of awk; using that could you please try following, written and tested with shown samples in GNU awk.
awk '
match($0,/,.*,/){
val=substr($0,RSTART+1,RLENGTH-2)
gsub(/,/," ",val)
print substr($0,1,RSTART) val substr($0,RSTART+RLENGTH-1)
}
' Input_file
Yet another perl
$ perl -pe 's/(?:^[^,]*,|,[^,]*$)(*SKIP)(*F)|,/ /g' ip.txt
2020/11/04,Test Account,569.00
2020/11/05,Test Account,250.00
2020/11/05,More Test Accounts,225.00
(?:^[^,]*,|,[^,]*$) matches first/last field along with the comma character
(*SKIP)(*F) this would prevent modification of preceding regexp
|, provide , as alternate regexp to be matched for modification
With sed (assuming \n is supported by the implementation, otherwise, you'll have to find a character that cannot be present in the input)
sed -E 's/,/\n/; s/,([^,]*)$/\n\1/; y/,/ /; y/\n/,/'
s/,/\n/; s/,([^,]*)$/\n\1/ replace first and last comma with newline character
y/,/ / replace all comma with space
y/\n/,/ change newlines back to comma
A similar answer to Timur's, in awk
awk '
BEGIN { FS = OFS = "," }
function join(start, stop, sep, str, i) {
str = $start
for (i = start + 1; i <= stop; i++) {
str = str sep $i
}
return str
}
{ print $1, join(2, NF-1, " "), $NF }
' file.csv
It's a shame awk doesn't ship with a join function builtin
I managed to extract the following response and comma separate it. It's comma seperated string and I'm only interested in comma separated values of the account_id's. How do you pattern match using sed?
Input: ACCOUNT_ID,711111111119,ENVIRONMENT,dev,ACCOUNT_ID,111111111115,dev
Expected Output: 711111111119, 111111111115
My $input variable stores the input
I tried the below but it joins all the numbers and I would like to preserve the comma ','
echo $input | sed -e "s/[^0-9]//g"
I think you're better served with awk:
awk -v FS=, '{for(i=1;i<=NF;i++)if($i~/[0-9]/){printf sep $i;sep=","}}'
If you really want sed, you can go for
sed -e "s/[^0-9]/,/g" -e "s/,,*/,/g" -e "s/^,\|,$//g"
$ awk '
BEGIN {
FS = OFS = ","
}
{
c = 0
for (i = 1; i <= NF; i++) {
if ($i == "ACCOUNT_ID") {
printf "%s%s", (c++ ? OFS : ""), $(i + 1)
}
}
print ""
}' file
711111111119,111111111115
I would like to reverse the complete text from the file.
Say if the file contains:
com.e.h/float
I want to get output as:
float/h.e.com
I have tried the command:
rev file.txt
but I have got all the reverse output: taolf/h.e.moc
Is there a way I can get the desired output. Do let me know. Thank you.
Here is teh link of teh sample file: Sample Text
You can use sed and tac:
str=$(echo 'com.e.h/float' | sed -E 's/(\W+)/\n\1\n/g' | tac | tr -d '\n')
echo "$str"
float/h.e.com
Using sed we insert \n before and after all non-word characters.
Using tac we reverse the output lines.
Using tr we strip all new lines.
If you have gnu-awk then you can do all this in a single awk command using 4 argument split function call that populates split strings and delimiters separately:
awk '{
s = ""
split($0, arr, /\W+/, seps)
for (i=length(arr); i>=1; i--)
s = s seps[i] arr[i]
print s
}' file
For non-gnu awk, you can use:
awk '{
r = $0
i = 0
while (match(r, /[^a-zA-Z0-9_]+/)) {
a[++i] = substr(r, RSTART, RLENGTH) substr(r, 0, RSTART-1)
r = substr(r, RSTART+RLENGTH)
}
s = r
for (j=i; j>=1; j--)
s = s a[j]
print s
}' file
Is it possible to use Perl?
perl -nlE 'say reverse(split("([/.])",$_))' f
This one-liner reverses all the lines of f, according to PO's criteria.
If prefer a less parentesis version:
perl -nlE 'say reverse split "([/.])"' f
For portability, this can be done using any awk (not just GNU) using substrings:
$ awk '{
while (match($0,/[[:alnum:]]+/)) {
s=substr($0,RLENGTH+1,1) substr($0,1,RLENGTH) s;
$0=substr($0,RLENGTH+2)
} print s
}' <<<"com.e.h/float"
This steps through the string grabbing alphanumeric strings plus the following character, reversing the order of those two captured pieces, and prepending them to an output string.
Using GNU awk's split, splitting from separators . and /, define more if you wish.
$ cat program.awk
{
for(n=split($0,a,"[./]",s); n>=1; n--) # split to a and s, use n from split
printf "%s%s", a[n], (n==1?ORS:s[(n-1)]) # printf it pretty
}
Run it:
$ echo com.e.h/float | awk -f program.awk
float/h.e.com
EDIT:
If you want to run it as one-liner:
awk '{for(n=split($0,a,"[./]",s); n>=1; n--); printf "%s%s", a[n], (n==1?ORS:s[(n-1)])}' foo.txt
I have a text file in this format:
abacası Abaca[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 20.1748046875
abacı Abaç[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 16.3037109375 Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+[A3sg]+[Pnon]+[Nom] : 23.0185546875
abacılarla Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 27.8974609375 aba[Noun]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 23.3427734375 abacı[Noun]+lAr[A3pl]+[Pnon]+YlA[Ins] : 19.556640625
Here I call the first string before the first space as word (for example abacısı)
The string which starts with after first space and ends with integer is definition (for example Abaca[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 20.1748046875)
I want to do this: If a line includes more than one definition (first line has one, second line has two, third line has three), apply newline and put the first string (word) into the beginning of the new line. Expected output:
abacası Abaca[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 20.1748046875
abacı Abaç[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 16.3037109375
abacı Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+[A3sg]+[Pnon]+[Nom] : 23.0185546875
abacılarla Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 27.8974609375
abacılarla aba[Noun]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 23.3427734375
abacılarla abacı[Noun]+lAr[A3pl]+[Pnon]+YlA[Ins] : 19.556640625
I have almost 1.500.000 lines in my text file and the number of definition is not certain for each line. It can be 1 to 5
Small python script does the job. Input is expected in input.txt, output gotes to output.txt.
import re
rf = re.compile('([^\s]+\s).+')
r = re.compile('([^\s]+\s\:\s\d+\.\d+)')
with open("input.txt", "r") as f:
text = f.read()
with open("output.txt", "w") as f:
for l in text.split('\n'):
offset = 0
first = ""
match = re.search(rf, l[offset:])
if match:
first = match.group(1)
offset = len(first)
while True:
match = re.search(r, l[offset:])
if not match:
break
s = match.group(1)
offset += len(s)
f.write(first + " " + s + "\n")
I am assuming the following format:
word definitionkey : definitionvalue [definitionkey : definitionvalue …]
None of those elements may contain a space and they are always delimited by a single space.
The following code should work:
awk '{ for (i=2; i<=NF; i+=3) print $1, $i, $(i+1), $(i+2) }' file
Explanation (this is the same code but with comments and more spaces):
awk '
# match any line
{
# iterate over each "key : value"
for (i=2; i<=NF; i+=3)
print $1, $i, $(i+1), $(i+2) # prints each "word key : value"
}
' file
awk has some tricks that you may not be familiar with. It works on a line-by-line basis. Each stanza has an optional conditional before it (awk 'NF >=4 {…}' would make sense here since we'll have an error given fewer than four fields). NF is the number of fields and a dollar sign ($) indicates we want the value of the given field, so $1 is the value of the first field, $NF is the value of the last field, and $(i+1) is the value of the third field (assuming i=2). print will default to using spaces between its arguments and adds a line break at the end (otherwise, we'd need printf "%s %s %s %s\n", $1, $i, $(i+1), $(i+2), which is a bit harder to read).
With perl:
perl -a -F'[^]:]\K\h' -ne 'chomp(#F);$p=shift(#F);print "$p ",shift(#F),"\n" while(#F);' yourfile.txt
With bash:
while read -r line
do
pre=${line%% *}
echo "$line" | sed 's/\([0-9]\) /\1\n'$pre' /g'
done < "yourfile.txt"
This script read the file line by line. For each line, the prefix is extracted with a parameter expansion (all until the first space) and spaces preceded by a digit are replaced with a newline and the prefix using sed.
edit: as tripleee suggested it, it's much faster to do all with sed:
sed -i.bak ':a;s/^\(\([^ ]*\).*[0-9]\) /\1\n\2 /;ta' yourfile.txt
Assuming there are always 4 space-separated words for each definition:
awk '{for (i=1; i<NF; i+=4) print $i, $(i+1), $(i+2), $(i+3)}' file
Or if the split should occur after that floating point number
perl -pe 's/\b\d+\.\d+\K\s+(?=\S)/\n/g' file
(This is the perl equivalent of Avinash's answer)
Bash and grep:
#!/bin/bash
while IFS=' ' read -r in1 in2 in3 in4; do
if [[ -n $in4 ]]; then
prepend="$in1"
echo "$in1 $in2 $in3 $in4"
else
echo "$prepend $in1 $in2 $in3"
fi
done < <(grep -o '[[:alnum:]][^:]\+ : [[:digit:].]\+' "$1")
The output of grep -o is putting all definitions on a separate line, but definitions originating from the same line are missing the "word" at the beginning:
abacası Abaca[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 20.1748046875
abacı Abaç[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 16.3037109375
Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+[A3sg]+[Pnon]+[Nom] : 23.0185546875
abacılarla Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 27.8974609375
aba[Noun]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 23.3427734375
abacı[Noun]+lAr[A3pl]+[Pnon]+YlA[Ins] : 19.556640625
The for loop now loops over this, using a space as the input file separator. If in4 is a zero length string, we're on a line where the "word" is missing, so we prepend it.
The script takes the input file name as its argument, and saving output to an output file can be done with simple redirection:
./script inputfile > outputfile
Using perl:
$ perl -nE 'm/([^ ]*) (.*)/; my $word=$1; $_=$2; say $word . " " . $_ for / *(.*?[0-9]+\.[0-9]+)/g;' < input.log
Output:
abacası Abaca[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 20.1748046875
abacı Abaç[Noun]+[Prop]+[A3sg]+SH[P3sg]+[Nom] : 16.3037109375
abacı Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+[A3sg]+[Pnon]+[Nom] : 23.0185546875
abacılarla Aba[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 27.8974609375
abacılarla aba[Noun]+[A3sg]+[Pnon]+[Nom]-CH[Noun+Agt]+lAr[A3pl]+[Pnon]+YlA[Ins] : 23.3427734375
abacılarla abacı[Noun]+lAr[A3pl]+[Pnon]+YlA[Ins] : 19.556640625
Explanation:
Split the line to separate first field as word.
Then split the remaining line using the regex .*?[0-9]+\.[0-9]+.
Print word concatenated with every match of above regex.
I would approach this with one of the excellent Awk answers here; but I'm posting a Python solution to point to some oddities and problems with the currently accepted answer:
It reads the entire input file into memory before processing it. This is harmless for small inputs, but the OP mentions that the real-world input is kind of big.
It needlessly uses re when simple whitespace tokenization appears to be sufficient.
I would also prefer a tool which prints to standard output, so that I can redirect it where I want it from the shell; but to keep this compatible with the earlier solution, this hard-codes output.txt as the destination file.
with open('input.txt', 'r') as input:
with open('output.txt', 'w') as output:
for line in input:
tokens = line.rstrip().split()
word = tokens[0]
for idx in xrange(1, len(tokens), 3):
print(word, ' ', ' '.join(tokens[idx:idx+3]), file=output)
If you really, really wanted to do this in pure Bash, I suppose you could:
while read -r word analyses; do
set -- $analyses
while [ $# -gt 0 ]; do
printf "%s %s %s %s\n" "$word" "$1" "$2" "$3"
shift; shift; shift
done
done <input.txt >output.txt
Please find the following bash code
#!/bin/bash
# read.sh
while read variable
do
for i in "$variable"
do
var=`echo "$i" |wc -w`
array_1=( $i )
counter=0
for((j=1 ; j < $var ; j++))
do
if [ $counter = 0 ] #1
then
echo -ne ${array_1[0]}' '
fi #1
echo -ne ${array_1[$j]}' '
counter=$(expr $counter + 1)
if [ $counter = 3 ] #2
then
counter=0
echo
fi #2
done
done
done
I have tested and it is working.
To test
On bash shell prompt give the following command
$ ./read.sh < input.txt > output.txt
where read.sh is script , input.txt is input file and output.txt is where output is generated
here is a sed in action
sed -r '/^indirger(ken|di)/{s/([0-9]+[.][0-9]+ )(indirge)/\1\n\2/g}' my_file
output
indirgerdi indirge[Verb]+[Pos]+Hr[Aor]+[A3sg]+YDH[Past] : 22.2626953125
indirge[Verb]+[Pos]+Hr[Aor]+YDH[Past]+[A3sg] : 18.720703125
indirgerken indirge[Verb]+[Pos]+Hr[Aor]+[A3sg]-Yken[Adv+While] : 19.6201171875