SED to replace text inside quotes - regex

Assuming I have a text file where some lines are of this format:
%attr(750,user1,group1) "/mydrive/mypath1/mypath2\winpath1\winpath2\file.xml"
What I wanna achieve is:
touch only those lines which start with %attr
on each of such lines find the last occasion of ".*" (including quotes)
inside that last occasion replace all \ to /
What is the proper syntax for sed utility?

awk can do the job easily:
awk -F '"' '/^%attr/ {gsub(/\\/, "/", $(NF-1))} 1' OFS='"' file
To change the original file:
awk -F '"' '/^%attr/ {gsub(/\\/, "/", $(NF-1))} 1' OFS='"' file > _tmp && mv _tmp file

you can try this sed '/\%attr/{s/\\/\//g}'

Related

Extract all numbers from a text file and store them in another file

I have a text file which have lots of lines. I want to extract all the numbers from that file.
File contains text and number and each line contains only one number.
How can i do it using sed or awk in bash script?
i tried
#! /bin/bash
sed 's/\([0-9.0-9]*\).*/\1/' <myfile.txt >output.txt
but this didn't worked.
grep can handle this:
grep -Eo '[0-9\.]+' myfile.txt
-o tells to print only the matches and [0-9\.]+ is a regular expression to match numbers.
To put all numbers on one line and save them in output.txt:
echo $(grep -Eo '[0-9\.]+' myfile.txt) >output.txt
Text files should normally end with a newline characters. The use of echo above assures that this happens.
Non-GNU grep:
If your grep does not support the -o flag, try:
echo $(tr ' ' '\n' <myfile.txt | grep -E '[0-9\.]+') >output.txt
This uses tr to replace all spaces with newlines (so each number appears separately on a line) and then uses grep to search for numbers.
tr -sc '0-9.' ' ' "$file"
Will transform every string of non-digit-or-period characters into a single space.
You can also use Bash:
while read line; do
if [[ $line =~ [0-9\.]+ ]]; then
echo $BASH_REMATCH
fi
done <myfile.txt >output.txt

Removing spaces after and before commas (not between words) using command line

I have a CSV file that is created using pull_data $a $b > $id.csv.
An e.g. of a line from the output is:
1041894,30/01/2013,31/01/2013,A Customer Limited , 2, 1,PR14, PR14 , 104 An Item ,247 An Item
Here is my condition:
I need to remove all spaces after or before the commas but exclude spaces that are between words.
I need each line to look like the below and would like to use sed if possible:
1041894,30/01/2013,31/01/2013,A Customer Limited,2,1,PR14,PR14,104 An Item,247 An Item
I have tried sed 's/, \+\| \+,/,/g' but this does not remove all the spaces after or before a comma, it removes only some.
Try this sed:
sed -i.bak -e 's/,[[:space:]]\+/,/g' -e 's/[[:space:]]\+,/,/g' file
OR this awk:
awk -F '[[:space:]]*,[[:space:]]*' '{$1=$1}1' OFS=, file
UPDATE:
OP might have whitespaces not just spaces:
Try this sed also:
sed -i.bak 's/[[:space:]]*,[[:space:]]*/,/g' file
This might work for you (GNU sed):
sed 's/\s*,\s*/,/g' file
Using awk
awk 'BEGIN{FS=OFS=","} {for (i=1;i<=NF;i++) gsub(/^[ \t]+|[ \t]+$/, "", $i)}1' file
Some like this:
awk '{gsub(/[[:space:]]*,[[:space:]]*/,",")}1' file
echo "1041894,30/01/2013,31/01/2013,A Customer Limited , 2, 1,PR14 ,PR14 ,104 An Item ,247 An Item" | awk '{gsub(/[[:space:]]*,[[:space:]]*/,",")}1'
1041894,30/01/2013,31/01/2013,A Customer Limited,2,1,PR14,PR14,104 An Item,247 An Item
It does remove any space before or after ,
Edit: added [[:space:]] to remove all types of spaces.

Bash - how to put each line within quotation

I want to put each line within quotation marks, such as:
abcdefg
hijklmn
opqrst
convert to:
"abcdefg"
"hijklmn"
"opqrst"
How to do this in Bash shell script?
Using awk
awk '{ print "\""$0"\""}' inputfile
Using pure bash
while read FOO; do
echo -e "\"$FOO\""
done < inputfile
where inputfile would be a file containing the lines without quotes.
If your file has empty lines, awk is definitely the way to go:
awk 'NF { print "\""$0"\""}' inputfile
NF tells awk to only execute the print command when the Number of Fields is more than zero (line is not empty).
I use the following command:
xargs -I{lin} echo \"{lin}\" < your_filename
The xargs take standard input (redirected from your file) and pass one line a time to {lin} placeholder, and then execute the command at next, in this case a echo with escaped double quotes.
You can use the -i option of xargs to omit the name of the placeholder, like this:
xargs -i echo \"{}\" < your_filename
In both cases, your IFS must be at default value or with '\n' at least.
This sed should work for ignoring empty lines as well:
sed -i.bak 's/^..*$/"&"/' inFile
or
sed 's/^.\{1,\}$/"&"/' inFile
Use sed:
sed -e 's/^\|$/"/g' file
More effort needed if the file contains empty lines.
I think the sed and awk are the best solution but if you want to use just shell here is small script for you.
#!/bin/bash
chr="\""
file="file.txt"
cp $file $file."_backup"
while read -r line
do
echo "${chr}$line${chr}"
done <$file > newfile
mv newfile $file
paste -d\" /dev/null your-file /dev/null
(not the nicest looking, but probably the fastest)
Now, if the input may contain quotes, you may need to escape them with backslashes (and then escape backslashes as well) like:
sed 's/["\]/\\&/g; s/.*/"&"/' your-file
This answer worked for me in mac terminal.
$ awk '{ printf "\"%s\",\n", $0 }' your_file_name
It should be noted that the text in double quotes and commas was printed out in terminal, the file itself was unaffected.
I used sed with two expressions to replace start and end of line, since in my particular use case I wanted to place HTML tags around only lines that contained particular words.
So I searched for the lines containing words contained in the bla variable within the text file inputfile and replaced the beginnign with <P> and the end with </P> (well actually I did some longer HTML tagging in the real thing, but this will serve fine as example)
Similar to:
$ bla=foo
$ sed -e "/${bla}/s#^#<P>#" -e "/${bla}/s#\$#</P>#" inputfile
<P>foo</P>
bar
$

Perl, sed, or awk one-liner to change the format of the file

I need advice on how to change the file formatted following way
file1:
A 504688
B jobnameA
A 504690
B jobnameB
A 504691
B jobnameC
...
into file2:
A B
504688 jobnameA
504690 jobnameB
504691 jobnameC
...
One solution I could think of is:
cat file1 | perl -0777 -p -e 's/\s+B/\t/' | awk '{print $2"\t"$3}'.
But I am wondering if there is more efficient way or already known practice that does this job.
perl -nawe 'print "#F[1 .. $#F]", $F[0] eq "A" ? "\t" : "\n"' < /tmp/ab
Look up the options in perlrun.
Another useful one to add is -l (append newline to print), but not in this case.
Assuming your input file is tab separated:
echo $'A\tB'
cut -f2 filename | paste - -
Should be pretty quick because this is exactly what cut and paste were written to do.
awk '/^A/{num=$2}/^B/{print num,$2}' file
Or, alternately,
awk '{num=$2;getline;print num,$2}' file
Here is an sed solution:
sed -e 'N' -e 's/A\s*\(.*\)\nB\s*\(.*\)/\1\t\2/' file
This version will also print the header at the top:
sed '1{h;s/.*/A\tB/p;g};N;s/A\s*\(.*\)\nB\s*\(.*\)/\1\t\2/' file
Or an alternative:
sed -n '/^A\s*/{s///;h};/^B\s*/{s///;H;g;s/\n/\t/p}' file
If your sed does not support semicolons as a command separator for the alternative:
sed -n '
/^A\s*/{ # if the line starts with "A"
s/// # remove the "A" and the whitespace
h # copy the remainder into the hold space
} # end if
/^B\s*/{ # if the line starts with "B"
s/// # remove the "B" and the whitespace
H # append pattern space to hold space
g # copy hold space to pattern space
s/\n/\t/p # replace newline with tab and print
}' file
This version will also print the header at the top:
sed -n '/^A\s*/{s///;h;1s/.*/A\tB/p};/^B\s*/{s///;H;g;s/\n/\t/p}' file
This will work with any header text, not just fixed A and B >>
awk '{a=$1;b=$2;getline;if(c!=1){print a,$1;c=1};print b,$2}' file1 >file2
...and it will print also header row
If you need \t separator, then use:
awk '{a=$1;b=$2;getline;if(c!=1){print a"\t"$1;c=1};print b"\t"$2}' file1 >file2
This might work for you:
sed -e '1i\A\tB' -e 'N;s/A\s*\(\S*\).*\nB\s*\(\S*\).*/\1\t\2/' file

With sed or awk, how do I match from the end of the current line back to a specified character?

I have a list of file locations in a text file. For example:
/var/lib/mlocate
/var/lib/dpkg/info/mlocate.conffiles
/var/lib/dpkg/info/mlocate.list
/var/lib/dpkg/info/mlocate.md5sums
/var/lib/dpkg/info/mlocate.postinst
/var/lib/dpkg/info/mlocate.postrm
/var/lib/dpkg/info/mlocate.prerm
What I want to do is use sed or awk to read from the end of each line until the first forward slash (i.e., pick the actual file name from each file address).
I'm a bit shakey on syntax for both sed and awk. Can anyone help?
$ sed -e 's!^.*/!!' locations.txt
mlocate
mlocate.conffiles
mlocate.list
mlocate.md5sums
mlocate.postinst
mlocate.postrm
mlocate.prerm
Regular-expression quantifiers are greedy, which means .* matches as much of the input as possible. Read a pattern of the form .*X as "the last X in the string." In this case, we're deleting everything up through the final / in each line.
I used bangs rather than the usual forward-slash delimiters to avoid a need for escaping the literal forward slash we want to match. Otherwise, an equivalent albeit less readable command is
$ sed -e 's/^.*\///' locations.txt
Use command basename
$~hawk] basename /var/lib/mlocate
mlocate
I am for "basename" too, but for the sake of completeness, here is an awk one-liner:
awk -F/ 'NF>0{print $NF}' <file.txt
There's really no need to use sed or awk here, simply us basename
IFS=$'\n'
for file in $(cat filelist); do
basename $file;
done
If you want the directory part instead use dirname.
Pure Bash:
while read -r line
do
[[ ${#line} != 0 ]] && echo "${line##*/}"
done < files.txt
Edit: Excludes blank lines.
Thius would do the trick too if file contains the list of paths
$ xargs -d '\n' -n 1 -a file basename
This is a less-clever, plodding version of gbacon's:
sed -e 's/^.*\/\([^\/]*\)$/\1/'
#OP, you can use awk
awk -F"/" 'NF{ print $NF }' file
NF mean number of fields, $NF means get the value of last field
or with the shell
while read -r line
do
line=${line##*/} # means longest match from the front till the "/"
[ ! -z "$line" ] && echo $line
done <"file"
NB: if you have big files, use awk.