The document I would like to transform looks like this:
name=foo
name=bar
thing, attribute1=foo, attribute2=data1
thing, attribute3=bar, attribute4=data2
What I would like to do is to find the strings foo and bar (by searching for "name=(.*)" for example and then to replace all occurrences by adding a prefix.
The document would then become
name=prefix_foo
name=prefix_bar
thing, attribute=prefix_foo
thing, attribute=prefix_bar
I imagine this could be done purely with grep and sed?
Working line by line the transformation would be:
gsed -i -E 's/name=(.*)/name=prefix_\1/g' test.txt
However, how can I reuse the match for other substitutions (recursively)?
You can indeed reuse the match for other names. By using the regex options -P -o, and making use of \K, you can select only the names you want to replace, and then prefix them with sed. Here's a bash script that does what you want.
#get filenames and prefix
echo "input filename?";
read fname;
echo "prefix?";
read prefix;
#if it's a file...
if [ -f "$fname" ]
then
#grep for names to change
result=$(grep -P -o "name=\K.*" "$fname");
#get names in an array
arrRes=($result);
#loop through and sed each name
for name in "${arrRes[#]}"; do
#name now holds a name to sub
echo "replacing $name with $prefix$name";
#sub the name
$(sed -i "s/$name/$prefix$name/g" "$fname");
done
fi
Try it here!
How can I use sed to add a dynamic prefix to each number in an integer list?
For example:
I have a string "A-1,2,3,4,5", I want to transform it to string "A-1,A-2,A-3,A-4,A-5" - which means I want to add prefix of first integer i.e. "A-" to each number of the list.
If I have string like "B-1,20,300" then I want to transform it to string "B-1,B-20,B-300".
I am not able to use RegEx Capturing Groups because for global match they do not retain their value in subsequent matches.
When it comes to looping constructs in sed, I like to use newlines as markers for the places I have yet to process. This makes matching much simpler, and I know they're not in the input because my input is a text line.
For example:
$ echo A-1,2,3,4,5 | sed 's/,/\n/g;:a s/^\([^0-9]*\)\([^\n]*\)\n/\1\2,\1/; ta'
A-1,A-2,A-3,A-4,A-5
This works as follows:
s/,/\n/g # replace all commas with newlines (insert markers)
:a # label for looping
s/^\([^0-9]*\)\([^\n]*\)\n/\1\2,\1/ # replace the next marker with a comma followed
# by the prefix
ta # loop unless there's nothing more to do.
The approach is similar to #potong's, but I find the regex much more readable -- \([^0-9]*\) captures the prefix, \([^\n]*\) captures everything up to the next marker (i.e. everything that's already been processed), and then it's just a matter of reassembling it in the substitution.
Don't use sed, just use the other standard UNIX text manipulation tool, awk:
$ echo 'A-1,2,3,4,5' | awk '{p=substr($0,1,2); gsub(/,/,"&"p)}1'
A-1,A-2,A-3,A-4,A-5
$ echo 'B-1,20,300' | awk '{p=substr($0,1,2); gsub(/,/,"&"p)}1'
B-1,B-20,B-300
This might work for you (GNU sed):
sed -E ':a;s/^((([^-]+-)[^,]+,)+)([0-9])/\1\3\4/;ta' file
Uses pattern matching and a loop to replace a number following a comma by the first column prefix and that number.
Assuming this is for shell scripting, you can do so with 2 seds:
set string = "A1,2,3,4,5"
set prefix = `echo $string | sed 's/^\([A-Z]\).*/\1/'`
echo $string | sed 's/,\([0-9]\)/,'$prefix'-\1/g'
Output is
A1,A-2,A-3,A-4,A-5
With
set string = "B-1,20,300"
Output is
B-1,B-20,B-300
Could you please try following(if ok with awk).
awk '
BEGIN{
FS=OFS=","
}
{
for(i=1;i<=NF;i++){
if($i !~ /^A/&&$i !~ /\"A/){
$i="A-"$i
}
}
}
1' Input_file
if your data in 'd' file, tried on gnu sed:
sed -E 'h;s/^(\w-).+/\1/;x;G;:s s/,([0-9]+)(.*\n(.+))/,\3\1\2/;ts; s/\n.+//' d
How can I use perl, awk, or sed to search for all occurrences of text wrapped in quotes within a file, and print the result of deleting those occurrences from the file? I do not want to actually alter the file, but simply print the result of altering the file like sed does.
For example, say the file contains the following :
data|more data|"not important"|"more unimportant stuff"
I need it to print out:
data|more data||
But I want to leave the file intact. I tried using sed but I could not get it to accept regexs.
I have tried something like this:
sed -e 's/\<["]+[^"]*["]+\>//g' file.txt
but it does nothing and prints the original file.
Any Thoughts?
Using a perl one-liner:
perl -pe 's/".*?"//g' file
Explanation:
Switches:
-p: Creates a while(<>){...; print} loop for each line in your input file.
-e: Tells perl to execute the code on command line.
You seem to have a few extra characters in your sed command.
sed -e 's/"[^"]*"//g' file.txt
Input:
"quoted text is here" but not quoted there
never more
"hello world" foo bar
data|more data|"not important"|"more unimportant stuff"
Output:
but not quoted there
never more
foo bar
data|more data||
echo 'data|more data|"not important"|"more unimportant stuff"' | sed -E 's/"[^"]*"//g'
You don't need to declare a character class (brackets) for only one character...
my $cnt=qq(data|more data|"not important"|"more unimportant stuff");
my #arr = $cnt =~ m{(?:^|\|)([^"][^\|]*[^"])(?=\||$)}ig;
print "#arr";
This code might help you..
I have the following csv file:
hd1,100
hd2,200
I'd like to change it so it reads like this:
hard1drive,100
hard2drive,200
I thought sed could help:
sed s'/hd[0-9]/hard[0-9]drive]/ < infile.csv
but instead of the desired output I get:
hard[0-9]drive,100
hard[0-9]drive,200
Is there any way I can 'capture' the number from the search parameter and insert it within the replace parameter within sed, or am I going to have to use another command?
Use capturing groups
sed 's/hd\([0-9]\)/hard\1drive/'
option without grouping:
kent$ echo "hd1,100
hd2,200"|sed 's/d[0-9]/ar&drive/'
hard1drive,100
hard2drive,200
I would like to print part between regex match like this:
echo "this is foo and another foo quux" | sed 's/this\(.*\)another.*/\1/'
which prints
is foo and
what is perfectly ok as I want to get part between this and another printed.
But, If I want to parse my source code and use:
cat source_code | sed 's/.*AdulterateFood\(.*\)DangerousFood.*/\1/'
and I do know that AdulterateFood and DangerousFood is only once in the source code, it still prints everything, whole file:( I am wondering why.. AdulterateFood and DangerousFood are on different lines.
Thank you for your suggestions.
sed prints each input line by default. If you don't want that behavior you need to add the -n option. If you then want it to print the lines that match your RE you have to add a "p" to the end of the substitution command to tell sed TO print that line. So this:
sed -n 's/.*AdulterateFood\(.*\)DangerousFood.*/\1/p' source_code
seems to be what you're asking for but since you didn't provide any sample input and expected output it's just a guess.
To print all lines between AdulterateFood and Dangerous food:
sed -n '/AdulterateFood/,/DangerousFood/p' file