Inserting text with many newlines with gnu sed - regex

I have a mainfile.txt that contains
*
* Some text
*
Using the command
while read file; do gsed -i '1i'"$(cat mainfile.txt)" "$file";
I insert the text from mainfile.txt into the beginning of every file that matches some criteria. However, it seems like the different lines in mainfile.txt are causing trouble. The error says gsed: can't find label for jump to `o'. This error does not occur when mainfile.txt contains one line only. When trying to find a solution I only found out how to insert new lines in sed, which is not exactly what I am looking for.

i requires that each line to insert end in a backslash, except the last. If that's not the case for your file, it won't work.
ed is a better choice for editing files than the non-standard sed -i, though if you're restricting yourself to GNU sed/gsed instead of whatever the OS-provided system sed is, that's less of an issue.
With either command, the best solution to insert the contents of one file in another is to use the r command instead to read the contents of a file into the buffer after the addressed line (It acts more like a than i that way):
printf "%s\n" "0r mainfile.txt" w | ed -s "$file"
Unfortunately, sed doesn't take an address of 0 to mean "before the first line" like ed does so it's harder to use r here in it.
Of course, just prepending one file to another can easily be done without either command:
cat mainfile.txt "$file" > temp.txt && mv -f temp.txt "$file"
or using sponge(1) from the moreutils package:
cat mainfile.txt "$file" | sponge "$file"

This might work for you (GNU sed):
sed -i '1ecat mainfile.txt' "$file"
On first line only of the file $file, evaluate the command cat mainfile.txt, then print all lines as normal.
$file will be updated with the lines of mainfile.txt prepended.
Alternative if the $file has at least 2 lines:
sed -i -e '1h;1r mainfile.txt' -e '1d;2H;2g' "$file"

Related

How to pass a variable line number in sed substitute command

I am trying to do a sed operation like this
sed -i '100s/abc/xyz/' filename.txt
I wanted 100 in a variable say $var from a perl script. So, I am trying like this
system("sed -i "${vars}s/abc/xyz/" filename.txt").
This is throwing some error.
Again when I am doing like this putting system command in single quotes:
system('sed -i "${vars}s/abc/xyz/" filename.txt')
this is substituting wrongly. What can be done?
Better and safer is to use the LIST variant of system, because it avoids unsafe shell command line parsing. The command, sed in your case, will receive the command line arguments un-alterated and without the need to quote them.
NOTE: I added -MO=Deparse just to illustrate what the one-liner compiles to.
NOTE: I added -e to be on the safe side as you have -i on the command line which expects a parameter.
$ perl -MO=Deparse -e 'system(qw{sed -i -e}, "${vars}s/abc/xyz/", qw{filename.txt})'
system(('sed', '-i', '-e'), "${vars}s/abc/xyz/", 'filename.txt');
-e syntax OK
Of course in reality it would be easier just to do the processing in Perl itself instead of calling sed...
Shelling out to sed from within perl is a road to unnecessary pain. You're introducing additional quoting and variable expansion layers, and that's at best making your code less clear, and at worst introducing bugs accidentally.
Why not just do it in native perl which is considerably more effective. Perl even allows you to do in place editing if you want.
But it's as simple as:
open ( my $input, '<', 'filename.txt');
open ( my $output, '>', 'filename.txt.new');
select $output;
while ( <$input> ) {
if ( $. == $vars ) {
s/abc/xyz/
}
print;
}
Or if you're really keen on the in place edit, you can look into setting `$^I:
Perl in place editing within a script (rather than one liner)
But I'd suggest 'just' renaming the file after you're done is as easy.

combine two sed commands [duplicate]

This question already has answers here:
Combining two sed commands
(2 answers)
Closed 8 years ago.
How can I combine the following two sed commands.
One loops all files in a directory and removes the first line from them.
The other removes any double quotes " from the start of file lines.
Remove first line of each file
for each in `/bin/ls -1`;do sed -i 1d $each;done
Beginning of line
for each in `/bin/ls -1`;do sed -i 's/^"//g' $each;done
You can do:
for each in *; do sed -i.bak 's/^"//g; 1d' "$each"; done
You can put them into the same invocation of sed like this:
for f in *; do sed -i '1d;s/^"//' "$f"; done
As well as combining the two sed commands, I have also used a glob * rather than attempting to parse ls, which is never a good idea.
Also, your substitution needn't be global, as it can by definition only apply to each line once, so I removed the g modifier as well.
You can do it with a single commmand in this way:
sed -i.bak --separate '1d ; s/^"//' *
Explanation
With --separate you're telling sed to treat the files separately, the default is to process them as a long single file but you are using adresses (1 in the first command) so the default doesn't work.
'1d ; s/^"//' just combines the two commands (separated by ;).
You can use -e to perform different sed commands in the same line: sed -e 'command_1' -e 'command_2' ... -e 'command_n'. You can also use sed 'command_1; command_2; ...; command_n.
Let's use the first option and loop through the files:
for file in *
do
sed -i.bak -e '1d' -e 's/^"//' "$file"
done
Note also that I use for file in *, so that * expands to the files in the current directory. This is better than parsing the output of ls (Why you shouldn't parse the output of ls(1) is a good read).
Finally, it is a good practise to create a backup file when using -i, as anubhava suggests. This way, you are always in the safe side :)

batch renaming of files with perl expressions

This should be a basic question for a lot of people, but I am a biologist with no programming background, so please excuse my question.
What I am trying to do is rename about 100,000 gzipped data files that have existing name of a code (example: XG453834.fasta.gz). I'd like to name them to something easily readable and parseable by me (example: Xanthomonas_galactus_str_453.fasta.gz).
I've tried to use sed, rename, and mmv, to no avail. If I use any of those commands on a one-off script then they work fine, it's just when I try to incorporate variables into a shell script do I run into problems. I'm not getting any errors, just no names are changed, so I suspect it's an I/O error.
Here's what my files look like:
#! /bin/bash
# change a bunch of file names
file=names.txt
while IFS=' ' read -r r1 r2;
do
mmv ''$r1'.fasta.gz' ''$r2'.fasta.gz'
# or I tried many versions of: sed -i 's/"$r1"/"$r2"/' *.gz
# and I tried many versions of: rename -i 's/$r1/$r2/' *.gz
done < "$file"
...and here's the first lines of my txt file with single space delimiter:
cat names.txt
#find #replace
code1 name1
code2 name2
code3 name3
I know I can do this with python or perl, but since I'm stuck here working on this particular script I want to find a simple solution to fixing this bash script and figure out what I am doing wrong. Thanks so much for any help possible.
Also, I tried to cat the names file (see comment from Ashoka Lella below) and then use awk to move/rename. Some of the files have variable names (but will always start with the code), so I am looking for a find & replace option to just replace the "code" with the "name" and preserve the file name structure.
I suspect I am not escaping the variable within the single tick of the perl expression, but I have poured over a lot of manuals and I can't find the way to do this.
If you're absolutely sure than the filenames doesn't contain spaces of tabs, you can try the next
xargs -n2 < names.txt echo mv
This is for DRY run (will only print what will do) - if you satisfied with the result, remove the echo ...
If you want check the existence ot the target, use
xargs -n2 < names.txt echo mv -i
if you want NEVER allow overwriting of the target use
xargs -n2 < names.txt echo mv -n
again, remove the echo if youre satisfied.
I don't think that you need to be using mmv, a simple mv will do. Also, there's no need to specify the IFS, the default will work for you:
while read -r src dest; do mv "$src" "$dest"; done < names.txt
I have double quoted the variable names as it is generally considered good practice but in this case, a space in either of the filenames will result in read not working as you expect.
You can put an echo before the mv inside the loop to ensure that the correct command will be executed.
Note that in your file names.txt, the .fasta.gz suffix is already included, so you shouldn't be adding it inside the loop aswell. Perhaps that was your problem?
This should rename all files in column1 to column2 of names.txt. Provided they are in the same folder as names.txt
cat names.txt| awk '{print "mv "$1" "$2}'|sh

How to line break in csh script

Hi i am trying to create a script in csh where i have to cut the name field and print it to the screen. when i do the command ( cut /etc/passwd -f5 -d":" ) by itself it works fine, all the names are in different lines, but this doesn't happen when i insert it in the script like this:
#!bin/csh
set name=`cut /etc/passwd -f5 -d":"`
echo $name
They all appear one after the other. I have tried many things but none work, what i am doing wrong?
Thanks
It's possible to do this -- but in my opinion it's not worth doing.
You can set a variable to a value that contains newlines, but the only way I know of to do so is to use a set command with a multi-line string, with backslashes to join the lines.
Here's how I did it:
% ( echo -n "set name = '" ; \
cut /etc/passwd -f5 -d":" | sed 's/$/\\\/' ; echo "'" ) \
>! tmp
% source tmp
% echo $name:q
I had to use $name:q rather than "$name"; when I type echo "$name" I get an Unmatched ". error.
As GigaWatt said in a comment, if all want to do is display the result, you're better off just executing the cut command; there's no point in saving it in a variable.
If you need to use the output of the cut command more than once, you can save it to a file:
% cut /etc/passwd -f5 -d":" > tmp
and use the contents of the file -- or you can just re-run the cut command if you don't want to create a temporary file.
It's also worth noting that the /etc/passwd file doesn't necessarily contain information about all the accounts on a system. Some systems supplement it with NIS or LDAP. getent passwd accounts for all that (unless you have an old or limited system that doesn't have the getent command).
Bourne-based shells, including bash, tend to handle this kind of thing more cleanly. In bash:
$ name="$(cut /etc/passwd -f5 -d":")"
$ echo "$name"
Only the final newline at the end of the very last line is lost; the echo command will add it. Consider using a shell other than csh, or another scripting language like Python or Perl. csh is widely considered to be a poor scripting language.
I found a Similar discussion. Basically, the new line characters are not saved in the variable when cut is run in a csh script. So we add a newline character using awk.
echo in csh doesn't understand the \n character, even with -e option. So we can just use printf. Here's the answer:
#!bin/csh
set name=`cut /etc/passwd -f5 -d":" | awk '{printf("%s\\n", $0)'}`
printf "%b\n" "$name"
Harshad

a simple sed script displaying only changed lines

How could I make a separate sed script (let's call it script.sed) that would display only the changed lines without having to use the -n option while executing it? (Sorry for my English)
I have a file called data2.txt with digits and I need to change the lines ending with ".5" and print those changed lines out in the console.
I know how to do it with a single command (sed -n 's/.5$//gp' data2.txt), however our university professor requires us to do the same using sed -f script.sed data2.txt command.
Any ideas?
The following should work for your sed script:
s/.5$//gp
d
The -n option will suppress automatic printing of the line, the other way to do that is to use the d command. From man page:
d Delete pattern space. Start next cycle.
This works because the automatic printing of the line happens at the end of a cycle, and using the d command means you never reach the end of a cycle so no lines are printed automatically.
This might work for you (GNU sed):
#n
s/.5$//p
Save this to a file and run as:
sed -f file.sed file.txt