combine two sed commands [duplicate]

combine two sed commands [duplicate] - regex

This question already has answers here:
Combining two sed commands
(2 answers)
Closed 8 years ago.
How can I combine the following two sed commands.
One loops all files in a directory and removes the first line from them.
The other removes any double quotes " from the start of file lines.
Remove first line of each file
for each in `/bin/ls -1`;do sed -i 1d $each;done
Beginning of line
for each in `/bin/ls -1`;do sed -i 's/^"//g' $each;done

You can do:
for each in *; do sed -i.bak 's/^"//g; 1d' "$each"; done

You can put them into the same invocation of sed like this:
for f in *; do sed -i '1d;s/^"//' "$f"; done
As well as combining the two sed commands, I have also used a glob * rather than attempting to parse ls, which is never a good idea.
Also, your substitution needn't be global, as it can by definition only apply to each line once, so I removed the g modifier as well.

You can do it with a single commmand in this way:
sed -i.bak --separate '1d ; s/^"//' *
Explanation
With --separate you're telling sed to treat the files separately, the default is to process them as a long single file but you are using adresses (1 in the first command) so the default doesn't work.
'1d ; s/^"//' just combines the two commands (separated by ;).

You can use -e to perform different sed commands in the same line: sed -e 'command_1' -e 'command_2' ... -e 'command_n'. You can also use sed 'command_1; command_2; ...; command_n.
Let's use the first option and loop through the files:
for file in *
do
sed -i.bak -e '1d' -e 's/^"//' "$file"
done
Note also that I use for file in *, so that * expands to the files in the current directory. This is better than parsing the output of ls (Why you shouldn't parse the output of ls(1) is a good read).
Finally, it is a good practise to create a backup file when using -i, as anubhava suggests. This way, you are always in the safe side :)

Related

Inserting text with many newlines with gnu sed

I have a mainfile.txt that contains
*
* Some text
*
Using the command
while read file; do gsed -i '1i'"$(cat mainfile.txt)" "$file";
I insert the text from mainfile.txt into the beginning of every file that matches some criteria. However, it seems like the different lines in mainfile.txt are causing trouble. The error says gsed: can't find label for jump to `o'. This error does not occur when mainfile.txt contains one line only. When trying to find a solution I only found out how to insert new lines in sed, which is not exactly what I am looking for.

i requires that each line to insert end in a backslash, except the last. If that's not the case for your file, it won't work.
ed is a better choice for editing files than the non-standard sed -i, though if you're restricting yourself to GNU sed/gsed instead of whatever the OS-provided system sed is, that's less of an issue.
With either command, the best solution to insert the contents of one file in another is to use the r command instead to read the contents of a file into the buffer after the addressed line (It acts more like a than i that way):
printf "%s\n" "0r mainfile.txt" w | ed -s "$file"
Unfortunately, sed doesn't take an address of 0 to mean "before the first line" like ed does so it's harder to use r here in it.
Of course, just prepending one file to another can easily be done without either command:
cat mainfile.txt "$file" > temp.txt && mv -f temp.txt "$file"
or using sponge(1) from the moreutils package:
cat mainfile.txt "$file" | sponge "$file"

This might work for you (GNU sed):
sed -i '1ecat mainfile.txt' "$file"
On first line only of the file $file, evaluate the command cat mainfile.txt, then print all lines as normal.
$file will be updated with the lines of mainfile.txt prepended.
Alternative if the $file has at least 2 lines:
sed -i -e '1h;1r mainfile.txt' -e '1d;2H;2g' "$file"

Converting LaTeX pmatrix command to amsmath pmatrix environment using sed

I have an old LaTeX document (with a lot of formatting commands) that I want to convert to the more modern LaTeX (I want to do the update for several reasons, not the least of which is to reduce the coupling between content and formatting). At any rate, the document has a lot of calls to the deprecated command \pmatrix{ .... } which I would like to replace with the new amsmath command \begin{pmatrix} ... \end{pmatrix}. I have been trying to use sed to do this conversion but I have never used it before and I am having trouble.
Here is a MWE
LaTeX input string
\pmatrix{0&0\cr \frac{1}{2}&0\cr 0&0\cr}\pmatrix{1&1\cr 1&1\cr 1&1\cr}
with the expected output
\begin{pmatrix}0&0\\ \frac{1}{2}&0\\ 0&0\end{pmatrix}\begin{pmatrix}1&1\\ 1&1\\ 1&1\end{pmatrix}
The commands that I have been trying to use are variants of the following
sed 's/\\pmatrix{\(.*\cr[ ]*\)}/\\begin{pmatrix}\1 \\end{pmatrix}/g' <$WORKING_FILE >$OUTPUT_FILE
but the closest output that I have been able to achieve is
\begin{pmatrix}0 & 0 \\ 0 & 0 \\ 0 & 0 \end{pmatrix}
I am pretty sure that the problem is related to having two calls to pmatrix side by side, but I am not sure how to modify the regex to make this work.
I have searched google, but being so new to regex, I just got confused by all of the variations out there and which to use, and how to properly format such a thing.

The following might work for you:
sed -re 's/(\\pmatrix)\{([^}]*)}/\\begin{pmatrix}\2\\end{pmatrix}/g' -e 's/\\cr/\\\\/g' -e 's/\\\\\\end/\\end/g' inputfile
This works by:
substituting \pmatrix{...} with `\begin{matrix}...\end{matrix}
substituting \cr with \\
handling \\\end to make it \end
EDIT: As per your update, you might be better off splitting the relevant parts using grep before piping to sed:
grep -oP '\\pmatrix.*?\\cr}' inputfile | sed -re 's/\\pmatrix\{(.*)}/\\begin{pmatrix}\1\\end{pmatrix}/g;s/\\cr/\\\\/g;s/\\\\\\end/\\end/g'

This might work for you (GNU sed):
sed -r 's/\\cr/\n/g;s/\\(pmatrix)\{([^\n]*)\n([^\n]*)\n([^\n]*)\n\}/\\begin{\1}\2\\\\ \3\\\\ \4\\end{\1}/g;s/\n/\\cr/g' file
Convert \\cr to newlines. Do a global substitution command. Then convert those newlines left back to \\cr's.

a simple sed script displaying only changed lines

How could I make a separate sed script (let's call it script.sed) that would display only the changed lines without having to use the -n option while executing it? (Sorry for my English)
I have a file called data2.txt with digits and I need to change the lines ending with ".5" and print those changed lines out in the console.
I know how to do it with a single command (sed -n 's/.5$//gp' data2.txt), however our university professor requires us to do the same using sed -f script.sed data2.txt command.
Any ideas?

The following should work for your sed script:
s/.5$//gp
d
The -n option will suppress automatic printing of the line, the other way to do that is to use the d command. From man page:
d Delete pattern space. Start next cycle.
This works because the automatic printing of the line happens at the end of a cycle, and using the d command means you never reach the end of a cycle so no lines are printed automatically.

This might work for you (GNU sed):
#n
s/.5$//p
Save this to a file and run as:
sed -f file.sed file.txt

Unpredictable behavior in sed interpreters output from multiple expressions

Why does GNU sed sometimes handle substitution with piped output into another sed instance differently than when multiple expressions are used with the same one?
Specifically, for msys/mingw sessions, in the /etc/profile script I have a series of manipulations that "rearrange" the order of the environment variable PATH and removes duplicate entries.
Take note that while normally sed treats each line of input seperately (and therfore can't easily substitute '\n' in the input stream, this sed statement does a substitution of ':' with '\n', so it still handles the entire input stream like one line (with '\n' characters in it). This behavior stays true for all sed expressions in the same instance of sed (basically until you redirect or pipe the output into another program).
Here's the obligatory specs:
Windows 7 Professional Service Pack 1
HP Pavilion dv7-6b78us
16 GB DDR3 RAM
MinGW-w64 (x86_64-w64-mingw32-gcc-4.7.1.2-release-win64-rubenvb) mounted on /mingw/
MSYS (20111123) mounted on / and on /usr/
$ uname -a="MINGW32_NT-6.1 CHRIV-L09 1.0.17(0.48/3/2) 2011-04-24 23:39 i686 Msys"
$ which sed="/bin/sed.exe" (it's part of MSYS)
$ sed --version="GNU sed version 4.2.1"
This is the contents of PATH before manipulation:
PATH='.:/usr/local/bin:/mingw/bin:/bin:/c/PHP:/c/Program Files (x86)/HP SimplePass 2011/x64:/c/Program Files (x86)/HP SimplePass 2011:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0:/c/si:/c/android-sdk:/c/android-sdk/tools:/c/android-sdk/platform-tools:/c/Program Files (x86)/WinMerge:/c/ntp/bin:/c/GnuWin32/bin:/c/Program Files/MySQL/MySQL Server5.5/bin:/c/Program Files (x86)/WinSCP:/c/Program Files (x86)/Overlook Fing 2.1/bin:/c/Program Files/7-zip:.:/c/Program Files/TortoiseGit/bin:/c/Program Files (x86)/Git/bin:/c/VS10/VC/bin/x86_amd64:/c/VS10/VC/bin/amd64:/c/VS10/VC/bin'
This is an excerpt of /etc/profile (where I have begun the PATH manipulation):
set | grep --color=never ^PATH= | sed -e "s#^PATH=##" -e "s#'##g" \
-e "s/:/\n/g" -e "s#\n\(/[^\n]*tortoisegit[^\n]*\)#\nZ95-\1#ig" \
-e "s#\n\(/[a-z]/win\)#\nZ90-\1#ig" -e "s#\n\(/[a-z]/p\)#\nZ70-\1#ig" \
-e "s#\.\n#A10-.\n#g" -e "s#\n\(/usr/local/bin\)#\nA15-\1#ig" \
-e "s#\n\(/bin\)#\nA20-\1#ig" -e "s#\n\(/mingw/bin\)#\nA25-\1#ig" \
-e "s#\n\(/[a-z]/vs10/vc/bin\)#\nA40-\1#ig"
The last sed expression in that line basically looks for lines that begins with "/c/VS10/VC/bin" and prepends them with 'A40-' like this:
...
/c/si
A40-/c/VS10/VC/bin
A40-/c/VS10/VC/bin/amd64
A40-/c/VS10/VC/bin/x86_amd64
/c/GnuWin32/bin
...
I like my sed expressions to be flexible (path structures change), but I don't want it to match the lines that end with amd64 or x86_amd64 (those are going to have a different string prepended). So I change the last expression to:
-e "s#\n\(/[a-z]/vs10/vc/bin\)\n#\nA40-\1\n#ig"
This works:
...
/c/si
A40-/c/VS10/VC/bin
/c/VS10/VC/bin/amd64
/c/VS10/VC/bin/x86_amd64
/c/GnuWin32/bin
...
Then, (to match any "line" matching the pseudocode "/x/.../bin") I change the last expression to:
-e "s#\n\(/[a-z]/.*/bin\)\n#\nA40-\1\n#ig"
Which produces:
...
/c/si
/c/VS10/VC/bin
/c/VS10/VC/bin/amd64
/c/VS10/VC/bin/x86_amd64
/c/GnuWin32/bin
...
??? - sed didn't match any character ('.') any number of times ('*') in the middle of the line ???
But, if I pipe the output into a different instance of sed (and compensate for sed handling each "line" seperately) like this:
| sed -e "s#^\(/[a-z]/.*/bin\)$#A40-\1#ig"
I get:
sed: -e expression #1, char 30: unterminated `s' command
??? How is that unterminated? It's got all three '#' characters after the s, has the modifiers 'i' and 'g' after the third '#', and the entire expression is in double quotes ('"'). Also, there are no escapes ('\') immediately preceding the delimiters, and the delimiter is not a part of either the search or the replacement. Let's try a different delimiter than '#', like '~':
I use:
| sed -e "s~^(/[a-z]/.*/bin)$~A40-\1~ig"
and, I get:
...
/c/si
A40-/c/VS10/VC/bin
/c/VS10/VC/bin/amd64
/c/VS10/VC/bin/x86_amd64
A40-/c/GnuWin32/bin
...
And, that is correct! The only thing I changed was the delimeter from '#' to '~' and it worked ???
This is not (even close to) the first time that sed has produced unexplainable results for me.
Why, oh, why, is sed NOT matching syntax in an expression in the same instance, but IS matching when piped into another instance of sed?
And, why, oh, why, do I have to use a different delimeter when I do this (in order not to get an "unterminated 's' command"?
And the real reason I'm asking: Is this a bug in sed, OR, is it correct behavior that I don't understand (and if so, can someone explain why this behavior is correct)? I want to know if I'm doing it wrong, or if I need a different/better tool (or both, they don't have to be mutually exclusive).
I'll mark a response it as the answer if someone can either prove why this behavior is correct or if they can prove why it is a bug. I'll gladly accept any advice about other tools or different methods of using sed, but those won't answer the question.
I'm going to have to get better at other text processors (like awk, tr, etc.) because sed is costing me too much time with it's unexplainable results.
P.S. This is not the complete logic of my PATH manipulation. The complete logic also finishes prepending all the lines with values from 'A00-' to 'Z99-', then pipes that output into 'sort -u -f' and back into sed to remove those same prefixes on each line and to convert the lines ('\n') back into colons (':'). Then "export PATH='" is prepended to the single line and "'" is appended to it. Then that output is redirected into a temporary file. Next, that temporary file is sourced. And, finally, that temporary file is removed.
The /etc/profile script also displays the contents of PATH before and after sorting (in case it screwed up the path).
P.P.S. I'm sure there is a much better way to do this. It started as some very simple sed manipulations, and grew into the monster you see here. Even if there is a better way, I still need to know why sed is giving me these results.

sed -e "s#^\(/[a-z]/.*/bin\)$#A40-\1#ig"
is unterminated because the shell is trying to expand "$#A". Put your expressions in single quotes to avoid this.
The expression
-e "s#\n\(/[a-z]/.*/bin\)\n#\nA40-\1\n#ig"
fails, or doesn't do what you expect, because . matches the newline in a multi-line expression. Check your whole output, the A40- is at the very beginning. Change it to
-e "s#\n\(/[a-z]/[^\n]*/bin\)\n#\nA40-\1\n#ig"
and it might be more what you expect. This may very well be the case with most of your issues with multi-line modifications.
You can also put the statements, one per line, into a standalone file and invoke sed with sed -f editscript. It might make maintenance of this a bit easier.

find and replace within file

I have a requirement to search for a pattern which is something like :
timeouts = {default = 3.0; };
and replace it with
timeouts = {default = 3000.0;.... };
i.e multiply the timeout by factor of 1000.
Is there any way to do this for all files in a directory
EDIT :
Please note that some of the files are symlinks in the directory.Is there any way to get this done for symlinks also ?
Please note that timeouts exists as a substring also in the files so i want to make sure that only this line gets replaced. Any solution is acceptable using sed awk perl .

Give this a try:
for f in *
do
sed -i 's/\(timeouts = {default = [0-9]\+\)\(\.[0-9]\+;\)\( };\)/\1000\2....\3/' "$f"
done
It will make the replacements in place for each file in the current directory. Some versions of sed require a backup extension after the -i option. You can supply one like this:
sed -i .bak ...
Some versions don't support in-place editing. You can do this:
sed '...' "$f" > tmpfile && mv tmpfile "$f"
Note that this is obviously not actually multiplying by 1000, so if the number is 3.1 it would become "3000.1" instead of 3100.0.

you can do this
perl -pi -e 's/(timeouts\s*=\s*\{default\s*=\s*)([0-9.-]+)/print $1; $2*1000/e' *

One suggestion for whichever solution above you decide to use - it may be worth it to think through how you could refactor to avoid having to modify all of these files for a change like this again.
Do all of these scripts have similar functionality?
Can you create a module that they would all use for shared subroutines?
In the module, could you have a single line that would allow you to have a multiplier?
For me, anytime I need to make similar changes in more than one file, it's the perfect time to be lazy to save myself time and maintenance issues later.

$ perl -pi.bak -e 's/\w+\s*=\s*{\s*\w+\s*=\s*\K(-?[0-9.]+)/sprintf "%0.1f", 1000 * $1/eg' *
Notes:
The regex matches just the number (see \K in perlre)
The /e means the replacement is evaluated
I include a sprintf in the replacement just in case you need finer control over the formatting
Perl's -i can operate on a bunch of files
EDIT
It has been pointed out that some of the files are shambolic links. Given that this process is not idempotent (running it twice on the same file is bad), you had better generate a unique list of files in case one of the links points to a file that appears elsewhere in the list. Here is an example with find, though the code for a pre-existing list should be obvious.
$ find -L . -type f -exec realpath {} \; | sort -u | xargs -d '\n' perl ...
(Assumes none of your filenames contain a newline!)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

combine two sed commands [duplicate] - regex

You can do: for each in *; do sed -i.bak 's/^"//g; 1d' "$each"; done

Related

Inserting text with many newlines with gnu sed

Converting LaTeX pmatrix command to amsmath pmatrix environment using sed

a simple sed script displaying only changed lines

Unpredictable behavior in sed interpreters output from multiple expressions

find and replace within file

Categories

Resources