Recursive multi-line replace: changing copyright headers - replace

I'm retrying to replace all of the copyright headers in my project (100+ files) with a new version. Currently I have something like this at the start of each file:
<?php
/**
* Project name
*
* #copyright Apache 2.0
* #author FooBar
*/
And I want all my files to start like this:
<?php
/**
* Copyright 2014 FooBar
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
I've already looked at:
this thread, which I can't get working. It does a partial replacement, keeping certain lines of the original text in the new text. I want a complete replacement.
this script, which similarly doesn't work for my use case. It replaces the very start of each file with a new header, which causes the existing content (<?php /** */) to be appended to the new comment, thereby causing parse errors.
Does anybody know how I can do a recursive multi-line file replace? Do I need to use sed/awk?
SOLUTION:
I just need to execute this bash script:
INPUT=../path
find $INPUT -name "*.php" -exec sed -i -e '2,/\*\//d; 1r copyright.txt' {} \;

Is it safe to assume all your files start with
<?php
/**
If so, you can use
sed '2,/\*\//d; 1r newSig.txt' input.txt
The first sed command deletes the signature from line 2 til the end of the signature. You could use a dynamic range but it would also delete other multi-line signatures in the file. The second command reads the file newSig.txt, which has your new signature, and appends it after line 1.

With GNU awk for a multi-char RS to read the whole file as a single string:
$ gawk -v RS='^$' -v hdr="\
/**
* Copyright 2014 FooBar
*
* Licensed under the blah blah blah
*/\
" '{sub(/\/\*[^/]+\*\//,hdr)}1' file
<?php
/**
* Copyright 2014 FooBar
*
* Licensed under the blah blah blah
*/

NOTE: you should read Ed Morton's comment too. Regarding that is a problem, you can check the files and only pass the readable ones in you for cycle before running the awk script.
If your files always start like this, one way to solve it with gawk is
awk 'FNR==1 { print $0
print "INSERT YOUR new header here even on multiline print statements."
# if you don't mind your old header, stop here and skip the below rules
}
FNR==2 && $0 ~ "/\*\*" {
while (getline) {
if ($0 == "*/") { getline ; break }
}
}
FNR>2 { print $0 }' INPUTFILE
And you can wrap it in for cycle, like
for file in *php ; do
awk ... $file > $file.new
done

My way is no limit on fix line of
<?php
/**
it will replace the first pair of /** to next **/
1) Save the replace content into file: update.txt (not set suffix to php)
2) then run this command on one php file (abc.php) to confirm first
sed ':a;$!{N;ba};s!/[^/]*/!########!1' abc.php|sed -e '/########/{r update.txt' -e 'd}'
3) if it is fine, then run the script on all php files:
for file in *.php
do
sed ':a;$!{N;ba};s!/[^/]*/!########!1' $file|sed -e '/########/{r update.txt' -e 'd}' > temp
mv temp "$file"
done

Related

Read file until regex in changelog file using bash

I want to get content in file until find a regex in content. I need this to report my last changelog
Example:
## <small>0.27.3 (2019-03-18)</small>
* Fix integration tests
* Change log message
## <small>0.27.2 (2019-03-18)</small>
* Change find to filter
* Fix bug in typo
* Format message in request
I want a regex to return oly the content of my latest version. Example:
## <small>0.27.3 (2019-03-18)</small>
* Fix integration tests
* Change log message
How can I make this using sed, grep or awk?
Thanks for this
Edit:
I can made it:
CHANGELOG_MESSAGE=$(head -n 1 CHANGELOG.md)$(head -n 20 CHANGELOG.md | tail -n 19 | sed '/##/q' | awk '!/##/')
I think that this solution is a few complex, but works
try this:
sed '1p;1,/^##/!d;/##/d' CHANGELOG.md
explanation
1p # print first line
;
1,/^##/!d # delete everything after second ##-line
;
/##/d # delete all lines start with ##
output
## <small>0.27.3 (2019-03-18)</small>
* Fix integration tests
* Change log message

Insert a text before slash "/" character in bash scripts

I have a line in a file as below:
2 14 * * * /run/opt/server/autoi.sh
And I want to insert "root" before /run/opt/server/autoi.sh as below:
2 14 * * * root /run/opt/server/autoi.sh
I have tried the following command
sed '//run/i root' filename
but it gives the following error:
sed: -e expression #1, char 0: no previous regular expression
Could you please help me to find a fix for it?
Use a different separator character, and you need to use the s command to substitute, e.g:
sed 's#/run#root /run#' filename
or without repeating /run:
sed 's#/run#root &#' filename
You are using the wrong sed command. The i command will insert a new line before the matching line, and of course, //run is not a valid regex at all.
The general form of a sed command is
<address> <action>
where address could be a regex or a line number, and action is a command.
In fact, you want an action without an address, which means it will be applied to every input line; and the action you want to perform is a substitution.
sed 's%/run%root &%' filename
We are using the & convenience shorthand to repeat the string which matched the first regex, and an alternate regex separator instead of / so that / does not itself get interpreted as a regex separator (equivalently, you could backslash-escape it, but here, that produces something called leaning toothpick syndrome).
This will print the results to standard output, not modify the file. Once you have verified that you get the results you want, you might want to add an -i option to modify the input file. (On some platforms, such as *BSD -- which includes MacOS -- you need -i '' with an empty argument.)
Simply substitute very first / with root / as follows.
sed 's/\//root \//' Input_file
sed -e 's/run/root\/run/' abc
Example:
[root#myIP tmp]# cat abc
2 14 * * * /run/opt/server/autoi.sh
[root#myIP tmp]# sed -e 's/run/root\/run/' abc
2 14 * * * /root/run/opt/server/autoi.sh
Edited: To add a username. this should do the trick
[root#myIP tmp]# sed -e 's/\//root \//' abc
2 14 * * * root /run/opt/server/autoi.sh

perl $1 not getting set

I need to go through some java files, and pull out the authors after every #author tag. I started out looking at awk, but awk can't remove the unneded parts, and so I came across this.
What I'm running
perl -n -e'/author (.*)/ && print $1' *.java
This prints nothing. If I do
perl -n -e'/author (.*)/ && print $_' *.java
it will (correctly) print the entire line.
I can do this, and it does accomplish my goal, but I still want to know why my capture group isn't working.
perl -n -e"/\#author / && print $'" *.java
Example input:
/* HelloWorld.java
* #author Partner of Winning
* #author Robert LastName
*/
public class HelloWorld{
public static void main(String[] args) {
System.out.println("Hello World!");
}
}
You must have a long prompt. A shorter prompt would have revealed the problem.
$ perl -n -e'/author (.*)/ && print $1' *.java
$ bert LastNameing
Your file has Windows line endings (carriage return + line feed), and you are outputting the carriage return without the line feed, causing lines to be overwritten.
You can convert the file to a unix file using dos2unix, or you could change your program to handle CRLF line endings. There are a couple of shortcuts you can take here.
Add newlines, effectively neutralizing the CR.
$ perl -nle'/author (.*)/ && print $1' *.java
Robert LastName
Partner of Winning
But that can output text that causes problems, since it still contains the input's CRs. The following avoids matching them:
$ perl -nle'/author ([^\r\n]*)/ && print $1' *.java
Robert LastName
Partner of Winning

Excluding lines that contain a pattern before replacing in Perl

I have the following code to replace version string from a set of files
ack --ignore-file=is:HISTORY.md -l --print0 '1\.1\.1' | xargs -0 perl -pi -e 's/1\.1\.1/1\.1\.2/g'
Now, I realized there are some lines in the doxygen comment that also have the version string like this.
/**
* Generate Tag id from Tag name
*
* #since 1.1.1
* #static
* #access public
*
*/
How can I modify the above snippet so that lines that contain #since will be excluded?
To exclude lines with #since you could try this instead of your current perl replace code:
!/\#since/ && s/1\.1\.1/1.1.2/g
or even
/\#since/ || s/1\.1\.1/1.1.2/g

Using multiple sed commands

Hi I'm looking to search through a file and output the values of a line that matches the following regex with the matching text removed, I don't need it output to a file. This is what I am currently using and it is outputting the required text but multiple times:
#!/bin/sh
for file in *; do
sed -e 's/^owner //g;p;!d ; s/^admin //g;p;!d ; s/^loc //g;p;!d ; s/^ser //g;p;!d' $file
done
The preferred format would be something like this so I could have control over what happens inbetween:
for file in *; do
sed 's/^owner //g;p' $file | head -1
sed 's/^admin //g;p' $file | head -1
sed '/^loc //g;p' $file | head -1
sed '/^ser //g;p' $file | head -1
done
An example input file would be the following:
owner sys group
admin guy
loc Q-30934
ser 18r9723
comment noisy fan is something
and the required output is the following:
sys group
guy
Q-30934
18r9723
You're giving sed the p (for Print) command several times. It prints the entire line each time. And unless you tell it not to with the -n option, sed will print the line at the end anyway.
You also give the !d command multiple times.
Edited after you added the multiple-sed version: instead of using head -q, just use -n to avoid printing lines you don't want. Or even use q (Quit) to stop processing after printing the bit you do want.
For instance:
sed -n '/^owner / { s///gp; q; }' $file
The {} group the substitution and quit commands together, so that they are both executed if and only if the pattern is matched. Having used the pattern in the address at the beginning, you can leave it out of the s command. So that command is short for:
sed -n '/^owner / { s/^owner //gp; q; }' $file
I'd suggest:
sed -n -e '/^owner / { s///; p; }' \
-e '/^admin / { s///; p; }' \
-e '/^loc / { s///; p; }' \
-e '/^ser / { s///; p; }' \
*
sed is perfectly capable of reading many files, so the loop control is unnecessary (you aren't doing per-file I/O redirection, for example) and it's reasonable to list the files after the rest of the sed command (that's the * on its own). If you've got a more modern version of sed (e.g. GNU sed), you can combine the patterns into a single line:
sed -r -n -e '/^(owner|admin|loc|ser) / { s///; p; }' *
This might work for (GNU sed):
sed '0,/^owner /{//s///p};0,/^admin /{//s///p};0,/^loc /{//s///p};0,/^ser /{//s///p}' file
Creates a series of toggle switches, one for each of the desired strings. The switches apply once only throughout the file for each string i.e. only the first occurence of each string is printed.
An alternative and depending on file sizes maybe quicker method:
sed -rn '1{x;s/^/owner admin loc ser /;x};/^(owner |admin |loc |ser )/{G;/^(owner |admin |loc |ser )(.*\n.*)\1/!b;s//\2/;P;/\n$/q;s/.*\n//;h}' file
This preps the hold space with the desired strings. For only those lines that contain the desired strings, append the hold space and check if the current line needs to be amended. Match the desired string with the same string in the hold space. If the line has already appeared the match will fail and the line can be disregarded. If the line is yet to be amended, the desired string is removed from the current line and then the first half of the line is printed. If no strings appear in the remaining half of the line the process is over and can be quit. Otherwise remove the first half of the string and replace the hold space with the desired string removed.