Replacing string in linux using sed/awk based - regex

i want to replace this
#!/usr/bin/env bash
with this
#!/bin/bash
i have tried two approaches
Approach 1
original_str="#!/usr/bin/env bash"
replace_str="#!/bin/bash"
sed s~${original_str}~${replace_str}~ filename
Approach 2
line=`grep -n "/usr/bin" filename`
awk NR==${line} {sub("#!/usr/bin/env bash"," #!/bin/bash")}
But both of them are not working.

You cannot use ! inside a double quotes in BASH otherwise history expansion will take place.
You can just do:
original_str='/usr/bin/env bash'
replace_str='/bin/bash'
sed "s~$original_str~$replace_str~" file
#!/bin/bash

Using escape characters :
$ cat z.sh
#!/usr/bin/env bash
$ sed -i "s/\/usr\/bin\/env bash/\/bin\/bash/g" z.sh
$ cat z.sh
#!/bin/bash

Try this out in the terminal:
echo "#!/usr/bin/env bash" | sed 's:#!/usr/bin/env bash:#!/bin/bash:g'
In this cases I use : because sed gets confused between the different slashes and it isn't able to tell anymore with one separates and wich one is part of the text.
Plus it looks really clean.
The cool thing is that you can use every symbol you want as a separator.
For example a semicolon ; or the pipe symbol | .
By using the escape character \ I think that the code would look too messy and wouldn't be very readable, considering the fact that you have to put it before every forward slash in the command.
The command above will just print out the replaced line, but if you want to modify the file, than you need to specify the input and output file, like this:
sed 's:#!/usr/bin/env bash:#!/bin/bash:g' <inputfile >outputfile-new
Remember to put that -new if the inputfile and the output file have the same name, because without it your original one will be cleared completely: this happend me in the past, and it's not the best thing at all. For example:
<test.txt >test-new.txt

Related

Replace placeholders with corresponding env variables

I have a configuration file with placeholders like this (stored in /tmp/var for this example)
ldap_bind_dn='${bind_dn}'
Now I'd like replace ${bind_dn} with the the environment variable of the same name (which is set inside Docker).
export $bind_dn=CN=my-user,CN=Users,DC=example,DC=com
The expected result after processing aboves test file would be
ldap_bind_dn='CN=my-user,CN=Users,DC=example,DC=com'
I tried sed but it doesn't replace it to the value of the env variable
$ sed "s#\${\(.*\)}#$\1#" /tmp/var
bind_dn='$bind_dn'
Why sed replace with $bind_dn instead of the value? I'd expect that the variable is processed because I haven't escaped the $ sign.
The expression itself works, only the substitution doesn't:
$ sed "s#\${\(.*\)}#test123#" /tmp/var
bind_dn='test123'
The replacement is also done correctly when the target variable is hardcoded
$ sed "s#\${\(.*\)}#$bind_dn#" /tmp/var
bind_dn='CN=my-user,CN=Users,DC=example,DC=com'
But since we have a bunch of configuration variables, I'd like to automatically replace all env variables in the format ${NAME} automatically.
Not the most elegant solution, but this one works in bash:
sed 's#\${\(.*\)}#`{echo,"$\1"}`#' /tmp/var | xargs -n1 -I{} echo echo "{}" | bash -s
It is a little bit tricky because you need bash execution for the variable replacement, that's why I'm piping it to bash -s
While the env variable could be parsed by executing eval with mapfile, this seems not suiteable for me because its a ini configuration file. The sections marked with brackets like [general] would throw errors. And I also have concerns to just execute the WHOLE line, which allows executing any command.
This is fixed by the following awk:
awk '{while(match($0,"[$]{[^}]*}")) {var=substr($0,RSTART+2,RLENGTH -3);gsub("[$]{"var"}",ENVIRON[var])}}1' < /tmp/var > /tmp/var
Why sed replace with $bind_dn instead of the value?
Because sed is not supposed to do shell parameter expansion in its pattern space. That's the job of the shell mainly.
Using GNU sed:
~> cat /tmp/var
ldap_bind_dn='${bind_dn}'
~> export bind_dn=CN=my-user,CN=Users,DC=example,DC=com
~> sed -E 's/^\w+=.*/echo "&"/e' /tmp/var
ldap_bind_dn='CN=my-user,CN=Users,DC=example,DC=com'
The e command (a GNU extension) in
sed -E 's/^\w+=.*/echo "&"/e'
executes the command that is found in pattern space and replaces the pattern space with the output. For this example, pattern space is ldap_bind_dn='${bind_dn}', and is replaced with the output of echo "ldap_bind_dn='${bind_dn}'" (& references the whole matched portion of the pattern space in echo "&"). Since the argument of echo is in double quotes, it is subject to parameter expansion when it is executed by the shell.
Caveat: Make sure that the file (/tmp/var in this example) comes from a trusted source. Otherwise it may contain lines like foo='$(some_nasty_command)', which is executed when the sed command above runs.

Compress working directory purely using bash regex

I want to know the best way to compress the current working directory so that only the last directory's full name is visible. Let me give an example:
$ echo $PWD
/Users/mac/workshop/project1/src
I want to be able to do bash regex replacement operations on it such that I can get ~/w/p/src
I can obtain the first part of getting the leading ~ by doing ${PWD/#$HOME/\~}
$ echo ${PWD/#$HOME/\~}
~/workshop/project1/src
What other regex operations can I do (is it possible to chain the regex operators?) so that I get the following
$ echo ${PWD/#$HOME/\~} ...
~/w/p/src
Note that I need to do only using bash i.e. no sed, awk, grep etc.
The intention for this is so that, I can set the PROMPT value based on bash i.e.
in my .bashrc, I want to:
export PROMPT=${PWD/#$HOME/\~}...
Do-able in just bash, but not as simple as you'd like:
$ squashPWD() {
local pwd parts part
IFS=/ read -ra parts <<< "${PWD/#$HOME/\~}"
for part in "${parts[#]:0:${#parts[#]}-1}"; do
pwd+="${part:0:1}/"
done
echo "$pwd${parts[-1]}"
}
$ pwd
/home/jackman/tmp/adir/foo
$ squashPWD
~/t/a/foo
$ cd /usr/local/share/doc/fish/
$ squashPWD
/u/l/s/d/fish
If you don't need bash:
squashPWD() { perl -pe 's/^$ENV{HOME}/~/; s{([^/])[^/]*(?=/)}{$1}g' <<<"$PWD"; }
Either way, your prompt can be something like:
PS1='\u#\h:$(squashPWD) \$ '
It doesn't need to be all bash, you can use a function in your bashrc or bash_profile and thus use sed or awk. You can put something like this in your bashrc:
short() {
local short_path=$(echo "$PWD" | sed -E 's!/(.)[^/]*!/\1!g')
local last_dir=${PWD##*/}
echo "${short_path::-1}${last_dir}" # remove last character (1st character of last directory, and just append the last directory)
}
PS1='$(short) '
Keep in mind I don't think I replace your $HOME directory with ~, but you know how to do that :)

How to batch rename files based off a pattern in bash or linux command line [duplicate]

Objective
Change these filenames:
F00001-0708-RG-biasliuyda
F00001-0708-CS-akgdlaul
F00001-0708-VF-hioulgigl
to these filenames:
F0001-0708-RG-biasliuyda
F0001-0708-CS-akgdlaul
F0001-0708-VF-hioulgigl
Shell Code
To test:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'
To perform:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh
My Question
I don't understand the sed code. I understand what the substitution
command
$ sed 's/something/mv'
means. And I understand regular expressions somewhat. But I don't
understand what's happening here:
\(.\).\(.*\)
or here:
& \1\2/
The former, to me, just looks like it means: "a single character,
followed by a single character, followed by any length sequence of a
single character"--but surely there's more to it than that. As far as
the latter part:
& \1\2/
I have no idea.
First, I should say that the easiest way to do this is to use the
prename or rename commands.
On Ubuntu, OSX (Homebrew package rename, MacPorts package p5-file-rename), or other systems with perl rename (prename):
rename s/0000/000/ F0000*
or on systems with rename from util-linux-ng, such as RHEL:
rename 0000 000 F0000*
That's a lot more understandable than the equivalent sed command.
But as for understanding the sed command, the sed manpage is helpful. If
you run man sed and search for & (using the / command to search),
you'll find it's a special character in s/foo/bar/ replacements.
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
Therefore, \(.\) matches the first character, which can be referenced by \1.
Then . matches the next character, which is always 0.
Then \(.*\) matches the rest of the filename, which can be referenced by \2.
The replacement string puts it all together using & (the original
filename) and \1\2 which is every part of the filename except the 2nd
character, which was a 0.
This is a pretty cryptic way to do this, IMHO. If for
some reason the rename command was not available and you wanted to use
sed to do the rename (or perhaps you were doing something too complex
for rename?), being more explicit in your regex would make it much
more readable. Perhaps something like:
ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh
Being able to see what's actually changing in the
s/search/replacement/ makes it much more readable. Also it won't keep
sucking characters out of your filename if you accidentally run it
twice or something.
you've had your sed explanation, now you can use just the shell, no need external commands
for file in F0000*
do
echo mv "$file" "${file/#F0000/F000}"
# ${file/#F0000/F000} means replace the pattern that starts at beginning of string
done
I wrote a small post with examples on batch renaming using sed couple of years ago:
http://www.guyrutenberg.com/2009/01/12/batch-renaming-using-sed/
For example:
for i in *; do
mv "$i" "`echo $i | sed "s/regex/replace_text/"`";
done
If the regex contains groups (e.g. \(subregex\) then you can use them in the replacement text as \1\,\2 etc.
The easiest way would be:
for i in F00001*; do mv "$i" "${i/F00001/F0001}"; done
or, portably,
for i in F00001*; do mv "$i" "F0001${i#F00001}"; done
This replaces the F00001 prefix in the filenames with F0001.
credits to mahesh here: http://www.debian-administration.org/articles/150
The sed command
s/\(.\).\(.*\)/mv & \1\2/
means to replace:
\(.\).\(.*\)
with:
mv & \1\2
just like a regular sed command. However, the parentheses, & and \n markers change it a little.
The search string matches (and remembers as pattern 1) the single character at the start, followed by a single character, follwed by the rest of the string (remembered as pattern 2).
In the replacement string, you can refer to these matched patterns to use them as part of the replacement. You can also refer to the whole matched portion as &.
So what that sed command is doing is creating a mv command based on the original file (for the source) and character 1 and 3 onwards, effectively removing character 2 (for the destination). It will give you a series of lines along the following format:
mv F00001-0708-RG-biasliuyda F0001-0708-RG-biasliuyda
mv abcdef acdef
and so on.
Using perl rename (a must have in the toolbox):
rename -n 's/0000/000/' F0000*
Remove -n switch when the output looks good to rename for real.
There are other tools with the same name which may or may not be able to do this, so be careful.
The rename command that is part of the util-linux package, won't.
If you run the following command (GNU)
$ rename
and you see perlexpr, then this seems to be the right tool.
If not, to make it the default (usually already the case) on Debian and derivative like Ubuntu :
$ sudo apt install rename
$ sudo update-alternatives --set rename /usr/bin/file-rename
For archlinux:
pacman -S perl-rename
For RedHat-family distros:
yum install prename
The 'prename' package is in the EPEL repository.
For Gentoo:
emerge dev-perl/rename
For *BSD:
pkg install gprename
or p5-File-Rename
For Mac users:
brew install rename
If you don't have this command with another distro, search your package manager to install it or do it manually:
cpan -i File::Rename
Old standalone version can be found here
man rename
This tool was originally written by Larry Wall, the Perl's dad.
The backslash-paren stuff means, "while matching the pattern, hold on to the stuff that matches in here." Later, on the replacement text side, you can get those remembered fragments back with "\1" (first parenthesized block), "\2" (second block), and so on.
If all you're really doing is removing the second character, regardless of what it is, you can do this:
s/.//2
but your command is building a mv command and piping it to the shell for execution.
This is no more readable than your version:
find -type f | sed -n 'h;s/.//4;x;s/^/mv /;G;s/\n/ /g;p' | sh
The fourth character is removed because find is prepending each filename with "./".
Here's what I would do:
for file in *.[Jj][Pp][Gg] ;do
echo mv -vi \"$file\" `jhead $file|
grep Date|
cut -b 16-|
sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done
Then if that looks ok, add | sh to the end. So:
for file in *.[Jj][Pp][Gg] ;do
echo mv -vi \"$file\" `jhead $file|
grep Date|
cut -b 16-|
sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done | sh
for i in *; do mv $i $(echo $i|sed 's/AAA/BBB/'); done
The parentheses capture particular strings for use by the backslashed numbers.
ls F00001-0708-*|sed 's|^F0000\(.*\)|mv & F000\1|' | bash
Some examples that work for me:
$ tree -L 1 -F .
.
├── A.Show.2020.1400MB.txt
└── Some Show S01E01 the Loreming.txt
0 directories, 2 files
## remove "1400MB" (I: ignore case) ...
$ for f in *; do mv 2>/dev/null -v "$f" "`echo $f | sed -r 's/.[0-9]{1,}mb//I'`"; done;
renamed 'A.Show.2020.1400MB.txt' -> 'A.Show.2020.txt'
## change "S01E01 the" to "S01E01 The"
## \U& : change (here: regex-selected) text to uppercase;
## note also: no need here for `\1` in that regex expression
$ for f in *; do mv 2>/dev/null "$f" "`echo $f | sed -r "s/([0-9] [a-z])/\U&/"`"; done
$ tree -L 1 -F .
.
├── A.Show.2020.txt
└── Some Show S01E01 The Loreming.txt
0 directories, 2 files
$
2>/dev/null suppresses extraneous output (warnings ...)
reference [this thread]: https://stackoverflow.com/a/2372808/1904943
change case: https://www.networkworld.com/article/3529409/converting-between-uppercase-and-lowercase-on-the-linux-command-line.html

How to extract a substring using sed on OS X?

Im trying to iterate over each file and folder inside a directory and extract part of the file name into a variable, but I can't make sed work correctly. I either get all of the file name or none of it.
This version of the script should capture the entire file name:
#!/bin/bash
for f in *
do
substring=`echo $f | sed -E -n 's/(.*)/\1/'`
echo "sub: $substring"
done
But instead I get nothing:
sub:
sub:
sub:
sub:
...
This version should give me just the first character in the filename:
#!/bin/bash
for f in *
do
substring=`echo $f | sed -E 's/^([a-zA-Z])/\1/'`
echo "sub: $substring"
done
But instead I get the whole file name:
sub: Adlm
sub: Applications
sub: Applications (Parallels)
sub: Desktop
...
I've tried numerous iterations of it and what it basically boils down to is that if I use -n I get nothing and if I don't I get the whole file name.
Can someone show me how to get just the first character?
Or, my overall goal is to be able to extract a substring and store it into a variable, if anybody has a better approach to it, that would be appreciated as well.
Thanks in advance.
If you want to modify a shell parameter, you probably want to use a parameter expansion.
for f in *; do
# This version should expand to the whole parameter
echo "$f"
# This version should expand to the first character in the filename
echo "${f::1}"
done
Parameter expansions are not as powerful as sed, but they are built in to the shell (no launching a separate process or subshell necessary) and there are expansions for:
Substrings (as above)
Replacing and substituting characters
Altering the case of strings (bash 4+)
and more.
This version of the script should capture the entire file name:
sed -E -n 's/(.*)/\1/'
But instead I get nothing.
You used -n so naturally it won't yield anything. Perhaps you should remove -n or add p:
sed -E -n 's/(.*)/\1/p'
This version should give me just the first character in the filename:
sed -E 's/^([a-zA-Z])/\1/'
But instead I get the whole file name,
You didn't replace anything there. Perhaps what you wanted was
sed -E 's/^([a-zA-Z]).*/\1/'
Also I suggest quoting your arguments well:
substring=`echo "$f" | sed ...'`
Finally the simpler method is to use substring expansion if you're using Bash as suggested by kojiro.
You forget to add .* after the capturing group in sed,
$ for i in *; do substring=`echo $i | sed -E 's/^(.).*$/\1/'`; echo "sub: $substring"; done
It's better to use . instead of [a-zA-Z] because it may fail if the first character starts with any special character.
I prefer awk to sed. It seems to be easier for me to understand.
#!/bin/bash
#set -x
for f in *
do
substring=`echo $f | awk '{print substr($1,1,1)}'`
echo "sub: $substring"
done

Grep regex contained in a file (not grep -f option!)

I am reading some equipment configuration output and check if the configuration is correct, according to the HW configuration. The template configurations are stored as files with all the params, and the lines contain regular expressions (basically just to account for variable number of spaces between "object", "param" and "value" in the output, also some index variance)
First of all, I cannot use grep -f $template $output, since I have to process each line of the template separately. I have something like this running
while read line
do
attempt=`grep -E "$line" $file`
# ...etc
done < $template
Which works just fine if the template doesn't contain regex.
Problem: grep interpretes the search option literally when these are read form file. I tested the regex themselves, they work fine from the command line.
With this background, the question is:
How to read regex from a file (line by line) and have grep not interprete them literally?
Using the following script:
#!/usr/bin/env bash
# multi-grep
regexes="$1"
file="$2"
while IFS= read -r rx ; do
result="$(grep -E "$rx" "$file")"
grep -q -E "$rx" "$file" && printf 'Look ma, a match: %s!\n' "$result"
done < "$regexes"
And files with the following contents:
$ cat regexes
RbsLocalCell=S.C1.+eulMaxOwnUuLoad.+100
$ cat data
RbsLocalCell=S1C1 eulMaxOwnUuLoad 100
I get this result:
$ ./multi-grep regexes data
Look ma, a match: RbsLocalCell=S1C1 eulMaxOwnUuLoad 100!
This works for different spacing as well
$ cat data
RbsLocalCell=S1C1 eulMaxOwnUuLoad 100
$ ./multi-grep regexes data
Look ma, a match: RbsLocalCell=S1C1 eulMaxOwnUuLoad 100!
Seems okay to me.
Use the -F option, or fgrep.
What's more, you seem to want to match full lines: add the -x option as well.
Another point: make sure the pattern is not interpreted in some wrong way by the shell by putting "$line" in quotes.
All in all that looks like you better write a perl than a shell script.