Compress working directory purely using bash regex - regex

I want to know the best way to compress the current working directory so that only the last directory's full name is visible. Let me give an example:
$ echo $PWD
/Users/mac/workshop/project1/src
I want to be able to do bash regex replacement operations on it such that I can get ~/w/p/src
I can obtain the first part of getting the leading ~ by doing ${PWD/#$HOME/\~}
$ echo ${PWD/#$HOME/\~}
~/workshop/project1/src
What other regex operations can I do (is it possible to chain the regex operators?) so that I get the following
$ echo ${PWD/#$HOME/\~} ...
~/w/p/src
Note that I need to do only using bash i.e. no sed, awk, grep etc.
The intention for this is so that, I can set the PROMPT value based on bash i.e.
in my .bashrc, I want to:
export PROMPT=${PWD/#$HOME/\~}...

Do-able in just bash, but not as simple as you'd like:
$ squashPWD() {
local pwd parts part
IFS=/ read -ra parts <<< "${PWD/#$HOME/\~}"
for part in "${parts[#]:0:${#parts[#]}-1}"; do
pwd+="${part:0:1}/"
done
echo "$pwd${parts[-1]}"
}
$ pwd
/home/jackman/tmp/adir/foo
$ squashPWD
~/t/a/foo
$ cd /usr/local/share/doc/fish/
$ squashPWD
/u/l/s/d/fish
If you don't need bash:
squashPWD() { perl -pe 's/^$ENV{HOME}/~/; s{([^/])[^/]*(?=/)}{$1}g' <<<"$PWD"; }
Either way, your prompt can be something like:
PS1='\u#\h:$(squashPWD) \$ '

It doesn't need to be all bash, you can use a function in your bashrc or bash_profile and thus use sed or awk. You can put something like this in your bashrc:
short() {
local short_path=$(echo "$PWD" | sed -E 's!/(.)[^/]*!/\1!g')
local last_dir=${PWD##*/}
echo "${short_path::-1}${last_dir}" # remove last character (1st character of last directory, and just append the last directory)
}
PS1='$(short) '
Keep in mind I don't think I replace your $HOME directory with ~, but you know how to do that :)

Related

Replace placeholders with corresponding env variables

I have a configuration file with placeholders like this (stored in /tmp/var for this example)
ldap_bind_dn='${bind_dn}'
Now I'd like replace ${bind_dn} with the the environment variable of the same name (which is set inside Docker).
export $bind_dn=CN=my-user,CN=Users,DC=example,DC=com
The expected result after processing aboves test file would be
ldap_bind_dn='CN=my-user,CN=Users,DC=example,DC=com'
I tried sed but it doesn't replace it to the value of the env variable
$ sed "s#\${\(.*\)}#$\1#" /tmp/var
bind_dn='$bind_dn'
Why sed replace with $bind_dn instead of the value? I'd expect that the variable is processed because I haven't escaped the $ sign.
The expression itself works, only the substitution doesn't:
$ sed "s#\${\(.*\)}#test123#" /tmp/var
bind_dn='test123'
The replacement is also done correctly when the target variable is hardcoded
$ sed "s#\${\(.*\)}#$bind_dn#" /tmp/var
bind_dn='CN=my-user,CN=Users,DC=example,DC=com'
But since we have a bunch of configuration variables, I'd like to automatically replace all env variables in the format ${NAME} automatically.
Not the most elegant solution, but this one works in bash:
sed 's#\${\(.*\)}#`{echo,"$\1"}`#' /tmp/var | xargs -n1 -I{} echo echo "{}" | bash -s
It is a little bit tricky because you need bash execution for the variable replacement, that's why I'm piping it to bash -s
While the env variable could be parsed by executing eval with mapfile, this seems not suiteable for me because its a ini configuration file. The sections marked with brackets like [general] would throw errors. And I also have concerns to just execute the WHOLE line, which allows executing any command.
This is fixed by the following awk:
awk '{while(match($0,"[$]{[^}]*}")) {var=substr($0,RSTART+2,RLENGTH -3);gsub("[$]{"var"}",ENVIRON[var])}}1' < /tmp/var > /tmp/var
Why sed replace with $bind_dn instead of the value?
Because sed is not supposed to do shell parameter expansion in its pattern space. That's the job of the shell mainly.
Using GNU sed:
~> cat /tmp/var
ldap_bind_dn='${bind_dn}'
~> export bind_dn=CN=my-user,CN=Users,DC=example,DC=com
~> sed -E 's/^\w+=.*/echo "&"/e' /tmp/var
ldap_bind_dn='CN=my-user,CN=Users,DC=example,DC=com'
The e command (a GNU extension) in
sed -E 's/^\w+=.*/echo "&"/e'
executes the command that is found in pattern space and replaces the pattern space with the output. For this example, pattern space is ldap_bind_dn='${bind_dn}', and is replaced with the output of echo "ldap_bind_dn='${bind_dn}'" (& references the whole matched portion of the pattern space in echo "&"). Since the argument of echo is in double quotes, it is subject to parameter expansion when it is executed by the shell.
Caveat: Make sure that the file (/tmp/var in this example) comes from a trusted source. Otherwise it may contain lines like foo='$(some_nasty_command)', which is executed when the sed command above runs.

Stop escaping forward slash in bash variables

I am having trouble with expanding variables and ignoring their forward slashes.
I have written a simple script that finds text in my git repository and replaces it with other text. This works fine, but now I want to expand it using regex. This shouldn't be too much of a problem since both git grep and sed support regex. However, when I try to use regex in my input variables the forward slashes are removed which ruins the script.
If I run git grep "\bPoint" in the terminal I will get many results. However, I can't figure out how to get the same results when I use user input in my script. The git grep file will change my input to bPoint instead of \bPoint, and won't find any results to give to sed.
#!/bin/bash
# This script allows you to replace text in the git repository without damaging
# .git files.
read -p "Text to replace: " toReplace
read -p "Replace with: " replaceWith
git grep -l ${toReplace}
# The command I want to run
#git grep -l "${toReplace}" | xargs sed -i "s,${toReplace},${replaceWith},g"
I've tried a lot of different combinations of quotations, but nothing seems to work for me.
You must use read -r. As per help read:
-r do not allow backslashes to escape any characters
Examples:
# without -r
read -p "Text to replace: " toReplace && echo "$toReplace"
Text to replace: \bPoint
bPoint
# with -r
read -rp "Text to replace: " toReplace && echo "$toReplace"
Text to replace: \bPoint
\bPoint

How to batch rename files based off a pattern in bash or linux command line [duplicate]

Objective
Change these filenames:
F00001-0708-RG-biasliuyda
F00001-0708-CS-akgdlaul
F00001-0708-VF-hioulgigl
to these filenames:
F0001-0708-RG-biasliuyda
F0001-0708-CS-akgdlaul
F0001-0708-VF-hioulgigl
Shell Code
To test:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'
To perform:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh
My Question
I don't understand the sed code. I understand what the substitution
command
$ sed 's/something/mv'
means. And I understand regular expressions somewhat. But I don't
understand what's happening here:
\(.\).\(.*\)
or here:
& \1\2/
The former, to me, just looks like it means: "a single character,
followed by a single character, followed by any length sequence of a
single character"--but surely there's more to it than that. As far as
the latter part:
& \1\2/
I have no idea.
First, I should say that the easiest way to do this is to use the
prename or rename commands.
On Ubuntu, OSX (Homebrew package rename, MacPorts package p5-file-rename), or other systems with perl rename (prename):
rename s/0000/000/ F0000*
or on systems with rename from util-linux-ng, such as RHEL:
rename 0000 000 F0000*
That's a lot more understandable than the equivalent sed command.
But as for understanding the sed command, the sed manpage is helpful. If
you run man sed and search for & (using the / command to search),
you'll find it's a special character in s/foo/bar/ replacements.
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
Therefore, \(.\) matches the first character, which can be referenced by \1.
Then . matches the next character, which is always 0.
Then \(.*\) matches the rest of the filename, which can be referenced by \2.
The replacement string puts it all together using & (the original
filename) and \1\2 which is every part of the filename except the 2nd
character, which was a 0.
This is a pretty cryptic way to do this, IMHO. If for
some reason the rename command was not available and you wanted to use
sed to do the rename (or perhaps you were doing something too complex
for rename?), being more explicit in your regex would make it much
more readable. Perhaps something like:
ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh
Being able to see what's actually changing in the
s/search/replacement/ makes it much more readable. Also it won't keep
sucking characters out of your filename if you accidentally run it
twice or something.
you've had your sed explanation, now you can use just the shell, no need external commands
for file in F0000*
do
echo mv "$file" "${file/#F0000/F000}"
# ${file/#F0000/F000} means replace the pattern that starts at beginning of string
done
I wrote a small post with examples on batch renaming using sed couple of years ago:
http://www.guyrutenberg.com/2009/01/12/batch-renaming-using-sed/
For example:
for i in *; do
mv "$i" "`echo $i | sed "s/regex/replace_text/"`";
done
If the regex contains groups (e.g. \(subregex\) then you can use them in the replacement text as \1\,\2 etc.
The easiest way would be:
for i in F00001*; do mv "$i" "${i/F00001/F0001}"; done
or, portably,
for i in F00001*; do mv "$i" "F0001${i#F00001}"; done
This replaces the F00001 prefix in the filenames with F0001.
credits to mahesh here: http://www.debian-administration.org/articles/150
The sed command
s/\(.\).\(.*\)/mv & \1\2/
means to replace:
\(.\).\(.*\)
with:
mv & \1\2
just like a regular sed command. However, the parentheses, & and \n markers change it a little.
The search string matches (and remembers as pattern 1) the single character at the start, followed by a single character, follwed by the rest of the string (remembered as pattern 2).
In the replacement string, you can refer to these matched patterns to use them as part of the replacement. You can also refer to the whole matched portion as &.
So what that sed command is doing is creating a mv command based on the original file (for the source) and character 1 and 3 onwards, effectively removing character 2 (for the destination). It will give you a series of lines along the following format:
mv F00001-0708-RG-biasliuyda F0001-0708-RG-biasliuyda
mv abcdef acdef
and so on.
Using perl rename (a must have in the toolbox):
rename -n 's/0000/000/' F0000*
Remove -n switch when the output looks good to rename for real.
There are other tools with the same name which may or may not be able to do this, so be careful.
The rename command that is part of the util-linux package, won't.
If you run the following command (GNU)
$ rename
and you see perlexpr, then this seems to be the right tool.
If not, to make it the default (usually already the case) on Debian and derivative like Ubuntu :
$ sudo apt install rename
$ sudo update-alternatives --set rename /usr/bin/file-rename
For archlinux:
pacman -S perl-rename
For RedHat-family distros:
yum install prename
The 'prename' package is in the EPEL repository.
For Gentoo:
emerge dev-perl/rename
For *BSD:
pkg install gprename
or p5-File-Rename
For Mac users:
brew install rename
If you don't have this command with another distro, search your package manager to install it or do it manually:
cpan -i File::Rename
Old standalone version can be found here
man rename
This tool was originally written by Larry Wall, the Perl's dad.
The backslash-paren stuff means, "while matching the pattern, hold on to the stuff that matches in here." Later, on the replacement text side, you can get those remembered fragments back with "\1" (first parenthesized block), "\2" (second block), and so on.
If all you're really doing is removing the second character, regardless of what it is, you can do this:
s/.//2
but your command is building a mv command and piping it to the shell for execution.
This is no more readable than your version:
find -type f | sed -n 'h;s/.//4;x;s/^/mv /;G;s/\n/ /g;p' | sh
The fourth character is removed because find is prepending each filename with "./".
Here's what I would do:
for file in *.[Jj][Pp][Gg] ;do
echo mv -vi \"$file\" `jhead $file|
grep Date|
cut -b 16-|
sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done
Then if that looks ok, add | sh to the end. So:
for file in *.[Jj][Pp][Gg] ;do
echo mv -vi \"$file\" `jhead $file|
grep Date|
cut -b 16-|
sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done | sh
for i in *; do mv $i $(echo $i|sed 's/AAA/BBB/'); done
The parentheses capture particular strings for use by the backslashed numbers.
ls F00001-0708-*|sed 's|^F0000\(.*\)|mv & F000\1|' | bash
Some examples that work for me:
$ tree -L 1 -F .
.
├── A.Show.2020.1400MB.txt
└── Some Show S01E01 the Loreming.txt
0 directories, 2 files
## remove "1400MB" (I: ignore case) ...
$ for f in *; do mv 2>/dev/null -v "$f" "`echo $f | sed -r 's/.[0-9]{1,}mb//I'`"; done;
renamed 'A.Show.2020.1400MB.txt' -> 'A.Show.2020.txt'
## change "S01E01 the" to "S01E01 The"
## \U& : change (here: regex-selected) text to uppercase;
## note also: no need here for `\1` in that regex expression
$ for f in *; do mv 2>/dev/null "$f" "`echo $f | sed -r "s/([0-9] [a-z])/\U&/"`"; done
$ tree -L 1 -F .
.
├── A.Show.2020.txt
└── Some Show S01E01 The Loreming.txt
0 directories, 2 files
$
2>/dev/null suppresses extraneous output (warnings ...)
reference [this thread]: https://stackoverflow.com/a/2372808/1904943
change case: https://www.networkworld.com/article/3529409/converting-between-uppercase-and-lowercase-on-the-linux-command-line.html

Replacing string in linux using sed/awk based

i want to replace this
#!/usr/bin/env bash
with this
#!/bin/bash
i have tried two approaches
Approach 1
original_str="#!/usr/bin/env bash"
replace_str="#!/bin/bash"
sed s~${original_str}~${replace_str}~ filename
Approach 2
line=`grep -n "/usr/bin" filename`
awk NR==${line} {sub("#!/usr/bin/env bash"," #!/bin/bash")}
But both of them are not working.
You cannot use ! inside a double quotes in BASH otherwise history expansion will take place.
You can just do:
original_str='/usr/bin/env bash'
replace_str='/bin/bash'
sed "s~$original_str~$replace_str~" file
#!/bin/bash
Using escape characters :
$ cat z.sh
#!/usr/bin/env bash
$ sed -i "s/\/usr\/bin\/env bash/\/bin\/bash/g" z.sh
$ cat z.sh
#!/bin/bash
Try this out in the terminal:
echo "#!/usr/bin/env bash" | sed 's:#!/usr/bin/env bash:#!/bin/bash:g'
In this cases I use : because sed gets confused between the different slashes and it isn't able to tell anymore with one separates and wich one is part of the text.
Plus it looks really clean.
The cool thing is that you can use every symbol you want as a separator.
For example a semicolon ; or the pipe symbol | .
By using the escape character \ I think that the code would look too messy and wouldn't be very readable, considering the fact that you have to put it before every forward slash in the command.
The command above will just print out the replaced line, but if you want to modify the file, than you need to specify the input and output file, like this:
sed 's:#!/usr/bin/env bash:#!/bin/bash:g' <inputfile >outputfile-new
Remember to put that -new if the inputfile and the output file have the same name, because without it your original one will be cleared completely: this happend me in the past, and it's not the best thing at all. For example:
<test.txt >test-new.txt

replace string to asterisk bash

I am trying to get from user a path as an input.
The user will enter a specific path for specific application:
script.sh /var/log/dbhome_1/md5
I've wanted to convert the number of directory (in that case - 1) to * (asterisk). later on, the script will do some logic on this path.
When i'm trying sed on the input, i'm stuck with the number -
echo "/var/log/dbhome_1/md5" | sed "s/dbhome_*/dbhome_\*/g"
and the input will be -
/var/log/dbhome_*1/md5
I know that i have some problems with the asterisk wildcard and as a char...
maybe regex will help here?
Code for GNU sed:
sed "s#1/#\*/#"
.
$echo "/var/log/dbhome_1/md5" | sed "s#1/#\*/#"
"/var/log/dbhome_*/md5"
Or more general:
sed "s#[0-9]\+/#\*/#"
.
$echo "/var/log/dbhome_1234567890/md5" | sed "s#[0-9]\+/#\*/#"
"/var/log/dbhome_*/md5"
use this instead:
echo "/var/log/dbhome_1/md5" | sed "s/dbhome_[0-9]\+/dbhome_\*/g"
[0-9] is a character class that contains all digits
Thus [0-9]\+ matches one or more digits
If your script is in bash (which I assume when I see the tag, but I also doubt it when I see its name script.sh which seems to have the wrong extension for a bash script), you might as well use pure bash stuff: /var/log/dbhome_1/md5 will very likely be in positional parameter $1, and what you want will be achieved by:
echo "${1//dbhome_+([[:digit:]])/dbhome_*}"
If this seems to fail, it's probably because your extglob shell optional behavior is turned off. In this case, just turn it on with
shopt -s extglob
Demo:
$ shopt -s extglob
$ a=/var/log/dbhome_1234567/md5
$ echo "${a//dbhome_+([[:digit:]])/dbhome_*}"
/var/log/dbhome_*/md5
$
Done!