replace string to asterisk bash - regex

I am trying to get from user a path as an input.
The user will enter a specific path for specific application:
script.sh /var/log/dbhome_1/md5
I've wanted to convert the number of directory (in that case - 1) to * (asterisk). later on, the script will do some logic on this path.
When i'm trying sed on the input, i'm stuck with the number -
echo "/var/log/dbhome_1/md5" | sed "s/dbhome_*/dbhome_\*/g"
and the input will be -
/var/log/dbhome_*1/md5
I know that i have some problems with the asterisk wildcard and as a char...
maybe regex will help here?

Code for GNU sed:
sed "s#1/#\*/#"
.
$echo "/var/log/dbhome_1/md5" | sed "s#1/#\*/#"
"/var/log/dbhome_*/md5"
Or more general:
sed "s#[0-9]\+/#\*/#"
.
$echo "/var/log/dbhome_1234567890/md5" | sed "s#[0-9]\+/#\*/#"
"/var/log/dbhome_*/md5"

use this instead:
echo "/var/log/dbhome_1/md5" | sed "s/dbhome_[0-9]\+/dbhome_\*/g"
[0-9] is a character class that contains all digits
Thus [0-9]\+ matches one or more digits

If your script is in bash (which I assume when I see the tag, but I also doubt it when I see its name script.sh which seems to have the wrong extension for a bash script), you might as well use pure bash stuff: /var/log/dbhome_1/md5 will very likely be in positional parameter $1, and what you want will be achieved by:
echo "${1//dbhome_+([[:digit:]])/dbhome_*}"
If this seems to fail, it's probably because your extglob shell optional behavior is turned off. In this case, just turn it on with
shopt -s extglob
Demo:
$ shopt -s extglob
$ a=/var/log/dbhome_1234567/md5
$ echo "${a//dbhome_+([[:digit:]])/dbhome_*}"
/var/log/dbhome_*/md5
$
Done!

Related

replace string with underscore and dots using sed or awk

I have a bunch of files with filenames composed of underscore and dots, here is one example:
META_ALL_whrAdjBMI_GLOBAL_August2016.bed.nodup.sortedbed.roadmap.sort.fgwas.gz.r0-ADRL.GLND.FET-EnhA.out.params
I want to remove the part that contains .bed.nodup.sortedbed.roadmap.sort.fgwas.gz. so the expected filename output would be META_ALL_whrAdjBMI_GLOBAL_August2016.r0-ADRL.GLND.FET-EnhA.out.params
I am using these sed commands but neither one works:
stringZ=META_ALL_whrAdjBMI_GLOBAL_August2016.bed.nodup.sortedbed.roadmap.sort.fgwas.gz.r0-ADRL.GLND.FET-EnhA.out.params
echo $stringZ | sed -e 's/\([[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.\)//g'
echo $stringZ | sed -e 's/\[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.[[:lower:]]\.//g'
Any solution is sed or awk would help a lot
Don't use external utilities and regexes for such a simple task! Use parameter expansions instead.
stringZ=META_ALL_whrAdjBMI_GLOBAL_August2016.bed.nodup.sortedbed.roadmap.sort.fgwas.gz.r0-ADRL.GLND.FET-EnhA.out.params
echo "${stringZ/.bed.nodup.sortedbed.roadmap.sort.fgwas.gz}"
To perform the renaming of all the files containing .bed.nodup.sortedbed.roadmap.sort.fgwas.gz, use this:
shopt -s nullglob
substring=.bed.nodup.sortedbed.roadmap.sort.fgwas.gz
for file in *"$substring"*; do
echo mv -- "$file" "${file/"$substring"}"
done
Note. I left echo in front of mv so that nothing is going to be renamed; the commands will only be displayed on your terminal. Remove echo if you're satisfied with what you see.
Your regex doesn't really feel too much more general than the fixed pattern would be, but if you want to make it work, you need to allow for more than one lower case character between each dot. Right now you're looking for exactly one, but you can fix it with \+ after each [[:lower:]] like
printf '%s' "$stringZ" | sed -e 's/\([[:lower:]]\+\.[[:lower:]]\+\.[[:lower:]]\+\.[[:lower:]]\+\.[[:lower:]]\+\.[[:lower:]]\+\.[[:lower:]]\+\.\)//g'
which with
stringZ="META_ALL_whrAdjBMI_GLOBAL_August2016.bed.nodup.sortedbed.roadmap.sort.fgwas.gz.r0-ADRL.GLND.FET-EnhA.out.params"
give me the output
META_ALL_whrAdjBMI_GLOBAL_August2016.r0-ADRL.GLND.FET-EnhA.out.params
Try this:
#!/bin/bash
for line in $(ls -1 META*);
do
f2=$(echo $line | sed 's/.bed.nodup.sortedbed.roadmap.sort.fgwas.gz//')
mv $line $f2
done

Using sed with regex to find and replace a string

So I have the following string in my config.fish, and init.vim:
Fish: eval sh ~/.config/fish/colors/base16-monokai.dark.sh
Vim: colorscheme base16-monokai
Vim: let g:airline_theme='base16_monokai'
And I have the following shell script:
#!/bin/sh
theme=$1
background=$2
if [ -z '$theme' ]; then
echo "Please provide a theme name."
else
if [ -z '$background' ]; then
$background = 'dark'
fi
base16-builder -s $theme -t vim -b $background > ~/.config/nvim/colors/base16-$theme.vim &&
base16-builder -s $theme -t shell -b $background > ~/.config/fish/colors/base16-$theme.$background.sh &&
base16-builder -s $theme -t vim-airline -b $background > ~/.vim/plugged/vim-airline-themes/autoload/airline/themes/base16_$theme.vim
sed -i -e 's/foo/eval sh ~/.config/fish/colors/base16-$theme.$background.sh/g' ~/Developer/dotfiles/config.fish
sed -i -e 's/foo/colorscheme base16-$theme/g' ~/Developer/dotfiles/init.vim
sed -i -e 's/foo/let g:airline_theme='base16_$theme'/g' ~/Developer/dotfiles/init.vim
fi
Basically the idea is the script will generate whichever theme is passed through using this builder.
I have tried referring this documentation but I am not very skilled at regex so if anybody could give me a hand I would appreciate it.
What I need to happen is once the script is generated sed will look for the above strings and replace theme with the newly generated theme ones.
Try this :
sed -i "s|\(eval sh ~/\.config/fish/colors/base16-\)\([^.]*\)\.\([^.]*\)\\(.*\)|\1$theme.$background\4|
" ~/Developer/dotfiles/config.fish
sed -i "s/\(base16\)\([-_]\)\([a-zA-Z]*\)/\1\2$theme/g" ~/Developer/dotfiles/init.vim
Assuming in the second sed command that the theme is an alphanumeric string. If not, you can complete the character range : [a-zA-Z] with additional characters (eg [a-zA-Z0-9]).
You can replace something in sed using this syntax: sed "s#regex#replacement#g". Because you have /s and 's in your strings, it's easiest not to need to escape them.
There are some characters that need to be escaped to make the regexes. . and $ need to be escaped with a \. The $ in the replacement string needs to be escaped too.
If you want to capture a certain part from match, it's easiest to use char classes. For example, eval sh ~/\.config/fish/colors/base16-([^.]+)\.dark\.sh would be the regex to use if you want your replacement to be airline_theme='$1_base16_\$theme'. In that case, the $1 in the replacement is the thing captured in the regex.
[^.]+ will capture everything up to the next .
I hope this helps you to better understand regexes! This should be detailed enough to show you how to write your own.
You need to use double quotes for parameter expansion not single quotes.
You need to escape the single quotes: 'hello'\''world'
I will make one line for you and leave it as an exercise to fix the other lines
sed -i -e 's~\(let g:airline_theme='\''\)[^'\'']*\('\'\)'~base16_'"$theme"~' ~/Developer/dotfiles/init.vim
The first character after the s in the sed expression string is used as the pattern separator, so by putting / first you have specified / as the separator.
Additionally using the single quote tells the shell not to expand any variables, you are going to want to use double quotes instead.
try something like
sed -i -e "s#foo#eval sh ~/.config/fish/colors/base16-$theme.$background.sh#g" ~/Developer/dotfiles/config.fish
as you've now commented that you needed to find the previous theme string instead of foo
sed -i -e "s#eval sh \~/\.config/fish/colors/base16-.*?\..*?\.sh#eval sh ~/.config/fish/colors/base16-$theme.$background.sh#g" ~/Developer/dotfiles/config.fish

How to batch rename files based off a pattern in bash or linux command line [duplicate]

Objective
Change these filenames:
F00001-0708-RG-biasliuyda
F00001-0708-CS-akgdlaul
F00001-0708-VF-hioulgigl
to these filenames:
F0001-0708-RG-biasliuyda
F0001-0708-CS-akgdlaul
F0001-0708-VF-hioulgigl
Shell Code
To test:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'
To perform:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh
My Question
I don't understand the sed code. I understand what the substitution
command
$ sed 's/something/mv'
means. And I understand regular expressions somewhat. But I don't
understand what's happening here:
\(.\).\(.*\)
or here:
& \1\2/
The former, to me, just looks like it means: "a single character,
followed by a single character, followed by any length sequence of a
single character"--but surely there's more to it than that. As far as
the latter part:
& \1\2/
I have no idea.
First, I should say that the easiest way to do this is to use the
prename or rename commands.
On Ubuntu, OSX (Homebrew package rename, MacPorts package p5-file-rename), or other systems with perl rename (prename):
rename s/0000/000/ F0000*
or on systems with rename from util-linux-ng, such as RHEL:
rename 0000 000 F0000*
That's a lot more understandable than the equivalent sed command.
But as for understanding the sed command, the sed manpage is helpful. If
you run man sed and search for & (using the / command to search),
you'll find it's a special character in s/foo/bar/ replacements.
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
Therefore, \(.\) matches the first character, which can be referenced by \1.
Then . matches the next character, which is always 0.
Then \(.*\) matches the rest of the filename, which can be referenced by \2.
The replacement string puts it all together using & (the original
filename) and \1\2 which is every part of the filename except the 2nd
character, which was a 0.
This is a pretty cryptic way to do this, IMHO. If for
some reason the rename command was not available and you wanted to use
sed to do the rename (or perhaps you were doing something too complex
for rename?), being more explicit in your regex would make it much
more readable. Perhaps something like:
ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh
Being able to see what's actually changing in the
s/search/replacement/ makes it much more readable. Also it won't keep
sucking characters out of your filename if you accidentally run it
twice or something.
you've had your sed explanation, now you can use just the shell, no need external commands
for file in F0000*
do
echo mv "$file" "${file/#F0000/F000}"
# ${file/#F0000/F000} means replace the pattern that starts at beginning of string
done
I wrote a small post with examples on batch renaming using sed couple of years ago:
http://www.guyrutenberg.com/2009/01/12/batch-renaming-using-sed/
For example:
for i in *; do
mv "$i" "`echo $i | sed "s/regex/replace_text/"`";
done
If the regex contains groups (e.g. \(subregex\) then you can use them in the replacement text as \1\,\2 etc.
The easiest way would be:
for i in F00001*; do mv "$i" "${i/F00001/F0001}"; done
or, portably,
for i in F00001*; do mv "$i" "F0001${i#F00001}"; done
This replaces the F00001 prefix in the filenames with F0001.
credits to mahesh here: http://www.debian-administration.org/articles/150
The sed command
s/\(.\).\(.*\)/mv & \1\2/
means to replace:
\(.\).\(.*\)
with:
mv & \1\2
just like a regular sed command. However, the parentheses, & and \n markers change it a little.
The search string matches (and remembers as pattern 1) the single character at the start, followed by a single character, follwed by the rest of the string (remembered as pattern 2).
In the replacement string, you can refer to these matched patterns to use them as part of the replacement. You can also refer to the whole matched portion as &.
So what that sed command is doing is creating a mv command based on the original file (for the source) and character 1 and 3 onwards, effectively removing character 2 (for the destination). It will give you a series of lines along the following format:
mv F00001-0708-RG-biasliuyda F0001-0708-RG-biasliuyda
mv abcdef acdef
and so on.
Using perl rename (a must have in the toolbox):
rename -n 's/0000/000/' F0000*
Remove -n switch when the output looks good to rename for real.
There are other tools with the same name which may or may not be able to do this, so be careful.
The rename command that is part of the util-linux package, won't.
If you run the following command (GNU)
$ rename
and you see perlexpr, then this seems to be the right tool.
If not, to make it the default (usually already the case) on Debian and derivative like Ubuntu :
$ sudo apt install rename
$ sudo update-alternatives --set rename /usr/bin/file-rename
For archlinux:
pacman -S perl-rename
For RedHat-family distros:
yum install prename
The 'prename' package is in the EPEL repository.
For Gentoo:
emerge dev-perl/rename
For *BSD:
pkg install gprename
or p5-File-Rename
For Mac users:
brew install rename
If you don't have this command with another distro, search your package manager to install it or do it manually:
cpan -i File::Rename
Old standalone version can be found here
man rename
This tool was originally written by Larry Wall, the Perl's dad.
The backslash-paren stuff means, "while matching the pattern, hold on to the stuff that matches in here." Later, on the replacement text side, you can get those remembered fragments back with "\1" (first parenthesized block), "\2" (second block), and so on.
If all you're really doing is removing the second character, regardless of what it is, you can do this:
s/.//2
but your command is building a mv command and piping it to the shell for execution.
This is no more readable than your version:
find -type f | sed -n 'h;s/.//4;x;s/^/mv /;G;s/\n/ /g;p' | sh
The fourth character is removed because find is prepending each filename with "./".
Here's what I would do:
for file in *.[Jj][Pp][Gg] ;do
echo mv -vi \"$file\" `jhead $file|
grep Date|
cut -b 16-|
sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done
Then if that looks ok, add | sh to the end. So:
for file in *.[Jj][Pp][Gg] ;do
echo mv -vi \"$file\" `jhead $file|
grep Date|
cut -b 16-|
sed -e 's/:/-/g' -e 's/ /_/g' -e 's/$/.jpg/g'` ;
done | sh
for i in *; do mv $i $(echo $i|sed 's/AAA/BBB/'); done
The parentheses capture particular strings for use by the backslashed numbers.
ls F00001-0708-*|sed 's|^F0000\(.*\)|mv & F000\1|' | bash
Some examples that work for me:
$ tree -L 1 -F .
.
├── A.Show.2020.1400MB.txt
└── Some Show S01E01 the Loreming.txt
0 directories, 2 files
## remove "1400MB" (I: ignore case) ...
$ for f in *; do mv 2>/dev/null -v "$f" "`echo $f | sed -r 's/.[0-9]{1,}mb//I'`"; done;
renamed 'A.Show.2020.1400MB.txt' -> 'A.Show.2020.txt'
## change "S01E01 the" to "S01E01 The"
## \U& : change (here: regex-selected) text to uppercase;
## note also: no need here for `\1` in that regex expression
$ for f in *; do mv 2>/dev/null "$f" "`echo $f | sed -r "s/([0-9] [a-z])/\U&/"`"; done
$ tree -L 1 -F .
.
├── A.Show.2020.txt
└── Some Show S01E01 The Loreming.txt
0 directories, 2 files
$
2>/dev/null suppresses extraneous output (warnings ...)
reference [this thread]: https://stackoverflow.com/a/2372808/1904943
change case: https://www.networkworld.com/article/3529409/converting-between-uppercase-and-lowercase-on-the-linux-command-line.html

How to extract a substring using sed on OS X?

Im trying to iterate over each file and folder inside a directory and extract part of the file name into a variable, but I can't make sed work correctly. I either get all of the file name or none of it.
This version of the script should capture the entire file name:
#!/bin/bash
for f in *
do
substring=`echo $f | sed -E -n 's/(.*)/\1/'`
echo "sub: $substring"
done
But instead I get nothing:
sub:
sub:
sub:
sub:
...
This version should give me just the first character in the filename:
#!/bin/bash
for f in *
do
substring=`echo $f | sed -E 's/^([a-zA-Z])/\1/'`
echo "sub: $substring"
done
But instead I get the whole file name:
sub: Adlm
sub: Applications
sub: Applications (Parallels)
sub: Desktop
...
I've tried numerous iterations of it and what it basically boils down to is that if I use -n I get nothing and if I don't I get the whole file name.
Can someone show me how to get just the first character?
Or, my overall goal is to be able to extract a substring and store it into a variable, if anybody has a better approach to it, that would be appreciated as well.
Thanks in advance.
If you want to modify a shell parameter, you probably want to use a parameter expansion.
for f in *; do
# This version should expand to the whole parameter
echo "$f"
# This version should expand to the first character in the filename
echo "${f::1}"
done
Parameter expansions are not as powerful as sed, but they are built in to the shell (no launching a separate process or subshell necessary) and there are expansions for:
Substrings (as above)
Replacing and substituting characters
Altering the case of strings (bash 4+)
and more.
This version of the script should capture the entire file name:
sed -E -n 's/(.*)/\1/'
But instead I get nothing.
You used -n so naturally it won't yield anything. Perhaps you should remove -n or add p:
sed -E -n 's/(.*)/\1/p'
This version should give me just the first character in the filename:
sed -E 's/^([a-zA-Z])/\1/'
But instead I get the whole file name,
You didn't replace anything there. Perhaps what you wanted was
sed -E 's/^([a-zA-Z]).*/\1/'
Also I suggest quoting your arguments well:
substring=`echo "$f" | sed ...'`
Finally the simpler method is to use substring expansion if you're using Bash as suggested by kojiro.
You forget to add .* after the capturing group in sed,
$ for i in *; do substring=`echo $i | sed -E 's/^(.).*$/\1/'`; echo "sub: $substring"; done
It's better to use . instead of [a-zA-Z] because it may fail if the first character starts with any special character.
I prefer awk to sed. It seems to be easier for me to understand.
#!/bin/bash
#set -x
for f in *
do
substring=`echo $f | awk '{print substr($1,1,1)}'`
echo "sub: $substring"
done

sed plus sign doesn't work

I'm trying to replace /./ or /././ or /./././ to / only in bash script. I've managed to create regex for sed but it doesn't work.
variable="something/./././"
variable=$(echo $variable | sed "s/\/(\.\/)+/\//g")
echo $variable # this should output "something/"
When I tried to replace only /./ substring it worked with regex in sed \/\.\/. Does sed regex requires more flags to use multiplication of substring with + or *?
Use -r option to make sed to use extended regular expression:
$ variable="something/./././"
$ echo $variable | sed -r "s/\/(\.\/)+/\//g"
something/
Any sed:
sed 's|/\(\./\)\{1,\}|/|g'
But a + or \{1,\} would not even be required in this case, a * would do nicely, so
sed 's|/\(\./\)*|/|g'
should suffice
Two things to make it simple:
$ variable="something/./././"
$ sed -r 's#(\./){1,}##' <<< "$variable"
something/
Use {1,} to indicate one or more patterns. You won't need g with this.
Use different delimiterers # in above case to make it readable
+ is ERE so you need to enable -E or -r option to use it
You can also do this with bash's built-in parameter substitution. This doesn't require sed, which doesn't accept -r on a Mac under OS X:
variable="something/./././"
a=${variable/\/*/}/ # Remove slash and everything after it, then re-apply slash afterwards
echo $a
something/
See here for explanation and other examples.