Using multicase regex with sed on jenkins - regex

OK, so I'm making a choice parameterized Jenkins job. The choices for the parameters are DEV STAGING QA and PROD and they are stored in ${ENV}
I need to change the variable ${ENV} to match a string in a URL. I'm trying to do this with a sed command using regex. Is it possible?
I tested PROD|ING|(?<!Q)A as the regex in Expresso, and it finds the necessary portions, (A,ING,PROD) which would leave me with either DEV QA STG or `` as my variable value if I replaced them with '', then I'll add something onto the end of it.
When I try to run echo "DEVSTAGINGQAPROD" | sed "s/PROD|ING|(?<!Q)A//g" to remove those chars on CentOS it returns -bash: !Q: event not found. I want it to return DEVSTGQA
echo "DEVSTAGINGQAPROD" | sed "s/PROD|ING//g returns DEVSTAGQA as it should. The problem I seem to be having is the look behind, to only remove the A if it doesn't have a Q before it.
Any ideas how to make this work?

One problem here is that sed doesn't understand negative lookbehind. Another is your choice of quotes. History expansion is enabled by default in the shell, so ! has a special meaning and must be escaped inside double quotes.
To deal with the first problem, I'd suggest using Perl instead of sed, as it has a much more advanced regular expression engine. For the second, just use single quotes, within which the ! will be interpreted literally:
$ echo "DEVSTAGINGQAPROD" | perl -pe 's/PROD|ING|(?<!Q)A//g'
DEVSTGQA

Related

Expand environment variable inside Perl regex

I am having trouble with a short bash script. It seems like all forward slashes needs to be escaped. How can required characters in expanded (environment) variables be escaped before perl reads them? Or some other method that perl understands.
This is what I am trying to do, but this will not work properly.
eval "perl -pi -e 's/$HOME\/_TV_rips\///g'" '*$videoID.info.json'
That is part of a longer script where videoID=$1. (And for some reason perl expands variables both within single and double quotes.)
This simple workaround with no forward slash in the expanded environment variable $USER works. But I would like to not have /Users/ hard coded:
eval "perl -pi -e 's/\/Users\/$USER\/_TV_rips\///g'" '*$videoID.info.json'
This is probably solvable in some better way fetching home dir for files or something else. The goal is to remove the folder name in youtube-dl's json data.
I am using perl just because it can handle extended regex. But perl is not required. Any better substitute for extended regex on macOS is welcome.
You are building the following Perl program:
s//home/username\/_TV_rips\///g
That's quite wrong.
You shouldn't be attempting to build Perl code from the shell in the first place. There are a few ways you could pass values to the Perl code instead of generating Perl code. Since the value is conveniently in the environment, we can use
perl -i -pe's/\Q$ENV{HOME}\E\/_TV_rips\///' *"$videoID.info.json"
or better yet
perl -i -pe's{\Q$ENV{HOME}\E/_TV_rips/}{}' *"$videoID.info.json"
(Also note the lack of eval and the fixed quoting on the glob.)
Just assembling the ideas in comments, this should achieve what you expected :
perl -pi -e 's{$ENV{HOME}/_TV_rips/}{}g' *$videoID.info.json
#ikegami thanks for your comment! It is indeed safer with \Q...\E, in case $HOME contains characters like $.
All RegEx delimiters must of cource be escaped in input String.
But as Stefen stated, you can use other delimiters in perl, like %, §.
Special characters
# Perl comment - don't use this
?,[], {}, $, ^, . Regex control chars - must be escaped in Regex. That makes it easier if you have many slashes in your string.
You should always write a comment to make clear you are using different delimiters, because this makes your regex hard to read for inexperienced users.
Try out your RegEx here: https://regex101.com/r/cIWk1o/1

Where is this Regex expression not closed in sed (apostrophe parenthesis)?

I'm trying to update some setting for wordpress and I need to use sed. When I run the below command, it seems to think the line is not finished. What am I doing wrong?
$ sed -i 's/define\( \'DB_NAME\', \'database_name_here\' \);/define\( \'DB_NAME\', \'wordpress\' \);/g' /usr/share/nginx/wordpress/wp-settings.php
> ^C
Thanks.
Single quotes in most shells don't support any escaping. If you want to include a single quote, you need to close the single quotes and add the single quote - either in double quotes, or backslashed:
sed 's/define\( '\''DB_NAME'\'', '\''database_name_here'\'' \);/define\( '\''DB_NAME'\'', '\''wordpress'\'' \);/g'
I fear it still wouldn't work for you, as \( is special in sed. You probably want just a simple ( instead.
sed 's/define( '\''DB_NAME'\'', '\''database_name_here'\'' );/define( '\''DB_NAME'\'', '\''wordpress'\'' );/g'
or
sed 's/define( '"'"'DB_NAME'"'"', '"'"'database_name_here'"'"' );/define( '"'"'DB_NAME'"'"', '"'"'wordpress'"'"' );/g'
Normally, using single quotes around the script of a sed script is sensible. This is a case where double quotes would be a better choice — there are no shell metacharacters other than single quotes in the sed script:
sed -e "s/define( 'DB_NAME', 'database_name_here' );/define( 'DB_NAME', 'wordpress' );/g" /usr/share/nginx/wordpress/wp-settings.php
or:
sed -e "s/\(define( 'DB_NAME', '\)database_name_here' );/\1wordpress' );/g" /usr/share/nginx/wordpress/wp-settings.php
or even:
sed -e "/define( 'DB_NAME', 'database_name_here' );/s/database_name_here/wordpress/g" /usr/share/nginx/wordpress/wp-settings.php
One other option to consider is using sed's -f option to provide the script as a file. That saves you from having to escape the script contents from the shell. The downside may be that you have to create the file, run sed using it, and then remove the file. It is likely that's too painful for the current task, but it can be sensible — it can certainly make life easier when you don't have to worry about shell escapes.
I'm not convinced the g (global replace) option is relevant; how many single lines are you going to find in the settings file containing two independent define DB_NAME operations with the default value?
You can add the -i option when you've got the basic code working. Do note that if you might ever work on macOS or a BSD-based system, you'll need to provide a suffix as an extra argument to the -i option (e.g. -i '' for a null suffix or no backup; or -i.bak to be able to work reliably on both Linux (or, more accurately, with GNU sed) and macOS and BSD (or, more accurately, with BSD sed). Appealing to POSIX is no help; it doesn't support an overwrite option.
Test case (first example):
$ echo "define( 'DB_NAME', 'database_name_here' );" |
> sed -e "s/\(define( 'DB_NAME', '\)database_name_here' );/\1wordpress' );/g"
define( 'DB_NAME', 'wordpress' );
$
If the spacing around 'DB_NAME' is not consistent, then you'd end up with more verbose regular expressions, using [[:space:]]* in lieu of blanks, and you'd find the third alternative better than the others, but the second could capture both the leading and trailing contexts and use both captures in the replacement.
Parting words: this technique works this time because the patterns don't involve shell metacharacters like $ or  ` . Very often, the script does need to match those, and then using mainly single quotes around the script argument is sensible. Tackling a different task — replace $DB_NAME in the input with the value of the shell variable $DB_NAME (leaving $DB_NAMEORHOST unchanged):
sed -e 's/$DB_NAME\([^[:alnum:]]\)/'"$DB_NAME"'\1/'
There are three separate shell strings, all concatenated with no spaces. The first is single-quoted and contains the s/…/ part of a s/…/…/ command; the second is "$DB_NAME", the value of the shell variable, double-quoted so that if the value of $DB_NAME is 'autonomous vehicle recording', you still have a single argument to sed; the third is the '\1/' part, which puts back whatever character followed $DB_NAME in the input text (with the observation that if $DB_NAME could appear at the end of an input line, this would not match it).
Most regexes do fuzzy matching; you have to consider variations on what might be in the input to determine how hard your regular expressions have to work to identify the material accurately.

Trying to remove version number from a string using sed in OSX

I have what I hope is a simple issue which is stumping me. I need to take an installer file with a name like:
installer_v0.29_linux.run
installer_v10.22_linux_x64.run
installer_v1.1_osx.app
installer_v5.6_windows.exe
and zip it up into a file with the format
installer_linux.zip
installer_linux_x64.zip
installer_osx.zip
installer_windows.zip
I already have a bash script running on OSX which does almost everything else I need in the build chain, and was certain I could achieve this with sed using something like:
ZIP_NAME=`echo "$OUTPUT_NAME" | sed -E 's/_(?:\d*\.)?\d+//g'`
That is, replacing the regex _(?:\d*\.)?\d+ with a blank - the regex should match any decimal number preceded by an underscore.
However, I get the error RE error: repetition-operator operand invalid when I try to run this. At this stage I am stumped - I have Googled around this and can't see what I am doing wrong. The regex I wrote works correctly at Regexr, but clearly some element of it is not supported by the sed implementation in OSX. Does anyone know what I am doing wrong?
You can try this sed:
sed 's/_v[^_]*//; s/\.[[:alnum:]]\+$/.zip/' file
installer_linux.zip
installer_linux_x64.zip
installer_osx.zip
installer_windows.zip
You don't need sed, just some parameter expansion magic with an extended pattern.
shopt -s extglob
zip_name=${OUTPUT_NAME/_v+([^_])/}
The pattern _v+([^_]) matches a string starting with _v and all characters up to the next _. The extglob option enables the use of the +(...) pattern to match one or more occurrences of the enclosed pattern (in this case, a non-_ character). The parameter expansion ${var/pattern/} removes the first occurrence of the given pattern from the expansion of $var.
Try this way also
sed 's/_[^_]\+//' FileName
OutPut:
installer_linux.run
installer_linux_x64.run
installer_osx.app
installer_windows.exe
If you want add replace zip instead of run use below method
sed 's/\([^_]\+\).*\(_.*\).*/\1\2.zip/' Filename
Output :
installer_linux.run.zip
installer_x64.run.zip
installer_osx.app.zip
installer_windows.exe.zip

Convert last hyphen in filename using BASH

I've been tasked with a major file rename project. Some of these files that I'll be renaming contain multiple hyphens. I need to swap the last hyphen in the name to an underscore in order for the files to be renamed to our new naming convention.
Can anyone explain to me why the last hyphen not being replaced with an underscore in the test code below?
#!/bin/bash
image_name="i-need-the-last-hyphen-removed.psd"
echo -e "Normal: ${image_name}"
echo "Changed: ${image_name/%-/_}"
The output I am looking for should mimic the following:
Normal: i-need-the-last-hyphen-removed.psd
Changed: i-need-the-last-hyphen_removed.psd
The script logic was created by following documentation found here: http://tldp.org/LDP/abs/html/string-manipulation.html
I've tried escaping the hypen but that was not fruitful. I've given up, this will prove to be the most elegant solution versus using SED and/or BASH_REMATCH solutons I was working with in the past.
Any help would be great. Thank you in advance.
I'll suggest using rename tool for this kind of tasks. It's sed pattern similar.
rename 's/(.*)-/$1_/' *.psd
Since .* is greedy, that way last '-' will be catched, where (.*) is captured in group. Right part will not be changed.
With *.psd you will catch all psd files in current folder
Huge thanks to #alex-p for the following suggestion. As I originally stated I didn't want to use SED or BASH_REMATCH or any other complex REGEX. This worked flawlessly.
echo "${image_name%-}_${image_name##-}"
You can do it using sed as:
sed -r "s/(.*)-(.*)/\1_\2/"
This will have two captured group (1. before last - 2. after last -)which will be concatenated with _
Or
sed -r "s/-([^-]*$)/_\1/"
This will have one captured group which will replace - with _ and then the captured group will be concatenated at last.
${image_name/%-/_}" would only work if the - was the very termination/suffix of the $image_name (like e.g. in mystring-).
Try using sed:
$> echo i-need-the-last-hyphen-removed.psd | sed -r 's/-([^-]*$)/_\1/'
i-need-the-last-hyphen_removed.psd

Period wildcard not working in Bash pattern replacement

Given this Bash code:
TEMP="1_2"
echo ${TEMP/_.*/}
why does it print out 1_2 instead of 1?
I've also tried these, but they don't work:
echo ${TEMP/_\.*/}
echo ${TEMP/_\\.*/}
This does work:
echo ${TEMP/_[0-9]*/}
but I want to know:
Why isn't the period acting as a wildcard?
What should I use instead?
A question mark is the single-character wildcard. However, it doesn't work like regular expressions where the asterisk is a quantifier. In Bash, in parameter expansions, an asterisk is a multicharacter wildcard.
$ temp=1_2
$ echo "${temp/_*}"
1
The following also work in this particular situation. See Parameter Expansion in man bash for more information regarding the differences.
echo "${temp%_*}"
echo "${temp%%_*}"
I recommend against using all-caps variable names in order to reduce the chance of name collision with shell or environment variables.