Transform a dynamic alphanumeric string

Transform a dynamic alphanumeric string - regex

I have a Build called 700-I20190808-0201. I need to convert it to 7.0.0-I20190808-0201. I can do that with regular expression:
sed 's/\([0-9]\)\([0-9]\)\([0-9]\)\(.\)/\1.\2.\3\4/' abc.txt
But the solution does not work when the build ID is 7001-I20190809-0201. Can we make the regular expression dynamic so that it works for both (700 and 7001)?

Could you please try following.
awk 'BEGIN{FS=OFS="-"}{gsub(/[0-9]/,"&.",$1);sub(/\.$/,"",$1)} 1' Input_file

If you have Perl available, lookahead regular expressions make this straightforward:
$ cat foo.txt
700-I20190808-0201
7001-I20190809-0201
$ perl -ple 's/(\d)(?=\d+\-I)/\1./g' foo.txt
7.0.0-I20190808-0201
7.0.0.1-I20190809-0201

You can implement a simple loop using labels and branching using sed:
$ echo '7001-I20190809-0201' | sed ':1; s/^\([0-9]\{1,\}\)\([0-9][-.]\)/\1.\2/; t1'
7.0.0.1-I20190809-0201
$ echo '700-I20190809-0201' | sed ':1; s/^\([0-9]\{1,\}\)\([0-9][-.]\)/\1.\2/; t1'
7.0.0-I20190809-0201
If your sed support -E flag:
sed -E ':1; s/^([0-9]+)([0-9][-.])/\1.\2/; t1'

sed -e 's/\([0-9]\)\([0-9]\)\([0-9]\)\(.\)/\1.\2.\3.\4/' -e 's/\.\-/\-/' abc.txt
This worked for me, very simple one. Just needed to extract it in my ant script using replaceregex pattern

Related

sed not performing expected substitution

I have a bash variable, some file path (with spaces) and filename, e.g:
$ echo $tmp
/home/xyz/some/path/with spaces/AlbumArt_{random-number-sequence}_Large.jpg
When I attempt to identify the filename part with grep, e.g:
$ echo "$tmp" | egrep 'AlbumArt.*Large.jpe?g$'
/home/xyz/some/path/with spaces/**AlbumArt_{random-number-sequence}_Large.jpg**
The filename part appears to be identified correctly, but when I attempt to convert this to a sed substitution expression, e.g:
$ echo "$tmp" | sed 's#AlbumArt.*Large.jpe?g$#NewString#'
/home/xyz/some/path/with spaces/AlbumArt_{random-number-sequence}_Large.jpg
The expected substitution isn't happening. Thanks in advance for any help.

In fact egrep is a variant of grep -E, allowing to 'activate' extended regular expression (you can see: https://en.wikipedia.org/wiki/Regular_expression#Standards).
Thus, you just need to use the same option with sed:
echo "$tmp" | sed -E 's#AlbumArt.*Large.jpe?g$#NewString#'

Swap columns in bash using SED without using loop

I'm new to Sed, I'm trying to learn some pattern using Sed.
I got a filenamne.txt that has the following entry:
ppp/jjj qqq/kkk rrr/lll
My goal is to swap the word before the slash and the word after the slash in each of the three word1/word2 columns:
jjj/ppp kkk/qqq lll/rrr
I tried using sed –re ‘s!(.*)(/)(.*)!\1\2\!’ filename.txt, but it didn't work. Any idea how can I go about it?

$ echo "ppp/jjj qqq/kkk rrr/lll" | sed -e 's/$/ /' -e 's!\([^/]*\)/\([^ ]*\) !\2/\1 !g'
jjj/ppp kkk/qqq lll/rrr

Use replacement in perl command-line is a lot more straight-forward :-
perl -pe 's/(\w+)\/(\w+)/$2\/$1/g' file
jjj/ppp kkk/qqq lll/rrr

$ sed 's#\([^ ]*\)/\([^ ]*\)#\2/\1#g' file
jjj/ppp kkk/qqq lll/rrr

BASH: replacing PERL with SED for in-place substitution

Would like to replace this statement with perl:
perl -pe "s|(?<=://).+?(?=/)|$2:80|"
with
sed -e "s|<regex>|$2:80|"
Since sed has a much less powerful regex engine (for example it does not support look-arounds) the task boils down to writing a sed compatible regex to match only a domain name in a fully qualitied URL. Examples:
http://php2-mindaugasb.c9.io/Testing/JS/displayName.js
http://php2-mindaugasb.c9.io?a=Testing.js
http://www.google.com?a=Testing.js
Should become:
http://$2:80/Testing/JS/displayName.js
http://$2:80?a=Testing.js
http://$2:80?a=Testing.js
A solution like this would be ok:
sed -e "s|<regex>|http://$2:80|"
Thanks :)

Use the below sed command.
$ sed "s~//[^/?]\+\([?/]\)~//\$2:80\1~g" file
http://$2:80/Testing/JS/displayName.js
http://$2:80?a=Testing.js
http://$2:80?a=Testing.js
You must need to escape the $ at the replacement part.

sed 's|http://[^/?]*|http://$2:80|' file
Output:
http://$2:80/Testing/JS/displayName.js
http://$2:80?a=Testing.js
http://$2:80?a=Testing.js

How to cut a string from a string

My script gets this string for example:
/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
let's say I don't know how long the string until the /importance.
I want a new variable that will keep only the /importance/lib1/lib2/lib3/file from the full string.
I tried to use sed 's/.*importance//' but it's giving me the path without the importance....
Here is the command in my code:
find <main_path> -name file | sed 's/.*importance//
I am not familiar with the regex, so I need your help please :)
Sorry my friends I have just wrong about my question,
I don't need the output /importance/lib1/lib2/lib3/file but /importance/lib1/lib2/lib3 with no /file in the output.
Can you help me?

I would use awk:
$ echo "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file" | awk -F"/importance/" '{print FS$2}'
importance/lib1/lib2/lib3/file
Which is the same as:
$ awk -F"/importance/" '{print FS$2}' <<< "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
importance/lib1/lib2/lib3/file
That is, we set the field separator to /importance/, so that the first field is what comes before it and the 2nd one is what comes after. To print /importance/ itself, we use FS!
All together, and to save it into a variable, use:
var=$(find <main_path> -name file | awk -F"/importance/" '{print FS$2}')
Update
I don't need the output /importance/lib1/lib2/lib3/file but
/importance/lib1/lib2/lib3 with no /file in the output.
Then you can use something like dirname to get the path without the name itself:
$ dirname $(awk -F"/importance/" '{print FS$2}' <<< "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file")
/importance/lib1/lib2/lib3

Instead of substituting all until importance with nothing, replace with /importance:
~$ echo $var
/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
~$ sed 's:.*importance:/importance:' <<< $var
/importance/lib1/lib2/lib3/file
As noted by #lurker, if importance can be in some dir, you could add /s to be safe:
~$ sed 's:.*/importance/:/importance/:' <<< "/dir1/dirimportance/importancedir/..../importance/lib1/lib2/lib3/file"
/importance/lib1/lib2/lib3/file

With GNU sed:
echo '/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file' | sed -E 's#.*(/importance.*)#\1#'
Output:
/importance/lib1/lib2/lib3/file

pure bash
kent$ a="/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
kent$ echo ${a/*\/importance/\/importance}
/importance/lib1/lib2/lib3/file
external tool: grep
kent$ grep -o '/importance/.*' <<<$a
/importance/lib1/lib2/lib3/file

I tried to use sed 's/.*importance//' but it's giving me the path without the importance....
You were very close. All you had to do was substitute back in importance:
sed 's/.*importance/importance/'
However, I would use Bash's built in pattern expansion. It's much more efficient and faster.
The pattern expansion ${foo##pattern} says to take the shell variable ${foo} and remove the largest matching glob pattern from the left side of the shell variable:
file_name="/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
file_name=${file_name##*importance}

Removeing the /file at the end as you ask:
echo '<path>' | sed -r 's#.*(/importance.*)/[^/]*#\1#'
Input /dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
Returns: /importance/lib1/lib2/lib3
See this "Match groups" tutorial.

Remove substring till first Token using regexp

I have the Path:
GarbageContainingSlashesAndDots/TOKEN/xyz/TOKEN/abc
How coukt I remove GarbageContainingSlashesAndDots?
I know, it is before TOKEN, but Unfortunately, there are two substrings TOKEN in string.
using sed s/.*TOKEN// makes my string to /abc,
but I need /TOKEN/xyz/TOKEN/abc
Thank You!!!

Divide and conquer:
$ echo 'Garbage.Containing/Slashes/And.Dots/TOKEN/xyz/TOKEN/abc' |
sed -n 's|/TOKEN/|\n&|;s/.*\n//;p'
/TOKEN/xyz/TOKEN/abc

Is perl instead of sed allowed?
perl -pe 's!.*?(?=/TOKEN)!!'
echo 'GarbageContainingSlashesAndDots/TOKEN/xyz/TOKEN/abc' | perl -pe 's!.*?(?=/TOKEN)!!'
# returns:
/TOKEN/xyz/TOKEN/abc
Sed does not support non-greedy matching. Perl does.

I think you have bash, so it can be a simple as
$ s="GarbageContainingSlashesAndDots/TOKEN/xyz/TOKEN/abc"
$ echo ${s#*/}
TOKEN/xyz/TOKEN/abc
or if you have Ruby(1.9+)
echo $s | ruby -e 'print gets.split("/",2)[-1]'

Thank you for all suggestions, I've learnt something new.
Finally I was able to reach my goal using grep -o
echo "GarbageContainingSlashesAndDots/TOKEN/xyz/TOKEN/abc" | grep -o "/TOKEN/.*/TOKEN/.*"

Using grep:
word='GarbageContainingSlashesAndDots/TOKEN/xyz/TOKEN/abc'
echo $word | grep -o '/.*'

echo "./a//...b/TOKEN/abc/TOKEN/xyz"|sed 's#.*\(/TOKEN/.*/TOKEN/.*\)#\1#'

UPDATE 2: have you tried this?
s!.*\(/TOKEN.+TOKEN.*\)!\1!
UPDATE: sorry, non-greedy matches are not supported by sed
Try this:
s/.*?TOKEN//
.*? matches only for the first occurance of TOKEN.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Transform a dynamic alphanumeric string - regex

Could you please try following. awk 'BEGIN{FS=OFS="-"}{gsub(/[0-9]/,"&.",$1);sub(/\.$/,"",$1)} 1' Input_file

If you have Perl available, lookahead regular expressions make this straightforward: $ cat foo.txt 700-I20190808-0201 7001-I20190809-0201 $ perl -ple 's/(\d)(?=\d+\-I)/\1./g' foo.txt 7.0.0-I20190808-0201 7.0.0.1-I20190809-0201

sed -e 's/\([0-9]\)\([0-9]\)\([0-9]\)\(.\)/\1.\2.\3.\4/' -e 's/\.\-/\-/' abc.txt This worked for me, very simple one. Just needed to extract it in my ant script using replaceregex pattern

Related

sed not performing expected substitution

Swap columns in bash using SED without using loop

BASH: replacing PERL with SED for in-place substitution

How to cut a string from a string

Remove substring till first Token using regexp

Categories

Resources