Sed Search And Replace Question - regex

What I can do if I want the "sed" to give me this ID as final result: b7f6fe86
The input might be http://www.uploading.com/files/b76f5e86/hcadssoto720.part1.rar
OR http://uploading.com/files/b76f5e86/hcadssoto720.part1.rar and I want the output to give the ID of this link.
For instance
echo http://uploading.com/files/b76f5e86/hcadssoto720.part1.rar | sed 's/SOMETHING/SOMETHING/g"
OUTPUT: b7f6fe86
echo http://www.uploading.com/files/b76f5e86/hcadssoto720.part1.rar | sed 's/SOMETHING/SOMETHING/g"
OUTPUT: b7f6fe86
I hope to get the help of regular expression experts.
Cheers

cut -d'/' -f5
Should do it afaik.

try:
echo "http://uploading.com/files/b76f5e86/hcadssoto720.part1.rar" | sed 's|\([^/]*/\)\{4\}\([0-9a-f]*\).*|\2|'
Note that the '/' character is replaced by the '|' character to simplify the appearance of "leaning toothpick syndrome".
Bah - can't make some '*' appear properly - ah - got it.

Related

Get first set of 8 numbers only with Sed

I have some code I'm using with Windows and SED to give me the first set of eight characters in a file name that keeps giving me the second set only that I cannot figure out what I'm doing wrong.
My Code:
echo JiggySauce_20161208_21325005_Meat.txt | sed -r "s/.*_([0-9]*)_.*/\1/g"
Addition Example (so regex per underbar delimiters won't always work):
echo JiggySauce_Mustard_Mayo_20161208_21325005_Meat.txt | sed -r "s/.*_([0-9]*)_.*/\1/g"
I keep getting this wrong result (at least not what I need):
21325005
My expected result:
20161208
I could even live with (preferrably not but could work with that I suppose):
20161208_21325005
Please help me with this if you have an answer as I'm at a standstill looking dumb and stumped over here like UHHH....
With GNU sed:
echo JiggySauce_20161208_21325005_Meat.txt | sed -r 's/^[^_]*_([^_]*).*/\1/'
Output:
20161208
Post Initial Answer Update:
I suggest: sed -r 's/[^0-9]*([0-9]{8}).*/\1/'
Cyrus
Output:
20161208
See: The Stack Overflow Regular Expressions FAQ
Using grep:
echo JiggySauce_20161208_21325005_Meat.txt | grep -Eo '[0-9]+' | head -1
or
echo JiggySauce_20161208_21325005_Meat.txt | tr '_' '\n' | grep -m1 -Eo '[0-9]+'

Cywgin Sed match not working when "/" in string

testLine="This is a test line: Asia/Pacific Australia"
expr="Asia\/Pacific Australia"
This works:
echo "$testLine" | sed 's/Asia\/Pacific Australia/TEST/g'
This DOES NOT:
echo "$testLine" | sed 's/$expr/TEST/g'
I've tried everything from using multiple "escapes", using different quote marks, using -r and -re Sed switches. Nothing seems to work.
Please advise if anyone has a working solution. Please can someone advise and provide the Cygwin output here, many thanks!
First change your variable as
expr="Asia/Pacific Australia"
this should work then
echo "$testLine" | sed 's_'"$expr"'_TEST_g'
note that for sed delimiters you can choose other chars as well, here _

capture special character from line

I would like to capture special character from a line:
var=`echo "#this is comment" | grep "[^a-zA-Z0-9 \t]"`
echo $var
Expected Output: #
But getting: #this is comment
Can someone help me out.
It seems like you want something more like:
var=`echo "#this is comment" | sed 's/[^a-zA-Z0-9 \t]//g;'`
Using sed will replace the characters; using grep was only searching for the characters.
Edit: Note that the \t construct is not guaranteed to be portable to all systems or locales; I believe if your sed supports POSIX regular expressions, using [:space:] may work better. (thanks #ghoti!)
string="#this is comment"
var=$(echo "$string" | sed 's/[a-zA-Z0-9 ]//g')
echo "$var"
I've removed \t as it's not portable.
If you want to do this with awk, as your tag suggests, you can use something like:
var=$(echo "$string" | awk '{gsub(/[a-zA-Z0-9 ]/, "")} 1')
Note that these are probably not good ways to achieve whatever it is that you're trying to do. If you post more of your code, showing us some context, we can help you avoid an XY problem.
Of course, you can also do substitutions like this directly in bash, if you want.
var=${string//[A-Za-z0-9 ]}
You'll save CPU and time by avoiding the call to an extra program when you don't really need it.
sed can be used for this, but tr is a better choice:
echo "#this is comment" | tr -d 'a-zA-Z0-9 \t'
tr also supports character classes such as [:space:] and [:alpha:]

Using sed and regex to capture last part of url

I'm trying to make sed match the last part of a url and output just that. For example:
echo "http://randomurl/suburl/file.mp3" | sed (expression)
should give the output:
file.mp3
So far I've tried sed 's|\([^/]+mp3\)$|\1|g' but it just outputs the whole url. Maybe there's something I'm not seeing here but anyways, help would be much appreciated!
this works:
echo "http://randomurl/suburl/file.mp3" | sed 's#.*/##'
basename is your good friend.
> basename "http://randomurl/suburl/file.mp3"
=> file.mp3
This should do the job:
$ echo "http://randomurl/suburl/file.mp3" | sed -r 's|.*/(.*)$|\1|'
file.mp3
where:
| has been used instead of / to separate the arguments of the s command.
Everything is matched and replaced with whatever if found after the last /.
Edit: You could also use bash parameter substitution capabilities:
$ url="http://randomurl/suburl/file.mp3"
$ echo ${url##*/}
file.mp3
echo 'http://randomurl/suburl/file.mp3' | grep -oP '[^/\n]+$'
Here's another solution using grep.

bash script regex matching

In my bash script, I have an array of filenames like
files=( "site_hello.xml" "site_test.xml" "site_live.xml" )
I need to extract the characters between the underscore and the .xml extension so that I can loop through them for use in a function.
If this were python, I might use something like
re.match("site_(.*)\.xml")
and then extract the first matched group.
Unfortunately this project needs to be in bash, so -- How can I do this kind of thing in a bash script? I'm not very good with grep or sed or awk.
Something like the following should work
files2=(${files[#]#site_}) #Strip the leading site_ from each element
files3=(${files2[#]%.xml}) #Strip the trailing .xml
EDIT: After correcting those two typos, it does seem to work :)
xbraer#NO01601 ~
$ VAR=`echo "site_hello.xml" | sed -e 's/.*_\(.*\)\.xml/\1/g'`
xbraer#NO01601 ~
$ echo $VAR
hello
xbraer#NO01601 ~
$
Does this answer your question?
Just run the variables through sed in backticks (``)
I don't remember the array syntax in bash, but I guess you know that well enough yourself, if you're programming bash ;)
If it's unclear, dont hesitate to ask again. :)
I'd use cut to split the string.
for i in site_hello.xml site_test.xml site_live.xml; do echo $i | cut -d'.' -f1 | cut -d'_' -f2; done
This can also be done in awk:
for i in site_hello.xml site_test.xml site_live.xml; do echo $i | awk -F'.' '{print $1}' | awk -F'_' '{print $2}'; done
If you're using arrays, you probably should not be using bash.
A more appropriate example wold be
ls site_*.xml | sed 's/^site_//' | sed 's/\.xml$//'
This produces output consisting of the parts you wanted. Backtick or redirect as needed.