Regex matches only last occurence instead of all occurrences

Regex matches only last occurence instead of all occurrences - regex

I have this pattern
.*;\/wp-content\/uploads\/(.*)\.
and this in string format
<div class="js-aem-gallery" data-images="{"omniture":{"type":"","hasImpression":false},"slides":[{"title":"","omniture":null,"description":"","path":"/wp-content/uploads/auto-shows/frankfurt/2017/2019-porsche-cayenne/02-2019-porsche-cayenne-turbo-NP.jpg","thumbnail":"/wp-content/uploads/auto-shows/frankfurt/2017/2019-porsche-cayenne/02-2019-porsche-cayenne-turbo-NP.jpg?w=320&h=240&crop=1"},{"title":"","omniture":null,"description":"","path":"/wp-content/uploads/make/porsche/cayenne/2019/oem/21-2019-porsche-cayenne.jpg","thumbnail":"/wp-content/uploads/make/porsche/cayenne/2019/oem/21-2019-porsche-cayenne.jpg?w=320&h=240&crop=1"},{"title":"","omniture":null,"description":"","path":"/wp-content/uploads/make/porsche/cayenne/2019/oem/23-2019-porsche-cayenne.jpg","thumbnail":"/wp-content/uploads/make/porsche/cayenne/2019/oem/23-2019-porsche-cayenne.jpg?w=320&h=240&crop=1"},{"title":"","omniture":null,"description":"","path":"/wp-content/uploads/make/porsche/cayenne/2019/oem/30-2019-porsche-cayenne.jpg","thumbnail":"/wp-content/uploads/make/porsche/cayenne/2019/oem/30-2019-porsche-cayenne.jpg?w=320&h=240&crop=1"}]}"></div>
I want to match all the occurrences that appear between ;/wp-content/uploads/ and .jpg
I tried to match this, but it shows last group only.
I am new to regex, so please correct me with the implementation here, thank you.

You can make use of lookarounds to do clean extractions:
(?<=;\/wp-content\/uploads\/)(.*?)(?=\.jpg)
https://regex101.com/r/vzJ0FU/1

Related

Regex to match different characters at same position in string

Let's say I have the text a123456. I want a string of b123456 to match. So essentially, 'match if all characters are the same except for the first character'. Am I asking for the impossible with regex?

Use the dot (.) to match any character. So, a possible Regex would be:
/^.123456$/

If you want to use zero length assertion with regex, you can have lookbehind approach in following way :
(?<=\w)your_value$ // your_value should be text which you want to check

I think you can figure it out on your own. This ain't tough, just needs some understanding between you and Regex. Why don't you go through the following links and try to make a regex on your own.
https://www.talentcookie.com/2015/07/regular-expressions/
https://www.talentcookie.com/2015/07/lets-practice-regular-expression/
https://www.talentcookie.com/2016/01/some-useful-regular-expression-terminologies/

how to Exclude specific word using regex?

i have a problem here, i have the following string
#Novriiiiii yauda busana muslim #nencor haha. wa'alaikumsalam noperi☺
then i use this regex pattern to select all the string
\w+
however, i need to to select all the string except the word which prefixed with # like #Novriiiiii or #nencor which means, we have to exclude the #word ones
how do i do that ?
ps. i am using regexpal to compile the regex. and i want to apply the regex pattern into yahoo pipes regex. thank you

You can use a negative lookbehind so that if a word is preceded by # it is excluded. You also need a word boundary before the word or else the lookbehind will only affect the first character.
(?<!#)\b\w+
http://rubular.com/r/ONEl70Am5Q

Does this suit your needs?
http://rubular.com/r/uuXvNrUiGJ
[^#\w+]\w+

This would sole your problem indeed:
[^#\w+][\w.]+
Check this link: http://regexr.com?34tq7

If you cannot use a negative lookbehind as other answers have already suggested, here's a workaround.
\w already doesn't match the # character, so you'd want something like this:
[^#]\w+
But this will (a) not work at the beginning of the string, and (b) include the character before the word in the match. To fix (a), we can do:
(^|[^#])\w+
To fix (b), we parenthesize the part we want:
(^|[^#])(\w+)
Then use $2 or \2 (depending on regex dialect) to refer to the matched word.

Another option is to include the # symbol in the word:
[\w#]+
And then add another step in your Pipe to filter out all words that start with an #.

A way to do that is to remove words that you don't want. Example:
find: #\w+
replace: empty string
you obtain the text without #abcdef words.

Regex for deleting characters before a certain character?

I'm very new at regex, and to be completely honest it confounds me. I need to grab the string after a certain character is reached in said string. I figured the easiest way to do this would be using regex, however like I said I'm very new to it. Can anyone help me with this or point me in the right direction?
For instance:
I need to check the string "23444:thisstring" and save "thisstring" to a new string.

If this is your string:
I'm very new at regex, and to be completely honest it confounds me
and you want to grab everything after the first "c", then this regular expression will work:
/c(.*)/s
It will return this match in the first matched group:
"ompletely honest it confounds me"
Try it at the regex tester here: regex tester
Explanation:
The c is the character you are looking for
.* (in combination with /s) matches everything left
(.*) captures what .* matched, making it available in $1 and returned in list context.

Regex for deleting characters before a certain character!
You can use lookahead like this
.*(?=x)
where x is a particular character or word or string.{using characters like .,$,^,*,+ have special meaning in regex so don't forget to escape when using it within x}
EDIT
for your sample string it would be
.*(?=thisstring)
.* matches 0 to many characters till thisisstring

Here is a one-line solution for matching everything after "before"
print $1."\n" if "beforeafter" =~ m/before(.*)/;
Edit:
While using lookbehind is possible, it's not required. Grouping provides an easier solution.

To get the string before : in your example, you have to use [^:][^:]*:\(.*\). Notice that you should have at least one [^:] followed by any number of [^:]s followed by an actual :, the character you are searching for.

remove repeated character between words

I am trying out the quiz from Regex 101
In Task 6, the question is
Oh no! It seems my friends spilled beer all over my keyboard last night and my keys are super sticky now. Some of the time when I press a key, I get two duplicates. Can you pppllleaaaseee help me fix this? Content in bold should be removed.
I have tried this regex
([a-z])(\1{2})
But couldn't get the solution.

The solution for the riddle on that website is:
/(.)\1{2}/g
Since any key on the keyboard can get stuck, so we need to use ..
\1 in the regex means match whatever the 1st capturing group (.) matches.
Replacement is $1 or \1.
The rest of your regex is correct, just that there are unnecessary capturing groups.

Your regex is correct if you want to match exactly three characters. If you want to match at least three, that is
([a-z])(\1{2,})
or
([a-z])(\1\1+)
Since you don't need to capture anything but the first occurence, these are slightly better:
([a-z])\1{2} # your original regex (exactly three occurences)
([a-z])\1{2,}
([a-z])\1\1+
Now, the replacement should be exactly one occurence of the character, and nothing more:
\1

Replace:
(.)\1+
with:
\1
This of course requires that your regex engine suports backreferences... Also, in the replacement part, and according to regex engines, \1 may have to be written as $1.

I'd do it with (\w)(\1+)? but can't find out how to "remove" within the given site...
Best way would be to replace the results of the secound match with empty strings

Antimatch with Regex

I search for a regex pattern, which shouldn't match a group but everything else.
Following regex pattern works basicly:
index\.php\?page=(?:.*)&tagID=([0-9]+)$
But the .* should not match TaggedObjects.
Thanks for any advices.

(?:.*) is unnecessary - you're not grouping anything, so .* means exactly the same. But that's not the answer to your question.
To match any string that does not contain another predefined string (say TaggedObjects), use
(?:(?!TaggedObjects).)*
In your example,
index\.php\?page=(?:(?!TaggedObjects).)*&tagID=([0-9]+)$
will match
index.php?page=blahblah&tagID=1234
and will not match
index.php?page=blahTaggedObjectsblah&tagID=1234
If you do want to allow that match and only exclude the exact string TaggedObjects, then use
index\.php\?page=(?!TaggedObjects&tagID=([0-9]+)$).*&tagID=([0-9]+)$

Try this. I think you mean you want to fail the match if the string contains an occurence of 'TaggedObjects'
index\.php\?page=(?!.*TaggedObjects).*&tagID=([0-9]+)$

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex matches only last occurence instead of all occurrences - regex

You can make use of lookarounds to do clean extractions: (?<=;\/wp-content\/uploads\/)(.*?)(?=\.jpg) https://regex101.com/r/vzJ0FU/1

Related

Regex to match different characters at same position in string

how to Exclude specific word using regex?

Regex for deleting characters before a certain character?

remove repeated character between words

Antimatch with Regex

Categories

Resources