I'm trying to write a python regular expression that will match both these URLs:
http://www.waymarking.com/waymarks/WM6N3G_Battle_Mountain_State_Park
http://www.waymarking.com/waymarks/WM6N3G
and for both will capture:
http://www.waymarking.com/waymarks/WM6N3G
This is what I have:
(http://www.waymarking.com/waymarks/.*?)_?.*?
But it only matches:
http://www.waymarking.com/waymarks/
Thanks!
(http://www.waymarking.com/waymarks/[^_]*).*
How about
(http://www.waymarking.com/waymarks/[^_]+)
non regex way
url="http://www.waymarking.com/waymarks/WM6N3G_Battle_Mountain_State_Park"
s = url.split("_")
print s[0]
*? makes something completely optional and won't be included if it doesn't have to
(http://www.waymarking.com/waymarks/[^_]+)(_.*)?)
What about this:
(http://www.waymarking.com/waymarks/[a-zA-Z0-9]*)_?.*?
.*(http://www.waymarking.com/waymarks/WM6N3G).* if it is inline
.*? is non-greedy, and so will give up everything except for one character, in this case.
Related
I have a string like this:
abcabcdeabc...STRING INSIDE...xyz
I want to find "...STRING INSIDE..." so I'm using the regex below to match it:
(?<=abc).*(?=xyz)
The issue is there are duplicated "abc" in the string so it returns "abcdeabc...STRING INSIDE..." but I only want to match anything between the last "abc" and "xyz". Is this possible? And if yes, how can I achieve this? Thank you.
Try it here:
https://regex101.com/r/gS9Xso/3
Try this pattern:
.*(?<=abc)(.*)(?=xyz)
The leading .* will consume everything up until the last abc, then the number will be captured.
Demo
We can also try using the following pattern:
.*abc(.*?)xyz
Here is a demo for the second pattern:
Demo
This should work well.
[^\d]*abc(\d+)xyz[^\d]*
See it on Debuggex
I have lines like this:
example.com/p/stuff/...
example.com/page/thing/...
example.com/page/stuff/...
example.com/page/other-stuff/...
etc
where the dots represent continuing URL paths. I want to select URLs that contain /page/ and are NOT followed by thing/. So from the above list we would select:
example.com/page/stuff/...
example.com/page/other-stuff/...
.*?\/page\/[^(thing)].*
this is the regex for matching a string which has /page/ not followed by thing
adding the lazy evalation is suggested because you advance a char at the time, better performance!
You need to use negative lookahead:
example\.com\/page\/(?!thing\/).*
Demo
Use the following regex pattern:
.*?\/page\/(?!thing\/).*
https://regex101.com/r/19wh1w/2
(?!thing\/) - negative lookahead assertion ensures that page/ section is not followed by thing/
I've been trying to get the correct filter for:
{0}{1/2}{R/G}{X}{Y}{Z}{R}{R}
I've tried this on rubular.com (http://rubular.com/r/niCiKoUfmN):
\{([0-Z])\}
I get:
{0}{X}{Y}{Z}{R}{R}
But I do not get:
{1/2}{R/G}
How can I write the regular expression so it gets all of it?
\{(\w)(?:\/(\w))?\}
Edit live on Debuggex
A radical way consists to use a negated character class with the character you want to avoid:
\{([^}]*)\}
[^}] means all characters except }
* means zero or more times
You don't have the slash sign (/) in your group. Further, you have to add an quantificator to tell the parser, more characters in brackets are allowed:
\{([0-Z/]+)\}
You can do so by adding an optional /[0-Z]
Which will give you:
\{([0-Z](\/[0-Z])?)\}
Rubular: http://rubular.com/r/3D0VPCaJX7
This should do it:
\{[0-Z\/]+\}
You don't need the parentheses unless you're wanting to use a subset of the match for something else.
You need to include 0 or more inclusions of the / clause.
Debuggex Demo
\{([0-Z][\/0-Z]*)\}
Edit live on Debuggex
jsFiddle Demo in javascript
I want to negate the string
*.INFO
How can I do this?
I have tried
^(?!.*\*\.INFO).*$
but it is not working.
Based on your recent comment, this matches anything starting with *. except *.INFO:
\*\.(?!INFO\b)\S+
Note that by adding the \b to INFO this will match strings that start with*.INFO but are followed by other characters, eg *.INFOXYZ
You are nearly correct
^(?![*][.]INFO).*$
you can test it here
I am trying to find everything in a paragraph that is not abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 and not space / /gi
/[^a-zA-Z0-9]|[^ ]/gi
the above doesn't work!
You could also try with:
/[^a-zA-Z0-9\s]/gi
If you only want to exclude spaces use:
[^ ]*
Test the regex here if you want.
try ^[\d\w]*$ or ^[\w]*$
as reg' Expression means from ^(start) to $(end) match 0-9a-zA-Z only
for c++ ansistring="^[\\d\\w]*$";
You can use this for
am trying to find everything in a paragraph that is not abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 and not space / /gi
replaceAll("[^A-Za-z0-9\\s]", "")