What is the regex for a URL like this? - regex

I don't really know regex, but would like a quick solution to search and replace links. I want to use the search regex wordpress plugin to remove links in my post. How do I format the regex to a link like this:
http://website.com/index.php?id=934&title=item name
edit: the numbers in the id and the item name varies
Thank you in advance!

Try this one out: http://regexr.com?2vjq6
Depending on whether or not you need whitespace in your "title" parameter, the regex I provided may need to be altered. Best practice would be to not have whitespace in your URLs (use URL encoding instead, where a space = %20).
http://website.com/index.php\?id=[0-9]*&title=[a-zA-Z0-9\-]*

Try this pattern
(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?

Related

How to cut url down correctly by regex?

May I ask you some question about regex? It will be cool if you could help me to solve an issue. I have tons of urls and I need to find out all unique which has word promo in url.
For instance, I have a bunch urls like that:
/promo/vygoda-do-20-na-samsung?from=hb
/promo/antikrizisnaya-rasprodazha-skidki-do-50-mark164615151?from=hb
/promo/antikrizisnaya-rasprodazha-skidki-do-50-mark164615151
but I need get like this:
/promo/vygoda-do-20-na-samsung
/promo/antikrizisnaya-rasprodazha-skidki-do-50
/promo/antikrizisnaya-rasprodazha-skidki-do-50
All I could do it is
https://regex101.com/r/Ot8xzV/1
I have just started my journey to regex and don't have strong knowledge, so, please help me to do it. I'll be very grateful
Use
(.*/promo/[^?]+?)(?:-mark\d+|\?).*
Replace with $1 if you can replace. Capturing group may work for you already.
See proof.

How to write a regex to validate a specific URL format?

My URL will look something like this :
"/eshop/products/SMART+TV+SAMSUNG+UE55H6500SLXXH+3D/productDetail/ger_20295028/"
Where product names can keep changing here
SMART+TV+SAMSUNG+UE55H6500SLXXH+3D and product id here ger_20295028. I tried writing a regex which is wrong.
How can I correct it for the above URL?
Regex:
.*/products/[^/]*?/productDetail/[^/]*?/([^/].*?)/[^/]*?/([^/]*)(/.*?)*$
You use ? (single character) instead of * (any number) and you also have much more parts at the end than the example you've given. Try something like this
.*/products/[^/]*/productDetail/[^/]*/
You should read up on quantifiers (the ? means once or zero times, you are confusing it with *). This regex might work for you:
/^.*\/products\/[^\/]+\/productDetail\/[^\/]+\/$/
Try it online here.

URL pattern to exclude globally in Zap

I am having trouble with regex syntax in OWASP ZAP. I want to exclude from all scans all URLs that contain "web/lib". I've tried to add
^*web/lib*$
under Global Exclude URL option, but it didn't work. Please help - thanks a lot.
It's regex, if you're specifying wildcard you generally want period asterisk. You also probably need to escape the slash.
Eg: https://regex101.com/r/XLPF85/1

Perl/lighttpd regex

I'm using regex in lighttpd to rewrite URLs, but I can't write an expression that does what I want (which I thought was pretty basic, apparently not, I'm probably missing something).
Say I have this URL: /page/variable_to_pass/ OR /page/variable_to_pass/
I want to rewrite the URL to this: /page.php?var=variable_to_pass
I've already got rules like ^/login/(.*?)$ to handle specific pages, but I wanted to make one that can match any page without needing one expression per page.
I tried this: ^/([^.?]*) but it matches the whole /page/variable_to_pass/ instead of just page.
Any help is appreciated, thanks!
This regexp should do what you need
/([^\/]+)/(.+)
First match would be page name, and the second - variable value
Try:
/([^.?])+/([^.?])+/
That should give you two matches.

Regex to find bad URLs in a database field

We had an issue with the text editor on our website that was doubling up the URL. So for example, the text field may look contain:
This is a description for a media item, and here in a link.
So pretty much I need a regex to detect any string that begins with http and has another http before a closing quote, as in "http://www.example.com/apage.htmlhttp://www.example.com/apage.html"
"http[^"]+http
http://www.example.com/apage.htmlhttp://www.example.com/apage.html
This is actually a valid URL! So you'd want to be a bit careful not to munge any other URLs that happen to have ‘http://’ in the middle of them. To detect only a ‘doubled’ URL you could use backreferences:
"(https?://[^"]*)\1"
(This is a non-standard regex feature, but most modern implementations have it.)
Using regex to process HTML is a bad idea. HTML cannot reliably be parsed by regex.
If you can use the *.? syntax, you can just look for the following:
http(.*?)http
and if its present, reject the url.
The string that begins with http and has another http before a quote is:
^http[^"]*http
But, although this answers exactly your question I suspect you may want Uh Clem's answer instead ;-)
You will probably want something like this:
("http[^"]+)(http)
Then compare the two and if \1 === " + \2 then replace them.
One thought; do you have any query strings in any of your urls. If you do, are any of them like this "http://someurl.com?http=somemoredatahttp://someurl.com?http=somemoredata"?
If so, you will want something far more complicated.