Extract text string from URL using regex

Extract text string from URL using regex - regex

Basically, I have list of URLs that look like this:
http://auctionnetwork.com.my/auctiondate_img.php?id=244003
and I want to extract auctiondate_244003, how would I do that with regex?
I want the output to be "auctiondate_244003".

You didn't define exactly what you are looking for.
How about something like: \/([^\/]*)img\.php\?id=([0-9]+)?
You'll have to concatenate the 1st and second match groups to get what you need
See https://regex101.com/r/dD2hW6/1

Related

Add a word before each extracted String

I am capturing the session id from the string. I want to add word(prefix) before the extracted session id.
Sample input: key=this is sample input; MySessionId=hhjsfd436763jhjhfdjs87787.hghht77f54; key7=jhu8787; type=raw; oldkey=jkjf8787;
I have formed the below regex to capture the MySessionId.
MySessionId=([^.]*)
I want to add a word before the extracted string like below.
Expected output:
ABCD-1234-hhjsfd436763jhjhfdjs87787
Any way to achieve this through Regular expression?

It really depends what language you're using, you'll need to find a function that replaces text in a string (usually it's called replace). It looks like you're dealing with cookies so I'll show you an example in javascript:
//$1 refers to the first group captured by the regex
//i think other languages use $1 too but you should probably check
string = string.replace(/MySessionId([^.]*)/, "ABCD-1234-$1")

Regex ignore first 12 characters from string

I'm trying to create a custom filter in Google Analytic to remove the query parts of the url which I don't want to see. The url has the following structure
[domain]/?p=899:2000:15018702722302::NO:::
I would like to create a regex which skips the first 12 characters (that is until:/?p=899:2000), and what ever is going to be after that replace it with nothing.
So I made this one: https://regex101.com/r/Xgbfqz/1 (which could be simplified to .{0,12}) , but I actually would like to skip those and only let the regex match whatever is going to be after that, so that I'll be able to tell in Google Analytics to replace it with "".
The part in the url that is always the same is
?p=[3numbers]:[0-4numbers]
Thank you

Your regular expression:
\/\?p=\d{3}\:\d{0,4}(.*)
Tested in Golang RegEx 2 and RegEx101
It search for /p=###:[optional:####] and capture the rest of the right side string.
(extra) JavaScript:
paragraf='[domain]/?p=899:2000:15018702722302::NO:::'
var regex= /\/\?p=\d{3}\:\d{0,4}(.*)/;
var match = regex.exec(paragraf);
alert('The rest of the right side of the string: ' + match[1]);

Easily use "[domain]/?p=899:2000:15018702722302::NO:::".substr(12)

You can try this:
/\?p\=\d{3}:\d{0,4}
Which matches just this: ?p=[3numbers]:[0-4numbers]
Not sure about replacing though.
https://regex101.com/r/Xgbfqz/1

Using regex to filter a URL list

I'm trying to use regex to filter a list of site that doesn't include a specific word.
For example from the list below, i want to filter all sites with the word test and empty strings so the final output that I'll get is http://example.com. I tried to use ^((?!test).)* but that doesn't filter empty strings. Maybe there is a better way to filter them? Thanks.
http://test1.com
http://test2.com
*empty string*
http://example.com

You need to use a negative lookahead and .+ in your regex as this:
^(?!.*test).+
RegEx Demo

Match on ID in URL with RegEx

I am looking to find a random alpha-numeric ID from a URL returned in some JSON. Using the example:
"responseURL" : "http://sutureself.com/userid/123abc"
... I want to just match on the 123abc. So far I have:
http://sutureself.com/userid/([a-z0-9]+)
... but this matches the whole URL. What do I need to add to only match this ID? Note, the length of the ID can differ.

How about this:
[^/]+(?="})
Working regex example:
http://regex101.com/r/pT3fG8

Looks like you need to set the template to "$1". See comments in this related post.

Regular Expression filter for Meta Fields

I want to parse text content to extract some parameters with Regular Expression.
My text looks like below:
//_META_FIELD{Parameter: S}
And, I want to filter content start with "//_META_FIELD{" and end with "}"
So, I can get the filtered content will : Parameter: S
Can any one help?

This Regex will find what you are looking for:
#^//_META_FIELD{(.+?)}$#m
^ is to make sure is at the beginning of the line and $ is to make sure nothing else is after that closing } You can remove that if you don't need it.
Also you can see an example of that RegExp here

The regex should look something like this:
^//_META_FIELD\{(.*?)\}$

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Extract text string from URL using regex - regex

Basically, I have list of URLs that look like this: http://auctionnetwork.com.my/auctiondate_img.php?id=244003 and I want to extract auctiondate_244003, how would I do that with regex? I want the output to be "auctiondate_244003".

You didn't define exactly what you are looking for. How about something like: \/([^\/]*)img\.php\?id=([0-9]+)? You'll have to concatenate the 1st and second match groups to get what you need See https://regex101.com/r/dD2hW6/1

Related

Add a word before each extracted String

Regex ignore first 12 characters from string

Using regex to filter a URL list

Match on ID in URL with RegEx

Regular Expression filter for Meta Fields

Categories

Resources