how to remove all website addresses in bulk using regex

how to remove all website addresses in bulk using regex - regex

I have a lot of sites and I want to delete all of the web site address.
example:
http://www.website1.com/product.php?id=
http://www.website2.net/list.php?cid=
http://www.website3.org/view.php?page=
once removed:
product.php? id =
list.php? cid =
view.php? page =
I want to remove them in bulk using regex101 or regex on notepad ++
I want to ask what the code regullar expression to remove all of it?

I find PHP Live Regex easier to use for that purpose since you see the replace results directly (choose preg_replace instead of preg_match):
You can use this regex and choose replace and only keep the first capturing group $1:
(?:[a-z]{4,5}://[a-z.0-9]*\/)?([a-z.\?_=]*)([0-9]*)
Result:
product.php?id=
list.php?cid=
view.php?page=
See: http://www.phpliveregex.com/p/g5q

Use the following regex to search in Notepad++:
.*\/ demo
Then use a space to replace.
Basically we are searching for the last / and removing everything from beginning to that position.

Related

Regex to remove everything after -i- (with -i-)

I was trying to find solution for my problem.
Input: prd-abcd-efgh-i-0dflnk55f5d45df
Output: prd-abcd-efgh
Tried Splunk Query : index=aws-* (host=prd-abcd-efgh*) | rex field=host "^(?<host>[^.]+)"| dedup host | stats count by host,methodPath
I want to remove everything comes after "-i-" using simple regex.I tried with regex "^(?[^.]+)" listed here
https://answers.splunk.com/answers/77101/extracting-selected-hosts-with-regex-regex-hosts-with-exceptions.html
Please help me to solve it.

replace(host, "(?<=-i-).*", "")
Example here: https://regex101.com/r/blcCcQ/2
This (?<=-i-) is a lookbehind

I have no knowledge of Splunk. but the normal way to do that would be to match the part you don't want and replace it with an empty string.
The regex for doing that could be:
-i-.*
Then replace the match with an empty string.

Something simple like this should work:
([a-z-]+)-i-.+
The first capture group will return only the part preceding -i-.

Regex ignore first 12 characters from string

I'm trying to create a custom filter in Google Analytic to remove the query parts of the url which I don't want to see. The url has the following structure
[domain]/?p=899:2000:15018702722302::NO:::
I would like to create a regex which skips the first 12 characters (that is until:/?p=899:2000), and what ever is going to be after that replace it with nothing.
So I made this one: https://regex101.com/r/Xgbfqz/1 (which could be simplified to .{0,12}) , but I actually would like to skip those and only let the regex match whatever is going to be after that, so that I'll be able to tell in Google Analytics to replace it with "".
The part in the url that is always the same is
?p=[3numbers]:[0-4numbers]
Thank you

Your regular expression:
\/\?p=\d{3}\:\d{0,4}(.*)
Tested in Golang RegEx 2 and RegEx101
It search for /p=###:[optional:####] and capture the rest of the right side string.
(extra) JavaScript:
paragraf='[domain]/?p=899:2000:15018702722302::NO:::'
var regex= /\/\?p=\d{3}\:\d{0,4}(.*)/;
var match = regex.exec(paragraf);
alert('The rest of the right side of the string: ' + match[1]);

Easily use "[domain]/?p=899:2000:15018702722302::NO:::".substr(12)

You can try this:
/\?p\=\d{3}:\d{0,4}
Which matches just this: ?p=[3numbers]:[0-4numbers]
Not sure about replacing though.
https://regex101.com/r/Xgbfqz/1

How to extract FirstName and LastName from html tags with regex?

I have response body which contains
"<h3 class="panel-title">Welcome
First Last </h3>"
I want to fetch 'First Last' as a output
The regular expression I have tried are
"Welcome(\s*([A-Za-z]+))(\s*([A-Za-z]+))"
"Welcome \s*([A-Za-z]+)\s*([A-Za-z]+)"
But not able to get the result. If I remove the newline and take it as
"<h3 class="panel-title">Welcome First Last </h3>" it is detecting in online regex maker.

I suspect your problem is the carriage return between "Welcome" and the user name. If you use the "single-line mode" flag (?s) in your regex, it will ignore newlines. Try these:
(?s)Welcome(\s*([A-Za-z]+))(\s*([A-Za-z]+))
(?s)Welcome \s*([A-Za-z]+)\s*([A-Za-z]+)
(this works in jMeter and any other java or php based regex, but not in javascript. In the comments on the question you say you're using javascript and also jMeter - if it is a jMeter question, then this will help. if javaScript, try one of the other answers)

Well, usually I don't recommend regex for this kind of work. DOM manipulation plays at its best.
but you can use following regex to yank text:
/(?:<h3.*?>)([^<]+)(?:<\/h3>)/i
See demo at https://regex101.com/r/wA2sZ9/1
This will extract First and Last names including extra spacing. I'm sure you can easily deal with spaces.

In jmeter reg exp extractor you can use:
<h3 class="panel-title">Welcome(.*?)</h3>
Then take value using $1$.
In the data you shown welcome is followed by enter.If actually its part of response then you have to use \n.
<h3 class="panel-title">Welcome\n(.*?)</h3>
Otherwise above one is enough.
First verify this in jmeter using regular expression tester of response body.

Welcome([\s\S]+?)<
Try this, it will definitely work.

Regular expressions are greedy by default, try this
Welcome\s*([A-Za-z]+)\s*([A-Za-z]+)
Groups 1 and 2 contain your data
Check it here

extract email address from Notepad++ using regex

I am trying to extract email addresses from notepad++ using RegEx.
I tried like this
Find and Replace
Find: (\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b)
Replace : .\1
I am loosing email address instead of text. I need remove all text and keep only email addresses in the file. How to do that?
Abilash Perumandla
hi Gunpreet, kindly share your thoughts to Abi#TEKperfekt.com
Pratap Aneel
15d
Pratap Aneel
please share your thoughts to Pratap.kumar#rsrit.com
naveen kumar
15d
naveen kumar

You need to match and capture the email with a (...) subpattern (so, you do that right), but you need to just match everything else (and that part is missing).
Use
Find what: (\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b)|.
Replace with: $1
Then, you might want to use Edit -> Blank Operations -> Remove Unnecessary Blank and EOL menu option.

Regex Assistance for a url filepath

Can someone assist in creating a Regex for the following situation:
I have about 2000 records for which I need to do a search/repleace where I need to make a replacement for a known item in each record that looks like this:
<li>View Product Information</li>
The FILEPATH and FILE are variable, but the surrounding HTML is always the same. Can someone assist with what kind of Regex I would substitute for the "FILEPATH/FILE" part of the search?

you may match the constant part and use grouping to put it back
(<li>View Product Information</li>)
then you should replace the string with $1your_replacement$2, where $1 is the first matching group and $2 the second (if using python for instance you should call Match.group(1) and Match.group(2))
You would have to escape \ chars if you're using Java instead.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how to remove all website addresses in bulk using regex - regex

Use the following regex to search in Notepad++: .*\/ demo Then use a space to replace. Basically we are searching for the last / and removing everything from beginning to that position.

Related

Regex to remove everything after -i- (with -i-)

Regex ignore first 12 characters from string

How to extract FirstName and LastName from html tags with regex?

extract email address from Notepad++ using regex

Regex Assistance for a url filepath

Categories

Resources