Need to highlight particular pattern of text - regex

I have a file with some paragraphs, what i want to do is to highlight certain pattern of text/words occurring in the text file with background yellow and text color black.
pattern = ["enough", "too much"];
Text file = "text.txt";
and show it on a webpage with highlighted text for enough and too much words in the text file.
I want to use perl to do this task.
Please tell me how i can do this in optimized way.

Make array of all the words you want to highlight.
Save input file in $file variable.
run foreach on that array and use regular expression to replace the word with word+HTML tag.
ie...
foreach(#words)
{
$file=~r/$/< font color=black, bgcolor=yellow>$< /font>/g;
}
save the $file again as a file with .html or .htm extension.
This was more like logic question than technical i guess.

Related

Regex - return only first src url match

I'm trying to extract the first jpg from a page of text. multiple paragraphs and multiple urls in each, but i only want the first url/jpg, stop after first is matched/returned.
sample page;
this is some text and a url src="https://www.someurl.jpg" more text, more text, more text.
more text, more text
more text, more text.
this is some text and a url src="https://www.anotherurl.jpg" more text, more text, more text.
more text, more text.
Current Code;
(?<=src=")(.*?)(?=")
This code returns both urls. I need the output to be just the first one it finds and stop there, just return the first.
Output required;
https://www.someurl.jpg
any help appreciated.
Your regex is quite good, just add surroundings and g flag.
/(?<=src=")(.*?)(?=")/g
Now You gave correct regex.
console.log(`I'm trying to extract the first jpg from a page of text. multiple paragraphs and multiple urls in each, but i only want the first url/jpg, stop after first is matched/returned.
sample page;
this is some text and a url src="https://www.so23123123123l.jpg" more text, more text, more text. more text, more text
more text, more text. this is some text and a url src="https://www.anotherurl.jpg" more text, more text, more text. more text, more text.`.match(/(?<=src=")(.*?)(?=")/ig));
You can read here about regexp flags.

Using multiple Perl regular expressions to find and replace

I'm a Perl and regex newcomer in need of your expertise.
I need to process text files that include placeholder lines like Foo Bar1.jpg and replace those with with corresponding URLs like https:/baz/qux/Foo_Bar1.jpg.
As you may have guessed, I'm working with HTML. The placeholder text refers to the filename, which is the only thing available when writing the document. That's why I have to use placeholder text. Ultimately, of course, I want to replace the filename with the URL (after I upload file to my CMS to get the URL). At that point, I have all of the information at hand — the filename and the URL. Of course, I could just paste the URLs over the placeholder names in the HTML document. In fact, I've done that. But I'm certain that there's a better way.
In short, I have placeholder lines like this:
Foo Bar1.jpg
Foo Bar2.jpg
Foo Bar3.jpg
And I also have URL lines like this:
https:/baz/qux/Foo_Bar1.jpg
https:/baz/qux/Foo_Bar2.jpg
https:/baz/qux/Foo_Bar3.jpg
I want to find the placeholder string and capture a differentiator like Bar1 with a regex. Then I want to use the captured part like Bar1 to perform another regex search that matches part of the corresponding URL string, i.e. https:/baz/qux/Foo_Bar1.jpg. After a successful match, I want to replace the Foo Bar1.jpg line with https:/baz/qux/Foo_Bar1.jpg.
Ultimately, I want to do that for every permutation, so that https:/baz/qux/Foo_Bar2.jpg also replaces Foo Bar2.jpg and so on.
I've written regular expressions that match both the placeholder and the URL. That's not my problem, as far as I can tell. I can find the strings I need to process. For example, /[a-z]+\s([a-z0-9]+)\.jpg/ successfully matches what I'm calling the placeholder text and captures what I'm calling the differentiator.
However, though I've spent an embarrassing number of hours over the past week reading through Stack Overflow, various other sites and O'Reilly books on Pearl and Pearl Regular Expressions, I can't wrap my mind around how to process what I can find.
I think the piece you are missing is the idea of using Perl's internal grep function, for searching a list of URL lines based on what you are calling your "differentiator".
Slurp your URL lines into a Perl array (assuming there are a finite manageable number of them, so that memory is not clobbered):
open URLS, theUrlFile.txt or die "Cannot open.\n";
my #urls = <URLS>;
Then within the loop over your file containing "placeholders":
while (my $key = /[a-z]+\s([a-z0-9]+)\.jpg/g) {
my #matches = grep $key, #urls;
if (#matches) {
s/[a-z]+\s$key\.jpg/$matches[0]/;
}
}
You may also want to insert error/warning messages if #matches != 1.

RegEX: Matching everything but a specific value

How do i match everything in an html response but this piece of text
"signed_request" value="The signed_request is placed here"
The fast solution is:
^(.*?)"signed_request" value="The signed_request is placed here"(.*)$
If value can be random text you could do:
^(.*?)"signed_request" value="[^"]*"(.*)$
This will generate two groups that.
If the result was not successful the text does not contain the word.
If the text contains the text more than once, it is only the first time that is ignored.
If you need to remove all instances of the text you can just as well use a replace string method.
But usually it is a bad idea to use regex on html.

How to extract a part of the url through regular expression in textwrangler?

My work involves manipulating lots of data. I use textwrangler as text editor but I guess the things would remain the same on all text editors.
So I have a url
http://example.com/swatches/frisk-watches/pr?p[]=sort%3Dpopularity&sid=812%2Cf13&offer=GsdOfferOnWatches07.&ref=4c83d65f-bfaf-4db6-b5f5-d733d7b1d2af
The above one is a sample url
I want to capture the text GsdOfferOnWatches07. i.e text from offer= and till &ref using regular expression on textwragler Ctrl+F feature.
How can I do that?
$link = 'http://example.com/swatches/frisk-watches/pr?p[]=sort%3Dpopularity&sid=812%2Cf13&offer=GsdOfferOnWatches07.&ref=4c83d65f-bfaf-4db6-b5f5-d733d7b1d2af';
preg_match('/offer=(.*?)&ref/', $link, $match);
echo $match[1];'

Regular Expression to find specific section of a text file

I want to find a section from text file using regular expression. I have file as below:
This is general text section that I don't want.
HEADER:ABC1
This is section text under header
More text to follow
Additional text to follow
HEADER:XYZ2
This is section text under header
More text to follow
Additional text to follow
HEADER:KHJ3
This is section text under header
A match text will look like this A:86::ABC
Now, I want to retrieve all section text up to HEADER if the section text contains the match A:86::ABC. The result text will be
(HEADER:KHJ3
This is section text under header
A match text will look like this A:86::ABC).
I appreciate any help. I am using python and the match section can be more than one in a file. Also this is a multi line file.
regex = re.compile(".*(HEADER.*$.*A:86::ABC)", re.MULTILINE|re.DOTALL)
>>> regex.findall(string)
[u'HEADER:KHJ3\nThis is section text under header \nA match text will look like this A:86::ABC']
Hopefully this helps.
For 2 captures use ".*(HEADER.*$)(.*A:86::ABC)"