Regular Expression to check for words - regex

Here is my regular expression:
#"\<\\{+(TITLE|BODY|DATE|CALENDARID|CALENDARSUBSCRIPTIONGUID|CALENDARURL|TIME|CONTACTS|LOCATION|URLINFO)\\}+\\>"
It needs to check for words like {Title}. Have I made it correctly?

It looks ok to me. You might want to have the -i switch to make your expression case insensitive unless you only want to match "TITLE" and not "title" for example. Here is a pretty good regex checker: RegEx Pal. I noticed you didn't flag a language and the previous link was a javascript regex tester. Here are some more resources Regex Lib resources page

Related

use regex to get both link and text associated with it (anchor tag)

I created a regex string that I hoped would get both the link and the associated text in an html page. For instance, if I had a link such as:
<a href='www.la.com/magic.htm'>magicians of los angeles</a>
Then the link I want is 'www.la.com/magic.htm' and the text I want is 'magicians of los angeles'.
I used the following regex expression:
strsearch = "\<a\s+(.*?)\>(.*?)\</a\s*?\>|"
But my vb program told me I was getting too many matches.
Is there something wrong with the regEx expression?
The circle-brackets are meant to get 'groups' that can be back-referenced.
Thanks
What about this one:
\<a href=.+\</a>
All there is left to do is to go over each match and extract the substrings using regular string manipulation.
Check here (although regexr follows javascript regex implementation, it is still useful in our scenario)
With that being said, I often see people stating that regexes are not suited for parsing Html. You might need to use an Html Parser for this. You have HtmlAgilityPack, which is not maintained anymore, and AngleSharp, that I know of to recommend.
I tried with following pattern , it worked.
\<a href=(.*?)\>(.*?)\<\/a\s*?\>|
Also Found two errors on your origin string:
missed a escape syntax on /a
the reserved word 'href' is captured on
first group
At last , i would like recommend you a great site to test REGEX string. It will helps your debug really fast. Refer this (also demonstrating the result you want) :
REGEX101

Regular expression not working in google analytics

Im trying to build a regular expression to capture URLs which contain a certain parameter 7136D38A-AA70-434E-A705-0F5C6D072A3B
Ive set up a simple regex to capture a URL with anything before and anything after this parameter (just just all URLs which contain this parameter). Ive tested this on an online checker: http://scriptular.com/ and seems to work fine. However google analytics is saying this is invalid when i try to use it. Any idea what is causing this?
Url will be in the format
/home/index?x=23908123890123&y=kjdfhjhsfd&z=7136D38A-AA70-434E-A705-0F5C6D072A3B&p=kljdaslkjasd
so i just want to capture URLs that contain that specific "z" parameter.
regex
^.+(?=7136D38A-AA70-434E-A705-0F5C6D072A3B).+$
You just need
^.+=7136D38A-AA70-434E-A705-0F5C6D072A3B.+$
Or (a bit safer):
^.+=7136D38A-AA70-434E-A705-0F5C6D072A3B($|&.+$)
And I think you can even use
=7136D38A-AA70-434E-A705-0F5C6D072A3B($|&)
See demo
Your regex is invalid because GA regex flavor does not support look-arounds (and you have a (?=...) positive look-ahead in yours).
Here is a good GA regex cheatsheet.
To match /home/index?x=23908123890123&y=kjdfhjhsfd&z=7136D38A-AA70-434E-A705-0F5C6D072A3B&p=kljdaslkjasd you can use:
\S*7136D38A-AA70-434E-A705-0F5C6D072A3B\S*

Regex Expression to Match URL and Exclude Other

Im trying to write a regex expression to match anything (.*)/feed/ with the exception of (.*)/author/feed/
Currently, I have (.*)/feed/(.*) which works well to identify any string /feed/ to redirect. However, I dont want to exlude those that have /author/(.*)/feed/
For example - match http://www.site.com/ANYTHING/feed/ but exclude site.com/author/ANYTHING/feed/
I should clarify that I'm not terribly familiar with regex expressions but this is actually for use within the Redirection plugin for wordpress which states "Full regular expression support."
Any help would be greatly appreciated. Thank you in advance
Depending on the language, you may be able to use a negative look-behind assertion:
(.*)(?<!/author)/feed
The assertion, (?<!/author), ensures that /author does not match behind the text /feed, but does not count it as being matched.

Notepad++ replace with reg expression?

I have a big list with links and other date in it. I want to filter out all the data and have a list with just the links.
Example of the current list:
32,2012-01-04 06:44:44,http://link.com/link
33,2012-01-04 06:44:45,http://link.com/link,{Text|textext|text},http://link.com/link|http://link.com/link|http://link.com/link
Notepad++ offers find replace functionality using RegEx. You can access this feature by using Ctrl+H.
If you're actually asking for a regular expression to do this, you can use something like this to match URLs:
\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
which I found here.
Additionally you can test out changes to your regex easily at http://gskinner.com/RegExr/
Using the input you provided, here's a pattern you can use on http://www.regexr.com/
You'll need to make sure the global (/g) flag is on
Expression:
.*?(http.*?)[,|\n]
Input:
32,2012-01-04 06:44:44,http://link.com/link1
33,2012-01-04 06:44:45,http://link.com/link2,{Text|textext|text},http://link.com/link3|http://link.com/link4|http://link.com/link5
Substitution:
$1\n
Output:
http://link.com/link1
http://link.com/link2
http://link.com/link3
http://link.com/link4
http://link.com/link5

Regex not returning 2 groups

I'm having a bit of trouble with my regex and was wondering if anyone could please shed some light on what to do.
Basically, I have this Regex:
\[(link='\d+') (type='\w+')](.*|)\[/link]
For example, when I pass it the string:
[link='8' type='gig']Blur[/link] are playing [link='19' type='venue']Hyde Park[/link]"
It only returns a single match from the opening [link] tag to the last [/link] tag.
I'm just wondering if anyone could please help me with what to put in my (.*|) section to only select one [link][/link] section at a time.
Thanks!
You need to make the wildcard selection ungreedy with the "?" operator. I make it:
/\[(link='\d+')\s+(type='\w+')\](.*?)\[\/link\]/
of course this all falls down for any kind of nesting, in which case the language is no longer regular and regexs aren't suitable - find a parser
Regular Expressions Info a is a fantastic site. This page gives an example of dealing with html tags. There's also an Eclipse plugin that lets you develop expressions and see the matching in realtime.
You need to make the .* in the middle of your regex non-greedy. Look up the syntax and/or flag for non-greedy mode in your flavor of regular expressions.