Bounding Multiple Matches With Single Text - regex

I'm trying to parse out the properties of a type (eg. the words 'Cusip', 'Issuer', and 'Coupon') shown here:
Public Type GetPricesResponse
Cusip As String
Issuer As String
Coupon As String
End Type
The regex ([a-zA-Z0-9]+).+As works great for this code snippet (see http://regexr.com?300fl), but may not work when mixed with a larger body of code. So, I've tried to "bound" this regex with the words Public Type on the front, and End Type at the end to specifically identify what I need as follows:
Public\sType\s([a-zA-Z0-9]+).+As.+End\sType
...but of course it then doesn't match anything.
I have the MultiLine option set as well.

You've presented two different problems.
The first is, roughly, "can I write a regex to match this thing", the answer is yes. For simplicity I've used \w instead of [a-zA-Z0-9]:
Public\s+Type\s+(\w+)\s+((\w+)\s+As\s+(\w+)\s*('.*\s*)?)+End\s+Type
The next is "how can I parse out the properties" and the answer to that is, as written in the comments: don't use a single regex. First, use a regex which captures only the definitions:
Public\s+Type\s+\w+\s+(.*?)End\s+Type
This uses a the reluctant quantifier *? so that the regex won't gobble up End Type and the DOTALL flag so that you can match several lines. From this match, you take group 1 and repeatedly find the following:
^\s+(\w+)\s+.*$
Group 1 from this match will be your property name.

Use the following regexp to match the whole thing:
Public\s+Type\s+(?<tname>[\w]+)\s+((?<pname>[\w]+)\s+As\s+(?<ptype>[\w]+)\s+)+End\s+Type
Note that it uses named groups for easier access to matched content. Therefore after the whole content is matched, the group named tname matches the class type, the group named pname matches the property name, and the group named ptype matches the corresponding properties type.
Here's its live demo:
http://regexr.com?300l0

Related

Regular expression for specific part of url

I'm learning to write regular expressions and am trying to get a specific part of a url. For example,
https://domain/#/customerName/profile/profileName/jobs/jobName
where "customerName" is a variable but "domain and "profile" are constants.
How would I obtain profileName given this string? I've come up with
^\/(?!https:\/\/domain\/#\/#\/*\/profile\/)
but don't really know how to cut off the rest of the string after "profileName"
customerName is a variable, but assuming it doesn't contain any slashes, you can match it with [^\/]+ (which means "repeat any character but a forward slash") after the /#/. The same pattern works for profileName. So, you can use:
https:\/\/domain\/#\/[^\/]+\/profile\/([^\/]+)
https://regex101.com/r/aToq8N/3
The profileName will be in the first captured group.
Note that because the URL doesn't contain /#/#/, but just /#/, use \/#\/ in the regular expression instead of \/#\/#\/.
Even if your domain is fixed, you may use this expression too fore the domain part
[^\/]+
This way it can contain dot in addition to alphabetic characters,among other things.

regex expression for selecting a value

I want to write a regexp formula for the below sip message that takes number:
< sip:callpark#as1sip1.com:5060;user=callpark;service=callpark;preason=park;paction=park;ptoken=150009;pautortrv=180;nt_server_host=47.168.105.100:5060 >
(Actually there are "<" and ">" signs in the message, but the site does not let me write)
For this case, I want to select ptoken value.. I wrote an expression such as: ptoken=(.*);p but it returns me ptoken=150009;p, I just need the number:150009
How do I write a regexp for this case?
PS: I write this for XML script..
Thanks,
I SOLVE THE PROBLEM BY USING TWO REGEX:
ereg assign_to="token" check_it="true" header="Refer-To:" regexp="(ptoken=([\d]*))" search_in="hdr"/
ereg assign_to="callParkToken" search_in="var" variable="token" check_it="true" regexp="([\d].*)" /
You could use the following regex:
ptoken=(\d+)
# searches for ptoken= literally
# captures every digit found in the first group
Your wanted numbers are in the first group then. Take a look at this demo on regex101.com. Depending on your actual needs, there could be better approaches (Xpath? as tagged as XML) though.
You should use lookahead and lookbehind:
(?<=ptoken=)(.+?)(?=;)
It captures any character (.+?) before which is ptoken= and behind which is ;
The <ereg ... > action has the assign_to parameter. In your case assign_to="token". In fact, the parameter can receive several variable names. The first is assigned the whole string matching the regular expression, and the following are assigned the "capture groups" of the regular expression.
If your regexp is ptoken=([\d]*), the whole match includes ptoken which is bad. The first capture group is ([\d]*) which is the required value. Thus, use <ereg regexp="ptoken=([\d]*)" assign_to="dummyvar,token" ..other parameters here.. >.
Is it working?

How to exclude a certain word in regex?

I'm using this expression and it's perfect for what I need:
.*(cq|conquest).*
It returns any word/phrase/sentence/etc. with the letters 'cq' or the word 'conquest' in it. However, from those matches I want to exclude all that contain the term 'conquest power'.
Examples:
some conquest here (should match)
another cq with some conquest here (should match)
too much cq or conquest power is bad (should not match)
How can I do that to the regex above? It has to be only one regex otherwise the program that I'm using (Advanced Combat Tracker) will create two different tabs.
If you want to match any string which contains "conquest" or "cq", but not if the string contains "conquest power", then the regex is
^(?!.*conquest power).*?(?:cq|conquest).*
The above will attempt to match from the start of the string to the end of the line, if you want to match from the start of each line, switch on multiline mode if available - adding (?m) to the start of the regex may do that.
If you want to match across newlines change . to [\s\S], or switch on singleline mode if available.
You have confused people by stating "I want to match 'cq' or 'conquest'" but also "I want the regex to extract that line".
I assume you don't really want to match just "cq" or "conquest", you want to match strings/lines (?) containing "cq" or "conquest".
From your original question I got that you want to match all strings which contain "cq" or "conquest" but do not contain "power". For this case the following regexp works:
^([^p]|p(?!ower))*(cq|conquest)([^p]|p(?!ower))*$
(regexpal)

regex in as3 to ignore match with specific start to string

I have the following as3 function below which converts normal html with links so that the links have 'event:' prepended so that I can catch them with a TextEvent listener.
protected function convertLinks(str:String):String
{
var p1:RegExp = /href|HREF="(.[^"]*)"/gs;
str = str.replace(p1,'HREF="event:$1"');
return str;
}
For example
<a href="http://www.somedomain.com">
gets converted to
<a href="event:http://www.somedomain.com">
This works just fine, but i have a problem with links that have already been converted.
I need to exclude the situation where i have a string such as
<a href="event:http://www.somedomain.com">
put through the function, because at the moment this gets converted to
<a href="event:event:http://www.somedomain.com">
Which breaks the link.
How can i modify my function so that links with 'event:' at the start are NOT matched and are left unchanged?
First of all, trying to manipulate HTML with regex may not be a good idea.
That said, according to the flavor comparison chart on regular-expressions.info, ActionScript regex is based off of ECMA engine, which supports lookaheads.
Thus you can write this:
/(?:href|HREF)="(?!event:)(.[^"]*)"/
(?=…) is positive lookahead; it asserts that a given pattern can be matched. (?!…) is negative lookahead; it asserts that a given pattern can NOT be matched.
Note that the inclusion of the . is very peculiar. It's probably not intended to include the . there since it can match a closing doublequote.
Note also that I've fixed the alternation for the href/HREF by using a non-capturing group (?:…).
This is because:
this|that matches either "this" or "that"
this|that thing matches either "this" or "that thing"
(this|that) thing matches either "this thing" or "that thing"
Alternatively you may also want to just turn on case-insensity flag /i, which would handle things like hReF or eVeNt:.
Thus, perhaps your pattern should just be
/href="(?!event:)([^"]*)"/gsi
If lookahead was not supported, you can use an optional pattern that matches event: if it's there, excluding it from group 1, so that it doesn't get included when you substitute in $1.
/href="(?:event:)?([^"]*)"/gsi
\________/ \_____/
non-capturing group 1
optional

How to match a string that does not end in a certain substring?

how can I write regular expression that dose not contain some string at the end.
in my project,all classes that their names dont end with some string such as "controller" and "map" should inherit from a base class. how can I do this using regular expression ?
but using both
public*.class[a-zA-Z]*(?<!controller|map)$
public*.class*.(?<!controller)$
there isnt any match case!!!
Do a search for all filenames matching this:
(?<!controller|map|anythingelse)$
(Remove the |anythingelse if no other keywords, or append other keywords similarly.)
If you can't use negative lookbehinds (the (?<!..) bit), do a search for filenames that do not match this:
(?:controller|map)$
And if that still doesn't work (might not in some IDEs), remove the ?: part and it probably will - that just makes it a non-capturing group, but the difference here is fairly insignificant.
If you're using something where the full string must match, then you can just prefix either of the above with ^.* to do that.
Update:
In response to this:
but using both
public*.class[a-zA-Z]*(?<!controller|map)$
public*.class*.(?<!controller)$
there isnt any match case!!!
Not quite sure what you're attempting with the public/class stuff there, so try this:
public.*class.*(?<!controller|map)$`
The . is a regex char that means "anything except newline", and the * means zero or more times.
If this isn't what you're after, edit the question with more details.
Depending on your regex implementation, you might be able to use a lookbehind for this task. This would look like
(?<!SomeText)$
This matches any lines NOT having "SomeText" at their end. If you cannot use that, the expression
^(?!.*SomeText$).*$
matches any non-empty lines not ending with "SomeText" as well.
You could write a regex that contains two groups, one consists of one or more characters before controller or map, the other contains controller or map and is optional.
^(.+)(controller|map)?$
With that you may match your string and if there is a group() method in the regex API you use, if group(2) is empty, the string does not contain controller or map.
Check if the name does not match [a-zA-Z]*controller or [a-zA-Z]*map.
finally I did it in this way
public.*class.*[^(controller|map|spec)]$
it worked