Optional question mark - regexp - regex

I have problem with creating correct regular expression.
Here is what I have so far:
https://regex101.com/r/d0epRo/2
I need to add to this links one more parameter and I have to determinate wheather there is question mark or not. Therefore ? should be optional but I can't get it to work.
Those not working (\?|) (\?)? (\??).
Those should be marked http://www.polskieszlaki.pl and http://www.polskieszlaki.pl/wawel.htm but aren't
I have no forther ideas. Help please.

I think what you want is this regex:
a[\s]+href="[^mailto][\S]+polskieszlaki\.pl(?:(.*))?(?:(\?)(.*))?\"
This (?: ... ) means "do not capture"

If you are just trying to retrieve the query parameters try:
a[\s]+href="[^mailto][\S]+polskieszlaki\.pl(.*)(?:\?(?<param>.*))\"
You can then extract the param group
Or in a more simpler form without the named + ignored capture groups:
a[\s]+href="[^mailto][\S]+polskieszlaki\.pl(.*)(\?(.*))\"

Related

How to combine groups in Regex with non capturing groups to have all optional

What I'm trying to achieve: I want to match user entered sentence with my templates and to see which template matches better (as many groups out of all in template as possible).
Regex which I'm building to solve example:
^(\bMyCompany1\b)?(?:.+)?\s(\bestablishes\b)?(?:.+)?\s(\bAnotherCompany\b)?(?:.+)?$
Example sentences:
'MyCompany1 establishes AnotherCompany' - matches all 3 groups. is OK
'MyCompany1 establ AnotherCompany' - matches first and last group. ignres the middle typo. is also Ok
'MyCompany1 establishes AnotherCompany ' - space in the end. cannot identify 2 and 3 groups. I don't understand why
'MyCompany1 establishes AnotherCompany' - additional spaces after word 'establishes'. For some reason is not detecting 2nd group anymore
This regex is just an example of one template. I will have 1 regex (build dynamically) per each template. Like 'User1 sent a request to User2', 'Company1 borrowed to Company2 $111' My idea is to define each part of the template and to see how many parts I matched. E.g. in my example: - I expect some company name from the list (MyCompany or MyCompany1) or non capturing group to ignore the rest (maybe user did a typo or is just typing and hasn't finished) - I expect same order of groups to be there
Can you please explain what I'm doing wrong in my Regex? Is it correct to achieve that by using Regex at all?
This is covering all your test cases, it is based on 3 lookaheads, each one contain an optional non-capture that includes a group for every keywords you're looking for.
^(?=(?:.*(\bMyCompany1\b))?)(?=(?:.*?(\bestablishes\b))?)(?=(?:.*(\bAnotherCompany\b))?).*$
You'll get regex explanation at the link below:
Demo
Or, if the order matter:
^(?:.*(\bMyCompany1\b))?(?:.*?(\bestablishes\b))?(?:.*(\bAnotherCompany\b))?.*$
Demo
could you please try below regex
^(\bMyCompany1\b)?\s+(\bestablishes\b)?\s+(\bAnotherCompany\b)?(?:.+)?$
hope it helps

URL matching regex

I need some help with URL matching regex. I read the regex syntax documentation but it's so complex.
I'm trying to create a URL list for a checkout funnel, how would I set up regex for the following?
https://shop.mysite.ca/[unique ID]/checkouts/[unique ID 2]
OR
https://shop.mysite.ca/[unique ID]/checkouts/[unique ID
2]?step=contact_information
What I have so far, though not sure how to put the optional parameter "step=contact_information")
/^(https:\/\/shop.mysite.ca\/)([\da-z]+)(\/checkouts\/)([\da-z]+)$/
You can use a "?" to make a group either appear 0 or 1 times, making it optional.
/^(https:\/\/shop.mysite.ca\/)([\da-z]+)(\/checkouts\/)([\da-z]+)(\?step=contact_information)?$/
this should work
^(https:\/\/shop.mysite.ca\/)([\da-z]+)(\/checkouts\/)([\da-z]+)((\?step=contact_information)*)$
edit: forgot the ? and used * instead. The other solution by #thomas is a bit better I think
Try it:
^(https:\/\/shop.mysite.ca\/)(\d+.)(\/checkouts)(\/)(\d+.)($|\?\w.+$)
If unique ID is composed with non numbers characteres:
^(https:\/\/shop.mysite.ca\/)([\da-zA-Z]+.)(\/checkouts)(\/)([\da-zA-Z]+.)($|\?\w.+$)

Regular expression to match question mark except repeated or commented(--)

I would like to build a regular expression in C# to match question mark except repeated or commented.
For example, if I have a string below
--???
??
asdlfkj --?
asldfjl -?
aslfldkf --?
aslfkvlv --??
?
-?
dklsafdlafjd = ?
, I want to match like below (between * character).
--???
??
asdlfkj --?
asldfjl -*?*
aslfldkf --?
aslfkvlv --??
*?*
-*?*
dklsafdlafjd = *?*
I'm developing SQL binding method using 2 parameters.
The first one is SQL, for example
select * from atable where id = ?.
SQL can have comment so I want ignore them.
The second one is parameter for SQL as Array to match sequentially;
Does anyone have good idea for it?
If you can negate this regex it should work for you:
(\?{2,}|(?<=--)\?)
I don't know what language you're working in, but you should be able to filter by line. Apply this regex as a predicate and either negate it or use a exclude function.
I'll leave those implementation details up to you.

URL regex group catching

Hello I'm trying to find a regex that would catch the terms in a url.
For example, given:
https://stackoverflow.com, it would catch "stackoverflow"
and given https://stackoverflow.com/questions/ask, it would catch "stackoverflow", "questions", "ask" and any potential terms in between the slash character after the domain name.
Up until now I managed to find the following regex but it cannot repeat catching groups
https?:\/\/(?:www\.)?([\da-z-]*)(?:[\.a-z]*)(?:\/([\da-z]*)\/?)+
Do you guys have any ways to resolve that issue?? that would be great.
I testet the answer of Michal M it appears not to get "www." so I updated it
/(?:\/(?:w{3}\.)?)\K([\w]+)/i
Edit: As soon as it's not important to match the "www." I placed it inside a non capturing group so it won't be captured. Btw I also placed the case insensitive modifier so "WWW." would be okay too.
Try this one:
(?:(\/))\K(\w+)
tested in notepad++
You may try using two separate regexes -- one for the hostname part and another for the terms in the path part. Then combine them with alternation construction and do global search:
https?:\/\/(?:\w+\.)*(\w+)\.\w+ # this would capture hostname "term"
|
\/(\w+) # this would capture path "terms"
(Note: requires /x modifier.)
Demo: https://regex101.com/r/nA8jT9/2
Thanks I managed to rearrange it for it to work with the "www"
(?:\/(?:www\.)?)\K([\w\d]+)

Regex string with optional suffix in Reqtify

I try to filter the following lines in Reqtify:
Li.success tc_BT_Cancel_From_Pause_State_a4_2016-01-14_16h40m16s.log
Li.success tc_BT_Cancel_From_Pause_State_a5_2016-01-14_16h40m23s.log
Li.success tc_BT_Cancel_Init_BtControlStop_2016-01-14_16h40m23s.log
With a first regex ^Li.\w+\stc_(\w+)_20 I achieve to extract
BT_Cancel_From_Pause_State_a4
BT_Cancel_From_Pause_State_a5
BT_Cancel_Init_BtControlStop
But my goal is to strip off the _a* suffix.
I already tried in an additional expression (.+)(_a\d)? but the result is unchanged.
Same for (.+)(_a\d|) .
Does anyone have an idea how to strip this optional part off?
The final list should be:
BT_Cancel_From_Pause_State
BT_Cancel_From_Pause_State
BT_Cancel_Init_BtControlStop
Thanks,
Chris
This should do the trick :
^Li.\w+\stc_(\w+?)(_a\d)?_20
Group 1 remains unchanged, group 2 will be the optional _aX.
If you don't want to group the second part, you can change it this way :
^Li.\w+\stc_(\w+?)(?:_a\d)?_20
I propose this:
^Li.\w+\stc_(\w+?)(?:_a\d)?_20
With Live Demo