How to build this regular expression? - regex

Sample: AAAATGCCCTAAGGGATGTTTTAGAAA
I want to capture all string with these criteria:
Start: ATG
Follow by 3x characters of sets: A or C or G or T
End: TAA or TAG or TGA
Such as: ATGCCCTAA, ATGTTTTAG
I had a regular expression here: /[ACGT]*((ATG)(([ACGT]){3})+(TAA|TAG|TGA))[ACGT]*/g, but it only match the last ATGTTTTAG not ATGCCCTAA. I don't know why ?
Please help me write pattern that match both ATGCCCTAA and ATGTTTTAG.
Here is online example:
https://regex101.com/r/iO8lF9/1

This regex works well /(ATG(?:A{3}|C{3}|G{3}|T{3})(?:TAA|TAG|TGA))/g
as you can see here: https://www.regex101.com/r/gZ0zA9/2
I hope it helps

Using back-reference you can shorten your regex as this:
ATG([AGCT])\1{2}(?:TGA|TA[AG])
RegEx Demo
It matches [AGCT] after ATG and groups it as captured group #1. Next we match \1{2} to make sure same letter is repeated 3 times.

try...
^ATG[AGCT]{3}(TAA|TAG|TGA)$

I use this pattern and it works, thank all you for helping me.
/(ATG(:?A{3}|C{3}|G{3}|T{3})(:?TAA|TAG|TGA))/g

Related

Need regex help for matching names

Let's say I have these three names
John Doe (p45643)
Le'anne Frank
Molly-Mae Edwards
I want to match
1) John Doe
2) Le'anne Frank
3) Molly-Mae Edwards
The regex I have tried is
(^[a-zA-Z-'^\d]$)+
but it isn't working as I am expecting.
I would like help creating a regex pattern that:
Matches a name from start to finish, and cannot contain a number. The only permitted values each "name" can contain is, [a-zA-Z'-], so if a name was
J0hn then it shouldn't match
If I understood correctly your question, then you have a minor errors in your regex:
(^[a-zA-Z-'^\d]$)+
^-------^------Here
The - pointed above should be escaped or moved to the end since it works as a range character. The + is marking the group as repeated.
You can use this regex instead (following your previous pattern):
(^[a-zA-Z'^\d -]+$)
Regex demo
Update: for your comment. If you want to match separately, then you can use:
(\b[a-zA-Z'^\d-]+\b)
Regex demo
And if you only want to match string (not numbers), then you can use:
(\b[a-zA-Z'-]+\b)
Regex demo
You are using the anchors incorrectly. Based on the modifier it can match the whole string or a single line.
Try
/^[a-zA-Z'-]+$/
Thanks to #Djory Krache
The query I was looking for was
(\b[a-zA-Z'-]+\b)

how to make a regular expression for this?

I want to make a regular expression on the string "{{c1::tiger}} is
a kind of {{c2::animal::something movable}}" to get the word "tiger" and "animal",and I have made this expression \{\{c\d+::((?P<value>.*?)(:{0,2})(.*?))\}\},also I want to use group('value') to achieve this.The result word "tiger" is exactly what I need,but always get the wrong result "animal::something movable"(which I mean "animal"),could anyone help me to solve this problem?Thanks.
The pattern that you tried contains 4 capturing groups and for the current example data group 1 and group 3 are empty.
To get tiger you could use a single capturing group with a negated character class:
\{\{c\d+::(?P<value>.*?)(?:::|}})
Regex demo
If the closing }} have to be present, you could use:
\{\{c\d+::(?P<value>.*?)(?:::.*)?}}
Regex demo
This would work for your example String:
c1::(?P<valueTiger>[a-z]*).*?c2::(?P<valueAnimal>[a-z]*)
Regex101
\{\{c\d+::(?P<value>[^:}]+)(?::{0,2}([^}]+))?\}\}
Demo

Regex Pattern to extract url links from two string

I have two string in which I have to sorten urls. I want a regex pattern to extract them
https://l.facebook.com/l.php?u=http%3A%2F%2Febay.to%2F2EyH7Nq&h=ATNHM5kACc4rh_z68Ytw__cNCzJ63_iPezd_whc0PjcsN4qj1PfdJgFXyrOKM3-biqPm7eAXTOc5LD6r-7JXhRsqsqEHUs0jaJkjvm_QWf6sqbHQmS63q6P0_NcQoUa86Az5EttNT9xJb_evKBaiFxW7e7v2afJQn2zNxz5lQ8xgxhMcEFuJ3hUiSYUMEemKFB2LSIgAZFibRv4GeRrTk8hxFaArkBuAhQaXQFd4jX-aQuUYhjD0ErV5FY-D4gFMpb0lFCU7SyBlRpkUuOcHVjwjxN-_g6reMYwo8loAJnJD
/redirect?q=http%3A%2F%2Fgoo.gl%2FIW7ct&redir_token=PV5sR8F7GuXT9PgPO_nkBFLABQx8MTUxNjA3OTY5MEAxNTE1OTkzMjkw&v=7wmIyD1fM4M&event=video_description
Output will be from 1st and 2nd link:-
http%3A%2F%2Febay.to%2F2EyH7Nq
http%3A%2F%2Fgoo.gl%2FIW7ct
Please help me out.
I have already used:-
(http|https).*?&
but its not working on first url.
You can try this:
=(https?[^&]*)
Demo
If lookbehind is possible in your flavour of regex then you may try this as well which will ensure to not capture the equal sign:
(?<=)(https?[^&]*)
Demo 2
Try this regex !
I am also attach the output of the regex through regex101.
http%3A%2F%2F(.*)%2F(.*[^&])(?=&)
You can use this pattern to only capture goo.gl and ebay.to links:
(http%3A%2F%2F(ebay\.to|goo\.gl)%2F[^&]*)&
Demo

Regex match last occurrence between 2 strings

I have a string like this:
abcabcdeabc...STRING INSIDE...xyz
I want to find "...STRING INSIDE..." so I'm using the regex below to match it:
(?<=abc).*(?=xyz)
The issue is there are duplicated "abc" in the string so it returns "abcdeabc...STRING INSIDE..." but I only want to match anything between the last "abc" and "xyz". Is this possible? And if yes, how can I achieve this? Thank you.
Try it here:
https://regex101.com/r/gS9Xso/3
Try this pattern:
.*(?<=abc)(.*)(?=xyz)
The leading .* will consume everything up until the last abc, then the number will be captured.
Demo
We can also try using the following pattern:
.*abc(.*?)xyz
Here is a demo for the second pattern:
Demo
This should work well.
[^\d]*abc(\d+)xyz[^\d]*
See it on Debuggex

Find first point with regex

I want a regex which return me only characters before first point.
Ex :
T420_02.DOMAIN.LOCAL
I want only T420_02
Please help me.
You can use the following regex: ^(.*?)(?=\.)
The captured group contains what you need (T420_02 in your example).
This simple expression should do what you need, assuming you want to match it at the beginning of the string:
^(.+?)\.
The capture group contains the string before (but not including) the ..
Here's a fiddle: http://www.rexfiddle.net/s8l0bn3
Use regex pattern ^[^.]+(?=[.])