Regular Expression Pattern After Each x characters - regex

I need a Regex to match the following:
After each 6 characters of a string there is a ';'
Examples:
aaaaaa;z5z5z5;zdzzzt; (Valid)
aaadzdaaa;z5z5dzdzz5;zdzdzd; (Not Valid)
I'v tried:
(([A-Za-z0-9]{6};$))
but it only validates according to the last sequence.

You should use
^(?:[A-Za-z0-9]{6};)*$
See regex demo
If there must be at least one sequence with a semi-colon, replace * with + quantifier:
^(?:[A-Za-z0-9]{6};)+$
You actually need both ^ start-of-string anchor and $ end-of-string anchor, and you should not have placed the $ anchor into the repeated group since there is only one end of string.
Here is the regex breakdown:
^ - start of string
(?:[A-Za-z0-9]{6};)* - 0 or more sequences of...
[A-Za-z0-9]{6} - exactly 6 ASCII letters or digits
; - a semi-colon
$ - end of string.

I would use:
^(?:\w{6};)*$
With:
^ assert position at start of a line
(?:\w{6};)* Non-capturing group
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\w{6} match any word character [a-zA-Z0-9_]
Quantifier: {6} Exactly 6 times
; matches the character ; literally
$ assert position at end of a line

Related

RegExp - find 1,2,3,6,7,8 and 9th letter from the end of the string

I'm new to regular expressions and trying to figure out which expression would match 1,2,3 and 6,7,8,9th letter in the string, starting from the end of the string. It would also need to include \D (for non-digits), so if 3rd letter from the end is a number it will exclude it.
Example of a string is
Wsd-kaf_23psd_trees32rap
So the result should be:
reesrap
or for
Wsd-kaf_23psd_trees324ap
it would be
reesap
This
(?<=^.{9}).*
gives me last 9 chars, but that's not really what I want.
Does anyone knows how can I do that?
Thanks.
You could try to use alternations to find all characters upto the position that holds 9 character untill the end or consecutive digits:
(?:^.*(?=.{9})|\d+)
See an online demo. Replace with empty string.
(?: - Open non-capture group;
^.* - Any 0+ characters (greedy), upto;
(?=.{9}) - A positive lookahead to assert position is followed by 9 characters;
| - Or;
\d+ - 1+ digits.
If, however, your intention was to match the characters seperately, then try:
\D(?=.{0,8}$)
See an online demo. Any non-digit that has 0-8 characters upto the end-line character.

REGEX: Exact match a word followed by 1 space followed by digits

I want to exact match a string followed by exactly one space and then numbers.
I.e. input would be Acc 1234, should evaluate to true.
Here is my regex that I've tried:
[^(\\WAcc\\W)[ ]{1}\\d]
This however fails for inputs (AccA1234 or Acc 1234 ACA, or Acc1234). How to I get my regex to match to my input (Acc 1234) exaclty?
\W is a non-word character. I'm guessing you were looking for \b, the word boundary matcher, instead.
I'd try this regex:
^\bAcc\b \d+$
^[^\s]+\s\d+$
^ - asserts beginning of line
[^\s]+ - match anything other than space, one or more characters
\s - match single space, \s is for space match
\d+ - one of more digits
$ - marks the end of line

Regular expression not worknig

I am trying to create a regular expression in javascript with the following rules:
At least 2 characters.
Should have at least 1 letter as a prefix and end with a . or have or - and then have more letters.
The following strings should be legal - aa, aaaaa, a., a-a, a a.
These should not be legal - a (too short), aa.aa. (two dots), aa- (after - should be another letter).
I don't know what I'm doing wrong here but my regex doesn't seem to work, as it is legal yet no word matches it:
(?=^.{2,}$)^(([a-z][A-Z])+([.]|[ -][a-zA-Z]+){0,1}$)
Had to re-write it completely to cover op's comment. The new regex would be:
^[a-zA-Z][a-zA-Z]*[ -][a-zA-Z]*[a-zA-Z]$|^[a-zA-Z][a-zA-Z]*([a-zA-Z]|\.)$
Explanation
1st Alternative ^[a-zA-Z][a-zA-Z]*[ -][a-zA-Z]*[a-zA-Z]$
^ asserts position at start of a line
[a-zA-Z] Match a single character present in [a-zA-Z]
[a-zA-Z]* * Quantifier — Matches between zero and unlimited
times(greedy)
[ -] Match a single character - or a space
$ asserts position at the end of a line
2nd Alternative
^[a-zA-Z][a-zA-Z]*([a-zA-Z]|\.)$
^ asserts position at start of a line
[a-zA-Z] Match a single character present in [a-zA-Z]
[a-zA-Z]* * Quantifier — Matches between zero and unlimited
times(greedy)
([a-zA-Z]|.) Match a single character present in the list below
[a-zA-Z] or dot
$ asserts position at the end of a line

Regex (Do not include digit.digit)

This is a sample text i'm running my regex on:
DuraFlexHose Water 1/2" hex 300mm 30.00
I want to include everything and stop at the 30.00
So what I have in mind is something like [^\d*\.\d*]* but that's not working. What is the query that would help me acheive this?
See Demo
/.*(?=\d{2}\.\d{2})/
.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=\d{2}\.\d{2}) Positive Lookahead - Assert that the regex below can be matched
\d{2} match a digit [0-9]
Quantifier: {2} Exactly 2 times
\. matches the character . literally
\d{2} match a digit [0-9]
Quantifier: {2} Exactly 2 times
If you cannot use any CSV parser and are only limited to regex, I'd suggest 2 regexps.
This one can be used to grab every character from the beginning up to the first pattern of optional spaces + digits(s) + . + digit(s):
^([\s\S]*?)\s*\d+\.\d+
See demo
In case the float value is at the end of string, use a $ anchor (the end of string):
^([\s\S]*?)\s*\d+\.\d+$
See another demo
Note that [\s\S] matches any symbol, even a newline.
Regex breakdown:
^ - Start of string
([\s\S]*?) - (Capture group 1) any symbols, 0 or more, as few as possible otherwise, 3 from 30.45 will be captured)
\s* - 0 or more whitespace, as many as possible (so as to trim Group 1)
\d+\.\d+ - 1 or more digits followed by a period followed by 1 or more digits
$ - end of string.
If you plan to match any floats, like -.05, you will need to replace \d+\.\d+ with [+-]?\d*\.?\d+.
Here is how it can be used:
var str = 'DuraFlexHose Water 1/2" hex 300mm 300.00';
var res = str.match(/^([\s\S]*?)\s*\d+\.\d+/);
if (res !== null) {
document.write(res[1]);
}

How can I detect last digits in python string

I need to detect last digits in the string, as they are indexes for my strings. They may be 2^64, So it's not convenient to check only last element in the string, then try second... etc.
String may be like asdgaf1_hsg534, i.e. in the string may be other digits too, but there are somewhere in the middle and they are not neighboring with the index I want to get.
Here is a method using re.sub:
import re
input = ['asdgaf1_hsg534', 'asdfh23_hsjd12', 'dgshg_jhfsd86']
for s in input:
print re.sub('.*?([0-9]*)$',r'\1',s)
Output:
534
12
86
Explanation:
The function takes a regular expression, a replacement string, and the string you want to do the replacement on: re.sub(regex,replace,string)
The regex '.*?([0-9]*)$' matches the whole string and captures the number that precedes the end of the string. Parenthesis are used to capture parts of the match we are interested in, \1 refers to the first capture group and \2 the second ect..
.*? # Matches anything (non-greedy)
([0-9]*) # Upto a zero or more digits digit (captured)
$ # Followed by the end-of-string identifier
So we are replacing the whole string with just the captured number we are interested in. In python we need to use raw strings for this: r'\1'. If the string doesn't end with digits then a blank string with be returned.
twosixfour = "get_the_numb3r_2_^_64__18446744073709551615"
print re.sub('.*?([0-9]*)$',r'\1',twosixfour)
>>> 18446744073709551615
A simple regex can detect digits at the end of the string:
'\d+$'
$ matches the end of the string. \d+ matches one or more digits. The + operator is greedy by default, meaning it matches as many digits as possible. So this will match all of the digits at the end of the string.
If you want to use re.sub and make sure that there is at least a single digit present at the end of the line, you can use the quantifier + to match 1 or more digits \d+ to not remove the whole line if there are no digits present or no digits only at the end of the line.
^.*?(\d+)$
^ Start of line
.*? Match any char except a newline as least as possible (non greedy)
(\d+) Capture group 1, match 1+ digits
$ End of line
Or using a negative lookbehind
^.*(?<!\d)(\d+)$
^ Start of line
.* Match any char except a newline as much as possible
(?<!\d)(\d+) Assert no digits directly to the left, then capture 1+ digits in group 1
$ End of line
Regex demo
When using re.match, you can omit the ^ anchor and you might also use \A and \Z to asert the start and the end of the string.
Regex demo
import re
strings = ['asdgaf1_hsg534', 'asdfh23_hsjd12', 'dgshg_jhfsd86', 'test']
for s in strings:
print (re.sub(r".*?(\d+)$", r'\1',s))
Output
534
12
86
test
If there should be a non digit present before matching a digit as in this comment you could use a negated character class with a single capture group.
^.*[^\d\r\n](\d+)
^ Start of line
.* Match any char except a newline as much as possible
[^\d\r\n] Negated character class, match any char except a digit or a newline
(\d+) Capture group 1, match 1+ digits
Regex demo
To get the last digits in the string (not necessarily at the end of the string)
^.*?(\d+)[^\r\n\d]*$
^ Start of line
.*? Match any char except a newline as least as possible (non greedy)
(\d+) Capture group 1, match 1+ digits
[^\r\n\d]* Negated character class, match 0+ times any char except a newline or digit
$ End of line
Regex demo