Replace code using Regex on Visual Studio Code [duplicate] - regex

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I have a huge file and need to replace some strings, the problem is that they are dynamic but always follow a pattern:
year[4 digits number]/month[2 digits number]/timestamp[8 digits number]/file[random string ending with extension]
Some examples:
2017/07/24204301/a-4.png
2017/07/24204318/a-5-e1501986401369.png
2017/11/24211223/questao10branca-172x300.png
I need to remove the timestamp on all occurrences, then the above example would become:
2017/07/a-4.png
2017/07/a-5-e1501986401369.png
2017/11/questao10branca-172x300.png
How can I achieve this using Regexp and Visual Studio Code?

Given the examples you presented, there are a couple of regular expressions that will work for you.
See regex in use here
/\d{8}(?=/)
/ Match this literally
\d{8} Match any digit exactly 8 times
(?=/) Positive lookahead ensuring what follows is a literal /
See regex in use here
(?<=^\d{4}/\d{2}/)\d{8}/
(?<=^\d{4}/\d{2}/) Negative lookbehind ensuring what precedes is the following:
^ Assert position at the start of the string
\d{4} Match any digit exactly 4 times
/ Match this literally
\d{2} Match any digit exactly twice
/ Match this literally
\d{8} Match any digit exactly 8 times
/ Match this literally

Related

trying to understand what this regex means [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
Trying to understand what the below regex means.
/^[0-9]{2,3}[- ]{0,1}[0-9]{3}[- ]{0,1}[0-9]{3}$/
Sorry not exactly a coding question.
Let's break this regex into a few different parts:
^: asserts position at start of the string
[0-9]{2,3}: Match a number between 0 and 9, between 2 and 3 times
[- ]{0,1} Matches a dash between zero and one times (Optional dash)
[0-9]{3}: Match a number between 0 and 9, exactly 3 times
[- ]{0,1} Matches a dash between zero and one times (Optional dash)
[0-9]{3}: Match a number between 0 and 9, exactly 3 times
$: asserts position at the end of the string, or before the line terminator right at the end of the string (if any)
Here are a few strings that would pass this regex:
123-123-123
123123123
12-123-123
12123123
Here's a good resource to learn/test regexes: regex101.com
It matches two or three digits followed by (optionally) a dash or space, then 3 digits, again optional dash or space and 3 digits. It seems to try to match a telephone number written in different formats.

Regex to match specific string + optional space + 8 digits [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I need a regular expression to validate strings with the prefix 'CON' followed by an optional space followed by 8 digits.
I've tried various expressions, I got tangled up and now I'm lost.
^(CON+s\?d{8})$
\bCON\b\S?D{8}
Syntax is off a bit
^(CON\s?\d{8})
( starts a capturing group
CON is exactly matched
\s matches any white space character and the ? makes it optional
\d{8} matches 8 digits
) ends the capturing group
You were pretty well off to start, Hope this helps :)
keeping in mind If there is no space, then there shouldn't be 8 more digits
^CON(\ \d{8})?
If the string you are looking for can be part of a larger string (note that in this case it may be preceded or followed by anything, even other digits):
CON\s?\d{8}
If the string must match in full, use ^$ to designate that:
^CON\s?\d{8}$
You can add variations to it, if say you want it to begin/end with a word boundary - use \bto indicate that. If you want it to end in a non-digit, use \D+ at the end, instead of $.
Finally, if you want the string to end with an EOL or a non-digit, you may use an expression like this:
CON\s?\d{8}(\D+|$) or the same with a non-capturing group: CON\s?\d{8}(?:\D+|$)

Regex to match given amount of characters in undefined order [duplicate]

This question already has answers here:
Regex to match exactly n occurrences of letters and m occurrences of digits
(3 answers)
Closed 4 years ago.
I am looking for a regex that matches the following:
2 times the character 'a' and 3 times the character 'b'.
Additionally, the characters do not have to be subsequent, meaning that not only 'aabbb' and 'bbaaa' should be allowed, but also 'ababb', 'abbab' and so forth.
By the sound of it this should be an easy task, but atm I just can't wrap my head around it. Redirection to a good read is appreciated.
You need to use positive lookaheads. This is the same as the password validation problem described here.
Edit:
A positive lookahed will allow you to check a pattern against the string without changing where the next part of the regex matches. This means that you can test multiple regex patterns at the current position of the string and for the regex to match all the positive lookaheads will have to match.
In your case you are looking for 2 a' and 3 b's so the regex to match exactly 2 a's anywhere in the string is /^[^a]*a[^a]*a[^a]*$/ and for 3 b's is /^[^b]*b[^b]*b[^b]*b[^b]*$/ we now need to combine these so that we can match both together as follows /^(?=[^a]*a[^a]*a[^a]*$)(?=[^b]*b[^b]*b[^b]*b[^b]*$).*$/. This will start at the beginning of the string with the ^ anchor, then look for exactly 2 a's then the end of the string. Then because that was a positive lookahead the (?= ... ) the position for the next part of the pattern to match at in the string wont move so we are still at the start of the string and now match exactly 3 b's. As this is a positive lookahead we are still at the beginning of the string but now know that we have 2 a's and 3'b in the string so we match the whole of the string with .*$.

Regex quantifier not restricting match [duplicate]

This question already has an answer here:
Restricting character length in a regular expression
(1 answer)
Closed 4 years ago.
I would like to match 1 or more capital letters, [A-Z]+ followed by 0 or more numbers, [0-9]* but the entire string needs to be less than or equal to 8 characters in total.
No matter what regex I come up with the total length seems to be ignored. Here is what I've tried.
^[A-Z]+[0-9]*{1,8}$ //Range ignored, will not work on regex101.com but will on rubular.com/
^([A-Z]+[0-9]*){1,8}$ //Range ignored
^(([A-Z]+[0-9]*){1,8})$ //Range ignored
Is this not possible in regex? Do I just need to do the range check in the language I'm writing in? That's fine but I thought it would be cleaner to keep in all in regex syntax. Thanks
The behaviour is expected. When you write the following pattern:
^([A-Z]+[0-9]*){1,8}$
The {1,8} quantifier is telling the regex to repeat the previous pattern, therefore the capturing group in this case, between one to eight times. Due to the greedyness of your operators, you will match and capture indefinitely.
You need to use a lookahead to obtain the desired behaviour:
^(?=.{1,8}$)[A-Z]+[0-9]*$
^ Assert beginning of string.
(?=.{1,8}$) Ensure that the string that follows is between one and eight characters in length.
[A-Z]+[0-9]*$ Match any upper case letters, one or more, and any digits, zero or more.
$ Asserts position end of string.
See working demo here.
The regex ^([A-Z]+[0-9]*){1,8}$ would match [A-Z]+[0-9]* 1 - 8 times. That would match for example a repetition of 8 times A1A1A1A1A1A1A1A1 but not a repetition of 9 times A1A1A1A1A1A1A1A1A1
You might use a positive lookahead (?=[A-Z0-9]{1,8}$) to assert the length of the string:
^(?=[A-Z0-9]{1,8}$)[A-Z]+[0-9]*$
That would match
^ From the start of the string
(?=[A-Z0-9]{1,8}$) Positive lookahead to assert that what follows matches any of the characters in the character class [A-Z0-9] 1 - 8 times and assert the end of the string.
[A-Z]+[0-9]*$ Match one or more times an uppercase character followed by zero or more times a digit and assert the end of the string. $

noncapturing group explanation within a positive lookahead [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
Does this regular expression mean that at least one of the following that isn't a-z:
(?=.*(?:[a-z]))
It's part of the following expression:
/^(?=[A-Za-z0-9\'\s\d\.]{2,50}$)(?=.*(?:[a-z]))[a-zA-Z0-9]+[A-Za-z0-9\'\s\.]+$/m
No, (?=.*(?:[a-z])) means that there could be whatever but must finish with a lowercase letter.
This regex means:
/^(?=[A-Za-z0-9\'\s\d\.]{2,50}$)(?=.*(?:[a-z]))[a-zA-Z0-9]+[A-Za-z0-9\'\s\.]+$/m
Match the line that starts with 2 to 50 alphanumeric, single quote, spaces or a dot, and then follows with lower case letter, and continues with alphanumerics and must ends followed by alphanumerics, spaces, single quote or dot.
Here you can see a better graphical approach for your regex:
Actually, this can be improved as:
/^(?=[A-Za-z\d'\s.]{2,50}$)(?=.*[a-z])[a-zA-Z\d]+[A-Za-z\d'\s.]+$/m