regular expression for multiple filenames - regex

I have some files like that
15.58.55.ser 16.22.20.ser 16.36.23.ser 16.40.13.ser 16.59.41.ser 17.05.08.ser 17.14.40.ser 18.14.40.ser 18.20.43.ser
I want to replace these filenames with the following format
image_1.ser image_2.ser ....
I don't know how to achieve it.
please give me some advice.

The regex is quite simple:
(?:\d{2}\.){3}ser
It matches two digits \d{2} and a dot \. three times {3}, ending in ser.
You can see from RegExr that is matches all of your test cases.
However, in order to know how to do the replacement, you'd have to specify a language that you're working with.

Try this(If you need Java code)
String regex = "\\.ser";
fileName = "15.58.55.ser";
System.out.println(filename.replaceAll(fileName.split(regex)[0], "image_1"));
This is just for only one entry. If you want to replace multiple files, do it in For loop or whatever

Related

Split complex string into mutliple parts using regex

I've tried a lot to split this string into something i can work with, however my experience isn't enough to reach the goal. Tried first 3 pages on google, which helped but still didn't give me an idea how to properly do this:
I have a string which looks like this:
My Dogs,213,220#Gallery,635,210#Screenshot,219,530#Good Morning,412,408#
The result should be:
MyDogs
213,229
Gallery
635,210
Screenshot
219,530
Good Morning
412,408
Anyone have an idea how to use regex to split the string like shown above?
Given the shared patterns, it seems you're looking for a regex like the following:
[A-Za-z ]+|\d+,\d+
It matches two patterns:
[A-Za-z ]+: any combination of letters and spaces
\d+,\d+: any combination of digits + a comma + any combination of digits
Check the demo here.
If you want a more strict regex, you can include the previous pattern between a lookbehind and a lookahead, so that you're sure that every match is preceeded by either a comma, a # or a start/end of string character.
(?<=^|,|#)([A-Za-z ]+|\d+,\d+)(?=,|#|$)
Check the demo here.

Regex - Skip characters to match

I'm having an issue with Regex.
I'm trying to match T0000001 (2, 3 and so on).
However, some of the lines it searches has what I can describe as positioners. These are shown as a question mark, followed by 2 digits, such as ?21.
These positioners describe a new position if the document were to be printed off the website.
Example:
T123?214567
T?211234567
I need to disregard ?21 and match T1234567.
From what I can see, this is not possible.
I have looked everywhere and tried numerous attempts.
All we have to work off is the linked image. The creators cant even confirm the flavour of Regex it is - they believe its Python but I'm unsure.
Regex Image
Update
Unfortunately none of the codes below have worked so far. I thought to test each code in live (Rather than via regex thinking may work different but unfortunately still didn't work)
There is no replace feature, and as mentioned before I'm not sure if it is Python. Appreciate your help.
Do two regex operations
First do the regex replace to replace the positioners with an empty string.
(\?[0-9]{2})
Then do the regex match
T[0-9]{7}
If there's only one occurrence of the 'positioners' in each match, something like this should work: (T.*?)\?\d{2}(.*)
This can be tested here: https://regex101.com/r/XhQXkh/2
Basically, match two capture groups before and after the '?21' sequence. You'll need to concatenate these two matches.
At first, match the ?21 and repace it with a distinctive character, #, etc
\?21
Demo
and you may try this regex to find what you want
(T(?:\d{7}|[\#\d]{8}))\s
Demo,,, in which target string is captured to group 1 (or \1).
Finally, replace # with ?21 or something you like.
Python script may be like this
ss="""T123?214567
T?211234567
T1234567
T1234434?21
T5435433"""
rexpre= re.compile(r'\?21')
regx= re.compile(r'(T(?:\d{7}|[\#\d]{8}))\s')
for m in regx.findall(rexpre.sub('#',ss)):
print(m)
print()
for m in regx.findall(rexpre.sub('#',ss)):
print(re.sub('#',r'?21', m))
Output is
T123#4567
T#1234567
T1234567
T1234434#
T123?214567
T?211234567
T1234567
T1234434?21
If using a replace functionality is an option for you then this might be an approach to match T0000001 or T123?214567:
Capture a T followed by zero or more digits before the optional part in group 1 (T\d*)
Make the question mark followed by 2 digits part optional (?:\?\d{2})?
Capture one or more digits after in group 2 (\d+).
Then in the replacement you could use group1group2 \1\2.
Using word boundaries \b (Or use assertions for the start and the end of the line ^ $) this could look like:
\b(T\d*)(?:\?\d{2})?(\d+)\b
Example Python
Is the below what you want?
Use RegExReplace with multiline tag (m) and enable replace all occurrences!
Pattern = (T\d*)\?\d{2}(\d*)
replace = $1$2
Usage Example:

Regexp: replacing all [[??]] with {{param|??}}

I'm hoping some regexp guru to help me out with this:
I have strings such as [[AB]], [[ABC]] and [[BEC]], and I want to replace them with string {{param|AB}}, {{param|ABC}} and {{param|BEC}} respectively.
All source strings are inside [[]] and have 2 or 3 upper case letters. The idea is to transfer the letters inside brackets to the new format. It's fine if I need two different regexps for 2 and 3 letter long cases.
(if curious, this is for replacing large number of links with templates in a Mediawiki based page).
Thanks in advance!
You can replace the result of following regex :
/\[\[([A-Z]{2,3})\]\]/
with :
{{/param\|\1/}}
Not that some regex engines use $ for capture group so you may need to use {{/param\|$1/}}
If you want to exclude some words you can use a negative look ahead :
/^\[\[((?!AAA|BBB|CCC)[A-Z]{2,3})]]$/gm
But note that since that preceding regex use anchors if you are dealing with a multiline string you need to use m flag (multiline flag).
See demo https://regex101.com/r/cR8zG6/1
You can search using this regex:
\[\[(\w+)\]\]
and replace using:
{{param|$1}}
RegEx Demo

Interesting easy looking Regex

I am re-phrasing my question to clear confusions!
I want to match if a string has certain letters for this I use the character class:
[ACD]
and it works perfectly!
but I want to match if the string has those letter(s) 2 or more times either repeated or 2 separate letters
For example:
[AKL] should match:
ABCVL
AAGHF
KKUI
AKL
But the above should not match the following:
ABCD
KHID
LOVE
because those are there but only once!
that's why I was trying to use:
[ACD]{2,}
But it's not working, probably it's not the right Regex.. can somebody a Regex guru can help me solve this puzzle?
Thanks
PS: I will use it on MYSQL - a differnt approach can also welcome! but I like to use regex for smarter and shorter query!
To ensure that a string contains at least two occurencies in a set of letters (lets say A K L as in your example), you can write something like this:
[AKL].*[AKL]
Since the MySQL regex engine is a DFA, there is no need to use a negated character class like [^AKL] in place of the dot to avoid backtracking, or a lazy quantifier that is not supported at all.
example:
SELECT 'KKUI' REGEXP '[AKL].*[AKL]';
will return 1
You can follow this link that speaks on the particular subject of the LIKE and the REGEXP features in MySQL.
If I understood you correctly, this is quite simple:
[A-Z].*?[A-Z]
This looks for your something in your set, [A-Z], and then lazily matches characters until it (potentially) comes across the set, [A-Z], again.
As #Enigmadan pointed out, a lazy match is not necessary here: [A-Z].*[A-Z]
The expression you are using searches for characters between 2 and unlimited times with these characters ACDFGHIJKMNOPQRSTUVWXZ.
However, your RegEx expression is excluding Y (UVWXZ])) therefore Z cannot be found since it is not surrounded by another character in your expression and the same principle applies to B ([ACD) also excluded in you RegEx expression. For example Z and A would match in an expression like ZABCDEFGHIJKLMNOPQRSTUVWXYZA
If those were not excluded on purpose probably better can be to use ranges like [A-Z]
If you want 2 or more of a match on [AKL], then you may use just [AKL] and may have match >= 2.
I am not good at SQL regex, but may be something like this?
check (dbo.RegexMatch( ['ABCVL'], '[AKL]' ) >= 2)
To put it in simple English, use [AKL] as your regex, and check the match on the string to be greater than 2. Here's how I would do in Java:
private boolean search2orMore(String string) {
Matcher matcher = Pattern.compile("[ACD]").matcher(string);
int counter = 0;
while (matcher.find())
{
counter++;
}
return (counter >= 2);
}
You can't use [ACD]{2,} because it always wants to match 2 or more of each characters and will fail if you have 2 or more matching single characters.
your question is not very clear, but here is my trial pattern
\b(\S*[AKL]\S*[AKL]\S*)\b
Demo
pretty sure this should work in any case
(?<l>[^AKL\n]*[AKL]+[^AKL\n]*[AKL]+[^AKL\n]*)[\n\r]
replace AKL for letters you need can be done very easily dynamicly tell me if you need it
Is this what you are looking for?
".*(.*[AKL].*){2,}.*" (without quotes)
It matches if there are at least two occurences of your charactes sorrounded by anything.
It is .NET regex, but should be same for anything else
Edit
Overall, MySQL regular expression support is pretty weak.
If you only need to match your capture group a minimum of two times, then you can simply use:
select * from ... where ... regexp('([ACD].*){2,}') #could be `2,` or just `2`
If you need to match your capture group more than two times, then just change the number:
select * from ... where ... regexp('([ACD].*){3}')
#This number should match the number of matches you need
If you needed a minimum of 7 matches and you were using your previous capture group [ACDF-KM-XZ]
e.g.
select * from ... where ... regexp('([ACDF-KM-XZ].*){7,}')
Response before edit:
Your regex is trying to find at least two characters from the set[ACDFGHIJKMNOPQRSTUVWXZ].
([ACDFGHIJKMNOPQRSTUVWXZ]){2,}
The reason A and Z are not being matched in your example string (ABCDEFGHIJKLMNOPQRSTUVWXYZ) is because you are looking for two or more characters that are together that match your set. A is a single character followed by a character that does not match your set. Thus, A is not matched.
Similarly, Z is a single character preceded by a character that does not match your set. Thus, Z is not matched.
The bolded characters below do not match your set
ABCDEFGHIJKLMNOPQRSTUVWXYZ
If you were to do a global search in the string, only the italicized characters would be matched:
ABCDEFGHIJKLMNOPQRSTUVWXYZ

Pattern matching in Perl

I am doing pattern match for some names below:
ABCD123_HH1
ABCD123_HH1_K
Now, my code to grep above names is below:
($name, $kind) = $dirname =~ /ABCD(\d+)\w*_([\w\d]+)/;
Now, problem I am facing is that I get both the patterns that is ABCD123_HH1, ABCD123_HH1_K in $dirname. However, my variable $kind doesn't take this ABCD123_HH1_K. It does take ABCD123_HH1 pattern.
Appreciate your time. Could you please tell me what can be done to get pattern with _k.
You need to add the _K part to the end of your regex and make it optional with ?:
/ABCD(\d+)_([\w\d]+(_K)?)/
I also erased the \w*, which is useless and keeps you from correctly getting the HH1_K.
You should check for zero or more occurrences of _K.
* in Perl's regexp means zero or more times
+ means atleast one or more times.
Hence in your regexp, append (_K)*.
Finally, your regexp should be this:
/ABCD(\d+)\w*_([\w\d]+(_K)*)/
\w includes letters, numbers as well as underscores.
So you can use something as simple as this:
/ABCD\w+/