How to show #s in vertical from Horizontal through simple regex? - regex

I have the following numbers as shown below:
1234567890
I would like to get the result as:
1
2
3
4
5
6
7
8
9
0
(Horizontal to Vertical). Please help me to achieve it via simple regex or through editplus.
Thanks in advance !!!

You don't need a regular expression for this; all you're trying to accomplish is to insert a newline character between each element in your string.
If you're using C#, you can use the following:
string s = "1234567890";
string.Join(Environment.NewLine, s.ToCharArray());
Note that if your number is of a numeric data type (e.g., int), you'll likely need to convert it to a string. In C#, this is as simple as calling the .ToString() method, for example:
int x = 1234567890;
string s = x.ToString();

sorry I don't have editplus, but this should work (tested in notepad++)
Find:
([0-9])
replace:
\1\r\n
make sure to have regular expression search on (this may only pertain to notepad++)
the () creates a regular expression group, that may then be back referenced via the "\1" (see the link for a primer)
the "\r\n" are just CRLF

Replace . with &\n in editplus.

Related

How to retrieve the targeted substring, if the number of characters can vary?

I want to retrieve from input similar to the following: code="XY85XXXX", the substring between "".
In case of a fixed number of 8 characters I can retrieve the value with (?<=code=").{8}.
But the targeted substring length varies, 7 or 9, or somewhere in the range between 3 and 11 (as in the examples below) and that is what I need to also handle.
Input can for example be code="XY85XXXX765" or code="123".
How must I adjust the regex to achieve that flexibility?
You can use positive lookbehind to 'anchor' your matches to the fixed part (?<=code=") and a negative character class allowing any character but " occurring one or more times:
(?<=code=")[^"]+
You can use a lookahead and lookbehind both searching for quotes:
(?<=").*(?=")
let rx = /(?<=").*(?=")/;
let extract = (txt) => console.log(txt.match(rx)[0]);
extract('code="XY85XXXX"');
extract('code="Y85XXXX"');
extract('code="ZXY85XXXXZ"');
I've copied the solution ( (?<=code=")[^"]+) in this tool https://regex101.com/ for PHP.
Ok, I get my result but when I select in the tool .NET I have no result.
What should/must be changed?

Regex function to find all and only 6 digit numeric string ignoring spaces if any any between [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I have HTML source page as text file.
I need to read file and find out only those numeric strings which have 6 continous digits and can have a space in between those 6 digits
Eg
209 016 - should be come up in search result and as 400013(space removed)
209016 - should also come up in search and unaltered as 209016
any numeric string more then 6 digits long should not come up in search eg 20901677,209016#223, 29016,
I think this can be achieved by regex but I was not able to
A soln in regex is more desirable but anything else is also welcome
To match 6 digits with any number of spaces in between, you may use the following pattern:
\b(?:\d[ ]*?){6}\b
Or if you want to reject it when it's followed by an #, you may use:
\b(?:\d[ ]*?){6}\b(?!#)
Regex demo.
Then, you can use the replace method to remove the space characters.
Python example:
import re
regex = r"\b(?:\d[ ]*?){6}\b(?!#)"
test_str = ("209016 \n"
"209 016\n"
"20901677','209016#223', '29016")
matches = re.finditer(regex, test_str, re.MULTILINE)
for match in matches:
print (match.group().replace(" ", ""))
Output:
209016
209016
Try it online.
You can try the following regex:
\b(?<!#)\d(?:\s*\d){5}\b(?!#)
demo: https://regex101.com/r/ZCcDmF/2/
But note that you might have to modify your boundaries if you need to exclude more than the #. it will become something like:
\b(?<!#|other char I need to exclude|another one|...)\d(?:\s*\d){5}\b(?!#|other char I need to exclude|another one|...)
where you have to replace other char I need to exclude, another one,... by the characters.

RegEx Lookaround issue

I am using Powershell 2.0. I have file names like my_file_name_01012013_111546.xls. I am trying to get my_file_name.xls. I have tried:
.*(?=_.{8}_.{6})
which returns my_file_name. However, when I try
.*(?=_.{8}_.{6}).{3}
it returns my_file_name_01.
I can't figure out how to get the extension (which can be any 3 characters. The time/date part will always be _ 8 characters _ 6 characters.
I've looked at a ton of examples and tried a bunch of things, but no luck.
If you just want to find the name and extension, you probably want something like this: ^(.*)_[0-9]{8}_[0-9]{6}(\..{3})$
my_file_name will be in backreference 1 and .xls in backreference 2.
If you want to remove everything else and return the answer, you want to substitute the "numbers" with nothing: 'my_file_name_01012013_111546.xls' -replace '_[0-9]{8}_[0-9]{6}' ''. You can't simply pull two bits (name and extension) of the string out as one match - regex patterns match contiguous chunks only.
try this ( not tested), but it should works for any 'my_file_name' lenght , any lenght of digit and any kind of extension.
"my_file_name_01012013_111546.xls" -replace '(?<=[\D_]*)(_[\d_]*)(\..*)','$2'
non regex solution:
$a = "my_file_name_01012013_111546.xls"
$a.replace( ($a.substring( ($a.LastIndexOf('.') - 16 ) , 16 )),"")
The original regex you specified returns the maximum match that has 14 characters after it (you can change to (?=.{14}) who is the same).
Once you've changed it, it returns the maximum match that has 14 characters after it + the next 3 characters. This is why you're getting this result.
The approach described by Inductiveload is probably better in case you can use backreferences. I'd use the following regex: (.*)[_\d]{16}\.(.*) Otherwise, I'd do it in two separate stages
get the initial part
get the extension
The reason you get my_filename_01 when you add that is because lookaheads are zero-width. This means that they do not consume characters in the string.
As you stated, .*(?=_.{8}_.{6}) matches my_file_name because that string is is followed by something matching _.{8}_.{6}, however once that match is found, you've only consumed my_file_name, so the addition of .{3} will then consume the next 3 characters, namely _01.
As for a regex that would fit your needs, others have posted viable alternatives.

Python: RE only captures first and last match

I'm trying to make a Regular Expression that captures the following:
- XX or XX:XX, up to 6 repetitions (XX:XX:XX:XX:XX:XX), where X is a hexadecimal number.
In other words, I'm trying to capture MAC addresses than can range from 1 to 6 bytes.
regex = re.compile("^([0-9a-fA-F]{2})(?:(?:\:([0-9a-fA-F]{2})){0,5})$")
The problem is that if I enter for example "11:22:33", it only captures the first match and the last, which results in ["11", "22"].
The question: is there any method that {0,5} character will let me catch all repetitions, and not the last one?
Thanks!
Not in Python, no. But you can first check the correct format with your regex, and then simply split the string at ::
result = s.split(':')
Also note that you should always write regular expressions as raw strings (otherwise you get problems with escaping). And your outer non-capturing group does nothing.
Technically there is a way to do it with regex only, but the regex is quite horrible:
r"^([0-9a-fA-F]{2})(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?$"
But here you would always get six captures, just that some might be empty.

regular expression: extract last 2 characters

what is the best way to extract last 2 characters of a string using regular expression.
For example, I want to extract state code from the following
"A_IL"
I want to extract IL as string..
please provide me C# code on how to get it..
string fullexpression = "A_IL";
string StateCode = some regular expression code....
thanks
Use the regex:
..$
This will return provide the two characters next to the end anchor.
Since you're using C#, this would be simpler and probably faster:
string fullexpression = "A_IL";
string StateCode = fullexpression.Substring(fullexpression.Length - 2);
Use /(..)$/, then pull group 1 (.groups(1), $1, \1, etc.).
as for the best way, I'd say it's .{2}$
it is more elegant and self-descriptive.