Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am trying to write a regular expression to extract only the number 120068018
!false!|!!|!!|!!|!!|!120068018!|!!|!false!
I am rather new to regular expressions and am finding this a daunting task.
I have tried using the pattern as the numbers always start with 1200
'/^1200+!\$/'
but that does not seem to work.
If your number always starts with 1200, you can do
/1200\d*/
This matches a string that starts with 1200 plus "however many digits follow". Depending on the 'dialect' of regex that you use, you might find
/1200[0-9]*/
more robust; or
/1200[[:digit:]]*/
which is more "readable"
Any of these can be "captured" with parentheses. Again, depending on the dialect, you may or may not need to escape these (add \ in front). Example
echo '||!!||!!||!!120012345||##!!##!!||' | sed 's/^.*\(1200[0-9]*\).*$/\1/'
produces
120012345
just as you wanted. Explanation:
sed stream editor command
s substitute
^.* everything from start of string until
\( start capture group
1200 the number 1200
[0-9]* followed by any number of digits
\) end capture
.*$ everything from here to end of line
/\1/ and replace with the contents of the first capture group
Use this:
/1200\d+/
\d is a meta-character that will match any digits.
Your regular expression didn't work because:
^ matches start of a string. You didn't have your number there.
$ matches end of the string. Again, your number is in the middle of the string.
applies to the immediately preceding character, meta-character or group. So 1200+ means 120 followed by 1 or more zeroes.
This is the regular expression you need:
/[0-9]+/
[0-9] Matches any number from 0 to 9
The + sign means 1 or more times.
If you want to get any number starting with 1200, then it would be
/1200[0-9]*/
Here I am using * because it allows zero or more. Otherwise, the number 1200 wouldn't be captured.
If you want to capture (extract) the String, surround it with parenthesis:
/(1200[0-9]*)/
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
The below regular expression works with 012, 201, 102, etc. I am trying to change the regular expression so that it matches 002, 200, 020O from a 4 digit number. I tried varies methods, but the regular expression is matching other patterns. Can someone give me some direction on how to resolve this issue. Thank you.
Working:
RegEx012 = re.compile(r'\b(?=[1-9]*0)(?=[02-9]*1)(?=[013-9]*2)\d+\b')
Not Working:
RegEx002 = re.compile(r'\b(?=[1-9]*0)(?=[1-9]*0)(?=[013-9]*2)\d+\b')
Results:
0250(good)
0260(good)
2052(bad)
2062(bad)
If you want to match a string with 2 times a zero and at least 3 digits, you could use a positive lookahead:
\b(?=[1-9]*0[1-9]*0[1-9]*\b)\d{3,}\b
Explanation
\b Word boundary
(?= Positive lookahead, assert what is on the right contains
[1-9]*0[1-9]*0[1-9]*\b Match 2 times a zero between optional digits 1-9
) Close lookahead
\d{3,} Match 3 or more digits
\b Word boundary
Regex demo
Or the other way around, assert 3 digits and match 2 times a zero between optional digits 1-9
\b(?=\d{3})[1-9]*0[1-9]*0[1-9]*\b
To match when the third character is a 3 (Or use a character class [03] to match either a 0 or 3)
\b(?=[1-9]*0[1-9]*0[1-9]*\b)\d{2}3\d*\b
Regex demo
This question already has an answer here:
Add a new line after a matched pattern in Notepad++
(1 answer)
Closed 2 years ago.
I'm trying to solve the following problem without success:
I have a document with 100 questions on one line.
In Notepad++ I want to replace each "space | question number | dot | space" and add a linebreak after this, so for example:
This is question one 2. This is question one 3. This is question three
To:
This is question one
This is question two
This is question three
I'm new to regex, I managed to create the following: [\s][1-9][0-9][.][\s] but then I'm missing the single digit numbers...
Using this Regex will match only numbers between 1-100 exclusively:
\s*([1-9][0-9]?|100)\.\s+
while the following will match whatever number of digits regardless of the value:
\s*\d+\.\s+
the \s* will match zero or more spaces before the number.
\d+ matches one or more digits
\. matches the dot, we use the \ to escape it since it's a special character in regex
\s+ matches one or more spaces
([1-9][0-9]?|100) matches any two numbers between 1 and 9 so 1 to 99 and we use the | as "OR" to include 100
type your prefered regex in the "Find what" box and \n (new line) in the "Replace with" box.
you can keep the question number using:
\s*(([1-9][0-9]?|100)\.)\s+
OR
\s*(\d+\.)\s+
and replacing it with:
\n$1
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a set of results that I would like to parse out using Regex and I can't seem to find an expression that works. On each line in a txt file, there are 2 entries each containing a quantity up to 100 followed by an item name of varying lengths and spaces.
Example:
7 BALLS OF STRING 13 CARDBOARD BOXES
14 ROCKS 12 PENCILS
I would like to match the 1st entry with the quantity in group 1, and the 2nd entry with it's quantity in group 2.
You can use the following regular expression pattern and use it while reading the file, line per line:
^(\d*\s[A-Z\s]*)\s(\d*\s[A-Z\s]*)$
Here is a live example: https://regex101.com/r/18dege/1
Here a few details:
^ matches the beginning of the string, $ the end of it
\d* matches any number (0 or more) of numeric characters greedy (equal to [0-9]*)
\s matches a white space character (e.g. tab, space, etc.)
[A-Z\s]* matches any number (0 or more) of uppercase characters and whitespace greedy
() creates a matching group (to extract some parts of the string)
According to the comment below, uppercase letters can be followed by lowercase letters, which should not be matched. An example for this would be:
7 BALLS OF STRING 13 CARDBOARD BOXES
14 ROCKS 12 PENCILS
18 TABLES 3 BLANKETS sewn with patches
To match this pattern, you can use the following regular expression:
^(\d*\s[A-Z\s]*?)[a-z\s]*\s(\d*\s[A-Z\s]*?)[a-z\s]*$
As an update to the above pattern, I've added the following:
[a-z\s]* between the statements (not in the group) and after the second statement, to match a lowercase string
(\d*\s[A-Z\s]*?) I've added a question mark ?, to make the matching non-greedy. This prevents adding the white space between the uppercase and the lowercase part to the matching group. It is now required to have an end of string character $ at the end of the pattern, otherwise, the second group would not match enough characters.
Here is a live example: https://regex101.com/r/18dege/2
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What regex will work to match only certain rows which have a value range (e.g. 20-25 days) in the text raw data (sample below):
[product-1][arbitrary-text][expiry-17days]
[product-2][arbitrary-text][expiry-22days]
[product-3][arbitrary-text][expiry-29days]
[product-4][arbitrary-text][expiry-25days]
[product-5][arbitrary-text][expiry-10days]
[product-6][arbitrary-text][expiry-12days]
[product-7][arbitrary-text][expiry-20days]
[product-8][arbitrary-text][expiry-26days]
'product' and 'expiry' text is static (doesn't change), while their corresponding values change.
'arbitrary-text' is also different for each line/product. So in the sample above, the regex should only match/return lines which have the expiry between 20-25 days.
Expected regex matches:
[product-2][arbitrary-text][expiry-22days]
[product-4][arbitrary-text][expiry-25days]
[product-7][arbitrary-text][expiry-20days]
Thanks.
Please check the following regex:
/(.*-2[0-5]days\]$)/gm
( # start capturing group
.* # matches any character (except newline)
- # matches hyphen character literally
2 # matches digit 2 literally
[0-5] # matches any digit between 0 to 5
days # matches the character days literally
\] # matches the character ] literally
$ # assert position at end of a line
) # end of the capturing group
Do note the use of -2[0-5]days to make sure that it doesn't match:
[product-7][arbitrary-text][expiry-222days] # won't match this
tested this one and it works as expected:
/[2-2]+[0-5]/g
[2-2] will match a number between 2 and 2 .. to restrict going pass the 20es range.
[0-5] second number needs to be between 0 and 5 "the second digit"
{2} limit to 2 digits.
Edit : to match the entire line char for char , this shoudl do it for you.
\[\w*\-\d*\]\s*\[\w*\-[2-2]+[0-5]\w*\]
Edit2: updated to account for Arbitrary text ...
\[(\w*-\d*)\]+\s*\[(\w*\-\w*)\]\s*\[(\w*\-[2-2]+[0-5]\w*)\]
edit3: Updated to match any character for the arbitrary-text.
\[(\w*-\d*)\]\s*\[(.*)\]\s*\[(\w*\-[2-2][0-5]\w*)\]
.*\D2[0-5]d.*
.* matches everything.
\D prevents numbers like 123 and 222 from being valid matches.
2[0-5] covers the range.
d so it doesn't match the product number.
I pasted your sample text into http://regexr.com
It's a useful tool for building regular expressions.
You can try this one :
/(.*-2[0-5]days\]$)/gm
try it HERE
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
How to Write Regular Expression for Validating Indian Names ?
Indian Names mostly don't contain Surnames or Last Names of a Person. It Contains only Initial followed by Name or vice versa.
The Name should Contains either Upper and Lower Cases or some times both cases. White-Spaces and Full Stops are also the Part of the Name.
The Person Name should start with upper or lower case alphabet.
I need the Regular Expression to Validate the Names I listed below.
List of Test Case Valid Names:
B. Bala Manigandan
B.Balamanigandan
B. Balamanigandan
G.S. Sakthivel
G. S. Sakthivel
Sakthivel M.G.
GS Sakthivel
G.S. SAKTHIVEL
BALA MANIGANDAN .B
Sakthivel .M.G
If you want to catch all strings containing only upper and lowercase characters, periods and spaces, you can use
^[a-zA-Z\. ]+$
Here's a breakdown of how the regex works
^ matches the beginning of the string
[a-zA-Z\. ] matches any character a-z, A-Z, or a period or space
+ makes the above match any string with 1 or more characters
$ matches the end of the string
You can test this regex here, at regex101.com.
The regex could probably be created better, though, to ensure that the string does not end or start with a period or a space
^(?![\. ])[a-zA-Z\. ]+(?<! )$
This is essentially the same as the above regex, except it uses negative lookaheads and negative lookbehinds to make sure the string does not start with a period or space or end with a space
( Beginning of the group
?! makes the group a Negative Lookahead
[\. ] Matches either a period or space (in this case, because we are looking a negative lookahead, it will make sure it does not match this)
) End of the group
( Beginning of the second group
?<! makes the group a Negative Lookbehind
The space matches a literal space (in this case, because we are looking a negative lookbehind, it will make sure it does not match this)
) End of the second group
You can test this regex here on regex101.com