I want to find those number which contains more than 5 digits and replace it with first 4 digits.
Used below Regex to find number which contains more than 5 digits.
[0-9]{5,}
How Can I achieve blow output?
99999999 -> this will replace with 9999
12345.66 -> this will replace with 1234.66
1234 -> Remains unchanged
This one should do it:
The regex
([0-9]{4})[0-9]+
takes the four numbers as first (and only) group
requires at lease one more number behind
replaces the complete match with the first (and only) group
Using notepad++, you can match 4 digits, then use \K to clear the current output buffer and match 1 or more digits.
\d{4}\K\d+
See a regex demo.
In the replacement use an empty string.
If you don't want partial matches, you can add word boundaries \b around the pattern.
\b\d{4}\K\d+\b
See another regex demo
Related
I'm trying to adjust KODI's search filter with regex so the scrapers recognize tv shows from their original file names.
They either come in this pattern:
"TV show name S04E01 some extra info" or this "TV show name 01 some extra info"
The first is not recognized, because "S04" scrambles the search in a number of ways, this needs to go.
The second is not recognized, because it needs an 'e' before numbers, otherwise, it won't be recognized as an episode number.
So I see two approaches.
Make the filter ignore s01-99
prepend an 'e' any freestanding two-digit numbers, but I worry if regex can even do that.
I have no experience in the regex, but I've been playing around coming up with this, which unsurprisingly doesn't do the trick
^(?!s{00,99})\d{2}$
You may either find \b([0-9]{2})\b regex matches and replace with E$1, or match \bs(0[1-9]|[1-9][0-9])\b pattern in an ignore filter.
Details
\b([0-9]{2})\b - matches and captures into Group 1 any two digits that are not enclosed with letters, digits and _. The E$1 replacement means that the matched text (two digits) is replaced with itself (since $1 refers to the Group 1 value) with E prepended to the value.
\bs(0[1-9]|[1-9][0-9])\b - matches an s followed with number between 01 and 99 because (0[1-9]|[1-9][0-9]) is a capturing group matching either 0 and then any digit from 1 to 9 ([1-9]), or (|) any digit from 1 to 9 ([1-9]) and then any digit ([0-9]).
NOTE: If you need to generate a number range regex, you may use this JSFiddle of mine.
I'm trying to create regex to retrieve last number if there was a number or any number if there wasn't any from a string.
Examples:
6 łyżek stopionego masła -> 6
5 łyżek blabla, 6 łyżek masła -> 6
5 łyżek mąki lub masła -> 5
I'm matching only on masła (changing variable) so it has to be included in regex
EDIT:
I cannot explain what I actually need:
Here is regex101 example: https://regex101.com/r/pEeRk3/1
EDIT2:
Emma's solution works great, but I would need to parse decimals and 2multiple digit numbers as well, meaning that those would match as well:
https://regex101.com/r/pEeRk3/3 - I added examples with answers in the link
If you want to match the last occurence of a digit with a decimal and you word has to follow this value, you might use lookarounds:
(?<!\S)\d+(?:\.\d+)?(?!\S)(?!.*\d)(?=.*masła)
(?<!\S)\d+(?:\.\d+)?(?!\S) Match 1+ digits with an optional past to match a dot and 1+ digits
(?!.*\d) assert that there are no more digits following
(?=.*masła) Assert what is on the right is your word
Regex demo
Or you might use a capturing group:
(?<!\S)(\d+(?:\.\d+)?)[^\d\n]* masła(?!\S)[^\d\n]*$
Regex demo
This expression might simply suffice:
.*([0-9])
if we are interested in one digit only, or
.*([0-9]+)
if multiple digits might be desired.
Demo 1
If those strings with masła are desired, we can expand our expression to:
(?=.*masła).*([0-9])
Demo 2
If we would not be validating our numbers and our number would be valid, with commas or dots, then this expression might likely return our desired output:
(?=.*masła)([0-9,.]+)(\D*)$
Demo 3
String to be evaluated will be either be a 10 digit number or a 4 digit number.
5551119900 (10 Digit)
9999 (4 Digit)
Need regex to test for specific list of 10 digit numbers or 4 digit numbers. I have the following Regex that almost works
55511199(00|01|02|10|20|30)|(0000|9901|9902|9903|9999)
Above is checking for
5551119900
5551119901
5551119902
5551119910
5551119920
5551119930
0000
9901
9902
9903
9999
ISSUE:
(1) Need match to be exactly 10 digits or 4 digits only.
(2) Pattern match (see link below) is showing an exact match and also a "Group 1". I'm not sure what the group match means or if that is a good thing.
Sample: https://regex101.com/r/BbplFG/1/
Try this version of your regex:
^(?:55511199(?:00|01|02|10|20|30)|(?:0000|9901|9902|9903|9999))$
Demo
I have made several changes here:
Used ?: inside terms in parentheses, to turn off group capturing
Placed the entire pattern inside parentheses
Added starting (^) and ending ($) anchors around the entire pattern
How I can delete lines which have less than 11 numbers but more than 8 numbers in one line in notepad++. The numbers are separeted from each other with letters or spaces, etc.
Your requirement says to remove lines having 9 or 10 digits, but not more or less than this. You may try using lookaheads to handle this. In regex mode, try finding the following pattern:
^(?!.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d)(?=.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d).*
Then just replace that with empty string (nothing). Follow the demo below to see that the pattern correctly flags the appropriate lines.
Demo
Edit:
Here is another pattern you may use, without lookaheads, which is a bit easier on the eyes:
^\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d?\D*$
This again says to match any line which contains either 9 or 10 digits, but not more or less than this.
Ctrl+H
Find what: ^(?:\D*\d){8}(?:\D*\d){0,3}(?:\R|$)
Replace with: LEAVE EMPTY
check Wrap around
check Regular expression
Replace all
Explanation:
^ # beginning of line
(?:\D*\d){8} # non capture group, 0 or more NON digit and 1 digit, may appear 8 times
(?:\D*\d){0,3} # non capture group, 0 or more NON digit and 1 digit, may appear 0 upto 3 times
(?:\R|$) # non capture group, linebreak or end of file
Given:
1234567
12345678
123456789
1234567890
12345678901
123456789012
a1b2c3d4e5f6g7
a1b2c3d4e5f6g7h8
a1b2c3d4e5f6g7h8i9
a1b2c3d4e5f6g7h8i9j0k1l2
Result for given example:
1234567
123456789012
a1b2c3d4e5f6g7
a1b2c3d4e5f6g7h8i9j0k1l2
Screen capture:
I have multiple 24-hour time strings through several files. For example, 1234, which I wish to replace with 12:34.
Finding them is easy, just \d\d\d\d, that I understand and it works. However, what replace string do I need. In other words, say xx:xx, what do I put in place of each x.
I've tried numbers of things to no avail. I'm obviously not understanding how I get it to remember the digits it found and to recall them in the replace string.
If in your example data 4 digits represent 24 hour time strings you could match 2 capturing groups between word boundaries to prevent a match with more then 4 digits. You can Adjust the word boundaries to your requirements.
Match
\b(\d{2})(\d{2})\b
Replace
group1:group2 \1:\2
Explanation
\b Match a word boundary
(\d{2}) Capture in a group 2 digits
(\d{2}) Capture in a group 2 digits
\b Match a word boundary
Note
Matching 4 digits does not verify a valid 24 hour time. You could match that using for example \b([01][0-9]|2[0-3])([0-5][0-9])\b and replace with \1:\2