I can see partial answers to my question but nothing that entirely answers my problem.
I'm looking for some script to run that will find and replace specifically only 15 and 16 digit numbers in a file.
rather than replace it with any one specific thing uniformly I want to retain the first 10 digits of the number and replace the last 6 with six 'X's
For example:
1234567890123456 would become: 1234567890XXXXXX
Help greatly appreciated.
P.S= The same question is raised here but only the question given in the subject title is addressed and not the detail of the text (the chap wanted not only to find 15 and 16 digit numbers... but wanted to replace the last digits with 'X')
PHP Find a 15 or 16-digit number in a long string
I guess you're using php.
With preg_replace you could do:
$str = 1234567890123456;
$res = preg_replace('/\b(\d{10})\d{5,6}\d/', '$1XXXXXX', $res);
Explanation:
/ : regex delimiter
\b : word boundary, make sure we don't have digits before
( : begin capture group
\d{10} : 10 digits
) : end capture group
\d{5,6} : 5 or 6 digits
\b : word boundary, make sure we don't have digits before
/ : regex delimiter
Related
I have an Url formatted as follow : https://www.mywebsite.com/subdomain/123456789.htm. I know that the webpage number is built with exactly 9 or 10 digits. I would like to extract this number using a Regex.
The Regex I use to perform this operation is :
^https://www.mywebsite.com/[A-Za-z0-9_.-~/]+([0-9]{9,10}).htm$
The problem is that when the number is 10 digits long, I get a match which is good but only the last 9 digits are captured. For example : https://www.mywebsite.com/subdomain/1234567890.htm captures 234567890 only.
I could easily create two regexes (one with 9 digits and one with 10) and take the longest number if both matches, but is there any elegant way to solve this problem using Regex?
EDIT
Following remarks which have been made below, there is actually a mistake in my original Regex : the first character group matches the first digit of the 10, and leaves only the 9 others for the capturing group. I've added a screenshot below. Adding a forward slash to the Regex before the capturing group solved the issue, thanks!
As per #TheFourthBird, you are missing a match on the forward slash. Maybe a slightly different approach to yours would be a non-capturing group:
^https://www.mywebsite.com/(?:[^/]+/)+(\d{9,10}).htm$
The character class [A-Za-z0-9_.-~/]+ matches all the character that follow until the end of the line.
This part ([0-9]{9,10}). will then backtrack until it can match the resulting digits, which it can starting from 9 digits and that will be in the capturing group.
Note to either escape the hyphen \- or place it at the start or end of the character class or else it could possible match a range.
One option is to use a word bounary \b before matching the digits
^https://www\.mywebsite\.com/[A-Za-z0-9_.~/-]+\b([0-9]{9,10})\.htm$
Regex demo
Another way could be matching the / right before the digits.
^https://www\.mywebsite\.com/[A-Za-z0-9_.~/-]+/([0-9]{9,10})\.htm$
Regex demo
If there can also be chars a-zA-Z or an underscoe before the digits and a lookbehind is supported, you could also assert that there is not a digit before (?<!\d)
^https://www\.mywebsite\.com/[A-Za-z0-9_.~/-]+(?<!\d)([0-9]{9,10})\.htm$
Regex demo
One more approach. This gets all the numbers between / and htm
(\d+)(?=\.htm)
RegexDemo
I'd like to build a regex, but I'm stuck.
This is the format I'm looking for:
x;y => 7 times, separated by -
where x is a number from 1 to 7
and y is a number from 1 to 4
Here's what I've done so far:
^([0-7;&-]*)$
example :
1;1-2;3-3;1-4;4-5;2-6;2-7;4
Could you help me?
Thank you
Your current pattern is a broad match as repeating your character class does not take any structure into account or different ranges for the digits.
You could match a digit 1-7, then : and a digit 1-4. Then repeat 6 times the same pattern preceded with a hyphen.
^[1-7];[1-4](?:-[1-7];[1-4]){6}$
Regex demo
I have a specific pattern I'm trying to get. The pattern I'm looking for is the following: 13 digits with a possible dot for a total of min 3 and max 13 digits (including the dot if present) and ending with "/" and number from 1 to 6.
for now I have this pattern
^(\d*|\d*\.?\d*)\/[1-6]$
but this matches 1234/1 or 123456.890123456778/2
but it's not what I need
I tried a few things but I think I missing something
^(\d*|\d*\.?\d*){3-13}\/[1-6]$
Possible match:
1.3/1
123456./2
123456.890123/3
1234567890123/4
123/5
How do I solve this problem?
Your wordings are a little confusing but if I got you correct then you can use this regex,
^(?=.{5,15}$)\d+\.?\d*\/[1-6]$
Explanation:
^ - Start of string
(?=.{5,15}$) - This positive look ahead ensures that the minimum length is 5 and max length is 15 (adding two for last slash and number)
\d+\.?\d* - Starts capturing the text with one or more digits followed by optional dot . and further more zero or more digits
\/[1-6] - Matches a slash and one to six digit
$ - End of string
Regex Demo
Let me know if this works fine for you else list the case for which it doesn't work.
I have multiple 24-hour time strings through several files. For example, 1234, which I wish to replace with 12:34.
Finding them is easy, just \d\d\d\d, that I understand and it works. However, what replace string do I need. In other words, say xx:xx, what do I put in place of each x.
I've tried numbers of things to no avail. I'm obviously not understanding how I get it to remember the digits it found and to recall them in the replace string.
If in your example data 4 digits represent 24 hour time strings you could match 2 capturing groups between word boundaries to prevent a match with more then 4 digits. You can Adjust the word boundaries to your requirements.
Match
\b(\d{2})(\d{2})\b
Replace
group1:group2 \1:\2
Explanation
\b Match a word boundary
(\d{2}) Capture in a group 2 digits
(\d{2}) Capture in a group 2 digits
\b Match a word boundary
Note
Matching 4 digits does not verify a valid 24 hour time. You could match that using for example \b([01][0-9]|2[0-3])([0-5][0-9])\b and replace with \1:\2
I'm trying to find all single numbers (with the use of vim):
numbers at start of line
numbers at end of line
the number has to be followed and proceded by a non number
but may not be folowed or proceded with a "dot" and a number or a "," and a number.
this is correct
7
word7
7word
7.
.7
a,7
word7word
word 7 word
7-7
but not this
7.7
7,7
77
Can anyone help me and explain the regex?
EDIT:
may'be I've found it with the help of an answer below about atomic grouping. Vim does support it:
\(\d\.\|\d\,\|\d\)\#<!\d\(\.\d\|\,\d\|\d\)\#!
You can try this:
\v%(\d+%(\.|,))#<!\d#<!\d+#>%(%(\.|,)\d)#!
Explanation:
\v turns very magic : no need of many backslashes
the % signs are optional (make groups in parentheses non-matching)
(\d+(\.|,)#<! : not preceded with digits then . or ,
\d#<! : not preceded with a digit (be sure we are at the first digit
\d+#> : consume all digits (#> ensures that, see :help /\#>)
((\.|,)\d)#! : after that, no dot or comma followed by a digit.
Give this a whirl:
^(?!\d(\.|\,)?\d)(((\D*?)\d(\D*?))|(\d(\D*?)\d))$
And let me know if you'd like an explanation.
Try this one:
\(\d[\.,]\)\#<!\d\#<!\d\d\#!\([\.,]\d\)\#!
Explanation:
It looks for digits (\d) that are not preceeded by a '.' or ',' followed by a digit (\(\d[\.,]\)\#<!) or a single digit (\d\#<!), and is not followed by '.' or ',' followed by a digit (\([\.,]\d\)\#!) or a single digit (\d\#!).
This one is straight from my vim so it should work in yours.