Regex expression if number string contains specific numbers

Regex expression if number string contains specific numbers - regex

I need some help with creating a regex string. I have this long list of numbers:
7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013 7014
7015 7016 7017 7018 7019 7020 7021 7022 7023 7024 7025 7026 7027 7028
7029 7030 7031 7032 7033 7034 7035 7036 7037 7038 7039 7040 7041 7042
7043 7044 7045 7046 7047 7048 7049 7050 7051 7052 7053 7054 7055 7056
7057 7058 7059 7060 7061 7062 7063 7064 7065 7066 7067 7068 7069 7070
7071 7072 7073 7074 7075 7076 7077 7078 7079 7080 7081 7082 7083 7084
7085 7086 7087 7088 7089 7090 7091 7092 7093 7094 7095 7096 7097 7098
7099 7100 7101 7102 7103 7104 7105 7106 7107 7108 7109 7110 7111 7112
7113 7114 7115 7116 7117 7118 7119 7120 7121 7122 7123 7124 7125 7126
7127 7128 7129 7130 7131 7132 7133 7134 7135 7136 7137 7138 7139 7140
7141 7142 7143 7144 7145 7146 7147 7148 7149 7150 7151 7152 7153 7154
7155 7156 7157 7158 7159 7160 7161 7162 7163 7164 7165 7166 7167 7168
7169 7170 7171 7172 7173 7174 7175 7176 7177
Basically, I need to find the numbers that contain numbers 8 and 9 so I can remove them from the list.
I tried this regex: ([0-7][0-7][8-9]{2}) but that will only match numbers that strictly have both numbers 8 & 9.

How about you just write some simple code rather than trying to cram everything into a regex?
#!/usr/bin/perl -i -p # Process the file in place
#n = split / /; # Split on whitespace into array #n
#n = grep { !/[89]/ } #n; # #n now contains only those numbers NOT containing 8 or 9
$_ = join( ' ', #n ); # Rebuild the line

Dalorzo answer would work, but I suggest a different approach:
/\b(?=\d{4}\b)(\d*[89]\d*)\b/g
Assuming you are only looking for 4 digit numbers, then it is using a positive lookahead to ensure you have those (so it won't match, say, 3 or 5 digit numbers) and then checks if at least one of the digits is 8 or 9.
http://regex101.com/r/hW4vQ3
If you need to catch all numbers, not just four digit ones, then
/\b(?=\d+\b)(\d*[89]\d*)\b/g
See it in action:
http://regex101.com/r/bW2gH3
And as a bonus, the regex is also capturing the numbers so you can do a replace afterwards, if you wish

This is a bit long-winded, but easier to decipher:
/\b([89]\d{3}|\d[89]\d{2}|\d{2}[89]\d|\d{3}[89])\b/g
It also restricts the search to 4-digit groups.

How about:
/\b((?:[\d]+)?[89](?:[\d]+)?)\b/g
Online Demo
\b will match the end and the begging of each number.
(?:[\d]+)? a non matching group of numbers, we need optional at the begging [89] and ending [89] and containing [89].
?: The non-matching group may be optional in this expression but there was not need to match the sub-groups.

You can use this pattern:
[0-7]*(?:8[0-8]*9|9[0-9]*8)[0-9]*
or with a backreference:
(?:[0-9]*(?!\1)([89])){2}[0-9]*

re.findall(r"(\d\d[0-7][89])|(\d\d[89][0-7])|(\d\d[89][89])",x)
Works for the input given.

Slightly simpler regex with lookahead:
(?=\d*[89])\d+
Demo

Related

Regex with one open and close bracket within an number

since few days I am sitting and fighting with the regular expression without any success
My first expression, what I want:
brackets just one time, doesn't matter where
Text or numbers before and after brackets optional
numbers within the brackets
Example what is allowed:
[32] text1
text1 [5]
text1 [103] text2
text1
[123]
[some value [33]] (maybe to complicated, would be not so important?)
My second expression is similar but just numbers before and after the brackets instead text
[32] 11
11 [5]
11 [103] 22
11
[123]
no match:
[12] xxx [5] (brackets are more than one time)
[aa] xxx (no number within brackets)
That's what I did but is not working because I don't know how to do with the on-time-brackets:
^.*\{?[0-9]*\}.*$
From some other answer I found also that, that's looks good but I need that for the numbers:
^[^\{\}]*\{[^\{\}]*\}[^\{\}]*$
I want to use later the number in the brackets and replace with some other values, just for some additional information, if important.
Hope someone can help me. Thanks in advance!

This is what you want:
^([^\]\n]*\[\d+\])?[^[\n]*$
Live example
Update: For just numbers:
^[\d ]*(\[\d+\])?[\d ]*$
Explaination:
^ Start of line
[^...] Negative character set --> [^\]] Any character except ]
* Zero or more length of the Class/Character set
\d 0-9
+ One or more length of the Class/Character set
(...)? 0 or 1 of the group
$ End of line
Note: These RegExs can return empty matches.
Thanks to #MMMahdy-PAPION! He improved the answer.

Limiting parts and total of numbers in a string (regex)

I'm trying to use regex to find tax numbers with the formats:
nnn-nnn-nnn | nn-nnn-nnn
nnn nnn nnn | nn nnn nnn
nnnnnnnnn | nnnnnnnn
EDIT: some samples are 062-225-505, 62-225-505, 062 225 505, 62 225 505, 062225502, 62225505. The numbers should not be any longer than 9 numbers in total
So far I have ([0-9]{2,3}(\s|-|)+[0-9]{3,8}(\s|-|)+[0-9]{3,9})
This works, BUT it is also finding 050821862257111 which is too long for what I'm trying to find. How do I limit the total string as well as each part being limited?
Thanks!

Try ^\d{1,9}(?:(?:-| )\d{1,9})*$
Explanation:
^ - match beginning of a string
\d{1,9} - match between 1 and 9 digits
(?:...) - non-captuirng group
-| - alterantion: match or -
* - match zero or more times
$ - match end of a string
Demo

With a small change to your regex, you can limit the length to eight or nine numbers, although this would still allow a mix and match of the delimiters:
([0-9]{2,3}[\s-]?[0-9]{3}[\s-]?[0-9]{3})
If the actual number of delimiters is not important, then you could just remove then, and then just check the length of the remaining numbers.

^\d{2}\d?(?:-|\s)?\d{3}(?:-| )?\d{3}$
demo at regex101
This regex will only match if the spaces and dashes are in the right place.
This will match: 062-225-505
This will not match: 062-2255-05 or 062225--505

Found with a combination of all of your help! :)
\s\d{2,3}\d?(-|\s)?\d{3}(\1)?\d{3}(?!\d)
Found 62-225-505, 62225505, 062 225 505, and did not find 060821067254101
Thanks all :)

Find a String from a varying number block to the end

I have nearly 8000 lines of the following text:
DIL 2 M 006 SC SCHÜTZ 083 1 Stck
25215-1 BIN-SORT 2152310251724-1 BIN-SORT getestet 048 133 Stck
RBBE60-T3dsg 21S003 SEALING 6X8.9X2.4 MM 082 3 Stck
I am only interested in the 3 digit block at the end and the number behind.
So this should be the output:
083 1
048 133
082 3
It could be, that the same number e.g. 048 appears at the beginning of the line. this shouldn't be a hit.
Unfortunatelly i have no idea how to extract this strings with the help of notpad++.

This expression,
.*(\d{3}\s+\d+).*
with a replacement of $1 is likely to work here.
The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.

You may try the following find and replace, in regex mode:
Find: ^.*?(\d+ \d+) \S*$
Replace: $1
The logic here is to use .* to consume everything up until the last two consecutive digits in the line. Then, we replace with only the captured two digits.
Demo

Regex to match blocks of text with key phrases in the middle

VB2010: I have text that consists of blocks of text that start with day and time DD HHMM and end only at the next day/time.
Here is my sample text:
18 2131 Z50000 ZZ-AAA
PR
PR
AGM TPS P773QQ 1500 DCA 22FEB
21,77,23,M10,F,26,3100,2
OK
18 2134 Z50000 ZZ-AAA
PR
QU HMKKDBB
.DDVZAZC 182134
ARR
FI US1500/AN P773QQ/DA KDCA/AD KMIA/IN 2026/FB 152/LA /LR
DT DDL DCAV 182134 M33A
- OS KMIA /GNO6541/R200RR
18 2134 Z50000 ZZ-AAA
PR
PR
ARR OPN P773QQ 1500 DCA 22FEB
0757
OK
18 2135 Z50000 ZZ-AAA
PR
PR
ARR M58 P773QQ 1500 DCA 22FEB
212
UNKNOWN POL/SPOL
QU HMKKDBB
.DDVZAZC 182134
ARR
FI US1500/AN P773QQ/DA KDCA/AD KMIA/IN 2026/FB 152/LA /LR
DT DDL DCAV 182134 M33A
- OS KMIA /GNO6541/R200RR
18 2136 Z50000 ZZ-AAA
PRF 1500/18 MIA IN 0152 333
18 2137 Z50000 ZZ-AAA
PR
PRZ 1500/18 MIA IN 2026 N/A 333
My goal is to get only the blocks of text that have key phrases ^FI and ^DT in the middle. The matching groups should contain only two blocks. The one from 18 2134 and end at M33A and then from 18 2135 to M33A.
I have tried:
This works for the most part except it starts the match at the prior block.
RegexOptions.Singleline Or RegexOptions.Multiline Or RegexOptions.IgnoreCase
^\d\d \d{4}(.*?)^FI US(.*?)^DT DDL(.*?)\r
This one I took from another post but cant seem to wrap my head around. It matches only the first part of every block.
RegexOptions.Multiline Or RegexOptions.IgnoreCase
^\d\d \d{4}.*\r[\s\S]*?(?=(?:^\d\d \d{4}|$))
Haven't used regex in a while so any help appreciated.

You may use
(?ms)^\d\d +\d{4}\b(?:(?!^(?:\d\d +\d{4}\b|FI|DT)).)*?^(?:FI|DT).*?(?=^\d\d +\d{4}\b|\Z)
See the regex demo (Though it is a PCRE regex test, it will work the same in .NET).
Pattern details
(?ms) - multiline and singleline options
^ - start of a line
\d\d +\d{4}\b - 2 digits, 1 or more spaces and 4 digits as a whole word
(?:(?!^(?:\d\d +\d{4}\b|FI|DT)).)*? - any char, 0+ repetitions, as few as possible, that does not start the sequence: start of a line, 2 digits, 1 or more spaces and 4 digits as a whole word, or FI or DT
^(?:FI|DT) - FI or DT at the start of a line
.*? - any 0+ chars, as few as possible
(?=^\d\d +\d{4}\b|\Z) - a positive lookahead that requires ^\d\d +\d{4}\b (start of a line, 2 digits, 1 or more spaces and 4 digits as a whole word) or \Z (end of string) to match immediately to the right of the current location.

This regex should find what you need, if single line enabled
[0-3]\d\s+[0-2]\d[0-5]\d.*?(FI.*?)\n(DT.*?)\n
Explanation:
[0-3]\d\s+[0-2]\d[0-5]\d day hour and minute check
.*? ungreedy capturing, . includes newline
(FI.*?)\n first group, FI line, until line break
(DT.*?)\n second group, same deal

Is it possible to increment numbers using regex substitution?

Is it possible to increment numbers using regex substitution? Not using evaluated/function-based substitution, of course.
This question was inspired by another one, where the asker wanted to increment numbers in a text editor. There are probably more text editors that support regex substitution than ones that support full-on scripting, so a regex might be convenient to float around, if one exists.
Also, often I've learned neat things from clever solutions to practically useless problems, so I'm curious.
Assume we're only talking about non-negative decimal integers, i.e. \d+.
Is it possible in a single substitution? Or, a finite number of substitutions?
If not, is it at least possible given an upper bound, e.g. numbers up to 9999?
Of course it's doable given a while-loop (substituting while matched), but we're going for a loopless solution here.

This question's topic amused me for one particular implementation I did earlier. My solution happens to be two substitutions so I'll post it.
My implementation environment is solaris, full example:
echo "0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909" |
perl -pe 's/\b([0-9]+)\b/0$1~01234567890/g' |
perl -pe 's/\b0(?!9*~)|([0-9])(?=9*~[0-9]*?\1([0-9]))|~[0-9]*/$2/g'
1 2 3 4 8 9 10 11 20 100 110 200 910 1000 1100 1910
Pulling it apart for explanation:
s/\b([0-9]+)\b/0$1~01234567890/g
For each number (#) replace it with 0#~01234567890. The first 0 is in case rounding 9 to 10 is needed. The 01234567890 block is for incrementing. The example text for "9 10" is:
09~01234567890 010~01234567890
The individual pieces of the next regex can be described seperately, they are joined via pipes to reduce substitution count:
s/\b0(?!9*~)/$2/g
Select the "0" digit in front of all numbers that do not need rounding and discard it.
s/([0-9])(?=9*~[0-9]*?\1([0-9]))/$2/g
(?=) is positive lookahead, \1 is match group #1. So this means match all digits that are followed by 9s until the '~' mark then go to the lookup table and find the digit following this number. Replace with the next digit in the lookup table. Thus "09~" becomes "19~" then "10~" as the regex engine parses the number.
s/~[0-9]*/$2/g
This regex deletes the ~ lookup table.

Wow, turns out it is possible (albeit ugly)!
In case you do not have the time or cannot be bothered to read through the whole explanation, here is the code that does it:
$str = '0 1 2 3 4 5 6 7 8 9 10 11 12 13 19 20 29 99 100 139';
$str = preg_replace("/\d+/", "$0~", $str);
$str = preg_replace("/$/", "#123456789~0", $str);
do
{
$str = preg_replace(
"/(?|0~(.*#.*(1))|1~(.*#.*(2))|2~(.*#.*(3))|3~(.*#.*(4))|4~(.*#.*(5))|5~(.*#.*(6))|6~(.*#.*(7))|7~(.*#.*(8))|8~(.*#.*(9))|9~(.*#.*(~0))|~(.*#.*(1)))/s",
"$2$1",
$str, -1, $count);
} while($count);
$str = preg_replace("/#123456789~0$/", "", $str);
echo $str;
Now let's get started.
So first of all, as the others mentioned, it is not possible in a single replacement, even if you loop it (because how would you insert the corresponding increment to a single digit). But if you prepare the string first, there is a single replacement that can be looped. Here is my demo implementation using PHP.
I used this test string:
$str = '0 1 2 3 4 5 6 7 8 9 10 11 12 13 19 20 29 99 100 139';
First of all, let's mark all digits we want to increment by appending a marker character (I use ~, but you should probably use some crazy Unicode character or ASCII character sequence that definitely will not occur in your target string.
$str = preg_replace("/\d+/", "$0~", $str);
Since we will be replacing one digit per number at a time (from right to left), we will just add that marking character after every full number.
Now here comes the main hack. We add a little 'lookup' to the end of our string (also delimited with a unique character that does not occur in your string; for simplicity I used #).
$str = preg_replace("/$/", "#123456789~0", $str);
We will use this to replace digits by their corresponding successors.
Now comes the loop:
do
{
$str = preg_replace(
"/(?|0~(.*#.*(1))|1~(.*#.*(2))|2~(.*#.*(3))|3~(.*#.*(4))|4~(.*#.*(5))|5~(.*#.*(6))|6~(.*#.*(7))|7~(.*#.*(8))|8~(.*#.*(9))|9~(.*#.*(~0))|(?<!\d)~(.*#.*(1)))/s",
"$2$1",
$str, -1, $count);
} while($count);
Okay, what is going on? The matching pattern has one alternative for every possible digit. This maps digits to successors. Take the first alternative for example:
0~(.*#.*(1))
This will match any 0 followed by our increment marker ~, then it matches everything up to our cheat-delimiter and the corresponding successor (that is why we put every digit there). If you glance at the replacement, this will get replaced by $2$1 (which will then be 1 and then everything we matched after the ~ to put it back in place). Note that we drop the ~ in the process. Incrementing a digit from 0 to 1 is enough. The number was successfully incremented, there is no carry-over.
The next 8 alternatives are exactly the same for the digits 1to 8. Then we take care of two special cases.
9~(.*#.*(~0))
When we replace the 9, we do not drop the increment marker, but place it to the left of our the resulting 0 instead. This (combined with the surrounding loop) is enough to implement carry-over propagation. Now there is one special case left. For all numbers consisting solely of 9s we will end up with the ~ in front of the number. That is what the last alternative is for:
(?<!\d)~(.*#.*(1))
If we encounter a ~ that is not preceded by a digit (therefore the negative lookbehind), it must have been carried all the way through a number, and thus we simply replace it with a 1. I think we do not even need the negative lookbehind (because this is the last alternative that is checked), but it feels safer this way.
A short note on the (?|...) around the whole pattern. This makes sure that we always find the two matches of an alternative in the same references $1 and $2 (instead of ever larger numbers down the string).
Lastly, we add the DOTALL modifier (s), to make this work with strings that contain line breaks (otherwise, only numbers in the last line will be incremented).
That makes for a fairly simple replacement string. We simply first write $2 (in which we captured the successor, and possibly the carry-over marker), and then we put everything else we matched back in place with $1.
That's it! We just need to remove our hack from the end of the string, and we're done:
$str = preg_replace("/#123456789~0$/", "", $str);
echo $str;
> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 20 21 30 100 101 140
So we can do this entirely in regular expressions. And the only loop we have always uses the same regex. I believe this is as close as we can get without using preg_replace_callback().
Of course, this will do horrible things if we have numbers with decimal points in our string. But that could probably be taken care of by the very first preparation-replacement.
Update: I just realised, that this approach immediately extends to arbitrary increments (not just +1). Simply change the first replacement. The number of ~ you append equals the increment you apply to all numbers. So
$str = preg_replace("/\d+/", "$0~~~", $str);
would increment every integer in the string by 3.

I managed to get it working in 3 substitutions (no loops).
tl;dr
s/$/ ~0123456789/
s/(?=\d)(?:([0-8])(?=.*\1(\d)\d*$)|(?=.*(1)))(?:(9+)(?=.*(~))|)(?!\d)/$2$3$4$5/g
s/9(?=9*~)(?=.*(0))|~| ~0123456789$/$1/g
Explanation
Let ~ be a special character not expected to appear anywhere in the text.
If a character is nowhere to be found in the text, then there's no way to make it appear magically. So first we insert the characters we care about at the very end.
s/$/ ~0123456789/
For example,
0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909
becomes:
0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909 ~0123456789
Next, for each number, we (1) increment the last non-9 (or prepend a 1 if all are 9s), and (2) "mark" each trailing group of 9s.
s/(?=\d)(?:([0-8])(?=.*\1(\d)\d*$)|(?=.*(1)))(?:(9+)(?=.*(~))|)(?!\d)/$2$3$4$5/g
For example, our example becomes:
1 2 3 4 8 9 19~ 11 29~ 199~ 119~ 299~ 919~ 1999~ 1199~ 1919~ ~0123456789
Finally, we (1) replace each "marked" group of 9s with 0s, (2) remove the ~s, and (3) remove the character set at the end.
s/9(?=9*~)(?=.*(0))|~| ~0123456789$/$1/g
For example, our example becomes:
1 2 3 4 8 9 10 11 20 100 110 200 910 1000 1100 1910
PHP Example
$str = '0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909';
echo $str . '<br/>';
$str = preg_replace('/$/', ' ~0123456789', $str);
echo $str . '<br/>';
$str = preg_replace('/(?=\d)(?:([0-8])(?=.*\1(\d)\d*$)|(?=.*(1)))(?:(9+)(?=.*(~))|)(?!\d)/', '$2$3$4$5', $str);
echo $str . '<br/>';
$str = preg_replace('/9(?=9*~)(?=.*(0))|~| ~0123456789$/', '$1', $str);
echo $str . '<br/>';
Output:
0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909
0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909 ~0123456789
1 2 3 4 8 9 19~ 11 29~ 199~ 119~ 299~ 919~ 1999~ 1199~ 1919~ ~0123456789
1 2 3 4 8 9 10 11 20 100 110 200 910 1000 1100 1910

Is it possible in a single substitution?
No.
If not, is it at least possible in a single substitution given an upper bound, e.g. numbers up to 9999?
No.
You can't even replace the numbers between 0 and 8 with their respective successor. Once you have matched, and grouped this number:
/([0-8])/
you need to replace it. However, regex doesn't operate on numbers, but on strings. So you can replace the "number" (or better: digit) with twice this digit, but the regex engine does not know it is duplicating a string that holds a numerical value.
Even if you'd do something (silly) as this:
/(0)|(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)/
so that the regex engine "knows" that if group 1 is matched, the digit '0' is matched, it still cannot do a replacement. You can't instruct the regex engine to replace group 1 with the digit '1', group '2' with the digit '2', etc. Sure, some tools like PHP will let you define a couple of different patterns with corresponding replacement strings, but I get the impression that is not what you were thinking about.

It is not possible by regular expression search and substitution alone.
You have to use use something else to help achieve that. You have to use the programming language at hand to increment the number.
Edit:
The regular expressions definition, as part of Single Unix Specification doesn't mention regular expressions supporting evaluation of aritmethic expressions or capabilities for performing aritmethic operations.
Nonetheless, I know some flavors ( TextPad, editor for Windows) allows you to use \i as a substitution term which is an incremental counter of how many times has the search string been found, but it doesn't evaluate or parse found strings into a number nor does it allow to add a number to it.

I have found a solution in two steps (Javascript) but it relies on indefinite lookaheads, which some regex engines reject:
const incrementAll = s =>
s.replaceAll(/(.+)/gm, "$1\n101234567890")
.replaceAll(/(?:([0-8]|(?<=\d)9)(?=9*[^\d])(?=.*\n\d*\1(\d)\d*$))|(?<!\d)9(?=9*[^\d])(?=(?:.|\n)*(10))|\n101234567890$/gm, "$2$3");
The key thing is to add a list of numbers in order at the end of the string in the first step, and in the second, to find the location relevant digit and capture the digit to its right via a lookahead. There are two other branches in the second step, one for dealing with initial nines, and the other for removing the number sequence.
Edit: I just tested it in safari and it throws an error, but it definately works in firefox.

I needed to increment indices of output files by one from a pipeline I can't modify. After some searches I got a hit on this page. While the readings are meaningful, they really don't give a readable solution to the problem. Yes it is possible to do it with only regex; no it is not as comprehensible.
Here I would like to give a readable solution using Python, so that others don't need to reinvent the wheels. I can imagine many of you may have ended up with a similar solution.
The idea is to partition file name into three groups, and format your match string so that the incremented index is the middle group. Then it is possible to only increment the middle group, after which we piece the three groups together again.
import re
import sys
import argparse
from os import listdir
from os.path import isfile, join
def main():
parser = argparse.ArgumentParser(description='index shift of input')
parser.add_argument('-r', '--regex', type=str,
help='regex match string for the index to be shift')
parser.add_argument('-i', '--indir', type=str,
help='input directory')
parser.add_argument('-o', '--outdir', type=str,
help='output directory')
args = parser.parse_args()
# parse input regex string
regex_str = args.regex
regex = re.compile(regex_str)
# target directories
indir = args.indir
outdir = args.outdir
try:
for input_fname in listdir(indir):
input_fpath = join(indir, input_fname)
if not isfile(input_fpath): # not a file
continue
matched = regex.match(input_fname)
if matched is None: # not our target file
continue
# middle group is the index and we increment it
index = int(matched.group(2)) + 1
# reconstruct output
output_fname = '{prev}{index}{after}'.format(**{
'prev' : matched.group(1),
'index' : str(index),
'after' : matched.group(3)
})
output_fpath = join(outdir, output_fname)
# write the command required to stdout
print('mv {i} {o}'.format(i=input_fpath, o=output_fpath))
except BrokenPipeError:
pass
if __name__ == '__main__': main()
I have this script named index_shift.py. To give an example of the usage, my files are named k0_run0.csv, for bootstrap runs of machine learning models using parameter k. The parameter k starts from zero, and the desired index map starts at one. First we prepare input and output directories to avoid overriding files
$ ls -1 test_in/ | head -n 5
k0_run0.csv
k0_run10.csv
k0_run11.csv
k0_run12.csv
k0_run13.csv
$ ls -1 test_out/
To see how the script works, just print its output:
$ python3 -u index_shift.py -r '(^k)(\d+?)(_run.+)' -i test_in -o test_out | head -n5
mv test_in/k6_run26.csv test_out/k7_run26.csv
mv test_in/k25_run11.csv test_out/k26_run11.csv
mv test_in/k7_run14.csv test_out/k8_run14.csv
mv test_in/k4_run25.csv test_out/k5_run25.csv
mv test_in/k1_run28.csv test_out/k2_run28.csv
It generates bash mv command to rename the files. Now we pipe the lines directly into bash.
$ python3 -u index_shift.py -r '(^k)(\d+?)(_run.+)' -i test_in -o test_out | bash
Checking the output, we have successfully shifted the index by one.
$ ls test_out/k0_run0.csv
ls: cannot access 'test_out/k0_run0.csv': No such file or directory
$ ls test_out/k1_run0.csv
test_out/k1_run0.csv
You can also use cp instead of mv. My files are kinda big, so I wanted to avoid duplicating them. You can also refactor how many you shift as input argument. I didn't bother, cause shift by one is most of my use cases.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex expression if number string contains specific numbers - regex

This is a bit long-winded, but easier to decipher: /\b([89]\d{3}|\d[89]\d{2}|\d{2}[89]\d|\d{3}[89])\b/g It also restricts the search to 4-digit groups.

You can use this pattern: [0-7](?:8[0-8]9|9[0-9]8)[0-9] or with a backreference: (?:[0-9](?!\1)([89])){2}[0-9]

re.findall(r"(\d\d[0-7][89])|(\d\d[89][0-7])|(\d\d[89][89])",x) Works for the input given.

Slightly simpler regex with lookahead: (?=\d*[89])\d+ Demo

Related

Regex with one open and close bracket within an number

Limiting parts and total of numbers in a string (regex)

Find a String from a varying number block to the end

Regex to match blocks of text with key phrases in the middle

Is it possible to increment numbers using regex substitution?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex expression if number string contains specific numbers - regex

This is a bit long-winded, but easier to decipher: /\b([89]\d{3}|\d[89]\d{2}|\d{2}[89]\d|\d{3}[89])\b/g It also restricts the search to 4-digit groups.

You can use this pattern: [0-7]*(?:8[0-8]*9|9[0-9]*8)[0-9]* or with a backreference: (?:[0-9]*(?!\1)([89])){2}[0-9]*

re.findall(r"(\d\d[0-7][89])|(\d\d[89][0-7])|(\d\d[89][89])",x) Works for the input given.

Slightly simpler regex with lookahead: (?=\d*[89])\d+ Demo

Related

Regex with one open and close bracket within an number

Limiting parts and total of numbers in a string (regex)

Find a String from a varying number block to the end

Regex to match blocks of text with key phrases in the middle

Is it possible to increment numbers using regex substitution?

Categories

Resources

You can use this pattern: [0-7](?:8[0-8]9|9[0-9]8)[0-9] or with a backreference: (?:[0-9](?!\1)([89])){2}[0-9]