Replace version in yaml structure with regexp - regex

I'm trying to replace the value of the version property in the following yaml structure.
My reason for using regex rather than parsing the yaml is that I need to write it back again. If I parse it and then write it back it'll loose it's existing formatting.
environments:
local:
values:
- kubeContext: default
- surfScreenshotter:
installed: false
version: 0
- whoamiMn:
installed: false
version: 0
dev:
values:
- kubeContext: nuc
- surfScreenshotter:
installed: false
version: 0
- whoamiMn:
installed: false
version: 0
My kotlin code
val regex = """environments:
|.*
| $environment:
| values:
|.*
| - $projectName:
|.*
| {10}version: (\S+)
""".trimMargin().toRegex(MULTILINE)
val updatedHelmfile = regex.replaceFirst(helmfileContent, version)
$environment can be either "local" or "dev" and projectName can be "surfScreenshotter" or "whoamiMn".
Nothing is matched. Anyone got an idea how to make this work?

You can rely on indentation to make sure you are in the right section of your text block and capture the whole part before the version into a capturing group:
val regex = """(environments:
|(?:\R\h{2}.*)*?\R\h{2}$environment:
|(?:\R\h{4}.*)*?\R\h{6}-\h*$projectName:
|(?:\R\h{10}.*)*?\R\h{10}
|version:\h*)\S+
""".trimMargin().toRegex(RegexOption.COMMENTS)
Then, you need to make sure to restore Group 1 contents with $1 in the replacement pattern:
val updatedHelmfile = regex.replaceFirst(helmfileContent, "$1" + version)
See the regex demo and the Kotlin demo.
Details
(environments: - Group 1 start and environments: string
(?:\R\h{2}.*)*?\R\h{2}dev: - zero or more occurrences (as few as possible) of a line break followed with two horizontal whitespace and then the rest of the line, then a line break, two horizontal whitespace and dev: string
(?:\R\h{4}.*)*?\R\h{6}-\h*whoamiMn: - zero or more occurrences (as few as possible) of a line break followed with four horizontal whitespace and then the rest of the line, then a line break, six horizontal whitespace and - + 0 or more spaces, and then whoamiMn: string
(?:\R\h{10}.*)*?\R\h{10} - zero or more occurrences (as few as possible) of a line break followed with ten horizontal whitespace and then the rest of the line, then a line break, ten horizontal whitespace
version:\h*) - version:, 0 or more spaces, end of Group 1
\S+ - one or more non-whitespace chars.

Related

Snippets VS Code Regex

I need your help, I am building a snippets, but I need to transform the path of the file which is this:
D:\Project\test\src\EnsLib\File\aaa\bbb
and I need it to be like this:
EnsLib\File\aaa\bbb
just leave me from "SRC" forward and replace the \ with points.
Example: D:\Project\test\src\EnsLib\File\aaa\bbb
Result: EnsLib.File.aaa.bbb
that always after the src folder is the starting point
my test regex are these:
"${TM_DIRECTORY/(.*\\\\{4})/$1/}",
"${TM_DIRECTORY/.*src\\\\(.*)\\\\(.*)$/.$2/}.${TM_FILENAME_BASE}",
// "${TM_DIRECTORY/.*\\\\(.*)\\\\(.*)$/$1.$2/}.${TM_FILENAME_BASE}",
// "${RELATIVE_FILEPATH/\\D{4}(\\W)\\..+$/$1/g}",
// "${TM_DIRECTORY/(.*src\\\\)//g}.${TM_FILENAME_BASE}",
// "${RELATIVE_FILEPATH/(\\D{3})\\W|(\\..+$)/$1.$2/g}",
// "${RELATIVE_FILEPATH/\\W/./g}",
It seems you want
"${TM_DIRECTORY/^.*?\\\\src\\\\|(\\\\)/${1:+.}/g}"
The regex is ^.*?\\src\\|(\\), it matches
^ - start of string
.*? - any zero or more chars other than line break chars, as few as possible
\\src\\ - \src\ string
| - or
(\\) - Group 1 ($1): a \ char.
If Group 1 matches, the replacement is a ., else, the replacement is an empty string, i.e. the text from the start of string till \src\ is simply removed.

How to delete duplicate numbers in notepad ++?

I've been trying to do use the ^(.*?)$\s+?^(?=.*^\1$) but it doesnt work.
I have this scenario:
9993990487 - 9993990487
9993990553 - 9993990553
9993990554 - 9993990559
9993990570 - 9993990570
9993990593 - 9993990596
9993990594 - 9993990594
And I would want to delete those that are "duplicate" and spect the following:
9993990487
9993990553
9993990554 - 9993990559
9993990570
9993990593 - 9993990596
9993990594
I would really appreciate some help since its 20k+ numbers I have to filter. Or maybe another program, but it's the only one I have available in this PC.
Thanks,
Josue
You may use
^(\d+)\h+-\h+\1$
Replace with $1.
See the regex demo.
Details
^ - start of a line
(\d+) - Group 1: one or more digits
\h+-\h+ - a - char enclosed with 1+ horizontal whitespaces
\1 - an inline backreference to Group 1 value
$ - end of a line.
The replacement is a $1 placeholder that replaces the match with the Group 1 value.
Demo and settings:

Python - how to add a new line every time there is a pattern is found in a string?

How can I add a new line every time there is a pattern of a regex-list found in a string ?
I am using python 3.6.
I got the following input:
12.13.14 Here is supposed to start a new line.
12.13.15 Here is supposed to start a new line.
Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.
I wish to have the following output:
12.13.14
Here is supposed to start a new line.
12.13.15
Here is supposed to start a new line.
Here is some text. It is written in one lines.
12.13.
Here is some more text.
2.12.14.
Here is even more text.
My first try returns as the output the same as the input:
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['\d\d\.\d\d\.', '\d\.\d\d\.\d\d','\d\d\.\d\d\.\d\d']))
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text_list = fin2.read().split()
fin2.seek(0)
for string in fin2:
if re.match(start_rx, string):
string = str.replace(start_rx, '\n\n' + start_rx + '\n')
fout2.write(string)
My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['\d\d\.\d\d\.', '\d\.\d\d\.\d\d','\d\d\.\d\d\.\d\d']))
with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:
for line in fin2:
start = False
if re.match(start_rx, line):
start = True
if start == False:
print ('do something')
if start == True:
line = '\n' + line ## leerzeichen vor Pos Nr
line = line.replace(start_rx, start_rx + '\n')
fout3.write(line)
First of all, to search and replace with a regex, you need to use re.sub, not str.replace.
Second, if you use a re.sub, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use \g<0> backreference, no capturing groups are required).
Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['\d\d\.\d\d\.\d\d', '\d\.\d\d\.\d\d', '\d\d\.\d\d\.'])). However, you may use a more precise pattern here manually.
Here is how your code can be fixed:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text = fin2.read()
fout2.write(re.sub(r'\s*(\d+(?:\.\d+)+\.?)\s*', r'\n\n\1\n', text))
See the Python demo
The pattern is
\s*(\d+(?:\.\d+)+\.?)\s*
See the regex demo
Details
\s* - 0+ whitespaces
(\d+(?:\.\d+)+\.?) - Group 1 (\1 in the replacement pattern):
\d+ - 1+ digits
(?:\.\d+)+ - 1 or more repetitions of . and 1+ digits
\.? - an optional .
\s* - 0+ whitespaces
Try this
out_file2=re.sub(r'(\d+) ', r'\1\n', in_file2)
out_file2=re.sub(r'(\w+)\.', r'\1\.\n', in_file2)

extracting certain values from a text file in python

I have a text file in the below format and I have to extract all range of motion and Location values. In some files, the value is given in the next line and in some, it is not given
File1.txt:
Functional Assessment: Patient currently displays the following functional
limitations and would benefit from treatment to maximize functional use and
pain reduction: Range of Motion: limited . ADLs: limited . Gait: limited .
Stairs: limited . Squatting: limited . Work participation status: limited .
Current Status: The patient's current status is improving.
Location: Right side
Expected output: limited | Right side
File2.txt:
Functional Assessment: Patient currently displays the following functional
limitations and would benefit from treatment to maximize functional use and
pain reduction:
Range of Motion:
painful
and
limited
Strength:
limited
Expected output: painful and limited | Not given
This is the code which I am trying:
if "Functional Assessment:" in line:
result=str(line.rsplit('Functional Assessment:'))
romvalue = result.rsplit('Range of Motion:')[-1].split()[0]
outputfile.write(romvalue)
partofbody = result.rsplit('Location:')[-1].split()[0]
outputfile.write(partofbody)
I am not getting the output which I want with this code. Can someone please help.
You may collect all lines after a line that starts with Functional Assessment:, join them and use the following regex:
(?sm)\b(Location|Range of Motion):\s*([^\W_].*?)\s*(?=(?:\.\s*)?[^\W\d_]+:|\Z)
See the regex demo.
Details
(?sm) - re.S and re.M modifiers
\b - word boundary
(Location|Range of Motion) - Group 1: either Location or Range of Motion
:\s* - a colon and 0+ whitespaces
([^\W_].*?) - Group 2:
\s* - 0+ whitespaces
(?=(?:\.\s*)?[^\W\d_]+:|\Z) - a positive lookahead that, immediately to the right of the current location, requires
(?:\.\s*)? - an optional sequence of . and 0+ whitespaces
[^\W\d_]+: - 1+ letters followed with :
| - or
\Z - end of string.
Here is a Python demo:
reg = re.compile(r'\b(Location|Range of Motion):\s*([^\W_].*?)\s*(?=(?:\.\s*)?[^\W\d_]+:|\Z)', re.S | re.M)
for file in files:
flag = False
tmp = ""
for line in file.splitlines():
if line.startswith("Functional Assessment:"):
tmp = tmp + line + "\n"
flag = not flag
elif flag:
tmp = tmp + line + "\n"
print(dict(list(reg.findall(tmp))))
Output (for the two texts you posted):
{'Location': 'Right side', 'Range of Motion': 'limited'}
{'Range of Motion': 'painful \nand\nlimited'}

Notepad++ add Suffix after each 5 lines

I have a text file contains a list of usernames (+100,000 lines), I'd like to add a Suffix after each 5 lines.
Example:
Username1
Username2
Username3
Username4
Username5 SUFFIX HERE!
Username6
Username7
Username8
Username9
Username10 SUFFIX HERE!
Username11
Username12
Username13
Username14
Username15 SUFFIX here!
Username16
... etc.
I've tried to use regex to search for ^(.+)$ then \1 suffixtext! with failed attempt. it change all the lines. while i just need each 5 lines.
I want to also add a random number after the suffix.
Thank you,
regards.
You may use
^.*(?:\R.*){4}
And replace with $& SUFFIX 0.
Details:
^ - start of a line
.* - any 0+ chars other than line break chars
(?:\R.*){4} - exactly 4 occurrences of a line break (any style, \R) followed with any 0+ chars other than line break chars (.*).
The replacement contains a backreference to the whole match ($&) and then a number.
See the screenshot with settings:
To later increment the numbers after SUFFIX, use a Python Script
cnt = 0
def incrementnum(match):
global cnt
cnt = cnt + 1
return "{0}{1}".format(match.group(1), str(int(match.group(2))+cnt))
editor.rereplace(r'(SUFFIX )(\d+)$', incrementnum)
Just follow these instructions to use it in your NPP.