Regex: yaml finding match - regex

I am trying to get a match in regex for all root yaml entries with their values. So only with entries with a value are considered (not matching any nested entries). I have been messing around with it but to no avail. thanks!
so with this example:
metadata:
url: "https://www.google.com"
booleanvalue: 'false'
tls:
host:
google_net: "google.net"
secret:
big_secert_net: "cert"
API_HOST: 'https://api.test.com'
DOMAIN: 'api.domain'
METRIC_ENVIRONMENT: 'test'
Regex would return this match:
booleanvalue: 'false'
API_HOST: 'https://api.test.com'
DOMAIN: 'api.domain'
METRIC_ENVIRONMENT: 'test'

grep -E '^[^\r\n]+:[^\S\r\n]+[^[{\r\n][^\r\n]*$' test.yaml
What I'm doing:
^[^\r\n]+: Match key at the beginning of the line
[^\S\r\n]+ Match inline whitespace (some implementations provide \h for this). There must be at least one whitespace characters after the colon.
[^[{\r\n][^\r\n]*$ Match the content. Ensure it starts with something that is not [ or { on the same line (those would start nested YAML objects). Then, match everything until the end of the line.

Related

How to replace all occurences of one word with another using Regular Expression

I am trying to replace a tag value in a web.config file using Regex with ansible playbook.
This is my sample file.
TXWebSocketHandler="Data =localhost;Catalog Name=catalogname;User =user;key=key;
TXWebSocketHandler="Data =localhost;Catalog Name=catalogname;User =user;key=key;
TXWebSocketHandler="Data =localhost;Catalog Name=catalogname;User =user;key=key;
My Desired output should be
TXWebSocketHandler="Data =127.0.0.1;Catalog Name=catalogname;User =user;key=key;
TXWebSocketHandler="Data =127.0.0.1;Catalog Name=catalogname;User =user;key=key;
TXWebSocketHandler="Data =127.0.0.1;Catalog Name=catalogname;User =user;key=key;
All the localhost should be replaced as 127.0.0.1.
And the playbook I have used is
- name: replace_config
community.windows.win_lineinfile:
path: 'D:\Apps\project\web.config'
regexp: /localhost/g
line: 127.0.0.1
For this I am getting a output like,
TXWebSocketHandler="Data =localhost;Socket Name=Socketname;User =user;key=key;
TXWebSocketHandler="Data =localhost;Socket Name=Socketname;User =user;key=key;
TXWebSocketHandler="Data =localhost;Socket Name=Socketname;User =user;key=key;
127.0.0.1
Substitution value that I am adding is not replaced in localhost, but it is getting added in end of the file. when I try in any of regex generator this works fine like I wanted.
Is it an issue with regex or am I missing any argument in this ansible playbook? Please suggest some method to replace all occurences of one word with another .
Single-line restriction
The module community.windows.win_lineinfile seems to work on single lines only:
Ensure a particular line is in a file, or replace an existing line using a back-referenced regular expression
From Synopsis:
This is primarily useful when you want to change a single line in a file only.
Still you could use backrefs: true to replace a single line, see the docs (restriction emphasized in bold):
If the regexp does match, the last matching line will be replaced by the expanded line parameter.
See also
Ansible lineinfile module syntax error when using multiple variables in string.
In your playbook:
- name: replace_config
community.windows.win_lineinfile:
path: 'D:\Apps\project\web.config'
backrefs: true
regexp: '^(.*)localhost(.*)$'
line: '$1 127.0.0.1 $2'
This should replace the matched localhost with the loopback IP 127.0.0.1, and set the captured parts before (backreferenced by $1) and after (backreferenced by $2) back in.
Note: If it worked, try to remove the surrounding spaces from IP-address.
Replace as built-in module
As seshadri_c suggested:
ansible.builtin.replace module can be used:
Replace all instances of a particular string in a file using a back-referenced regular expression
See Synopsis:
This module will replace all instances of a pattern within a file.
From the docs, adapt the first example "Replace old hostname with new hostname" to your use-case:
- name: Replace old hostname with new hostname (requires Ansible >= 2.4)
ansible.builtin.replace:
path: D:/Apps/project/web.config
regexp: '(.*)localhost(.*)$'
replace: '\1 127.0.0.1 \2'
Note: I replaced the backslashes (Windows path-separator) to forward-slashes (safer with Python). The added spaces after back-ref \1
and before \2 might be removed (not sure if it works because numbers of back-ref and IP-address could collide). See the other example in docs "Explicitly specifying named matched groups" for another way.

Get multiple lines using regex with Ansible

I have been trying to modify some files with Ansible but I do not have the right regex.
The goal is to modify a set of files and change everything between <Factory /> and </Factory> as "not register". As an example
I want to change this:
<Factory />
Replacement set
Madrid
</Factory>
to this:
<Factory />
Not register
<Factory />
What I have is the following:
hosts: all
tasks:
- name: replace factory registration
ansible.builtin.replace:
path: /home/clientDatabase.xml
regex: {'(?<=<Factory />.*?(?=</Factory>)', multiline = True}
replace: 'Not register'
I have tried several expressions and this is the closest I have got. It works perfectly on notepad++ if you set the regular expression on and check the .match newline box but it does not do anything in ansible.
What I understand is from (?<=) to (?=) get me everything in between (.*) that is 0 or once (?), check on multiple lines to get the whole structure (multiline = True).
I have also tried \R for return carrier and break line, the ^ and $ ones but from all the tries I had it does not work and I am getting out of ideas.
Could someone give me any hints here?
Here are some resources I think helped me the most:
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/replace_module.html
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/replace_module.html
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/replace_module.html
https://w3.unpocodetodo.info/utiles/regex.php
Update:
Finally, I follow your suggestion, using [^<]*? match everything except the "<" character and it worked perfectly. The parenthesis was a mispell, sorry
The final result is:
hosts: all
tasks:
- name: replace factory registration
ansible.builtin.replace:
path: /home/clientDatabase.xml
regex: '(?<=<Factory />)[^<]*?'
replace: 'Not register'
What I understand is from <Factory /> replace all content up to the first <. With this statement not multiline, nor dotall flag need to be use.
I suggest the following regex which allows words, spaces and newline between the tags <Factory /> and </Factory>.
Please note that the string is on 2 lines because it contains a newline as part of the pattern.
(?<=<Factory \/>)[\w\s
]*(?=<\/Factory>)
this could also be written
(?<=<Factory \/>)[\w\s\n\r]*(?=<\/Factory>)

Ansible: Insert word in GRUB cmdline

I'd like to use Ansible's lineinfile or replace module in order to add the word splash to the cmdline in GRUB.
It should work for all the following examples:
Example 1:
Before: GRUB_CMDLINE_DEFAULT=""
After: GRUB_CMDLINE_DEFAULT="splash"
Example 2:
Before: GRUB_CMDLINE_DEFAULT="quiet"
After: GRUB_CMDLINE_DEFAULT="quiet splash"
Example 3:
Before: GRUB_CMDLINE_DEFAULT="quiet nomodeset"
After: GRUB_CMDLINE_DEFAULT="quiet nomodeset splash"
The post Ansible: insert a single word on an existing line in a file explained well how this could be done without quotes. However, I can't get it to insert the word within the quotes.
What is the required entry in the Ansible role or playbook in order to add the word splash to the cmdline as shown?
You can do this without a shell output, with 2 lineinfiles modules.
In your example you're searching for splash:
- name: check if splash is configured in the boot command
lineinfile:
backup: true
path: /etc/default/grub
regexp: '^GRUB_CMDLINE_LINUX=".*splash'
state: absent
check_mode: true
register: grub_cmdline_check
changed_when: false
- name: insert splash if missing
lineinfile:
backrefs: true
path: /etc/default/grub
regexp: "^(GRUB_CMDLINE_LINUX=\".*)\"$"
line: '\1 splash"'
when: grub_cmdline_check.found == 0
notify: update grub
The trick is to try to remove the line if we can find splash somewhere, but doing a check only check_mode: true. If the term was found (found > 0) then we don't need to update the line. If it's not found, it means we need to insert it. We append it at the end with the backrefs.
Inspired by Adam's answer, I use this one to enable IOMMU:
- name: Enable IOMMU
ansible.builtin.lineinfile:
path: /etc/default/grub
regexp: '^GRUB_CMDLINE_LINUX_DEFAULT="((:?(?!intel_iommu=on).)*?)"$'
line: 'GRUB_CMDLINE_LINUX_DEFAULT="\1 intel_iommu=on"'
backup: true
backrefs: true
notify: update-grub
Please note I've had to set backrefs to true in order to \1 reference to work otherwise the captured group was not replaced.
Idempotency works fine as well.
EDIT: Please note this snippet only works with an Intel CPU and might to be updated to fit your platform.
A possible solution is the definition of two entries as follows:
- name: "Checking GRUB cmdline"
shell: "grep 'GRUB_CMDLINE_LINUX_DEFAULT=.*splash.*' /etc/default/grub"
register: grub_cfg_grep
changed_when: false
failed_when: false
- name: "Configuring GRUB cmdline"
replace:
path: '/etc/default/grub'
regexp: '^GRUB_CMDLINE_LINUX_DEFAULT="((\w.?)*)"$'
replace: 'GRUB_CMDLINE_LINUX_DEFAULT="\1 splash"'
when: '"splash" not in grub_cfg_grep'
Explanation: We first check if the splash keyword is present in the required line using grep. Since grep gives a negative return code when a string is not found, we suppress the errors using failed_when: false. The output of grep is saved to the grub_cfg_grep variable.
Next, we bind the replace module to the condition that the keyword splash is in the standard output of grep. The regular expression takes the old content in the quotes and adds the splash keyword behind it.
Note: In the case of an empty string before the execution, the result reads " splash" (with a space in front) but it is still a valid cmdline.
The difficulty is this line in the replace module page: "It is up to the user to maintain idempotence by ensuring that the same pattern would never match any replacements made."https://docs.ansible.com/ansible/latest/modules/replace_module.html#id4 It's easy to insert the item but actually quite tricky to make it idempotent, so the target file doesn't grow every time you run the task.
I found a way to do it in one shot with the replace module. You should be able to adapt this. My task checks the GRUB_CMDLINE_LINUX_DEFAULT line for "vt.default_red" and inserts some colour codes if not found.
My method was to copy-and-paste various nearly-there examples into the regex tester website and fiddle until it worked. I still don't grok the result, but it worked in my tests at https://www.regextester.com/ and it works in my playbook.
One problem I had was that Ansible's regex implementation apparently doesn't support conditionals, which gave me odd errors for a while.
- name: colours | configured grub command
replace:
path: /etc/default/grub
regexp: '^GRUB_CMDLINE_LINUX_DEFAULT="((:?(?!vt\.default_red).)*?)"$'
replace: 'GRUB_CMDLINE_LINUX_DEFAULT="\1 vt.default_red=0xee,..."'
The regex matches the literal string ("GRUB_CMDLINE_LINUX_DEFAULT=" and a double quote mark) at the start and the double quote mark at the end. Deconstructing the rest...
( - open capture group #1 (creates backref #1)
(:? - open a non-capturing group (not sure what the question mark is here)
(?! - negative lookahead (ie. don't match if the following string comes next)
vt\.default_red - the string to look for, literal dot is escaped
) - close negative lookahead
.) - match a single char (why?) and close the non-capturing group
* - try to match the non-capturing group zero or more times
? - ... lazily (ie. get the smallest possible match)
) - close capture group #1
What about doing this in Ansible, use perl to address your need.
- name: Change items in the file
ansible.builtin.command:
command: perl -i pe 's/DEFAULT="/DEFAULT="splash"/'
Another way of looking at it. This is an old conversation, but it is still relevant.

regex filter in replace module for ansible

Trying to get a regex replace in an ansible role for update autoscales going.
In my CFT I have the following mapping:
DevRegionSettings:
us-east-1:
primaryZone: us-east-1a
# secondaryZone: us-east-1b
# autoscale is wrong at point of instantiation
amiAutoscale: ami-234sefsrwerwer21
amiDB: ami-12313123
amiCoord: ami-12312312
amiWeb: ami-13123123
amiWorker: ami-12312312
I want to replace just the value of amiAutoscale with the latest ami that I find earlier on in the role.
I'm a regex noob and cannot figure it out for the life of me.
Been playing around with some of the regex from this thread:
Regex to match key in YAML
But still cant get it to do what I want :(
Any help would be appreciated!
The ansible task I had running was as follows:
- name: Replacing ami in the Dev Cloudformation Template
replace:
regexp: '(^\s*(?P<key>\w+_amiAutoscale):\s*(?P<value>\d+))'
replace: "{{ latest_ami.image_id }}"
path: "$path_to_cft.yaml"
So a couple of issues with your regex:
\w+_amiAutoscale - The line amiAutoscale: ami-234sefsrwerwer21 does not have an _ before amiAutoscale
(?P<value>\d+) - ami-234sefsrwerwer21 is not a sequence of digits.
This worked for me, but may be too open of a pattern: (^\s*(?P<key>amiAutoscale):\s*(?P<value>.+))
Example: https://regex101.com/r/76VGlJ/1
- name: Replacing ami in the Dev Cloudformation Template
replace:
regexp: '(^\s*(?P<key>amiAutoscale):\s*(?P<value>.+))'
replace: "{{ latest_ami.image_id }}"
path: "$path_to_cft.yaml"
Your regex does not match because the regex expect to match 1+ word characters followed by an underscore \w+_ right after the starting whitespace characters ^\s* which are not in the data.
Also, in the named capturing group (?P<value>\d+) you match 1+ digits which does not match ami-234sefsrwerwer21
What you also might do for your example data is to use only 2 capturing groups and a character class in the second group to specify what you would allow to match:
^\s*(?P<key>amiAutoscale)\s*:\s*(?P<value>[\w-]+)
Regex demo

Ansible lineinfile duplication using insertafter

I am trying to add an entry into my /etc/hosts file using ansibles lineinfile. I want the logic to be if it finds the entry 127.0.0.1 mysite.local then do nothing otherwise insert it after the line 127.0.1.1
127.0.0.1 localhost
127.0.1.1 mypc
127.0.0.1 mysite.local
I have the insert after part working but it appears the actual regex search is failing to find the existing entry so I keep getting duplication of the insertion of 127.0.0.1 mysite.local
The docs do say;
When modifying a line the regexp should typically match both the initial state of the line as well as its state after replacement by line to ensure idempotence.
But I'm not sure how that applies to my regex. Currently my play is;
- name: Add the site to hosts
lineinfile:
path: /etc/hosts
# Escape special chars
regex: "^{{ domain|regex_escape() }}"
line: "127.0.0.1 {{ domain }}"
insertafter: '127\.0\.1\.1'
firstmatch: yes
become: yes
where domain is mysite.local.
I have looked at this answer but I'm pretty sure I cannot use backrefs since the docs state;
This flag changes the operation of the module slightly; insertbefore and insertafter will be ignored, and if the regexp doesn't match anywhere in the file, the file will be left unchanged.
I have tried;
regex: '127\.0\.0\.1\s+?{{ domain|regex_escape() }}'
With no luck either
It seems that firstmatch: yes was breaking things. It work for me with following task (I replaced space with tab for fancy look but spaces work as well):
- name: Add the site to hosts
lineinfile:
path: /etc/hosts
# Escape special chars
regexp: "{{ domain|regex_escape() }}"
line: "127.0.0.1{{ '\t' }}{{ domain }}"
insertafter: '127\.0\.1\.1'
According to this link, lineinfile scans the file and applies the regex one line at a time, meaning you cannot use a regex that looks through the whole file. I am unfamiliar with the lineinfile tool, but if you can use the "replace" tool used in the link above then you can use the following Python regex to match as you need:
\A((?:(?!127\.0\.0\.1\s)[\s\S])*?)(?:\Z|127\.0\.0\.1\s+(?!{{ domain|regex_escape() }})\S+\n|(127\.0\.1\.1\s+\S+(?![\s\S]*\n127\.0\.0\.1\s)\n))
With the substitution: "\1\2127.0.0.1 {{ domain }}\n"
The non-capturing group handles three distinct cases:
Case 1: 127.0.1.1 and 127.0.0.1 don't exist so insert at end
Case 2: 127.0.0.1 exists with a different host so replace the entry
Case 3: 127.0.1.1 exists so insert after it
It is the second case that tackles idempotence by avoiding matching an entry for "127.0.0.1" if one already exists.
The doc says:
insertafter: ... If regular expressions are passed to both regexp and insertafter, insertafter is only honored if no match for regexp is found.
The regex in the task expands to
regex: ^mysite\.local
This regex is not found because there is no line that begins with "mysite.local". Hence insertafter is honored and "line" is inserted after 127.0.1.1 .