Is there a way to filter subgroups of regex directly in the promtail config file?
For example, I have a Label that filters the hostname of a log, however the Label is being delivered along with the hostname string
- match:
selector: '{job="WAF"}'
stages:
- regex:
expression: "(?P<HOSTNAME>hostname \"([^\"]+)\")"
- labels:
HOSTNAME:
Example of log line:
[:error] [pid 1331:tid 140180245612288][client XXXXXXXXXXXXXXXXXXX] ModSecurity: Access denied with code 403 (phase 2). Operator GE matched 5 at TX:anomaly_score. [file "XXXXXXXXXXXXXXXXXXX"] [line "XXXXXX"] [id "949110"] [msg "Inbound Anomaly Score Exceeded (Total Score: 5)"] [severity "XXXXXXXXXXXXX"][hostname "website2.lab.com"] [uri "/"] [unique_id "XXXXXXXXXXXXX"]
Example of Label:
Labels example in Grafana
I have roughly formatted yml files with key/value pairs in them. I then imported the values of both of these files successfully into a running playbook using the include_vars module.
Now, I want to be able to compare the value of the key/value pair from file/list 1, to all of the keys of file/list 2. Then finally when there is a match, print and preferably save/register the value of the matching key from file/list 2.
Essentially I am comparing a machine name to an IP list to try to grab the IP the machine needs out of that list. The name is "dynamic" and is different each time the playbook is run, as file/list 1 is always dynamically populated on each run.
Examples:
file/list 1 contents
machine_serial: m60
s_iteration: a
site_name: dud
t_number: '001'
file/list 2 contents
m51: 10.2.5.201
m52: 10.2.5.202
m53: 10.2.5.203
m54: 10.2.5.204
m55: 10.2.5.205
m56: 10.2.5.206
m57: 10.2.5.207
m58: 10.2.5.208
m59: 10.2.5.209
m60: 10.2.5.210
m61: 10.2.5.211
In a nutshell, I want to be able to get the file/list 1 ct_machine_serial key who's value is currently: m60 to be able to find it's key match in file/list 2, and then print and/or preferably register it's value of 10.2.5.210.
What I've tried so far:
Playbook:
- name: IP gleaning comparison.
hosts: localhost
remote_user: ansible
become: yes
become_method: sudo
vars:
ansible_ssh_pipelining: yes
tasks:
- name: Try to do a variable import of the file1 file.
include_vars:
file: ~/active_ct-scanner-vars.yml
name: ctfile1_vars
become: no
- name: Try to do an import of file2 file for lookup comparison to get an IP match.
include_vars:
file: ~/machine-ip-conversion.yml
name: ip_vars
become: no
- name: Best, but failing attempt to get the value of the match-up IP.
debug:
msg: "{{ item }}"
when: ctfile1_vars.machine_serial == ip_vars
with_items:
- "{{ ip_vars }}"
Every task except the final one works perfectly.
My failed output final task:
TASK [Best, but failing attempt to get the value of the match-up IP.] ***********************************************************************************
skipping: [localhost] => (item={'m51': '10.200.5.201', 'm52': '10.200.5.202', 'm53': '10.200.5.203', 'm54': '10.200.5.204', 'm55': '10.200.5.205', 'm56': '10.200.5.206', 'm57': '10.200.5.207', 'm58': '10.200.5.208', 'm59': '10.200.5.209', 'm60': '10.200.5.210', 'm61': '10.200.5.211'})
skipping: [localhost]
What I hoped for hasn't happened, it simply skips the task, and doesn't iterate over the list like I was hoping, so there must be a problem somewhere. Hopefully there is an easy solution to this I just missed. What could be the correct answer?
Given the files
shell> cat active_ct-scanner-vars.yml
machine_serial: m60
s_iteration: a
site_name: dud
t_number: '001'
shell> cat machine-ip-conversion.yml
m58: 10.2.5.208
m59: 10.2.5.209
m60: 10.2.5.210
m61: 10.2.5.211
Read the files
- include_vars:
file: active_ct-scanner-vars.yml
name: ctfile1_vars
- include_vars:
file: machine-ip-conversion.yml
name: ip_vars
Q: "Compare the machine name to an IP list and grab the IP."
A: Both variables ip_vars and ctfile1_vars are dictionaries. Use ctfile1_vars.machine_serial as index in ip_vars
match_up_IP: "{{ ip_vars[ctfile1_vars.machine_serial] }}"
gives
match_up_IP: 10.2.5.210
Example of a complete playbook for testing
- hosts: localhost
gather_facts: false
vars:
match_up_IP: "{{ ip_vars[ctfile1_vars.machine_serial] }}"
tasks:
- include_vars:
file: active_ct-scanner-vars.yml
name: ctfile1_vars
- include_vars:
file: machine-ip-conversion.yml
name: ip_vars
- debug:
var: match_up_IP
I'm facing 2 problems:
Problem 1.
I'm trying to filter a list with jinja2 regex_search but I alse get None matches.
Problem 2.
Each elements of the new list seems to be list of one element (sigth!!!).
My code.
- name: Regex_Search Test
hosts: localhost
vars:
my_list:
- app-be-dev01-2
- app-be-dev02-2
- app-be-dev02-3
- app-be-dev03-2
- app-foo-2
- app-be-dev04-1
- app-be-dev04-2
tasks:
- name: Varsmng
set_fact:
customer_instances: >-
{% for instance in my_list -%} {{ customer_instances | default([]) + [ instance | string | regex_search('app-be-(.*)-([0-9]*)', '\1' ) ] }}
{%- endfor %}
- name: Debug
debug:
msg:
- "customer_instances: {{ customer_instances }}"
My output.
TASK [Varsmng] ****************************************************************************************************************************************
task path: /home/cin0633a/progetti/ansible/testenv/test.yml:19
ok: [localhost] => {
"ansible_facts": {
"customer_instances": "[[u'dev01']][[u'dev02']][[u'dev02']][[u'dev03']][None][[u'dev04']][[u'dev04']] "
},
"changed": false
As you can see, each element has a double square brackets. And can I avoid None values?
You get a list for each element because regex_search returns a list when you use the replace feature with capture groups in the expression.
$ ansible localhost -m debug -e toto=bla-bli-blo -a "msg={{ toto | regex_search('(bla).*') }}"
localhost | SUCCESS => {
"msg": "bla-bli-blo"
}
$ ansible localhost -m debug -e toto=bla-bli-blo -a "msg={{ toto | regex_search('(bla).*', '\\1') }}"
localhost | SUCCESS => {
"msg": [
"bla"
]
}
And you get None values because some items do not match your regex.
You can get your result with a better approach IMO using specific filters rather than a complex jinja2 template. The following playbook:
- name: Regex_Search Test
hosts: localhost
gather_facts: false
vars:
my_list:
- app-be-dev01-2
- app-be-dev02-2
- app-be-dev02-3
- app-be-dev03-2
- app-foo-2
- app-be-dev04-1
- app-be-dev04-2
searchreg: >-
app-be-(.*)-([0-9]*)
my_filtered_list: >-
{{
my_list |
select('regex', searchreg) |
map('regex_replace', searchreg, '\1')
}}
tasks:
- debug:
var: my_filtered_list
Gives:
PLAY [Regex_Search Test] *****************************************************************************
TASK [debug] *****************************************************************************
ok: [localhost] => {
"my_filtered_list": [
"dev01",
"dev02",
"dev02",
"dev03",
"dev04",
"dev04"
]
}
PLAY RECAP *****************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Note1: you can pipe the unique filter if you want unique environment results
Note2: if you are using ansible < 2.10, you will have to add the list filter at the end of all others to get the actual result.
I'm trying to get the value of value.txt which I think works well in ansible, the file value looks exactly like this
[ec2-user#ip-192-168-1-45]$ cat value.txt
this-is-the-value
I would like to assign the output value as the replacement of the "CHANGE_ME" keywords from file.txt
[ec2-user#ip-192-168-1-45]$ cat file.txt
asdasdh kajsdlkjasdlk CHANGE_ME ajsdlkjasdlkjasd
asdkjhakjsd: CHANGE_ME
jasdlkjadsl{
aksldjlkasd: CHANGE_ME
}
I'm using this ansible playbook to combine the 2 process however it seems it doesn't replace the the "CHANGE_ME" when I try to verify the file.txt
- name: check output
hosts: localhost
connection: local
gather_facts: false
tasks:
- name: cat the file
shell: "cat value.txt"
register: cat_value
- debug: var=cat_value.stdout
- name: modify file.txt
replace:
regexp: "{{ cat_value.stdout }}"
replace: "CHANGE_ME"
path: "{{ playbook_dir }}/file.txt"
The OUTPUT goes like this
[ec2-user#ip-192-168-1-45]$ ansible-playbook ansible.yml
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [check output] ******************************************************************************************************************************************************************************
TASK [cat the file] ******************************************************************************************************************************************************************************
changed: [localhost]
TASK [debug] *************************************************************************************************************************************************************************************
ok: [localhost] => {
"cat_value.stdout": "this-is-the-value"
}
TASK [modify file.txt] ***************************************************************************************************************************************************************************
ok: [localhost]
PLAY RECAP ***************************************************************************************************************************************************************************************
localhost : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
[ec2-user#ip-192-168-1-45]$ cat file.txt
asdasdh kajsdlkjasdlk CHANGE_ME ajsdlkjasdlkjasd
asdkjhakjsd: CHANGE_ME
jasdlkjadsl{
aksldjlkasd: CHANGE_ME
}
Context
I have a list of files and directories to be deleted.
This is obtained from lines starting with the word "deleting" from an rsync stdout.
rsync stdout_lines: [
"building file list ... done",
"*deleting Lab02/Ex2/Doc1.txt",
"*deleting Lab02/Ex2/",
"*deleting Lab02/Ex1/Doc1.txt",
"*deleting Lab02/Ex1/",
"*deleting Lab02/",
".d..t...... ./",
"*deleting Lab01/Ex2/Doc1.txt",
"*deleting Lab01/Ex2/",
"*deleting Lab01/Ex1/Doc2.txt",
"*deleting Lab01/Ex1/Doc1.txt",
".d..t...... Lab01/",
".d..t...... Lab01/Ex1/",
"sent 350 bytes received 191 bytes 360.67 bytes/sec",
"total size is 614 speedup is 1.13 (DRY RUN)"
]
Formatted using:
'{{ sync_return.stdout_lines | select("regex", "^[*]deleting") | map("regex_replace", "^[*]deleting", "") | map("regex_replace", " ", "") | list }}'
A primitive example of the format of this list is as follows:
formatted list: [
"Lab02/Ex2/Doc1.txt",
"Lab02/Ex2/",
"Lab02/Ex1/Doc1.txt",
"Lab02/Ex1/",
"Lab02/",
"Lab01/Ex2/Doc1.txt",
"Lab01/Ex2/",
"Lab01/Ex1/Doc2.txt",
"Lab01/Ex1/Doc1.txt"
]
In an attempt to hasten the process of deleting (by reducing the number of elements to iterate over) - I separated the list into 2 sub-lists:
A list of directories. (elements of the main list that end in '/')
'{{ items_to_delete | select("regex", "/$") | list }}'
A list of file paths. (elements who's containing directory does not get deleted)
'{{ items_to_delete | reject("match", item) | list }}'
The sub-lists for the example above would be...
directories to delete: [
"Lab02/Ex2/",
"Lab02/Ex1/",
"Lab02/",
"Lab01/Ex2/"
]
files to delete: [
"Lab01/Ex1/Doc2.txt",
"Lab01/Ex1/Doc1.txt"
]
The Problem
Whilst the current solution works, it's not the best it could be. The dream is to have a solution where the "directories to delete" list only contains the highest level directories possible. i.e. Since we know the directory "Lab02/" is being deleted, "directories to delete" will NOT contain "Lab02/Ex2/" or "Lab02/Ex1/".
I believe my goal is somewhat similar to the os.path.commonprefix python function, however must be done for a variety of file paths within the list.
I'm relatively new to Ansible, so any guidance/help with this matter would be greatly appreciated.
I won't ask why you want to implement that, and I'll take it as an exercise.
Idea is, you can sort directories alphabetically, then while looping the paths, you strip any one that starts with the previous line.
You can write your filter like this (put in filter_plugins directory):
def common_paths(paths=[]):
sorted_paths = sorted(paths)
pfx = sorted_paths[0]
for path in sorted_paths[1:]:
if re.compile("^%s.*" % pfx).match(path):
sorted_paths.remove(path)
else:
pfx = path
return sorted_paths
class FilterModule(object):
def filters(self):
return { 'common_paths': common_paths }
then:
- name: Filter
set_fact:
bar: "{{ foo | common_paths }}"
Test locally with:
---
- hosts: localhost
tasks:
- name: Test data
set_fact:
foo:
- 'Lab01/'
- 'Lab01/Ex5/'
- 'Ex2/foo3/'
- 'Ex2/foo2/'
- 'Ex2/'
- 'Lab03/Ex5/e/'
- 'Lab02/y/z/Lab01/1/'
- 'Lab02/y/z/Lab01/3/'
- 'Lab01/Ex5/Lab02/'
- 'Lab03/Ex5/d/1'
- name: Filter
set_fact:
bar: "{{ foo | common_paths }}"
Output:
$ ansible-playbook common_paths.yml -vvv
ansible-playbook 2.10.4
PLAYBOOK: common_paths.yml *********************************************************************************
1 plays in common_paths.yml
PLAY [localhost] *********************************************************************************
TASK [Gathering Facts] *********************************************************************************
ok: [localhost]
TASK [Test data] ********************************************************************************* task path: /home/guido/Development/git/ansible-local/common_paths.yml:5
ok: [localhost] => {
"ansible_facts": {
"foo": [
"Lab01/",
"Lab01/Ex5/",
"Ex2/foo3/",
"Ex2/foo2/",
"Ex2/",
"Lab03/Ex5/e/",
"Lab02/y/z/Lab01/1/",
"Lab02/y/z/Lab01/3/",
"Lab01/Ex5/Lab02/",
"Lab03/Ex5/d/1/"
]
},
"changed": false
}
TASK [Filter] *********************************************************************************
task path: /home/guido/Development/git/ansible-local/common_paths.yml:19
ok: [localhost] => {
"ansible_facts": {
"bar": [
"Ex2/",
"Lab01/",
"Lab02/y/z/Lab01/1/",
"Lab02/y/z/Lab01/3/",
"Lab03/Ex5/d/1/",
"Lab03/Ex5/e/"
]
},
"changed": false
}
PLAY RECAP *********************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0