augeas & puppet & whitespace - regex

I try to edit the file limits.com with puppet. I need to add lines this files. I copy the example from:
http://docs.puppetlabs.com/guides/augeas.html
# /etc/puppet/modules/limits/manifests/conf.pp
define limits::conf (
$domain = "root",
$type = "soft",
$item = "nofile",
$value = "10000"
) {
# guid of this entry
$key = "$domain/$type/$item"
# augtool> match /files/etc/security/limits.conf/domain[.="root"][./type="hard" and ./item="nofile" and ./value="10000"]
$context = "/files/etc/security/limits.conf"
$path_list = "domain[.=\"$domain\"][./type=\"$type\" and ./item=\"$item\"]"
$path_exact = "domain[.=\"$domain\"][./type=\"$type\" and ./item=\"$item\" and ./value=\"$value\"]"
augeas { "limits_conf/$key":
context => "$context",
onlyif => "match $path_exact size != 1",
changes => [
# remove all matching to the $domain, $type, $item, for any $value
"rm $path_list",
# insert new node at the end of tree
"set domain[last()+1] $domain",
# assign values to the new node
"set domain[last()]/type $type",
"set domain[last()]/item $item",
"set domain[last()]/value $value",
],
}
and
use case:
limits::conf {
# maximum number of open files/sockets for root
"root-soft": domain => root, type => soft, item => nofile, value => 9999;
"root-hard": domain => root, type => hard, item => nofile, value => 9999;
}
The result inside of limits.conf is:
##student - maxlogins 4
root hard nofile 9999
root soft nofile 9999
How I can put whitespace or tabs bettwen the values "root hard nofile & value"
I try put the whitespace inside "", try with regex but doesn't works.
Thanks

You can't. Augeas manages spaces automatically, and there is no way to control them manually.

Elaborating on Raphink's correct answer: While augeas works with files in a semantic fashion (it knows what the content represents), there are alternatives if you care for the appearance more.
In your case, it might make more sense to manage your file through a template and take fine grained control of the result.

Related

How to update values dynamically for the individual match sections within sshd config file using puppet

i am able to update the value to the sections "User foo" and "Host *.example.net" by passing the index. If i pass index 1 or 2 the respective value is getting updated.
my code:
$sections = ['Host *.example.net', 'User foo']
$sections.each |String $section| {
sshd_config_match { "${section}":
ensure => present,
}
}
$settings = [['User foo', 'X11Forwarding yes', 'banner none'],['Host *.example.net', 'X11Forwarding no', 'banner none']]
$settings.each |Array $setting| {
$setting_array = split($setting[1],/ /)
sshd_config { "${setting_array[0]} ${setting[0]}":
ensure => present,
key => "${setting_array[0]}",
condition => "${setting[0]}",
value => "${setting_array[1]}",
}
}
current result:
Match Host *.example.net
# Created by Puppet
X11Forwarding no
Match User foo
# Created by Puppet
X11Forwarding yes
Expected results:
Match Host *.example.net
# Created by Puppet
X11Forwarding no
Banner none
Match User foo
# Created by Puppet
X11Forwarding yes
Banner none
i am able to update only one value mentioned in the index but am looking a way to update more or all the values mentioned in the list.
It's not clear what module is providing your sshd_config_match and sshd_config resource types, nor, therefore, exactly what they do. Nevertheless, if we consider this code ...
$settings = [['User foo', 'X11Forwarding yes', 'banner none'],['Host *.example.net', 'X11Forwarding no', 'banner none']]
$settings.each |Array $setting| {
$setting_array = split($setting[1],/ /)
sshd_config { "${setting_array[0]} ${setting[0]}":
ensure => present,
key => "${setting_array[0]}",
condition => "${setting[0]}",
value => "${setting_array[1]}",
}
}
... we can see that each element of $settings is a three-element array, of which the each call accesses only those at indexes 0 and 1. That seems to match up with the result you see, which does not contain anything corresponding to the data from the elements at index 2.
You could iterate over the inner $setting elements, starting at index 1, instead of considering that element only, but I would suggest instead restructuring the data more naturally, and writing code suited to the restructured data. You have data of mixed significance in your arrays, and you are needlessly jamming keys and values together such that you need to spend effort to break them back apart. Structuring the data as a hash of hashes instead of an array of arrays could be a good start:
$settings = {
'User foo' => { 'X11Forwarding' => 'yes', 'banner' => 'none'},
'Host *.example.net' => { 'X11Forwarding' => 'no', 'banner' => 'none'},
}
Not only does that give you much enhanced readability (mostly from formatting), but it also affords much greater usability. To wit, although I'm guessing a bit here, you should be able to do something similar to the following:
$settings.each |String $condition, Hash $properties| {
$properties.each |String $key, String $value| {
sshd_config { "${condition} ${key}":
ensure => 'present',
condition => $condition,
key => $key,
value => $value,
}
}
}
Again, greater readability, this time largely from a helpful choice of names, and along with it greater clarity that something like this is in fact the right structure for the code (supposing that I have correctly inferred enough about the types you are using).

Concatenate strings in file_line resource

I'm trying to concatenate a string in a puppet manifest like so:
file_line {'Append to /etc/hosts':
ensure => present,
line => "${networking['ip'] + '\t'}${networking['fqdn'] + '\t'}${networking['hostname']}",
match => "${'^#?'+ networking['ip'] + '\s+' + networking['fqdn'] + '\s+' + networking['hostname']}",
path => '/etc/hosts'
}
I either get syntax errors or in the case above:
The value '' cannot be converted to Numeric
Which I'm guessing means that it doesn't like the plus operator.
So how do I interpolate the strings in the match and line attributes?
The problem here is that the operator + is restricted to only Numeric types (documentation). It cannot be used with String types. However, the spacing and regular expressions can still be used as normal without attempting a string concatenate. These merely need to be placed outside the variable interpolation. Therefore:
file_line { 'Append to /etc/hosts':
ensure => present,
line => "${networking['ip']}\t${networking['fqdn']}\t${networking['hostname']}",
match => "^#?${networking['ip']}\s+${networking['fqdn']}\s+${networking['hostname']}",
path => '/etc/hosts'
}
should resolve your issues with the type mismatch and the + operator.
i know you already got your answer, but you can make life easier using host to manage /etc/hosts:
host {
"localhost": ip => "127.0.0.1";
"$::puppet_server":
ip => "1.2.3.4",
host_aliases => [ "puppetserver" ],
;
"dns-01.tld":
ip => "1.2.3.5",
host_aliases => [ "dns-01" ],
;
"dns-02.tld":
ip => "1.2.3.6",
host_aliases => [ "dns-02" ],
;
}

How to process multiline log entry with logstash filter?

Background:
I have a custom generated log file that has the following pattern :
[2014-03-02 17:34:20] - 127.0.0.1|ERROR| E:\xampp\htdocs\test.php|123|subject|The error message goes here ; array (
'create' =>
array (
'key1' => 'value1',
'key2' => 'value2',
'key3' => 'value3'
),
)
[2014-03-02 17:34:20] - 127.0.0.1|DEBUG| flush_multi_line
The second entry [2014-03-02 17:34:20] - 127.0.0.1|DEBUG| flush_multi_line Is a dummy line, just to let logstash know that the multi line event is over, this line is dropped later on.
My config file is the following :
input {
stdin{}
}
filter{
multiline{
pattern => "^\["
what => "previous"
negate=> true
}
grok{
match => ['message',"\[.+\] - %{IP:ip}\|%{LOGLEVEL:loglevel}"]
}
if [loglevel] == "DEBUG"{ # the event flush line
drop{}
}else if [loglevel] == "ERROR" { # the first line of multievent
grok{
match => ['message',".+\|.+\| %{PATH:file}\|%{NUMBER:line}\|%{WORD:tag}\|%{GREEDYDATA:content}"]
}
}else{ # its a new line (from the multi line event)
mutate{
replace => ["content", "%{content} %{message}"] # Supposing each new line will override the message field
}
}
}
output {
stdout{ debug=>true }
}
The output for content field is : The error message goes here ; array (
Problem:
My problem is that I want to store the rest of the multiline to content field :
The error message goes here ; array (
'create' =>
array (
'key1' => 'value1',
'key2' => 'value2',
'key3' => 'value3'
),
)
So i can remove the message field later.
The #message field contains the whole multiline event so I tried the mutate filter, with the replace function on that, but I'm just unable to get it working :( .
I don't understand the Multiline filter's way of working, if someone could shed some light on this, it would be really appreciated.
Thanks,
Abdou.
I went through the source code and found out that :
The multiline filter will cancel all the events that are considered to be a follow up of a pending event, then append that line to the original message field, meaning any filters that are after the multiline filter won't apply in this case
The only event that will ever pass the filter, is one that is considered to be a new one ( something that start with [ in my case )
Here is the working code :
input {
stdin{}
}
filter{
if "|ERROR|" in [message]{ #if this is the 1st message in many lines message
grok{
match => ['message',"\[.+\] - %{IP:ip}\|%{LOGLEVEL:loglevel}\| %{PATH:file}\|%{NUMBER:line}\|%{WORD:tag}\|%{GREEDYDATA:content}"]
}
mutate {
replace => [ "message", "%{content}" ] #replace the message field with the content field ( so it auto append later in it )
remove_field => ["content"] # we no longer need this field
}
}
multiline{ #Nothing will pass this filter unless it is a new event ( new [2014-03-02 1.... )
pattern => "^\["
what => "previous"
negate=> true
}
if "|DEBUG| flush_multi_line" in [message]{
drop{} # We don't need the dummy line so drop it
}
}
output {
stdout{ debug=>true }
}
Cheers,
Abdou
grok and multiline handling is mentioned in this issue https://logstash.jira.com/browse/LOGSTASH-509
Simply add "(?m)" in front of your grok regex and you won't need mutation. Example from issue:
pattern => "(?m)<%{POSINT:syslog_pri}>(?:%{SPACE})%{GREEDYDATA:message_remainder}"
The multiline filter will add the "\n" to the message. For example:
"[2014-03-02 17:34:20] - 127.0.0.1|ERROR| E:\\xampp\\htdocs\\test.php|123|subject|The error message goes here ; array (\n 'create' => \n array (\n 'key1' => 'value1',\n 'key2' => 'value2',\n 'key3' => 'value3'\n ),\n)"
However, the grok filter can't parse the "\n". Therefore you need to substitute the \n to another character, says, blank space.
mutate {
gsub => ['message', "\n", " "]
}
Then, grok pattern can parse the message. For example:
"content" => "The error message goes here ; array ( 'create' => array ( 'key1' => 'value1', 'key2' => 'value2', 'key3' => 'value3' ), )"
Isn't the issue simply the ordering of the filters. Order is very important to log stash. You don't need another line to indicate that you've finished outputting multiline log line. Just ensure multiline filter appears first before the grok (see below)
P.s. I've managed to parse a multiline log line fine where xml was appended to end of log line and it spanned multiple lines and still I got a nice clean xml object into my content equivalent variable (named xmlrequest below). Before you say anything about logging xml in logs... I know... its not ideal... but that's for another debate :)):
filter {
multiline{
pattern => "^\["
what => "previous"
negate=> true
}
mutate {
gsub => ['message', "\n", " "]
}
mutate {
gsub => ['message', "\r", " "]
}
grok{
match => ['message',"\[%{WORD:ONE}\] \[%{WORD:TWO}\] \[%{WORD:THREE}\] %{GREEDYDATA:xmlrequest}"]
}
xml {
source => xmlrequest
remove_field => xmlrequest
target => "request"
}
}

Perl taint mode with domain name input for CGI resulting in “Insecure dependency in eval”

Given the following in a CGI script with Perl and taint mode I have not been able to get past the following.
tail /etc/httpd/logs/error_log
/usr/local/share/perl5/Net/DNS/Dig.pm line 906 (#1)
(F) You tried to do something that the tainting mechanism didn't like.
The tainting mechanism is turned on when you're running setuid or
setgid, or when you specify -T to turn it on explicitly. The
tainting mechanism labels all data that's derived directly or indirectly
from the user, who is considered to be unworthy of your trust. If any
such data is used in a "dangerous" operation, you get this error. See
perlsec for more information.
[Mon Jan 6 16:24:21 2014] dig.cgi: Insecure dependency in eval while running with -T switch at /usr/local/share/perl5/Net/DNS/Dig.pm line 906.
Code:
#!/usr/bin/perl -wT
use warnings;
use strict;
use IO::Socket::INET;
use Net::DNS::Dig;
use CGI;
$ENV{"PATH"} = ""; # Latest attempted fix
my $q = CGI->new;
my $domain = $q->param('domain');
if ( $domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/ ) {
$domain = "$1\.$2";
}
else {
warn("TAINTED DATA SENT BY $ENV{'REMOTE_ADDR'}: $domain: $!");
$domain = ""; # successful match did not occur
}
my $dig = new Net::DNS::Dig(
Timeout => 15, # default
Class => 'IN', # default
PeerAddr => $domain,
PeerPort => 53, # default
Proto => 'UDP', # default
Recursion => 1, # default
);
my #result = $dig->for( $domain, 'NS' )->to_text->rdata();
#result = sort #result;
print #result;
I normally use Data::Validate::Domain to do checking for a “valid” domain name, but could not deploy it in a way in which the tainted variable error would not occur.
I read that in order to untaint a variable you have to pass it through a regex with capture groups and then join the capture groups to sanitize it. So I deployed $domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/. As shown here it is not the best regex for the purpose of untainting a domain name and covering all possible domains but it meets my needs. Unfortunately my script is still producing tainted failures and I can not figure out how.
Regexp-Common does not provide a domain regex and modules don’t seem to work with untainting variable so I am at a loss now.
How to get this thing to pass taint checking?
$domain is not tainted
I verified that your $domain is not tainted. This is the only variable you use that could be tainted, in my opinion.
perl -T <(cat <<'EOF'
use Scalar::Util qw(tainted);
sub p_t($) {
if (tainted $_[0]) {
print "Tainted\n";
} else {
print "Not tainted\n";
}
}
my $domain = shift;
p_t($domain);
if ($domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/) {
$domain = "$1\.$2";
} else {
warn("$domain\n");
$domain = "";
}
p_t($domain);
EOF
) abc.def
It prints
Tainted
Not tainted
What Net::DNS::Dig does
See Net::DNS::Dig line 906. It is the beginning of to_text method.
sub to_text {
my $self = shift;
my $d = Data::Dumper->new([$self],['tobj']);
$d->Purity(1)->Deepcopy(1)->Indent(1);
my $tobj;
eval $d->Dump; # line 906
…
From new definition I know that $self is just hashref containing values from new parameters and several other filled in the constructor. The evaled code produced by $d->Dump is setting $tobj to a deep copy of $self (Deepcopy(1)), with correctly set self-references (Purity(1)) and basic pretty-printing (Indent(1)).
Where is the problem, how to debug
From what I found out about &Net::DNS::Dig::to_text, it is clear that the problem is at least one tainted item inside $self. So you have a straightforward way to debug your problem further: after constructing the $dig object in your script, check which of its items is tainted. You can dump the whole structure to stdout using print Data::Dumper::Dump($dig);, which is roughly the same as the evaled code, and check suspicious items using &Scalar::Util::tainted.
I have no idea how far this is from making Net::DNS::Dig work in taint mode. I do not use it, I was just curious and wanted to find out, where the problem is. As you managed to solve your problem otherwise, I leave it at this stage, allowing others to continue debugging the issue.
As resolution to this question if anyone comes across it in the future it was indeed the module I was using which caused the taint checks to fail. Teaching me an important lesson on trusting modules in a CGI environment. I switched to Net::DNS as I figured it would not encounter this issue and sure enough it does not. My code is provided below for reference in case anyone wants to accomplish the same thing I set out to do which is: locate the nameservers defined for a domain within its own zone file.
#!/usr/bin/perl -wT
use warnings;
use strict;
use IO::Socket::INET;
use Net::DNS;
use CGI;
$ENV{"PATH"} = ""; // Latest attempted fix
my $q = CGI->new;
my $domain = $q->param('domain');
my #result;
if ( $domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/ ) {
$domain = "$1\.$2";
}
else {
warn("TAINTED DATA SENT BY $ENV{'REMOTE_ADDR'}: $domain: $!");
$domain = ""; # successful match did not occur
}
my $ip = inet_ntoa(inet_aton($domain));
my $res = Net::DNS::Resolver->new(
nameservers => [($ip)],
);
my $query = $res->query($domain, "NS");
if ($query) {
foreach my $rr (grep { $_->type eq 'NS' } $query->answer) {
push(#result, $rr->nsdname);
}
}
else {
warn "query failed: ", $res->errorstring, "\n";
}
#result = sort #result;
print #result;
Thanks for the comments assisting me in this matter, and SO for teaching more then any other resource I have come across.

Turn set of urls in to a regex pattern (optional patterns)

Using an arbitrary set of urls (eg: http://api.longurl.org/v2/services) what is the best way to turn this list into a regex?
Is this appropriate regex?
(((easyuri|eepurl|eweri)\.com)|((migre|mke|myloc)\.me)|etc...)'
Can you do multiple levels of optional patterns like that?
I see different ways to accomplish this.
Use XPath and try to select a node given the current URL.
Parse the xml into a dictionary and test your current URL if it exists as a key.
Store the domains of the XML in a database, index the url field and query your current URL.
If performance is not an issue: Match the current URL against the entire XML file as text.
Perhaps there are more ideas.
Building a regex from the XML does not seem to me a good idea since all the other solutions appear to me far more easy to develop.
OP'S ANSWER:
Well it turns out that this does work:
/((?:easyuri|eepurl|eweri)\.com)|((?:migre|mke|myloc)\.me)/
Run against this:
easyuri.com eepurl.comer eweri.us migre.me mke.memo myloc.em
You get this:
[0] => Array
(
[0] => easyuri.com
[1] => eepurl.com
[2] => migre.me
[3] => mke.me
)
But the easiest way would just be something like this:
/0rz\.tw|1link\.in|1url\.com|2\.gp|2big\.at|etc\.\.\./
Regex helps you complicate things more than is possible with other methods. ;P
Here's the PHP I eventually used to create the regex:
Assumes that you have cURL'd http://api.longurl.org/v2/services and converted the xml to an array called $urlShorteners like: $urlShorteners = array('0rz.tw', '1link.in', 'etc...');
foreach($urlShorteners as $url) {
$urls[] = array_reverse(explode('.', $url));
}
foreach($urls as $url) {
$tldKeys[array_shift($url)][] = $url;
}
foreach($tldKeys as $tld => $doms) {
if($tld != '') {
$subPattern = array();
foreach($doms as $subDomain) {
$subPattern[] = implode("\.", array_reverse($subDomain));
}
if (count($subPattern) > 1) $optionPattern[] = "((?:" . implode("|", $subPattern) . ")\." . $tld . ")";
else $optionPattern[] = "(" . $subPattern[0] . "\." . $tld . ")";
}
}
$regex = '/' . implode('|', $optionPattern) . '/';
echo $regex . "\n";