How to replace a key value string with specific string in fluentd - replace

I am new to the fluentd, I want to use record_modifier to replace the string, when specific string occurs in the key value.
Example
1)input :
{"message":"how are you"}
output :
{"message":"who are you"}
when input key value having "how", i want to replace that string with "who"
2)input :
{"message":"he is a bad boy"}
output :
{"message":"he is a good boy"}
when input key value having "bad", i want to replace that string with "good"
#type record_modifier
#auto_typecast true
enter code here
key message
expression ?????
replace ?????
Thanks in Advance.

you can use gsub in a record_transformer
<filter yourTag>
#type record_transformer
enable_ruby
<record>
log ${record["log"].gsub('how', 'who')}
</record>
</filter>

Related

How to extract value using xslt 1.0

How to extract number 500 using xslt 1.0 from a large text
"o:errorCode" : "500"
Perhaps you're looking for:
<xsl:value-of select="substring-before(substring-after($largeText, '"o:errorCode" : "'), '"')"/>
This extracts the substring between the first occurrence of "o:errorCode" : " in $largeText and the first " that follows it.
Note that this will fail if the formatting of the input changes.

Regex in XML using PHP: find certain value of a certain xml tag

I am trying to get the value of a certain attribute in a certain xml tag with regex but cant get it right, maybe someone has an idea how to do it?
The xml looks like this:
<OTA_PingRQ>
<Errors>
<Error Code="101" Type="4" Status="NotProcessed" ShortText="Authentication refused">Authentication : login failed</Error>
</Errors>
</OTA_PingRQ>
and id like to match only the value of the Shorttext inside the Error tag.
in the end it should give me "Authentication refused" back.
What ive tried so far is using a lookbehind and lookahead, which doesnt let me take quantifiers with non fixed width. Like that (?<=<Error .).*?(?=>).
Can someone tell me how to only match the value of the shorttext (inside the error tag)?
You didn't specify the language you're using, i can give you the solution with PHP, the regex remain the same in every language anyway.
Here is the regex you're looking for :
#\<Error Code\=\"[0-9]+\" Type\=\"[0-9]+\" Status\=\"NotProcessed\" ShortText\=\"([a-z 0-9]+)\"\>#is
Concrete PHP use :
$yourOriginalString = '
<OTA_PingRQ>
<Errors>
<Error Code="101" Type="4" Status="NotProcessed" ShortText="Authentication refused">Authentication : login failed</Error>
</Errors>
</OTA_PingRQ>' ;
preg_match_all('#\<Error Code\=\"[0-9]+\" Type\=\"[0-9]+\" Status\=\"NotProcessed\" ShortText\=\"([a-z 0-9]+)\"\>#im', $yourOriginalString, $result) ;
print_r($result) ;
the regex function will return an array with :
[0] => Array
(
[0] => <Error Code="101" Type="4" Status="NotProcessed" ShortText="Authentication refused">
)
[1] => Array
(
[0] => Authentication refused
)
[0] is the full match
[1] list the content in the matching capturing groups : each () set in your regex
Some Regex explication :
Type\=\"[0-9]+\"
Assume "Type" can change and be any numbers.
ShortText\=\"([a-z 0-9]+)\"
Catch a string alphanumeric + space string. If you need some other stuffs, you can update like :
*[a-z 0-9\!\-]+*
catch ! and - too
#is
Are flags and ignore = caps and line break

How do I match a string which contains a specific string using regex lazily?

I would like to match a string using regex in python which contains a specific string (lazy match) but haven't figured out how to do so.
For instance, in the following example, how do I return just '<tag1>some text<tag2>some other text</tag2><tag1>'
and not the whole string
#!/bin/python3
import re
pattern = r'(<([a-zA-Z0-9]+?)\b[^>]*>.*?<tag2>some other text</tag2>.*?</\2>)'
text = '<root> <tag1>some text<tag2>some other text</tag2></tag1> </root>'
print(re.search(pattern, text, re.DOTALL).groups(0))
The code above prints <root> <tag1>some text<tag2>some other text</tag2></tag1> </root> when I want it to print <tag1>some text<tag2>some other text</tag2></tag1>
Of course, all of this assuming that there can be any tag in the place of tag1
Turns out, the solution is quite simple,here's the regex that works:
.*(<([a-zA-Z0-9]+?)\b[^>]*>.*?<tag2>some other text</tag2>.*?</\2>).*

Xpath search for duplicate

I have the following xml:
<log>
<logentry revision="11956">
<author>avijendran</author>
<date>2013-05-20T10:25:19.678089Z</date>
<msg>
JIRA-1263 - did something
</msg>
</logentry>
<logentry revision="11956">
<author>avijendran</author>
<date>2013-05-20T10:25:19.678089Z</date>
<msg>
JIRA-1263 - did something 22 again
</msg>
</logentry>
</log>
I want to ignore any occurrence of the JIRA-1263 after the first one.
The xpath I am trying is (Which works if the duplicates nodes are following. But if you have duplicates else where(deep down), then it is ignored:
<xsl:variable name="uniqueList" select="//msg[not(normalize-space(substring-before(., '
')) = normalize-space(substring-before(following::msg, '
')))]" />
If you want to get each msg use //msg[starts-with(normalize-space(.), 'JIRA-1263')] to get output JIRA-1263 - did something and JIRA-1263 - did something 22 again.
And if you want to get any element with same codition use //*[starts-with(normalize-space(.), 'JIRA-1263')] which give same result as previous one.
At the end, if you want to get first msg with same condition use //logentry/msg[starts-with(normalize-space(.), 'JIRA-1263')][not(preceding::msg)] to get output JIRA-1263 - did something
You can define a key at the top level of your stylesheet that groups log entries by their first word:
<xsl:key name="logentryByCode" match="logentry"
use="substring-before(normalize-space(msg), ' ')" />
Now you need to select all logentry elements where either
the msg does not start JIRA-nnnn (where nnnn is a number) or
this entry is the first one whose msg starts with this word (i.e. the first occurrence of "JIRA-1234 - anything" for each ticket number)
(note that these two conditions need not be mutually exclusive):
<xsl:variable name="uniqueList" select="log/logentry[
(
not(
starts-with(normalize-space(msg), 'JIRA-') and
boolean(number(substring-before(substring(normalize-space(msg), 6), ' ')))
)
)
or
(
generate-id() = generate-id(key('logentryByCode',
substring-before(normalize-space(msg), ' '))[1])
)
]/msg" />
The boolean(number(...)) part checks whether a string of text can be parsed as a valid non-zero number (the text in this case being the part of the first word of the message that follows JIRA-), and the generate-id trick is a special case of the technique known as Muenchian grouping.
Equally, you could group the msg elements instead of the logentry elements, using match="msg" in the key definition and normalize-space(.) instead of normalize-space(msg).
And here another interpretation of what you try to do.
Find any first logentry which start with JIRA-XXXX.
If this it right try this:
log/logentry[
starts-with(normalize-space(msg), 'JIRA-') and
not
(
substring-before( normalize-space(msg), ' ')= substring-before( normalize-space(preceding::msg), ' ')
)]
This will find any logentry which starts with JIRA- but has not preceding one with the same substring before the first space (JIRA-XXXX) in your example.

Regex to fetch xml node string value

I have an output, where i'd like to fetch the value of CMEngine node i.e., everything inside CMEngine node. Please help me with a regex, I already have a java code in place which uses the regex, so I just need the regex. Thanks
My XML
<General>
<LanguageID>en_US</LanguageID>
<CMEngine>
<CMServer/> <!-- starting here -->
<DaysToKeepHistory>4</DaysToKeepHistory>
<PreprocessorMaxBuf>5000000</PreprocessorMaxBuf>
<ServiceRefreshInterval>30</ServiceRefreshInterval>
<ReuseMemoryBetweenRequests>true</ReuseMemoryBetweenRequests>
<Trace Enabled="false">
<ActiveCategories>
<Category>ENVIRONMENT</Category>
<Category>EXEC</Category>
<Category>EXTERNALS</Category>
<Category>FILESYSTEM</Category>
<Category>INPUT_DOC</Category>
<Category>INTERFACES</Category>
<Category>NETWORKING</Category>
<Category>OUTPUT_DOC</Category>
<Category>PREPROCESSOR_INPUT</Category>
<Category>REQUEST</Category>
<Category>SYSTEMRESOURCES</Category>
<Category>VIEWIO</Category>
</ActiveCategories>
<SeverityLevel>ERROR</SeverityLevel>
<MessageInfo>
<ProcessAndThreadIds>true</ProcessAndThreadIds>
<TimeStamp>true</TimeStamp>
</MessageInfo>
<TraceFile>
<FileName>CMEngine_log.txt</FileName>
<MaxFileSize>1000000</MaxFileSize>
<RecyclingMethod>Restart</RecyclingMethod>
</TraceFile>
</Trace>
<JVMLocation>C:\Informatica\9.1.0\java\jre\bin\server</JVMLocation>
<JVMInitParamList/> <!-- Ending here -->
</CMEngine>
</General>
If it has to be a regex, and if there is only one CMEngine tag per string:
Pattern regex = Pattern.compile("(?<=<CMEngine>)(?:(?!</CMEngine>).)*", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group();
}
Since that output appears to be machine-generated and is unlikely to contain comments or other stuff that might confuse the regex, this should work quite reliably.
It starts at a position right after a <CMEngine> tag: (?<=<CMEngine>)and matches all characters until the next </CMEngine> tag: (?:(?!</CMEngine>).)*.