Logstash Grok Regex: get each line in each block - regex

I need a custom logstash-grok regex pattern
Some sample data:
abc blabla
[BLOCK]
START=1
END=2
[/BLOCK]
more blabla
[BLOCK]
START=3
END=4
[/BLOCK]
Note: each line ends in a newline character.
How do I capture all START and END values?
The desired result is:
{ "BLOCK1": { "START:"1", "END":"2"} }, "BLOCK2": { "START":"3", "END":"4" } }
I tried
START \bSTART=(?<start>\d*)
END \bEND=(?<end>\d*)
but the result is the values of only the first block:
{ "start": "1", "end": "2" }
I also tried using the multiline character (?m) in front of the grok pattern but that doesn't work either...
Any help is appreciated.

Related

Regex to capture between two characters that matches a word inbetween

I am trying to find out the best way possible for extracting all the text in between two characters (ignoring line breaks) that matches a word in between the two characters specified.
In the below example, i want to find by the zip 22222 and extract/group its block from { till } that is {
"zip":"22222",
"total":2
}
Example :
{
"zip":"11111",
"total":1
},
{
"zip":"22222",
"total":2
},
{
"zip":"33333",
"total":3
}
Want to extract/capture/group the block {...} for zip 22222 as below :
{
"zip":"22222",
"total":2
}
I tried the below, but this is capturing the blocks for all zip codes
(?s)(?<={)(.*?)(?=})
https://regex101.com/r/0wTDyj/1
Below regex worked for me :
(?s)(?={)(?:(?:(?!"zip").)?"zip"\s:\s*"22222".*?)(?<=})

error while applying regex to numeric values to replace values in nifi

hi I have a data as below
[{
s1 = 98493456645
s2 = 0000000000
102 = 93234,
12 =
15 = rahdeshfui
16 = 2343432,234234
},
{
s1 = 435234235
s2 = 01
102 = 45336
12 =
15 = vjsfrh#gmail.com
16 = 2415454
}
]
now using reg expression i need to change to json format and i have tried this
regexp:- ([^\s]+?.*)=((.*(?=,$))+|.*).*
replace value:- "$1":"$2",
for this values i am getting output as below
[{
"s1":"98493456645",
"s2":"0000000000",
"102":"93234,",
"12":"",
"15":"rahdeshfui",
"16":"2343432,234234",
},
{
"s1":"435234235",
"s2":"01",
"102":"45336",
"12":"",
"15":"vjsfrh#gmail.com",
"16":"2415454"
}
]
but I my expected output should be as below
[{
"s1":98493456645,
"s2":0,
"102":93234,
"12":"",
"15":"rahdeshfui",
"16":"2343432,234234",
},
{
"s1":435234235,
"s2":01,
"102":45336,
"12":"",
"15":"vjsfrh#gmail.com",
"16":"2415454"
}
]
for numneric numbers their should not be in "" and if i have a value more than one 0 i need to replace it with single 0 and for some values i have , at end i need to skip , in case if i have one
It might be a bit cumbersome, but you want to replace multiple things so one option might be to use multiple replacements.
Note that these patterns do not take the opening [{ and closing ]] into account or any nesting but only the key value part as your posting pattern is for the example data.
1.) Wrap the keys and values in double quotes while not capturing the
comma at the end and match the equals sign including the surrounding
spaces:
(\S+) = (\S*?),?(?=\n) and replace with "$1":"$2",
Demo
2.) Remove the double quotes around the digits except for those that start with 0:
("[^"]+":)"(?!0+[1-9])(\d+)"" and replace with $1$2
Demo
3.) Remove the comma after the last key value:
("[^"]+":)(\S+),(?!\n *"\w+") and replace with $1$2
Demo
4.) Replace 2 or more times a zero with a single zero:
("[^"]+":)0{2,} and replace with $10
Demo
That will result in:
[{
"s1":98493456645,
"s2":0,
"102":93234,
"12":"",
"15":"rahdeshfui",
"16":"2343432,234234"
},
{
"s1":435234235,
"s2":"01",
"102":45336,
"12":"",
"15":"vjsfrh#gmail.com",
"16":2415454
}
]
Is assume the last value "16":"2415454" is "16":2415454 as the value contains digits only.

Golang Regex extract text between 2 delimiters - including delimiters

As stated in the title I have an program in golang where I have a string with a reoccurring pattern. I have a beginning and end delimiters for this pattern, and I would like to extract them from the string. The following is pseudo code:
string := "... This is preceding text
PATTERN BEGINS HERE (
pattern can continue for any number of lines...
);
this is trailing text that is not part of the pattern"
In short what I am attempting to do is from the example above is extract all occurrences of of the pattern that begins with "PATTERN BEGINS HERE" and ends with ");" And I need help in figuring out what the regex for this looks like.
Please let me know if any additional info or context is needed.
The regex is:
(?s)PATTERN BEGINS HERE.*?\);
where (?s) is a flag to let .* match multiple lines (see Go regex syntax).
See demo
Not regex, but works
func findInString(str, start, end string) ([]byte, error) {
var match []byte
index := strings.Index(str, start)
if index == -1 {
return match, errors.New("Not found")
}
index += len(start)
for {
char := str[index]
if strings.HasPrefix(str[index:index+len(match)], end) {
break
}
match = append(match, char)
index++
}
return match, nil
}
EDIT: Best to handle individual character as bytes and return a byte array

validator.addMethod for checking before and end whitespaces

I want to validate a field with white spaces either before a text string or after. It is allowed to have space in the middle string.
Here is my code
$.validator.addMethod("trimLookup", function(value, element) {
regex = "^[^\s]+(\s+[^\s]+)*$";
regex = new RegExp( regex );
return this.optional( element ) || regex.test( value );
}, $.validator.format("Cannot contains any spaces at beginning or end"));
I test the regex in https://regex101.com/ it works fine. I also test this code with other regex it works. But if enter " " or " abc " it doesn't work.
Any Suggestion?
Thank you for your time!

Looking for a string with regex and delete the whole line

I am trying to find in Textpad a character with regex (for example "#") and if it is found the whole line should be deleted. The # is not at the beginnen of the line nor at the end but somewehre in between and not connected to another word, number or charakter - it stands alone with a whitespace left and right, but of course the rest of the line contains words and numbers.
Example:
My first line
My second line with # hash
My third line# with hash
Result:
My first line
My third line# with hash
How could I accomplish that?
Let's break it down:
^ # Start of line
.* # any number of characters (except newline)
[ \t] # whitespace (tab or space)
\# # hashmark
[ \t] # whitespace (tab or space)
.* # any number of characters (except newline)
or, in a single line: ^.*[ \t]#[ \t].*
try this
^(.*[#].*)$
Debuggex Demo
or maybe
(?<=[\r\n^])(.*[#].*)(?=[\r\n$])
Debuggex Demo
EDIT: Changed to reflect the point by Tim
This
public static void main(String[] args){
Pattern p = Pattern.compile("^.*\\s+#\\s+.*$",Pattern.MULTILINE);
String[] values = {
"",
"###",
"a#",
"#a",
"ab",
"a#b",
"a # b\r\na b c"
};
for(String input: values){
Matcher m = p.matcher(input);
while(m.find()){
System.out.println(input.substring(m.start(),m.end()));
}
}
}
gives the output
a # b