Replace numbers one by one with regex - regex

I have strings such as this:
"Query_string" : [ 1345.6423, 5656.5, 346.324, 880.0 ],
"Query_string" : [ 1345.6423, 5656.5, 346.324, 880.0 ],
"Query_string" : [ 1345.6423, 5656.5, 346.324, 880.0 ],
Random code 124253
String.....
I need to replace digits that have "query_string" in front of them to be zero, like so:
"Query_string" : [ 0000.0000, 0000.0, 000.000, 000.0 ],
But other stuff should stay in place, eg:
Random code 124253
I tried this:
(^\"Query\_string\"\s\:\s\[\s)|\d|(\s\]\,)
But it matches all digits, including "Random code 124253"

sed ": loop
s/\("Query_string".*\)[1-9]/\10/
t loop" YourFile

This sed expression replaces all non-zero digits on Query lines with 0:
sed '/^"Query_string"/{s/[1-9]/0/g}' input
Another version:
sed '/^"Query_string"/!b;s/[1-9]/0/g' input
Still another:
sed '/^"Query_string"/s/[1-9]/0/g' input

Related

Logstash Grok Regex: get each line in each block

I need a custom logstash-grok regex pattern
Some sample data:
abc blabla
[BLOCK]
START=1
END=2
[/BLOCK]
more blabla
[BLOCK]
START=3
END=4
[/BLOCK]
Note: each line ends in a newline character.
How do I capture all START and END values?
The desired result is:
{ "BLOCK1": { "START:"1", "END":"2"} }, "BLOCK2": { "START":"3", "END":"4" } }
I tried
START \bSTART=(?<start>\d*)
END \bEND=(?<end>\d*)
but the result is the values of only the first block:
{ "start": "1", "end": "2" }
I also tried using the multiline character (?m) in front of the grok pattern but that doesn't work either...
Any help is appreciated.

Ubuntu 16 sed not working with parenthesis

Oh, I can't get past this SED regex. This line "entrytimestamp" : ISODate("2020-09-09T16:07:34.526Z") in the first record should also be transformed but since it does not have a comma after the closing parenthesis it is not. Simply I want to remove "ISODate(" and the closing parenthesis ")". But it should not matter if is it the last element or not. I have double/triple checked the REGEX but I am missing something. Does anybody have any idea?
root## cat inar.json
[
{
"_id" : ObjectId("5f58fdc632e4de001621c1ca"),
"USER" : null,
"entrytimestamp" : ISODate("2020-09-09T16:07:34.526Z")
},
{
"_id" : ObjectId("5f590118c205630016dcafb4"),
"entrytimestamp" : ISODate("2020-09-09T16:21:44.346Z"),
"USER" : null
}
]
sed -E "s/(.+\"entrytimestamp\"\s:\s)ISODate\((\"[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{1,3}Z\")\)(.+)/\1\2\3/" inar.json
[
{
"_id" : ObjectId("5f58fdc632e4de001621c1ca"),
"USER" : null,
"entrytimestamp" : ISODate("2020-09-09T16:07:34.526Z")
},
{
"_id" : ObjectId("5f590118c205630016dcafb4"),
"entrytimestamp" : "2020-09-09T16:21:44.346Z",
"USER" : null
}
]
You may use this sed:
sed -E 's/("entrytimestamp" *: *)ISODate\(([^)]+)\)/\1\2/' file
[
{
"_id" : ObjectId("5f58fdc632e4de001621c1ca"),
"USER" : null,
"entrytimestamp" : "2020-09-09T16:07:34.526Z"
},
{
"_id" : ObjectId("5f590118c205630016dcafb4"),
"entrytimestamp" : "2020-09-09T16:21:44.346Z",
"USER" : null
}
]
Command Details
("entrytimestamp" *: *): Match starting "entrytimestamp" : part with optional spaces around :. Capture this part in group #1
ISODate\(: Match ISODate(
([^)]+): Match 1+ of any character that is not ). Capture this part in group #2
\): Match closing )
/\1\2: Put back-references #1 and #2 back in substitution
Your regex does not match the first line you intend to match because of the last (.+) that matches at least one or more characters. As there is only a ) at the end and nothing else to match, the pattern fails.
Use (.*) to match any zero or more characters:
sed -E "s/(.+\"entrytimestamp\"\s:\s)ISODate\((\"[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{1,3}Z\")\)(.*)/\1\2\3/" inar.json
This is how the expression works.

how to match and replace the repeated group patterns and align the result?

I have a code snippet like below
[ "sortBy", "String", "sort by method" ],
[ "sortOrder", "String", "sort order includes ascend and descend" ],
[ "count", "Int", "The number of results to return." ],
[ "names", "Array<String>", "array of strings represents name" ]
I want to use regular expression to match and replace and align so that the result would be look like this:
{ Name = "sortBy"; Ref = "String"; Description = Some "sort by method" }
{ Name = "sortOrder"; Ref = "String"; Description = Some "sort order includes ascend and descend" }
{ Name = "count"; Ref = "Int"; Description = Some "The number of results to return." }
{ Name = "names"; Ref = "Array<String>"; Description = Some "array of strings represents name" }
and each column should be aligned. I am stuck at the beginning how to group match it and align the result. My search is this
*\[ *"(.*)", *"(.*)", *"(.*)" *\],
in visual studio code but it only match the first row. Instead I want to to match all rows at once and replace it and then align it.
The point here is to match and capture only the parts you need to keep, and just match other parts.
You may use
^( *)\[( *)(".*?"),( *)(".*?"),( *)(".*?" *)\],?$
Replace with $1{$2Name = $3;$4Ref = $5;$6Description = Some $7}.
See the regex demo
Details
^ - start of line
( *) - Group 1 ($1): leading spaces
\[ - a [ char (will be replaced with {)
( *) - Group 2 ($2): spaces after [
(".*?") - Group 3 ($3): "..." substring
, - a comma (will be replaced with ;)
( *) - Group 4 ($4): spaces after the first ,
(".*?") - Group 5 ($5): "..." substring
, - a comma (will be replaced with ;)
( *) - Group 6 ($6): spaces after the second ,
(".*?" *) - Group 7 ($7): "..." substring and 0+ spaces after
\],?$ - ], an optional , and end of line.
Here is an answer using a macro extension. Because you need to run two separate regex's (although the second regex is very simple). First a demo with your original text first, some badly formatted text second and your desired results last:
Select your text first and then trigger the macro. I am using alt+r as the keybinding but you can choose whatever you want.
Using the macro extension multi-command put this into your settings.json:
"multiCommand.commands": [
{
"command": "multiCommand.insertAlignRows",
"sequence": [
"editor.action.insertCursorAtEndOfEachLineSelected",
"cursorHomeSelect",
{
"command": "editor.action.insertSnippet",
"args": {
"snippet": "${TM_SELECTED_TEXT/^(\\s*)\\[\\s*(.{12})\\s*(.{18})\\s*([^\\]]*)\\],?/$1{ Name = $2 Ref = $3Description = Some $4}/g}",
}
},
"cursorHomeSelect",
{
"command": "editor.action.insertSnippet",
"args": {
"snippet": "${TM_SELECTED_TEXT/,/;/g}",
}
},
]
}
]
In keybindings.json:
{
"key": "alt+r", // choose whatever keybinding you want
"command": "extension.multiCommand.execute",
"args": { "command": "multiCommand.insertAlignRows" },
"when": "editorTextFocus"
},
The regex that is doing almost all of the work is:
^(\s*)\[\s*(.{12})\s*(.{18})\s*([^\]]*)\],?
I removed the double escapes necessary in snippets but not in the find/replace widget so you could just use this regex in your find input (and not do the macro at all) and
$1{ Name = $2 Ref = $3Description = Some $4}
in the replace field. And then just replace , with ; after that.
Back to that regex: ^(\s*)\[\s*(.{12})\s*(.{18})\s*([^\]]*)\],? which looks brittle because of the "magic numbers" 12 and 18 derived from your sample text. But it isn't as bad as it first seems as the demo with the bad original formatting shows. They are just counting characters and as long as your input is reasonably close to what you presented it'll work.
The 12 can actually be from 12-16, with the 12 being the length of your longest first item (like "sortOrder",) and the 16 being the minimum number from the beginning of the first items to where the second items (like "String") begin.
Likewise the 18 could be 17-24 given your input and where you want the final column to start. Play with the numbers, it is pretty easy in regex101 demo.
I think the only restriction is that your input not look like this:
[ "names", "Array<String>", "array of strings represents name" ]
[ "sortOrder","String", "sort order includes ascend and descend" ],
where a later column starts before the end of the previous column - as in column 3 starts before all the column 2's end. Likewise for some column 2 item starting before all the column 1 items have ended like
[ "sortOrder", "String", "sort order includes ascend and descend" ],
[ "names", "Array<String>", "array of strings represents name" ]
If your input is that bad you could fix it first with some simple regex's.
Remember you can also adjust where the columns start in your replace by adding/subtracting spaces, as between the $2 Ref in my example above or $3Description - you can add space(s) after the $3 if you wish.

Regexp_replace for JSON

How to regexp_replace the phone_num & phone_ext with only numeric instead of characters.
[ {
"phone_type":"HOME",
"phone_num":"(+1)123-456-7890",
"phone_ext":"-85254-",
"phone_status":"Y",
},
{
"phone_type":"HOME",
"phone_num":"+001-123-456-7890",
"phone_ext":"85-254",
"phone_status":"N",
}
]
should be displayed as
[ {
"phone_type":"HOME",
"phone_num":"11234567890",
"phone_ext":"85254",
"phone_status":"Y",
},
{
"phone_type":"HOME",
"phone_num":"0011234567890",
"phone_ext":"85254",
"phone_status":"N",
}
]
Well, finding the text is fairly easy.
/phone_(num|ext)"\s*:\s*"([^"]*)",/gmi
Next part is finding the second grouping ([^"]*) within your match function and strip all none numeric characters. This will vary by application.

error while applying regex to numeric values to replace values in nifi

hi I have a data as below
[{
s1 = 98493456645
s2 = 0000000000
102 = 93234,
12 =
15 = rahdeshfui
16 = 2343432,234234
},
{
s1 = 435234235
s2 = 01
102 = 45336
12 =
15 = vjsfrh#gmail.com
16 = 2415454
}
]
now using reg expression i need to change to json format and i have tried this
regexp:- ([^\s]+?.*)=((.*(?=,$))+|.*).*
replace value:- "$1":"$2",
for this values i am getting output as below
[{
"s1":"98493456645",
"s2":"0000000000",
"102":"93234,",
"12":"",
"15":"rahdeshfui",
"16":"2343432,234234",
},
{
"s1":"435234235",
"s2":"01",
"102":"45336",
"12":"",
"15":"vjsfrh#gmail.com",
"16":"2415454"
}
]
but I my expected output should be as below
[{
"s1":98493456645,
"s2":0,
"102":93234,
"12":"",
"15":"rahdeshfui",
"16":"2343432,234234",
},
{
"s1":435234235,
"s2":01,
"102":45336,
"12":"",
"15":"vjsfrh#gmail.com",
"16":"2415454"
}
]
for numneric numbers their should not be in "" and if i have a value more than one 0 i need to replace it with single 0 and for some values i have , at end i need to skip , in case if i have one
It might be a bit cumbersome, but you want to replace multiple things so one option might be to use multiple replacements.
Note that these patterns do not take the opening [{ and closing ]] into account or any nesting but only the key value part as your posting pattern is for the example data.
1.) Wrap the keys and values in double quotes while not capturing the
comma at the end and match the equals sign including the surrounding
spaces:
(\S+) = (\S*?),?(?=\n) and replace with "$1":"$2",
Demo
2.) Remove the double quotes around the digits except for those that start with 0:
("[^"]+":)"(?!0+[1-9])(\d+)"" and replace with $1$2
Demo
3.) Remove the comma after the last key value:
("[^"]+":)(\S+),(?!\n *"\w+") and replace with $1$2
Demo
4.) Replace 2 or more times a zero with a single zero:
("[^"]+":)0{2,} and replace with $10
Demo
That will result in:
[{
"s1":98493456645,
"s2":0,
"102":93234,
"12":"",
"15":"rahdeshfui",
"16":"2343432,234234"
},
{
"s1":435234235,
"s2":"01",
"102":45336,
"12":"",
"15":"vjsfrh#gmail.com",
"16":2415454
}
]
Is assume the last value "16":"2415454" is "16":2415454 as the value contains digits only.