How can I find the records which has last character "A" in notepad++ - regex

I have huge JSON file in notepad++. One my field is product. I want to find out all the products which has character A in last in Value.
This is my data
{
"ID": 689,
"product": "GIPA",
"JobID": 66349,
"FriendlyName": "Android",
},
{
"ID": 689,
"product": "TKNA",
"JobID": 66350,
"FriendlyName": "Android",
},
{
"ID": 689,
"product": "TNRG",
"JobID": 66351,
"FriendlyName": "Android",
},
{
"ID": 689,
"product": "GAJT",
"JobID": 66352,
"FriendlyName": " Android",
},
I have tried two way but those are not working
"product": "^[a-z|A-Z|0-9]+[^A]\s?I{1}$"
And
"product": ".*(\A)$"
How can I find first two records?

Note the major issues with your regexps:
"product": "^[a-z|A-Z|0-9]+[^A]\s?I{1}$" contains ^ inside the pattern and thus it will never match as there is no start of string in the middle of it, where there is no pattern matching a line break before ^
[a-z|A-Z] matches letters AND also |, do not use | in character classes if you do not mean to match a literal | char (it loses its "alternation" meaning in between [...]
[^A] matches any char but A
{1} is always redundant, remove it. All patterns inside an expression are tried once by default.
"product": ".*(\A)$" contains \A, start of string anchor, which also invalidates the pattern, it will no longer match any string
You can use
"product": "[^"]*A"
It matches
"product": " - a literal string
[^"]* - 0 or more chars other than "
A" - A" string.

Related

How to add character after a line

I'm trying to perform a few regex steps, and I'd like to add a quotation mark and a comma (",) at the end of these lines without altering any of the rest of the characters in the line.
How would I keep things intact but add the ", after the words: device1, device2, device3 ?
Example of lines I'm working with:
object network device1
host 192.168.1.11
object network device2
host 192.168.1.12
object network device 3
host 192.168.1.13
After my first step of regex, I have modified my first line to include the curly bracket and some formatting with the words "category" and "name" as shown below. However, I don't want to change the word device1, but want to include a quotation and comma after the word device1
{
"category": "network",
"name": "device1
host 192.168.1.11
{
"category": "network",
"name": "device2
host 192.168.1.11
{
"category": "network",
"name": "device3
host 192.168.1.13
I can't figure out how to include the ", with my first step in my regex replace sequence?
I'm using both regexr.com and Notepad++.
You can use this regex to match each entity in your input data:
object\s+(\w+)\s+([^\r\n]+)[\r\n]+host\s+([\d.]+)
This matches:
object\s+ : the word "object" followed by a number of spaces
(\w+) : some number of word (alphanumeric plus _) characters, captured in group 1
\s+ : a number of spaces
([^\r\n]+) : some number of non-end-of-line characters, captured in group 2
[\r\n]+ : some number of end-of-line characters
host\s+ : the word "host" followed by a number of spaces
([\d.]+) : some number of digit and period characters, captured in group 3
This can then be replaced by:
{\n "category": "$1",\n "name": "$2",\n "host": "$3"\n},
To give output (for your sample data) of:
{
"category": "network",
"name": "device1",
"host": "192.168.1.11"
},
{
"category": "network",
"name": "device2",
"host": "192.168.1.12"
},
{
"category": "network",
"name": "device 3",
"host": "192.168.1.13"
},
Regex demo on regex101
Now you can simply add [ at the beginning of the file and replace the last , with a ] to make a valid JSON file.
This is the regex "name": "(device\d+) but since you have not mentioned any programming language you might get some pattern error based on the language you are using for example "" in java will need escape character so if you are using java then use this regex
\"name\": \"(device\d+)
Now you have to extract group (device\d+) and put your " there
for example in java you can do it with string.replaceAll

How do I remove a substring from a value in an elasticsearch document using their devtools?

If each document has a value that is similar to:
https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs
and I want to remove the D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/ part so I am left with https://test.com/MODIF-RRS/test/code.cs how would I do that?
I have a regex that works using an online tester
(D:/([a-zA-Z0-9_-]+)/_work/([a-zA-Z0-9_-]+)/s/)
but it gave me an error: invalid range: from (95) cannot be > to (93)
I used char filter with your regex.
POST _analyze
{
"char_filter": {
"type":"pattern_replace",
"pattern":"(D:/([a-zA-Z0-9_-]+)/_work/([a-zA-Z0-9_-]+)/s/)"
},
"text": "https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs"
}
Token
{
"tokens": [
{
"token": "https://test.com/MODIF-RRS/test/code.cs",
"start_offset": 0,
"end_offset": 85,
"type": "word",
"position": 0
}
]
}
(D:/([a-zA-Z0-9_-]+)/_work/([a-zA-Z0-9_-]+)/s/)
> invalid range: from (95) cannot be > to (93)
ASCII character 95 is _ and ASCII character 93 is ].
The parser thinks _-] is supposed to be a range of characters (similar to A-Z) and is confused because the ASCII values left and right of - are not in ascending order.
As you do not want to specify a range there are all, try escaping the - characters with a leading \, so that the parser knows you mean a literal -, not a range of characters:
(D:/([a-zA-Z0-9_\-]+)/_work/([a-zA-Z0-9_\-]+)/s/)
Note: Depending on how you specify your regex (in JSON?), you may have to escape the \ itself as well, so you'd have to write \\- instead of \-.
Alternatively it's usually possible to specify - as first character in the set, then the parser realizes it cannot be a range.
(D:/([-a-zA-Z0-9_]+)/_work/([-a-zA-Z0-9_]+)/s/)

How to find and replace in Notepad++ while before exact text?

I have a file that contains thousands of lines (json columns) like this:
"128",
"drugName_en": "Ampy 500mg Capsules",
"drugName_ar": "امبي 500 مجم كبسول",
"scientificDrugId": "959",
"proxyDrugId": "01",
"prodCompDrugId": "06",
"pack": "2x10Capsules",
"unit": "باكت",
"price": "0",
"discount": "00",
"categoryId": "15",
"image": "http://icons.iconarchive.com/icons/graphicloads/medical-health/256/medicine-box-icon.png"
},
"129",
"drugName_en": "Ampy C 10*10 C 500mg Capsules",
"drugName_ar": "امبي سي 500 مجم 10*10 كبسول",
"scientificDrugId": "36",
"proxyDrugId": "01",
"prodCompDrugId": "06",
"pack": "10x10Capsules",
"unit": "باكت",
"price": "2267",
"discount": "00",
"categoryId": "15",
"image": "http://icons.iconarchive.com/icons/graphicloads/medical-health/256/medicine-box-icon.png"
},
I need to replace each , which is before "drugName_en" with :{
Could this be done in Notepad++?
Thank you in advance.
This does the job, preserving the linebreak (whatever it is):
Ctrl+H
Find what: ,(\R)(?=\s+"drugName_en")
Replace with: :{$1
CHECK Match case
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
, # a comma
(\R) # group 1, any kind of linebreak (i.e. \r, \n, \r\n)
(?= # positive lookahead, make sure we have after:
\s+ # 1 or more white spaces
"drugName_en" # literally
) # end lookahead
Replacement:
/{ # literally
$1 # content of group 1, the linebreak
Screen capture (before):
Screen capture (after):
this is not possible in notepad++ but it is possible in sublime text editor.i have tried it and it is completely work just try Sublime Text editor.
Download Sublime Text

Notepad ++: how to remove all text before and after a string

I want to just keep the code for each line in this text, what is the regular expression for this
{"name": "Canada", "countryCd": "CA", "code": 393},
{"name": "Syria", "countryCd": "SR", "code": 3535},
{"name": "Germany", "countryCd": "GR", "code": 3213}
The expected result would be
CA
SR
GR
Kind of a hack (see #Totos comment) but works for your requirements:
.*"([A-Z]{2})".*
This needs to be replaced by $1, see a demo on regex101.com (side node: isn't Germany usually GER ?)
In notepad++ I would do a find and replace like:
.*?"countryCd": "([^"]+)".*
And replace that with:
\1
That way if for some reason your country code was not just 2 letters it would be captured correctly. The [^"] is a negative character class, meaning anything that isn't " and the + makes it at least 1 character. I find using negative character classes does what is actually intended.
And in this case you want to capture whatever is in the quotes after the country CD, and this will do the trick.

grok parsing issue

I have an input line that looks like this:
localhost_9999.kafka.server:type=SessionExpireListener,name=ZooKeeperSyncConnectsPerSec.OneMinuteRate
and I can use this pattern to parse it:
%{DATA:kafka_node}:type=%{DATA:kafka_metric_type},name=%{JAVACLASS:kafka_metric_name}
which gives me this:
{
"kafka_node": [
[
"localhost_9999.kafka.server"
]
],
"kafka_metric_type": [
[
"SessionExpireListener"
]
],
"kafka_metric_name": [
[
"ZooKeeperSyncConnectsPerSec.OneMinuteRate"
]
]
}
I want to split the OneMinuteRate into a seperate field but can't seem to get it to work. I've tried this:
%{DATA:kafka_node}:type=%{DATA:kafka_metric_type},name=%{WORD:kafka_metric_name}.%{WORD:attr_type}"
but get nothing back then.
I'm also using https://grokdebug.herokuapp.com/ to test these out...
You can either use your last regex with an escaped . (note that a . matches any char but newline and a \. will match a literal dot char), or use DATA type for the last but one field and a GREEDYDATA for the last field:
%{DATA:kafka_node}:type=%{DATA:kafka_metric_type},name=% {DATA:kafka_metric_name}\.%{GREEDYDATA:attr_type}
Since %{DATA:name} translates to (?<name>.*?) and %{GREEDYDATA:name} translates to (?<name>.*), the name part will match any chars, 0 or more occurrences, as few as possible, up to the first ., and attr_type .* pattern will greedily "eat up" the rest of the line up to its end.