I have to capture the data in the SD located in logs (RFC5424). I created a regex for this. Unfortunately, there are data in json format and set '[' and ']'. Can someone help me make a regex that correctly captures this data?
Example:
import re
text = """[exampleSDID#32473 iut="3" eventSource="Application" eventID="1011"][examplePriority#32473 class="{"fruits": [{ "kiwis": 3,"mangos": 4,"apple": null},{ "bag": true }],"vegetables": {"patatoes": "amandine","peas": false},"meat": ["fish","chicken","beef"]}"]"""
regex = """\[(\S+#[^\]]+)\]"""
matchAllSD = re.findall(regex, text)
for SD in matchAllSD:
print SD
$python main.py
exampleSDID#32473 iut="3" eventSource="Application" eventID="1011"
examplePriority#32473 class="{"fruits": [{ "kiwis": 3,"mangos":
4,"apple": null},{ "bag": true }
Thanks!
May be this is what you are looking for:
\[([^\]]*?(?:class=".*")?)\]
Demo
Related
I have a log file with a specific pattern format and I want to extract some field using a pattern but still not able to retrieve the correct value :
This's a line of my log file :
2021-02-08 09:09:38,111 INFO [stdout] (default task-26) Payload: {"rqUID":"12345678-abdcABCD-09876543-abcdefgh","addUsrIdentif":{"userId":"string","requestDate":"string","userLanguage":"string","financialInstitution":"string","products":["string"],"providerLogin":"string","providerPasswd":"string"},"customerInformation":{"customerId":"string","orgId":"string","orgDepth":12,"contractId":"string"},"merchantInformation":{"chainId":"string","merchantId":"string","posId":"string"},"agentInformation":{"oedInstitutionId":"string","branchId":"string","agentId":"string"},"caseReference":12}
I want to extract the field rqUID value using this pattern I get this result :
[[inputs.logparser]]
files = ["D:\\server.log"]
from_beginning = true
[inputs.logparser.grok]
measurement = "PWCAPI"
patterns = ["%{LOG_PATTERN}"]
custom_patterns = '''
LOG_PATTERN %{WORD:rqUID}
'''
Results :
{
"rqUID": [
[
"2021"
]
]
}
My purpose is the get : rqUID = 12345678-abdcABCD-09876543-abcdefgh
I tested the pattern using : https://grokdebug.herokuapp.com/
Can someone help ,thanks in advance :)
You can use a named capturing group here with a customized pattern:
"rqUID":"(?<rqUID>[^"]+)
Details:
"rqUID":" - a literal substring
(?<rqUID>[^"]+) - a named capturing group rqUID that captures any one or more chars other than a " char.
I need to parse a text file testresults.txt and capture serial number and then write the captured serial number onto separate text file called serialno.txt using groovy Jmeter JSR223 post processor.
Below code is not working. It didn't get into the while loop itself. Kindly help.
import java.util.regex.Pattern
import java.util.regex.Matcher
String filecontent = new File("C:/device/resources/testresults.txt").text
def regex = "SerialNumber\" value=\"(.+)\""
java.util.regex.Pattern p = java.util.regex.Pattern.compile(regex)
java.util.regex.Matcher m = p.matcher(filecontent)
File SN = new File("C:/device/resources/serialno.txt")
while(m.find()) {
SN.write m.group(1)
}
If your code doesn't enter the loop it means that there are no matches so you need to amend your regular expression, you can use i.e. Regex101 website for experiments
Given the following content of the testresults.txt file:
SerialNumber" value="foo"
SerialNumber" value="bar"
SerialNumber" value="baz"
your code works fine.
For the time being I can only suggest using match operator to make your code more "groovy"
def source = new File('C:/device/resources/testresults.txt').text
def matches = (source =~ 'SerialNumber" value="(.+?)"')
matches.each { match ->
new File('C:/device/resources/serialno.txt') << match[1] << System.getProperty('line.separator')
}
Demo:
More information: Apache Groovy - Why and How You Should Use It
I have a question about Regular Expression (Regex) and I really newbie in this. I found a tutorial a Regex written in Python to delete the data and replace it with an empty string.
This is the code from Python:
import re
def extract_identity(data, context):
"""Background Cloud Function to be triggered by Pub/Sub.
Args:
data (dict): The dictionary with data specific to this type of event.
context (google.cloud.functions.Context): The Cloud Functions event
metadata.
"""
import base64
import json
import urllib.parse
import urllib.request
if 'data' in data:
strjson = base64.b64decode(data['data']).decode('utf-8')
text = json.loads(strjson)
text = text['data']['results'][0]['description']
lines = text.split("\n")
res = []
for line in lines:
line = re.sub('gol. darah|nik|kewarganegaraan|nama|status perkawinan|berlaku hingga|alamat|agama|tempat/tgl lahir|jenis kelamin|gol darah|rt/rw|kel|desa|kecamatan', '', line, flags=re.IGNORECASE)
line = line.replace(":","").strip()
if line != "":
res.append(line)
p = {
"province": res[0],
"city": res[1],
"id": res[2],
"name": res[3],
"birthdate": res[4],
}
print('Information extracted:{}'.format(p))
In the above function, information extraction is done by removing all e-KTP labels with regular expressions.
This is the sample of e-KTP:
And this is the result after scanning that e-KTP using the python code:
Information extracted:{'province': 'PROVINSI JAWA TIMUR', 'city': 'KABUPATEN BANYUWANGI', 'id': '351024300b730004', 'name': 'TUHAN', 'birthdate': 'BANYUWANGI, 30-06-1973'}
This is the full tutorial from the above code.
And then my question is, can we use Regex in Kotlin to remove the label from the result of e-KTP like in python code? Because I try some logic that I understand it does not remove the label of e-KTP. My code in Kotlin like this:
....
val lines = result.text.split("\n")
val res = mutableListOf<String>()
Log.e("TAG LIST STRING", lines.toString())
for (line in lines) {
Log.e("TAG STRING", line)
line.matches(Regex("gol. darah|nik|kewarganegaraan|nama|status perkawinan|berlaku hingga|alamat|agama|tempat/tgl lahir|jenis kelamin|gol darah|rt/rw|kel|desa|kecamatan"))
line.replace(":","")
if (line != "") {
res.add(line)
}
Log.e("TAG RES", res.toString())
}
Log.e("TAG INSERT", res.toString())
tvProvinsi.text = res[0]
tvKota.text = res[1]
tvNIK.text = res[2]
tvNama.text = res[3]
tvTgl.text = res[4]
....
And this is the result of my code:
TAG LIST STRING: [PROVINSI JAWA BARAP, KABUPATEN TASIKMALAYA, NIK 320625XXXXXXXXXX, BRiEAFAUZEROMARA, Nama, TempatTgiLahir, Jenis keiamir, etc]
TAG INSERT: [PROVINSI JAWA BARAP, KABUPATEN TASIKMALAYA, NIK 320625XXXXXXXXXX, BRiEAFAUZEROMARA, Nama, TempatTgiLahir, Jenis keiamir, etc]
The label still exists, It's possible to remove a label using Regex or something in Kotlin like in Python?
The point is to use kotlin.text.replace with a Regex as the search argument. For example:
text = text.replace(Regex("""<REGEX_PATTERN_HERE>"""), "<REPLACEMENT_STRING_HERE>")
You may use
line = line.replace(Regex("""(?i)gol\. darah|nik|kewarganegaraan|nama|status perkawinan|berlaku hingga|alamat|agama|tempat/tgl lahir|jenis kelamin|gol darah|rt/rw|kel|desa|kecamatan"""), "")
Note that (?i) at the start of the pattern is a quick way to make the whole pattern case insensitive.
Also, when you need to match a . with a regex you need to escape it. Since a backslash can be coded in several ways and people often fail to do it correctly, it is always recommended to define regex patterns within raw string literals, in Kotlin, you may use the triple-double-quoted string literals, i.e. """...""" where each \ is treated as a literal backslash that is used to form regex escapes.
I'm working on importing a CSV dataset into a google sheet from my drive. I have the script working, however whenever the data imports it looks like this.
After Import
var file = DriveApp.getFileById(url);
var csvString = file.getBlob().getDataAsString('UTF-8').replace(/\uFFFD/g, '');
var csvData = Utilities.parseCsv(csvString);
var sheet = SpreadsheetApp.openById(sheetid);
var s = sheet.getSheetByName('Data');
s.getRange(1, 1, csvData.length, csvData[0].length).setValues(csvData);
I've tried a number of different regex expressions to replace the unknown characters but after a few days trying to figure it out, I figured I'd post it on here and get a bit of help. (I didn't include the .replace() in the code because I couldn't get it to work. This is the code that is working to only paste it to my sheet)
Edit* Here is the Expected Output - I've whited out the email addresses and usernames to keep the information private.
Expected Output
I am writing a web application in golang. I am using regular expression to validate the URL. But I am not able to validate image (abc.png) in the URL validation.
var validPath = regexp.MustCompile("^/$|/(home|about|badge)/(|[a-zA-Z0-9]+)$")
The above URL takes /home/, /about/ but could not make for /abc.png. I mean . itself not working
I tried the following regex, but it didn't help
var validPath = regexp.MustCompile("^/$|/(home|about|badge|.)/(|[a-zA-Z0-9]+)$")
var validPath = regexp.MustCompile("^/$|/(home|about|badge)(/|.)(|[a-zA-Z0-9]+)$")
And I am trying to match http://localhost:8080/badge.png
Could anyone please help me on this?
It appears
^/$|^(?:/(home|about|badge))?/((?:badge|abc)\.png|[a-zA-Z0-9]*)$
should work for you. See the regex demo.
The pattern breakdown:
^/$ - a / as a whole string
| - or...
^ - start of string
(?:/(home|about|badge))? - optional sequence of / + either home, or about or badge
/ - a / symbol followed with
((?:badge|abc)\.png|[a-zA-Z0-9]*) - Group 1 capturing:
(?:badge|abc)\.png - badge or abc followed with .png
| - or...
[a-zA-Z0-9]* - zero or more alphanumerics
$ - end of string
And here is the Go playground demo.
package main
import "fmt"
import "regexp"
func main() {
//var validPath = regexp.MustCompile("^/((home|about)(/[a-zA-Z0-9]*)?|[a-zA-Z0-9]+\\.[a-z]+)?$")
var validPath = regexp.MustCompile(`^/$|^(?:/(home|about|badge))?/((?:badge|abc)\.png|[a-zA-Z0-9]*)$`)
fmt.Println(validPath.MatchString("/"), validPath.MatchString("/home/"), validPath.MatchString("/about/"), validPath.MatchString("/home/13jia0"), validPath.MatchString("/about/1jnmjan"), validPath.MatchString("/badge.png"), validPath.MatchString("/abc.png"))
fmt.Println(validPath.MatchString("/nope/"), validPath.MatchString("/invalid.png"), validPath.MatchString("/test/test"))
m := validPath.FindStringSubmatch("/about/global")
fmt.Println("validate() :: URL validation path m[1] : ", m[1])
fmt.Println("validate() :: URL validation path m[2] : ", m[2])
if m == nil || m[2] != "global" {
fmt.Println("Not valid")
}
}
What you are looking for is the following (based off the example paths you posted):
var validPath = regexp.MustCompile("^/((home|about)(/[a-zA-Z0-9]*)?|[a-zA-Z0-9]+\\.[a-z]+)?$")
Playground with examples
You can validate with the following Regex:
var validPath = regexp.MustCompile("^\/(home|about|badge)\/[a-zA-Z0-9]+[.][a-z]+$")
Ps: I made a flexible Regex, so it accepts a lot of formats of images: png, jpg, jpeg and so on..
You can test it here: Regex