I have a requirement to modify a field like :
<ADDRESS>RieglerhÃtte|27|22|~|~</ADDRESS> to RieglerhÃtte 27/22.
Remove all ~, replace all | to / (except first occurrence of | to ' ' space).
I tried using translate function as :
<?substring-before(ADDRESS,‘|‘)?>
<?translate(translate(substring-after(ADDRESS,‘|‘),‘~|‘,‘/‘),‘|‘,‘/‘)?>
which is giving result as : RieglerhÃtte 27///
Also tried using replace :
<?xdofx:replace(replace(replace(ADDRESS,'~|',''),'|~',''),'|','/')?>
which works fine but doesn't have the space after first occurrence of | : RieglerhÃtte/27.
I tried using substring-before with replace but its giving error in BIP. (xdofx and xdoxslt cant be used together).
I am trying to get the output like RieglerhÃtte 27 or RieglerhÃtte 27/11/22/33 (for input as - RieglerhÃtte|27|11|22|33).
I think you were very close. Assuming ~ is your character for "blank", a | separates fields, and you don't want to show blank fields...
Try this:
<?substring-before(ADDRESS,‘|‘)?>
<?translate(translate(substring-after(ADDRESS,‘|‘),‘|~‘,‘‘),‘|‘,‘/‘)?>
It first removes the blank values, and then changes the pipe separator to a forward slash.
Related
I have a scenario wherein I want to replace a period when its surrounded by Alphabets and not when surrounded by Numbers. I figured out a Regular Expression pattern that can identify only the periods in Key names but the pattern is not working in SQL
SELECT REGEXP_REPLACE("Amount.fee:0.75,Amount.tot:645.55","(?<!\d)(\.)(?!\d)","_","ig");
Expected output: Amount_fee:0.75,Amount_tot:645.55
Note, I am trying this because, In MemSQL I couldn't access JSON key when it has period in it.
Also verified the pattern "(?<!\d)(.)(?!\d)" using https://coding.tools/regex-replace and it working fine. But, SQL is not working. Am using MemSQL 7.1.9 and POSIX Enhanced Regular expression are supposed to be work. Any help is much appreciated.
Since it looks like you are trying to workaround accessing a JSON key with a period, I will show you how to do that.
This can be done by either surrounding the json key name with backtics while using the shorthand json extract syntax:
select col::%`Amount.fee` from (select '{"Amount.fee":0.75,"Amount.tot":645.55}' col);
+--------------------+
| col::%`Amount.fee` |
+--------------------+
| 0.75 |
+--------------------+
or by using the json_extract_ builtins directly:
select json_extract_double('{"Amount.fee":0.75,"Amount.tot":645.55}', 'Amount.fee');
+------------------------------------------------------------------------------+
| json_extract_double('{"Amount.fee":0.75,"Amount.tot":645.55}', 'Amount.fee') |
+------------------------------------------------------------------------------+
| 0.75 |
+------------------------------------------------------------------------------+
Assuming you only want to target dots that are in between two non digit characters, where the dot is not the first or last character in the string, you may match on ([^\d])\.([^\d]) and replace with \1_\2:
SELECT REGEXP_REPLACE("Amount.fee:0.75,Amount.tot:645.55", "([^\d])\.([^\d])", "\1_\2", "ig");
Here is a regex demo showing that the replacement is working. Note that you might have to use $1_$2 instead of \1_\2 as the replacement, depending on the regex flavor of your SQL tool.
I have the following value in a field which needs to be split into multiple fields,
Classname:
abc.TestAutomation.NNNN.Specs.Prod/NDisableTransactionalAccessUsers.#()::TestAssembly:abc.TestAutomation
Required output:
Productname : abc.TestAutomation.NNNN.Specs.Prod
Feature name : NDisableTransactionalAccessUsers
Project : TestAssembly:abc.TestAutomation
I have been trying to extract the values into my fields using REX command, but I am failing.
source="Reports.csv" index="prod_reports_data" sourcetype="ReportsData"
| rex "classname(?<Productname>/*)\.(?<Featurename>#*)\.(?<Project>.*)"
| table classname Productname Featurename Project
While I execute this command, there are no results. I am very new to Splunk, can someone guide.
Thanks.
I almost always use multiple rex statement to get what I want ... but if you "know" the data is consistent, this will work (tried on regex101.com):
| rex field=_raw (?<classname>[^\/]+)\/(?<featurename>[^\.]+)\.[[:punct:]]+(?<project>[\w].+)
What this regular expression does:
<classname> :: everything from the front of the event to a front slash (/)
<featurename> :: whatever follows the front slash (/) until a literal dot (.)
discard all found punctuation
<project> :: whatever is left on the line
According to regex101.com, this is likely the most efficient rex you can use (14 steps total)
I have a comma separated file where I need to change the first column removing leading zeroes in string. Text file is as below
ABC-0001,ab,0001
ABC-0010,bc,0010
I need to get the data as under
ABC-1,ab,0001
ABC-10,bc,0010
I can do a command line replace which i tried as below:
sed 's/ABC-0*[1-9]/ABC-[1-9]/g' file
I ended up getting output:
ABC-[1-9],ab,0001
ABC-[1-9]0,ac,0010
Can you please tell me what I am missing in here.
Alternately I also tried to apply formatting in the SQL that generates this file as below:
select regexp_replace(key,'((0+)|1-9|0+)','(1-9|0+)') from file where key in ('ABC-0001','ABC-0010')
which gives output as
ABC-(1-9|0+)1
ABC-(1-9|0+)1(1-9|0+)
Help on either of solution will be very helpful!
Try this :
sed -E 's/ABC-0*([1-9])/ABC-\1/g' file
------ --
| |
capturing group |
captured group
To do it in the query using Oracle, where the key value with the zeroes you want to remove is in a column called "key" in a table called "file", would look like this:
select regexp_replace(key, '(-)(0+)(.*)', '\1\3')
from file;
You need to capture the dash as it is "consumed" by the regex as it is matched. Followed by the second group of one or more 0's, followed by the rest of the field. Replace with captured groups 1 and 3, leaving the 0's (if any) between out.
I am trying to come up with a RegEx (POSIX like) in a vendor application that returns data looking like illustrated below and presents a single line of data at a time so I do not need to account for multiple rows and need to match a row indvidually.
It can return one or more values in the string result
The application doesn't just let me use a "\d+\.\d+" to capture the component out of the string and I need to map all components of a row of data to a variable unfortunately even if I am going to discard it or otherwise it returns a negative match result.
My data looks like the following with the weird underscore padding.
USER | ___________ 3.58625 | ___________ 7.02235 |
USER | ___________ 10.02625 | ___________ 15.23625 |
The syntax is supports is
Matches REGEX "(Var1 Regex), (Var2 Regex), (Var3 Regex), (Var 4 regex), (Var 5 regex)" and the entire string must match the aggregation of the RegEx components, a single character off and you get nothing.
The "|" characters are field separators for the data.
So in the above what I need is a RegEx that takes it up to the beginning of the numeric and puts that in Var1, then capture the numeric value with decimal point in var 2, then capture up to the next numeric in Var 3, and then keep the numeric in var 4, then capture the space and end field | character into var 5. Only Var 2 and 4 will be useful but I have to capture the entire string.
I have mainly tried capturing between the bars "|" using ^.*\|(.*).\|*$ from this question.
I have also tried the multiple variable ([0-9]+k?[.,]?[0-9]+)\s*-\s*.*?([0-9]+k?[.,]?[0-9]+) mentioned in this question.
I seem to be missing something to get it right when I try using them via RegExr and I feel like I am missing something pretty simple.
In RegExr I never get more than one part of the string I either get just the number, the equivalent of the entire string in a single variable, or just the number which don't work in this context to accomplish the required goal.
The only example the documentation provides is the following from like a SysLog entry of something like in this example I'm consolidating there with "Fault with Resource Name: Disk Specific Problem: Offline"
WHERE value matches regex "(.)Resource Name: (.), Specific Problem: ([^,]),(.)"
SET _Rrsc = var02
SET _Prob = var03
I've spun my wheels on this for several hours so would appreciate any guidance / help to get me over this hump.
Something like this should work:
(\D+)([\d.]+)(\D+)([\d.]+)(.*)
Or in normal words: Capture everything but numbers, capture a decimal number, capture everything but numbers, capture a decimal number, capture everything.
Using USER | ___________ 10.02625 | ___________ 15.23625 |
$1 = USER | ___________
$2 = 10.02625
$3 = | ___________
$4 = 15.23625
$5 = |
I'm trying to substitute all non matching characters in a single line between certain columns (after a search).
Example:
The search can be everything
In example below the search = test
The substitute character of non matching characters: empty space.
I want to substitute all characters non part of "test" between columns 10 and 30.
Columns 10 and 30 are indicated with |
before: djd<aj.testjal.kjetestjaja testlala ratesttsuvtesta !<-a-
| |
after: djd<aj.test test testlala ratesttsuvtesta !<-a-
How can I realize this?
Use the following substitution command on that line.
:s/\(test\)\zs\|\%>9v\%<31v./\=submatch(1)!=''?'':' '/g
If the range of columns is specified using visual selection, run
:'<,'>s/\(test\)\zs\|\%V./\=submatch(1)!=''?'':' '/g
One method may be to select the appropiate column range using the Visual mode (control+v)
Once selected, the search and replace can be done using (see this question)
%s/\%Vfoo/bar/g
A regular expression for not test can be found here: Regular expression to match a line that doesn't contain a word?