I'm trying to make a regex query in solr using lookahead to match terms in any order. It's not working.
I'm doing the following query:
q=(itemKeyword:/(?=.*REDE).*(?=.*ENLACE).*/)
To get the document containing:
"INTERFACE GERENCIAMENTO RADIO ENLACE - APLICACAO: MICROONDAS, TIPO ACESSO: REMOTO TELEACIONAMENTO, ARMAZENAMENTO DADOS: HD, CANAL SERVICO: N/A, MONITOR: N/A, FUNCAO: GERENCIAMENTO DE REDE"
You may use the & operator:
itemKeyword:/.*REDE.*&.*ENLACE.*/
The pattern matches a string that contains both REDE and ENLACE.
Related
Hope you're doing well.
Imagine I have the following Sheet:
5:20:58 xxxx: entro con el mismo xxxx
5:21:08 xxxx: xxxx
5:21:58 xxxxx: Perfecto, te pido de 5 a 10 minutos mientras
reviso la configuración de las etiquetas. ¿De acuerdo?
5:22:04 xxxxx: ok
I need to delete the datetime of all those rows. The result
xxxx: entro con el mismo xxxx
xxxx: xxxx
xxxxx: Perfecto, te pido de 5 a 10 minutos mientras
reviso la configuración de las etiquetas. ¿De acuerdo?
xxxxx: ok
Is there a formula in Google Sheets to make this?
I tried with REPLACE, SPLIT but is not applicable to all the rows in the sheet.
(The real sheet has too many rows, I extracted a part from the sheet to give an example)
EDIT
(following OP's comment)
...there are sometimes that the data not starts with a timestamp. ... How can I adjust the formula to make it work?
Please use the following altered formula
=INDEX(IFERROR(REGEXEXTRACT(B1:B;" (.*)");B1:B))
OR (for an even more robust formula)
=INDEX(IFERROR(REGEXEXTRACT(B1:B;"^\d+:\d+:\d+ (.+)");B1:B))
Original answer
Please use the following formula (adjust range to your needs)
=INDEX(IFERROR(REGEXEXTRACT(B1:B;" (.*)")))
OR (depending on your locale)
=INDEX(IFERROR(REGEXEXTRACT(B1:B," (.*)")))
Functions used:
INDEX
IFERROR
REGEXEXTRACT
Let's say your raw data is in A2:A. Place this in the second cell (e.g., B2) of an otherwise empty column:
=ArrayFormula(IF(A2:A="",,TRIM(REGEXREPLACE(A2:A,"\d+:\d+:\d+",""))))
ADDENDUM:
Version for some international locales (where semicolon is used in place of a comma within formulas):
=ArrayFormula(IF(A2:A="";;TRIM(REGEXREPLACE(A2:A;"\d+:\d+:\d+";""))))
I have Dell servers with iDrac 8. Monitoring: Prometheus+snmp_exporter+Grafana.
MIB: iDRAC-SMIv2
OID: 1.3.6.1.4.1.674.10892.5.4.300.40.1.8
From SNPP I get eventLogDateName in format: 20201222152131.000000+120
How can I use regex for replace 20201222152131.000000+120 to 12/22/20 15:21:31. I don't know where is I need insert my regex.
P.S.
pattern = '^(?P<YYYY>\d{4})(?P<MM>\d{2})(?P<DD>\d{2})(?P<HH>\d{2})(?P<mm>\d{2})(?P<ss>\d{2})\.(?P<SSSSSS>\d{6})(?P<ZZ>[-+]\d{3,4})$'
replacement = "${YYYY}-${MM}-${DD} ${HH}:${mm}:${ss}"
This question is related to RegEx find all XML tags but I'm trying to do it in Windows PowerShell.
I have an XML file that contains many different XML tags, and the file is Huge, so basically I want to use RegEx to parse the file and spit out the name of all the tags as a list. The XML document is not a valid XML document even though it contains XML tags and elements. So using the XML functions of PowerShell won't work. I get many errors when trying to view it as an XML document, thus the need to use RegEx.
I've determined that the following RegEx identifies the tags (thanks to the related question mentioned above): (?<=<)([^\/]*?)((?= \/>)|(?=>))
Here's a very small sniplet of the file I'm parsing:
<data><bp_year /><bp_make>John Deere</bp_make><bp_model>650</bp_model><bp_price>3000.00</bp_price><bp_txtDayPhone>555-555-5555</bp_txtDayPhone><bp_bestPrice>3000.0000</bp_bestPrice><bp_txtComments>Best price available?</bp_txtComments><bp_url>https://www.example.com</bp_url></data>
<data><receiveOffers /><link>http://example.com/inventory.htm?id=2217405&used=1</link><itemName>2007 Yamaha RHINO 660</itemName></data>
<data><vehicleYear>2008</vehicleYear><vehicleMake>Buick</vehicleMake><vehicleModel>Enclave</vehicleModel><vehicleStyle>CX</vehicleStyle><vehicleInformation /><vehicleMileage /><phone>555-555-5555</phone><timeOfDay>Morning</timeOfDay><message /></data>
<data><mo_year>2009</mo_year><mo_make>Webasto</mo_make><mo_model>Air Top 2000</mo_model><mo_price /><mo_txtDayPhone>555-555-5555</mo_txtDayPhone><mo_txtOffer>700</mo_txtOffer><mo_txtTrade /><mo_txtComments /></data>
I really don't have much experience with Powershell, but from my understanding, you can do Grep stuff with it. After searching around on the internet, I found some resources that helped point me towards my solution, via using the powershell Select-String command.
I've attempted the following powershell command, but it gives me way too much feedback. I just want a master "Matches" list.
Select-String -Path '.\dataXML stuff - Copy.xml'-Pattern "(?<=<)([^\/]*?)((?= \/>)|(?=>))" -AllMatches | Format-List -Property Matches
Sample of Output generated:
Matches : {data, vehicleYear, vehicleMake, vehicleModel...}
Matches : {data, address, city, region...}
Matches : {data, vehicleYear, vehicleMake, vehicleModel...}
Matches : {data, vehicleYear, vehicleMake, vehicleModel...}
Matches : {data, address, city, region...}
Matches : {data, vehicleYear, vehicleMake, vehicleModel...}
Matches : {data, vehicleYear, vehicleMake, vehicleModel...}
Matches : {data, mo_year, mo_make, mo_model...}
Basically, I want something like:
data
vehicleYear
vehicleMake
vehicleModel
address
city
region
mo_year
mo_make
mo_model
and so on and on....
Where only the matched strings are returned and listed, rather than telling me what matched on each line of the XML file. I prefer the list format because then I can pump this into Excel and get a distinct list of tag names, and then start actually doing what I need to accomplish, but the overwhelming number of different XML tags and not knowing what they are is holding me up.
Maybe Select-String isn't the best method to use, but I feel like I'm close to my solution after finding this Microsoft post:
https://social.technet.microsoft.com/Forums/windowsserver/en-US/d5bbd2fb-c8fa-43ed-b432-79ebfeee82ea/return-only-matches-from-selectstring?forum=winserverpowershell
Basically, here's the solution modified to fit my needs:
Gc 'C:\Documents\dataXML stuff - Copy.xml'|Select-String -Pattern "(?<=<)([^\/]*?)((?= \/>)|(?=>))"|foreach {$_.matches}|select value
It provides a list of all the xml tags, just like I wanted, except it only returns the first XML tag of that line, so I get a lot of:
data
data
data
but no vehicleYear, vehicleMake, vehicleModel, etc., which would have been the 2nd or 3rd or 11th xml tag of that line.
As for ...
Like I mentioned earlier in the post, I do not use PowerShell at all
Reading is a good thing, but see it in action is better. There are many free video resources to view PowerShell from the beginning, and tons of references. Then the are the MS TechNet virtual labs to leverage.
See this post for folks providing some paths for learning PowerShell.
Does anyone have any experience teaching others powershell?
https://www.reddit.com/r/PowerShell/comments/7oir35/help_with_teaching_others_powershell
Sure you could do it with RegEx, but it is best to handle it natively.
In PowerShell, XML is a big deal; as is JSON. All the help files a just XML files. There are bulit-in cmdlets to deal with it.
# Get parameters, examples, full and Online help for a cmdlet or function
Get-Command -Name '*xml*' | Format-Table -AutoSize
(Get-Command -Name Select-Xml).Parameters
Get-help -Name Select-Xml -Examples
Get-help -Name Select-Xml -Full
Get-help -Name Select-Xml -Online
Get-Help about_*
# Find all cmdlets / functions with a target parameter
Get-Help * -Parameter xml
# All Help topics locations
explorer "$pshome\$($Host.CurrentCulture.Name)"
And many sites that present articles on dealing with it.
PowerShell Data Basics: XML
To master PowerShell, you must know how to use XML. XML is an essential data interchange format because it remains the most reliable way of ensuring that an object's data is preserved. Fortunately, PowerShell makes it all easy, as Michael Sorens demonstrates.
https://www.red-gate.com/simple-talk/sysadmin/powershell/powershell-data-basics-xml
Converting XML to PowerShell PSObject
Recently, I was working on some code (of course) and had a need to convert some XML to PowerShell PSObjects. I found some snippets out there that sort of did this, but not the way that I needed for this exercise. In this case I’m converting XML meta data from Plex.
https://consciouscipher.wordpress.com/2015/06/05/converting-xml-to-powershell-psobject
Mastering everyday XML tasks in PowerShell
PowerShell has awesome XML support. It is not obvious at first, but with a little help from your friends here at PowerShellMagazine.com, you’ll soon solve every-day XML tasks – even pretty complex ones – in no time.
So let’s check out how you put very simple PowerShell code to work to get the things done that used to be so mind-blowingly complex in the pre-PowerShell era.
http://www.powershellmagazine.com/2013/08/19/mastering-everyday-xml-tasks-in-powershell
For all intents and purposes, if I just take one row for your sample, and do this using the .Net xml namespace...
($MyXmlData = [xml]'<data><bp_year /><bp_make>John Deere</bp_make><bp_model>650</bp_model><bp_price>3000.00</bp_price><bp_txtDayPhone>555-555-5555</bp_txtDayPhone><bp_bestPrice>3000.0000</bp_bestPrice><bp_txtComments>Best price available?</bp_txtComments><bp_url>https://www.example.com</bp_url></data>')
data
----
data
You get resutls like this...
$MyXmlData.data
bp_year :
bp_make : John Deere
bp_model : 650
bp_price : 3000.00
bp_txtDayPhone : 555-555-5555
bp_bestPrice : 3000.0000
bp_txtComments : Best price available?
bp_url : https://www.example.com
with intellisene / autocomplete of the nodes / elements...
$MyXmlData.data.bp_year
Another view...
$MyXmlData.data | Format-Table -AutoSize
bp_year bp_make bp_model bp_price bp_txtDayPhone bp_bestPrice bp_txtComments bp_url
------- ------- -------- -------- -------------- ------------ -------------- ------
John Deere 650 3000.00 555-555-5555 3000.0000 Best price available? https://www.example.com
And from that, just geting the tags / names
$MyXmlData.data.ChildNodes.Name
bp_year
bp_make
bp_model
bp_price
bp_txtDayPhone
bp_bestPrice
bp_txtComments
bp_url
So, armed with the above approaches / notes. It just becomes a matter of looping through your file to get all you are after.
So, just taking your sample and dumping it into a file with no changes, one can do this.
$MyXmlData = (Get-Content -Path 'D:\Scripts\MyXmlData.xml')
$MyXmlData | Format-List -Force
ForEach($DataRow in $MyXmlData)
{
($DataObject = [xml]$DataRow).Data | Format-Table -AutoSize
}
bp_year bp_make bp_model bp_price bp_txtDayPhone bp_bestPrice bp_txtComments bp_url
------- ------- -------- -------- -------------- ------------ -------------- ------
John Deere 650 3000.00 555-555-5555 3000.0000 Best price available? https://www.example.com
receiveOffers link itemName
------------- ---- --------
http://example.com/inventory.htm?id=2217405&used=1 2007 Yamaha RHINO 660
vehicleYear vehicleMake vehicleModel vehicleStyle vehicleInformation vehicleMileage phone timeOfDay message
----------- ----------- ------------ ------------ ------------------ -------------- ----- --------- -------
2008 Buick Enclave CX 555-555-5555 Morning
mo_year mo_make mo_model mo_price mo_txtDayPhone mo_txtOffer mo_txtTrade mo_txtComments
------- ------- -------- -------- -------------- ----------- ----------- --------------
2009 Webasto Air Top 2000 555-555-5555 700
ForEach($DataRow in $MyXmlData)
{
($DataObject = [xml]$DataRow).Data.ChildNodes.Name
}
bp_year
bp_make
bp_model
bp_price
bp_txtDayPhone
bp_bestPrice
bp_txtComments
bp_url
receiveOffers
link
itemName
vehicleYear
vehicleMake
vehicleModel
vehicleStyle
vehicleInformation
vehicleMileage
phone
timeOfDay
message
mo_year
mo_make
mo_model
mo_price
mo_txtDayPhone
mo_txtOffer
mo_txtTrade
mo_txtComments
Yet, note, this is not the only way to do this.
Looking for help on building a regex that captures a 1-line string after a specific word.
The challenge I'm running into is that the program where I need to build this regex uses a single line format, in other words dot matches new line. So the formula I created isn't working. See more details below. Any advice or tips?
More specific regex task:
I'm trying to grab the line that comes after the word Details from entries like below. The goal is pull out 100% Silk, or 100% Velvet. This is the material of the product that always comes after Details.
Raw data:
<p>Loose fitted blouse green/yellow lily print.
V-neck opening with a closure string.
Small tie string on left side of top.</p>
<h3>Details</h3> <p>100% Silk.</p>
<p>Made in Portugal.</p> <h3>Fit</h3>
<p>Model is 5‰Ûª10,‰Û size 2 wearing size 34.</p> <p>Size 34 measurements</p>
OR
<p>The velvet version of this dress. High waist fit with hook and zipper closure.
Seams run along edges of pants to create a box-like.</p>
<h3>Details</h3> <p>100% Velvet.</p>
<p>Made in the United States.</p>
<h3>Fit</h3> <p>Model is 5‰Ûª10‰Û, size 2 and wearing size M pants.</p> <p>Size M measurements Length: 37.5"åÊ</p>
<p>These pants run small. We recommend sizing up.</p>
Here is the current formula I created that's not working:
Replace (.)(\bDetails\s+(.)) with $3
The output gives the below:
<p>100% Silk.</p>
<p>Made in Portugal.</p>
<h3>Fit</h3>
<p>Model is 5‰Ûª10,‰Û size 2 wearing size 34.</p>
<p>Size 34 measurements</p>
OR
<p>100% Velvet.</p>
<p>Made in the United States.</p>
<h3>Fit</h3> <p>Model is 5‰Ûª10‰Û, size 2 and wearing size M pants.</p> <p>Size M measurements Length: 37.5"åÊ</p>
<p>These pants run small. We recommend sizing up.</p>
`
How do I capture just the desired string? Let me know if you have any tips! Thank you!
Difficult to provide a working solution in your situation as you mention your program has "limited regex features" but don't explain what limitations.
Here is a Regex you can try to work with to capture the target string
^(?:<h3>Details<\/h3>)(.*)$
I would personally use BeautifulSoup for something like this, but here are two solutions you could use:
Match the line after "Details", then pull out the data.
matches = re.findall('(?<=Details<).*$', text)
matches = [i.strip('<>') for i in matches]
matches = [i.split('<')[0] for i in [j.split('>')[-1] for j in matches]]
Replace "Details<...>data" with "Detailsdata", then find the data.
text = re.sub('Details<.*?<.*>', '', text)
matches = re.findall('(?<=Details).*?(?=<)', text)
2011-12-01T00:43:51.251871+05:18 Dec 01 2011 00:41:32 KOC-TEJ-AMEX-ASA-5510-6 : %ASA-4-106023: Deny icmp src TCS:172.26.40.1 dst AMEX:172.26.40.187 (type 5, code 0) by access-group "TCS_access_in" [0x953d065b, 0x0]
Need to extract 2011-12-01T00:43:51.251871+05:18
My code
create view standardLogTime as
extract regex /(\d{4}\-\d{2}\-\d+\w+\:\d{2}\:\d+\.\d+\+\d+\:\d+)/ on D.text as testValue
from Document D;
-- Extracting standard log generation time.
create view standardLogTime as
extract regex /\d{4}(-\d{2}){2}T(\d{2}:){2}\d{2}\.\d+?\+\d{2}:\d{2}/ on D.text as testValue
from Document D;
output view standardLogTime;
-- Extracting incoming request Date.
create view dateView as
extract regex /(\s+\w+\s\d+\s\d{4})/ on Date.text as testDate from Document Date;
--output view dateView;
-- Extracting incoming request Time.
create view timeView as
extract regex /\s+(\d{1,2}\:\d{1,2}\:\d{1,2})/ on Time.text
as requestTime from Document Time;
--output view timeView;
-- Extracting the firewall device name.
create view deviceName as
extract regex /(\w+\-\w+\-\w+\-\w+\-\d+\-\d+)/ on Device.text
as deviceName from Document Device;
--output view deviceName;
create view combinedView as
extract pattern (<S.testValue>) (<D.testDate>) (<T.requestTime>) (<Div.deviceName>)
return group 0 as logTime and
group 1 as date and
group 2 as time and
group 3 as deviceName
from standardLogTime S,dateView D ,timeView T,deviceName Div;
output view combinedView;*/
I don't know what language that is, but in Python I would do
date = line.split()[0]
or, if I were forced to use an RE, it'd be
^(\S+)\s
\d{4}(-\d{2}){2}T(\d{2}:){2}\d{2}\.\d+?\+\d{2}:\d{2}