How do you extract text matching a pattern in XPATH?

How do you extract text matching a pattern in XPATH? - regex

I have data that looks like this:
<value>v13772 #FBst0451145:w<up>1118</up>; P{GD3649}v13772#
v13773 #FBst0451146:w<up>1118</up>; P{GD3649}v13773#</value>
How can I process this string in XPATH to extract any and all #FBst####### numbers?
I know of the xpath matches() function... but that only returns true or false. No good if I want the matching string. I've searched around but cannot find a satisfactory answer to this problem, which is probably really common.
Thanks!

In addition to the good answer by Michael Kay, if you want to use only the replace() function, then use:
replace(.,'.*?(#FBst\d+).*','$1')
The result is:
#FBst0451145
#FBst0451146
And if you only want the numbers from the above result, use:
replace(replace(.,'.*?(#FBst\d+).*','$1'),
'[^0-9]+', ' ')
This produces:
0451145 0451146

I Assume you can also use XQuery. The get_matches() function from the FunctX module should work for you. Download the file which supports your version of XQuery. Then import the module whenever you need its functionality.
import module namespace functx = "http://www.functx.com" at "functx-1.0-doc-2007-01.xq";
functx:get-matches(string-join(//text()),'xyz')

Try
tokenize(value, '[^0-9]+')
which should return the sequence of tokens separated by sequences of non-digits.

With help from Dimitre, a working regex is:
replace(.,'.*?(#FBst\d+).*','$1 ','m')
Although it doesn't work unless a newline separates each target string, it will do for now.
Thanks everyone!

Related

Extract strings between special characters

Please help me in extracting the string from this text like
id1:value1,id2:value2
id1:value1a,id2:value2a
from
"[{id1:value1,id2:value2},{id1:value1a,id2:value2a}]"

{([^}]+)} will find and capture anything that is inside { }
You should include more detail in your question though and show that you have made an attempt at it yourself.

Working codes
unlist(regmatches(text, gregexpr("\\{.*?\\}", text)))
unlist(regmatches(text, gregexpr("\\{([^}]+)\\}", text)))

XSLT - Replace a number of chars in a string

I have the following string,
';#6;#'
The above string could be anything, E.g.:
';#1;#' or ';#2;#' , or ';#3;#' ...
I need to be able to replace the contents between the ' and '
Is this possible using something like translate in XSLT 1.0?

This kind of thing is quite difficult in XSLT 1.0. Take a look at the library of string-handling functions available at www.exslt.org - some of them come with XSLT implementations that you can copy into your stylesheet and call (typically as xsl:call-template).

Use substring and concat functions.

How do I substitute a parameter of an XML tag... using Regex?

Help how do I use regex to replace the value of param below
<?xml version="1.0" encoding="UTF-8" ?>
<games>
<game id="1001" path="C:\Program Files\Warcraft III\war3.exe" param="" display="1" priority="0"/>
</games>
the value of param is empty i wanted to add something to it using regex.
or replace the hole param="" with param="something"
and it has to be the first param after id="1001"
help.
i'm also using autohotkey so.. I don't know if you can just provide me a code to edit xml with autohotkey. :P but regex would do for this.
somebody provided me with this code
RegExReplace(xml,"s)id=""1001"".*?param=""\K[^""]+","HELLO WORLD!")
it works if the param has a value but it won't work if it doesn't.
how do i make it work.

You could use something like this, but you should consider using a proper XML parser instead, since this regex will easily fail in many cases:
s/(id="1001" [^>]*param=").*?"/$1something"/

You might be better off looking at an XML/HTML parsing engine here, assuming you are talking about XML/HTML params. Such engines are made for parsing and modifying this kind of content - regexes are not at all ideal for such work.
But it would help to know more about what you are dealing with, too; what's the environment? Is this HTML/XML data? Where are you modifying it? (client? server?) etc.

If the command you posted works as you said, then all you need to do is change the + to a *, like so:
RegExReplace(xml,"s)id=""1001"".*?param=""\K[^""]*","HELLO WORLD!")
+ means "one or more; * means "zero or more"

Regular Expression DateTime Javascript

I need an expression which matches,DateTime format (DD/MM/YYYY),i've already found it.
However,it only works to (1/6/2009) or (1/5/2010),it doenst support (01/06/2009) or (01/05/2010).
How can i check if a string is a dateTime in Javascript?

You can check this nifty library : Date.js

Try this (which I found here):
(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d
In order to make this match dates without leading zeros on the month and the day you will need to change it up a bit:
(0?[1-9]|1[012])[- /.](0?[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d

How about using a datejs library? It has no problem with those patterns.

How to use the matched text in the replacement text in Vim

I have a block of codes with timestamp in front of each line like this:
12/02/2010 12:20:12 function myFun()
12/02/2010 12:20:13 {....
The first column is a date time value. I would like to comment them out by using Vim, thus:
/*12/02/2010 12:20:12*/ function myFun()
/*12/02/2010 12:20:13*/ {....
I tried to search for date first:
/\d\d\/\d\d\/\d\d\d\d \d\d:\d\d:\d\d
I got all the timestamps marked correctly. However When I tried to replace them by the command:
%s/\d\d\/\d\d\/\d\d\d\d \d\d:\d\d:\d\d/\/*\d\d\/\d\d\/\d\d\d\d \d\d:\d\d:\d\d*\//
I got the following result:
/*dd/dd/dddd dd:dd:dd*/ function myFun()
/*dd/dd/dddd dd:dd:dd*/ {....
I think I need to name the search part and put them back in the replace part. How I can do it?

I suppose I would just do something like:
:%s-^../../.... ..:..:..-/* & */-

I would actually not us a regex to do this. It takes too long to enter the correct formatting. I would instead use a Visual Block. The sequence works out to be something like this.
<C-V>}I/* <ESC>
3f\s
<C-V>I */
I love regex, and don't want to knock the regex solutions, but find when doing things with pre-formatted blocks, that this is easier, and requires less of a diversion from the real task, which isn't figuring out how to write a regex.

%s/\d\d\/\d\d\/\d\d\d\d \d\d:\d\d:\d\d/\/*&*\//

:%s/^\([0-9/]* [0-9:]* \)\(.*\)/\/*\1*\/ \2/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do you extract text matching a pattern in XPATH? - regex

Try tokenize(value, '[^0-9]+') which should return the sequence of tokens separated by sequences of non-digits.

With help from Dimitre, a working regex is: replace(.,'.?(#FBst\d+).','$1 ','m') Although it doesn't work unless a newline separates each target string, it will do for now. Thanks everyone!

Related

Extract strings between special characters

XSLT - Replace a number of chars in a string

How do I substitute a parameter of an XML tag... using Regex?

Regular Expression DateTime Javascript

How to use the matched text in the replacement text in Vim

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do you extract text matching a pattern in XPATH? - regex

Try tokenize(value, '[^0-9]+') which should return the sequence of tokens separated by sequences of non-digits.

With help from Dimitre, a working regex is: replace(.,'.*?(#FBst\d+).*','$1 ','m') Although it doesn't work unless a newline separates each target string, it will do for now. Thanks everyone!

Related

Extract strings between special characters

XSLT - Replace a number of chars in a string

How do I substitute a parameter of an XML tag... using Regex?

Regular Expression DateTime Javascript

How to use the matched text in the replacement text in Vim

Categories

Resources

With help from Dimitre, a working regex is: replace(.,'.?(#FBst\d+).','$1 ','m') Although it doesn't work unless a newline separates each target string, it will do for now. Thanks everyone!