Regex without brackets - regex

I have the following tag from an XML file:
<msg><![CDATA[Method=GET URL=http://test.de:80/cn?OP=gtm&Reset=1(Clat=[400441379], Clon=[-1335259914], Decoding_Feat=[], Dlat=[0], Dlon=[0], Accept-Encoding=gzip, Accept=*/*) Result(Content-Encoding=[gzip], Content-Length=[7363], ntCoent-Length=[15783], Content-Type=[text/xml; charset=utf-8]) Status=200 Times=TISP:270/CSI:-/Me:1/Total:271]]>
Now I try to get from this message: Clon, Dlat, Dlon and Clat.
However, I already created the following regex:
(?<=Clat=)[\[\(\d+\)\n\n][^)n]+]
But the problem is here, I would like to get only the numbers without the brackets. I tried some other expressions.
Do you maybe know, how I can expand this expression, in order to get only the values without the brackets?
Thank you very much in advance.
Best regards

The regex
(clon|dlat|dlon|clat)=\[(-?\d+)\]
Gives
As I stated before, if you use this regex to extract the information out of this CDATA element, that's okay. But you really want to get to the contents of that element using an XML parser.
Example usage
Regex r = new Regex(#"(clon|dlat|dlon|clat)=\[(-?\d+)\]");
string s = ".. here's your cdata content .. ";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
{
var name = match.Groups[1].Value; //will contain "clon", "dlat", "dlon" or "clat"
var inner_value = match.Groups[2].Value; //will contin the value inside the square-brackets, e.g. "400441379"
//Do something with the matches
}

Related

How to use Regex expression to modify my variable value?

I have a variable rawvalue:
let rawvalue = {abc-def-qwe}
I want to use regex to remove the { and }; I can simply do this by truncating the first and last characters. I built the regex:
^.(.*.).$
I want to know how to apply this regex on my variable to get the desired output?
The syntax you're looking for is like this:
let input = "{abc-def-qwe}";
let re = /^.(.*.).$/;
let fixed = re.exec(input)[1]; // Get the first match group "abc-def-qwe"
Maybe, this RegEx might be a better choice, which creates one group and you can simply call it using $1 and replace your string:
^\{(.+)\}$
For implementation, you might use, maybe these posts: 1, 2, 3.

Match return substring between two substrings using regexp

I have a list of records that are character vectors. Here's an example:
'1mil_0,1_1_1_lb200_ks_drivers_sorted.csv'
'1mil_0_1_lb100_ks_drivers_sorted.csv'
'1mil_1_1_lb2_100_100_ks_drivers_sorted.csv'
'1mil_1_1_lb100_ks_drivers_sorted.csv'
From these names I would like to extract whatever's between the two substrings 1mil_ and _ks_drivers_sorted.csv.
So in this case the output would be:
0,1_1_1_lb200
0_1_lb100
1_1_lb2_100_100
1_1_lb100
I'm using MATLAB so I thought to use regexp to do this, but I can't understand what kind of regular expression would be correct.
Or are there some other ways to do this without using regexp?
Let the data be:
x = {'1mil_0,1_1_1_lb200_ks_drivers_sorted.csv'
'1mil_0_1_lb100_ks_drivers_sorted.csv'
'1mil_1_1_lb2_100_100_ks_drivers_sorted.csv'
'1mil_1_1_lb100_ks_drivers_sorted.csv'};
You can use lookbehind and lookahead to find the two limiting substrings, and match everything in between:
result = cellfun(#(c) regexp(c, '(?<=1mil_).*(?=_ks_drivers_sorted\.csv)', 'match'), x);
Or, since the regular expression only produces one match, the following simpler alternative can be used (thanks #excaza for noticing):
result = regexp(x, '(?<=1mil_).*(?=_ks_drivers_sorted\.csv)', 'match', 'once');
In your example, either of the above gives
result =
4×1 cell array
'0,1_1_1_lb200'
'0_1_lb100'
'1_1_lb2_100_100'
'1_1_lb100'
For me the easy way to do this is just use espace or nothing to replace what you don't need in your string, and the rest is what you need.
If is a list, you can use a loop to do this.
Exemple to replace "1mil_" with "" and "_ks_drivers_sorted.csv" with ""
newChr = strrep(chr,'1mil_','')
newChr = strrep(chr,'_ks_drivers_sorted.csv','')

Split line at commas, only if commas not contained between quotes

Is there any way to use the split function in scala so that it splits a line at commas but doesn't at commas contained within 2 double quotes?
For example, I have the following:
x: String = """"??", "hamburger", "ketchup, mayo, mustard", "pizza""""
and I tried this:
x.split(',') but it didn't work. I then thought about removing all double quotes but that still doesn't solve my problem.
Any help would be greatly appreciated!
EDIT:
Here's a snippet of my code to see how I can incorporate this:
val data1 = noheader1.map { line =>
val values = line._1.split(',') //This is what I am trying to change
val name = values(2).replaceAll("\"", ""))
I am a bit new to scala and even more so to regex, so could someone clarify how to write that weird regex expression in my code so that I can obtain an ARRAY of the comma separated words of the line?
Try this!
(?>"(?>\\.|[^"])*?"|(,))
Regex101
Instead of split() you can use a regular expression and findAllIn(), like such:
val x = """"??", "hamburger", "ketchup, mayo, mustard", "pizza""""
""""[^"]+"""".r.findAllIn(x).toList
This will result in, List("??", "hamburger", "ketchup, mayo, mustard", "pizza")
Note: I am using triple-quotes (""") in the example.
Perhaps not so elegant as other regex already suggested, consider the splitting element between items as ", " and so
x.split("\",\\s+\"")
Array("??, hamburger, ketchup, mayo, mustard, pizza")
Then in the resulting array, to the head "?? apply stripPrefix("\"") and to the last pizza" apply stripSuffix("\"").

regular expressions and vba

Does anyone know how to extract matches as strings from a RegExp.Execute() function?
Let me show you what I've gotten to so far:
Regex.Pattern = "^[^*]*[*]+"
Set myMatches = Regex.Execute(temp)
I want the object "myMatches" which is holding the matches, to be converted to a string. I know that there is only going to be one match per execution.
Does anyone know how to extract the matches from the object as Strings to be displayed lets say via a MsgBox?
Try this:
Dim sResult As String
'// Your expression code here...
sResult = myMatches.Item(0)
'// or
sResult = myMatches(0)
Msgbox("The matching text was: " & sResult)
The Execute method returns a match collection and you can use the item property to retrieve the text using an index.
As you stated you only ever have one match then the index is zero. If you have more than one match you can return the index of the match you require or loop over the entire collection.
This page has a lot of information on regex and seems to have what you want.
http://www.regular-expressions.info/vbscript.html

looking for a regular expression to extract all text outputs to user from js file

i have some huge js files and there are some texts/messages/... which are output for a human beeing. the problem is they don't run over the same method.
but i want to find them all to refactor the code.
now i am searching for a regular expression to find those messages.
...his.submit_register = function(){
if(!this.agb_accept.checked) {
out_message("This is a Messge tot the User in English." , "And the Title of the Box. In English as well");
return fals;
}
this.valida...
what i want to find is all the strings which are not source code.
in this case i want as return:
This is a Messge tot the User in
English. And the Title of the Box. In
English as well
i tried something like: /\"(\S+\s{1})+\S\"/, but this wont work ...
thanks for help
It's not possible to parse Javascript source code using regular expressions because Javascript is not a regular language. You can write a regular expression that works most of the time:
/"(.*?)"/
The ? means that the match is not greedy.
Note: this will not correctly handle strings that contain ecaped quotes.
A simple java regex solving your problem (assuming that the message doesn't contain a " character):
Pattern p = Pattern.compile("\"(.+?)\"");
The extraction code :
Matcher m;
for(String line : lines) {
m = p.matcher(line);
while(m.find()) {
System.out.println(m.group(1));
}
}