Regex: Obtain ID(s) from URL - regex

Getting my feet wet in Regular Expressions, and I'm having a difficult time getting this one to work.
I have a url as such:
/800-Flowers-inc-4124/18-roses-3123
Where 4124 is the business ID, and 3123 is the product ID.
The hard part for me is creating the capturing groups. Currently, my regex is as follows:
/(\d+)(?=/|$)/g
Unfortunately, that only selects the business ID, and doesn't return the product ID.
Any help is greatly appreciated, and if you provide a regex, I would love if you could put a little explanation
thanks!

Your regex is fine, except since you've used the / as the regex delimiter you need to escape it in the expression:
/(\d+)(?=\/|$)/g
Or, you can just use a different delimiter (e.g. #):
#(\d+)(?=/|$)#g
Depending on the language you're using it'll probably return the results in some sort of array, or there could be a 'findAll'-type method instead of just 'find'.

mathematical.coffee is correct:
var data = '/800-Flowers-inc-4124/18-roses-3123';
var myregexp = /(\d+)(?=\/|$)/g;
var match = myregexp.exec(data);
var result = "Matches:\n";
while (match != null) {
result += "match:" + match[0] + ',\n';
match = myregexp.exec(data);
}
alert(result);

Related

Regex match everything not between a pair of characters

Suppose a string (representing elapsed time in the format HH:MM:ss) like this:
"123:59:00"
I want to match everything except the numbers for the minutes, i.e.: the regex should match the bold parts and not the number between colons:
"123: 59 :00"
In the example, the 59 should be the only part unmatched.
Is there any way to accomplish this with a js regex?
EDIT: I'm asking explicitly for a regex, because I'm using the Notion Formula API and can only use JS regex here.
You don't necessarily need to use RegEx for this. Use split() instead.
const timeString = "12:59:00";
const [hours, _, seconds] = timeString.split(":");
console.log(hours, seconds);
If you want to use Regex you can use the following:
const timeString = "12:59:00";
const matches = timeString.match(/(?<hours>^\d{2}(?=:\d{2}:))|(?<seconds>(?<=:\d{2}:)\d{2}$)/g);
console.log(matches);
// if you want to include the colons use this
const matchesWithColons = timeString.match(/(?<hours>^\d{2}:(?=\d{2}:))|(?<seconds>(?<=:\d{2}):\d{2}$)/g);
console.log(matchesWithColons);
You can drop the named groups ?<hours> and ?<seconds>.
Using split() might be the most canonical way to go, but here is a regex approach using match():
var input = "123:59:00";
var parts = input.match(/^[^:]+|[^:]+$/g);
console.log(parts);
If you want to also capture the trailing/leading colons, then use this version:
var input = "123:59:00";
var parts = input.match(/^[^:]+:|:[^:]+$/g);
console.log(parts);
Could also work
^([0-9]{2})\:[0-9]{2}\:([0-9]{2})$/mg

Find and replace between second and third slash

I have urls with following formats ...
/category1/1rwr23/item
/category2/3werwe4/item
/category3/123wewe23/item
/category4/132werw3/item
/category5/12werw33/item
I would replace the category numbers with {id} for further processing.
/category1/{id}/item
How do i replace category numbers with {id}. I have spend last 4 hours with out proper conclusion.
Assuming you'll be running regex in JavaScript, your regex will be.
/^(\/.*?\/)([^/]+)/gm
and replacement string should look like $1whatever
var str = "your url strings ..."
var replStr = 'replacement';
var re = /^(\/.*?\/)([^/]+)/gm;
var result = str.replace(re, '$1'+replStr);
console.log(result);
based on your input, it should print.
/category1/replacement/item
/category2/replacement/item
/category3/replacement/item
/category4/replacement/item
/category5/replacement/item
See DEMO
We devide it into 3 groups
1.part before replacement
2.replacement
3.part after replacement
yourString.replace(//([^/]*\/[^/]+\/)([^/]+)(\/[^/]+)/g,'$1' + replacement+ '$3');
Here is the demo: https://jsfiddle.net/9sL1qj87/

Regex without brackets

I have the following tag from an XML file:
<msg><![CDATA[Method=GET URL=http://test.de:80/cn?OP=gtm&Reset=1(Clat=[400441379], Clon=[-1335259914], Decoding_Feat=[], Dlat=[0], Dlon=[0], Accept-Encoding=gzip, Accept=*/*) Result(Content-Encoding=[gzip], Content-Length=[7363], ntCoent-Length=[15783], Content-Type=[text/xml; charset=utf-8]) Status=200 Times=TISP:270/CSI:-/Me:1/Total:271]]>
Now I try to get from this message: Clon, Dlat, Dlon and Clat.
However, I already created the following regex:
(?<=Clat=)[\[\(\d+\)\n\n][^)n]+]
But the problem is here, I would like to get only the numbers without the brackets. I tried some other expressions.
Do you maybe know, how I can expand this expression, in order to get only the values without the brackets?
Thank you very much in advance.
Best regards
The regex
(clon|dlat|dlon|clat)=\[(-?\d+)\]
Gives
As I stated before, if you use this regex to extract the information out of this CDATA element, that's okay. But you really want to get to the contents of that element using an XML parser.
Example usage
Regex r = new Regex(#"(clon|dlat|dlon|clat)=\[(-?\d+)\]");
string s = ".. here's your cdata content .. ";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
{
var name = match.Groups[1].Value; //will contain "clon", "dlat", "dlon" or "clat"
var inner_value = match.Groups[2].Value; //will contin the value inside the square-brackets, e.g. "400441379"
//Do something with the matches
}

Using a Variable in an AS3, Regexp

Using Actionscript 3.0 (Within Flash CS5)
A standard regex to match any digit is:
var myRegexPattern:Regex = /\d/g;
What would the regex look like to incorporate a string variable to match?
(this example is an 'IDEAL' not a 'WORKING' snippet) ie:
var myString:String = "MatchThisText"
var myRegexPatter_WithString:Regex = /\d[myString]/g;
I've seen some workarounds which involve creating multiple regex instances, then combine them by source, with the variable in question, which seems wrong. OR using the flash string to regex creator, but it's just plain sloppy with all the double and triple escape sequences required.
There must be some pain free way that I can't find in the live docs or on google. Does AS3 hold this functionality even? If not, it really should.
Or I am missing a much easier means of simply avoiding this task that I'm simply naive too due to my newness to regex?
I've actually blogged about this, so I'll just point you there: http://tyleregeto.com/using-vars-in-regular-expressions-as3 It talks about the possible solutions, but there is no ideal one like you mention.
EDIT
Here is a copy of the important parts of that blog entry:
Here is a regex to strip the tags from a block of text.
/<("[^"]*"|'[^']*'|[^'">])*>/ig
This nifty expression works like a charm. But I wanted to update it so the developer could limit which tags it stripped to those specified in a array. Pretty straight forward stuff, to use a variable value in a regex you first need to build it as a string and then convert it. Something like the following:
var exp:String = 'start-exp' + someVar + 'more-exp';
var regex:Regexp = new RegExp(exp);
Pretty straight forward. So when approaching this small upgrade, that's what I did. Of course one big problem was pretty clear.
var exp:String = '/<' + tag + '("[^"]*"|'[^']*'|[^'">])*>/';
Guess what, invalid string! Better escape those quotes in the string. Whoops, that will break the regex! I was stumped. So I opened up the language reference to see what I could find. The "source" parameter, (which I've never used before,) caught my eye. It returns a String described as "the pattern portion of the regular expression." It did the trick perfectly. Here is the solution:
var start:Regexp = /])*>/ig;
var complete:RegExp = new RegExp(start.source + tag + end.source);
You can reduce it down to this for convenience:
var complete:RegExp = new RegExp(/])*>/.source + tag, 'ig');
As Tyler correctly points out (and his answer works just fine), you can assemble your regex as a string end then pass this string to the RegExp constructor with the new RegExp("pattern", "flags") syntax.
function assembleRegex(myString) {
var re = new RegExp('\\d' + myString, "i");
return re;
}
Note that when using a string to store a regex pattern, you do need to add some extra backslashes to get it to work right (e.g. to get a \d in the regex, you need to specify \\d in the string). Note also that the string pattern does not use the forward slash delimiters. In other words, the following two statements are equivalent:
var re1 = /\d/ig;
var re2 = new Regexp("\\d", "ig");
Additional note: You may need to process the myString variable to escape any backslashes it might contain (if they are to be interpreted as literal). If this is the case the function becomes:
function assembleRegex(myString) {
myString = myString.replace(/\\/, '\\\\');
var re = new RegExp('\\d' + myString);
return re;
}

GWT - 2.1 RegEx class to parse freetext

I'm struggling with the com.google.gwt.regexp.shared.RegExpclass and simply want to parse the phone numbers from a string and get ALL occurrences of a number but only seems to be able to get the 1st occurrences.. I know there is subtle difference in the regex between java (where it works) and GWT.
String freeText = "Theo Powell<5643321309>, Robert Roberts<9653768972>, Betty Wilson<6268281885>, Brandon Anderson<703203115>";
MatchResult matchResult = RegExp.compile("[\+]?[0-9." "-]{8,}").exec(freeText);
int groupCount = matchResult.getGroupCount(); // result = 1
String s = matchResult.getGroup(0); //result = 5643321309
Thanks in advance.
Ian..
You'll have to loop, applying the pattern again until it returns nothing. For that, you first have to use the "global" flag:
ArrayList<String> matches = new ArrayList<String>();
RegExp pattern = RegExp.compile("[\+]?[0-9. -]{8,}", "g");
for (MatchResult result = pattern.exec(freeText); result != null; result = pattern.exec(freeText)) {
matches.add(result.getGroup(0));
}
If you think it's a bit "magic" or "kludgy" (which it kind of is), I'd suggest reading docs about the JavaScript RegExp object, as the RegExp class in GWT is a direct mapping of this: https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/RegExp/exec (with sample code in JS very similar to the one above).
Change the regex from
[\+]?[0-9." "-]{8,}
to
([\+]?[0-9." "-]{8,})
See Capturing Groups for further details.