Capture a substring which is not present in the source string - regex

I want regular expression search and replace as follows:
Input string ends with images, videos, friends
Output string contains the matched suffix
Else
Output string contains the suffix profile
Example input/output:
/john-smith-images -> /user-images
/john-smith-videos -> /user-videos
/john-smith -> /user-profile
I tried this regex which captures the suffix, if present:
/.+?(images|videos|friends)?$/
I am restricted to one regular expression and regular expression only solution. I need to use this in mod_rewrite/IIRF/IIS URL rewrite.

Instead of using .replace(), consider using .match() or .test() inside a conditional, and handling the non-matching case separately.

Give this a try:
text.replace(/[\w-]+(?=(images|videos|friends))/, 'user-').replace(/[\w-]+-(?!(images|videos|friends))\w*/, 'user-profile')

Use String#replace with callback like this:
var regexp = /.+?(images|videos|friends|)$/;
function cb ($0, $1) {
r = $1 ? $1 : 'profile';
return '/user-' + r;
}
console.log("/john-smith-images".replace(regexp, cb));
console.log("/john-smith-videos".replace(regexp, cb));
console.log("/john-smith".replace(regexp, cb));
Output:
/user-images
/user-videos
/user-profile
Live Demo: http://ideone.com/MVRcku

Related

Get result String RegEx

I am trying to get string using RegEx; here is the string:
window.runParams = {};
window.runParams = {blablabla};
How to get the second string {blablabla}? I am using REGEX:
(?<=window.runParams = ").*(?=;)
But that gets the first string {}.
If you want to get string with braces eg: {blablabla}
window.runParams = ({\w+})
If you want to get only the string inside braces eg: blablabla
window.runParams = {(\w+)}
Value of group 1 is your result
The following pattern captures only curly brackets with word character content:
(?<=window.runParams = ){\w+}(?=;)
and will only capture:
{blablabla}
when run against the text:
window.runParams = {};
window.runParams = {blablabla};
See results here:
https://regex101.com/r/mTwA64/1
try modifying your regex so it only accepts matches with non-empty curly brackets \{.+\} such as
(?<=window\.runParams = )(\{.+\})(?=;)
...there's probably ways to simplify the regex further, depending on you problem...my guess is you don't need the lookahead/lookbehind, e.g. in the example given \{.+\} will do just fine (returns {blablabla}) ....but it really depends on the format and content of your file...also remember braces, dots etc have a special meaning in regexes so you probably would want to escape them

How to write regular expression in powershell

I need regular expression in powershell to split string by a string ## and remove string up-to another character (;).
I have the following string.
$temp = "admin#test.com## deliver, expand;user1#test.com## deliver, expand;group1#test.com## deliver, expand;"
Now, I want to split this string and get only email ids into new array object. my expected output should be like this.
admin#test.com
user1#test.com
group1#test.com
To get above output, I need to split string by the character ## and remove sub string up-to semi-colon (;).
Can anyone help me to write regex query to achieve this need in powershell?.
If you want to use regex-based splitting with your approach, you can use ##[^;]*; regex and this code that will also remove all the empty values (with | ? { $_ }):
$res = [regex]::Split($temp, '##[^;]*;') | ? { $_ }
The ##[^;]*; matches:
## - double #
[^;]* - zero or more characters other than ;
; - a literal ;.
See the regex demo
Use [regex]::Matches to get all occurrences of your regular expression. You probably don't need to split your string first if this suits for you:
\b\w+#[^#]*
Debuggex Demo
PowerShell code:
[regex]::Matches($temp, '\b\w+#[^#]*') | ForEach-Object { $_.Groups[0].Value }
Output:
admin#test.com
user1#test.com
group1#test.com

Trying to match a string in the format of domain\username using Lua and then mask the pattern with '#'

I am trying to match a string in the format of domain\username using Lua and then mask the pattern with #.
So if the input is sample.com\admin; the output should be ######.###\#####;. The string can end with either a ;, ,, . or whitespace.
More examples:
sample.net\user1,hello -> ######.###\#####,hello
test.org\testuser. Next -> ####.###\########. Next
I tried ([a-zA-Z][a-zA-Z0-9.-]+)\.?([a-zA-Z0-9]+)\\([a-zA-Z0-9 ]+)\b which works perfectly with http://regexr.com/. But with Lua demo it doesn't. What is wrong with the pattern?
Below is the code I used to check in Lua:
test_text="I have the 123 name as domain.com\admin as 172.19.202.52 the credentials"
pattern="([a-zA-Z][a-zA-Z0-9.-]+).?([a-zA-Z0-9]+)\\([a-zA-Z0-9 ]+)\b"
res=string.match(test_text,pattern)
print (res)
It is printing nil.
Lua pattern isn't regular expression, that's why your regex doesn't work.
\b isn't supported, you can use the more powerful %f frontier pattern if needed.
In the string test_text, \ isn't escaped, so it's interpreted as \a.
. is a magic character in patterns, it needs to be escaped.
This code isn't exactly equivalent to your pattern, you can tweek it if needed:
test_text = "I have the 123 name as domain.com\\admin as 172.19.202.52 the credentials"
pattern = "(%a%w+)%.?(%w+)\\([%w]+)"
print(string.match(test_text,pattern))
Output: domain com admin
After fixing the pattern, the task of replacing them with # is easy, you might need string.sub or string.gsub.
Like already mentioned pure Lua does not have regex, only patterns.
Your regex however can be matched with the following code and pattern:
--[[
sample.net\user1,hello -> ######.###\#####,hello
test.org\testuser. Next -> ####.###\########. Next
]]
s1 = [[sample.net\user1,hello]]
s2 = [[test.org\testuser. Next]]
s3 = [[abc.domain.org\user1]]
function mask_domain(s)
s = s:gsub('(%a[%a%d%.%-]-)%.?([%a%d]+)\\([%a%d]+)([%;%,%.%s]?)',
function(a,b,c,d)
return ('#'):rep(#a)..'.'..('#'):rep(#b)..'\\'..('#'):rep(#c)..d
end)
return s
end
print(s1,'=>',mask_domain(s1))
print(s2,'=>',mask_domain(s2))
print(s3,'=>',mask_domain(s3))
The last example does not end with ; , . or whitespace. If it must follow this, then simply remove the final ? from pattern.
UPDATE: If in the domain (e.g. abc.domain.org) you need to also reveal any dots before that last one you can replace the above function with this one:
function mask_domain(s)
s = s:gsub('(%a[%a%d%.%-]-)%.?([%a%d]+)\\([%a%d]+)([%;%,%.%s]?)',
function(a,b,c,d)
a = a:gsub('[^%.]','#')
return a..'.'..('#'):rep(#b)..'\\'..('#'):rep(#c)..d
end)
return s
end

Dart: RegExp by example

I'm trying to get my Dart web app to: (1) determine if a particular string matches a given regex, and (2) if it does, extract a group/segment out of the string.
Specifically, I want to make sure that a given string is of the following form:
http://myapp.example.com/#<string-of-1-or-more-chars>[?param1=1&param2=2]
Where <string-of-1-or-more-chars> is just that: any string of 1+ chars, and where the query string ([?param1=1&param2=2]) is optional.
So:
Decide if the string matches the regex; and if so
Extract the <string-of-1-or-more-chars> group/segment out of the string
Here's my best attempt:
String testURL = "http://myapp.example.com/#fizz?a=1";
String regex = "^http://myapp.example.com/#.+(\?)+\$";
RegExp regexp= new RegExp(regex);
Iterable<Match> matches = regexp.allMatches(regex);
String viewName = null;
if(matches.length == 0) {
// testURL didn't match regex; throw error.
} else {
// It matched, now extract "fizz" from testURL...
viewName = ??? // (ex: matches.group(2)), etc.
}
In the above code, I know I'm using the RegExp API incorrectly (I'm not even using testURL anywhere), and on top of that, I have no clue how to use the RegExp API to extract (in this case) the "fizz" segment/group out of the URL.
The RegExp class comes with a convenience method for a single match:
RegExp regExp = new RegExp(r"^http://myapp.example.com/#([^?]+)");
var match = regExp.firstMatch("http://myapp.example.com/#fizz?a=1");
print(match[1]);
Note: I used anubhava's regular expression (yours was not escaping the ? correctly).
Note2: even though it's not necessary here, it is usually a good idea to use raw-strings for regular expressions since you don't need to escape $ and \ in them. Sometimes using triple-quote raw-strings are convenient too: new RegExp(r"""some'weird"regexp\$""").
Try this regex:
String regex = "^http://myapp.example.com/#([^?]+)";
And then grab: matches.group(1)
String regex = "^http://myapp.example.com/#([^?]+)";
Then:
var match = matches.elementAt(0);
print("${match.group(1)}"); // output : fizz

Simple Regular Expression matching

Im new to regular expressions and Im trying to use RegExp on gwt Client side. I want to do a simple * matching. (say if user enters 006* , I want to match 006...). Im having trouble writing this. What I have is :
input = (006*)
input = input.replaceAll("\\*", "(" + "\\" + "\\" + "S\\*" + ")");
RegExp regExp = RegExp.compile(input).
It returns true with strings like BKLFD006* too. What am I doing wrong ?
Put a ^ at the start of the regex you're generating.
The ^ character means to match at the start of the source string only.
I think you are mixing two things here, namely replacement and matching.
Matching is used when you want to extract part of the input string that matches a specific pattern. In your case it seems that is what you want, and in order to get one or more digits that are followed by a star and not preceded by anything then you can use the following regex:
^[0-9]+(?=\*)
and here is a Java snippet:
String subjectString = "006*";
String ResultString = null;
Pattern regex = Pattern.compile("^[0-9]+(?=\\*)");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group();
}
On the other hand, replacement is used when you want to replace a re-occurring pattern from the input string with something else.
For example, if you want to replace all digits followed by a star with the same digits surrounded by parentheses then you can do it like this:
String input = "006*";
String result = input.replaceAll("^([0-9]+)\\*", "($1)");
Notice the use of $1 to reference the digits that where captured using the capture group ([0-9]+) in the regex pattern.