regex to pick folders from network path - regex

I'm trying to get a regex for selecting part of a network path
\\server.env.com\Target\Test1\Test2\final1\final2\final3\final4\final5
I need to skip two folders after Target and get the rest of the path from the above. So regex should give me final1\final2\final3\final4\final5 in this case. The path can have more levels of folders after final5. So the regex should work for any number of folders.
When I am using look behind, the browser says its not supported, so cannot use it.

Using regex...
var str1="path \\server.env.com\\Target\\Test1\\Test2\\final1\\final2\\final3\\final4\\final5"
// "path \server.env.com\Target\Test1\Test2\final1\final2\final3\final4\final5"
str1.match( /([^\\]*\\){5}(.*)/ )[2]
// "final1\final2\final3\final4\final5"
works based on a test for number of forward slashes prior to the 'finals'
Or, using split
var arr = str1.split("\\")
arr.splice(0,5)
var result = arr.join("\\")
result // "final1\final2\final3\final4\final5"

Following this post: How do you use a variable in a regular expression?
Create a regex that replaces everything up to Target and then two more sub-directories
var path = "\\server.env.com\\Target\\Test1\\Test2\\final1\\final2\\final3\\final4\\final5"
console.log("Path is: " + path)
var target = "Target"
var regex = "^.*" + target + "\\\\(?:[^\\\\]+\\\\){2}"
console.log("Regex is: "+ regex)
var re = new RegExp(regex, "mg")
var extracted = path.replace(re, "")
console.log("Extraction is: " + extracted)

Related

remove "?show=false" using regex [duplicate]

I looking for regular expression to use in my javascript code, which give me last part of url without parameters if they exists - here is example - with and without parameters:
https://scontent-fra3-1.xx.fbcdn.net/v/t1.0-9/14238253_132683573850463_7287992614234853254_n.jpg?oh=fdbf6800f33876a86ed17835cfce8e3b&oe=599548AC
https://scontent-fra3-1.xx.fbcdn.net/v/t1.0-9/14238253_132683573850463_7287992614234853254_n.jpg
In both cases as result I want to get:
14238253_132683573850463_7287992614234853254_n.jpg
Here is this regexp
.*\/([^?]+)
and JS code:
let lastUrlPart = /.*\/([^?]+)/.exec(url)[1];
let lastUrlPart = url => /.*\/([^?]+)/.exec(url)[1];
// TEST
let t1 = "https://scontent-fra3-1.xx.fbcdn.net/v/t1.0-9/14238253_132683573850463_7287992614234853254_n.jpg?oh=fdbf6800f33876a86ed17835cfce8e3b&oe=599548AC"
let t2 = "https://scontent-fra3-1.xx.fbcdn.net/v/t1.0-9/14238253_132683573850463_7287992614234853254_n.jpg"
console.log(lastUrlPart(t1));
console.log(lastUrlPart(t2));
May be there are better alternatives?
You could always try doing it without regex. Split the URL by "/" and then parse out the last part of the URL.
var urlPart = url.split("/");
var img = urlPart[urlPart.length-1].split("?")[0];
That should get everything after the last "/" and before the first "?".

Google App Script findText regex not working for new line character

I'm trying to locate/modify text in my Google Document where the text has been broken across a full line break. My regular expression below works when I manually find text in the Google document (CTRL+F) and then search via the regular expression dialog. What is baffling is why the exact same regex doesn't work in the code below on full line breaks, i.e. "\n" (note: the soft line "\v" breaks are ok).
The second approach finds the text but I'm unable to do anything with it as I need the element object in-order to manipulate the text.
//Test document 1Q6v8ipqA81LoPtpk71NdqTaIEqMjki1KIJbrm0bILBg contains the following text:
//
//This Agreement shall not be assigned by either party without the prior\n
//written consent of the parties hereto
var doc = DocumentApp.openById('1Q6v8ipqA81LoPtpk71NdqTaIEqMjki1KIJbrm0bILBg');
//Method 1 - does NOT locate the text
var body = doc.getBody();
var pattern = "prior[\s]*written";
var foundElement = body.findText(pattern);
while (foundElement != null) {
var foundText = foundElement.getElement().asText();
var start = foundElement.getStartOffset();
var end = foundElement.getEndOffsetInclusive();
foundElement = body.findText(pattern, foundElement);
}
//Method 2 - locates the text, but I cannot acquire the element object
var body2 = doc.getBody().getText();
var pattern2 = /prior[\s]*written/;
while (m=pattern2.exec(body2))
{
Logger.log(m[0]);
}
}
If this were ever going to work, you would need the regex to be in s (single line) mode. Per https://developers.google.com/apps-script/reference/document/body#findtextsearchpattern,
A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers.
So it looks like they have in fact chosen not to support multi-line matches in any way.

Multiple regex to match file extension with version

My current regex is like so
/\.(jpe?g|png|gif|svg)$/i
I'm trying to modify it to support matching when the extension has get parameters at the end of it so all of the below formats would match
../fonts/fontawesome-webfont.svg
../fonts/fontawesome-webfont.svg?v=4.3.0
../fonts/fontawesome-webfont.svg?v=4.3.0#fontawesomeregular'
How can I modify it to support these?
Assuming the URLs to be parsed follow proper formatting (where only one '?' delimiter can be used to signify the start of the query) you could do:
/\.(jpe?g|png|gif|svg)(?:\?.*|)$/i
var urls = [
'../fonts/fontawesome-webfont.svg',
'../fonts/fontawesome-webfont.svg?v=4.3.0',
'../fonts/fontawesome-webfont.svg?v=4.3.0#fontawesomeregular'
];
var matches = urls.map(function(url) { return url.match(/\.(jpe?g|png|gif|svg)(?:\?.*|)$/i); });
document.write('<pre>' + JSON.stringify(matches, null, 2) + '</pre>');
Alternatively you could use Node's url.parse():
var url = require('url');
var urlObj = url.parse(URL_STRING);
var matches = urlObj.pathname.match(/\.(jpe?g|png|gif|svg)$/i);

Javascript regex replace

I have a langauge dropdown, and a javascript function which changes the page to the corresponding language selected. I need help on my regex replace:
For example, I would like this URL to turn into this url:
http://localhost:7007/en/Product/Detail/1038
http://localhost:7007/fr/Product/Detail/1038
function languageChange(sender) {
var lang = $(sender).val();
var target = window.location.href;
target = target.replace(/(http:\/\/.*?)([a-zA-Z]{2})(.*$)/gim, '$1' + lang + '$3');
window.location = target;
}
Is your URL always the same structure? If so, you may not need a regex at all. Split the url at each "/", replace index 3, then join your array back to together with "/".
Here is a code sample:
function changeLanguage(url, newLang) {
var url = url.split('/');
url[3] = newLang;
return url.join('/');
}
changeLanguage('http://localhost:7007/en/Product/Detail/1038','Fr');
Note: I originally wrote "splice" instead of "join" in my response. Join is the correct method.
Here is a function that processes any number of URLs within a string, and replaces the language part (the first part of path), only if exists and is from 2 to 4 chars long:
function changeLanguage(text, lang) {
return text.replace(
/\b(\w+:\/\/[^\/]+\/)[A-Z]{2,4}(?=[\/\s]|$)/gim,
'$1' + lang);
}
Edit: Converted to function format.
Use this regex:
target =
target.replace(/(https?:\/\/[^/]+)\/?([^/]*)(.*)/gi, '$1/' + lang + '$3');
if e.g. lang='fr' then target holds http://localhost:7007/fr/Product/Detail/1038 value;

Regex to replace domain within links that are not images

Need to replace a domain name on all the links on the page that are not images or pdf files.
This would be a full html page received through a proxy service.
Example:
test<img src="http://www.test.com" /><a href="http://www.test.com/test.pdf">pdf
test1
Result:
test<img src="http://www.test.com" /><a href="http://www.test.com/test.pdf">pdf
test1
If you are using .NET, I strongly suggest you to use HTML Agility Pack
Direct parsing using regex can be very error prone. This questions is also similar to the post below.
What regex should I use to remove links from HTML code in C#?
If the domain is http://www.example.com, the following should do the trick:
/http:\/\/www\.example\.com\S*(?!pdf|jpg|png|gif)\s/
This uses a negative lookahead to ensure that the regex matches a string only if the string does not contain pdf,png,jpg or gif at the specified position.
If none of your pdf urls have query parameters (like a.pdf?asd=12), the following code will work. It replaces only absolute and root-relative urls.
var links = document.getElementsByTagName("a");
var len = links.length;
var newDomain = "http://mydomain.com";
/**
* Match absolute urls (starting with http)
* and root relative urls (starting with a `/`)
* Does not match relative urls like "subfolder/anotherpage.html"
* */
var regex = new RegExp("^(?:https?://[^/]+)?(/.*)$", "i");
//uncomment next line if you want to replace only absolute urls
//regex = new RegExp("^https?://[^/]+(/.*)$", "i");
for(var i = 0; i < len; i++)
{
var link = links.item(i);
var href = link.getAttribute("href");
if(!href) //in case of named anchors
continue;
if(href.match(/\.pdf$/i)) //if pdf
continue;
href = href.replace(regex, newDomain + "$1");
link.setAttribute("href", href);
}