Regular Expression for finding JavaScript accessing custom attributes - regex

I'm fixing our web application to be browser compatible with Internet Explorer 10 (non-compatibility mode) and have run into a couple if issues. There is a lot of JavaScript that access a custom attribute of an element, which does not work in Internet Explorer 10 (regular mode). I've fixed most cases by using element.getAttribute("customattribute"). The problem is, there is quite a bit of JavaScript and I do not know all the places that a custom attribute is trying to be obtained. I've working on finding all occurrences by using a regular expression. Basically, I want to find anyword, followed by a dot (.) followed by anyword except attributes like id, name, checked, etc, followed by a space or equal sign. This is what I've come up with so far.
(\w)\.(?!attr|index|all|id|value|className)(\w)([ \t]|=)
The words attr, index, all, id, value and className are all being returned though. Is there a better way (or correct way) to achieve this?

I used the following modification to obtain the things you are asking for:
(\w*)\.(?!attr|index|all|id|value|className|getElementById)(\w*)
However there are a lot of dot phrases caught (e.g. "document.getElementById", "xmlhttp.open") which you don't want. So whitelisting things you do want may be helpful as well:
style\.(?!attr|index|all|id|value|className|getElementById)(\w*)
Tested at: http://gskinner.com/RegExr/ with a sample of JavaScript code. Without more information on the JavaScript code itself there are too many I can suppose to exclude or do the opposite if there are too many custom ones to I want to find.

Related

How to exclude the last part of a variable string using regex

I am currently making a bunch of landing pages that use similar URL structure, but each URL varies in number of words.
So it's something like:
http://landingpage.xyz/page-number-five
http://landingpage.xyz/page-number-fifty-four
http://landingpage.xyz/page-for-a-different-topic
and for the sent page I just postfix -sent like this. The reason I am not adding it as /sent is because the platform I am using handles URLs this way.
http://landingpage.xyz/page-number-five-sent
http://landingpage.xyz/page-number-fifty-four-sent
http://landingpage.xyz/page-for-a-different-topic-sent
Now I found it easy to make a regular expression that identifies all the sent pages which is let's say:
\/([a-z0-9\-]*)-sent
The thing is that I am not sure how to identify the ones that are not sent. I tried using a similar regular expression using something like this, but it's not working as expected:
\/([a-z0-9\-]*)(?!-sent)
What's the best way to design the regex for this? Or I am approaching it in the wrong way?
A lookahead should be considered where there are some characters left to match. So one at the end of regex doesn't look for anything. As long as I'm not sure whether or not your environment supports lookbehinds, this should be a workaround:
\/(?!.*-sent\b)([a-z0-9\-]*)

Chrome dev tools: any way to exclude requests whose URL matches a regex?

Unfortunately in the last versions of Chrome the negative network filter doesn't work anymore. I used this filter in order to exclude each http call containing a particular string. I asked a solution in Chrome dev tool forum but at the moment nobody answered.
So I would like to know if there is a way to resolve this problem (and exclude for example each call containing the string 'loadMess') with regex syntax.
Update (2018):
This is an update to my old answer to clarify that both bugs have been fixed for some time now.
Negate or exclude filtering is working as expected now. That means you can filter request paths with my.com/path (show requests matching this), or -my.com/path (show requests not matching this).
The regex solution also works after my PR fix made it in production. That means you can also filter with /my.com.path/ and /^((?!my.com/path).)*$/, which will achieve the same result.
I have left the old answer here for reference, and it also explains the negative lookup solution.
The pre-defined negative filters do work, but it doesn't currently allow you to do NOT filters on the names in Chrome stable, only CONTAINS. This is a bug that has been fixed in Chrome Canary.
Once the change has been pushed to Chrome stable, you should be able to do loadMess to filter only for that name, and -loadMess to filter out that name and leave the rest, as it was previously.
Workaround: Regex for matching a string not containing a string
^((?!YOUR_STRING).)*$
Example:
^((?!loadMess).)*$
Explanation:
^ - Start of string
(?!loadMess) - Negative lookahead (at this cursor, do not match the next bit, without capturing)
. - Match any character (except line breaks)
()* - 0 or more of the preceeding group
$ - End of string
Update (2016):
I discovered that there is actually a bug with how DevTools deals with Regex in the Network panel. This means the workaround above doesn't work, despite it being valid.
The Network panel filters on Name and Path (as discovered from the source code), but it does two tests that are OR'ed. In the case above, if you have loadMess in the Name, but not in the Path (e.g. not the domain or directory), it's going to match on either. To clarify, true || false === true, which means it will only filter out loadMess if it's found in both the Name and Path.
I have created an issue in Chromium and have subsequently pushed a fix to be reviewed. This has subsequently been merged.
This is answered here - for latest Chrome 58.0.3029.110 (Official Build) (64-bit)
https://stackoverflow.com/a/27770139/4772631
E.g.: If I want to exclude all gifs then just type -gif
Negative lookahead is recommended everywhere, but it does not work.
Instead, "-myregex" does work for me. Like this: -/(Violation|HMR)/.
Chrome broswer dev tools support regrex filter not very well.
When I want to hide some requests, it does not work as showed above. But you can use -hide1 -hide2 to hide the request you want.
Just leave a space between the conditions, and this does not match the regrex, I guess it may use string match other than regrex in principle
Filtering multiple different urls
You can negate symbol for filtering the network call.
Eg: -lab.com would filter lab.com urls.
But for filtering multiple urls you can use the | symbol in the regex
Eg: -/lab.com|mini.com/ This will filter lab.com and mini.com as well you can use it to filter many different websites or urls.
You can use "Invert" option to exclude the APIs matching a string in the Filter text box.
On latest chrome version (62) you have to use :
-mime-type:image/gif

How to use regex for URL-targeting

As a disclaimer, I must say that my experience with regular expressions is very limited. I am using Optimizely for A/B testing and have run into a problem. I only want my experiment to run on one page, however, this page's URL-structure is somewhat complicated. The URL-structure of the page where I want to run my experiment looks like this:
https://mywebsite.co/term/public_id/edit/pricing
The problem is the public_id that changes dynamically, whenever a new user goes through the signup flow. How can I use regex to target this page exclusively? I have been trying to figure it out these past days but without any luck. Optimizely regex docs can be found here. I can't just use a simple match because /term/ appears in the URL of several pages on my site.
You could use this regular expression:
mywebsite\.co/somepage/.*?/edit/pricing
The .* part means any character can occur here any number of times. The additional ? makes it lazy, meaning the rest of the regular expression will kick in as soon as possible.
Note that a literal . needs to be escaped with a backslash, like \.

Get URL-prefix like matching, using RegEx in Stylish?

I am coding custom CSS for Facebook using Stylish.
Everything goes well except that I need to have some custom values under the condition of URL-suffix. The only thing that comes close is URL-prefix which is the exact opposite.
So I was wondering if I could do something like:
Detect if URL is like either:
www.facebook.com/*/posts or just */post
where * could be any value.
Is it possible to do this through RegEx?
I googled it but I couldn't make anything out of it.
I want to apply some CSS code only when viewing some individual Facebook posts, and the URLbar shows:
www.facebook.com/User/Posts/PostID.php
Therefore, I would only like to detect if Post or post/postID.php exists and apply the style.
The below regex would match the links which contain the string /posts,
(?=.*?\/posts).*
DEMO

Writing Regular Expression for URL in Google Analytics

I have a huge list of URL's, in the format:
http://www.example.com/dest/uk/bath/
http://www.example.com/dest/aus/sydney/
http://www.example.com/dest/aus/
http://www.example.com/dest/uk/
http://www.example.com/dest/nor/
What RegEx could I use to get the last three URL's, but miss the first two, so that every URL without a city attached is given, but the ones with cities are denied?
Note: I am using Google Analytics, so I need to use RegEx's to monitor my URL's with their advanced feature. As of right now Google is rejecting each regular expression.
Generally, the best suggestion I can make for parsing URL's with a Regex is don't.
Your time is much much better spent finding a libary that exists for your language dedicated to the task of processing URLs.
It will have worked out all the edge cases, be fully RFC compliant, be bug free, secure, and have a great user interface so you can just suck out the bits you really want.
In your case, the suggested way to process it would be, using your URL library, extract the element s and then work explicitly on them.
That way, at most you'll have to deal with the path on its own, and not have to worry so much wether its
http://site.com/
https://site.com/
http://site.com:80/
http://www.site.com/
Unless you really want to.
For the "Path" you might even wish to use a splitter ( or a dedicated path parser ) to tokenise the path into elements first just to be sure.
tj111's current solution doesn't work - it matches all your urls.
Here's one that works (and I checked with your values). It also matches, no matter if there is a trailing slash or not:
http:\/\/.*dest\/\w+/?$
/http:\/\/www\.site\.com\/dest\/\w+\/?$/i
matches if they're all the same site with the "dest" there. you could also do this:
/\w+:\/\/[^/]+\/dest\/\w+\/?$/i
which will match any site with any protocal (http,ftp) and any site with the /dest/country at the end, and an optional /
Note, that this will only work with a subset of what the urls could legitimately be.
Try this regular expression:
^http://www\.example\.com/dest/[^/]+/$
This would only match the last three URLs.