Google Analytics Filter Set of Pages - regex

I need to create a filter on Google Analytics to include only a set of pages, for example, the view will have a filter to collect data only from
www.example.com/page1.html
www.example.com/page2.html
www.example.com/page3.html
I am trying to achieve this by using a Custom Filter to Include - > Request URI and using a Regex on the Filter Pattern.
My problem is that the Regex exceeds the 255 character limitation, even after I tried to optimice the regex to be a small as possible.
Creating more than one Include Filter does not work because this way no data would be collected, so I am wondering how could I achieve this? Thank you
This is the original regex
/es/investigacion/lace\.html|/en/research/lace\.html|/es/investigacion/lace/programa-maestrias\.html|/es/investigacion/lace/alumni/latin-american-forum-entrepreneurs\.html|/es/investigacion/lace/alumni/fondo-angel-investment\.html|/es/investigacion/lace/alumni\.html|/es/investigacion/lace/fondo-inversion\.html|/es/investigacion/lace/investigacion\.html|/es/investigacion/lace/acerca-del-centro\.html|/es/investigacion/lace/alumni/estudiantes-del-pais\.html|/en/research/investigation\.html|/en/research/about-the-center\.html|/es/investigacion/lace/alumni/mentoring\.html|/es/investigacion/lace/alumni/reatu-entrepreneur-award\.html|/en/research/lace/master-program\.html|/en/research/lace/alumni\.html|/en/research/investment-fund\.html
Edit: first try to compress the regex
/es/investigacion/lace/(programa-maestrias|alumni|investigacion|programa-maestrias|alumni/latin-american-forum-entrepreneurs|alumni/fondo-angel-investment|fondo-inversion|investigacion|acerca-del-centro|alumni/estudiantes-del-pais|alumni/mentoring|alumni/incae-entrepreneur-award)\.html|
Edit: the reason for this is because I need to create a new user profile on GA, and this new profile will have access to the information of a set of URLs only; so what occurred to me is create a new View that only captures the information of this set of URLs, and then assign the profile to this view with "Read/Analyze" permissons.

There are definitely more ways to optimise the regex. For example, since all string options end with .html, you could do something like this:
/(es/investigacion/lace|en/research/lace)\.html
by taking the .html out.
You could also take out
/es/investigacion/lace
and weave in the variable part of that using |'s, eg.
/es/investigacion/lace/(programa-maestrias|alumni|investigacion)\.html
But try a few of those optimisation techniques and you should be able to fit more in.

Related

Google Analtyics search and replace filter URI with Regex with an exception

Currently our registration form tracks UTM and SEM codes, plus you get very long string with Social sign ins. I end up with roughly 4k enrollment variations, very hard to track outside of goals.
In order to better trouble shoot channels, I've created a separate view where I want to combine everything into just /enrollment while excluding thank you page. So i would have a list like this:
www.mysite/enrollment
www.mysite/enrollment/
www.mysite/enrollment/sem01
www.mysite/enrollment/sem02
www.mysite/enrollment?adsforefacebook
www.mysite/enrollment?utmforemail
www.mysite/enrollment/thank-you
I've tried using this filter which works in the goal section, but I can't get it to work under filters.
Find
www\.mysite\.com\/enrollment(?!/thank\-you)
Replace
www.mysite.com/enrollment
Theoretically, this should catch everything with enrollment except thank you pages and replace with the new string.
I've tried several variations that include .*, but no go.
Oops. Nevermind, I think I figured it out right after I posted this. I don't think the normal find and replace works with exclusion patters and you have to use the Advanced filter... which worked exactly as expected with the above code.

Google Analytics: View Filter by Request URI - does this work?

I think this should be pretty straight forward but if I have a url(s) with a request URIs such as:
/en/my-hometown/92-winston
/en/my-hometown/92-winston/backyard
and I want to set up a GA view which only includes this page and any subpages, then does the following Filter work in Analytics?
Will that basically filter against and URI which specifically contains 92-winston or do I need to wrap it in a fancy regex? I'm not that great with RegEx.
Thanks! Apologies in advance if this is ridiculously easy.
You may want to try this regex, if you know exactly that the URI will start with "/en/my-hometown/92-winston":
^\/en\/my\-hometown\/92\-winston.*
This captures URI strings that start exactly with "/en/my-hometown/92-winston" and also include anything that may come after that, for example
/en/my-hometown/92-winston
/en/my-hometown/92-winston/backyard
/en/my-hometown/92-winston/some_other_page
Just beware that if you have more than one "Include" filter in GA, then you need to make sure you aren't actually excluding more data, as multiple include filters are "and"ed. Always test your new filters in your test view as well.

Hierarchical URLs in Django

Is there a way to implement hierarchical query pattern in Django? As far as I know, the framework only allows to route to views by parsing URLs of a specific format, like:
/customers/{order} -> customer.views.show_orders(order)
But what if I need something like this:
/book1/chapter1/section1/paragraph1/note5 -> notes.view.show(note_id)
where note_id is the id of the last part of the URL, but the URL could have different number of components:
/book1/chapter1
/book1/chapter1/section1
etc.
Each time, it would point to the relevant part of the book depth depending on the depth. Is this doable?
I know there is this: https://github.com/MrKesn/django-mptt-urls, but I am wondering if there is another solution. This isn't ideal for me.
Django URLs are just regular expressions, so the simplest way would be to just ignore everything prior to the "note" section of the URL. For example:
url(r'^.*/note(?P<note_id>[0-9]+)$', 'notes.view.show'),
However, this would ignore the book, chapter, paragraph components. Which would mean your notes would need unique ids across the system, not just within the book. If you needed to capture any number of the interim parts it would be more complicated.
I can't confirm this will work right now, but using non-capture groups in regular expressions, you should be able to capture an optional book and chapter like so:
url(r'^(?:book(?P<book_id>[0-9]+)/)?(?:chapter(?P<chapter_id>[0-9]+)/)?note(?P<note_id>[0-9]+)$', 'notes.view.show'),
Use named groups to accomplish this: https://docs.djangoproject.com/en/dev/topics/http/urls/#named-groups
url(r'^book(?P<book_id>\d+)/chapter(?P<chapter_id>\d+)/section(?P<section_id>\d+)/paragraph(?P<paragraph_id>\d+)/note(?P<note_id>\d+)$', notes.view.show(book_id, chapter_id, section_id, paragraph_id, note_id)
For those who really need a variable-depth URL structure and need the URL to consist strictly of slugs, not IDs, knowing all the components of the URL is critical to retrieve the correct record from the database. Then, the only solution I can think of is using:
url(r'^.*/$', notes.views.show, name='show')
and then parsing the content of the URL to get the individual components after retrieving the URL in the view using the request.path call. This doesn't sound ideal, but it is a way to accomplish it.

Regular expression for URL filter that makes it hard to read the filterd URL

I want to insert a URL filter and I would like the URL to be hard to dechiffre.
For example .*porn\.* in a way maybe that it uses the ASCII code for the letters in hex form .
Of course, the example is obvious and I definately will leave that one as it is ;)
But for the others I would like them to be hard to read!
Thx!
You can use the $_GET function in PHP to pull an ID out of the URL and display it that way, similar to Youtube with their "watch?v=". I recently did one using "?id=49" (I only have a few pages ATM, I will have about 70 soon). What I did is use a database with a song_id to index the information. I use the same basic layout, but you can use the ID to access information wrapped in PHP so that it doesnt get sent to the browser but will still display the page you want.
Or if you really want it to look crazy, you could use a database using the SHA() or MD5() function to encrypt it.
and your display will look like /page.php?id=21a57f2fe765e1ae4a8bf15d73fc1bf2a533f547f2343d12a499d9c0592044d4.

Google Analytics Reg Ex isn't tracking goal URL

Alright, i've played around with this for over a week now and I can't get it to work. Using a regular expression match:
My GOAL URL:
category=thanks
This is not tracking correctly
My only goal step:
/s.nl\?c=1025622&n=5&sc=[0-9]+&ext=T&add=[0-9]&whence=
This is tracking correctly but it is saying everybody exits on this step and does not go to my goal URL
Upon looking at pages that contain category=thanks, I found the following tracked URLs
/s.nl?c=1025622&sc=44&category=thanks&whence=&n=5
/s.nl?c=1025622&sc=44&category=thanks&n=5
/s.nl?c=1025622&sc=44&category=thanks&whence=&n=5&redirect_count=1&did_javascript_redirect=T
/s.nl?c=1025622&n=5&sc=44&category=thanks&it=A&login=T
along with a bunch of other containing category=thanks. As I obviously can't compensate for all these changing URL, I figured just having "category=thanks" would work, but apparently not?
Your category=thanks statement is not a regular expression that matches the URLs you mentioned. It would apply for filters and segments where there's a 'URI contains' option, but not a full match.
I beleive you have 2 options:
Go to Content report in GA, choose a larger time period (like a year) and add a category=thanks filter. You'll get all possible goal URLs. For them you'll have to write a regular expression. From what you describe, your URL structure is a mess (too many parametrs), so take a look at option 2.
Add a small script to your goal page that would redirect your visitors to a page with a clean url. Something like a conditional statement saying if (URL contains category=thanks) {redirect to /thankyou.html}, then use this thankyou.html as a goal in GA.