Google Analytic Search and replace according to list - replace

Scenario:
In Google Analytic, I notice that it is possible to replace certain URI parameter to words that you want by using search and replace filter like the following example below.
e.g. www.example.com/abc/product_id=3 -----> www.example.com/abc/product_name=shampoo
Problems:
Currently I've got a list of over 1000 products in my hand, instead of creating 1000 search and replace filter, what would be the most efficient and maintainable way to go solve the problem?
I've done some digging and notice that custom dimension could be the solution, however it would require me to modify the the JS code on the FTP sever which I dont have permission on. What other solutions do I have?
If it is not possible to show it here would there be any kind of tutorial that I could follow through?
Really appreciate for the help, Many Thanks

This is not a complete answer, but it's certainly more than a comment.
Besides the tedium of writing this out by hand, I can think of two options available to you.
Firstly, you could use the Google Analytics Management API (https://developers.google.com/analytics/devguides/config/mgmt/v3/). By constructing a set of commands, you could quickly iterate through your list and create the required 1,000 search and replace filters.
Secondly, if you were to use Google Tag Manager you would be able to create a Custom JavaScript Variable that takes the page path and compares it to your list. This variable could then replace the Page field before the hit data is sent to Google Analytics. This may sound more complicated, but it would allow you to pull your solution out of Google Analytics and into the flexible world of JavaScript.

Note that if you rewrite the product_id to a product_name once, you will have to maintain that cross reference every day and keep it in sync with what appears on the website -- make sure you have an automated solution or it will quickly get out-of-sync and be more of a mess than before.
An alternative is to do the search-and-replace on the reporting side.
I know Analytics Edge or Analytics Canvas products could easily do this, or you could just download into Excel or Google Sheets and do a series of lookup formulae.

Related

How can I list all licenses for a single user with google's Enterprise License Manager API

So far all the other API calls I worked with on the admin SDK seem to allow the params needed to properly filter what do/don't need. With this one, however, it seems I need to dump all licenses and them search through the results for the one I need. I need to keep track of the pageToken and make multiple calls now as I won't get all the results in the first response.
I would really like to add userId= to the call, but I don't see the option. Has anyone else found a workaround? Is there a better approach to this?
results = licservice.licenseAssignments().listForProduct(productId='Google-Apps', customerId='admarketplace.com', maxRecords=1000).execute()

Facebook style like system in modx cms (php)

Trying to build a simple like system in modx (which uses php snippets of code) I just need a button that logged in users can press which adds a 'like' to a resource.
Would it be best to update a custom table or TV? my thoughts are that if it is a template variable i can use getResource to sort by amount of likes.
Any thoughts on the best way to approach this or how to build this would help. My php knowledge is limited.
Depends how you are going to use it after and if you are storing more data than just a 'like' count. TV's are expensive on resources [even more so if you are going to whip through the entire resource set with getResources] so if you are going to do a lot of processing after the fact I would either look at a custom table ~or~ explore using property sets on your pages [I think it should be pretty easy to write a plugin that will update a page property]
I'd definitely go for a custom table.
While you could simply increment a numeric TV to count the amount of likes, you will come to a situation where anyone may be able to keep on liking a resource without limit - while you didn't specify the exact concept, that hardly can be desired. Using a custom table you could throw in a relational alias to the user ID that liked the resource, add a timestamp so you know when it happened, and let your fantasy run wild on additional features that are now open to you.
While not a hard requirement for custom tables, you will probably want to take the time to learn xPDO, which is the database abstraction layer MODX is based on. There's a great tutorial on the RTFM which walks you through it.

Sitecore return "Popular searches" while using Lucene Search?

I have a request to return a list of the most popular search terms used when searching a Sitecore site.
I have no idea how to implement this sort of function using Sitecore or whether Sitecore has this kind of functionality all ready. I can't find any documentation detailing this.
I am currently using search based of the LuceneSearch module (http://trac.sitecore.net/LuceneSearch) but altered to bind to a ListView for easy pagination.
At the moment I am probably just going to build a standalone function/class to update an XML file or something unless someone is able to point me in the correct direction...?
I would frankly use OMS for that - this is what it is designed to do. No need of separate database. Just register the search events via API with OMS. There is an out of the box Search report. May require some tweaking, but this seems to be the most out of the box solution.
Take a look here for more details.
I don't know of any standard functionality in Sitecore that would help you achieve this, so you will probably have to approach this from ground up - unless someone else in here is able to point to a package deal somewhere :-)
Solving this, really breaks down into two tasks
1) Collecting search term information. Whenever a user enters a search term in the searchbox that I assume you have; normalise it and store it in a SQL table (essentially a [term] [count] type table. Update the counter on terms you already store.
By normalising, I mean lowercasing it and so on - possibly breaking each search term (word) down and storing them one for one if that is what your solution calls for (probably not the route I would go)
2) Realtime retreiving information from the table, based on what the user is typing in the searchbox. Assuming you want some sort of "amazon-like" - also found on almost all major search engines nowadays - autocompletion. I normally implement these in a web service that then gets called by Ajax, JQuery or whatever rich client implementation you prefer.
As for updating an XML file, I think locking issues and performance would kill that solution; though it could perhaps be made to work on a very small scale.
Sorry that I can't be more specific in my response, but your question is very open-ended.
Very interesting question. One thing you could do it have another database to store these search queries. An insert into this DB would not be very difficult and would get around the issue of locking on a XML file. Maybe insert the search query into a DB table then to get the top results just pull the top x rows ordered by that query field. As Mark Cassidy said before, maybe normalize the data before inserting it.
You could isolate this work on your search layout (or sublayout) so it runs on a specific part of the site, not on every page.
Sitecore has an out of the box "site search" report in the executive insight dashboard, this will give you an indication of what search terms are driving the most visits and of course engagement value.
You just need to configure it by registering a page event on the search page and passing the query otherwise sitecore wouldnt know what form field constitutes a search. See this post it explains it in more detail. For more information you can download the analytics configuration reference document from sdn.http://sdn.sitecore.net/upload/sitecore6/65/engagement_analytics_configuration_reference_sc65-usletter.pdf
And dont forget for performance sitecore caches the reports at various levels so during development it may be handy to know how to force a cache update, I talk about this in the following blog post:
http://andytsitecore.blogspot.co.uk/2013/10/sitecore-dms-and-analytics.html

Retrieve a list of the most popular GET param variations for a given URL?

I'm working on building intelligence around link propagation, and because I need to deal with many short URL services where a reverse-lookup from an exact URL address is required, I need to be able to resolve multiple approximate versions of the same URL.
An example would be a URL like http://www.example.com?ref=affil&hl=en&ct=0
Of course, changing GET params in certain circumstances can refer to a completely different page, especially if the GET params in question refer to a profile or content ID.
But a quick parse of the page would quickly determine how similar the pages were to each other. Using a bit of machine learning, it could quickly become clear which GET params don't effect the content of the pages returned for a given site.
I'm assuming a service to send a URL and get a list of very similar URLs could only be offered by the likes of Google or Yahoo (or Twitter), but they don't seem to offer this feature, and I haven't found any other services that do.
If you know of any services that do cluster together groups of almost identical URLs in the aforementioned way, please let me know.
My bounty is a hug.
Every URL is akin an "address" to a location of data on the internet. The "host" part of the URL (in your example, "www.example.com") is a web-server, or a set of web-servers somewhere in the world. If we think of a URL as an "address", then the host could be a "country".
The country itself might keep track of every piece of mail that enters it. Some do, some don't. I'm talking about web-servers! Of course real countries don't make note of every piece of mail you get! :-)
But even if that "country" keeps track of every piece of mail - I really doubt they have any mechanism in place to send that list to you.
As for organizations that might do that harvesting themselves, I think the best bet would be Google, but even there the situation is rather grim. You see, because Google isn't the owner every web-server ("country") in the world, they cannot know of every URL that accesses that web-server.
But they can do the reverse. Since they can index every page they encounter, they can get a pretty good idea of every URL that appears in public HTML pages on the web. Of course, this won't include URLs people send to each other in chats, SMSs, or e-mails. But still, they can get a pretty good idea of what URLs exist.
I guess what I'm trying to say is that what you're looking for doesn't exist, really. The only way you can get all the URLs used to access a single website, is to be owner of that website.
Sorry, mate.
It sounds like you need to create some sort of discrete similarity rank between pages. This could be done by finding the number of similar words between two pages and normalizing the value to a bounded range then mapping certain portions of the range to different similarity ranks.
You would also need to know for each pair that you compare what GET parameters they had in common or how close they were. This information would become the attributes that define each of your instances (stored along side the rank mentioned above). After you have amassed a few hundred pairs of comparisons you could perhaps do some feature subset selection to identify the GET parameters that most identify how similar two pages are.
Of course, this could end up not finding anything useful at all as this dataset is likely to contain a great deal of noise.
If you are interested in this approach you should look into Infogain and feature subset selection in general. This is a link to my professors lecture notes which may come in handy. http://stuff.ttoy.net/cs591o/FSS.html

Connecting to IMDB

Has any one done this before? It would seem to me that there should be a webservice but i can't find one. I am writing an application for personal use that would just show basic info from IMDB.
The libraries for IMDb seem quite unreliable at present and highly inefficient. I really wish IMDb would just create a webservice.
After a bit of searching I found a reasonable alternative to IMDb. It provides all the basic information such as overview, year, ratings, posters, trailers etc.:
The Movie Database (TMDb).
It provides a webservice with wrappers for several languages and seems reliable so far. The search results have been, for myself, more accurate as well.
There is no webservice available.
But there are enough html scrapers written in every language to suit your needs!
I've used the .NET 3.5 Imdb Services opensource project in a few personal projects.
1 minute google results:
Perl: IMDB-Film
Ruby: libimdb-ruby
Python: IMDbPY
The only "API" the IMDb publishes is a set of plain-text data files containing formatted lists of actors, directors, movies, etc. You would likely need to write your own parser unless somebody has released one for your language. Try Google searches like "imdb api" and "imdb parser".
A screen scraper might be useful, but they specifically prohibit scrapers in their terms of use.
Though this was posted over two years ago, here is a simple python code
import urllib2
movie_id = raw_input('Enter the ID of the movie: ')
json = urllib2.urlopen('http://imdbapi.com/?i=' + movie_id + '&r=json')
print json.read()
save as imdb.py and then run as in shell or terminal or whatever
if you want xml data just replace json with xml
please note that this is using the imdbapi.com website to return a json result visit that website to view more options.
Here is my own solution using RegEx:
private const string UglyMovieRegex = "(?<=5>|3>)(Cast|Director:|Fun\\sStuff|Genre:|Plot:|Runtime:|Tagline:|Writers:)"
+ "|href=\"[\\w\\d/]+?(Genres|name|character)/([\\w]+?)/\".*?>([.\\-\\s\\w]+)</a>"
+ "|(?<=h\\d>)([.\\w\\s'\\-\"]+)(?=<a\\sc|</d|\\|)";
Regex MovieData = new Regex (UglyMovieRegex, RegexOptions.Compiled | RegexOptions.Multiline | RegexOptions.Singleline );
IMDB prohibits scrapers, and change the page layout every once in a while, so parsing HTML is an option, but be prepared to adjust your code 2-3 times a year (been there, done that, given up). They do have a fee-based service giving the full access to the data, but you'll also need to explain what is it for, and convince them you are not building a competitive website (I had a link to that, but it seems to have changed and can't find it now).
Another alternative is to run the IMDB database on your local machine. Java Movie Database imports the IMDB database files, converts them and provides a locally-accessible copy of IMDB. IMDB has some functionality which Java Movie Database does not have and visa-versa but if what you're looking for is quick access to all the data it might be worth giving this a try.
Now there's is an (undocumented) API like http://www.imdb.com/xml/find?json=1&q=Harry+Potter. See Does IMDB provide an API?
TRYNT Heavy Technologies provides (for free) a web service for retrieving basic IMDb data -- check out their site at http://www.trynt.com/trynt-movie-imdb-api/. They also have a separate service for Television data.
There is at least one unofficial IMDb API called IMDb8. It has about 31 endpoints including
actors/list-born-today
actors/get-awards-summary
title/get-plots
title/get-top-crew
etc. Like any other API it is very straightforward to use. I used this API for building a fun trivia project. You can find a tutorial on how to get started here.