Google Analytics URL Percent-encoding Issue with Cross Domain Cookie - regex

Google Analytics is not understanding a URL Percent-encoding so I can track multiple domains between my "source" domain and my "destination" domain. I'm using Google Tag Manager + the new Universal Analytics.
Is there a macro or rule in google tag manager that I can create to help Google Analytics detect these two URL Percent-encoding as %2526 for & and %253d for = appropriately? If so, is there any support that could be provided with this issue I'm experiencing?
Here is an example URL (not real):
http://subdomain.example.com/adfs/ls/?wa=wsignin1.0&wtrealm=https%3a%2f%2fsub.domain.com%2fwebsite%2f&wctx=rm%3d0%26id%3dpassive%26ru%3d%252fwebsite%252fsite%252fexample%252f%253fstuff%253dtypeofuser%2526_ga%253d1.244536837.1471787898.1397850931&wct=2014-04-18T20%3a14%3a54Z
As you can see close to the tail end of URL contains my _ga cookie that originated from my "source" domain and is getting passed to my "destination" domain. This is a good thing, however GA is not able to read it, because of the URL Percent-encoding shown below:
%2526_ga%253d1.244536837.1471787898.1397850931
%2526 is a URL encode for &
%253d is a URL encode for =
Since google analytics is not able to translate the URL Percent-encoding %2526 and %253d it writes a brand new cookie instead when I look at my cookies when I debug using firebug > cookies tab.

The solution I found that solves this problem is to append the cookie to the URL again on page load so so the cookie can be read by google analytics.
The regex for .match can be customized with your URLs that you need to filter.
var gacookie = window.location.search.match('_ga%253d(.+)&wct=');
var url = window.location.href;
if (url.indexOf('_ga') > -1) {
url += '&_ga=' + gacookie[1]
parent.location.hash = url
var hash = location.hash.replace('#', gacookie[1]);
if(hash != '') {
location.hash = '&_ga=' + gacookie[1];
}
}

Related

Unable to refresh dynamic data source in PowerBI service via using anonymous web API access?

This relates to my earlier question - How to iterate/loop through next pages in an API request in PowerQuery/PowerBI? ; which was resolved using below code:
//Declare base variables
let
BaseURL = "https://api.aaaaaa.com",
Entity = "/api/v1/user?&limit=1000",
Token = "zzzzzzzzzzzzzzzzzzzzzzzzzzzz",
Options = [Headers=[APITOKEN=Token]],
URL = BaseURL & Entity,
//Define a function that would take step/page as parameter and return results
GetData=(page as number) =>
let
Source = Json.Document(Web.Contents(URL & "&step=" & Number.ToText(page), Options)),
Data = try Source[results] otherwise null
in
Data,
//Iterate over GetData () to return all the records until last page i.e. until no "result" is retrieved from the API call
GeneratePageList =
List.Generate( ()=>
[Result = try GetData(1) otherwise null, Page=1],
each [Result] <> null,
each [Result = try GetData([Page]+1) otherwise null, Page=[Page]+1],
each [Result]
)
in
GeneratePageList
However, once this code is published to PowerBI service, we cannot schedule refresh for it, since it gives below error as:-
This dataset includes a dynamic data source. Since dynamic data sources aren't refreshed in the Power BI service, this dataset won't be refreshed. Learn more: https://aka.ms/dynamic-data-sources.
• Data source for Query1Discover Data Sources
Tried RelativePath & Query method as suggested here - https://blog.crossjoin.co.uk/2016/08/16/using-the-relativepath-and-query-options-with-web-contents-in-power-query-and-power-bi-m-code/ and here-
https://blog.crossjoin.co.uk/2019/04/25/skip-test-connection-power-bi-refresh-failures/
But, without any luck, see below how am using it:
let
BaseURL = "https://api.crewhu.com",
Entity = "/api/v1/user?&limit=1000&step=",
Token = "60afbdaf5d7d584762771f36",
Options = [Headers=[X_CREWHU_APITOKEN=Token]],
URL = BaseURL & Entity,
//Define a function that would take step/page as parameter and return results
GetData=(page as number) =>
let
Source = Json.Document(Web.Contents(BaseURL & [RelativePath = Entity, Query=[page]], Options)),
The BaseURL is reachable; but redirects to the login page, where our admin credentials (username+password) on the vendor site works well. However, same credentials do not work when using "Basic" connection method during accessing Web Content. Therefore, tried adding #Authorization = Basic in the header along with API key like - [Headers=[Authorization = Basic, X_CREWHU_APITOKEN=Token]]; but this also didn't work.
We've only got an Open API token/key from the vendor; but even that token/key also doesn't work from when providing that in "Web API" section during connecting/accessing Web Content, it gives error as:- "a web api key can only be specified when a web api key name is provided", but the same key/token works well from within PowerQuery (M) code using anonymous web api call method.
Have tried multiple permutation combinations of providing key/token in the username/password fields as suggested in some sites, but still no luck.

How to avoid user having to re-authorize Evernote every time?

I'm building a Python web app with the Evernote API. When users log in they're redirected to a page on the Evernote site to authorize the application. When they come back everything works fine (can see and edit notes etc.)
The challenge now is to avoid having to redirect the user to the Evernote site every time they log on.
I read on the Evernote forums that I need to save the access token and the notestore url to achieve this. I now save these to the users accounts after the first successful authorization.
But how do I use the access token and notestore url to authorize?
I found this sample code on the Evernote website that's supposed to achieve this, but it's in Java and I can't seem to make it work in Python.
// Retrieved during authentication:
String authToken = ...
String noteStoreUrl = ...
String userAgent = myCompanyName + " " + myAppName + "/" + myAppVersion;
THttpClient noteStoreTrans = new THttpClient(noteStoreUrl);
userStoreTrans.setCustomHeader("User-Agent", userAgent);
TBinaryProtocol noteStoreProt = new TBinaryProtocol(noteStoreTrans);
NoteStore.Client noteStore = new NoteStore.Client(noteStoreProt, noteStoreProt);
Basically, if you got the notestore url and access token from a previous authorization, how do you use them to re-authorize?
If you have the access token, you will use that as a constructor argument for the EvernoteClient class.
For example:
client = EvernoteClient(token=your_access_token)
note_store = client.get_note_store()
notebooks = note_store.listNotebooks();
for n in notebooks:
print n.name
For more examples, check out the Python Quick-start Guide.

Google Analytics missing __utmz cookie

I have universal analytics installed on my website, and want to parse the __utmz cookie to get referral info. However, I never see this cookie set.
Has something changed? Any reason this isn't set?
I do see the _ga cookie when I browse my site, and I see the __utmz cookie in my browser cache if I go to other sites.
I checked out the docs, and don't see any reference to this changing recently, so a bit stumped.
Universal Analytics doesn't create any __utm* cookies.
However, you can use Universal Analytics code (analytics.js) AND the traditional code (ga.js) simultaneously on your site. This will allow you to populate your UA profile and scrape the values from __utmz.
It seems like with Universal Analytics, this cookie has disappeared, and you only get a single _ga cookie.
Source: https://developers.google.com/analytics/devguides/collection/analyticsjs/cookie-usage
Also mentioned here: How to get the referrer, paid/natural and keywords for the current visitor in PHP with new Google Analytics?
Also given that analytics is primarily a tool to collect aggregated information, I couldn't find (and I doubt) that there is any way to query GA to get this info back, given the _ga cookie.
You can create your own cookie and store the query string parameters that google analytics use (utm_campaign and etc).
See this project as example:
https://github.com/dm-guy/utm-alternative
Use below code to get utmz cookie along with your universal analytics js code
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-XXXXX-X']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>

what are the values in _ga cookie?

I am using universal analytics. universal analytics creates first party cookie _ga
_ga=1.2.286403989.1366364567;
286403989 is clientId
1366364567 is timestamp
what is 1 and 2 in _ga cookie?
_ga=1.2.286403989.1366364567;
1st Field
This is a versioning number. In case the cookie format changes in the future. Seems to be fixed at 1 at the moment. The one above is an old format. Newer cookies have this value set at "GA1"
2nd Field
This field is used to figure out the correct cookie in case multiple cookies are setup in different paths or domains.
By default cookie are setup at path / and at the domain on document.location.hostname (with the www. prefix removed).
You could have a _ga cookie set at sub.example.com and another cookie set at example.com. Because the way the cookie API on browsers works there's no way to tell which is the correct cookie you use.
So the second number is the number of components (dot separated) at the domain.
for sub.example.com the number would be 3
for example.com the number would be 2
The path defaults to / but you can also change it by passing the cookiePath option to the ga.create method. If you pass it this field becomes 2 numbers dash separated. And the second number is the number slashes in the path.
Using these numbers the analytics.js script can correctly identify the cookie to be used in case there are multiple cookies set.
eg:
Imagine that you have a site that lives at sub1.sub2.example.com/folder1 in case you want to store the cookie only on your site and not make it visible to other subdomains or folders you can use the following configs:
ga('create', 'UA-XXXX-Y', {
'cookiePath': '/folder1/',
'cookieDomain': 'sub1.sub2.example.com'
});
In this case the cookie will look somoething like this;
_ga=1.4-2.XXXXXXXX.YYYYYYY
3rd Field
This is a random generated user ID. Used to identify different users.
4th Field
It's a timestamp of the first time the cookie was set for that user.
new Date(1366364567*1000)
> Fri Apr 19 2013 06:42:47 GMT-0300 (BRT)
This is also used to uniquely identify users in case of userId collisions.
Worth mentioning that a cookie is not an API. In the future it may completely change. Google doesn't recommend reading/writing the _ga cookie directly. You should interact with Google Analytics through one of the tracking libraries such as analytics.js. There's not a lot of use for this information other than curiosity.
If you are reading/writing directly the cookie you are doing it wrong.
I think this would be helpful.
/**
* Get Google Analytics UID
* #return int
*/
public function getGAUID() {
$uid = 0;
if ($_COOKIE['__utma'])
list($hash_domain, $uid, $first_visit, $prew_visit, $time_start, $num_visits) = sscanf($_COOKIE['__utma'], '%d.%d.%d.%d.%d.%d');
elseif ($_COOKIE['_ga'])
list($c_format, $c_domain, $uid, $first_visit) = sscanf($_COOKIE['_ga'], 'GA%d.%d.%d.%d');
return $uid;
}
Written in NodeJS with ES6 Syntax. Might help someone?
// Example: GA1.2.494614159.1574329064
const gaCookieGeneration = ({ version = 1, domain, rootpath = '/' }) => {
const subdomains = (domain.match(/./) || []).length + 1;
const rootpathDirs = (rootpath.match(/\//) || []).length;
const cookiePath = rootpathDirs > 1 ? `-${rootpathDirs}` : '';
const uniqueId = Math.random().toString().substr(2, 9);
const timeStamp = (+new Date()).toString().substr(0, 10);
return `GA${version}.${subdomains}${cookiePath}.${uniqueId}.${timeStamp}`;
};
const gaCookie = gaCookieGeneration({
domain: '.example.com',
});

WSO2 ESB Proxy stop replacing %26 with &

I need to send these parameters to domain
domain/page?param1=xxx&param2=yyy%26zzz
I am using proxy in wso2 for domain
localhost:8280/services/proxyfordomain/page?param1=xxx&param2=yyy%26zzz
Endpoint of proxyfordomain is domain
Proxy is replacing %26 with &
Actual URL to be logged in console is:
To domain/page?param1=xxx&param2=yyy%26zzz
But URL logged in console is :
To domain/page?param1=xxx&param2=yyy&zzz
Here param2 will take yyy%26zzz format values
but not yyy&zzz
How to stop WSO2 from replacing?
Thanks for spending your valuable time
You can use the following script mediator configuration to replace 'yyy&zzz' with 'yyy%26zzz'.
<script language="js">var url = mc.getTo().toString();
var newURL = url.replace("yyy&zzz","yyy%26zzz");
mc.setTo(newURL);</script>