Anamoly between google analytics and server hits - web-services

I am using Google Analytics on my single page HTML5 application. Though Google Analytics shows Visits as 16k, the number of hits in the server log says the figure is around 3 lac.
I am using the following tracking code in the head section of my page:
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-xxxxxxxx-x', 'example.com');
ga('send', 'pageview');
</script>
What could be the reason for such anamoly or how can I track if the Google Analytics results are correct (using server logs etc) ?

I assume that by "3 lac" you mean three hundred thousand.
First of all visits in GA and hits in a server log file are not comparable since "visits" is an aggregate number that usually includes multiple hits.
Secondly the server log tracks requests to the server (including requests for assets like images, css file, js file etc. as well as error pages). Since a page comprises multiple files (html and assets) every pageview will result in multiple hits to the server. Also the server log will track bots and crawlers.
On the other hand Google will track only the request to the page itself (the part that includes the tracking code) and not the assets, and it will no track (in the standard implementation) user agents that do not use javascript. It also won't track users that have opted out from tracking.
Since by now the majority of web traffic is caused by such user agents (search engine crawlers etc.) server logs will show significant more traffic than Google Analytics.
To compare the two you first need to remove calls to asset files and all traffic that is caused by crawlers from the log file. Then you need to compare the correct metric (i.e. Analytics pageviews vs. server log hits, not visits).

Related

Cookieless analytics headache. How do I get GDPR compliant analytics to work?

Keep in mind, this is only the second site I've ever built. I'm not a programmer, I'm just good at googling things and following instructions.
The gist of it.
I don't want to have to ask for cookie consent just to know how many visitors my site has. I'm not working for a huge corporate entity which needs the most precise data in the world. All I need is estimative data regarding traffic and maybe to know how many people pushed the "contact us" buttons.
There has to be a way to do this without cookies, but my research has led to two dead ends:
Following Setup Google Analytics without cookies I tried this code
<!-- Google Analytics -->
<script>
window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
ga('create', 'G-C88PM0YJP2', {
'storage': 'none'
});
ga('send', 'pageview');
</script>
<script async src='https://www.google-analytics.com/analytics.js'></script>
<!-- End Google Analytics -->
That got me nowhere. My dashboard registered no data. Chrome analytics debugger extension showed no data. (just for reference, google analytics with the standard script which uses cookies did work when checked with the debugger.)
Fed up with that, I tried Tinyanalytics in their cookieless mode.
<!-- Pixel Code for http://app.tinyanalytics.io/ -->
<script defer src="http://app.tinyanalytics.io/pixel/oaFj0HuEqsS9uMW9"></script>
<!-- END Pixel Code -->
But when I go to look at the data, I'm again getting: "No data available. Did you install the Tracking Code?"
I did install it and have no idea why it isn't working.
Their "Help" section says "step 6. Click on the Verify installation tab and then click the button. If you see an alert box saying that the pixel is installed, then you finished the installation process."
Except I can't find a "Verify installation Tab" anywhere in their interface.
Can you tell why my two attempts have failed and help me get something like tinyanalytics to work? Or can you point me towards better alternatives for free cookieless analytics?
Please note, I'm aware that my host Netlify.com does offer an analytics package for 10$/month. I consider that a last resort. But in my research it appears as though 3rd party cookies are getting deprecated anyway. Is google analytics going cookieless soon? Should I just pay Netlify.com and wait for that?
I wish I could find more alternatives on my own, but if you type the word "analytics" into any search engine, all you get is google analytics because they are so huge.
EDIT: I seem to have unearthed some error messages from my website
Mixed Content: The page at 'https://teatrulcomoara.ro/' was loaded
over HTTPS, but requested an insecure script
'http://app.tinyanalytics.io/pixel/oaFj0HuEqsS9uMW9'. This request has
been blocked; the content must be served over HTTPS. analytics.js:1
Failed to load resource: net::ERR_BLOCKED_BY_CLIENT
But this also doesn't check out for me. Tiny analytics asks weather my url is https or http, and I selected https, so why is Tinyanalytics giving me an http tracking code?
I've since added the s in the script. It makes the error go away, but I still don't have analytics
<!-- Pixel Code for http://app.tinyanalytics.io/ -->
<script defer src="https://app.tinyanalytics.io/pixel/oaFj0HuEqsS9uMW9"></script>
<!-- END Pixel Code -->
Thank you in advance for your help.
Here's a link to my site so you can view source in case that helps.

Transaction based monitoring service for rails

I am looking for a service for my rails application to monitor and alert/notify based on transaction based tests. I already looked at pingdom but it doesn't support uploading files and that's crucial to my use case. what other services i can use to test and monitor my transaction. My use case is as follows:
User logs in to their iTunesConnect account via our login form
User selects multiple image(s) to upload to iTunesConnect
The images get uploaded locally
User see the preview screen to preview the images and pass validations
Final upload which is through a background service where the images get uploaded to users' iTunesConnect account.
WDT.io can be used to monitor transactions with an "on-demand timer". At the beginning of your transaction, your code makes an HTTP request to wdt.io, and then again at the end of the transaction. If the second request doesn't come in within predefined time, you'll get an alert.
I'm not familiar with iTunesConnect, so this may not help. Also, if you have multiple transactions running concurrently, this won't be of help either.

Django Cache Implementation

Well, I'm designing a web application using Django. The application allow users to select a photo from the computer system and keep populating onto the users timeline. The timeline view have a list/grid of all the photos which the user has uploaded sorted chronologically, showing 50 photos and then a pull to refresh to fetch the next 50 photos on the timeline.The implementation works for multiple users.
Now for fast user experience of the app I'm considering caching. Like most sites store the timeline of the user onto cache so that whenever the user logs in it the first place to check for information the request is served out of the cache and if it is not available there then you go to the DB to query for the information.
Primarily in one line I'm trying to cache all the timelines for different users in cache for now.
I'm done with building of the webapp minus the cache part. So , my question is how do I cache all the timelines of different users??
There is a big difference between public caching and the caching of private data. I feel your data is private and thus needs a different strategy. There is a nice overview of the different ways to implement testing and, more importantly, the different things you need to take into account: The Server Side (Tom Eastman). This has a part on speed and caching (16:20 onward). It explains how to use etag and last_modified headers with django.

Image Storage in S3 and serving securely

I am building a photos-site where users can upload photo and use it view later. These photos are not public and private. I am storing the photos and the thumbnails in S3. Currently the implementation that I am following is that when a user comes to page I serve signed urls of the thumbnails and that its loaded from S3(though I am also thinking about using signed urls from cloudfront).
The issues now are:
In each request a different url is served for each thumbnail, so the browser cache can't be used. This makes the browser load each image again when the user refreshes the site. It makes our page slow.
This also creates one more problem that if someone snoops into the source and all, he can find out the signed-url of the photos and distribute it to others for viewing(though the signed url is only for 10 mins). What I would preferably like is that the url be passed by my application so that I can decide if the user should be allowed or not.
Please help me with what approach I should take, I want the page loading time to be fast and also serve the security concern. I would also like to know that will serving from cloudfront be faster than browser cache( I have read it someplace) even for different signed url everytime.
Feel free to be descriptive in your answer.
I don't think there is a perfect answer to what you want. Some random ideas/tradeoffs:
1) switch to HTTPS. That way you can ignore people sniffing URLs. But HTTPS items cannot be cached in the browser for very long.
2) If you are giving out signed urls, don't set expires = "time + 10m", but do "time + 20m and round to nearest 10m". This way, the URLs will be constant for at least 10m, and the browser can cache them. (Be sure to also set the expires: headers on the files in S3 so the browser knows they can be cached.)
3) You could proxy all the URLs. Have the browser request the photo from your server, then write a web proxy to proxy the request to the photo in S3. Along the way, you can check the user auth, generate a signed URL for S3, and even cache the photo locally.) This seems "less efficient" for you, but it lets the browser cache your URLs for as long as they want. It's also convenient for your users, since they can bookmark a photo URL, and it always works. Even if they move to a different computer, they hit your server which can ask them to sign in before showing the photo.
Make sure to use an "evented" server like Python Twisted, or Node.js. That way, you can be proxying thousands of photos at the same time without using a lot of memory/CPU on your server. (You will use a lot of bandwidth, since all data goes thru your server. But you can "scale out" easily by running multiple servers.)
4) Cloudfront is a cache. It will be slower (by a few hundred ms) the first time a resource is requested from a CF server. But don't expect the second request to be cached! Each CF location has ~20 different servers, and you'll hit a random one each time. So requesting a photo 10 times will likely generate 10 cache misses, and you still only have a 50% chance of getting a cache hit on the next request. CF is only useful for popular content that is going to be requested hundreds of times. CF is somewhat useful for foreign users because the private CF-to-S3 connection can be better than the normal internet.
I'm not sure exactly how you would have CF do your security checking for you. But if you pass thru the S3 auth, (not the default), then you could use the "mod 10 minutes" trick to make URLs that can be cached for 10 minutes.
It is impossible for CF to be "faster than a browser cache". But if you are NOT using your browser cache, CF can be faster than S3, but mostly in foreign locations.
Take a look at what other people do (i.e. smugmug uses S3, I think.)

Google Analytics in footer file

I have a question about how Google Analytics tracks pages in a Wordpress site or any other site that uses a template file to include the code for Google Analytics in the footer or header. Since the file is generated and used in all the pages, that would mean that the analytics code is counting all the pages that are viewed correct? Also, is it possible to view what pages are getting hits and have a more detailed report in Google Analytics? I just have a feeling that the page i'm tracking is displaying inaccurate reports since the same code is used on every page. Can anyone help clear this up and educate me a bit on this topic?
The code is always the same, it loads in the footer so you dont have to put it on every single page.
in the code there is a unique code for your website so analytics knows wich analytics account needs to get the information.
The code dosn't need to be changed everypage.
You can see the pageviews like this:
-->google analytics
--->contents
-->Site content
-->all pages
Now you get a list with urls and the page view for every url
You can sort the list by pageviews (how many times is the page loaded) and unique page views(How many uniqe ip addresses have visited the page.).
You can also find bounce rate wich shows how many % of the users left you site on that page.