Tracking users on internet

Tracking users on internet - cookies

We want to track users when they visits our competitor websites.
Suppose the user visits a.com(our site) and then visits b.com and c.com (our competitors) , we want to know the user has visited those sites. Is there a way to track this without using plugins like chrome extensions.
I know there is limitation in cookies. Is there a way to do this?

Thankfully, no.
From time to time accidental mechanisms to leak such information become known (such as, at one time, it was possible to inspect link styles to see if it's visited). Such things are regarded as serious security breaches and are swiftly eliminated by browser vendors.
Basic web security principles isolate the interaction with unrelated domains, unless both web pages willingly try to talk to each other. Furthermore, if a user visits your site first and the closes it, their interaction with you is finished - in no sane scenario would you be able to know what they do atferwards.
Not to mention the obvious fact that what you're asking is just plain highly unethical.

Unless you can get your competitors to give you access to their websites, either through include scripts or indirectly if you manage to track users through ad-networks or Facebook pings, it's not possible. You cannot access (should not be able, in any case) cookies set by another domain.
Some browsers allow access to the previously visited URL through document.referrer -- but that is not guaranteed to work, nor is reliable. In short, there's no way of doing it targeted.
Not being able to easily track people is a good thing; but it makes constructions like the on you are wanting a lot harder.

Related

Sharing among all the browsers of a user

I'm building a website www.example.com. I want to implement an affiliate program: when a non-signed-in user enters https://www.example.com/?af-code=affiliateA in his browser (e.g., Chrome), we plan to write affiliateA in e.g., localStorage of the browser, such that we always know this user comes from affiliateA, no matter which url of our website he visits later.
But it seems that localStorage of Chrome is not accessible to IE on the same machine, and vice versa.
My question is whether there is a mechanism (e.g., localStorage, sessionStorage, cookie) to permit of sharing among all the browsers of a user.

It's quite difficult to proove that something doesn't exist. But if we'll take this for a complete list of Web APIs (which I hope it is), then you'll see there different APIs for dealing with data storage 1, 2, ..., even a File and Directory Entries API, but all of them are local for a browser.
But your task is actually not in sharing data between browsers, but to identify a user. And that can be done. But only with some probability and a lot of effort.
For that you can gather different peaces of information about user's device (like screen size, dpi, network condition, etc.), and they will be more or less the same on the same device in a different browser. Also you can use other peaces of information on the server (client IP, simultaneous sessions, when both browsers are open, action patterns, etc.). But it's not an easy task.
And to add to the complexity of it:
One device can be used by many users.
One user usually has many devices.
And in no way you can identify a user on different devices without resorting to full-blown spying techniques.
But, there are trackers out there, like Google Analytics and you can even fin open-source projects. So setting it up on your site can be (still probabilistic) answer to you question.
Can't recommend any - never used them.

Remembering users of websockets

Good Afternoon,
I am hoping someone might be able to help me with a concept. I have a websocket server which pushes JSON messages out to users, I have coded in a number of admin functions for pushing broadcasts out to users, as well as disconnecting users if needed.
One of the things I would like to be able to do though is to come up with a 'near' foolproof way of 'banning' users from connecting to the server if required. This is where I am a bit lost, if I go the cookie route then it is possible that the cookies get cleared and it no longer works, I can't use the session ID either as once they disconnect they get a new session ID, and the IP address is also problematic as many would be on mobile dynamic connection.
Id appreciate any tips on how to best achieve a way of remembering the users so if I ban them, when they go to reconnect I can prevent them.
The server I am running is the supersocketserver whilst the client is HTML5.

Given your current constraints, there is no "foolproof" way to ban a person from accessing your web site.
For privacy reasons, there is no permanent way to identify a given browser. There are cookies, there are IP addresses, there are even some evil "perma-cookies" that attempt to store little pieces of identifying information in lots of places (such as flash cookies and other plug-in data) to try to make it difficult (but not impossible) for users to clear them. As you're already aware, IP addresses are not permanent and are not always tied to just one user either.
And, of course a user can certainly just use a different browser or computer or mobile device.
So, the usual way to control access is to require a user to create an account in your system before they can use your service. Then, if you want to ban a user, you ban that account. Since you will want to prevent the user from just creating a new account, you can collect other identifying information upon account registration. The more info you require and can verify, the harder it is for users to create more and more accounts. This gets to be quite a bit of work if you really want to make it difficult for users to create more accounts because you need to require pieces of identifying information that you can both verify and are hard for a rogue user to duplicate (credit cards, email addresses, home addresses, etc...). How far you go here and how much effort you put in is up to you on how much you want to keep a banned user out.

In a website with no users, are cookies the only way to prevent people from repeating actions?

I'm creating a website and I don't want to have user/membership.
However, people using the site can do things like vote and flag questionable content. Is the only way to identify and prevent users from doing this repeatedly by using cookies?
I realize a user can just clear their cookies but I can't think of another way.
Suggestions for this scenario?

Well you could map a cookie + ip-adress in a datarecord in your database. To identify the user. So if the ip exists in the database, you simply just add the cookie, but check the cookie first to avoid unessesary database calls.
This is not optimal though, since schools etc might have the same ips on a lot of computers.
You can always just adapt openid!

Marko & Visage are correct,
Just to add though, you might want to store each vote with the timestamp,IP, etc... so at least if someone does try to "game" your site, you'd be able to rollback sets of votes made from the same location or within a very short amount of time (i.e. from a bot)

+1 To all that others have already said. Here's another middle-way idea:
Use cookies as primary means of limiting voting. If a cookie is not found, check the IP address. Allow no more than, say, 1 vote per 5 minutes from the same IP.

Cookies are not enough, as you said it could be cleared/expired.
IP address tracking is also not an option because of DHCP and firewalls.
The only thing that is ~100% sure is users, but then again, one person can register multiple accounts.
I'll go with cookies, as the simplest ant least obtrusive way. If someone really wants to play the system, he will find a way whatever you try to prevent it.

Even with membership a user can register multiple times and vote.
Cookies are one way to handle this but people who know that they can delete cookie can vote again.
You can catch the IP of the voter and restrict based on that. But many people will have same IP.
Sadly there is no other way.

Yes, you are right.
HTTP is stateless, so there is no way of determining if the origin of a request you receive now is the same or different to the origin of a request you received, say, 5 minutes ago.
Cookies are the only way around this. Even server side sessions rely on cookies to maintain session identity across requests (ignoring the security nightmare of passing the sesison ID in the URL, which anyone with malicious intent can sidestep trivially).

There will always be people gaming the system if it suits them. Moreover, if you make it such that you don't need cookies at all you'd be open to very simple attacks.
I think you'll want to consider ways to increase the economic cost of users operating under a cloud of suspicion.
For example, if a user with the same cookie tries to re-submit the vote, that can obviously be stopped easily.
If a user with a different cookie but from the same IP does the same thing, it could be coming from a proxy/firewall so you may want to be cautious and force them to do something extra, like a simple CAPTCHA. Once they've done this, if they behave properly nothing new is required as long as their new cookie stays with them.
This implies that people without cookies can still participate, but they have to re-enter the letter sequence or whatever each time. A hassle, but they're likely used to sites not working without cookies. They'd be able to make an exception if required.
You really won't be able to deal with users sitting over a pool of IPs (real or otherwise) and exploiting new and dynamic attack vectors on your site. In that case, their economic investment will be more than yours and, frankly, you'll lose. At that point, you're just competing to maintain the rules of your system. That's when you should explore requiring signup/email/mobile/SMS confirmation to up the ante.

You can add GET variables and URL parts to serve as cookies - some sites do that to allow logins and/or tracking when cookies are disabled. Generate the part using source IP and user agent string, for example.
site.com/vote?cookie=123456
site.com/vote/cookie123456

How can I protect my web-based game against cheaters?

I just wrote one of my first web applications (Linux, Apache, MySQL, Django), and would like to launch it publicly. It's a webform-based task disguised as a game; I intend to eventually put it on Amazon Mechanical Turk and give small bonuses to people who achieve certain scores.
Even though this app does not have a tremendously high security risk, I need to safeguard it against manipulation and reverse engineering. However, I have little formal training in testing/security. Given that there are tangible prizes to be won, I know people will have an incentive to cheat, whether by altering POST data, pressing "back" and re-submitting data until they win, etc. So far, I have been dealing with these issues on an ad-hoc basis by putting in security tests as I think of possible exploits. However, I realize there are probably lots of forms of manipulation that I haven't thought of yet.
Can anybody recommend some reading materials from which I can learn how to protect my website against manipulation and reverse engineering?

A very good place to read up is OWASP; see http://www.owasp.org/index.php/Main_Page. They have extensive documentation regarding website security.
Edit: For a quick overview, check the "Top Ten."

The Google Browser Security Handbook has a lot of information about potential vulnerabilities in the web architecture, in particular the details that are affected by the behavior of web browsers (as opposed to server based vulnerabilities, like SQL injection attacks and the like). It is a good starting point for learning about how browsers work in ways that impact security, like how they handle cookies, cross domain requests, images and MIME types, etc.

SQL Injection
Prevent malicious users from altering SQL queries via URL query strings.
DoS Attacks
Prevent users from the same IP address from accessing your site an excessive number of times in a small space of time.
Password Strength
When allowing users to create their own passwords, show a password strength indicator which encourages users to enter stronger passwords.
Captcha
Stop non-human users from submitting to forms by presenting a captcha image. You may also want to use this if password authentication is failed multiple times, to prevent robots from guessing passwords.

One book I might recommend is "Security Engineering" by Ross Anderson. It's fairly detailed and it gives a good overview of many different topics relating to computer security, although not all of it is relevant for securing a website.

comparison of ways to maintain state

There are various ways to maintain user state using in web development.
These are the ones that I can think of right now:
Query String
Cookies
Form Methods (Get and Post)
Viewstate (ASP.NET only I guess)
Session (InProc Web server)
Session (Dedicated web server)
Session (Database)
Local Persistence (Google Gears) (thanks Steve Moyer)
etc.
I know that each method has its own advantages and disadvantages like cookies not being secure and QueryString having a length limit and being plain ugly to look at! ;)
But, when designing a web application I am always confused as to what methods to use for what application or what methods to avoid.
What I would like to know is what method(s) do you generally use and would recommend or more interestingly which of these methods would you like to avoid in certain scenarios and why?

While this is a very complicated question to answer, I have a few quick-bite things I think about when considering implementing state.
Query string state is only useful for the most basic tasks -- e.g., maintaining the position of a user within a wizard, perhaps, or providing a path to redirect the user to after they complete a given task (e.g., logging in). Otherwise, query string state is horribly insecure, difficult to implement, and in order to do it justice, it needs to be tied to some server-side state machine by containing a key to tie the client to the server's maintained state for that client.
Cookie state is more or less the same -- it's just fancier than query string state. But it's still totally maintained on the client side unless the data in the cookie is a key to tie the client to some server-side state machine.
Form method state is again similar -- it's useful for hiding fields that tie a given form to some bit of data on the back end (e.g., "this user is editing record #512, so the form will contain a hidden input with the value 512"). It's not useful for much else, and again, is just another implementation of the same idea behind query string and cookie state.
Session state (any of the ways you describe) are all great, since they're infinitely extensible and can handle anything your chosen programming language can handle. The first caveat is that there needs to be a key in the client's hand to tie that client to its state being stored on the server; this is where most web frameworks provide either a cookie-based or query string-based key back to the client. (Almost every modern one uses cookies, but falls back on query strings if cookies aren't enabled.) The second caveat is that you need to put some though into how you're storing your state... will you put it in a database? Does your web framework handle it entirely for you? Again, most modern web frameworks take the work out of this, and for me to go about implementing my own state machine, I need a very good reason... otherwise, I'm likely to create security holes and functionality breakage that's been hashed out over time in any of the mature frameworks.
So I guess I can't really imagine not wanting to use session-based state for anything but the most trivial reason.

Security is also an issue; values in the query string or form fields can be trivially changed by the user. User authentication should be saved either in an encrypted or tamper-evident cookie or in the server-side session. Keeping track of values passed in a form as a user completes a process, like a site sign-up, well, that can probably be kept in hidden form fields.
The nice (and sometimes dangerous) thing, though, about the query string is that the state can be picked up by anyone who clicks on a link. As mentioned above, this is dangerous if it gives the user some authorization they shouldn't have. It's nice, though, for showing your friends something you found on the site.

With the increasing use of Web 2.0, I think there are two important methods missing from your list:
8 AJAX applications - since the page doesn't reload and there is no page to page navigation, state isn't an issue (but persisting user data must use the asynchronous XML calls).
9 Local persistence - Browser-based applications can persist their user data and state to the local hard drive using libraries such as Google Gears.
As for which one is best, I think they all have their place, but the Query String method is problematic for search engines.

Personally, since almost all of my web development is in PHP, I use PHP's session handlers.
Sessions are the most flexible, in my experience: they're normally faster than db accesses, and the cookies they generate die when the browser closes (by default).

Avoid InProc if you plan to host your website on a cheap-n-cheerful host like webhost4life. I've learnt the hard way that because their systems are over subscribed, they recycle the applications very frequently which causes your session to get lost. Very annoying.
Their suggestion is to use StateServer which is fine except you have to serialise/deserialise the session eash post back. I love objects and my web app is full of them. I'm concerned about performance when switching to StateServer. I need to refactor to only put the stuff I really need in the session.
Wish I'd know that before I started...
Cheers, Rob.

Be careful what state you store client side (query strings, form fields, cookies). Anything security-related should not be stored client-side, except maybe a session identifier if it is reasonably obscured and hard to guess. There are too many websites that have settings like "authenticated=true" and store those in a cookie or query string or hidden form field. It is trivial for a user to bypass something like that. Remember that ANY input coming from a client could have been tampered with and should not be trusted.

Signed Cookies linked to some sort of database store when you need to grab data. There's no reason to be storing data on the client side if you have a connected back-end; you're just looking for trouble if this is a public facing website.

It's not some much a question of what to use & what to avoid, but when to use which. Each has a particular circumstances when it is the best, and a different circumstance when it's the worst.
The deciding factor is generally lifetime of the data. Session state lives longer than form fields, and so on.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js