Why Cookie is the only way to know the number of users - cookies

I was reading howstuffworks and this is what is written there :
http://computer.howstuffworks.com/cookie3.htm
" It turns out that because of proxy servers, caching, concentrators and so on, the only way for a site to accurately count visitors is to set a cookie with a unique ID for each visitor."
Couldn't derive from it why only cookie is the way ?

Thirty users might come from the same IP address (think an ISP block or inside a corporate network).
Thirty different users might retrieve content from various caches rather than making unique requests that make it all the way to the app server (local cache, ISP cache ,etc.)
Without tracking individual sessions, traffic may be mis-estimated or mis-interpreted.

Related

Does HATEOAS increase the number of calls to server?

I have never used HATEOAS with RESTAPI's and what I understand is with HATEOAS, one doesn't need to store URI's and server send's the URI's in the response which can be used to fetch other resources or related resources.
But with HATEOAS, aren't we increasing the number of calls?
If I want to fetch customer-order information and if I first fetch customer information and get URI for it's orders dynamically, isn't it an extra call?
Loose coupling can be understood but I do not understand the exact use of this Maturity level of REST.
Why should HATEOAS increase the number of required requests? Without the service returning URIs the client can use to perform a state trransition (gather further information, invoke some tasks, ...) the client has to have some knowledge on how to build a URI itself (hence it is tightly coupled to the service) though the client still needs to invoke the endpoint on the server side. So HATEOAS just moves the knowledge on how to generate the URI from client to server.
Usually a further request sent to the server isn't really an issue as each call should be stateless anyway. If you have a load-balanced server structure, the additional request does not really have a noticable prerformance impact on the server.
If you do care about the number of requests issued by a client to the server (for whatever reason) you might have a look at i.e. HAL JSON where you can embed the content of sub-resources, though in the case of customer orders this might also have a significant performance impact as if users may have plenty of issued orders stored the response might be quite huge and the client has to administer all of the data even though it might not use it. Usually instead of embedding lots of list items within a response the service will point the client to a URI where the client can learn how to retrieve these information if needed. Often this kind of URIs provide a pageable view on the data (like orders placed by a customer).
While a pageable request for sure increase the number or requests handled by the service, overall performance will increase though as the service does not have to return the whole order-data to the client and therefore reduce the load on the backing DB as well as shrinking the actual response content length.
To sum my post up, HATEOAS is intended to move the logic of creating URIs to invoke from clients to servers and therefore decouple clients from services further. The number of actual requests clients have to issue isn't tide to HATEOAS but to the overall API design and the requirements of the client.

How much overhead would a DNS call add to the response time of my API?

I am working on a cross-platform application that runs on iOS, Android and the web. Currently there is an API server which interacts with all the clients. Each request to the API is made through the ip (eg. http://1.1.1.1/customers). This disallows me to move the backend quickly whenever I want to another cloud VPS as I need to update iOS and Android versions of the app with a painful migration process.
I though the solution would be introducing a subdomain. (eg http://api.example.com/customers). How much would an additional DNS call would affect the response times?
The thing to remember about DNS queries is that, as long as you have configured your DNS sensibly, clients will only ever make a single call the first time communication is needed.
A dns query will typically involve three queries, one to the root server, one to the .com (etc) server, and a final one to the example.com domain. Each of these will take milliseconds and will be performed once, probably every hour or so whenever the TTL expires.
The TL;DR is basically that it is a no brainer, you get far far more advantages from using a domain name than you will ever get from an IP Address. The time is minimal, the packet size is tiny.

Blocking IP addresses, preventing DoS attacks

So this is more of a general question on the best practice of preventing DoS attacks, I'm just trying to get a grasp on how most people handle malicious requests from the same IP address which is the problem we are currently having.
I figure it's better to block the IP of a truly malicious IP as high up as possible as to prevent using more resources, especially when it comes to loading you application.
Thoughts?
You can prevent DoS attacks from occuring in various ways.
Limiting the number of queries/second
from a particular ip address. Once
the limit is reached, you can send a
redirect to a cached error page to
limit any further processing. You
might also be able to get these IP
address firewalled so that you don't
have to process their requests at
all. Limiting requests per IP address
wont work very well though if the
attacker forges the source IP address
in the packets they are sending.
I'd also be trying to build some
smarts into your application to help
dealing with a DoS. Take Google maps
as an example. Each individual site
has to have it's own API key which I
believe is limited to 50,000 requests
per day. If your application worked
in a similar way, then you'd want to
validate this key very early on in
the request so that you don't use too
many resources for the request. Once
the 50,000 requests for that key are
used, you can send appropriate proxy
headers such that all future requests
(for the next hour for example) for
that key are handled by the reverse
proxy. It's not fool proof though. If
each request has a different url,
then the reverse proxy will have to
pass through the request to the
backend server. You would also run
into a problem if the DDOS used lots
of different API keys.
Depending on the target audience for
your application, you might be able
to black list large IP ranges that
contribute significantly to the DDOS.
For example, if your web service is
for Australian's only, but you were
getting a lot of DDOS requests from
some networks in Korea, then you
could firewall the Korean networks.
If you want your service to be
accessible by anyone, then you're out
of luck on this one.
Another approach to dealing with a DDOS is to
close up shop and wait it out. If
you've got your own IP address or IP
range then you, your hosting company
or the data centre can null route the
traffic so that it goes into a block
hole.
Referenced from here. There are other solutions too on same thread.
iptables -I INPUT -p tcp -s 1.2.3.4 -m statistic --probability 0.5 -j DROP iptables -I INPUT n -p tcp -s 1.2.3.4 -m rpfilter --loose -j ACCEPT
# n would be an numeric index into the INPUT CHAIN -- default is append to INPUT chain
more at...
Can't Access Plesk Admin Because Of DOS Attack, Block IP Address Through SSH?

HTTP Cookies: Any way to make sure it wasn't copied?

Is there any way that a website can read a cookie in a way that the cookie is locked to that particular computer and that it wasn't somehow copied to another computer?
Assuming you don't trust the end point — no.
If you don't trust the user, then you can't be sure.
If you don't trust the computer (e.g. it might have malware installed), then you can't be sure.
If you don't trust the connection (i.e. it isn't secured with SSL), then you can't be sure.
You be sure by linking the cookie to an IP address, since:
Multiple computers can share an IP (e.g. via NAT)
One computer can cycle through multiple IPs (e.g. some large ISPs use a bank of proxy servers)
You could include a bunch of data gathered from the browser (e.g. the user agent string) as a hashed value in the cookie, but that would break if something changed the data you were checking against or the cookie was copied to another machine with identical data (while user agent strings can vary a lot, two computers can be configured identically, and there are plenty of circumstances where they are likely to be (e.g. in a company with a standard desktop install that includes standard versions of browsers and plugins).
The only thing you can do is to try to put as much data as possible in the cookie (browser user-agent,os, screen resolution,...). you have to scramble/encrypt the data.
if you read the cookie again sometime you can check if the values still match. but of course this is no 100% safe solution since all these data can be faked by a malicious user (if he knows what exactly he needs to change)

Tracking and logging anonymous users

If you let anonymous users vote for any post on a site just one time and you log that vote by the user's IP, what's the likelihood that you'd be banning other users from voting and that the original user would be able to vote again after a certain amount of time because their IP address has changed? I'm guessing almost certainly.
Client side cookies can be deleted and server side cookies again have no way to reliably map said cookie to the anonymous user.
Does this mean there is no reliable way of tracking anonymous users indefinitely?
Using only IP addresses for user authentication/identification is extremely unreliable. There might be many hundreds or even thousands of users behind one IP (e.g a corporate network) and for most of those on home connections their IPs are likely to be dynamic and regularly changing.
You have to use Cookies for more reliable tracking. You can specify just about any time-to-live for a cookie, so that when an anonymous user returns, you can identify him.
Of course cookies can be deleted by users, so they could delete their cookies and vote again. However, is this likely to be a big problem? If someone really wants to game your poll, they could write a script. However, you could add a few basic security features: only allow some maximum votes per IP per day, and allow only so many votes per IP per second.
If you let anonymous users vote for
any post on a site just one time and
you log that vote by the user's IP,
what's the likelihood that you'd be
banning other users from voting
Unless that page is extremely popular, it's very unlikely that someone else being assigned the same IP address by the ISP would also visit it.
Edit: Users using the same IP address due to NAT are a much bigger problem and probably a deal-breaker for using the IP address. I'd be less worried about corporate networks than about private home networks: very common, and having two people in the same household wanting to visit and vote on the same site is rather more likely than two random strangers.
and that the original user would be able to vote again after a certain amount of time
because their IP address has changed? I'm guessing almost certainly.
It's not just a question of time; most ISPs assign IP addresses upon connect, so all someone has to do to get a new one is to reinitialize their DSL connection (or whatever they use).
Does this mean there is no reliable way of tracking anonymous users indefinitely?
Correct.
Yes, there is no certainty in tracking IP addresses or using cookies.