Continuously decreasing accuracy of maxmind geolite city - geoip

We've been using awstats together with Maxmind's GeoLite City (now GeoLite Legacy City) database for website statistics of a non-profit organization.
The percentage of unknown cities (the ones not found in the database) has long been at some 15 - 20%. But approximately since August - September 2013, the percentage jumped to over 40 - 50%, and these days (April 2014), it's at some 60%, which is really high, the data are becoming unusable.
We download the up-to-date GeoLite City database every month.
Does somebody know why has the Maxmind GeoLite Legacy City degraded that much in accuracy over past months? Is it because of new GeoLite2 City database format? We would consider moving to GeoLite2 City, but current awstats lacks plugin for that database.

If I were to take a guess, I would imagine that as IPv4 is exhausted, we are starting to see more "weird" IP addresses that until recently have been marked as unused, and generally seeing IP allocations shaken up, so databases may just decide that they have no information about that IP range.

Related

Does Server and Database location needs to be in the same region for efficiency?

I have a social network website where user are able to upload their content. 80% user is in Malaysia and 15% in USA. Should I place the server in Taiwan (middle between USA & Malaysia) or Singapore (closer)? GCP don’t have Database in Singapore, so should I place the server in SG and DB in Taiwan?
Actually the answer to your question may vary. In your case since most users are in a specific region that is where the server should go. The amount of hops and bandwidth available to your users will make the difference in milliseconds or even a second but it will be faster - usually.
The time it may vary is the actual infrastructure in the country or place where the server farm is. However since you are dealing with AWS or Google I doubt you have anything to worry about.
So if your customers in the majority come from X region it makes sense to use servers as close to that region as possible.

Legacy GeoLite City provides incorrect country

(I'm using MaxMind's databases)
I'm using SourceMod which is a game server management addon for Source Engine, which provides plugin support.
Two months ago, I have added a plugin that does the following:
Check if the connecting user is from Israel. (Where I host the server, you're required to speak Hebrew in order to play)
If the user is not from Israel, check if his IP address is whitelisted. If it isn't, disallow the connection.
Now recently, one of the Israeli ISPs got new IP addresses (all starting with 77.138.*.*) which GeoLite shows as if they're French. So far I've got to whitelist 12 IP addresses because of this.
SourceMod has a third-party API that lets me use GeoLite2 City, GeoLite City (free ones) and GeoIPCity/GeoIPISP if I own them. For now I've tried using both of the free databases, sadly, no success.
I know MaxMind will update GeoLite again in 6 days. What I'm afraid of, is that I'm not going to achieve anything because MaxMind doesn't offer support for GeoLite, so I might be stuck with outdated databases that provide incorrect information.
What is a recently updated alternative that uses the same format as GeoLite (either legacy or new), or, is it 100% confirmed that those incorrect locations for IPs will be updated soon for GeoLite? I really don't feel like spending cash monthly just for GeoIP.

Is switching over to Amazon S3 for Drupal 7 image hosting worth it?

So I just have a quick question with regards to using amazon s3.
I have a small Drupal 7 site hosted on a VPS with not too much storage space. I put together the site for members of my School's Photographic Arts Committee to upload photos of School events and projects.
The full-quality photos are stored in a private folder on the server, and the images displayed on the site are watermarked 2048px width ones stored publicly.
I'm worried that I'm going to blow my storage space very fast, and I fear that I'm going to blow my not-really-exsistant budget on using amazon s3 with the module in Drupal.
So, I would like to know if it is a worthy investment using amazon s3, I'll be willing to spend +/- $5 dollars on it.
My monthly usage will include 3gigs worth of uploads and probably 20 gigs max downloads. Obviously slowly increasing.
Also, a bit confused about storage billing, do I have to pay for say my 50gigs worth storage from uploads from previous months, or just the 3 gigs of storage I used this month
PS: I live in South Africa and will probably use the Ireland S3 servers as they have the best latency.
Any feedback much appreciated!
Thanks.
S3 may be a good option in your case, given your limited storage space.
You can calculate things fairly easily. Ignoring the 'requests' charge since it's tiny, here's the formula for Ireland:
(gb of storage * 0.3) + (avg image size * requests * 0.09) + (requests * 0.005/1000)
There are some volume discounts and some "first N transfer free", but this is a good ceiling, especially for a low-volume site as you mention. Also note storing the full-size images (and not downloading them) means only the first third of the formula matters. As an example, if you have 5gb in full-size images plus another 1gb in 350kb "2048px" images that sum to 10,000 image views per month:
full-size: 5*.03=.15
2048 hosting/downloads: (1*.03)+(0.00033*10000*.09)+(10000*.0004/10000)=0.3274
So, your monthly costs are about 50 cents.
What happens if your site is slashdotted? Imagine you get 10 million hits:
full-size: 5*.03=.15
2048 hosting/downloads: (1*.03)+(0.00033*10000000*.09)+(10000000*.004/10000)=301.03
So, your monthly cost is now over $300. (this is why billing alarms are important!)
Now, let's imagine you put cloudfront in front of S3 (which is a really good idea for several reasons) and look at the pricing in this scenario. (I've simplified the pricing here a little bit, and assuming nothing is loaded twice by the same browser, so no caching)
full-size: 5*.03=.15
2048 hosting/downloads: (1*.03)+(0.00033*10000000*.085)+(10000000*.009/10000)=289.53
so it saved about $10 but gave you better performance.
If you need more features (image resizing, for instance), you may want to consider a photo host like Flickr or Smugmug. They pay for bandwidth, which makes your costs more predictable.

MaxMind's GeoIP database

I found that the MaxMind's GeoIP database's accuracy is 99.5% (free) or 99.8% (commercial), as published in their website. Does anybody know what would be the 0.5% and the 0.2% ?
Are they newly assigned IP addresses, or actual addresses that change their countries?
I feel that my question is not very clear but any answers are welcome.
A bit late, but the MaxMindGeoIP Country page spells this out a bit.
Basically the paid version corrects for AOL, and "some" anonymous proxies / satellite providers.
I get the sense that AOL is the biggest issue. For the free version, all AOL users show up in the US, while the paid version correctly identifies AOL users in Great Britain, France, Germany, and Brazil. I don't know how many people use AOL in Europe these days, so I'm not sure how big of an issue this is.
Regarding what "some" means regarding the anonymous proxies or satellite providers, I have no idea.

UK Royal Mail PAF address finder via postcode alternatives? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
We need an address finder (premise level) based on postcode. We have a budget of 40k for this. But I have been assigned to find some cheaper alternatives for Royal mail PAF database. Is Google any good to find premise level address when you send full postcode. Any recommendation over Royal Mail PAF file. Any web services out there for this to accomplish? Please share your knowledge.
Cheers,
Naren
We use products from AFD for this, they work well for us.
Edit just saw Best way to geocode UK postcode with Google Maps API? on the front page.
In the UK the government has said that PAF data should be made free[1]. I'm painfully aware of the almost extortionate nature that Royal Mail operate.
Having worked with Royal Mail PAF API, I known a 'friend' (wink wink) that created a class wrapper around the APIs. This 'friend' of mine built a custom Importer that automatically ripped all the PAF data into a MS SQL database. Post the data import, he no longer needs to renew he's licences because he is no longer using PAF data.
This may be something you could do also, buy the data one time an import it.
As for data changes, you can buy perhaps every few years e.g. 2-3 years and do a update of your existing data.
[1] Damn It! guess I was wrong, http://www.guardian.co.uk/technology/2010/jan/22/postcode-petition-fails-blocked-number-ten
I work as the integrations specialist for Postcode Anywhere (we are one of the leading Royal Mail PAF resellers). Address capture doesn’t have to be expensive – and you don’t have to sacrifice reliability for an affordable service. Postcode Anywhere can be licensed either on a simple credit pack based system or on an annual basis, and you can be up and running in 10 minutes using our JavaScript client. If you are looking to create a more bespoke integration we also have an array of web services and code samples to help you.
If you want to have a play around with the service to see what you think we will be more than happy to provide you with a free trial. A full run-down of all of our products and services can be found here: http://www.postcodeanywhere.co.uk/products.
I work for CraftyClicks.
There are a few PAF resellers around. The data is all the same, prices can vary significantly. Best to spend a few minutes browsing the various sites.
At CraftyClicks our focus is on uptime/availability and keeping the price of PAF data reasonable - at high volumes the price falls to well below 1 penny a click.
Our address lookup web service can be integrated client side via JavaScript or server side via XML.
Let us know your requirements (adam at craftyclicks.co.uk) - you shouldn't be spending anywhere near 40k for this!
Adam.
The base PAF data is the same but a lot of value is put into adding information that is not included into PAF to help with realtime and batch addressing matching with products based on PAF. We have a lot of locality information that is not included within PAF but people tend to use within their address.
As to updates, there are thousands of changes every month so its vital that you use a source that has regular updates to the PAF data and also associated files such as business and consumer names data that also help in the matching process.
Have a look at our site www.capscan.com for both UK and International data quality with services delivered either installed or as a web service.
You can also contact us on 0207 428 1255