What data areaCode and metroCode columns contain for in maxmind? - geoip

As i inderstand area code - is regional telephone code, right?
What for metro code column stays for then?

Metro Code is a DMA code like "803" for Los Angeles, as defined by Google AdWords. Area code is, indeed, the telephone prefix after the country code.
https://developers.google.com/adwords/api/docs/appendix/cities-DMAregions

A metro code is a zip code like in the states, e.g. 24100

Related

How do I match a group of text under a title that changes

New to this regex and everyone here has been an awesome resource for help but I’m running up against the wall and no matter what I cant see to get the grouping to work.
I’m looking to match the name of the room and the products and services that belong to that room. The number of rooms can vary same with the names, the description of the product or service may change but the line will always start with “Product” or “Service”.
If anyone can point me in the right direction it would be truly appreciated.
Master Bedroom
Product description of the product
Product description of the product
Service description of the service
Kitchen
Product description of the product
Services description of the service
You will probably get better results if you can use a regex alongside a bit of postprocessing. For example, the following regex will match all of the service/product lines:
(Product|Service[s]?)(.*)
But you will still need to get the name of the header. You could perhaps start with something like this:
(.*)\n((Product|Service[s]?)(.*)\n)+
In which case your capturing groups will include the name of the heading and then ALL of the lines in that section; you can then split and process each with the first regex I provided.
If you're able to share which programming language/tool you're using to run this processing, I can help you write the code to split the data correctly from the first regex.
You can look at this regex in action at regexr:
For the input string:
Master Bedroom
Product Bedknobs, cheap
Product Beautiful carpet polish
Service Free pillow sharpening
Kitchen
Product Sink grease
Services Inexpensive cucumber delivery
You will get the following groups:
Master Bedroom
Product Bedknobs, cheap
Product Beautiful carpet polish
Service Free pillow sharpening
and
Kitchen
Product Sink grease
Services Inexpensive cucumber delivery
[edit] note that this regex WILL capture the "Product/Service" string as its own group... Figured you could always throw it away if you didn't need it, but didn't hurt to have access to it after parsing :)

EXCEL How to iterate and add iterator to double entries

I have an Excel list which does contain a certain number of double entries.
eg.
enter image description here
What I want to do is to search each doublette and add an iterator to it. So that the result looks something like this:
enter image description here
I've absolutly no idea where to start with this.
Any ideas!?
Put this in B2:
=IF(A2=A1,A2&"."&COUNTIF($A$1:A2,A2)-1,A2)
And drag down. Let me know if it helps...
Thanks a lot to Black.Jack! You were right all along!
My own stupidity was in my way. ;) I am running a German version of Libre Office, which brings two subtle, but important changes. First conditions etc. must be named German!!! (as a colleague pointed out ;))
Secondly the parameters must be seperated by ; and not by , in Libreoffice...
So the working formula for a German Libre Office version is this one:
=WENN(A2=A1;A1&"."&ZÄHLENWENN($A$1:A2;A2)-1;A2)
again: Thank you very much Black.Jack. Wouldn't have figured it out without your help!!!

MaxMind's GeoIPCity for a single country only?

Recently I have stumbled upon a problem - MaxMind's GeoIPCity file is way too big for our needs and contains A LOT of data we don't need and won't need.
The question is: is there a way to limit the City database to a single country? let's say, Canadian cities only?
You cannot just conveniently download the database for Canadian cities only, but you can certainly prune the database once you have downloaded and loaded it. This is true whether you use the MaxMind DB or download the CSV format, just trim out the lines that do not represent Canada's country code or geoname_id (depending on v1 or v2 of the dataset).
If you identify your specific coding environment and language, I'm certain someone can help you write a few lines of code that chops out all the fat.

Matching self reported and official titles

I am currently breaking my head over a problem concerning matching of a dataset containing self reported job titles (and thus messy, incoherent and with loads of orthographic errors) with a standardised list of official titles in SAS.
In a condensed form lets say I have the following list of standardised titles:
machine engineer
machine assistant
machine mechanic
machine operator
And a snapshot of the self-reported titles with some of the common problems:
Machine engineer
machine engineer at ABC machine company
mechanic for agricultural machines
mchaine assistant
machine operator/conductor
Often these self-reported titles contain a company name and a sector that is not of interest as well as errors in spelling. First I was thinking of a fuzzy matching, but with the COMPGED function the edit distance would be very high for the entries that for example contain the company. The problem itself contains approximately 1,400 standardised job titles and more than 170,000 distinct self-reported titles coming from a total of roughly 1 million entries. Obviously it would be too optimistic to expect that all titles could be matched, but any method of getting closer would be of great help. What I am aiming for would essentially look like this:
ID self_reported standardised
1 Machine engineer machine engineer
2 machine engineer at ABC machine company machine engineer
3 mechanic for agricultural machines machine mechanic
4 mchaine assistant machine assistant
5 machine operator/conductor machine operator
Is there any method that is able to somewhat deal with the multitude of problems that can arise in this matching problem?
Thank you!

How to distinguish a NY "Queens-style" street address from a ranged address, and an address with a unit#

I need to distinguish between a Queens style address, from a valid ranged address, and an address with a unit#. For eg:
Queens style: 123-125 Some Street, NY
Ranged Address: 6414-6418 37th Ln SE, Olympia, WA 98503
Address with unit#: 1990-A Gildersleeve Ave, Bronx, NY.
In the case of #3, A is a unit# at street address 1990. THe unit# might be a number as well, for eg: 1990-12. A ranged address identifies a range of addresses on a street, and not a unique deliverable address.
So, the question is, is there an easy way to identify the Queens style address from the other cases?
---- UPDATE ---
Thanks, all. From your answers, it seems that there is no easy way to do this. I basically need to know if a street address in the form ABCD-WXYZ is a Queens-style address pointing to a single property, or if it is a ranged address.
How about some followup questions:
1) Are all addresses in NY City of the form ABCD-WXYZ?
2) Are there any other places in US where this style of addressing is used? Wikipedia seems to imply that is true, but does not give any examples.
This is from the memory of growing up there, so beware:
An address like
198-16 100th Avenue, Hollis, NY, 11423
Can be deciphered first by deciding whether the 11423 zip code is in Queens. If not, then punt.
Next, it says "100th Avenue". That implies that the "198" is referring to "198th Street": Streets always run North to South, and Avenues always run East to West. You get some interesting things with "Road" and "Place" and such, but "Place" is a "Street", and I believe that "Road" is an "Avenue".
To find the building, start at 198th Street, on the South side (even numbers), and start counting. You'll find that 198-16 is on the corner of 199th Street and 100th Avenue, just like it was when I lived there, because if it was on the other side of 199th street, it would have been 200-something.
As to how to distinguish, you could start by applying the above rules, and seeing if you come up with something that makes sense. Maybe the Street never intersects the Avenue? Maybe the numbers don't go up that high (I don't believe there is a 300th Street, and I'm not sure about a 300th Avenue). Maybe the building number is too high (you'd live on a very long street if you lived at 198-200 100th Avenue, especially because the distance from 198th Street to 199th Street on 100th Avenue isn't very great: it's a short block in that direction).
Unfortunately addresses don't have enough to "verify themselves" like a mod 10 checksum on a credit card. This means that without external information, there really is no way to know for sure how the address is supposed to look in a standardized format as compared to the original, unprocessed input format.
This is where something like an address verification web service would come into play. For a few dollars a month (usually about $20) you can verify your address database and clean it up and also prevent bad or duplicate addresses from getting into your system and spreading through it like a cancer. Most address validation web services will standardize the format of the address and expose the various component parts of the address so you can do additional process or inspection or whatever.
Just so you are aware, I'm the founder of SmartyStreets. We offer an address verification web service API called LiveAddress. You're more than welcome to contact me personally with questions about addresses whether you're a customer or not. I'm more than happy to help.
Well you would know that the second address isn't in Queens because the X-Y format is based on the streets and avenues of the borough. There aren't 6414 avenues or streets in Queens (less than 280 of each). The house number shouldn't go much over 100 because they reset every numbered cross street/avenue. So the X and Y would rarely have the same amount of digits. Ultimately though, a valid address would have the house number, street name, city, state/province if available, zip code or address code, country if international, otherwise they won't be sent, so you rarely would be given just the house number, if the other information weren't clearly implied.
The system was created to avoid confusion with the other boroughs before we had the Zip code system. I mean, there are some locations in Astoria, that if you don't give a zip code or a neighborhood name (or use the dash system), google maps will point to Long Island City and that's within the borough. No other borough in the city uses this system. It's just a Queens gem. Outside of the state, however, I believe (don't quote me on this) that Philadelphia uses this system. I know this is an old post, but I just saw it and wanted to give my two cents.
Generally, you can't distinguish between these different address styles, without additional information. Fortunately, the remainder of the addresses provide some clues as to what address style is in use.
Your first example is a Queens style address. Knowing that the address is in NY, and knowing that it has a specific street name, you might be able to infer that it's in Queens, and treat accordingly. If you had the ZIP code, that would be even better, because then you could restrict treatment of Queens style addresses to only those that have specific ZIP codes.