Bulk geocoding for associating addresses to shapefiles [here-api] - geocoding

I was wondering if I can use the here-api to geocode addresses (bulk way) and store the results. I need to store them on a python data frame to afterwards merge them with a shape file. However, in the TOS HERE says that you cannot store the results from geocoding.

I've been in your situation. Let me list the options that we've considered.
You could fetch address information dynamically using their Geocoding API. Advantage is that you'll always have up-to-date information, and you don't need to query places that you'll never use anyway.
I'm assuming that your shape file is based on HERE data. You could still try to use OpenStreetMaps. You'll get some inaccuracies if the map data differs, but it's free, and not as restrictive as HERE.
You could opt to buy the HERE map data, to work around the TOS restrictions
By the way, we went for an option 4: switching to OpenStreetMaps entirely, but it sounds like you don't have that luxery, because you have to work with existing shapefiles that you want to "enrich".

Related

what's the efficiency of qsqlite data base in Qt?

I am very new to database and I am trying to implement a offline map viewer. What would be the efficiency of the qsqldatabase?
To make it extreme, for example, is it possible to download all satellite image of all the detail levels of US from the google's map server and store it in a local sqlite database and still perform real time query based on my current gps location?
The Qt Database driver for SQLite uses SQLite internally (surprise!). So the question is more like: Is SQLite the right database to use? My answer: I would not use it to store geographical data, consider to look for a database which is optimized for this task.
If this is not an option; SQLite is really efficient. First check if your data is within the limits. Do not forget to create indexes and analyze the database. Then it should be able to handle your task. Here I assume you just want to get an image by its geographical position (but other solutions can be a lot faster because your data is sortable — if I remember correctly SQLite is not optimized for that).
As you will store large blobs, you may want to have a look at the Internal Versus External BLOBs in SQLite document. Maybe this gives you the answer already.

Updating a field in all records in elasticsearch

I'm new to ElasticSearch, so this is probably something quite trivial, but I haven't figured out anything better that fetching everything, processing with a script and updating the registers one by one.
I want to make something like a simple SQL update:
UPDATE RECORD SET SOMEFIELD = SOMEXPRESSION
My intent is to replace the actual bogus data with some data that makes more sense (so the expression is basically randomly choosing from a pool of valid values).
There are a couple of open issues about making possible to update documents by query.
The technical challenge is that lucene (the text search engine library that elasticsearch uses under the hood) segments are read only. You can never modify an existing document. What you need to do is delete the old version of the document (which by the way will only be marked as deleted till a segment merge happens) and index the new one. That's what the existing update api does. Therefore, an update by query might take a long time and lead to issues, that's why it's not released yet. A mechanism that allows to interrupt running queries would be a nice to have too for this case.
But there's the update by query plugin that exposes exactly that feature. Just beware of the potential risks before using it.

YAHOO! Place Finder API

Is there a way to get all of the POI (Points of Interest) and AOI (Areas of Interest) lets say for a specific state.
I would like to be able to auto-complete while someone is typing if it contains any of that data, but I don't know how to get a list of all that data.
I would even be good with storing it my database if need be, because I can't see it changing very often.
No, PlaceFinder itself does not provide a way to download its places database. It's only intended for individual lookups.
You might look at the GeoPlanet data downloads which make available under CC license the entire WOEID (Where on Earth ID) geo database.

Looking for Ideas: How would you start to write a geo-coder?

Because the open source geo-coders cannot begin to compare to Google's or even Yahoo's, I would like to start a project to create a good open source geo-coder. Just to clarify, a geo-coder takes some text (usually with some constraints) and returns one or more lat/lon pairs.
I realize that this is a difficult and garguntuan task, so I am wondering how you might get started. What would you read? What algorithms would you familiarize yourself with? What code would you review?
And also, assuming you were going to develop this very agilely, what would you want the first prototype to be able to do?
EDIT: Let's set aside the data question for now. I am going to use OpenStreetMap data, along with a database of waypoints that I have. I would later plan to include other data sets as well, and I realize the geo-coder would be inherently limited by the quality of the original data.
The first (and probably blocking) problem would be: where do you get your data from? (unless you are willing to pay thousands of dollars for proprietary sets).
You could build a geocoding-api on top of OpenStreetMap (they publish their data in dumps on a regular basis) I guess, but that one was still very incomplete last time I checked.
Algorithms are easy. Good mapping data, however, is expensive. Very expensive.
Google drove their cars all over the world, collecting this data among other things.
From a .NET point of view these articles might be interesting for you:
Writing Your Own GPS Applications: Part I
Writing Your Own GPS Applications: Part 2
Writing GIS and Mapping Software for .NET
I've only glanced at the articles but they've been on CodeProject's 'Most Popular' list for a long time.
And maybe this CodePlex project which the author of the articles above made available.
I would start at the absolute beginning by figuring out how you're going to get the data that matches a street address with a geocode. Either Google had people going around with GPS units, OR they got the information from some existing source. That existing source may have been... (all guesses)
The Postal Service
Some existing maps(printed)
A bunch of enthusiastic users that were early adopters of GPS technology who ere more than willing to enter in street addresses and GPS coordinates
Some government entity (or entities)
Their own satellites
etc
I guess what I'm getting at is the information was either imported from somewhere or was input by someone via some interface. As my starting point I would look at how to get that information. In an open source situation, you may be able to get a bunch of enthusiastic people to enter information.
So for my first prototype, boring as it would be, I would create a form for entering information.
Then you need to know the math for figuring out the closest distance (as the crow flies). From there, try to figure out how to include roads. (My guess is you would have to have data point for each and every curve, where you hold the geocode location of the curve, and the angle of the road on a north/south and east/west vector. You'd probably need to take incline into account, too to get accurate road measurements.)
That's just where I'd start.
But in all honesty, I wouldn't even start on this. Other programmers have done it already, I'm more interested in what hasn't already been done.
get my free raw data from somewhere like http://ipinfodb.com/ip_database.php
load it into a database, denormalizing for fast lookups
design my API
build it out as a RESTful web service
return results in varying formats: JSON, XML, CSV, raw text
The first prototype should accept a ZIP code and return lat/lon in raw text.

Web Service to return complex object with optional parts

I'm trying to think of the correct design for a web service. Essentially, this service is going to perform a client search in a number of disparate systems, and return the results.
Now, a client can have various pieces of information attached - e.g. various pieces of contact information, their address(es), personal information. Some of this information may be complex to retrieve from some systems, so if the consumer isn't going to use it, I'd like them to have some way of indicating that to the web service.
One obvious approach would be to have different methods for different combinations of wanted detail - but as the combinations grow, so too do the number of methods. Another approach I've looked at is to add two string array parameters to the method call, where one array is a list of required items (e.g. I require contact information), and the other is optional items (e.g. if you're going to pull in their names anyway, you might as well return that to me).
A third approach would be to add additional methods to retrieve the detail. But that's going to explode the number of round trips if I need all the details for potentially hundreds of clients who make up the result.
To be honest, I'm not sure I like any of the above approaches. So how would you design such a generic client search service?
(Considered CW since there might not be a single "right" answer, but I'll wait and see what sort of answers arrive)
Create a "criteria" object and use that as a parameter. Such an object should have a bunch of properties to indicate the information you want. For example "IncludeAddresses" or "IncludeFullContactInformation".
The consumer is then responsible to set the right properties to true, and all combinations are possible. This will also make the code in the service easier to do. You can simply write if(criteria.IncludeAddresses){response.Addresses = GetAddresses;}
Any non-structured or semi-structured data is best handled by XML. You might pass XML data via a string or wrap it up in a class adding some functionality to it. Use XPathNavigator to go through XML. You can also use XMLDocument class although it is not too friendly to use. Anyway, you will need some kind of class to handle XML content of course.
That's why XML was invented - to handle data which structure is not clearly defined.
Regards,
Maciej