what is a shapefile's measure value? - shapefile

I'm trying to write a GIS, and are using shapefiles from kortforsyningen.dk
I have the problem, that i cant find out what the m (mesure) value of a vertex is.
I know x value is east/west
y is north/south
z is the height, elevation
but m, whats that? In physics, it would be time or 4.th dimention, but none of those fit with the word "mesure"
The Documentation doesn't tell, first time the word is used, it just says "plus a m (mesure) value. (page 10)
EDIT:
when i wrote "The Documentation" i meant the shapefile documentation, this one
http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf

m seems to be any value that you can assign to a point. E.g You measure the temperature at spefic measure points. then x,y contains the geo coordinates, an m the temperature. Then there is the PointZ type whoch contains x,y,z,m: which i undrstand as a 3d point with an assigned measure, e.g temperature or airpressure, etc.

Related

fortran beginner - writing variable to output file

I am starting to work with a CFD fortran program, and want to update the variables that it writes to an output file.
I want to output several columns, I and J coordinates(IL and JL), Water Surface Elevation (SURFEL), Bottom Elevation of coordinate (BELV), Depth of Water (HP) and finally, and this is where I have the question, the Maximum Water Surface Elevation of the coordinate during the simulation (SURFELMAX). L refers to a specific I,J coordinate, LA is the last coordinate in the simulation
So far I have:
DO L=2,LA
SURFEL=BELV(L)+HP(L)
IF (SURFEL.GT.SURFELMAX)THEN
SURFELMAX=SURFEL
ELSE IF (SURFELMAX.GT.SURFEL) THEN
SURFELMAX=SURFELMAX
WRITE(10,200)IL(L),JL(L),SURFEL,SURFELMAX
ENDIF
ENDDO
Everything works ok other than the SURFELMAX, in which the highest recorded surface elevation that occurred in any coordinate in the whole domain is written for each coordinate, i.e. the column is filled with the same value, the highest experienced in the whole domain during the simulation.
Would I need to first allocate an array for SURFELMAX, and have SURFEL checked against it each time to see if it has increased? If so could somebody point me in the right direction for this?
If I understand the requirements correctly, then you want to calculate SURFELMAX before you start writing out. This could simply be:
SURFELMAX = MAXVAL(BELV(2:LA)+HP(2:LA))
WRITE(10,200) (IL(L), JL(L), BELV(L)+HP(L), SURFELMAX, L=2,LA)
(or even as a single line).
It appears I didn't understand correctly; I'll try again - keeping the above as a warning to others.
It seems that you do indeed want SURFELMAX(2:LA) where each element is the highest in a given cell to date.
do L=2, LA
SURFELMAX(L) = MAX(SURFELMAX(L), BELV(L)+HP(L)) ! Store the historical maximum
WRITE (10,200) IL(L), JL(L), BELV(L)+HP(L), SURFELMAX(L)
end do
where, initially, SURFELMAX has been set to a sufficiently small value. You could also explicitly calculate SURFEL if that is needed.
If this is time dependent, then you will have to define a 2-d array SURFELMAX of size (1:LA,1:T) (T = number of time steps, LA = number of active coordinates).
Then increment the time step (say, the iterator is called I_T) outside of the loop through the domain.
Finally assign the maximum value at each coordinate to the SURFELMAX(I_T,L)

PCA in Matlab - Are the Principal Compoents re-arranged?

I am trying to do a PCA on some volatility data, and let's just say I can propose a model as the following:
volatility = bata0 + beta1*x + beta2* x^2
where x are some observations, say for example, moneyness and so on.
So in Matlab, what I did was to say Y=[ones x x^2] and then do pca(Y)
and for some reason, my first row in my coefficient matrix is always something like 0 0 1, i.e., 0 everywhere else except the last column, and output of atent always shows the highest value in the first row as well, no matter how I change the model.
Obviously, this can't be the case where the last term in every single model is explained well by the last term in the equation. And if I remove the constant term in Y (i.e., Y= [x x^2] then the first row of coefficient matrix becomes something more normal (i.e., non-zero value everywhere).
So my questions are:
is my way of doing PCA right?
Does PCA automatically rearrange the principal component and hence the first row in the coefficient matrix with all zeros except 1 at the last column may not necessarily represent the last term in the equation and
if it is wrong, what is the correct way of doing it?
From Matlab's documentation for princomp:
COEFF = princomp(X) performs principal components analysis (PCA) on
the n-by-p data matrix X, and returns the principal component
coefficients, also known as loadings. Rows of X correspond to
observations, columns to variables. COEFF is a p-by-p matrix, each
column containing coefficients for one principal component. The
columns are in order of decreasing component variance.

How do you judge the (real world) distance of an object in a picture?

I am building a recognition program in C++ and to make it more robust, I need to be able to find the distance of an object in an image.
Say I have an image that was taken 22.3 inches away of an 8.5 x 11 picture. The system correctly identifies that picture in a box with the dimensions 319 pixels by 409 pixels.
What is an effective way for relating the actual Height and width (AH and AW) and the pixel Height and width (PH and PW) to the distance (D)?
I am assuming that when I actually go to use the equation, PH and PW will be inversely proportional to D and AH and AW are constants (as the recognized object will always be an object where the user can indicate width and height).
I don't know if you changed your question at some point but my first answer it quite complicated for what you want. You probably can do something simpler.
1) Long and complicated solution (more general problems)
First you need the know the size of the object.
You can to look at computer vision algorithms. If you know the object (its dimensions and shape). Your main problem is the problem of pose estimation (that is find the position of the object relative the camera) from this you can find the distance. You can look at [1] [2] (for example, you can find other articles on it if you are interested) or search for POSIT, SoftPOSIT. You can formulate the problem as an optimization problem : find the pose in order to minimize the "difference" between the real image and the expected image (the projection of the object given the estimated pose). This difference is usually the sum of the (squared) distances between each image point Ni and the projection P(Mi) of the corresponding object (3D) point Mi for the current parameters.
From this you can extract the distance.
For this you need to calibrate you camera (roughly, find the relation between the pixel position and the viewing angle).
Now you may not want do code all of this for by yourself, you can use Computer Vision libs such as OpenCV, Gandalf [3] ...
Now you may want to do something more simple (and approximate). If you can find the image distance between two points at the same "depth" (Z) from the camera, you can relate the image distance d to the real distance D with : d = a D/Z (where a is a parameter of the camera related to the focal length, number of pixels that you can find using camera calibration)
2) Short solution (for you simple problem)
But here is the (simple, short) answer : if you picture in on a plane parallel to the "camera plane" (i.e. it is perfectly facing the camera) you can use :
PH = a AH / Z
PW = a AW / Z
where Z is the depth of the plane of the picture and a in an intrinsic parameter of the camera.
For reference the pinhole camera model relates image coordinated m=(u,v) to world coordinated M=(X,Y,Z) with :
m ~ K M
[u] [ au as u0 ] [X]
[v] ~ [ av v0 ] [Y]
[1] [ 1 ] [Z]
[u] = [ au as ] X/Z + u0
[v] [ av ] Y/Z + v0
where "~" means "proportional to" and K is the matrix of intrinsic parameters of the camera. You need to do camera calibration to find the K parameters. Here I assumed au=av=a and as=0.
You can recover the Z parameter from any of those equations (or take the average for both). Note that the Z parameter is not the distance from the object (which varies on the different points of the object) but the depth of the object (the distance between the camera plane and the object plane). but I guess that is what you want anyway.
[1] Linear N-Point Camera Pose Determination, Long Quan and Zhongdan Lan
[2] A Complete Linear 4-Point Algorithm for Camera Pose Determination, Lihong Zhi and Jianliang Tang
[3] http://gandalf-library.sourceforge.net/
If you know the size of the real-world object and the angle of view of the camera then assuming you know the horizontal angle of view alpha(*), the horizontal resolution of the image is xres, then the distance dw to an object in the middle of the image that is xp pixels wide in the image, and xw meters wide in the real world can be derived as follows (how is your trigonometry?):
# Distance in "pixel space" relates to dinstance in the real word
# (we take half of xres, xw and xp because we use the half angle of view):
(xp/2)/dp = (xw/2)/dw
dw = ((xw/2)/(xp/2))*dp = (xw/xp)*dp (1)
# we know xp and xw, we're looking for dw, so we need to calculate dp:
# we can do this because we know xres and alpha
# (remember, tangent = oposite/adjacent):
tan(alpha) = (xres/2)/dp
dp = (xres/2)/tan(alpha) (2)
# combine (1) and (2):
dw = ((xw/xp)*(xres/2))/tan(alpha)
# pretty print:
dw = (xw*xres)/(xp*2*tan(alpha))
(*) alpha = The angle between the camera axis and a line going through the leftmost point on the middle row of the image that is just visible.
Link to your variables:
dw = D, xw = AW, xp = PW
This may not be a complete answer but may push you in the right direction. Ever seen how NASA does it on those pictures from space? The way they have those tiny crosses all over the images. Thats how they get a fair idea about the deapth and size of the object as far as I know. The solution might be to have an object that you know the correct size and deapth of in the picture and then calculate the others' relative to that. Time for you to do some research. If thats the way NASA does it then it should be worth checking out.
I have got to say This is one of the most interesting questions i have seen for a long time on stackoverflow :D. I just noticed you have only two tags attached to this question. Adding something more in relation to images might help you better.

How to create data fom image like "Letter Image Recognition Dataset" from UCI

I am using letter_regcog example from OpenCV, it used dataset from UCI which have structure like this:
Attribute Information:
1. lettr capital letter (26 values from A to Z)
2. x-box horizontal position of box (integer)
3. y-box vertical position of box (integer)
4. width width of box (integer)
5. high height of box (integer)
6. onpix total # on pixels (integer)
7. x-bar mean x of on pixels in box (integer)
8. y-bar mean y of on pixels in box (integer)
9. x2bar mean x variance (integer)
10. y2bar mean y variance (integer)
11. xybar mean x y correlation (integer)
12. x2ybr mean of x * x * y (integer)
13. xy2br mean of x * y * y (integer)
14. x-ege mean edge count left to right (integer)
15. xegvy correlation of x-ege with y (integer)
16. y-ege mean edge count bottom to top (integer)
17. yegvx correlation of y-ege with x (integer)
example:
T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8
I,5,12,3,7,2,10,5,5,4,13,3,9,2,8,4,10
now I have segmented image of letter and want to transform it into data like this to put recognize it but I don't understand the mean of all value like "6. onpix total # on pixels" what is it mean ? Can you please explain the mean of these value. thanks.
I am not familiar with OpenCV's letter_recog example, but this appears to be a feature vector, or set of statistics about the image of a letter that is used to classify the future occurrences of the letter. The results of your segmentation should leave you with a binary mask with 1's on the letter and 0's everywhere else. onpix is simply the total count of pixels that fall on the letter, or in other words, the sum of your binary mask.
Most of the rest values in the list need to be calculated based on the set of pixels with a value of 1 in your binary mask. x and y are just the position of the pixel. For instance, x-bar is just the sample mean of all of the x positions of all pixels that have a 1 in the mask. You should be able to easily find references on the web for mathematical definitions of mean, variance, covariance and correlation.
14-17 are a little different since they are based on edge pixels, but the calculations should be similar, just over a different set of pixels.
My name is Antonio Bernal.
In page 3 of this article you will find a good description for each value.
Letter Recognition Using Holland-Style Adaptive Classifiers.
If you have any doubt let me know.
I am trying to make this algorithm work, but my problem is that I do not know how to scale the values to fit them to the range 0-15.
Do you have any idea how to do this?
Another Link from Google scholar -> Letter Recognition Using Holland-Style Adaptive Classifiers

Designing a grid overlay based on longitudes and latitudes

I'm trying to figure out the best way to approach the following:
Say I have a flat representation of the earth. I would like to create a grid that overlays this with each square on the grid corresponding to about 3 square kilometers. Each square would have a unique region id. This grid would just be stored in a database table that would have a region id and then probably the long/lat coordinates of the four corners of the region, right? Any suggestions on how to generate this table easily? I know I would first need to find out the width and height of this "flattened earth" in kms, calculate the number of regions, and then somehow assign the long/lats to each intersection of vertical/horizontal line; however, this sounds like a lot of manual work.
Secondly, once I have that grid table created, I need to design a fxn that takes a long/lat pair and then determines which logical "region" it is in. I'm not sure how to go about this.
Any help would be appreciated.
Thanks.
Assume the Earth is a sphere with radius R = 6371 km.
Start at (lat, long) = (0, 0) deg. Around the equator, 3km corresponds to a change in longitude of
dlong = 3 / (2 * pi * R) * 360
= 0.0269796482 degrees
If we walk around the equator and put a marker every 3km, there will be about (2 * pi * R) / 3 = 13343.3912 of them. "About" because it's your decision how to handle the extra 0.3912.
From (0, 0), we walk North 3 km to (lat, long) (0.0269796482, 0). We will walk around the Earth again on a path that is locally parallel to the first path we walked. Because it is a little closer to the N Pole, the radius of this circle is a bit smaller than that of the first circle we walked. Let's use lower case r for this radius
r = R * cos(lat)
= 6371 * cos(0.0269796482)
= 6 368.68141 km
We calculate dlong again using the smaller radius,
dlong = 3 / (2 * pi * r) * 360
= 0.0269894704 deg
We put down the second set of flags. This time there are about (2 * pi * r) / 3 = 13 338.5352 of them. There were 13,343 before, but now there are 13,338. What's that? five less.
How do we draw a ribbon of squares when there are five less corners in the top line? In fact, as we walked around the Earth, we'd find that we started off with pretty good squares, but that the shape of the regions sheared out into pretty extreme parallelograms.
We need a different strategy that gives us the same number of corners above and below. If the lower boundary (SW-SE) is 3 km long, then the top should be a little shorter, to make a ribbon of trapeziums.
There are many ways to craft a compromise that approximates your ideal square grid. This wikipedia article on map projections that preserve a metric property, links to several dozen such strategies.
The specifics of your app may allow you to simplify things considerably, especially if you don't really need to map the entire globe.
Microsoft has been investing in spatial data types in their SQL Server 2008 offering. It could help you out here. Because it has data types to represent your flattened earth regions, operators to determine when a set of coordinates is inside a geometry, etc. Even if you choose not to use this, consider checking out the following links. The second one in particular has a lot of good background information on the problem and a discussion on some of the industry standard data formats for spatial data.
http://www.microsoft.com/sqlserver/2008/en/us/spatial-data.aspx
http://jasonfollas.com/blog/archive/2008/03/14/sql-server-2008-spatial-data-part-1.aspx
First, Paul is right. Unfortunately the earth is round which really complicates the heck out of this stuff.
I created a grid similar to this for a topographical mapping server many years ago. I just recoreded the coordinates of the upper left coder of each region. I also used UTM coordinates instead of lat/long. If you know that each region covers 3 square kilometers and since UTM is based on meters, it is straight forward to do a range query to discover the right region.
You do realize that because the earth is a sphere that "3 square km" is going to be a different number of degrees near the poles than near the equator, right? And that at the top and bottom of the map your grid squares will actually represent pie-shaped parts of the world, right?
I've done something similar with my database - I've broken it up into quad cells. So what I did was divide the earth into four quarters (-180,-90)-(0,0), (-180,0)-(0,90) and so on. As I added point entities to my database, if the "cell" got more than X entries, I split the cell into 4. That means that in areas of the world with lots of point entities, I have a lot of quad cells, but in other parts of the world I have very few.
My database for the quad tree looks like:
\d areaids;
Table "public.areaids"
Column | Type | Modifiers
--------------+-----------------------------+-----------
areaid | integer | not null
supercededon | timestamp without time zone |
supercedes | integer |
numpoints | integer | not null
rectangle | geometry |
Indexes:
"areaids_pk" PRIMARY KEY, btree (areaid)
"areaids_rect_idx" gist (rectangle)
Check constraints:
"enforce_dims_rectangle" CHECK (ndims(rectangle) = 2)
"enforce_geotype_rectangle" CHECK (geometrytype(rectangle) = 'POLYGON'::text OR rectangle IS NULL)
"enforce_srid_rectangle" CHECK (srid(rectangle) = 4326)
I'm using PostGIS to help find points in a cell. If I look at a cell, I can tell if it's been split because supercededon is not null. I can find its children by looking for ones that have supercedes equal to its id. And I can dig down from top to bottom until I find the ones that cover the area I'm concerned about by looking for ones with supercedeson null and whose rectangle overlaps my area of interest (using the PostGIS '&' operator).
There's no way you'll be able to do this with rectangular cells, but I've just finished an R package dggridR which would make this easy to do using a grid of hexagonal cells. However, the 3km cell requirement might yield so many cells as to overload your machine.
You can use R to generate the grid:
install.packages('devtools')
install.packages('rgdal')
library(devtools)
devools.install_github('r-barnes/dggridR')
library(dggridR)
library(rgdal)
#Construct a discrete global grid (geodesic) with cells of ~3 km^2
dggs <- dgconstruct(area=100000, metric=FALSE, resround='nearest')
#Get a hexagonal grid for the whole earth based on this dggs
grid <- dgearthgrid(dggs,frame=FALSE)
#Save the grid
writeOGR(grid, "grid_3km_cells.kml", "cells", "KML")
The KML file then contains the ids and edge vertex coordinates of every cell.
The grid looks a little like this:
My package is based on Kevin Sahr's DGGRID which can generate this same grid to KML directly, though you'll need to figure out how to compile it yourself.