Python normalise data when number of rows are not equal

Python normalise data when number of rows are not equal - row

I was wondering if anyone has a way of normalising column when the row numbers are not equal (see attached picture). For example in my data, I need to divide S1 by S2. however S1 includes data for multiple points (0,0) (0,1), (1,0) and (2,0).
I fI just divide S1/S2 I will get normalised data for (0,0) which is S1(0,0)/S2(0,0). I would like to each data point to S2:
S1(0,0)/S2(0,0)
S1(0,1)/S2(0,0)
S1(1,0)/S2(0,0)
S1(2,0)/S2(0,0)
Sorry if it is too simple but I have just started to learn Python.
enter image description here

Related

What does correlation filtering actually do?

In the slide, $G(i, j)$ is the sum of the values of all these different colours. But what does $F(u, v)*I(i, j)$ represents? And what is $G(i, j)$ as well?

G is the output image (on the right side)
F is the filter kernel (the 5x5 image hovering over I)
I is the input image (the image on the left)
So every outputpixel (i,j) is set to value G(i,j) which is calculated by the given formula.
u,v are coordinates within F, so F(u,v) is a value of the filter kernel.
You basically sum up pixel-wise products of values of your input and your filter array.
The Filter is moved across the image and for every pixel you calculate G(i,j) using the only the pixels of I that lie under F. At the end you have a new image I that consists of those calculated values.
Read this for further info:
http://www.cs.umd.edu/~djacobs/CMSC426/Convolution.pdf

Fortran - Point in STL

I am trying to fill a STL file with points in Fortran. I have written a basic code but it is not working.
My method has been to use a random number generator to generate a point. I then normalize this point to the dimensions of the STL bounding box.
I then throw out the "z" coordinate for the the first triangle in the STL. I check if the random point is with the max and min value of the "x" and "y" coordinate of the first triangle. If so I project the random point vertically onto the triangle plane and calculate the "z" value should it intersect with the plane. I then check if the z value of the random point is less than the value of the projected point (Ray casting). If yes I increase a counter, which is initially set to zero, by one.
I do this for every triangle in the STL. If the counter is even the random point is outside the volume, if it is odd the random point is inside the volume and the point is stored.
I then generate a new random point and start again. I have included the important code below. Apologies for the length (lots of comments and blank lines for readability).
! Set inital counter for validated points
k = 1
! Do for all randomly generated points
DO i=1,100000
! Create a random point with coordinates x, y and z.
CALL RANDOM_NUMBER(rand)
! Normalise the random coordinates to the bounding box.
rand(1:3) = (rand(1:3) * (cord(1:3) - cord(4:6))) + cord(4:6)
! Set the initial counter for the vertices
j = 1
! Set the number of intersections with the random point and the triangle
no_insect = 0
! Do for all triangles in STL
DO num = 1, notri
! Get the maximum "x" value for the current triangle
maxtempx = MAXVAL(vertices(1,j:j+2))
! Get the minimum "x" value for the current triangle
mintempx = MINVAL(vertices(1,j:j+2))
! If the random point is within the bounds continue
IF (rand(1)>=mintempx .AND. rand(1)<=maxtempx) THEN
! Get the maximum "y" value for the current triangle
maxtempy = MAXVAL(vertices(2,j:j+2))
! Get the minimum "y" value for the current triangle
mintempy = MINVAL(vertices(2,j:j+2))
! If the random point is within the bounds continue
IF (rand(2)>=mintempy .AND. rand(2)<=maxtempy) THEN
! Find the "z" value of the point as projected onto the triangle plane
tempz = ((norm(1,num)*(rand(1)-vertices(1,j))) &
+(norm(2,num)*(rand(2)-vertices(2,j))) &
- (norm(3,num)*vertices(3,j))) / (-norm(3,num))
! If the "z" value of the randomly generated point goes vertically up
! through the projected point then increase the number of plane intersections
! by one. (Ray casting vertically up, could go down also).
IF (rand(3)<= tempz) THEN
no_insect = no_insect + 1
END IF
END IF
END IF
! Go to the start of the next triangle
j = j + 3
END DO
! If there is an odd number of triangle intersections not
! including 0 intersections then store the point
IF (MOD(no_insect,2)/=0 .AND. no_insect/=0) THEN
point(k,1:3) = rand(1:3)
WRITE(1,"(1X, 3(F10.8, 3X))") point(k,1), point(k,2), point(k,3)
k = k + 1
END IF
END DO
My results have been complete rubbish (see images) Image 1 - Test STL file (ship taken from here). Part of the program (code not shown) reads in binary STL files and stores the surface normals of each triangle and the vertices which make up this triangle. I then wrote the vertices to a text file and call GNUPLOT to connect the vertices of each triangle as show above. This plot is just a test to ensure that the STL files are being read and stored correctly. It does not use the surface normals.
.
Image 2 - This is a plot of the candidate points which were accepted as being inside the STL volume. (Stored in the final if loop shown in code above). These accepted points are then later written to a text file and plotted with GNUPLOT (NOT SHOWN). Had the algorithm worked this plot should be a point cloud of the triangulated mesh shown above. (It also plots the 8 bounding box coordinates to ensure that the random particles are generated in the correct range)
I appreciate that this does not take into account for points generated on vertices or rays which run parallel and intersect with edges. I just wanted to start with a rough code. Could you please advise if there is a problem with my methodology or code? Let me know if the question is too broad and I will delete it and try to be more specific.

I realized my code could be handy for others. I placed it at https://github.com/LadaF/Fortran---CGAL-polyhedra under the GNU GPL v3 license.
You can query, whether a point is inside a point or not. First, you read the file by
cgal_polyhedron_read. You must store the type(c_ptr) :: ptree that is crated and use it in your next calls.
The function cgal_polyhedron_inside returns whether a point is inside a polyhedron, or not. It requires one reference point, which must be known to be outside.
When you are finished call cgal_polyhedron_finalize.
You must have the file as a purely tridiagonal manifold mesh in an OFF file. You can create it from the STL file using http://www.cs.princeton.edu/~min/meshconv/ .

C++ alternative algorithm for solution

I need some help with an algorithm, I have a problem with an program.
I need to make a program where user inputs cordinates for 3 points and coefficient
for linear funciton that crosses the triangle made by those 3 points and i need to compare area of the shapes what is made function crossing that triangle.
I would paste code here but there is things in my native language and i just want to know your alogrithms for this solution, becuase my wokrs only if the points are entered in exact sequence and I cant get handle of that
http://pastebin.com/vNzGuqX4 - code
and for example i use this http://goo.gl/j18Ch0
The code is not finnished, I just noticed if I enter it in different sequence it does not work like when entering points " 1 1 2 5 4 4 0.5 1 5 " works but " 4 4 1 1 2 5 0.5 1 5 " does not

The linear must cross with 2 edges of the triangle at least. So you can find these 2 crossing points first, these 2 points with one of the 3 vertices will make a small triangle. Use this equation to calculate the area of a triangle S = sqrt(l * (l-a) * (l-b) * (l-c)) where l = (a+b+c)/2 and a, b, c are the length of the edge. It should be easy to get the length of an edge given the coordinate of the vertex. One is the area of the small triangle, the other one is the area of the big triangle minus the small one.

If your triangle is ABC, a good approach would be the following:
Find lines that go through points A and B, B and C, and C and A.
Find the intersection of your line with these three lines.
Check which two intersections lie on the triangle sides.
Depending on the intersections calculate the surface of the new small
triangle.

How to create data fom image like "Letter Image Recognition Dataset" from UCI

I am using letter_regcog example from OpenCV, it used dataset from UCI which have structure like this:
Attribute Information:
1. lettr capital letter (26 values from A to Z)
2. x-box horizontal position of box (integer)
3. y-box vertical position of box (integer)
4. width width of box (integer)
5. high height of box (integer)
6. onpix total # on pixels (integer)
7. x-bar mean x of on pixels in box (integer)
8. y-bar mean y of on pixels in box (integer)
9. x2bar mean x variance (integer)
10. y2bar mean y variance (integer)
11. xybar mean x y correlation (integer)
12. x2ybr mean of x * x * y (integer)
13. xy2br mean of x * y * y (integer)
14. x-ege mean edge count left to right (integer)
15. xegvy correlation of x-ege with y (integer)
16. y-ege mean edge count bottom to top (integer)
17. yegvx correlation of y-ege with x (integer)
example:
T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8
I,5,12,3,7,2,10,5,5,4,13,3,9,2,8,4,10
now I have segmented image of letter and want to transform it into data like this to put recognize it but I don't understand the mean of all value like "6. onpix total # on pixels" what is it mean ? Can you please explain the mean of these value. thanks.

I am not familiar with OpenCV's letter_recog example, but this appears to be a feature vector, or set of statistics about the image of a letter that is used to classify the future occurrences of the letter. The results of your segmentation should leave you with a binary mask with 1's on the letter and 0's everywhere else. onpix is simply the total count of pixels that fall on the letter, or in other words, the sum of your binary mask.
Most of the rest values in the list need to be calculated based on the set of pixels with a value of 1 in your binary mask. x and y are just the position of the pixel. For instance, x-bar is just the sample mean of all of the x positions of all pixels that have a 1 in the mask. You should be able to easily find references on the web for mathematical definitions of mean, variance, covariance and correlation.
14-17 are a little different since they are based on edge pixels, but the calculations should be similar, just over a different set of pixels.

My name is Antonio Bernal.
In page 3 of this article you will find a good description for each value.
Letter Recognition Using Holland-Style Adaptive Classifiers.
If you have any doubt let me know.
I am trying to make this algorithm work, but my problem is that I do not know how to scale the values to fit them to the range 0-15.
Do you have any idea how to do this?
Another Link from Google scholar -> Letter Recognition Using Holland-Style Adaptive Classifiers

Designing a grid overlay based on longitudes and latitudes

I'm trying to figure out the best way to approach the following:
Say I have a flat representation of the earth. I would like to create a grid that overlays this with each square on the grid corresponding to about 3 square kilometers. Each square would have a unique region id. This grid would just be stored in a database table that would have a region id and then probably the long/lat coordinates of the four corners of the region, right? Any suggestions on how to generate this table easily? I know I would first need to find out the width and height of this "flattened earth" in kms, calculate the number of regions, and then somehow assign the long/lats to each intersection of vertical/horizontal line; however, this sounds like a lot of manual work.
Secondly, once I have that grid table created, I need to design a fxn that takes a long/lat pair and then determines which logical "region" it is in. I'm not sure how to go about this.
Any help would be appreciated.
Thanks.

Assume the Earth is a sphere with radius R = 6371 km.
Start at (lat, long) = (0, 0) deg. Around the equator, 3km corresponds to a change in longitude of
dlong = 3 / (2 * pi * R) * 360
= 0.0269796482 degrees
If we walk around the equator and put a marker every 3km, there will be about (2 * pi * R) / 3 = 13343.3912 of them. "About" because it's your decision how to handle the extra 0.3912.
From (0, 0), we walk North 3 km to (lat, long) (0.0269796482, 0). We will walk around the Earth again on a path that is locally parallel to the first path we walked. Because it is a little closer to the N Pole, the radius of this circle is a bit smaller than that of the first circle we walked. Let's use lower case r for this radius
r = R * cos(lat)
= 6371 * cos(0.0269796482)
= 6 368.68141 km
We calculate dlong again using the smaller radius,
dlong = 3 / (2 * pi * r) * 360
= 0.0269894704 deg
We put down the second set of flags. This time there are about (2 * pi * r) / 3 = 13 338.5352 of them. There were 13,343 before, but now there are 13,338. What's that? five less.
How do we draw a ribbon of squares when there are five less corners in the top line? In fact, as we walked around the Earth, we'd find that we started off with pretty good squares, but that the shape of the regions sheared out into pretty extreme parallelograms.
We need a different strategy that gives us the same number of corners above and below. If the lower boundary (SW-SE) is 3 km long, then the top should be a little shorter, to make a ribbon of trapeziums.
There are many ways to craft a compromise that approximates your ideal square grid. This wikipedia article on map projections that preserve a metric property, links to several dozen such strategies.
The specifics of your app may allow you to simplify things considerably, especially if you don't really need to map the entire globe.

Microsoft has been investing in spatial data types in their SQL Server 2008 offering. It could help you out here. Because it has data types to represent your flattened earth regions, operators to determine when a set of coordinates is inside a geometry, etc. Even if you choose not to use this, consider checking out the following links. The second one in particular has a lot of good background information on the problem and a discussion on some of the industry standard data formats for spatial data.
http://www.microsoft.com/sqlserver/2008/en/us/spatial-data.aspx
http://jasonfollas.com/blog/archive/2008/03/14/sql-server-2008-spatial-data-part-1.aspx

First, Paul is right. Unfortunately the earth is round which really complicates the heck out of this stuff.
I created a grid similar to this for a topographical mapping server many years ago. I just recoreded the coordinates of the upper left coder of each region. I also used UTM coordinates instead of lat/long. If you know that each region covers 3 square kilometers and since UTM is based on meters, it is straight forward to do a range query to discover the right region.

You do realize that because the earth is a sphere that "3 square km" is going to be a different number of degrees near the poles than near the equator, right? And that at the top and bottom of the map your grid squares will actually represent pie-shaped parts of the world, right?
I've done something similar with my database - I've broken it up into quad cells. So what I did was divide the earth into four quarters (-180,-90)-(0,0), (-180,0)-(0,90) and so on. As I added point entities to my database, if the "cell" got more than X entries, I split the cell into 4. That means that in areas of the world with lots of point entities, I have a lot of quad cells, but in other parts of the world I have very few.
My database for the quad tree looks like:
\d areaids;
Table "public.areaids"
Column | Type | Modifiers
--------------+-----------------------------+-----------
areaid | integer | not null
supercededon | timestamp without time zone |
supercedes | integer |
numpoints | integer | not null
rectangle | geometry |
Indexes:
"areaids_pk" PRIMARY KEY, btree (areaid)
"areaids_rect_idx" gist (rectangle)
Check constraints:
"enforce_dims_rectangle" CHECK (ndims(rectangle) = 2)
"enforce_geotype_rectangle" CHECK (geometrytype(rectangle) = 'POLYGON'::text OR rectangle IS NULL)
"enforce_srid_rectangle" CHECK (srid(rectangle) = 4326)
I'm using PostGIS to help find points in a cell. If I look at a cell, I can tell if it's been split because supercededon is not null. I can find its children by looking for ones that have supercedes equal to its id. And I can dig down from top to bottom until I find the ones that cover the area I'm concerned about by looking for ones with supercedeson null and whose rectangle overlaps my area of interest (using the PostGIS '&' operator).

There's no way you'll be able to do this with rectangular cells, but I've just finished an R package dggridR which would make this easy to do using a grid of hexagonal cells. However, the 3km cell requirement might yield so many cells as to overload your machine.
You can use R to generate the grid:
install.packages('devtools')
install.packages('rgdal')
library(devtools)
devools.install_github('r-barnes/dggridR')
library(dggridR)
library(rgdal)
#Construct a discrete global grid (geodesic) with cells of ~3 km^2
dggs <- dgconstruct(area=100000, metric=FALSE, resround='nearest')
#Get a hexagonal grid for the whole earth based on this dggs
grid <- dgearthgrid(dggs,frame=FALSE)
#Save the grid
writeOGR(grid, "grid_3km_cells.kml", "cells", "KML")
The KML file then contains the ids and edge vertex coordinates of every cell.
The grid looks a little like this:
My package is based on Kevin Sahr's DGGRID which can generate this same grid to KML directly, though you'll need to figure out how to compile it yourself.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Python normalise data when number of rows are not equal - row

Related

What does correlation filtering actually do?

Fortran - Point in STL

C++ alternative algorithm for solution

How to create data fom image like "Letter Image Recognition Dataset" from UCI

Designing a grid overlay based on longitudes and latitudes

Categories

Resources