I am trying to implement affine transformation on two images.
First i find the matching pairs in both of the images. One of them is zoomed image and the other is a reference image. The pairs returned me co-efficients as :
|1 0 | | x | | 1 |
A = | | X = | | B = | |
|0 0 | | y | | 221 |
The equation formed is X' = AX + B;
x_co_efficients[2] = (((x_new_Cordinate[2]-x_new_Cordinate[0])*(xCordinate[1]- xCordinate[0])) - ((x_new_Cordinate[1]-x_new_Cordinate[0])*(xCordinate[2] - xCordinate[0])))/
(((xCordinate[1]-xCordinate[0])*(yCordinate[2]-yCordinate[0])) - ((xCordinate[2]-xCordinate[0])*(yCordinate[1]-yCordinate[0])));
x_co_efficients[1] = ((x_new_Cordinate[1]-x_new_Cordinate[0]) - (yCordinate[1]-yCordinate[0])*(x_co_efficients[2]))/(xCordinate[1]-xCordinate[0]);
x_co_efficients[0] = x_new_Cordinate[0] - (((x_co_efficients[1])*(xCordinate[0])) + ((x_co_efficients[2])*(yCordinate[0])));
y_co_efficients[2] = (((y_new_Cordinate[2]-y_new_Cordinate[0])*(xCordinate[1]- xCordinate[0])) - ((y_new_Cordinate[1]-y_new_Cordinate[0])*(xCordinate[2] - xCordinate[0])))/
(((xCordinate[1]-xCordinate[0])*(yCordinate[2]-yCordinate[0])) - ((xCordinate[2]-xCordinate[0])*(yCordinate[1]-yCordinate[0])));
y_co_efficients[1] = ((y_new_Cordinate[1]-y_new_Cordinate[0]) - (yCordinate[1]-yCordinate[0])*(y_co_efficients[2]))/(xCordinate[1]-xCordinate[0]);
y_co_efficients[0] = y_new_Cordinate[0] - (((y_co_efficients[1])*(xCordinate[0])) + ((y_co_efficients[2])*(yCordinate[0])));
These are the equations i am used to finned the co-efficients from using the matching pairs. The equations are working fine for same images, for zooming image it is giving me those co-efficients. Now the problem is i have a 24 bit binary image, i want to implement affine transformation on that image with respect to reference. Now when i try to find the new co-ordinates of that image and change its current value to that co-ordinate I get a very much distorted image. which is should not get otherwise if the transformation is right.
Can Some one please have a look at the equations and also explain a little bit how to implement these equations on the second image.
My code is in C++. Thank you.
My reference image is above.. and my comparison image is
The result i am getting is a distorted image with lines only.
Edit 1
I have now changed the solving method to matrices. Now i am getting the right output but the image i am getting after registration is like this..
Also i have to applied limit of 0 to 320*240 in the new co-ordinates to get the pixel value. Now my result is somewhat like this.
EDIT 2
I have changed the code and m getting this result without any black pixels. I am getting a little tilting.. have removed zoom effect in the given image though.
Your transformation matrix A is problematic. It destroys the y coordinate value and assigns 221 to all y coordinates
You can make the element at (2,2) in A just 1 and problem should be solved.
Related
I am familiar with numpy.transpose command that it is used to swap axes. But I am not familiar with mirror images that what they are and how numpy.transpose command is used to generate mirror image. The following link says that when we swap last two axis we get mirror images. So what is meant by mirror images here. I will be really thankful if someone please explain this with some picture
`a= np.arange(2*2*4).reshape(2,2,4)
b= np.transpose(a,(1,0,2))`
please look https://imgur.com/gallery/v6z7ah0
https://www.reddit.com/r/learnpython/comments/734lcl/complicated_numpy_transpose_question/?st=jij0av7a&sh=754dfd45
In [54]: a= np.arange(2*3*4).reshape(3,2,4)
# | | |
# axes 0 1 2
# new shape by moving the axes
In [54]: b= np.transpose(a,(1,0,2))
In [55]: a.shape
Out[55]: (3, 2, 4)
# first two axes are swapped
In [56]: b.shape
Out[56]: (2, 3, 4)
By default, np.transpose() reverses the shape. But, when passing an argument to np.transpose() the array is reshaped to the requested shape if possible.
Explanation:
In the above example, np.transpose(a, (1, 0, 2)) means that in the returned array b, the zeroth and first axes would be swapped.
Specifically, the tuple that's passed to np.transpose() is the order in which we want our resultant array to have the shape.
Plotting the image before (left) and after transposing (right):
I have a small problem calculating normals for my heightmap. It has a strange behavior. At the higher and the lower points the normals are fine, but in the middle they seem wrong. They are lighted by a point light.
UNFIXED SOURCE REMOVED
EDIT:
Tried 2 new approaches:
This is per-face-normal. It looks fine but you see the single faces.
Position normal = crossP(vectorize(pOL, pUR), vectorize(pOR, pUL));
I also tried to do it per-vertex this way, but also with a strange output.
This is the suggestion Nico made:
It looks also rather odd. Maybe there is a mistake how I calculate the helping points.
UNFIXED SOURCE REMOVED
EDIT 2:
Definition of my points:
OL,OR,UL,UR are the corner vertices of the plane that is to be drawn.
postVertPosZ1 postVertPosZ2
preVertPosX1 pOL pOR postVertPosX1
preVertPosX2 pUL pUR postVertPosX2
preVertPosZ1 preVertPosZ2
EDIT3:
I solved it now. It was a stupid mistake:
I forgot to multiply the y value of the helping Vertices with the height Multiplier and had to change some values.
It is beautiful now.
There are lots of ways to solve this problem. I haven't encountered yours. I suggest using central differences to estimate partial derivatives of the height field. Then use the cross product to get the normal:
Each vertex normal can be calculated from its four neighbors. You don't need the plane plus its neighbors:
T
L O R
B
O is the vertex for which you want to calculate the normal. The other vertices (top, right, bottom, left) are its neighbors. Then we want to calculate the central differences in the horizontal and vertical direction:
/ 2 \
horizontal = | height(R) - height(L) |
\ 0 /
/ 0 \
vertical = | height(B) - height(T) |
\ 2 /
The normal is the cross product of these tangents:
normal = normalize(cross(vertical, horizontal))
/ / height(L) - height(R) \ \
= normalize | | 2 | |
\ \ height(T) - height(B) / /
Note that these calculations assume that your x-axis is aligned to the right and the z-axis down.
Problem: I have a large number of scanned documents that are linked to the wrong records in a database. Each image has the correct ID on it somewhere that says where it belongs in the db.
I.E. A DB row could be:
| user_id | img_id | img_loc |
| 1 | 1 | /img.jpg|
img.jpg would have the user_id (1) on the image somewhere.
Method/Solution: Loop through the database. Pull the image text in to a variable with OCR and check if user_id is found anywhere in the variable. If not, flag the record/image in a log, if so do nothing and move on.
My example is simple, in the real world I have a guarantee that user_id wouldn't accidentally show up on the wrong form (it is of a specific format that has its own significance)
Right now it is working. However, it is incredibly strict. If you've worked with OCR you understand how fickle it can be. Sometimes a 7 = 1 or a 9 = 7, etc. The result is a large number of false positives. Especially among images with low quality scans.
I've addressed some of the image quality issues with some processing on my side - increase image size, adjust the black/white threshold and had satisfying results. I'd like to add the ability for the prog to recognize, for example, that "81*7*23103" is not very far from "81*9*23103"
The only way I know how to do that is to check for strings >= to the length of what I'm looking for. Calculate the distance between each character, calc an average and give it a limit on what is a good average.
Some examples:
Ex 1
81723103 - Looking for this
81923103 - Found this
--------
00200000 - distances between characters
0 + 0 + 2 + 0 + 0 + 0 + 0 + 0 = 2
2/8 = .25 (pretty good match. 0 = perfect)
Ex 2
81723103 - Looking
81158988 - Found
--------
00635885 - distances
0 + 0 + 6 + 3 + 5 + 8 + 8 + 5 = 35
35/8 = 4.375 (Not a very good match. 9 = worst)
This way I can tell it "Flag the bottom 30% only" and dump anything with an average distance > 6.
I figure I'm reinventing the wheel and wanted to share this for feedback. I see a huge increase in run time and a performance hit doing all these string operations over what I'm currently doing.
I am trying to draw the intersection of one blue rectangle and one yellow rectangle
,-------------------,
| |
| Blue |
| ,-------+---------,
| | Green | |
'-----------+-------, Yellow |
|_________________|
using the methods CDC::Polygon and CDC::SetBkMode(TRANSPARENT)
but all I get is this :
,-------------------,
| |
| Blue |
| ,-------+---------,
| | |
'-----------+ Yellow |
|_________________|
Please give me a simple solution sticking with the MFC.
Thanks.
You cannot do this, regardless of whether SetBkMode is TRANSPARENT or OPAQUE since Polygon uses the currently selected brush to fill the polygon's interior. Instead what you should do is this:
Paint one rectangle first, paint the other rectangle next and then calculate the intersection of the two rectangles using CRect::IntersectRect (see http://msdn.microsoft.com/en-us/library/262w7389(v=vs.100).aspx).
If the intersection is non-empty, calculate the resulting "color blend" and create the appropriate brush and, using it, draw a third rectangle using.
For more information on how to blend the colors, check out Algorithm for Additive Color Mixing for RGB Values right here on StackOverflow.
Im working on a system for skeletal animation and each bone's angle is based on its parent. I have to rotate that bone from the end of the parent joint for that angle to be accurate as illustrated in the first part of this illustration:
What I need to do is the second part of the illustration. This is because my drawing API only supports rotating around the center of the bitmap.
Thanks
Combine the rotation with a translation. Rotate the figure about the center, then move it to where it should be.
One option is to introduce extra blank pixels into your bitmap. If you can only rotate around the center of the bitmap, consider what happens if you double the width of your bitmap and then translate the image you want to rotate so that it's flush up against the right.
For example, suppose your image is
+-------+
X image |
+-------+
where the X is the point you want to rotate around. Now, construct this image:
+-------+-------+
| blank X image |
+-------+-------+
If you rotate around the center of this image, notice that you're rotating right on top of the X, which is what you wanted to do in the first place. The resulting rotated image looks like this:
+---+
| b |
| l |
| a |
| n |
| k |
+-X-+
| i |
| m |
| a |
| g |
| e |
+---+
Now, you just extract the bottom half of the image and you've got your original image, rotate 90 degrees around the indicated X point.
Hope this helps!