I'm trying to solve an IK problem using Jacobian matrices and their pseudo inverses to my human skeleton model, but I'm getting really awkward results. But the thing is, the answer is right! (that is, close enough to the target position & orientation) For example, here is one of the awkward results.
This is the IK result when I'm applying it to the entire upper part of his pelvis. The pelvis is the fixed point here. He's leaning back his upper body where he doesn't need to.
However, this is when I fix his shoulder location, and just apply IK to his arm to the very same target position.
As you can see, you don't need to tilt the spine all the way to the back!
Is this a known problem of the Jacobian-IK method..? If so, is there any way that I can make this more human-like?
JIC, I calculated my Jacobian matrix like this
Related
I'm writing to ask about homography and perspective projection.
I'm trying to write a piece of code, that will "warp" my image so that its corners align with 4 reference points that are in the 3D space - however, the game engine that I'm running it in, already allows me to get the screen position of them, so I already have their screen-space coordinates of both xi,yi and ui,vi, normalized to values between 0 and 1.
I have to mention that I don't have a degree in mathematics, which seems to be a requirement in the posts I've seen on this topic so far, but I'm hoping there is actually a solution to this problem that one can comprehend. I never had a chance to take classes in Computer Vision.
The reason I came here is that in all the posts I've seen online, the simple explanation that I came across is that each point must be put into a 1x3 matrix and multiplied by a 3x3 homography, which consists of 9 components h1,h2,h3...h9, and this transformation matrix will transform each point to the correct perspective. And that's where I'm hitting a brick wall - how do I calculate the transformation matrix? It feels like it should be a relatively simple algebraic task, but apparently it's not.
At this point I spent days reading on the topic, and the solutions I've come across are either based on matlab (which have a ton of mathematical functions built into them), or include elaborations and discussions that don't really explain much; sometimes they suggest tons of different parameters and simplifications, but rarely explain why and what's their purpose, or they are referencing books and studies that have been since removed from the web, and I found myself more confused than I was in the beginning. Most of the resources I managed to find online are also made in a different context - image stitching and 3d engine development.
I also want to mention that I need to run this code each frame on the CPU, and I'm fairly concerned about the effect of having to run too many matrix transformations and solving a ton of linear algebra equations.
I apologize for not asking about any specific code, but my general question is - can anyone point me in the right direction with this issue?
Limit the problem you deal with.
For example, if you always warp the entire rectangular image, you can treat that the coordinates of the image corners are {(0,0), (1,0), (0,1), (1,1)}.
This can simplify the equation, and you'll be able to solve the equation by yourself.
So you'll be able to implement the answer.
Note : Homograpy is scale invariant. So you can decrease the freedom to 8. (e.g. you can solve the equation under h9=1).
Best advice I can give: read a good book on the subject. For example, "Multiple View Geometry" by Hartley and Zisserman
I paint in my spare time, and that means I have a truly massive collection of reference images. Folders full of buildings, people, animals, cars, etc. It's gotten to the point where it'd be great to tag the objects by their pose, so I can find the right object at the right angle. CVAT, an image annotating tool for machine learning, allows you to mark images with cuboids, as you can see in this picture.
But suddenly I'm wondering... is it even possible for a computer to estimate the rotation of a cuboid based on a single image, when all I can feed it are the eight (x,y) pairs that define the image of said cuboid?
My thinking is that I need to somehow invert the transformation matrix so that this cuboid looks like a rectangle. That would mean that we're looking at it "on-axis", and I'm imagining that this inversion could furnish me with those XYZ rotations I'm looking for.
My best lead right now is OpenCv's getPerspectiveTransform function, which can create a matrix that will warp an image, but that transformation seems to be purely two-dimensional.
Wikipedia does mention the idea of using an "augmented matrix" to perform transformations in an extra dimension, which seems apropos here, since I want to go from a 2D representation to a 3d.
A couple constraints & advantages that might clarify the feasibility, here:
The cuboids are rendered in a parallel projection. They don't match the perspective of the image, and that's okay! Just need a rough sense of their pose -- a margin of error of 10 degrees on any given axis of rotation is fine by me, in case there are some inexact solutions that could work.
In the case of multiple cuboids in the scene, I don't care at all about their interrelations -- each case can be treated separately.
I always have a sense of the "rear wall" of the cuboid, because I'm careful in how I make these annotations, in case that symmetry-breaking helps.
The lengths of edges are irrelevant, I'm not trying to measure the "aspect ratio" of these bounding cuboids.
Thank you for any advice or hints!
I am currently writing a simple 3d sound system, but I got stuck.
The problem is:
I have a point in the space, where the sound comes from, and of course I have the listener, with his point and orientation.
The distance was not the problem, it works perfectly, but when I want to calculate the pan of the sound (how much left/right), well it is a total disaster.
I have searched it on the internet, but I can't find any usable solution, and then I tried to calculate it by myself with triangles and stuff, but you don't want to know what the result was.
I won't show code, because I have written it three times, and each version was unusable.
I don't want you to necessarily give me code, I will happy if I get a mathematical solution for that. I would like to get if the sound is how much left or right from the camera.
I work in c++ 11 and the sound library is Audiere.
Edit:
Thanks to willywonkadailyblah I figured out some solution. You can watch it here:
Get x position of sound from camera
with the dot product I can get the cos alpha of the triangle then with a simple pitagoras theorem I can get the distance (A).
And in the end I divide the pandistance with the heardistance and multiply with the sound distance.
You could calculate the dot product of the camera direction and the vector joining the camera and the sound source. This gives the cosine of the angle between the vectors. If the dot product is closer to zero than one, the sound source is more "to the side" than "in front".
I am having trouble incorporating transformations. For whatever reasons, everything is not going as I think it should, but to be honest - all the transformations back and forth make me quite dizzy.
As I have read everywhere (although explicit explanations are rare, imho), the principle algorithm for transformations is as follows:
transform the Ray (Origin and Direction) with the inverse of the transformation matrix
transform the resulting Intersection-Point with the transformation matrix
transform the object's normal at the Intersection-Point with the transposed of the inverse
From what I understood, that should do the trick. I'm pretty sure that my problem lies when I try to calculate the lighting, since both the initial intersection and the lighting algorithm use the same function (obj.getIntersection()). But then again, I have no idea. :(
You can read parts of my code here:
main.cpp, scene.cpp, sphere.cpp, sdf-loader.cpp
Please let me know if you need more info to help me - and please help me! ;)
EDIT:
I made some results, maybe someone "sees" (by the results) where I may be wrong:
untransformed scene:
sphere scaled (2,4,2):
box translated (0,-200,0):
sphere translated (-300,0,0):
sphere x-rotated (45°):
Generally for transformation in computer graphics I would recommend you to have a look at scratchapixel.com and particularly this lesson:
http://scratchapixel.com/lessons/3d-basic-lessons/lesson-4-geometry/
and this one, where you can see how transformations (matrices) are used to transform rays and objects:
http://scratchapixel.com/lessons/3d-basic-lessons/lesson-8-putting-it-all-together-our-first-ray-tracer/
If you don't know this amazing resource yet, I would advice to use it and maybe spread the word at your university. Your teacher should have pointed it out to you.
I am trying to create a simple matrix library in C++ that I will hopefully be able to use in game development afterwards.
I have the basic implementation done, but I have just realized a problem with storing only one matrix per object: the rotation order will get mixed up fairly quickly.
To the best of my knowledge: AB != BA
Therefore, if I am continually multiplying arbitrary rotations to my matrix, than the rotation will get mixed up, correct? In my case, I need to rotate globally on the Y axis, and locally on the X axis (and locally on the Z axis would be nice as well). These seem like the qualities of the average first person shooter. So by "mixed up", I mean that if I go to rotate on the Y axis (or Z axis), then it will start rotating around the local X axis, instead of the intended axis (if that makes any sense).
So, these are the solutions I came up with:
Keep 3 Euler angles, and rebuild the matrix in the correct order when one angle changes
Keep 3 Matrices, one for each axis
Somehow destruct the matrix during multiplication, and reconstruct it properly afterwards (?)
Or am I worrying about nothing? Are my qualms false, and the order will somehow magically solve itself?
You are correct that the order of rotation matrices can be an issue here.
Especially if you use Euler angles, you can suffer from the issue of gimbal lock: let's say your first rotation is +90° positive "pitch", meaning you're looking straight upward; then if the next rotation is +45° "roll", then you're still just looking straight up. But if you do the rotations in the opposite order, you end up looking somewhere different altogether. (see the Wikipedia link for an illustration that makes this clearer.)
One common answer in game development is what you've got in (1): store the Euler angles independently, and then build the rotation matrix out of all three of them at once every time you want to get the object's orientation in world space.
Another common solution is to store rotation as an angle around a single axis, rather than as Euler angles. (That is often less convenient for animators and player motion.)
We also often use quaternions as a more efficient way of storing and combining rotations.
Each of the links above should take you to an article illustrating the relevant math. I also like Eric Lengyel's Mathematics for 3D Game Programming and Computer Graphics book, which explains this whole subject very well.
I don't know how other people usually do this, but I generally just store the angles, and then reconstruct a matrix if necessary.
You are right that if you had one matrix and kept multiplying something onto it, you would end up messing things up. But again, I don't think this is the route you probably want to take.
I don't know what sort of graphics system you want to be using, but with OpenGL, you don't even have to worry about the matrix representation (unless you're doing something super performance-critical), and can simply use some calls to glRotate and the like.