I am currently writing a simple 3d sound system, but I got stuck.
The problem is:
I have a point in the space, where the sound comes from, and of course I have the listener, with his point and orientation.
The distance was not the problem, it works perfectly, but when I want to calculate the pan of the sound (how much left/right), well it is a total disaster.
I have searched it on the internet, but I can't find any usable solution, and then I tried to calculate it by myself with triangles and stuff, but you don't want to know what the result was.
I won't show code, because I have written it three times, and each version was unusable.
I don't want you to necessarily give me code, I will happy if I get a mathematical solution for that. I would like to get if the sound is how much left or right from the camera.
I work in c++ 11 and the sound library is Audiere.
Edit:
Thanks to willywonkadailyblah I figured out some solution. You can watch it here:
Get x position of sound from camera
with the dot product I can get the cos alpha of the triangle then with a simple pitagoras theorem I can get the distance (A).
And in the end I divide the pandistance with the heardistance and multiply with the sound distance.
You could calculate the dot product of the camera direction and the vector joining the camera and the sound source. This gives the cosine of the angle between the vectors. If the dot product is closer to zero than one, the sound source is more "to the side" than "in front".
Related
I'm writing to ask about homography and perspective projection.
I'm trying to write a piece of code, that will "warp" my image so that its corners align with 4 reference points that are in the 3D space - however, the game engine that I'm running it in, already allows me to get the screen position of them, so I already have their screen-space coordinates of both xi,yi and ui,vi, normalized to values between 0 and 1.
I have to mention that I don't have a degree in mathematics, which seems to be a requirement in the posts I've seen on this topic so far, but I'm hoping there is actually a solution to this problem that one can comprehend. I never had a chance to take classes in Computer Vision.
The reason I came here is that in all the posts I've seen online, the simple explanation that I came across is that each point must be put into a 1x3 matrix and multiplied by a 3x3 homography, which consists of 9 components h1,h2,h3...h9, and this transformation matrix will transform each point to the correct perspective. And that's where I'm hitting a brick wall - how do I calculate the transformation matrix? It feels like it should be a relatively simple algebraic task, but apparently it's not.
At this point I spent days reading on the topic, and the solutions I've come across are either based on matlab (which have a ton of mathematical functions built into them), or include elaborations and discussions that don't really explain much; sometimes they suggest tons of different parameters and simplifications, but rarely explain why and what's their purpose, or they are referencing books and studies that have been since removed from the web, and I found myself more confused than I was in the beginning. Most of the resources I managed to find online are also made in a different context - image stitching and 3d engine development.
I also want to mention that I need to run this code each frame on the CPU, and I'm fairly concerned about the effect of having to run too many matrix transformations and solving a ton of linear algebra equations.
I apologize for not asking about any specific code, but my general question is - can anyone point me in the right direction with this issue?
Limit the problem you deal with.
For example, if you always warp the entire rectangular image, you can treat that the coordinates of the image corners are {(0,0), (1,0), (0,1), (1,1)}.
This can simplify the equation, and you'll be able to solve the equation by yourself.
So you'll be able to implement the answer.
Note : Homograpy is scale invariant. So you can decrease the freedom to 8. (e.g. you can solve the equation under h9=1).
Best advice I can give: read a good book on the subject. For example, "Multiple View Geometry" by Hartley and Zisserman
I paint in my spare time, and that means I have a truly massive collection of reference images. Folders full of buildings, people, animals, cars, etc. It's gotten to the point where it'd be great to tag the objects by their pose, so I can find the right object at the right angle. CVAT, an image annotating tool for machine learning, allows you to mark images with cuboids, as you can see in this picture.
But suddenly I'm wondering... is it even possible for a computer to estimate the rotation of a cuboid based on a single image, when all I can feed it are the eight (x,y) pairs that define the image of said cuboid?
My thinking is that I need to somehow invert the transformation matrix so that this cuboid looks like a rectangle. That would mean that we're looking at it "on-axis", and I'm imagining that this inversion could furnish me with those XYZ rotations I'm looking for.
My best lead right now is OpenCv's getPerspectiveTransform function, which can create a matrix that will warp an image, but that transformation seems to be purely two-dimensional.
Wikipedia does mention the idea of using an "augmented matrix" to perform transformations in an extra dimension, which seems apropos here, since I want to go from a 2D representation to a 3d.
A couple constraints & advantages that might clarify the feasibility, here:
The cuboids are rendered in a parallel projection. They don't match the perspective of the image, and that's okay! Just need a rough sense of their pose -- a margin of error of 10 degrees on any given axis of rotation is fine by me, in case there are some inexact solutions that could work.
In the case of multiple cuboids in the scene, I don't care at all about their interrelations -- each case can be treated separately.
I always have a sense of the "rear wall" of the cuboid, because I'm careful in how I make these annotations, in case that symmetry-breaking helps.
The lengths of edges are irrelevant, I'm not trying to measure the "aspect ratio" of these bounding cuboids.
Thank you for any advice or hints!
Im trying to write a game in 2D with Sfml. For that game i need a Lightengine and some code that can give me the area of the world that is visible to the player. AS both problems fit very well together (are pratically the same) i would like to solve both problems at once.
My world will be loaded from files in which the hitboxes of objects will be represented as Polygons.
I now wrote some code that takes a list of Polygons and the Direction of a Ray that follows the mouse and finds the closest intersection with any of these polygons.
The next step now would be to cast rays from the players or lights Position towards the edges of the polygons, aswell rays offset by +-0.000001 radians to determine the visible area and give it back as a polygon.
The Problem though is that my algorithm (it calculates the inersection between two lines with vector mathematics) is too slow.
In my very good PC i get 100fps with 300 egdes and one Ray.
I now read many articles online but couldnt find one best solution. But as far as i read it should be much faster to calculate intersections with triangles.
My question now: would it be meaningly faster to triangulate the polygons once while loading the map and then use ray-triangle intersection or is there any better way that you know of to solve my problem?
I also heard of bounding Volumen hierachies but i dont know howmuch impact that would have.
Im a bit surprised of how much power my algorithm consumes, as it only has to calculate some 2 dimensional intersections...
For everyone looking for the solution I finally went with:
I discovered the Box2D Physics Engine and I am now using the b2World::RayCast(...) function to determine whether and where a ray hits an object in my scene.
For now everything works fine and smooth (did no exact benchmark yet) :)
http://www.iforce2d.net/b2dtut/world-querying
I got it to work with the help of this site
Have a nice Day! :)
To give you an idea of where I'm coming from, this started as a teaching exercise to get a 12-year-old video game addict into coding. The 2D games, I did in SDL with him and that was fine because I wasn't planning on going into 3D. Yeah, right! So now I'm in at the deep end in OpenGL and mainly trying to figure out exactly what it can and cannot do. I understand the theory (still working on beziers and nurbs if the truth be told) and could code the whole thing by hand in calculated triangular vertices but I'd hate to spend days on that only to be told that there's a built in function/library that does the whole thing faster and easier.
Quadrics seem to be extremely powerful but not terribly flexible. Consider the human head - roughly speaking a 3x4x3 sphere or a torso as a truncated cone that's taller than it is wide than it is thick. Again, a quadric shape with independent x,y and z radii. Since only one radius is provided, am I right in thinking that I would have to generate it around the origin and then apply a scaling matrix to adjust them? Furthermore, if this is so, am I also correct in thinking that saving the results into a vertex array rather than a frame list results in the system neither knowing or caring how they got there?
Transitions: I'm familiar with the basic transitions but, again, consider the torso. It can achieve, maybe, a 45 degree twist from the hips to the shoulders that is distributed linearly across the entire length or even the sideways lean. This is applied around the Y or Z axis respectively but I've obviously missed something about applying transformations that are based on an independent value. (eg rot = dist x (max_rot/max_dist). Again, I could do this by hand (and will probably have to in order to apply the correct physics) but does OpenGL have this functionality built in somewhere?
Any other areas of research I need to put in would be appreciated in the notes.
I have made several attempts to fix this and read all I could find here/forum/google. I used a CCD treshold mush lower than my objects move speed and using a CCD radius much smaller than the objects half radius. The only thing this does is make the multisphere get stuck on seams. I also tried to set ERP/ERP2 to 0.9/1.0.
[EDIT] Ok, so after some more reading; CCD will not work if the sphere is already touching the ground and ERP only affeccts objectts with joints if I understand correctly.
The ground is a trimesh made in Blender and using the obtainStaticNodeShape to get the shape. I have tried to scale the mesh to get smaller polygons but even the smallest (for the game acceptable) size does not work, about 32k indices with 11k polys, 500x500 units, the multisphere has a radius of 0.45 units.
[EDIT] the multi-sphere is two spheres on top of each other and they are restricted to angular movement around the Y-axis only, so no rolling.
The sphere gets "sucked" fast through the ground it does not sink slowly. I tried to make the fixedtimestep smaller 1/420 with 64 substeps did not give any better results. This happens most often while ascending or descending a slope. My ground is gently sloped but an incline of 20% seems to be enough for it to fall through a lot but it can happen on level ground too, just not as often.
When I did my first test I used a big stretched out cube as ground and it worked well.
So my problem now is I don't even know why this is happening so I have no idea what to try next? Can anyone please give me a solution or some pointers.
Is there any use in increasing the multi-sphere size (for the game I can not increase more than 25-30%) I have not explicitly set any collision margins but I think this would just make my sphere float over the ground? Is there any profit in changing the ground from a static object to a kinematic?
Would it work to use a raytest from the sphere straight down and push it up if it is lower than the ground? I think not, why would it fall through if it could detect the ground in the first place..?
[EDIT: additional info]
There are quite a few occurrences of similar problems floating around on forums and also here at stack overflow. Most seem to be about very small objects. Small objects (>0.2m) is clearly not a good option for bullet unless you want to increase the number of simulation steps quite a lot. My problems does not seem to fall under this category since my smallest object is 0.9m in diameter?
I have now also done a debug draw to see the normals of the trimesh that I use as ground. I can not find any errors with the normals.
I also tried to increase the collission margins of the speheres but to no avail.
I further tried to use suggested settings:
((btDefaultCollisionConfiguration)world.collisionConfiguration).setPlaneConvexMultipointIterations(3,3); ((btDefaultCollisionConfiguration)world.collisionConfiguration).setConvexConvexMultipointIterations(3, 3);
No difference.
I did however read about big trimeshes not working very well for raycasting, my mesh is big 512x512 units but I am not sure if this could cause my object to fall through the mesh?
I also read that sphere shapes has problems with trimeshes, but again I am not sure if this would be my case? The sphere I am using is locked for rotation on all axes.
I have also tried using a btCapsule but it gave same results.. Would a cylinder work better?
[EDIT]
I have tried using a cylinder instead since sphere and capsule did not work. The cylinder is working a lot better. I have still got it to fall through once though. The clyinder was jerking around a lot before it went through where the sphere/capsule would just go through really fast and easy. Maybe this could be a clue of whats the underlaying problem? A cylinder is not the best for a character shape though..
An other possible reason could be if a triangle in the mesh has too long sides or a large ratio between sides. I found a few of those on a slope where my sphere always falls through. If this is indeed the problem can I do anything about it except manually editing the mesh in Blender?
As you can see there are a lot of these questions and a lot of possible answers and I have no idea which one corresponds to my case, someone with better insight giving some pointers would mean a lot, thanks!