OpenGL rubiks cube - face rotation with mouse - c++

I am working on my first real OpenGL Project. It is a 3x3x3 Rubiks Cube. Here is a link to a simple screenshot of what i have so far(my rubiks cube)
Rotating the cube is done with dragging the mouse while holding the right mouse button. This works using the example of a arcball from NeHe Tutorials(NeHe Arcball)
I have the class singleCubes which represents one cube via 6 actual quads, stored in a display list that can be used in it´s draw method.
Class ComplexCube has an array of 3x3x3 singleCubes and is used as interface when interacting with the complete rubiks cube.
Now i want to rotate each specific face according to the mousedragging with left mouse button down. I use picking to get the id of the corresponding side of the single cube the user clicked on. This works also. So i click on a side of one cube on a face and depending on the direction of the dragging i set a rotation and offset factor of the cubes that get affected. (i also want to implement that u actually see the face rotate instead of just changing the color)
Now my Problem is that when i rotate the rubiks cube in any direction with right mouse dragging, it becomes upside down for example. So when i click on a side and want to rotate the face to the right, it´s going the wrong direction because i can´t keep track if the cube is upside down or whatever. Due to the use of the arcball rotation i dont have a x- or y-rotation angle which i could use to determine this.
Question 1: How can i keep track or later on get the information if the cube is upside down, tilted etc in order to translate the mouse dragging information(when rotating one face) when using the arcball example linked above?
// In render function
glPushMatrix();
{
glMultMatrixf(Transform.M); // Rotation applied by arcball object
complCube.draw(); // Draw all the cubes using display lists
}
glPopMatrix();
Setup: C++ with Microsoft Visual Studio 2008, GLEW, freeglut

You could use gluUnProject to convert mouse coordinates to 3d space and get a vector (difference between two points). This vector could then be used to apply a "force" to the selected face. Since gluUnProject uses the projection matrix, it would automatically deal with the orientation of the camera.
Basically, once you get your "force" vector, you project it onto the three axes (so onto (1,0,0), (0,1,0), (0,0,1)). Then choose the one with the largest magnitude. Then you have to convert this direction into a rotation axis as in the diagram below (sorry for the bad paint skills):
So what we have is the "force" vector in black and the selected rubiks face in grey. To get the rotation axis, just take the cross product the "force" vector with the normal of the selected face. This gives the red arrow. From that, you should be able to rotate your cubes in the right direction.
Edit to answer the question in more detail
So continuing from my explanation, I will give an example of how this will help you. Let's first assume your screen is 800x800 pixels and your rubiks cube is always centred. Now lets also assume that, as per your drawings in the comments, that we are in the case on the left.
We drag the mouse and get two positions which using gluUnProject are transformed into world coordinates (the numbers were chosen to show my point, not by any calculation):
p1 : (600, 600) -> (1, -0.5, 0)
p2 : (630, 605) -> (1.3, -0.505, 0)
Now we get the difference vector: p2 - p1 = v = (0.3, -0.05, 0). The reason that I was saying to "project onto the three axes" is so that you extract your major movement (which in this case is 0.3 in the x axis) (since the rubiks cube can't rotate along diagonals). To do the "projection" you just have to take the x, y, z axes individually and create vectors from them so you wind up with:
v1 = (0.3, 0, 0)
v2 = (0, -0.05, 0)
v3 = (0, 0, 0)
Now take the magnitudes and discard the smallest vectors, so we are left with the vector v1 = (0.3, 0, 0). This is your movement vector in world space. Now you take the cross product of that vector, with the normal vector of the selected face (which in this case would be (0, 0, 1)). This gives you a vector which points down (0, 1, 0) (after normalization) (in this step you will probably also have to extract the largest component only (0.02, 1.2, 0.8) -> (0, 1, 0) otherwise you would get bizarre rotations if your camera was not pointing directly along the main axes). You can now use that vector as the rotation axis and use 0.3 as your rotation amount (if it rotates in the opposite direciton to that expected, just put a -).
Now how does this help if your cube is upside down? Suppose we click on the screen in the same way. We now get:
p1 : (600, 600) -> (-1, 0.5, 0)
p2 : (630, 605) -> (-1.3, 0.505, 0)
See the difference in the world coordinates? They are inverted! So when you take the difference vector p2 - p1 = v = (-0.3, 0.05, 0). Extracting the largest component vector gives (-0.3, 0, 0). Doing the cross product once again gives you the rotation axis, but now the rotation is in the opposite direction, which is what you want.
Another reason for the cross product with the normal of the face is that if you were to select the faces on the top (in our drawings), then it would either give a rotation axis along the x or z axes (to the left, or into the screen) which is what you want for the top faces.

Like most of us, you will encounter the famous problem called Gimbal Lock.
see: http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=208925
This problem is extremely well documented so there is not much point for me to go into details here. I am sure you will find a ton of information about it.

Related

What are we trying to achieve by creating a right and down vectors in this ray tracing depiction?

Often, we see the following picture when talking about ray tracing.
Here, I see the Z axis as the sort of direction if the camera pointed straight ahead, and the XY grid as the grid that the camera is seeing here. From the camera's point of view, we see the usual Cartesian grid me and my classmates are used to.
Recently I was examining code that simulates this. One thing that is not obvious from this picture to me is the requirement for the "right" and "down" vectors. Obviously we have look_at, which shows where the camera is looking. And campos is where the camera is located. But why do we need camright and camdown? What are we trying to construct?
Vect X (1, 0, 0);
Vect Y (0, 1, 0);
Vect Z (0, 0, 1);
Vect campos (3, 1.5, -4);
Vect look_at (0, 0, 0);
Vect diff_btw (
campos.getVectX() - look_at.getVectX(),
campos.getVectY() - look_at.getVectY(),
campos.getVectZ() - look_at.getVectZ()
);
Vect camdir = diff_btw.negative().normalize();
Vect camright = Y.crossProduct(camdir);
Vect camdown = camright.crossProduct(camdir);
Camera scene_cam (campos, camdir, camright, camdown);
I was searching about this question recently and found this post as well: Setting the up vector in the camera setup in a ray tracer
Here, the answerer says this: "My answer assumes that the XY plane is the "ground" in world space and that Z is the altitude. Imagine standing on the floor holding a camera. It's position has a positive Z component, and it's view vector is nearly parallel to the ground. (In my diagram, it's pointing slightly upwards.) The plane of the camera's film (the uv grid) is perpendicular to the view grid. A common approach is to transform everything until the film plane coincides with the XY plane and the camera points along the Z axis. That's equivalent, but not what I'm describing here."
I'm not entirely sure why "transformations" are necessary.. How is this point of view different from the picture at the top? Here they also say that they need an "up" vector and "right" vector to "construct an image plane". I'm not sure what an image plane is..
Could someone explain better the relationship between the physical representation and code representation?
How do you know that you always want the camera's "up" to be aligned with the vertical lines in the grid in your image?
Trying to explain it another way: The grid is not really there. That imaginary grid is the result of the calculations of camera's directional vectors and the resolution you are rendering in. The grid is not what decides the camera angle.
When you are holding a physical camera in your hand, like the camera in the cell phone, don't you ever rotate the camera little bit for effect? Or when filming, you may want to slowly rotate the camera? Have you not seen any movies where the camera is rotated?
In the same way, you may want to rotate the "camera" in your ray traced world. And rotating only the camera is much easier than rotating all your objects in the scene(may be millions!)
Check out the example of rotating the camera from the movie Ice Age here:
https://youtu.be/22qniGtZhZ8?t=61
The (up or down) and right vectors constructs the plane you project the scene onto. Since the scene is in 3D you need to project the scene onto a 2D scene in order to render a picture to display on your screen.
If you have the camera position and direction you still don't know whether you're holding the camera right-side up, upside down, or tilted to the left and right.
Using camera position, lookat, up (down) and right vectors we can uniquely define the 3D scene is projected into a 2D picture.
Concretely, if you look at the code and the picture. The 3D scene are the objects displayed. The image/projection plane is the grid infront of the camera. It's orientation is defined by the the camright and camdir vectors (because we are assuming the cameras line of sight is perpendicular to camdir, camdown is uniquely defined by the other two).
The placement of the grid is based on the camera's position and intrinsic properties (it's not being displayed here, but the camera will have a specific field of view).

C++ OpenGL dragging multiple objects with mouse

just wondering how someone would go about dragging 4 different
objects in openGL. I have very simple code to draw these objects:
glPushMatrix();
glTranslatef(mouse_x, mouse_y, 0);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glTranslatef(mouse_x2, mouse_y2, 0);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glTranslatef(mouse_x3, mouse_y3, 0);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glTranslatef(mouse_x4, mouse_y4, 0);
glutSolidIcosahedron();
glPopMatrix();
I understand how to move an object, but I want to learn how to drag and drop any one of these objects.
I've been researching about the Name Stack and Selection Mode, but it just confused the hell out of me.
And I also know about having to have some sort of glutMouseFunc.
It's just the selection of each shape I'm puzzled on.
First thing that you need to do is capturing the position of mouse on the screen when the button is clicked. There are plenty of ways to do it but I believe it's outside of the scope of this question. When you have screen X,Y coords you must detect if any object is selected and which one it is. There are two possible approaches. You can either keep track of a bounding rectangle positions of each object (in screen space) and the test if the cursor is inside any of those rectangles will be quite simple. Or you can cast a ray from eye through cursor position in world space and check intersection of this ray with each object.
The second approach is more versatile for 3D graphics but you seem to be using only X and Y coords so you don't need to worry about Z order of objects.
In case of the first solution the main problem is: how to know how big is your object on the screen. glutSolidIcosahedron() renders an object of radius 1. To calculate it's screen radius you can either use some matrix math or in that case a simple trigonometry. You will need to know the distance from camera to the drawing plane (I believe you're using some glTranslatef(0,0,X) before you render. X is your distance) You also need to know the view angle of the camera. You set it in projection matrix. Now take a piece of paper, draw a cone of angle alpha, intersecting a plane at distance X and knowing that an object has radius 1 you can easily calculate how large area of the screen it occupies. (I'll leave this calculation for you)
Now if you know the radius on screen, simply test the distance from your click position to each object. if the distance is below radius it's selected. If more than one object passes this test just select first one of them.

gluLookAt explanation?

Trying to understand gluLookAt, especially the last 3 parameters.
Can someone please explain ?
gluLookAt(camera[0], camera[1], camera[2], /* look from camera XYZ */
0, 0, 0, /* look at the origin */
0, 1, 0); /* positive Y up vector */
What exactly does it mean by "positive Y up vector" ?
Is it possible to have the last up-vector 3 parameters as all 1s, e.g. 1, 1, 1 ? And, if it is possible, what exactly does it mean ?
Is it possible for the up vector to have value more than 1, e.g. 2, 3, 4 ?
Thanks.
Sketchup to the rescue!
Your image has an 'up' to it that can be separate from the world's up. The blue window in this image can be thought of as the 'near-plane' that your imagery is drawn on: your monitor, if you will. If all you supply is the eye-point and the at-point, that window is free to spin around. You need to give an extra 'up' direction to pin it down. OpenGL will normalize the vector that you supply if it isn't unit length. OpenGL will also project it down so that it forms a 90 degree angle with the 'z' vector defined by eye and at (unless you give an 'up' vector that is in exactly the same direction as the line from 'eye' to 'at'). Once 'in' (z) and 'up' (y) directions are defined, it's easy to calculate the 'right' or (x) direction from those two.
In this figure, the 'supplied' up vector is (0,1,0) if the blue axis is in the y direction. If you were to give (1,1,1), it would most likely rotate the image by 45 degrees because that's saying that the top of the blue window should be pointed toward that direction. Consequently the image of the guy would appear to be tipped (in the opposite direction).
The "up vector" of gluLookAt is just the way the camera is oriented. If you have a camera at a position, looking directly at an object, there is one source of freedom still remaining: rotation. Imagine a camera pointed directly at an object, fixed in place. But, your camera can still rotate, spinning your image. The way OpenGL locks this rotation in place is with the "up vector."
Imagine (0, 0, 0) is directly at your camera. Now, the "up vector" is merely a coordinate around your camera. Once you have the "up vector," though, OpenGL will spin your camera around until directly the top of your camera is facing the coordinate of the "up vector".
If the "up vector" is at (0, 1, 0), then your camera will point normally, as the up vector is directly above the camera, so the top of the camera will be at the top, so your camera is oriented correctly. Move the up vector to (1, 1, 0), though, and in order to point the top of the camera to the "up vector," your camera will need to rotate be 45 degrees, rotating the entire image by 45 degrees.
This answer was not meant as an in-depth tutorial, but rather as a way to grasp the concept of the "up vector," to help better understand the other excellent answers to your question.
first 3 parameters are camera position
next 3 parameters are target position
the last 3 parameters represent the rolling of camera.
Very important thing use gluLookAt after "glMatrixMode(GL_MODELVIEW);"
Other useful hint always specify it 0,0,1 for last 3 parameters.
In general co-ordinates are written as x,y,z. z is up vector.
the 0,1,0 leads in confusion as x,z,y.
so best thing leave it to 0,0,1.
the last vector, also known as the cameras up-vector defines the orientation of the camera.
imagine a stick attached to the top of a "real" camera. the stick's direction is the up-vector.
by changing it from (0,1,0) you can do sideways rolling.

OpenGL: Lighting Inside of Cube

I am creating a scene where I use a box to represent a room and different models within that box. When I enable lighting, my models light up fine but the room itself (the inside of the box) does not light up, or rather it is darker than it should be. Is it because I am trying to light up the inside of a cube? I am sure the normals are correct. Please let me know what you think!
I suppose the normals aren't correct but how do I go about finding the correct normals for inside of the cube. Currently, I am only passing the center point of each face into the normalf function.
If you pass in the center points your normals will be facing the wrong way.
For example, if your cube is 2 units in size and centered on the origin, the center point of the face on the positive X axis will be (1, 0, 0), and that would also happen to be the correct normal for the outward facing side of that face.
However the face pointing inwards will have a normal that's the inverse of that, i.e. (-1, 0, 0).

How does zooming, panning and rotating work?

Using OpenGL I'm attempting to draw a primitive map of my campus.
Can anyone explain to me how panning, zooming and rotating is usually implemented?
For example, with panning and zooming, is that simply me adjusting my viewport? So I plot and draw all my lines that compose my map, and then as the user clicks and drags it adjusts my viewport?
For panning, does it shift the x/y values of my viewport and for zooming does it increase/decrease my viewport by some amount? What about for rotation?
For rotation, do I have to do affine transforms for each polyline that represents my campus map? Won't this be expensive to do on the fly on a decent sized map?
Or, is the viewport left the same and panning/zooming/rotation is done in some otherway?
For example, if you go to this link you'll see him describe panning and zooming exactly how I have above, by modifying the viewport.
Is this not correct?
They're achieved by applying a series of glTranslate, glRotate commands (that represent camera position and orientation) before drawing the scene. (technically, you're rotating the whole scene!)
There are utility functions like gluLookAt which sorta abstract some details about this.
To simplyify things, assume you have two vectors representing your camera: position and direction.
gluLookAt takes the position, destination, and up vector.
If you implement a vector class, destinaion = position + direction should give you a destination point.
Again to make things simple, you can assume the up vector to always be (0,1,0)
Then, before rendering anything in your scene, load the identity matrix and call gluLookAt
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt( source.x, source.y, source.z, destination.x, destination.y, destination.z, 0, 1, 0 );
Then start drawing your objects
You can let the user span by changing the position slightly to the right or to the left. Rotation is a bit more complicated as you have to rotate the direction vector. Assuming that what you're rotating is the camera, not some object in the scene.
One problem is, if you only have a direction vector "forward" how do you move it? where is the right and left?
My approach in this case is to just take the cross product of "direction" and (0,1,0).
Now you can move the camera to the left and to the right using something like:
position = position + right * amount; //amount < 0 moves to the left
You can move forward using the "direction vector", but IMO it's better to restrict movement to a horizontal plane, so get the forward vector the same way we got the right vector:
forward = cross( up, right )
To be honest, this is somewhat of a hackish approach.
The proper approach is to use a more "sophisticated" data structure to represent the "orientation" of the camera, not just the forward direction. However, since you're just starting out, it's good to take things one step at a time.
All of these "actions" can be achieved using model-view matrix transformation functions. You should read about glTranslatef (panning), glScalef (zoom), glRotatef (rotation). You also should need to read some basic tutorial about OpenGL, you might find this link useful.
Generally there are three steps that are applied whenever you reference any point in 3d space within opengl.
Given a Local point
Local -> World Transform
World -> Camera Transform
Camera -> Screen Transform (usually a projection. depends on if you're using perspective or orthogonal)
Each of these transforms is taking your 3d point, and multiplying by a matrix.
When you are rotating the camera, it is generally changing the world -> camera transform by multiplying the transform matrix by your rotation/pan/zoom affine transformation. Since all of your points are re-rendered each frame, the new matrix gets applied to your points, and it gives the appearance of a rotation.