Detection of head nod pose from pitch angle - c++

I am making an android application that will detect driver's fatigue from head nod pose.
What I did:
I have detected 68 facial landmarks from image using dlib library.
Used solvePnP to find out rotation matrix and from that matrix I have got roll, pitch and yaw angles.
Now, I have to detect head nod from pitch angle.
My Problem:
How to set a threshold value so that an angle below or above the threshold can be said as a head nod?
It results some negative angles. What does a negative angle mean in 3 dimensional axis?
My code:
void getEulerAngles(Mat &rotCamerMatrix,Vec3d &eulerAngles)
{
Mat cameraMatrix,rotMatrix,transVect,rotMatrixX,rotMatrixY,rotMatrixZ;
double* _r = rotCamerMatrix.ptr<double>();
double projMatrix[12] = {_r[0],_r[1],_r[2],0,
_r[3],_r[4],_r[5],0,
_r[6],_r[7],_r[8],0};
decomposeProjectionMatrix( Mat(3,4,CV_64FC1,projMatrix),
cameraMatrix,
rotMatrix,
transVect,
rotMatrixX,
rotMatrixY,
rotMatrixZ,
eulerAngles);
}
int renderToMat(std::vector<full_object_detection>& dets, Mat& dst)
{
Scalar color;
std::vector<cv::Point2d> image_points;
std::vector<cv::Point3d> model_points;
string disp;
int sz = 3,l;
color = Scalar(0,255,0);
double p1,p2,p3,leftear,rightear,ear=0,yawn=0.00,yaw=0.00,pitch=0.00,roll=0.00;
l=dets.size();
//I am calculating only for one face.. so assuming l=1
for(unsigned long idx = 0; idx < l; idx++)
{
image_points.push_back(
Point2d(dets[idx].part(30).x(),dets[idx].part(30).x() ) );
image_points.push_back(Point2d(dets[idx].part(8).x(),dets[idx].part(8).x() ) );
image_points.push_back(Point2d(dets[idx].part(36).x(),dets[idx].part(36).x() ) );
image_points.push_back(Point2d(dets[idx].part(45).x(),dets[idx].part(45).x() ) );
image_points.push_back(Point2d(dets[idx].part(48).x(),dets[idx].part(48).x() ) );
image_points.push_back(Point2d(dets[idx].part(54).x(),dets[idx].part(54).x() ) );
}
double focal_length = dst.cols;
Point2d center = cv::Point2d(dst.cols/2.00,dst.rows/2.00);
cv::Mat camera_matrix = (cv::Mat_<double>(3.00,3.00) << focal_length, 0, center.x, 0, focal_length, center.y, 0,
0, 1);
cv::Mat dist_coeffs = cv::Mat::zeros(4,1,cv::DataType<double>::type);
cv::Mat rotation_vector; //s Rotation in axis-angle form
cv::Mat translation_vector;
cv::Mat rotCamerMatrix1;
if(l!=0)
{
model_points.push_back(cv::Point3d(0.0f, 0.0f, 0.0f));
model_points.push_back(cv::Point3d(0.0f, -330.0f, -65.0f));
model_points.push_back(cv::Point3d(-225.0f, 170.0f, -135.0f));
model_points.push_back(cv::Point3d(225.0f, 170.0f, -135.0f));
model_points.push_back(cv::Point3d(-150.0f, -150.0f, -125.0f));
model_points.push_back(cv::Point3d(150.0f, -150.0f, -125.0f));
cv::solvePnP(model_points, image_points, camera_matrix, dist_coeffs,rotation_vector, translation_vector);
Rodrigues(rotation_vector,rotCamerMatrix1);
Vec3d eulerAngles;
getEulerAngles(rotCamerMatrix1,eulerAngles);
yaw = eulerAngles[1];
pitch = eulerAngles[0];
roll = eulerAngles[2];
/*My problem begins here. I don't know how to set a threshold value for pitch so that I can say a value below or
above the pitch is a head nod!*/
}
return 0;
}

First you have to understand how are the 3 angles used in a 3D context. Basically they rotate an object in the 3D space with respect to an origin (the origin can change depending on the context), but how are the 3 angles applied?.
This question can be express as: in which order do they rotate the object. If you apply yaw, then pitch and then roll it may give you a different orientation of the object that if you do it in the opposite order. Having said that, you must understand what are those values representing, to understand what to do with them.
Now, you ask what would be a good threshold, well it depends. On what? well, in the order they are applied. For instance if you apply pitch first with 45 degrees so it looks down and then you apply a roll 180 degrees, then it is looking up, so it is a little hard to define a threshold.
Since you have your model points, you can create a 3D rotation matrix and apply it to them with different pitch angles (the rest of the angles will be 0 so the order will not be important here) and visualize them and choose the one that you consider is nodding. This is a little bit subjective, so you should be the one doing it.
Now, to the second question. The answer again is, it depends. On what this time? you may ask. Simple, is your system left handed or right handed? in one this means that the rotation is apply clockwise and the other counter clockwise, the negative sign changes the direction. So, in a left handed system it will be clockwise and with the negative sign will be counterclockwise. The right handed system will be counterclockwise and the negative sign will make it clockwise.
As a suggestion, you can make a vector (0,0,-1) assuming your z axis looks towards the back. Then apply the rotation to it and proyect it to a 2D plane parallel to the z axis, here take the tip of your vector and see what is the angle here. This way you are sure what are you getting.

Related

Determining rotation matrix about an axis for a given angle

I've been trying to understand matrices and vectors and implemented Rodrigue's rotation formula to determine the rotation matrix about an axis for a given angle. I've got function Transform which calls out to function Rotate.
// initial values of eye ={0,0,7}
//initial values of up={0,1,0}
void Transform(float degrees, vec3& eye, vec3& up) {
vec3 axis = glm::cross(glm::normalize(eye), glm::normalize(up));
glm::normalize(axis);
mat3 resultRotate = rotate(degrees, axis);
eye = eye * resultRotate;
glm::normalize(eye);
up = up * resultRotate;`enter code here`
glm::normalize(up);
}
mat3 rotate(const float degrees, const vec3& axis) {
//Implement Rodrigue's axis-angle rotation formula
float radDegree = glm::radians(degrees);
float cosValue = cosf(radDegree);
float minusCos = 1 - cosValue;
float sinValue = sinf(radDegree);
float cartesianX = axis.x;
float cartesianY = axis.y;
float cartesianZ = axis.z;
mat3 myFinalResult = mat3(cosValue +(cartesianX*cartesianX*minusCos), ((cartesianX*cartesianY*minusCos)-(cartesianZ*sinValue)),((cartesianX*cartesianZ*minusCos)+(cartesianY*sinValue)),
((cartesianX*cartesianY*minusCos)+(cartesianZ*sinValue)), (cosValue+(cartesianY*cartesianY*minusCos)), ((cartesianY*cartesianZ*minusCos) - (cartesianX*sinValue)),
((cartesianX*cartesianZ*minusCos)-(cartesianY*sinValue)), ((cartesianY*cartesianZ*minusCos) + (cartesianX*sinValue)), ((cartesianZ*cartesianZ*minusCos) + cosValue));
return myFinalResult;
}
All the values, resultant rotation matrix and the changed vectors are as expected for +angle of rotation, but wrong for negative angles and from then on, has cascading effect until the all the vectors are re-initialised. Can someone please help me figure out the problem? I cannot use any inbuilt functions like glm::rotate.
I do not use Rodrigues_rotation_formula because it needs to compute a system of equation on runtime and gets very complicated in higher dimensions.
Instead I am using axis aligned incremental rotations along with 4x4 homogenous transform matrices which are really easily portable to higher dimensions like 4D rotors.
Now there are local and global rotations. Local rotations will rotate around your matrix coordiante system local axises and global ones will rotate around world (or main coordinate system)
What you want is create a transform matrix around some point,axis and angle. To do that just:
create a transform matrix A
that has one axis aligned to axis of rotation and origin is center of rotation. To construct such matrix you need 2 perpendicular vectors which are easily obtainable from cross product.
rotate A around its local axis aligned to axis of rotation by angle
by simple multiplication of A by axis aligned incremental rotation R so
A*R;
revert the original transform of A before rotation
by simply multiplying inverse of A to the result so
A*R*Inverse(A);
apply this on matrix M you want to rotate
also by simply multiplying this to M so:
M*=A*R*Inverse(A);
And that is it... Here 3D OBB approximation you can find function :
template <class T> _mat4<T> rotate(_mat4<T> &m,T ang,_vec3<T> p0,_vec3<T> dp)
{
int i;
T c=cos(ang),s=sin(ang);
_vec3<T> x,y,z;
_mat4<T> a,_a,r=mat4(
1, 0, 0, 0,
0, c, s, 0,
0,-s, c, 0,
0, 0, 0, 1);
// basis vectors
x=normalize(dp); // axis of rotation
y=_vec3<T>(1,0,0); // any vector non parallel to x
if (fabs(dot(x,y))>0.75) y=_vec3<T>(0,1,0);
z=cross(x,y); // z is perpendicular to x,y
y=cross(z,x); // y is perpendicular to x,z
y=normalize(y);
z=normalize(z);
// feed the matrix
for (i=0;i<3;i++)
{
a[0][i]= x[i];
a[1][i]= y[i];
a[2][i]= z[i];
a[3][i]=p0[i];
a[i][3]=0;
} a[3][3]=1;
_a=inverse(a);
r=m*a*r*_a;
return r;
};
That does exactly that. Where m is original matrix to transform (and returns the rotated one), ang is signed angle in [rad], p0 is center of rotation and dp is axis of rotation direction vector.
This approach does not have any singularities nor problems to rotate by negative angles ...
If you want to use this with glm or any other GLSL like math just change the templates to what you use so float,vec3,mat4 instead of T,_vec3<T>,mat4<T>.

Arcball camera locked when parallel to up vector

I'm currently in the process of finishing the implementation for a camera that functions in the same way as the camera in Maya. The part I'm stuck in the tumble functionality.
The problem is the following: the tumble feature works fine so long as the position of the camera is not parallel with the up vector (currently defined to be (0, 1, 0)). As soon as the camera becomes parallel with this vector (so it is looking straight up or down), the camera locks in place and will only rotate around the up vector instead of continuing to roll.
This question has already been asked here, unfortunately there is no actual solution to the problem. For reference, I also tried updating the up vector as I rotated the camera, but the resulting behaviour is not what I require (the view rolls as a result of the new orientation).
Here's the code for my camera:
using namespace glm;
// point is the position of the cursor in screen coordinates from GLFW
float deltaX = point.x - mImpl->lastPos.x;
float deltaY = point.y - mImpl->lastPos.y;
// Transform from screen coordinates into camera coordinates
Vector4 tumbleVector = Vector4(-deltaX, deltaY, 0, 0);
Matrix4 cameraMatrix = lookAt(mImpl->eye, mImpl->centre, mImpl->up);
Vector4 transformedTumble = inverse(cameraMatrix) * tumbleVector;
// Now compute the two vectors to determine the angle and axis of rotation.
Vector p1 = normalize(mImpl->eye - mImpl->centre);
Vector p2 = normalize((mImpl->eye + Vector(transformedTumble)) - mImpl->centre);
// Get the angle and axis
float theta = 0.1f * acos(dot(p1, p2));
Vector axis = cross(p1, p2);
// Rotate the eye.
mImpl->eye = Vector(rotate(Matrix4(1.0f), theta, axis) * Vector4(mImpl->eye, 0));
The vector library I'm using is GLM. Here's a quick reference on the custom types used here:
typedef glm::vec3 Vector;
typedef glm::vec4 Vector4;
typedef glm::mat4 Matrix4;
typedef glm::vec2 Point2;
mImpl is a PIMPL that contains the following members:
Vector eye, centre, up;
Point2 lastPoint;
Here is what I think. It has something to do with the gimbal lock, that occurs with euler angles (and thus spherical coordinates).
If you exceed your minimal(0, -zoom,0) or maxima(0, zoom,0) you have to toggle a boolean. This boolean will tell you if you must treat deltaY positive or not.
It could also just be caused by a singularity, therefore just limit your polar angle values between 89.99° and -89.99°.
Your problem could be solved like this.
So if your camera is exactly above (0, zoom,0) or beneath (0, -zoom,0) of your object, than the camera only rolls.
(I am also assuming your object is at (0,0,0) and the up-vector is set to (0,1,0).)
There might be some mathematical trick to resolve this, I would do it with linear algebra though.
You need to introduce a new right-vector. If you make a cross product, you will get the camera-vector. Camera-vector = up-vector x camera-vector. Imagine these vectors start at (0,0,0), then easily, to get your camera position just do this subtraction (0,0,0)-(camera-vector).
So if you get some deltaX, you rotate towards the right-vector(around the up-vector) and update it.
Any influence of deltaX should not change your up-vector.
If you get some deltaY you rotate towards the up-vector(around the right-vector) and update it. (This has no influence on the right-vector).
https://en.wikipedia.org/wiki/Rotation_matrix at Rotation matrix from axis and angle you can find a important formula.
You say u is your vector you want to rotate around and theta is the amount you want to pivot. The size of theta is proportional to deltaX/Y.
For example: We got an input from deltaX, so we rotate around the up-vector.
up-vector:= (0,1,0)
right-vector:= (0,0,-1)
cam-vector:= (0,1,0)
theta:=-1*30° // -1 due to the positive mathematical direction of rotation
R={[cos(-30°),0,-sin(-30°)],[0,1,0],[sin(-30°),0,cos(-30°)]}
new-cam-vector=R*cam-vector // normal matrix multiplication
One thing is left to be done: Update the right-vector.
right-vector=camera-vector x up-vector .

Quaternion-Based-Camera unwanted roll

I'm trying to implement a quaternion-based camera, but when moving around the X and Y axis, the camera produces an unwanted roll on the Z axis. I want to be able to look around freely on all axis.
I've read other topics about this problem (for example: http://www.flipcode.com/forums/thread/6525 ), but I'm not getting what is meant by "Fix this by continuously rebuilding the rotation matrix by rotating around the WORLD axis, i.e around <1,0,0>, <0,1,0>, <0,0,1>, not your local coordinates, whatever they might be."
//Camera.hpp
glm::quat rotation;
//Camera.cpp
void Camera::rotate(glm::vec3 vec)
{
glm::quat paramQuat = glm::quat(vec);
rotation = paramQuat * rotation;
}
I call the rotate function like this:
cam->rotate(glm::vec3(0, 0.5, 0));
This must have to do with local/world coordinates, right? I'm just not getting it, since I thought quaternions are always based on each other (thus a quaternion can't be in "world" or "local" space?).
Also, should i use more than one quaternion for a camera?
As far as I understand it, and from looking at the code you provided, what they mean is that you shouldn't store and apply the rotation incrementally by applying rotate on the rotation quat all the time, but instead keeping track of two quaternions for each axis (X and Y in world space) and calculating the rotation vector as the product of those.
[edit: some added (pseudo)code]
// Camera.cpp
void Camera::SetRotation(glm::quat value)
{
rotation = value;
}
// controller.cpp --> probably a place where you'd want to translate user input and store your rotational state
xAngle += deltaX;
yAngle += deltaY;
glm::quat rotationX = QuatAxisAngle(X_AXIS, xAngle);
glm::quat rotationY = QuatAxisAngle(Y_AXIS, yAngle);
camera.SetRotation(rotationX * rotationY);

How to properly move the camera in the direction it's facing

I'm trying to figure out how to make the camera in directx move based on the direction it's facing.
Right now the way I move the camera is by passing the camera's current position and rotation to a class called PositionClass. PositionClass takes keyboard input from another class called InputClass and then updates the position and rotation values for the camera, which is then passed back to the camera class.
I've written some code that seems to work great for me, using the cameras pitch and yaw I'm able to get it to go in the direction I've pointed the camera.
However, when the camera is looking straight up (pitch=90) or straight down (pitch=-90), it still changes the cameras X and Z position (depending on the yaw).
The expected behavior is while looking straight up or down it will only move along the Y axis, not along the X or Z axis.
Here's the code that calculates the new camera position
void PositionClass::MoveForward(bool keydown)
{
float radiansY, radiansX;
// Update the forward speed movement based on the frame time
// and whether the user is holding the key down or not.
if(keydown)
{
m_forwardSpeed += m_frameTime * m_acceleration;
if(m_forwardSpeed > (m_frameTime * m_maxSpeed))
{
m_forwardSpeed = m_frameTime * m_maxSpeed;
}
}
else
{
m_forwardSpeed -= m_frameTime * m_friction;
if(m_forwardSpeed < 0.0f)
{
m_forwardSpeed = 0.0f;
}
}
// ToRadians() just multiplies degrees by 0.0174532925f
radiansY = ToRadians(m_rotationY); //yaw
radiansX = ToRadians(m_rotationX); //pitch
// Update the position.
m_positionX += sinf(radiansY) * m_forwardSpeed;
m_positionY += -sinf(radiansX) * m_forwardSpeed;
m_positionZ += cosf(radiansY) * m_forwardSpeed;
return;
}
The significant portion is where the position is updated at the end.
So far I've only been able to deduce that I have horrible math skills.
So, can anyone help me with this dilemma? I've created a fiddle to help test out the math.
Edit: The fiddle uses the same math I used in my MoveForward function, if you set pitch to 90 you can see that the Z axis is still being modified
Thanks to Chaosed0's answer, I was able to figure out the correct formula to calculate movement in a specific direction.
The fixed code below is basically the same as above but now simplified and expanded to make it easier to understand.
First we determine the amount by which the camera will move, in my case this was m_forwardSpeed, but here I will define it as offset.
float offset = 1.0f;
Next you will need to get the camera's X and Y rotation values (in degrees!)
float pitch = camera_rotationX;
float yaw = camera_rotationY;
Then we convert those values into radians
float pitchRadian = pitch * (PI / 180); // X rotation
float yawRadian = yaw * (PI / 180); // Y rotation
Now here is where we determine the new position:
float newPosX = offset * sinf( yawRadian ) * cosf( pitchRadian );
float newPosY = offset * -sinf( pitchRadian );
float newPosZ = offset * cosf( yawRadian ) * cosf( pitchRadian );
Notice that we only multiply the X and Z positions by the cosine of pitchRadian, this is to negate the direction and offset of your camera's yaw when it's looking straight up (90) or straight down (-90).
And finally, you need to tell your camera the new position, which I won't cover because it largely depends on how you've implemented your camera. Apparently doing it this way is out of the norm, and possibly inefficient. However, as Chaosed0 said, it's what makes the most sense to me!
To be honest, I'm not entirely sure I understand your code, so let me try to provide a different perspective.
The way I like to think about this problem is in spherical coordinates, basically just polar in 3D. Spherical coordinates are defined by three numbers: a radius and two angles. One of the angles is yaw, and the other should be pitch, assuming you have no roll (I believe there's a way to get phi if you have roll, but I can't think of how currently). In conventional mathematics notation, theta is your yaw and phi is your pitch, with radius being your move speed, as shown below.
Note that phi and theta are defined differently, depending on where you look.
Basically, the problem is to obtain a point m_forwardSpeed away from your camera, with the right pitch and yaw. To do this, we set the "origin" to your camera position, obtain a spherical coordinate, convert it to cartesian, and then add it to your camera position:
float radius = m_forwardSpeed;
float theta = m_rotationY;
float phi = m_rotationX
//These equations are from the wikipedia page, linked above
float xMove = radius*sinf(phi)*cosf(theta);
float yMove = radius*sinf(phi)*sinf(theta);
float zMove = radius*cosf(phi);
m_positionX += xMove;
m_positionY += yMove;
m_positionZ += zMove;
Of course, you can condense a lot of this code, but I expanded it for clarity.
You can think about this like drawing a sphere around your camera. Each of the points on the sphere is a potential position in the next timestep, depending on the camera's rotation.
This is probably not the most efficient way to do it, but in my opinion it's certainly the easiest way to think about it. It actually looks like this is nearly exactly what you're trying to do in your code, but the operations on the angles are just a little bit off.

"Looking At" an object with a Quaternion

So I am currently trying to create a function that will take two 3D points A and B, and provide me with the quaternion representing the rotation required of point A to be "looking at" point B (such that point A's local Z axis passes through point B, if you will).
I originally found this post, the top answer of which seemed to provide me with a good starting point. I went on to implement the following code; instead of assuming a default (0, 0, -1) orientation, as the original answer suggests, I try to extract a unit vector representing the actual orientation of the camera.
void Camera::LookAt(sf::Vector3<float> Target)
{
///Derived from pseudocode found here:
///https://stackoverflow.com/questions/13014973/quaternion-rotate-to
//Get the normalized vector from the camera position to Target
sf::Vector3<float> VectorTo(Target.x - m_Position.x,
Target.y - m_Position.y,
Target.z - m_Position.z);
//Get the length of VectorTo
float VectorLength = sqrt(VectorTo.x*VectorTo.x +
VectorTo.y*VectorTo.y +
VectorTo.z*VectorTo.z);
//Normalize VectorTo
VectorTo.x /= VectorLength;
VectorTo.y /= VectorLength;
VectorTo.z /= VectorLength;
//Straight-ahead vector
sf::Vector3<float> LocalVector = m_Orientation.MultVect(sf::Vector3<float>(0, 0, -1));
//Get the cross product as the axis of rotation
sf::Vector3<float> Axis(VectorTo.y*LocalVector.z - VectorTo.z*LocalVector.y,
VectorTo.z*LocalVector.x - VectorTo.x*LocalVector.z,
VectorTo.x*LocalVector.y - VectorTo.y*LocalVector.x);
//Get the dot product to find the angle
float Angle = acos(VectorTo.x*LocalVector.x +
VectorTo.y*LocalVector.y +
VectorTo.z*LocalVector.z);
//Determine whether or not the angle is positive
//Get the cross product of the axis and the local vector
sf::Vector3<float> ThirdVect(Axis.y*LocalVector.z - Axis.z*LocalVector.y,
Axis.z*LocalVector.x - Axis.x*LocalVector.z,
Axis.x*LocalVector.y - Axis.y*LocalVector.x);
//If the dot product of that and the local vector is negative, so is the angle
if (ThirdVect.x*VectorTo.x + ThirdVect.y*VectorTo.y + ThirdVect.z*VectorTo.z < 0)
{
Angle = -Angle;
}
//Finally, create a quaternion
Quaternion AxisAngle;
AxisAngle.FromAxisAngle(Angle, Axis.x, Axis.y, Axis.z);
//And multiply it into the current orientation
m_Orientation = AxisAngle * m_Orientation;
}
This almost works. What happens is that the camera seems to rotate half the distance towards the Target point. If I attempt the rotation again, it performs half the remaining rotation, ad infinitum, such that if I hold down the "Look-At-Button", the camera's orientation gets closer and closer to looking directly at the target, but is also constantly slowing down in its rotation, such that it never quite gets there.
Note that I don't want to resort to gluLookAt(), as I will also eventually need this code to point objects other than the camera at one another, and my objects already use quaternions for their orientations. For example, I might want to create an eyeball that tracks the position of something moving around in front of it, or a projectile that updates its orientation to seek out its target.
Normalize Axis vector before passing it to FromAxisAngle.
Why are you using a quaternion? You're just making things more complex and requiring more computation in this instance. To set up a matrix:-
calculate vector from observer to observed (which you're doing already)
normalise it (again, doing it already) = at
cross product this with the observer's up direction = right
normalise right
cross product at and right to get up
and you're done. The right, up and at vectors are the first, second and third row (or column, depending on how you set things up) of your matrix. The final row/column is the objects position.
But it looks like you want to transform an existing matrix to this new matrix over several frames. SLERPs do this to matricies as well as quaternions (which isn't surprising when you look into the maths). For the transformation, store the initial and target matricies and then SLERP between them, changing the amount to SLERP by each frame (e.g. 0, 0.25, 0.5, 0.75, 1.0 - although a non-linear progression would look nicer).
Don't forget that you're converting a quaternion back into a matrix in order to pass it to the rendering pipeline (unless there's some new features in the shaders to handle quaternions natively). So any efficencies due to quaternion use has to take into account the conversion process as well.