How to write dlib shape_pedictor together with the video capture

How to write dlib shape_pedictor together with the video capture - c++

I am trying to write each frame from a camera into a video. Till here it is fine. However, I want my video to include the shape_predictor too at each frame, so when it is reproduced it also appears on the image. So far I have got this... Any ideas? Thank you
cap >> frame;
cv::VideoWriter oVideoWriter;
// . . .
cv_image<bgr_pixel> cimg(frame); //Mat to something dlib can deal with
frontal_face_detector detector = get_frontal_face_detector();
std::vector<rectangle> faces = detector(cimg);
pose_model(cimg, faces[0]);
oVideoWriter.write(dlib::toMat(cimg)); //Turn it into an Opencv Mat

The shape predictor is not the face detector. You have to first call the face detector, then the shape predictor.
See this example program: http://dlib.net/face_landmark_detection_ex.cpp.html
You initialized the face detector properly..then you have to initialize the tracker. Something like this:
shape_predictor sp;
deserialize("shape_predictor_68_face_landmarks.dat") >> sp;
The model can be found here: http://sourceforge.net/projects/dclib/files/dlib/v18.10/shape_predictor_68_face_landmarks.dat.bz2
The rest of the way, you can just follow the example program I linked above. Here's the portion where the tracker is run. You have to pass to the tracker the output (bounding box) return by the detector for it to work. The code below iterates through all the boxes returned by the detector.
// Now tell the face detector to give us a list of bounding boxes
// around all the faces in the image.
std::vector<rectangle> dets = detector(img);
cout << "Number of faces detected: " << dets.size() << endl;
// Now we will go ask the shape_predictor to tell us the pose of
// each face we detected.
std::vector<full_object_detection> shapes;
for (unsigned long j = 0; j < dets.size(); ++j)
{
full_object_detection shape = sp(img, dets[j]);
cout << "number of parts: "<< shape.num_parts() << endl;
cout << "pixel position of first part: " << shape.part(0) << endl;
cout << "pixel position of second part: " << shape.part(1) << endl;
// You get the idea, you can get all the face part locations if
// you want them. Here we just store them in shapes so we can
// put them on the screen.
shapes.push_back(shape);
}

Related

Could not find ChessboardCorners in following picture

I have tried following code to find corners of square boxes in attached chess board picture but unfortunately could not find it. Can you please inform me what can i do to detect corners of chessboard in this case...Many thanks ..:)
int main() {
cv::Mat imgOriginal; // input image
Size boardSizeTopChessBoard;
boardSizeTopChessBoard.width = 144;
boardSizeTopChessBoard.height = 3;
vector<Point2f> pointBufTopChessBoard;
bool topChessBoardCornersFound = false;
imgOriginal = cv::imread("topChessBoard.jpg");
imshow("Original Image ", imgOriginal);
topChessBoardCornersFound = findChessboardCornersSB(imgOriginal, boardSizeTopChessBoard, pointBufTopChessBoard, 0);
if (topChessBoardCornersFound)
{
cout << "Corners found in top chess baord" << endl;
}
else
{
cout << "Corners not found in top chess baord" << endl;
}
waitKey(0);
return(0);
}

There's a number of reasons why it doesn't work.
First of all, image appears to have small resolution with this amount of corners. Therefore, it is too dificult to detect them.
Secondly, the constrast at the edges of image is lower which makes it more difficult. Darker image is harder to detect.
And finally, try to capture sharper image. This one is little bit blurry.

Camera calibration data retrieval

So I need help with OpenCV in C++
Basically I have a camera that has some radial distortion and I am able to undistort it using the provided examples/samples in OpenCV.
But currently I have to recalibrate the camera each time the program is run. But the example generates an XML file for a reason right... To make use of those values...
My problem is I'm not sure which values and how to use those values from the XML file to undistort the camera without having to go through the entire calibration again.
I tried finding examples of this use online but for some reason nothing related to my problem came up...
Supposedly we are supposed to be able to take the values from the output XML file and use them directly in the program so that we don't have to recalibrate the camera each time.
But currently that's exactly what my program is doing :/
I really hope someone can help me with this
Thanks a lot :)

First, You have to create Camera Matrix with Camera Matrix values from XML file.
Mat cameraMatrix = new Mat(new Size(3,3), CvType.CV_64FC1);
cameraMatrix.put(0,0,3275.907);
cameraMatrix.put(0,1,0);
cameraMatrix.put(0,2,2069.153);
cameraMatrix.put(1,0,0);
cameraMatrix.put(1,1,3270.752);
cameraMatrix.put(1,2,1139.271);
cameraMatrix.put(2,0,0);
cameraMatrix.put(2,1,0);
cameraMatrix.put(2,2,1);
Second, create Distortion Matrix with Distortion_Coefficients from XML file.
Mat distortionMatrix = new Mat(new Size(4,1), CvType.CV_64FC1);
distortionMatrix.put(0,0,-0.006934);
distortionMatrix.put(0,1,-0.047680);
distortionMatrix.put(0,2,0.002173);
distortionMatrix.put(0,3,0.002580);
Finally, use OpenCV method.
Mat map1 = new Mat();
Mat map2 = new Mat();
Mat temp = new Mat();
Imgproc.initUndistortRectifyMap(cameraMatrix, distortionMatrix, temp, cameraMatrix, src.size(), CvType.CV_32FC1, map1, map2);
And you can get two matrix map1, map2 which use for undistortion.
If you get this two matrix, you don't have to re-calibrate every time.
Just use remap and undistortion will be done.
Imgproc.remap(mat, undistortPicture, map1, map2, Imgproc.INTER_LINEAR);
refer this link.

Alright so I was able to extract the 4 things that I think was necessary from the output xml file. Essentially I made a new class that I named CalibSet and just extracted the data from the xml file via the "tfs[""] >> xxx;" at the bottom of the code.
class CalibSet
{
public:
Size Boardsize; // The size of the board -> Number of items by width and height
Size image; // image size
String calibtime;
Mat CamMat; // camera matrix
Mat DistCoeff; // distortion coefficient
Mat PViewReprojErr; // per view reprojection error
float SqSize; // The size of a square in your defined unit (point, millimeter,etc).
float avg_reproj_error;
int NrFrames; // The number of frames to use from the input for calibration
int Flags;
bool imagePoints; // Write detected feature points
bool ExtrinsicParams; // Write extrinsic parameters
bool GridPoints; // Write refined 3D target grid points
bool fisheyemodel; // use fisheye camera model for calibration
void write(FileStorage& fs) const //Write serialization for this class
{
fs << "{"
<<"nr_of_frames" << NrFrames
<<"image_width" << image.width
<<"image_height" << image.height
<<"board_width" << Boardsize.width
<<"board_height" << Boardsize.height
<<"square_size" << SqSize
<<"flags" << Flags
<<"fisheye_model" << fisheyemodel
<<"camera_matrix" << CamMat
<<"distortion_coefficients" << DistCoeff
<<"avg_reprojection_error" << avg_reproj_error
<<"per_view_reprojection_errors" << PViewReprojErr
<<"extrinsic_parameters" << ExtrinsicParams
<< "}";
}
void read(const FileNode& node) //Read serialization for this class
{
node["calibration_time"] >> calibtime;
node["nr_of_frames"] >> NrFrames;
node["image_width"] >> image.width;
node["image_height"] >> image.height;
node["board_width"] >> Boardsize.width;
node["board_height"] >> Boardsize.height;
node["square_size"] >> SqSize;
node["flags"] >> Flags;
node["fisheye_model"] >> fisheyemodel;
node["camera_matrix"] >> CamMat;
node["distortion_coefficients"] >> DistCoeff;
node["avg_reprojection_error"] >> avg_reproj_error;
node["per_view_reprojection_errors"] >> PViewReprojErr;
node["extrinsic_parameters"] >> ExtrinsicParams;
}
};
CalibSet CS;
FileStorage tfs(inputCalibFile, FileStorage::READ); // Read the settings
if (!tfs.isOpened())
{
cout << "Could not open the calibration file: \"" << inputCalibFile << "\"" << endl;
return -1;
}
tfs["camera_matrix"] >> CS.CamMat;
tfs["distortion_coefficients"] >> CS.DistCoeff;
tfs["image_width"] >> CS.image.width;
tfs["image_height"] >> CS.image.height;
tfs.release(); // close Settings file
And after this I use the function "undistort" to correct the live camera frames that I stored in frame and put the corrected image in rframe
flip(frame, frame, -1); // flip image vertically so that it's not upside down
cv::undistort(frame, rframe, CS.CamMat, CS.DistCoeff);
flip(rframe, rframe, +1); // flip image horizontally
It's important to make sure that the orientation of photos taken for sampling is exactly the same as the one used later on (including mirroring vertically or horizontally) else the image will still be distorted after using "undistort"
After this I can get an undistorted image as intended BUT the frame rate is extremely low (around 10-20FPS) and I'd appreciate any help in optimising the process if possible; to allow for higher frame rate from the live camera feed

How do you get x&y coordinates and size of a cv::keypoint? I keep getting a segfault

I've been over and over various cv::KeyPoint resources and followed many tips to no avail so far.
cv::KeyPoint Class Reference
// --- BLOB DETECTION --- //
// Storage for blobs
vector<KeyPoint> keypoints;
// Set up detector with params
Ptr<SimpleBlobDetector> detector = SimpleBlobDetector::create(params);
// Detect blobs
detector->detect( camThresh, keypoints);
// Draw keypoints
drawKeypoints( camRaw, keypoints, camBlobs, Scalar(0,0,255), DrawMatchesFlags::DRAW_RICH_KEYPOINTS );
I;m very glad I got this to work up until this point, but I really need to get the XY and size values of those keypoints. The closest suggestion I'd almost gotten to work (see below) throws out a segment fault.
float x = keypoints[i].pt.x;
and
Point2f p = keypoints[i].pt;
Lead me to the same outcome. Someone in the suggestions linked about mentioned the same problem. Any one have any tips? Thanks!

I found out why the program was crashing.
I was not checking for whether there even were any active keypoints at all.
So,if the program starts without anything in the camera's view it crashes.
It's completely valid to use, for example :
if(keypoints.size()>0) {
if((char)keyboard == 'c') {
float x0 = keypoints[0].pt.x;
float y0 = keypoints[0].pt.y;
cout << "Point0 Xpos = " << x0 << "\n";
cout << "Point0 Ypos = " << y0 << "\n";
}
}
As long as there are some objects being tracked when the program starts. Once nothing is in the camera's view the application crashes.

The opencv code for face recognition does not predict correctly in visual c++

This is my code for face recognition in videos. It runs without any error but it's prediction
is wrong most of the time.I am using LBPH face recognizer to recognize the faces.
I tried using haar cascades but it does not load. so i switched to LBHP.please help me to improve the prediction.
I am using gray scale cropped images of size 500 x 500 (pixels) for training the cascade classifier.
#include <opencv2/core/core.hpp>
#include <opencv2/contrib/contrib.hpp
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/objdetect/objdetect.hpp>
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace std;
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, 0));
labels.push_back(atoi(classlabel.c_str()));
}
}
}
string g_listname_t[]=
{
"ajay","Aasai","famiz"
};
int main(int argc, const char *argv[]) {
// Check for valid command line arguments, print usage
// if no arguments were given.
//if (argc != 4) {
// cout << "usage: " << argv[0] << " </path/to/haar_cascade> </path/to/csv.ext> </path/to/device id>"<<endl;
// cout << "\t </path/to/haar_cascade> -- Path to the Haar Cascade for face detection." << endl;
// cout << "\t </path/to/csv.ext> -- Path to the CSV file with the face database." << endl;
// cout << "\t <device id> -- The webcam device id to grab frames from." << endl;
// exit(1);
//}
//// Get the path to your CSV:
//string fn_haar = string(argv[1]);
//string fn_csv = string(argv[2]);
//int deviceId = atoi(argv[3]);
//// Get the path to your CSV:
// please set the correct path based on your folder
string fn_haar = "lbpcascade_frontalface.xml";
string fn_csv = "reader.ext ";
int deviceId = 0; // here is my webcam Id.
// These vectors hold the images and corresponding labels:
vector<Mat> images;
vector<int> labels;
// Read in the data (fails if no valid input filename is given, but you'll get an error message):
try {
read_csv(fn_csv, images, labels);
} catch (cv::Exception& e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
exit(1);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size AND we need to reshape incoming faces to this size:
int im_width = images[0].cols;
int im_height = images[0].rows;
// Create a FaceRecognizer and train it on the given images:
Ptr<FaceRecognizer> model = createLBPHFaceRecognizer();
model->train(images, labels);
cout<<("Facerecognizer created");
// That's it for learning the Face Recognition model. You now
// need to create the classifier for the task of Face Detection.
// We are going to use the haar cascade you have specified in the
// command line arguments:
CascadeClassifier lbp_cascade;
if ( ! lbp_cascade.load(fn_haar) )
{
cout<<("\nlbp cascade not loaded");
}
else
{
cout<<("\nlbp cascade loaded");
}
// Get a handle to the Video device:
VideoCapture cap(deviceId);
cout<<("\nvideo device is opened");
// Check if we can use this device at all:
if(!cap.isOpened()) {
cerr << "Capture Device ID " << deviceId << "cannot be opened." << endl;
return -1;
}
// Holds the current frame from the Video device:
Mat frame;
for(;;) {
cap >> frame;
// Clone the current frame:
Mat original = frame.clone();
cout<<("\nframe is cloned");
// Convert the current frame to grayscale:
Mat gray;
//gray = imread("G:\Picture\003.jpg",0);
cvtColor(original, gray, CV_BGR2GRAY);
imshow("gray image", gray);
// And display it:
char key1 = (char) waitKey(50);
// Find the faces in the frame:
cout<<("\ncolor converted");
vector< Rect_<int> > faces;
cout<<("\ndetecting faces");
lbp_cascade.detectMultiScale(gray, faces);
// At this point you have the position of the faces in
// faces. Now we'll get the faces, make a prediction and
// annotate it in the video. Cool or what?
cout<<("\nfaces detected\n");
cout<<faces.size();
for(int i = 0; i < faces.size(); i++)
{
// Process face by face:
cout<<("\nprocessing faces");
Rect face_i = faces[i];
// Crop the face from the image. So simple with OpenCV C++:
Mat face = gray(face_i);
// Resizing the face is necessary for Eigenfaces and Fisherfaces. You can easily
// verify this, by reading through the face recognition tutorial coming with OpenCV.
// Resizing IS NOT NEEDED for Local Binary Patterns Histograms, so preparing the
// input data really depends on the algorithm used.
//
// I strongly encourage you to play around with the algorithms. See which work best
// in your scenario, LBPH should always be a contender for robust face recognition.
//
// Since I am showing the Fisherfaces algorithm here, I also show how to resize the
// face you have just found:
/*Mat face_resized;
cv::resize(face, face_resized, Size(im_width, im_height), 1.0, 1.0, INTER_CUBIC);
// Now perform the prediction, see how easy that is:
cout<<("\nface resized");
imshow("resized face image", face_resized);*/
int prediction = model->predict(face);
cout<<("\nface predicted");
// And finally write all we've found out to the original image!
// First of all draw a green rectangle around the detected face:
cout<<("\nnow writing to original");
rectangle(original, face_i, CV_RGB(0, 255,0), 1);
// Create the text we will annotate the box with:
string box_text;
box_text = format( "Prediction =",prediction);
// Get stringname
if ( prediction >= 0 && prediction <=1 )
{
box_text.append( g_listname_t[prediction] );
}
else box_text.append( "Unknown" );
// Calculate the position for annotated text (make sure we don't
// put illegal values in there):
int pos_x = std::max(face_i.tl().x - 10, 0);
int pos_y = std::max(face_i.tl().y - 10, 0);
// And now put it into the image:
putText(original, box_text, Point(pos_x, pos_y), FONT_HERSHEY_PLAIN, 1.0, CV_RGB(0,255,0), 2.0);
}
// Show the result:
imshow("face_recognizer", original);
// And display it:
char key = (char) waitKey(50);
// Exit this loop on escape:
if(key == 27)
break;
}
return 0;
}

That is an expected result if you ask me, the code which you showed is the basic one to do recognition, there are some backdrops which we need to take care of before implementing.
1) the quality of training images, how did you crop them ?
do they contain any extra information apart from face, if you used haar classifier in our opencv data to crop faces, then, the images tend to contain extra information than the face, as the rectangles are a bit large in size when compared to face.
2) there might be a chance that, even the rotated faces might be trained, so, its tough to classify with the features of rotated faces.
3) how many images, you trained the recognizer with ?, it playes a crucial role.
Answer for the first question, is most likely to be out of opencv, we cant do much about it, as there is very less probability that, we ll find a face detector which is as good and as simple as haar detector, so, we could make this as an exemption, if we can adjust with an accuracy around 70 %.
the second problem could be solved with some preprocessing techniques on training and testing dataset.
Like., aligning faces which are being rotated
follow this link, very good suggestions for face alignment are being suggested.
How to align face images c++ opencv
the third problem is solved with good number of samples which is not a hard task to achieve, take care of alignment before training, so that correct features could be extracted to classify.
there might be other factors that can improve the accuracy which I might have missed.

How to access findNonZero coordinates stored in Mat C++

I am a beginner with OpenCV and I have read some tutorials and manuals but I couldn't quite make sense of some things.
Currently, I am trying to crop a binary image into two sections. I want to know which row has the most number of white pixels and then crop out the row and everything above it and then redraw the image with just the data below the row with the most number of white pixels.
What I've done so far is to find the coordinates of the white pixels using findNonZero and then store it into a Mat. The next step is where I get confused. I am unsure of how to access the elements in the Mat and figuring out which row occurs the most in the array.
I have used a test image with my code below. It gave me the pixel locations of [2,0; 1,1; 2,1; 3,1; 0,2; 1,2; 2,2; 3,2; 4,2; 1,3; 2,3; 3,3; 2,4]. Each element has a x and y coordinate of the white pixel. First of all how do I access each element and then only poll the y-coordinate in each element to determine the row that occurs the most? I have tried using the at<>() method but I don't think I've been using it right.
Is this method a good way of doing this or is there a better and/or faster way? I have read a different method here using L1-norm but I couldn't make sense of it and would this method be faster than mine?
Any help would be greatly appreciated.
Below is the code I have so far.
#include <opencv2\opencv.hpp>
#include <opencv2\imgproc\imgproc.hpp>
#include <opencv2\highgui\highgui.hpp>
#include <iostream>
using namespace cv;
using namespace std;
int main()
{
int Number_Of_Elements;
Mat Grayscale_Image, Binary_Image, NonZero_Locations;
Grayscale_Image = imread("Test Image 6 (640x480px).png", 0);
if(!Grayscale_Image.data)
{
cout << "Could not open or find the image" << endl;
return -1;
}
Binary_Image = Grayscale_Image > 128;
findNonZero(Binary_Image, NonZero_Locations);
cout << "Non-Zero Locations = " << NonZero_Locations << endl << endl;
Number_Of_Elements = NonZero_Locations.total();
cout << "Total Number Of Array Elements = " << Number_Of_Elements << endl << endl;
namedWindow("Test Image",CV_WINDOW_AUTOSIZE);
moveWindow("Test Image", 100, 100);
imshow ("Test Image", Binary_Image);
waitKey(0);
return(0);
}

I expect the following to work:
Point loc_i = NonZero_Locations.at<Point>(i);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js