I am trying to understand how to get the descriptor for a given KeyPoint in OpenCV. So far my code looks like follows:
#include <iostream>
#include "opencv2/opencv.hpp"
typedef cv::Mat Image;
int main(int argc, const char * argv[])
{
Image imgA = cv::imread("images/buddhamulticam_total100.png",
CV_LOAD_IMAGE_GRAYSCALE);
Image imgB = cv::imread("images/buddhamulticam_total101.png",
CV_LOAD_IMAGE_GRAYSCALE);
cv::Ptr<cv::FeatureDetector> detector =
cv::FeatureDetector::create("ORB");
cv::Ptr<cv::DescriptorExtractor> descriptor =
cv::DescriptorExtractor::create("ORB");
std::vector<cv::KeyPoint> keyPointsA, keyPointsB;
keyPointsA.push_back(cv::KeyPoint(0,0,5));
keyPointsB.push_back(cv::KeyPoint(10,10,5));
cv::Mat descriptorA, descriptorB;
descriptor->compute(imgA, keyPointsA, descriptorA);
descriptor->compute(imgB, keyPointsB, descriptorB);
std::cout << "DescriptorA (" << descriptorA.rows << "," <<
descriptorA.cols << ")" << std::endl;
std::cout << "DescriptorB (" << descriptorB.rows << ","
<< descriptorB.cols << ")" << std::endl;
return 0;
}
The problem is that I am getting no data in the descriptor. What am I missing?
Could you explain in more detail what are the params passed to the KeyPoint object? I am new to computer vision + OpenCV, so probably a better explanation (than OpenCV's documentation) could help.
You're trying to computer ORB on the points (0,0) and (10,10), but they are too close to the image border, so ORB can't compute descriptors in those locations. ORB (as well as the other binary descriptors) filters them out.
EDIT: since you asked about usage, I'm editing the answer. You should pass the whole image. I use it as:
Ptr<FeatureDetector> detector = FeatureDetector::create(detector_name);
Ptr<DescriptorExtractor> descriptor = DescriptorExtractor::create(descriptor_name);
detector->detect(imgK, kp);
descriptor->compute(imgK, kp, desc);
Related
Goal
Getting the same quality result when using OpenCV Mat as when using Leptonica Pix when doing OCR with Tesseract.
Environment
C++17, OpenCV 3.4.1, Tesseract 3.05.01, Leptonica 1.74.4, Visual Studio Community 2017, Windows 10 Pro 64-bit
Description
I'm working with Tesseract and OCR, and have found what I think is a peculiar behaviour.
This is my input image:
And this is my code:
#include "stdafx.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#pragma comment(lib, "ws2_32.lib")
using namespace std;
using namespace cv;
using namespace tesseract;
void opencvVariant(string titleFile);
void leptonicaVariant(const char* titleFile);
int main()
{
cout << "Tesseract with OpenCV and Leptonica" << endl;
const char* titleFile = "raptor-companion-2.jpg";
opencvVariant(titleFile);
leptonicaVariant(titleFile);
cout << endl;
system("pause");
return 0;
}
void opencvVariant(string titleFile) {
cout << endl << "OpenCV variant..." << endl;
TessBaseAPI ocr;
ocr.Init(NULL, "eng");
Mat image = imread(titleFile);
ocr.SetImage(image.data, image.cols, image.rows, 1, image.step);
char* outText = ocr.GetUTF8Text();
int confidence = ocr.MeanTextConf();
cout << "Text: " << outText << endl;
cout << "Confidence: " << confidence << endl;
}
void leptonicaVariant(const char* titleFile) {
cout << endl << "Leptonica variant..." << endl;
TessBaseAPI ocr;
ocr.Init(NULL, "eng");
Pix *image = pixRead(titleFile);
ocr.SetImage(image);
char* outText = ocr.GetUTF8Text();
int confidence = ocr.MeanTextConf();
cout << "Text: " << outText << endl;
cout << "Confidence: " << confidence << endl;
}
The methods opencvVariant and leptonicaVariant is basically the same except that one is using the class Mat from OpenCV and the other Pix from Leptonica. Yet, the result is quite different.
OpenCV variant...
Text: Rapton
Confidence: 68
Leptonica variant...
Text: Raptor Companion
Confidence: 83
As one can see in the output above, the Pix variant gives a much better result than the Mat variant. Since my code relies heavily on OpenCV for the computer vision before the OCR its essential for me that the OCR works well with OpenCV and its' classes.
Questions
Why does Pix give a better result than Mat, and vice versa?
How could the algorithm be changed to make the Mat variant as efficient as the Pix variant?
OpenCV imread function by default reads image as colored, which means you get pixels as BGRBGRBGR....
In your example you are assuming opencv image is grayscale, so there are 2 ways of fixing that:
Change your SetImage line according to number of channels in opencv image
ocr.SetImage((uchar*)image.data, image.size().width, simageb.size().height, image.channels(), image.step1());
Convert your opencv image to grayscale with 1 channel
cv::cvtColor(image, image, CV_BGR2GRAY);
The highlighted code demonstrate openCV framework is loaded in my C code and it render Police watching. Which is just to demonstrate it works very smooth and very clean code to write.
Target: My webCAM is connected in to the USB port. I would like to capture the live webcam image and match from a local file (/tmp/myface.png), if live webcam match with local file myface.png, it will show the text "Police watching"
How can i now, capture my webCAM on this following code? 2) When the webCAM is captured, how can i load the file and find if it match, on match it shows that text only.
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <fstream>
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
#include "opencv/cv.h"
void detectAndDisplay(Mat frame);
//*************
// Set Region of Interest
cv::Rect roi_b;
cv::Rect roi_c;
size_t ic = 0; // ic is index of current element
int ac = 0; // ac is area of current element
size_t ib = 0; // ib is index of biggest element
int ab = 0; // ab is area of biggest element
stringstream ssfn;
//*************
CascadeClassifier face_cascade;
string window_name = "Capture - Face detection";
int filenumber; // Number of file to be saved
string filename;
Mat frameread = imread("test.jpg");
int main(int argc, const char *argv[]){
if (argc != 4) {
cout << "usage: " << argv[0] << " </path/to/haar_cascade> </path/to/csv.ext> </path/to/device id>" << endl;
cout << "\t </path/to/haar_cascade> -- Path to the Haar Cascade for face detection." << endl;
cout << "\t </path/to/csv.ext> -- Path to the CSV file with the face database." << endl;
cout << "\t <device id> -- The webcam device id to grab frames from." << endl;
// exit(1);
}
CascadeClassifier face_cascade;
CascadeClassifier face_cascade1;
String fn="C:\\opencv\\sources\\data\\haarcascades\\haarcascade_frontalface_alt2.xml";
String fn1="C:\\opencv\\sources\\data\\haarcascades\\haarcascade_eye.xml";
face_cascade.load(fn);
face_cascade1.load(fn1);
VideoCapture input(0);
if(!input.isOpened()){return -1;}
namedWindow("Mezo",1);
Mat f2;
Mat frame;
std::vector<Rect> faces,faces1;
CvCapture* capture1;
IplImage* f1;
Mat crop;
cv::Rect r;
// detectAndDisplay(frameread);
while(1)
{
ic=0;
ib=0;
ab=0;
ac=0;
input >> frame;
waitKey(10);
//cvtColor(frame, frame, CV_BGR2GRAY);
//cv::equalizeHist(frame,frame);
face_cascade.detectMultiScale(frame, faces, 1.1, 10, CV_HAAR_SCALE_IMAGE | CV_HAAR_DO_CANNY_PRUNING, cvSize(0,0), cvSize(300,300));
for(int i=0; i < faces.size();i++)
{
Point pt1(faces[i].x+faces[i].width, faces[i].y+faces[i].height);
Point pt2(faces[i].x,faces[i].y);
Mat faceROI = frame(faces[i]);
face_cascade1.detectMultiScale(faceROI, faces1, 1.1, 2, 0 | CV_HAAR_SCALE_IMAGE, Size(30,30));
for(size_t j=0; j< faces1.size(); j++)
{
Point center(faces[i].x+faces1[j].x+faces1[j].width*0.5, faces[i].y+faces1[j].y+faces1[j].height*0.5);
int radius = cvRound((faces1[j].width+faces1[j].height)*0.25);
circle(frame, center, radius, Scalar(255,0,0), 2, 8, 0);
}
rectangle(frame, pt1, pt2, cvScalar(0,255,0), 2, 8, 0);
}
imshow("Result", frame);
waitKey(3);
char c = waitKey(3);
if(c == 27)
break;
}
return 0;
}
What you are asking about is probably the Face recognition. You should be more clear in your question.
Opencv has a class for doing recognition perfectly, not as you think to do.
Many approaches are available for this technology, Opencv has three algorithms. As well you need to prepare your database of images (labelled faces)
All this steps are described in opencv docs with some examples : http://docs.opencv.org/modules/contrib/doc/facerec/facerec_tutorial.html
Just you need to read and apply.
Here you can also find a good tutorial for beginners.
I am trying to read data from an industrial camera using the V4l linux driver and C++. I would like to display the result using the OpenCV. I read the buffer, create an Mat object, which actually contains values in range 0...255.
The problem seems to be the imshow() call. When commenting this line out, an actual window without an image is displayed. Once uncommented no window is diplayed and also no output in terminal after this line is shown. I am not able to find a solution on my own, all examples I found look the same as my code to me.
Here is the code:
#include <fcntl.h>
#include "opencv/cv.h"
#include "opencv/highgui.h"
#include <libv4l2.h>
#include <libv4l1.h>
#include <linux/videodev2.h>
#include <sys/ioctl.h>
#define BUFFERSIZE 357120 // 744 * 480
using namespace cv;
using namespace std;
int main(int argc, char **argv) {
int cameraHandle, i;
unsigned char pictureBuffer[BUFFERSIZE];
char cameraDevice[] = "/dev/video0";
struct v4l2_control V4L2_control;
/* open camera device */
if (( cameraHandle = v4l1_open(cameraDevice, O_RDONLY)) == -1 ){
printf("Unable to open the camera");
return -1;
}
// disable auto exposure
V4L2_control.id = V4L2_CID_EXPOSURE_AUTO;
V4L2_control.value = V4L2_EXPOSURE_SHUTTER_PRIORITY;
ioctl(cameraHandle, VIDIOC_S_CTRL, &V4L2_control);
// set exposure time
V4L2_control.id = V4L2_CID_EXPOSURE_ABSOLUTE;
V4L2_control.value = 2;
ioctl(cameraHandle, VIDIOC_S_CTRL, &V4L2_control);
// get 5 pictures to warm up the camera
for (i = 0; i <= 5; i++){
v4l1_read(cameraHandle, pictureBuffer, BUFFERSIZE);
}
// show pictures
Mat mat = Mat(744, 480, CV_8UC3, (void*)pictureBuffer);
cout << "M = " << endl << " " << mat << endl << endl; // display the image data
namedWindow("imagetest", CV_WINDOW_AUTOSIZE );
imshow("imagetest", mat);
waitKey(30);
cout << "test output" << endl;
//clenup
v4l1_close(cameraHandle);
destroyWindow("imagetest");
return 0;
}
EDIT:
Well, after running the code in terminal instead of ecipse I saw a segmentation fault Even commenting everything behind the
cout << "M = " << endl << " " << mat << endl << endl;
line gives me this error.
Solved. The problem lied in the wrong file format. CV_8UC1 or CV_8U instead of CV_8UC3 brought and an output. The difference between those formats is described here: In OpenCV, what's the difference between CV_8U and CV_8UC1?
The program '[5772] opencv3.exe' has exited with code 1 (0x1).
Other errors:
opencv_flann248.dll
opencv_features2d248.dll
opencv_calib3d248.dll
opencv_ml248.dll
opencv_video248.dll
opencv_contrib248.dll
opencv_objdetect248.dll
opencv_highgui248.dll
opencv_imgproc248.dll
opencv_core248.dll
- Cannot find or open the PDB file.
Code:
#include "opencv2/core/core.hpp"
#include "opencv2/contrib/contrib.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/objdetect/objdetect.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace std;
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, 0));
labels.push_back(atoi(classlabel.c_str()));
}
}
}
int main(int argc, const char *argv[]) {
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc != 4) {
cout << "usage: " << argv[0] << " </path/to/haar_cascade> </path/to/csv.ext> </path/to/device id>" << endl;
cout << "\t </path/to/haar_cascade> -- Path to the Haar Cascade for face detection." << endl;
cout << "\t </path/to/csv.ext> -- Path to the CSV file with the face database." << endl;
cout << "\t <device id> -- The webcam device id to grab frames from." << endl;
exit(1);
}
// Get the path to your CSV:
string fn_haar = string(argv[1]);
string fn_csv = string(argv[2]);
int deviceId = atoi(argv[3]);
// These vectors hold the images and corresponding labels:
vector<Mat> images;
vector<int> labels;
// Read in the data (fails if no valid input filename is given, but you'll get an error message):
try {
read_csv(fn_csv, images, labels);
} catch (cv::Exception& e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
exit(1);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size AND we need to reshape incoming faces to this size:
int im_width = images[0].cols;
int im_height = images[0].rows;
// Create a FaceRecognizer and train it on the given images:
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// That's it for learning the Face Recognition model. You now
// need to create the classifier for the task of Face Detection.
// We are going to use the haar cascade you have specified in the
// command line arguments:
//
CascadeClassifier haar_cascade;
haar_cascade.load(fn_haar);
// Get a handle to the Video device:
VideoCapture cap(deviceId);
// Check if we can use this device at all:
if(!cap.isOpened()) {
cerr << "Capture Device ID " << deviceId << "cannot be opened." << endl;
return -1;
}
// Holds the current frame from the Video device:
Mat frame;
for(;;) {
cap >> frame;
// Clone the current frame:
Mat original = frame.clone();
// Convert the current frame to grayscale:
Mat gray;
cvtColor(original, gray, CV_BGR2GRAY);
// Find the faces in the frame:
vector< Rect_<int> > faces;
haar_cascade.detectMultiScale(gray, faces);
// At this point you have the position of the faces in
// faces. Now we'll get the faces, make a prediction and
// annotate it in the video. Cool or what?
for(int i = 0; i < faces.size(); i++) {
// Process face by face:
Rect face_i = faces[i];
// Crop the face from the image. So simple with OpenCV C++:
Mat face = gray(face_i);
// Resizing the face is necessary for Eigenfaces and Fisherfaces. You can easily
// verify this, by reading through the face recognition tutorial coming with OpenCV.
// Resizing IS NOT NEEDED for Local Binary Patterns Histograms, so preparing the
// input data really depends on the algorithm used.
//
// I strongly encourage you to play around with the algorithms. See which work best
// in your scenario, LBPH should always be a contender for robust face recognition.
//
// Since I am showing the Fisherfaces algorithm here, I also show how to resize the
// face you have just found:
Mat face_resized;
cv::resize(face, face_resized, Size(im_width, im_height), 1.0, 1.0, INTER_CUBIC);
// Now perform the prediction, see how easy that is:
int prediction = model->predict(face_resized);
// And finally write all we've found out to the original image!
// First of all draw a green rectangle around the detected face:
rectangle(original, face_i, CV_RGB(0, 255,0), 1);
// Create the text we will annotate the box with:
string box_text = format("Prediction = %d", prediction);
// Calculate the position for annotated text (make sure we don't
// put illegal values in there):
int pos_x = std::max(face_i.tl().x - 10, 0);
int pos_y = std::max(face_i.tl().y - 10, 0);
// And now put it into the image:
putText(original, box_text, Point(pos_x, pos_y), FONT_HERSHEY_PLAIN, 1.0, CV_RGB(0,255,0), 2.0);
}
// Show the result:
imshow("face_recognizer", original);
// And display it:
char key = (char) waitKey(20);
// Exit this loop on escape:
if(key == 27)
break;
}
return 0;
}
look at the code, at the beginning of main():
cout << "usage: " << argv[0] << " </path/to/haar_cascade> </path/to/csv.ext> </path/to/device id>" << endl;
so, you have to pass 3 cmdline args to your prog here:
a cascade-file ( an xml file [either lbp or haar] from opencv/data for the face detection )
the csv (txt) file with the names and the labels of the training images
the camera device id used for the later prediction
i have tried out this file to compile with
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include <opencv2/legacy/legacy.hpp>
using namespace cv;
static void help( char** argv )
{
std::cout << "\nUsage: " << argv[0] << " [path/to/image1] [path/to/image2] \n"
<< "This is an example on how to use the keypoint descriptor presented in the following paper: \n"
<< "A. Alahi, R. Ortiz, and P. Vandergheynst. FREAK: Fast Retina Keypoint. \n"
<< "In IEEE Conference on Computer Vision and Pattern Recognition, 2012. CVPR 2012 Open Source Award winner \n"
<< std::endl;
}
int main( int argc, char** argv ) {
// check http://docs.opencv.org/doc/tutorials/features2d/table_of_content_features2d/table_of_content_features2d.html
// for OpenCV general detection/matching framework details
if( argc != 3 ) {
help(argv);
return -1;
}
// Load images
Mat imgA = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE );
if( !imgA.data ) {
std::cout<< " --(!) Error reading image " << argv[1] << std::endl;
return -1;
}
Mat imgB = imread(argv[2], CV_LOAD_IMAGE_GRAYSCALE );
if( !imgB.data ) {
std::cout << " --(!) Error reading image " << argv[2] << std::endl;
return -1;
}
std::vector<KeyPoint> keypointsA, keypointsB;
Mat descriptorsA, descriptorsB;
std::vector<DMatch> matches;
// DETECTION
// Any openCV detector such as
SurfFeatureDetector detector(2000,4);
// DESCRIPTOR
// Our proposed FREAK descriptor
// (roation invariance, scale invariance, pattern radius corresponding to SMALLEST_KP_SIZE,
// number of octaves, optional vector containing the selected pairs)
// FREAK extractor(true, true, 22, 4, std::vector<int>());
FREAK extractor;
// MATCHER
// The standard Hamming distance can be used such as
// BruteForceMatcher<Hamming> matcher;
// or the proposed cascade of hamming distance using SSSE3
BruteForceMatcher<Hamming> matcher;
// detect
double t = (double)getTickCount();
detector.detect( imgA, keypointsA );
detector.detect( imgB, keypointsB );
t = ((double)getTickCount() - t)/getTickFrequency();
std::cout << "detection time [s]: " << t/1.0 << std::endl;
// extract
t = (double)getTickCount();
extractor.compute( imgA, keypointsA, descriptorsA );
extractor.compute( imgB, keypointsB, descriptorsB );
t = ((double)getTickCount() - t)/getTickFrequency();
std::cout << "extraction time [s]: " << t << std::endl;
// match
t = (double)getTickCount();
matcher.match(descriptorsA, descriptorsB, matches);
t = ((double)getTickCount() - t)/getTickFrequency();
std::cout << "matching time [s]: " << t << std::endl;
// Draw matches
Mat imgMatch;
drawMatches(imgA, keypointsA, imgB, keypointsB, matches, imgMatch);
namedWindow("matches", CV_WINDOW_KEEPRATIO);
imshow("matches", imgMatch);
waitKey(0);
}
with this command
gcc freak_demo.cpp `pkg-config opencv --cflags --libs`
and received this error message
/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld: /tmp/ccwSmrle.o: undefined reference to symbol '_ZNSsD1Ev##GLIBCXX_3.4'
/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld: note: '_ZNSsD1Ev##GLIBCXX_3.4' is defined in DSO /usr/lib64/libstdc++.so.6 so try adding it to the linker command line
/usr/lib64/libstdc++.so.6: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status
I don't know what GLIBCXX is. Which package (OpenSuse 13.1, gcc 4.8) do I have to install? I don't know how to interpret the error message, any help is appreciated.
As Daniel Frey said in his comment, use g++ instead of gcc. This solved the exact same problem for me, but for the compilation of a totally different program.