How to save DLIB object detection chips properly? - c++

I am having trouble saving image chips generated by DLIB's face detection model. The code below details my workflow. I have attempted saving the whole image, d_image below, and that works just fine. However, when I try to save each chip I get distorted output (see example below). I'm using dlib 19.4 on Ubuntu 16.04.
// object to store raw image data
cv::Mat rawImage;
// initialize the detector
dlib::frontal_face_detector detector = dlib::get_frontal_face_detector();
// using shape predictor object to create dull_object_detections
dlib::shape_predictor sp;
dlib::deserialize(argv[1]) >> sp;
// for writing out images
int image_id = 1;
while (true){
// retrieve image size
sockt.getData(&image_size, 4, NULL);
if (image_size > 0) {
rawImage.create(1, image_size, CV_8UC1);
// load incoming data from a stream
sockt.getData(rawImage.data, image_size, MSG_WAITALL);
// reshape and correct orientation
dlib::cv_image<dlib::bgr_pixel> d_image = utils::process_frame(rawImage);
// find the daces!
std::vector<dlib::rectangle> detections = detector(d_image);
if (detections.size() > 0){
// generate additional detection data so we can use
// dlib's extract_image_chips function
std::vector<dlib::full_object_detection> shapes;
for (int idx = 0; idx < detections.size(); idx++){
dlib::full_object_detection shape = sp(d_image, detections[idx]);
shapes.push_back(shape);
}
// write each chip to disk
dlib::array<dlib::array2d<dlib::bgr_pixel>> face_chips;
dlib::extract_image_chips(d_image, dlib::get_face_chip_details(shapes), face_chips);
for (int idx = 0; idx < face_chips.size(); idx++){
std::string fname = argv[2] + std::to_string(image_id) + ".jpg";
dlib::save_jpeg(face_chips[idx], fname);
image_id++;
}
}
Example saved chip:
Edit: Added comment to utils::process_frame. This function accepts a 1xN array and decodes as a JPEG using OpenCV

Something wrong with image formats you are using:
This is OpenCV's greyscale (1-channel) image
rawImage.create(1, image_size, CV_8UC1);
This is BGR (3-channel) image
dlib::cv_image<dlib::bgr_pixel> d_image = utils::process_frame(rawImage);
Dlib should throw an exception if image has incorrect number of channels, but it does not throw it in your case. This means that somewhere in utils::process_frame(rawImage) the image format is changed into 3-channel- check image formats first
And this construction code rawImage.create(1, image_size, CV_8UC1); constructs 1-row and image_size cols image.
Something is incorrect with image size and format
Please also note, that dlib does not copy image data into dlib::cv_image<dlib::bgr_pixel> d_image and the rawImage should remain unchanged by other threads until processing is finished
Anyways, you can call dlib::toMat, get OpenCV Mat and save it with OpenCV functions
UPDATE:
one more thing here:
dlib::cv_image<dlib::bgr_pixel> d_image = utils::process_frame(rawImage);
looks like utils::process_frame returns some temporary object that is destroyed after d_image is constructed. d_image does not hold returned data and it can be lost
So I suggest you to change your code like this:
cv::Mat uncompressed;
tils::process_frame(rawImage, uncompressed);
dlib::cv_image<dlib::bgr_pixel> d_image(uncompressed);;
where process_frame should take reference to cv::Mat and save its output into it

Related

Absdiff in openCV can compile but show black image

I have been trying to use absdiff to find the motion in an image,but unfortunately it fail,i am new to OpenCV. The coding supposed to use absdiff to determine whether any motion is happening around or not, but the output is a pitch black for diff1,diff2 and motion. Meanwhile,next_mframe,current_mframe, prev_mframe shows grayscale images. While, result shows a clear and normal image. I use this as my reference http://manmade2.com/simple-home-surveillance-with-opencv-c-and-raspberry-pi/. I think the all the image memory is loaded with the same frame and compare, that explain why its a pitch black. Is there any others method i miss there? I am using RTSP to pass camera RAW image to ROS.
void imageCallback(const sensor_msgs::ImageConstPtr&msg_ptr){
CvPoint center;
int radius, posX, posY;
cv_bridge::CvImagePtr cv_image; //To parse image_raw from rstp
try
{
cv_image = cv_bridge::toCvCopy(msg_ptr, enc::BGR8);
}
catch (cv_bridge::Exception& e)
{
ROS_ERROR("cv_bridge exception: %s", e.what());
return;
}
frame = new IplImage(cv_image->image); //frame now holding raw_image
frame1 = new IplImage(cv_image->image);
frame2 = new IplImage(cv_image->image);
frame3 = new IplImage(cv_image->image);
matriximage = cvarrToMat(frame);
cvtColor(matriximage,matriximage,CV_RGB2GRAY); //grayscale
prev_mframe = cvarrToMat(frame1);
cvtColor(prev_mframe,prev_mframe,CV_RGB2GRAY); //grayscale
current_mframe = cvarrToMat(frame2);
cvtColor(current_mframe,current_mframe,CV_RGB2GRAY); //grayscale
next_mframe = cvarrToMat(frame3);
cvtColor(next_mframe,next_mframe,CV_RGB2GRAY); //grayscale
// Maximum deviation of the image, the higher the value, the more motion is allowed
int max_deviation = 20;
result=matriximage;
//rellocate image in right order
prev_mframe = current_mframe;
current_mframe = next_mframe;
next_mframe = matriximage;
//motion=difflmg(prev_mframe,current_mframe,next_mframe);
absdiff(prev_mframe,next_mframe,diff1); //Here should show black and white image
absdiff(next_mframe,current_mframe,diff2);
bitwise_and(diff1,diff2,motion);
threshold(motion,motion,35,255,CV_THRESH_BINARY);
erode(motion,motion,kernel_ero);
imshow("Motion Detection",result);
imshow("diff1",diff1); //I tried to output the image but its all black
imshow("diff2",diff2); //same here, I tried to output the image but its all black
imshow("diff1",motion);
imshow("nextframe",next_mframe);
imshow("motion",motion);
char c =cvWaitKey(3); }
I change the cv_bridge method to VideoCap, its seem to functions well, cv_bridge just cannot save the image even through i change the IplImage to Mat format. Maybe there is other ways, but as for now, i will go with this method fist.
VideoCapture cap(0);
Tracker(void)
{
//check if camera worked
if(!cap.isOpened())
{
cout<<"cannot open the Video cam"<<endl;
}
cout<<"camera is opening"<<endl;
cap>>prev_mframe;
cvtColor(prev_mframe,prev_mframe,CV_RGB2GRAY); // capture 3 frame and convert to grayscale
cap>>current_mframe;
cvtColor(current_mframe,current_mframe,CV_RGB2GRAY);
cap>>next_mframe;
cvtColor(next_mframe,next_mframe,CV_RGB2GRAY);
//rellocate image in right order
current_mframe.copyTo(prev_mframe);
next_mframe.copyTo(current_mframe);
matriximage.copyTo(next_mframe);
motion = diffImg(prev_mframe, current_mframe, next_mframe);
}

How to convert the mat image to string (array2d<rgb_pixel>)?

I created dll for dlib for face landmark detection, in opencv we get the image in MAT but here in dlib the we get it by using array2d, so any one say how to convert the mat to array2d ??
The full code can be seen viewed here
You can try to use cv_image, such as:
cv::Mat temp;
...
cv_image<bgr_pixel> image(temp);
// Detect faces
std::vector<rectangle> faces = detector(image);
Also note that this just wraps the cv::Mat object and doesn't copy anything. So image is only valid as long as temp is valid. This basically means you shouldn't modify temp while using image.

my application crashes at KNearest::find_nearest

I want to implement a OCR feature.
I have collected some samples and i want to use K-Nearest to implement it.
So, i use the below code to load data and initialize KNearest
KNearest knn = new KNearest;
Mat mData, mClass;
for (int i = 0; i <= 9; ++i)
{
Mat mImage = imread( FILENAME ); // the filename format is '%d.bmp', presenting a 15x15 image
Mat mFloat;
if (mImage.empty()) break; // if the file doesn't exist
mImage.convertTo(mFloat, CV_32FC1);
mData.push_back(mFloat.reshape(1, 1));
mClass.push_back( '0' + i );
}
knn->train(mData, mClass);
Then, i call the code to find best result
for (vector<Mat>::iterator it = charset.begin(); it != charset.end(); ++it)
{
Mat mFloat;
it->convertTo(mFloat, CV_32FC1); // 'it' presents a 15x15 gray image
float result = knn->find_nearest(mFloat.reshape(1, 1), knn->get_max_k());
}
But, my application crashes at find_nearest.
Anyone could help me?
I seemed to find the problem...
My sample image is a converted gray image by cvtColor, but my input image isn't.
After i add
cvtColor(mImage, mImage, COLOR_BGR2GRAY);
between
if (mImage.empty()) break;
mImage.convertTo(mFloat, CV_32FC1);
find_nearest() return a value and my application is fine.

Using Mat::at(i,j) in opencv for a 2-D Mat object

I am using Ubuntu 12.04 and OpenCV 2
I have written the following code :
IplImage* img =0;
img = cvLoadImage("nature.jpg");
if(img != 0)
{
Mat Img_mat(img);
std::vector<Mat> RGB;
split(Img_mat, RGB);
int data = (RGB[0]).at<int>(i,j)); /*Where i, j are inside the bounds of the matrix size .. i have checked this*/
}
The problem is I am getting negative values and very large values in the data variable. I think I have made some mistake somewhere. Can you please point it out.
I have been reading the documentation (I have not finished it fully.. it is quite large. ) But from what I have read, this should work. But it isnt. What is going wrong here?
Img_mat is a 3 channeled image. Each channel consists of pixel values uchar in data type.
So with split(Img_mat, BGR) the Img_mat is split into 3 planes of blue, green and red which are collectively stored in a vector BGR. So BGR[0] is the first (blue) plane with uchar data type pixels...hence it will be
int dataB = (int)BGR[0].at<uchar>(i,j);
int dataG = (int)BGR[1].at<uchar>(i,j);
so on...
You have to specify the correct type for cv::Mat::at(i,j). You are accessing the pixel as int, while it should be a vector of uchar. Your code should look something like this:
IplImage* img = 0;
img = cvLoadImage("nature.jpg");
if(img != 0)
{
Mat Img_mat(img);
std::vector<Mat> BGR;
split(Img_mat, BGR);
Vec3b data = BGR[0].at<Vec3b>(i,j);
// data[0] -> blue
// data[1] -> green
// data[2] -> red
}
Why are you loading an IplImage first? You are mixing the C and C++ interfaces.
Loading a cv::Mat with imread directly would be more straight-forward.
This way you can also specify the type and use the according type in your at call.

PCA in OpenCV using the new C++ interface

As an aside: Apologies if I'm flooding SO with OpenCV questions :p
I'm currently trying to port over my old C code to use the new C++ interface and I've got to the point where I'm rebuilidng my Eigenfaces face recogniser class.
Mat img = imread("1.jpg");
Mat img2 = imread("2.jpg");
FaceDetector* detect = new HaarDetector("haarcascade_frontalface_alt2.xml");
// convert to grey scale
Mat g_img, g_img2;
cvtColor(img, g_img, CV_BGR2GRAY);
cvtColor(img2, g_img2, CV_BGR2GRAY);
// find the faces in the images
Rect r = detect->getFace(g_img);
Mat img_roi = g_img(r);
r = detect->getFace(g_img2);
Mat img2_roi = g_img2(r);
// create the data matrix for PCA
Mat data;
data.create(2,1, img2_roi.type());
data.row(0) = img_roi;
data.row(1) = img2_roi;
// perform PCA
Mat averageFace;
PCA pca(data, averageFace, CV_PCA_DATA_AS_ROW, 2);
//namedWindow("avg",1); imshow("avg", averageFace); - causes segfault
//namedWindow("avg",1); imshow("avg", Mat(pca.mean)); - doesn't work
I'm trying to create the PCA space, and then see if it's working by displaying the computed average image. Are there any other steps to this?
Perhaps I need to project the images onto the PCA subspace first of all?
Your error is probably here:
Mat data;
data.create(2,1, img2_roi.type());
data.row(0) = img_roi;
data.row(1) = img2_roi;
PCA expects a matrix with the data vectors as rows. However, you never scale the images to the same size so that they have the same number of pixels (so the dimension is the same), also data.create(2,1,...) - the 1 needs to be the dimension of your vector, i.e. the number of your pixels. Then copy the pixels from the crop to your matrix.