What I am trying to do is use some process to alter the contents of a bitmap image ( dimensions and everything else is kept constant). While I run a process to perform transformations on it, I want to concurrently stream the image to my screen as it happens. So it's basically video without a series of images, but rather the same image in different stages of a transformation.
I am trying to do this because the bitmap I want to do this with is large, so actually saving each state as its own image and then combining will not be possible due to memory constraints.
My question is regarding the "live image display" while another process modifies the image, is there any way something like this can be done with a bitmap image?
You could stream the images in a video encoder directly.
For example you can feed ffmpeg with raw PGM/PPM images as input and you will get the compressed video as output without the need to ever create the actual images on disk.
Writing a PGM or PPM image means just generating a few header bytes followed by actual pixel values and thus is trivial to do with any language without the need of any image library. For example:
#include <stdio.h>
#include <math.h>
#include <algorithm>
int main(int argc, const char *argv[]) {
int w = 1920, h = 1080;
for (int frame=0; frame<100; frame++) {
// output frame header
printf("P5\n%i %i 255\n", w, h);
for (int y=0; y<h; y++) {
for (int x=0; x<w; x++) {
double dx = (x - w/2),
dy = (y - h/2),
d = sqrt(dx*dx + dy*dy),
a = atan2(dy, dx) - frame*2*3.14159265359/100,
value = 127 + 127*sin(a+d/10);
// output pixel
putchar(std::max(0, std::min(255, int(value))));
}
}
}
return 0;
}
generates on standard output a sequence of images that can be combined in a video directly. Running the program with
./video | ffmpeg -i - -r 60 -format pgm -y out.mp4
will generate a video named out.mp4. The video will be at 60fps (what -r is for) created from source images in PGM format (-format pgm) from standard input (option -i -) overwriting the output file if it's already present (-y).
This code was tested on linux; for this approach to work on windows you need also to set stdout to binary more with _setmode(fileno(stdout), _O_BINARY); or something similar (I didn't test on windows).
A more sophisticated way to do basically the same would be to start a child process instead of using piping thus leaving standard output usable by the program.
If you're just trying to see the results real-time, you can use something similar to page flipping / double buffering.
Basically: create two images. Image 1 is a "drawing buffer" while Image 2 is a "display buffer." Perform some portion of your drawing to image 1. Then, copy its complete contents to image 2. Continue drawing to image 1, copy, draw, copy, draw... etc. Insert a delay before each copy.
Related
Trying to create a functional SVM. I have 114 training images, 60 Positive/54 Negative, and 386 testing images for the SVM to predict against.
I read in the training image features to float like this:
trainingDataFloat[i][0] = trainFeatures.rows;
trainingDataFloat[i][1] = trainFeatures.cols;
And the same for the testing images too:
testDataFloat[i][0] = testFeatures.rows;
testDataFloat[i][2] = testFeatures.cols;
Then, using Micka's answer to this question, I turn the testDataFloat into a 1 Dimensional Array, and feed it to a Mat like this so to predict on the SVM:
float* testData1D = (float*)testDataFloat;
Mat testDataMat1D(height*width, 1, CV_32FC1, testData1D);
float testPredict = SVMmodel.predict(testDataMat1D);
Once this was all in place, there is the Debug Error of:
Sizes of input arguments do not match (the sample size is different from what has been used for training) in cvPreparePredictData
Looking at this post I found (Thanks to berak) that:
"all images (used in training & prediction) have to be the same size"
So I included a re-size function that would re-size the images to be all square at whatever size you wished (100x100, 200x200, 1000, 1000 etc.)
Run it again with the images re-sized to a new directory that the program now loads the images in from, and I get the exact same error as before of:
Sizes of input arguments do not match (the sample size is different from what has been used for training) in cvPreparePredictData
I just have no idea anymore on what to do. Why is it still throwing that error?
EDIT
I changed
Mat testDataMat1D(TestDFheight*TestDFwidth, 1, CV_32FC1, testData1D);
to
Mat testDataMat1D(1, TestDFheight*TestDFwidth, CV_32FC1, testData1D);
and placed the .predict inside the loop that the features are given to the float so that each image is given to the .predict individually because of this question. With the to int swapped so that .cols = 1 and .rows = TestDFheight*TestDFwidth the program seems to actually run, but then stops on image 160 (.exe has stopped working)... So that's a new concern.
EDIT 2
Added a simple
std::cout << testPredict;
To view the determined output of the SVM, and it seems to be positively matching everything until Image 160, where it stops running:
Please check your training and test feature vector.
I'm assuming your feature data is some form of cv::Mat containing features on each row.
In which case you want your training matrix to be a concatenation of each feature matrix from each image.
These line doesn't look right:
trainingDataFloat[i][0] = trainFeatures.rows;
trainingDataFloat[i][1] = trainFeatures.cols;
This is setting an element of a 2d matrix to the number of rows and columns in trainFeatures. This has nothing to do with the actual data that is in the trainFeatures matrix.
What are you trying to detect? If each image is a positive and negative example, then are you trying to detect something in an image? What are your features?
If you're trying to detect an object in the image on a per image basis, then you need a feature vector describing the whole image in one vector. In which case you'd do something like this with your training data:
int N; // Set to number of images you plan on using for training
int feature_size; // Set to the number of features extracted in each image. Should be constant across all images.
cv::Mat X = cv::Mat::zeros(N, feature_size, CV_32F); // Feature matrix
cv::Mat Y = cv::Mat::zeros(N, 1, CV_32F); // Label vector
// Now use a for loop to copy data into X and Y, Y = +1 for positive examples and -1 for negative examples
for(int i = 0; i < trainImages.size(); ++i)
{
X.row(i) = trainImages[i].features; // Where features is a cv::Mat row vector of size N of the extracted features
Y.row(i) = trainImages[i].isPositive ? 1:-1;
}
// Now train your cv::SVM on X and Y.
I investigated and stripped down my previous question (Is there a way to avoid conversion from YUV to BGR?). I want to overlay few images (format is YUV) on the resulting, bigger image (think about it like it is a canvas) and send it via network library (OPAL) forward without converting it to to BGR.
Here is the code:
Mat tYUV;
Mat tClonedYUV;
Mat tBGR;
Mat tMergedFrame;
int tMergedFrameWidth = 1000;
int tMergedFrameHeight = 800;
int tMergedFrameHalfWidth = tMergedFrameWidth / 2;
tYUV = Mat(tHeader->height * 1.5f, tHeader->width, CV_8UC1, OPAL_VIDEO_FRAME_DATA_PTR(tHeader));
tClonedYUV = tYUV.clone();
tMergedFrame = Mat(Size(tMergedFrameWidth, tMergedFrameHeight), tYUV.type(), cv::Scalar(0, 0, 0));
tYUV.copyTo(tMergedFrame(cv::Rect(0, 0, tYUV.cols > tMergedFrameWidth ? tMergedFrameWidth : tYUV.cols, tYUV.rows > tMergedFrameHeight ? tMergedFrameHeight : tYUV.rows)));
tClonedYUV.copyTo(tMergedFrame(cv::Rect(tMergedFrameHalfWidth, 0, tYUV.cols > tMergedFrameHalfWidth ? tMergedFrameHalfWidth : tYUV.cols, tYUV.rows > tMergedFrameHeight ? tMergedFrameHeight : tYUV.rows)));
namedWindow("merged frame", 1);
imshow("merged frame", tMergedFrame);
waitKey(10);
The result of above code looks like this:
I guess the image is not correctly interpreted, so the pictures stay black/white (Y component) and below them, we can see the U and V component. There are images, which describes the problem well (http://en.wikipedia.org/wiki/YUV):
and: http://upload.wikimedia.org/wikipedia/en/0/0d/Yuv420.svg
Is there a way for these values to be correctly read? I guess I should not copy the whole images (their Y, U, V components) straight to the calculated positions. The U and V components should be below them and in the proper order, am I right?
First, there are several YUV formats, so you need to be clear about which one you are using.
According to your image, it seems your YUV format is Y'UV420p.
Regardless, it is a lot simpler to convert to BGR work there and then convert back.
If that is not an option, you pretty much have to manage the ROIs yourself. YUV is commonly a plane-format where the channels are not (completely) multiplexed - and some are of different sizes and depths. If you do not use the internal color conversions, then you will have to know the exact YUV format and manage the pixel copying ROIs yourself.
With a YUV image, the CV_8UC* format specifier does not mean much beyond the actual memory requirements. It certainly does not specify the pixel/channel muxing.
For example, if you wanted to only use the Y component, then the Y is often the first plane in the image so the first "half" of whole image can just be treated as a monochrome 8UC1 image. In this case using ROIs is easy.
My application problem is that, I can get around 500 images, but there might be 1 or 2 of a pair of 2 images are completely the same, this means the files' checksum are the same. My eventual goal is to find out which ones are the repeated image paris.
However now I have to apply a compression algorithm on these 500 images, because the uncompressed images occupy too much disk space. Well, the compression breaks the checksum, so that I cannot use the checksum of the compressed images file to find out which are the repeated image pairs.
Fortunately, my compression algorithm is lossless, this means the restored uncompressed images can still be hashed somehow. But I just want to do this in memory without much disk write access. So my problem is how to efficiently pick up repeated image among large number of images files in memory?
I use opencv often, but the answer will be good as long as it is efficient without saving any file on disk. Python/Bash code will be also acceptable, C/C++ and OpenCV is preferred.
I can think of use OpenCV 's Mat, with std::hash, but std::hash won't work directly, I have to code the std::hash<cv::Mat> specifically, and I don't know how to do it properly yet.
Of course I can do this,
For each 2 images in all my images:
if ((cv::Mat)img1 == (cv::Mat)img2):
print img1 and img2 are identical
But this is extremely inefficient, basically a n^4 algorithm.
Note my problem is not image similarity problem, it is a hashing problem in memroy.
The idea of getting a hash algorithm for image :
Reduce the size of original image (cvResize ()), so that only the important objects will remain on picture ( getting rid of the high frequencies).
Reduce the image to 8x8 , then the total number of pixels will be 64 and the hash will fit all kinds of images , regardless of their size and aspect ratio.
Remove the color. Translate the image obtained in the previous step to grayscale. (cvCvtColor ()). Thus, hash will reduce from 192 (64 values of the three channels - red, green, and blue) to 64 values of brightness.
Find the average brightness of the resulting image. (cvAvg ())
Binarization of the image. (cvThreshold ()) retains only those pixels that are larger than average (considered them to be 1, and all others as 0).
Building a hash. Translation of the 64 values of 1 and 0 pictures in one 64-bit hash value.
Next, if you need to compare two images, then just build a hash for each of them and count the number of different bits (using Hamming distance).
Hamming distance - the number of positions in which the respective numbers of two binary words of the same length are different.
A distance of zero means that it is likely the same image, and the other values characterize how much they differ from each other.
If it is exact copies of the images you want, you can start comparing pixel 1,1 of all images, and group them by what is the same value on pixel 1,1. After that you know groups (hopefully quite many groups?) and than compare for each group pixel 1,2 . That way you do pixel by pixel untill you get a hundred groups or so. Than you just compare them full on, in each group. That way you do your slow n^4 algorithm, but each time on groups of five images, instead of on 500 images at a time. i am assuming you can read in your images pixel by pixel, i know thats possible if they are in .fits, with the pyfits module, but i guess alternatives exist for pretty much any image format?
So the idea behind this is that if pixel 1,1 is different, the entire image is different. This way you can make some list with the values of maybe the first 3 pixels or so. If in that list, there is enough variability, you can do your 1-1 full image checks on much smaller groups of images, instead of on 500 images at a time.
Does this sound like it should do what you want?
OK, I worked out a solution by myself, welcome if there's even better solution. I paste the code here.
#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <cstdio>
#include <iostream>
#include <string>
#include <cstring>
#include <functional>
#include <openssl/md5.h>
using namespace std;
using namespace cv;
static void help()
{
}
char *str2md5(const char *str, int length) {
int n;
MD5_CTX c;
unsigned char digest[16];
char *out = (char*)malloc(33);
MD5_Init(&c);
while (length > 0) {
if (length > 512) {
MD5_Update(&c, str, 512);
} else {
MD5_Update(&c, str, length);
}
length -= 512;
str += 512;
}
MD5_Final(digest, &c);
for (n = 0; n < 16; ++n) {
snprintf(&(out[n*2]), 16*2, "%02x", (unsigned int)digest[n]);
}
return out;
}
int main(int argc, const char** argv)
{
help();
if (argc != 2)
{
return EXIT_FAILURE ;
}
string inputfile = argv[1] ;
Mat src = imread (inputfile, -1) ;
if (src.empty())
{
return EXIT_FAILURE ;
}
cout << str2md5((char*)src.data, (int)src.step[0] * src.rows) << " " << inputfile << endl ;
return 0;
}
You'll have to install OpenSSL (libssl-dev) in your machine to compile this code. It loads the image into memory, and calculate the md5 value of it. So to find out the repeat image pairs, just write a simple bash/python script using the compiled program to search in the md5 value array of the files. Note that, this md5 check code won't work with huge image files.
When I use CImg to load a .BMP, how can I know whether it is a gray-scale or color image?
I have tried as follows, but failed:
cimg_library::CImg<unsigned char> img("lena_gray.bmp");
const int spectrum = img.spectrum();
img.save("lenaNew.bmp");
To my expectation, no matter what kind of .BMP I have loaded, spectrum will always be 3. As a result, when I load a gray-scale and save it, the result size will be 3 times bigger than it is.
I just want to save a same image as it is loaded. How do I save as gray-scale?
I guess the BMP format always store images as RGB-coded data, so reading a BMP will always result in a color image.
If you know your image is scalar, all channels will be the same, so you can discard two of them (here keeping the first one).
img.channel(0);
If you want to check that it is a scalar image, you can test the equality between channels, as
const CImg<unsigned char> R = img.get_shared_channel(0),
G = img.get_shared_channel(1),
B = img.get_shared_channel(2);
if (R==G && R==B) {
.. Your image is scalar !
} else {
.. Your image is in color.
}
I am working on a school project with OpenCV. A major part of the program will be a comparison of histograms. There will be a database of histograms and new histograms will be created from a live video feed then compared to the histograms in the database. Right now I am just trying to get the histograms created correctly from the video feed. My problem is that the program crashes or slows down dramatically at random intervals. So my question is how do I prevent the program from crashing or slowing down? OpenCV has always been kind of flaky for me, so I'm not sure if it is an issue with my code or if it is just the nature of OpenCV. If it is to do with my code I think the issue might have something to do with the frame rates (a guess/gut feeling). I am useing "cvWaitKey" to "pace" the loading of frames, but the "Learning OpenCV" book has this to say about "cvWaitKey"
c = cvWaitKey(33);
if( c == 27 ) break;
Once we have displayed the frame, we then wait for 33 ms. If the user hits a key, then c
will be set to the ASCII value of that key; if not, then it will be set to –1. If the user hits
the Esc key (ASCII 27), then we will exit the read loop. Otherwise, 33 ms will pass and
we will just execute the loop again.
It is worth noting that, in this simple example, we are not explicitly controlling
the speed of the video in any intelligent way. We are relying solely on the timer in
cvWaitKey() to pace the loading of frames. In a more sophisticated application it would
be wise to read the actual frame rate from the CvCapture structure (from the AVI) and
behave accordingly!
You will see in my code below (modified from here) that I my loop waits 10ms before starting the next execution. Often times the program will run with no issues at all, but sometimes it will crash less than a minute in, or five minutes in, there really is not pattern that I can detect. Any suggestions on how this crash( or slow down) can be prevented would be welcomed. Also I should add that I am using OpenCV 1.1 (can't ever get OpenCV 2.0 to work right), I am using Visual Studio 2008, and I create an .MSI installer package everytime I modify my code, that is, I do not debug in Visual Studio. Dependencies are cv110.dll, cxcore110.dll, and highgui110.dll. My code is below:
// SLC (Histogram).cpp : Defines the entry point for the console application.
#include "stdafx.h"
#include <cxcore.h>
#include <cv.h>
#include <cvaux.h>
#include <highgui.h>
#include <stdio.h>
#include <sstream>
#include <iostream>
using namespace std;
int main(){
CvCapture* capture = cvCaptureFromCAM(0);
if(!cvQueryFrame(capture)){
cout<<"Video capture failed, please check the camera."<<endl;
}
else{
cout<<"Video camera capture successful!"<<endl;
};
CvSize sz = cvGetSize(cvQueryFrame(capture));
IplImage* image = cvCreateImage(sz, 8, 3);
IplImage* imgHistogram = 0;
IplImage* gray = 0;
CvHistogram* hist;
cvNamedWindow("Image Source",1);
cvNamedWindow("Histogram",1);
for(;;){
image = cvQueryFrame(capture);
//Size of the histogram -1D histogram
int bins = 256;
int hsize[] = {bins};
//Max and min value of the histogram
float max_value = 0, min_value = 0;
//Value and normalized value
float value;
int normalized;
//Ranges - grayscale 0 to 256
float xranges[] = {0, 256};
float* ranges[] = {xranges};
//Create an 8 bit single channel image to hold a grayscale version of the original picture
gray = cvCreateImage(cvGetSize(image), 8, 1);
cvCvtColor(image, gray, CV_BGR2GRAY);
//Planes to obtain the histogram, in this case just one
IplImage* planes[] = {gray};
//Get the histogram and some info about it
hist = cvCreateHist(1, hsize, CV_HIST_ARRAY, ranges,1);
cvCalcHist(planes, hist, 0, NULL);
cvGetMinMaxHistValue(hist, &min_value, &max_value);
printf("Minimum Histogram Value: %f, Maximum Histogram Value: %f\n", min_value, max_value);
//Create an 8 bits single channel image to hold the histogram and paint it white
imgHistogram = cvCreateImage(cvSize(bins, 50),8,3);
cvRectangle(imgHistogram, cvPoint(0,0), cvPoint(256,50), CV_RGB(255,255,255),-1);
//Draw the histogram
for(int i=0; i < bins; i++){
value = cvQueryHistValue_1D(hist, i);
normalized = cvRound(value*50/max_value);
cvLine(imgHistogram,cvPoint(i,50), cvPoint(i,50-normalized), CV_RGB(0,0,0));
}
cvFlip(image, NULL, 1);
cvShowImage("Image Source", image);
cvShowImage("Histogram", imgHistogram);
//Page 19 paragraph 3 of "Learning OpenCV" tells us why we DO NOT use "cvReleaseImage(&image)" in this section
cvReleaseImage(&imgHistogram);
cvReleaseImage(&gray);
cvReleaseHist(&hist);
char c = cvWaitKey(10);
//if ASCII key 27 (esc) is pressed then loop breaks
if(c==27) break;
}
cvReleaseImage(&image);
cvReleaseCapture(&capture);
cvDestroyAllWindows();
}
Only a few things I can see or recommend:
Considering the build, make sure you're building in Release. Also, make sure the build of OpenCV you're using was built with OpenMP enabled, it makes an enormous difference.
Try moving your allocations outside the loop. Every loop you're re-creating gray and other images, when they should be re-used.
The other thing is your style, which makes it difficult to give good recommendations easily. It's poor style to pre-declare a bunch of variables, this is C-style. Declare your variables just prior to their use, and the code will be easier to read.
Update: I found the issue, it was actually my hardware (well the driver I think). I was using a PS3 Eye because of the amazing frame rates, but for some reason OpenCV does not like the PS3 Eye all the time. Sometimes it works great and other times not so great. I have verfied this on three computers, all of which run my code good with a standard web cam but randomly lock up when the PS3 Eye is used. Still, thank you for your suggestions GMan!