Opencv obatin certain pixel RGB value based on mask - c++

My title may not be clear enough, but please look carefully on the following description.Thanks in advance.
I have a RGB image and a binary mask image:
Mat img = imread("test.jpg")
Mat mask = Mat::zeros(img.rows, img.cols, CV_8U);
Give some ones to the mask, assume the number of ones is N. Now the nonzero coordinates are known, based on these coordinates, we can surely obtain the corresponding pixel RGB value of the origin image.I know this can be accomplished by the following code:
Mat colors = Mat::zeros(N, 3, CV_8U);
int counter = 0;
for (int i = 0; i < mask.rows; i++)
{
for (int j = 0; j < mask.cols; j++)
{
if (mask.at<uchar>(i, j) == 1)
{
colors.at<uchar>(counter, 0) = img.at<Vec3b>(i, j)[0];
colors.at<uchar>(counter, 1) = img.at<Vec3b>(i, j)[1];
colors.at<uchar>(counter, 2) = img.at<Vec3b>(i, j)[2];
counter++;
}
}
}
And the coords will be as follows:
enter image description here
However, this two layer of for loop costs too much time. I was wondering if there is a faster method to obatin colors, hope you guys can understand what I was trying to convey.
PS:If I can use python, this can be done in only one sentence:
colors = img[mask == 1]

The .at() method is the slowest way to access Mat values in C++. Fastest is to use pointers, but best practice is an iterator. See the OpenCV tutorial on scanning images.
Just a note, even though Python's syntax is nice for something like this, it still has to loop through all of the elements at the end of the day---and since it has some overhead before this, it's de-facto slower than C++ loops with pointers. You necessarily need to loop through all the elements regardless of your library, you're doing comparisons with the mask for every element.

If you are flexible with using any other open source library using C++, try Armadillo. You can do all linear algebra operations with it and also, you can reduce above code to one line(similar to your Python code snippet).
Or
Try findNonZero()function and find all coordinates in image containing non-zero values. Check this: https://stackoverflow.com/a/19244484/7514664

Compile with optimization enabled, try profiling this version and tell us if it is faster:
vector<Vec3b> colors;
if (img.isContinuous() && mask.isContinuous()) {
auto pimg = img.ptr<Vec3b>();
for (auto pmask = mask.datastart; pmask < mask.dataend; ++pmask, ++pimg) {
if (*pmask)
colors.emplace_back(*pimg);
}
}
else {
for (int r = 0; r < img.rows; ++r) {
auto prowimg = img.ptr<Vec3b>(r);
auto prowmask = img.ptr(r);
for (int c = 0; c < img.cols; ++c) {
if (prowmask[c])
colors.emplace_back(prowimg[c]);
}
}
}
If you know the size of colors, reserve the space for it beforehand.

Related

Fast method to access random image pixels and at most once

I'm learning OpenCV (C++) and as a simple practice, I designed a simple effect which makes some of image pixels black or white. I want each pixel to be edited at most once; so I added address of all pixels to a vector. But it made my code very slow; specially for large images or high amounts of effect. Here is my code:
void effect1(Mat& img, float amount) // 100 ≥ amount ≥ 0
{
vector<uchar*> addresses;
int channels = img.channels();
uchar* lastAddress = img.ptr<uchar>(0) + img.total() * channels;
for (uchar* i = img.ptr<uchar>(0); i < lastAddress; i += channels) addresses.push_back(i); //Fast Enough
size_t count = img.total() * amount / 100 / 2;
for (size_t i = 0; i < count; i++)
{
size_t addressIndex = xor128() % addresses.size(); //Fast Enough, xor128() is a fast random number generator
for (size_t j = 0; j < channels; j++)
{
*(addresses[addressIndex] + j) = 255;
} //Fast Enough
addresses.erase(addresses.begin() + addressIndex); // MAKES CODE EXTREMELY SLOW
}
for (size_t i = 0; i < count; i++)
{
size_t addressIndex = xor128() % addresses.size(); //Fast Enough, xor128() is a fast random number generator
for (size_t j = 0; j < channels; j++)
{
*(addresses[addressIndex] + j) = 0;
} //Fast Enough
addresses.erase(addresses.begin() + addressIndex); // MAKES CODE EXTREMELY SLOW
}
}
I think rearranging vector items after erasing an item is what makes my code slow (if I remove addresses.erase, code will run fast).
Is there any fast method to select each random item from a collection (or a number range) at most once?
Also: I'm pretty sure such effect already exists. Does anyone know the name of it?
This answer assumes you have a random bit generator function, since std::random_shuffle requires that. I don't know how xor128 works, so I'll use the functionality of the <random> library.
If we have a population of N items, and we want to select groups of size j and k randomly from that population with no overlap, we can write down the index of each item on a card, shuffle the deck, draw j cards, and then draw k cards. Everything left over is discarded. We can achieve this with the <random> library. Answer pending on how to incorporate a custom PRNG like you implemented with xor128.
This assumes that random_device won't work on your system (many compilers implement it in a way that it will always return the same sequence) so we seed the random generator with current time like the good old fashioned srand our mother used to make.
Untested since I don't know how to use OpenCV. Anyone with a lick of experience with that please edit as appropriate.
#include <ctime> // for std::time
#include <numeric> // for std::iota
#include <random>
#include <vector>
void effect1(Mat& img, float amount, std::mt19937 g) // 0.0 ≥ amount ≥ 1.00
{
std::vector<cv::Size> ind(img.total());
std::iota(ind.begin(), ind.end(), 0); // fills with 0, 1, 2, ...
std::random_shuffle(ind.begin(), ind.end(), g);
cv::Size count = img.total() * amount;
auto white = get_white<Mat>(); // template function to return this matrix' concept of white
// could easily replace with cv::Vec3d(255,255,255)
// if all your matrices are 3 channel?
auto black = get_black<Mat>(); // same but... opposite
auto end = ind.begin() + count;
for (auto it = ind.begin(), it != end; ++it)
{
img.at(*it) = white;
}
end = (ind.begin() + 2 * count) > ind.end() ?
ind.end() :
ind.begin() + 2 * count;
for (auto it = ind.begin() + count; it != end; ++it)
{
img.at(*it) = black;
}
}
int main()
{
std::mt19937 g(std::time(nullptr)); // you normally see this seeded with random_device
// but that's broken on some implementations
// adjust as necessary for your needs
cv::Mat mat = ... // make your cv objects
effect1(mat, 0.1, g);
// display it here
}
Another approach
Instead of shuffling indices and drawing cards from a deck, assume each pixel has a random probability of switching to white, switching to black, or staying the same. If your amount is 0.4, then select a random number between 0.0 and 1.0, any result between 0.0 and 0.4 flips the pixel black, and betwen 0.4 and 0.8 flips it white, otherwise it stays the same.
General algorithm:
given probability of flipping -> f
for each pixel in image -> p:
get next random float([0.0, 1.0)) -> r
if r < f
then p <- BLACK
else if r < 2*f
then p <- WHITE
You won't get the same number of white/black pixels each time, but that's randomness! We're generating a random number for each pixel anyway for the shuffling algorithm. This has the same complexity unless I'm mistaken.
Also: I'm pretty sure such effect already exists. Does anyone know the name of it?
The effect you're describing is called salt and pepper noise. There is no direct implementation in OpenCV that I know of though.
I think rearranging vector items after erasing an item is what makes
my code slow (if I remove addresses.erase, code will run fast).
Im not sure why you add your pixels to a vector in your code, it would make much more sense and also be much more performant to directly work on the Mat object and change the pixel value directly. You could use OpenCVs inbuild Mat.at() function to directly change the pixel values to either 0 or 255.
I would create a single loop which generates random indexes in the range of your image dimension and manipulate the image pixels directly. That way you are in O(n) for your noise addition. You could also just search for "OpenCV" and "salt and pepper noise", I am sure there already are a lot of really performant implementations.
I also post a simpler code:
void saltAndPepper(Mat& img, float amount)
{
vector<size_t> pixels(img.total()); // size_t = unsigned long long
uchar channels = img.channels();
iota(pixels.begin(), pixels.end(), 0); // Fill vector with 0, 1, 2, ...
shuffle(pixels.begin(), pixels.end(), mt19937(time(nullptr))); // Shuffle the vector
size_t count = img.total() * amount / 100 / 2;
for (size_t i = 0; i < count; i++)
{
for (size_t j = 0; j < channels; j++) // Set all pixel channels (e.g. Grayscale with 1 channel or BGR with 3 channels) to 255
{
*(img.ptr<uchar>(0) + (pixels[i] * channels) + j) = 255;
}
}
for (size_t i = count; i < count*2; i++)
{
for (size_t j = 0; j < channels; j++) // Set all pixel channels (e.g. Grayscale with 1 channel or BGR with 3 channels) to 0
{
*(img.ptr<uchar>(0) + (pixels[i] * channels) + j) = 0;
}
}
}

OpenCV not recognizing Mat size

I'm trying to print an image using OpenCV defining a 400x400 Mat:
plot2 = cv::Mat(400,400, CV_8U, 255);
But when I try print the points, something strange happens. The y coordinate only prints to the first 100 values. That is, if I print the point (50,100), it does not print it in the 100/400th part of the columns, but at the end. Somehow, 400 columns have turned into 100.
For example, when running this:
for (int j = 0; j < 95; ++j){
plot2.at<int>(20, j) = 0;
}
cv::imshow("segunda pared", plot2);
Shows this (the underlined part is the part corresponding to the code above):
A line that goes to 95 almost occupies all of the 400 points when it should only occupy 95/400th of the screen.
What am I doing wrong?
When you defined your cv::Mat, you told clearly that it is from the type CV_8U:
plot2 = cv::Mat(400,400, CV_8U, 255);
But when you are trying to print it, you are telling that its type is int which is usually a signed 32 bit not unsigned 8 bit. So the solution is:
for (int j = 0; j < 95; ++j){
plot2.at<uchar>(20, j) = 0;
}
Important note: Be aware that OpenCV uses the standard C++ types not the fixed ones. So, there is no need to use fixed size types like uint16_t or similar. because when compiling OpenCV & your code on another platform both of them will change together.
BTW, one of the good way to iterate through your cv::Mat is:
for (size_t row = 0; j < my_mat.rows; ++row){
auto row_ptr=my_mat.ptr<uchar>(row);
for(size_t col=0;col<my_mat.cols;++col){
//do whatever you want with row_ptr[col] (read/write)
}
}

Opencv Mat vector assignment to a row of a matrix, fastest way?

What is the fastest way of assigning a vector to a matrix row in a loop? I want to fill a data matrix along its rows with vectors. These vectors are computed in a loop. This loop last until all the entries of data matrix is filled those vectors.
Currently I am using cv::Mat::at<>() method for accessing the elements of the matrix and fill them with the vector, however, it seems this process is quite slow. I have tried another way by using cv::Mat::X.row(index) = data_vector, it works fast but fill my matrix X with some garbage values which I can not understand, why.
I read that there exists another way of using pointers (fastest way), however, I can not able to understand. Can somebody explain how to use them or other different methods?
Here is a part of my code:
#define OFFSET 2
cv::Mat im = cv::imread("001.png", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat X = cv::Mat((im.rows - 2*OFFSET)*(im.cols - 2*OFFSET), 25, CV_64FC1); // Holds the training data. Data contains image patches
cv::Mat patch = cv::Mat(5, 5, im.type()); // Holds a cropped image patch
typedef cv::Vec<float, 25> Vec25f;
int ind = 0;
for (int row = 0; row < (im.rows - 2*OFFSET); row++){
for (int col = 0; col < (im.cols - 2*OFFSET); col++){
cv::Mat temp_patch = im(cv::Rect(col, row, 5, 5)); // crop an image patch (5x5) at each pixel
patch = temp_patch.clone(); // Needs to do this because temp_patch is not continuous in memory
patch.convertTo(patch, CV_64FC1);
Vec25f data_vector = patch.reshape(0, 1); // make it row vector (1X25).
for (int i = 0; i < 25; i++)
{
X.at<float>(ind, i) = data_vector[i]; // Currently I am using this way (quite slow).
}
//X_train.row(ind) = patch.reshape(0, 1); // Tried this but it assigns some garbage values to the data matrix!
ind += 1;
}
}
To do it the regular opencv way you could do :-
ImageMat.row(RowIndex) = RowMat.clone();
or
RowMat.copyTo(ImageMat.row(RowIndex));
Haven't tested for correctness or speed.
Just a couple of edits in your code
double * xBuffer = X.ptr<double>(0);
for (int row = 0; row < (im.rows - 2*OFFSET); row++){
for (int col = 0; col < (im.cols - 2*OFFSET); col++){
cv::Mat temp_patch = im(cv::Rect(col, row, 5, 5)); // crop an image patch (5x5) at each pixel
patch = temp_patch.clone(); // Needs to do this because temp_patch is not continuous in memory
patch.convertTo(patch, CV_64FC1);
memcpy(xBuffer, patch.data, 25*sizeof(double));
xBuffer += 25;
}
}
Also, you dont seem to do any computation in patch just extract grey level values, so you can create X with the same type as im, and convert it to double at the end. In this way, you could memcpy each row of your patch, the address in memory beeing `unsigned char* buffer = im.ptr(row) + col
According to the docs:
if you need to process a whole row of matrix, the most efficient way is to get the pointer to the row first, and then just use plain C operator []:
// compute sum of positive matrix elements
// (assuming that M is double-precision matrix)
double sum=0;
for(int i = 0; i < M.rows; i++)
{
const double* Mi = M.ptr<double>(i);
for(int j = 0; j < M.cols; j++)
sum += std::max(Mi[j], 0.);
}

OpenCV Foreground Detection slow

I am trying to implement the codebook foreground detection algorithm outlined here in the book Learning OpenCV.
The algorithm only describes a codebook based approach for each pixel of the picture. So I took the simplest approach that came to mind - to have a array of codebooks, one for each pixel, much like the matrix structure underlying IplImage. The length of the array is equal to the number of pixels in the image.
I wrote the following two loops to learn the background and segment the foreground. It uses my limited understanding of the matrix structure inside the src image, and uses pointer arithmetic to traverse the pixels.
void foreground(IplImage* src, IplImage* dst, codeBook* c, int* minMod, int* maxMod){
int height = src->height;
int width = src->width;
uchar* srcCurrent = (uchar*) src->imageData;
uchar* srcRowHead = srcCurrent;
int srcChannels = src->nChannels;
int srcRowWidth = src->widthStep;
uchar* dstCurrent = (uchar*) dst->imageData;
uchar* dstRowHead = dstCurrent;
// dst has 1 channel
int dstRowWidth = dst->widthStep;
for(int row = 0; row < height; row++){
for(int column = 0; column < width; column++){
(*dstCurrent) = find_foreground(srcCurrent, (*c), srcChannels, minMod, maxMod);
dstCurrent++;
c++;
srcCurrent += srcChannels;
}
srcCurrent = srcRowHead + srcRowWidth;
srcRowHead = srcCurrent;
dstCurrent = dstRowHead + dstRowWidth;
dstRowHead = dstCurrent;
}
}
void background(IplImage* src, codeBook* c, unsigned* learnBounds){
int height = src->height;
int width = src->width;
uchar* srcCurrent = (uchar*) src->imageData;
uchar* srcRowHead = srcCurrent;
int srcChannels = src->nChannels;
int srcRowWidth = src->widthStep;
for(int row = 0; row < height; row++){
for(int column = 0; column < width; column++){
update_codebook(srcCurrent, c[row*column], learnBounds, srcChannels);
srcCurrent += srcChannels;
}
srcCurrent = srcRowHead + srcRowWidth;
srcRowHead = srcCurrent;
}
}
The program works, but is very sluggish. Is there something obvious that is slowing it down? Or is it an inherent problem in the simple implementation? Is there anything I can do to speed it up? Each code book is sorted in no specific order, so it does take linear time to process each pixel. So double the background samples, and the program runs slower by 2 for each pixel, which is then magnified by the number of pixels. But as the implementation stands, I don't see any clear, logical way to sort the code element entries.
I am aware that there is an example implementation of the same algorithm in the opencv samples. However, that structure seems to be much more complex. I am looking more to understand the reasoning behind this method, I am aware that I can just modify the sample for real life applications.
Thanks
Operating on every pixel in an image is going to be slow, regardless of how you implement it.

FFT of an image

I have an assignment about fftw and I was trying to write a small program to create an fft of an image. I am using CImg to read and write images. But all I get is a dark image with a single white dot :(
I'm most likely doing this the wrong way and I would appreciate if someone could explain how this should be done. I don't need the code, I just need to know what is the right way to do this.
Here is my code:
CImg<double> input("test3.bmp");
CImg<double> image_fft(input, false);
unsigned int nx = input.dimx(), ny = input.dimy();
size_t align = sizeof(Complex);
array2<Complex> in (nx, ny, align);
fft2d Forward(-1, in);
for (int i = 0; i < input.dimx(); ++i) {
for (int j = 0; j < input.dimy(); ++j) {
in(i,j) = input(i,j);
}
}
Forward.fft(in);
for (int i = 0; i < input.dimx(); ++i) {
for (int j = 0; j < input.dimy(); ++j) {
image_fft(i,j,0) = image_fft(i,j,1) = image_fft(i,j,2) = std::abs(in(i,j));
}
}
image_fft.normalize(0, 255);
image_fft.save("test.bmp");
You need to take the log of the magnitude. The single white dot is the base value (0 Hz, DC, whatever you want to call it), so it will almost ALWAYS be by far the largest component of any image you take (Since pixel values cannot be negative, the DC value will always be positive and large).
What you need to do is calculate the log (ln, whatever, some type of logarithmic calculation) of the magnitude (so after you've converted from complex to magnitude/phase form (phasor notation iirc?)) on each point before you normalize it.
Please note that the values are there, they are just REALLY small compared to the DC value, taking the log (Which makes smaller values bigger by a lot, and bigger values only slightly larger) will make the other frequencies visible.