I'm using the next algorithm to calculate the histogram from a YUV420sp image. Seems to work but the result is not 100% accurate for a fully dark image. When the image is dark I would expect to have on the left side of the histogram a high pick showing that the image is too dark, but the algorithm in such scenario shows instead a flat line, no pick. On the other light scenarios the histogram seems to be accurate.
void calculateHistogram(const unsigned char* yuv420sp, const int yuvWidth, const int yuvHeight, const int histogramControlHeight, int* outHistogramData)
{
const int BINS = 256;
// Clear the output
memset(outHistogramData, 0, BINS * sizeof(int));
// Get YUV brightness values
const int totalPixels = yuvWidth * yuvHeight;
for (int index = 0; index < totalPixels; index++)
{
char brightness = yuv420sp[index];
outHistogramData[brightness]++;
}
// Get the maximum brightness
int maxBrightness = 0;
for (int index = 0; index < BINS; index++)
{
if (outHistogramData[index] > maxBrightness)
{
maxBrightness = outHistogramData[index];
}
}
// Normalize to fit the UI control height
const int maxNormalized = BINS * histogramControlHeight / maxBrightness;
for(int index = 0; index < BINS; index++)
{
outHistogramData[index] = (outHistogramData[index] * maxNormalized) >> 8;
}
}
[SOLVED by galop1n] Though Galop1n implementation is much nicer I'm updating this one with the corrections in case is of use to anyone.
Changes:
1) Reading brightness values into an unsigned char instead of a char.
2) Placed UI normalization division into the normalization loop.
void calculateHistogram(const unsigned char* yuv420sp, const int yuvWidth, const int yuvHeight, const int histogramCanvasHeight, int* outHistogramData)
{
const int BINS = 256;
// Clear the output
memset(outHistogramData, 0, BINS * sizeof(int));
// Get YUV brightness values
const int totalPixels = yuvWidth * yuvHeight;
for (int index = 0; index < totalPixels; index++)
{
unsigned char brightness = yuv420sp[index];
outHistogramData[brightness]++;
}
// Get the maximum brightness
int maxBrightness = 0;
for (int index = 0; index < BINS; index++)
{
if (outHistogramData[index] > maxBrightness)
{
maxBrightness = outHistogramData[index];
}
}
// Normalize to fit the UI control height
for(int index = 0; index < BINS; index++)
{
outHistogramData[index] = outHistogramData[index] * histogramCanvasHeight / maxBrightness;
}
}
There is at least two bugs in your implementation.
The indexing by the brightness because of using a temporary of type signed char.
The final normalization result can be influence by the value of control height and the maximum count of pixel in a bin. The division cannot really be put outside of the loop because of that.
I recommend also to use a std::array ( need c++11 ) to store the result instead of a raw pointer as there is a risk the caller do not allocate enough space for what will use the function.
#include <algorithm>
#include <array>
void calculateHistogram(const unsigned char* yuv420sp, const int yuvWidth, const int yuvHeight, const int histogramControlHeight, std::array<int, 256> &outHistogramData ) {
outHistogramData.fill(0);
std::for_each( yuv420sp, yuv420sp + yuvWidth * yuvHeight, [&](int e) {
outHistogramData[e]++;
} );
int maxCountInBins = * std::max_element( begin(outHistogramData), end(outHistogramData) );
for( int &bin : outHistogramData )
bin = bin * histogramControlHeight / maxCountInBins;
}
If the maximum brightness of the image maxBrightness is zero, your calculation of maxNormalized becomes a division by zero. I suspect this is where your problem is.
Without better understanding what normalization conditions you are trying to establish, I am not sure what alternative to suggest to you right now.
Related
I am learning C++ at the moment and currently I am experimenting with pointers and structures. In the following code, I am copying vector A into a buffer of size 100 bytes. Afterwards I copy vector B into the same buffer with an offset, so that the vectors are right next to each other in the buffer. Afterward, I want to find the vectors in the buffer again and calculate the dot product between the vectors.
#include <iostream>
const short SIZE = 5;
typedef struct vector {
float vals[SIZE];
} vector;
void vector_copy (vector* v, vector* target) {
for (int i=0; i<SIZE; i++) {
target->vals[i] = v->vals[i];
}
}
float buffered_vector_product (char buffer[]) {
float scalar_product = 0;
int offset = SIZE * 4;
for (int i=0; i<SIZE; i=i+4) {
scalar_product += buffer[i] * buffer[i+offset];
}
return scalar_product;
}
int main() {
char buffer[100] = {};
vector A = {{1, 1.5, 2, 2.5, 3}};
vector B = {{0.5, -1, 1.5, -2, 2.5}};
vector_copy(&A, (vector*) buffer);
vector_copy(&B, (vector*) (buffer + sizeof(vector)));
float prod = buffered_vector_product(buffer);
std::cout << prod <<std::endl;
return 0;
}
Unfortunately this doesn't work yet. The problem lies within the function buffered_vector_product. I am unable to get the float values back from the buffer. Each float value should need 4 bytes. I don't know, how to access these 4 bytes and convert them into a float value. Can anyone help me out? Thanks a lot!
In the function buffered_vector_product, change the lines
int offset = SIZE * 4;
for (int i=0; i<SIZE; i=i+4) {
scalar_product += buffer[i] * buffer[i+offset];
}
to
for ( int i=0; i<SIZE; i++ ) {
scalar_product += ((float*)buffer)[i] * ((float*)buffer)[i+SIZE];
}
If you want to calculate the offsets manually, you can instead replace it with the following:
size_t offset = SIZE * sizeof(float);
for ( int i=0; i<SIZE; i++ ) {
scalar_product += *(float*)(buffer+i*sizeof(float)) * *(float*)(buffer+i*sizeof(float)+offset);
}
However, with both solutions, you should beware of both the alignment restrictions and the strict aliasing rule.
The problem with the alignment restrictions can be solved by changing the line
char buffer[100] = {};
to the following:
alignas(float) char buffer[100] = {};
The strict aliasing rule is a much more complex issue, because the exact rule has changed significantly between different C++ standards and is (or at least was) different from the strict aliasing rule in the C language. See the link in the comments section for further information on this issue.
I came across this sample code on openCV library. What does the line p[j] = table[p[j]] do? I have come across multi dimensional arrays but not something like this before.
Mat& ScanImageAndReduceC(Mat& I, const uchar* const table)
{
// accept only char type matrices
CV_Assert(I.depth() == CV_8U);
int channels = I.channels();
int nRows = I.rows;
int nCols = I.cols * channels;
if (I.isContinuous())
{
nCols *= nRows;
nRows = 1;
}
int i,j;
uchar* p;
for( i = 0; i < nRows; ++i)
{
p = I.ptr<uchar>(i);
for ( j = 0; j < nCols; ++j)
{
p[j] = table[p[j]];
}
}
return I;
}
It is doing color replacement by using a table where each pixel intensity maps to some other value. Commonly used for techniques like color grading, histogram adjustment, or even thresholding.
Here, the table contains unsigned char values and is being indexed by the value of the pixel. The pixel's intensity p[i] is used as an index into the table, and the value at that index is then written to that pixel, replacing its original value.
It is a lookup table conversion.
The pixels of image(I) would be converted by means of table.
For example, the pixel with value 100 would be changed to 10 if table[100]=10.
Your sample code is introduced in OpenCV tutorial which is well explained of what the code does.
https://docs.opencv.org/master/db/da5/tutorial_how_to_scan_images.html
I'm trying to make very simple (LUT-like) operations on a 16-bit gray-scale OpenCV Mat, which is efficient and doesn't slow down the debugger.
While there is a very detailed page in the documentation addressing exactly this issue, it fails to point out that most of those methods are only available on 8-bit images (including the perfect, optimized LUT function).
I tried the following methods:
uchar* p = mat_depth.data;
for (unsigned int i = 0; i < depth_width * depth_height * sizeof(unsigned short); ++i)
{
*p = ...;
*p++;
}
Really fast, unfortunately only supporting uchart (just like LUT).
int i = 0;
for (int row = 0; row < depth_height; row++)
{
for (int col = 0; col < depth_width; col++)
{
i = mat_depth.at<short>(row, col);
i = ..
mat_depth.at<short>(row, col) = i;
}
}
Adapted from this answer: https://stackoverflow.com/a/27225293/518169. Didn't work for me, and it was very slow.
cv::MatIterator_<ushort> it, end;
for (it = mat_depth.begin<ushort>(), end = mat_depth.end<ushort>(); it != end; ++it)
{
*it = ...;
}
Works well, however it uses a lot of CPU and makes the debugger super slow.
This answer https://stackoverflow.com/a/27099697/518169 points out to the source code of the built-in LUT function, however it only mentions advanced optimization techniques, like IPP and OpenCL.
What I'm looking for is a very simple loop like the first code, but for ushorts.
What method do you recommend for solving this problem? I'm not looking for extreme optimization, just something on par with the performance of the single-for-loop on .data.
I implemented Michael's and Kornel's suggestion and benchmarked them both in release and debug modes.
code:
cv::Mat LUT_16(cv::Mat &mat, ushort table[])
{
int limit = mat.rows * mat.cols;
ushort* p = mat.ptr<ushort>(0);
for (int i = 0; i < limit; ++i)
{
p[i] = table[p[i]];
}
return mat;
}
cv::Mat LUT_16_reinterpret_cast(cv::Mat &mat, ushort table[])
{
int limit = mat.rows * mat.cols;
ushort* ptr = reinterpret_cast<ushort*>(mat.data);
for (int i = 0; i < limit; i++, ptr++)
{
*ptr = table[*ptr];
}
return mat;
}
cv::Mat LUT_16_if(cv::Mat &mat)
{
int limit = mat.rows * mat.cols;
ushort* ptr = reinterpret_cast<ushort*>(mat.data);
for (int i = 0; i < limit; i++, ptr++)
{
if (*ptr == 0){
*ptr = 65535;
}
else{
*ptr *= 100;
}
}
return mat;
}
ushort* tablegen_zero()
{
static ushort table[65536];
for (int i = 0; i < 65536; ++i)
{
if (i == 0)
{
table[i] = 65535;
}
else
{
table[i] = i;
}
}
return table;
}
The results are the following (release/debug):
LUT_16: 0.202 ms / 0.773 ms
LUT_16_reinterpret_cast: 0.184 ms / 0.801 ms
LUT_16_if: 0.249 ms / 0.860 ms
So the conclusion is that reinterpret_cast is the faster by 9% in release mode, while the ptr one is faster by 4% in debug mode.
It's also interesting to see that directly calling the if function instead of applying a LUT only makes it slower by 0.065 ms.
Specs: streaming 640x480x16-bit grayscale image, Visual Studio 2013, i7 4750HQ.
OpenCV implementation is based on polymorphism and runtime dispatching over templates. In OpenCV version the use of templates is limited to a fixed set of primitive data types. That is, array elements should have one of the following types:
8-bit unsigned integer (uchar)
8-bit signed integer (schar)
16-bit unsigned integer (ushort)
16-bit signed integer (short)
32-bit signed integer (int)
32-bit floating-point number (float)
64-bit floating-point number (double)
a tuple of several elements where all elements have the same type (one of the above).
In case your cv::Mat is continues you can use pointer arithmetics to go through the whole data pointer and you should only use the appropriate pointer type to your cv::Mat.
Furthermore, keep it mind that cv::Mats are not always continuous (it can be a ROI, padded, or created from pixel pointer) and iterating over them with pointers will crash.
An example loop:
cv::Mat cvmat16sc1 = cv::Mat::eye(10, 10, CV_16SC1);
if (cvmat16sc1.data)
{
if (!cvmat16sc1.isContinuous())
{
cvmat16sc1 = cvmat16sc1.clone();
}
short* ptr = reinterpret_cast<short*>(cvmat16sc1.data);
for (int i = 0; i < cvmat16sc1.cols * cvmat16sc1.rows; i++, ptr++)
{
if (*ptr == 1)
std::cout << i << ": " << *ptr << std::endl;
}
}
Best solution for your problem is already written in the tutorial that you mentioned, in the chapter named "The efficient way". All you need is to replace every instance of uchar with ushort. No other changes are needed.
I have a rather unexpected issue with one of my functions. Let me explain.
I'm writing a calibration algorithm and since I want to do some grid search (non-continuous optimization), I'm creating my own mesh - different combinations of probabilities.
The size of the grid and the grid itself are computed recursively (I know...).
So in order:
Get variables
Compute corresponding size recursively
Allocate memory for the grid
Pass the empty grid by reference and fill it recursively
The problem I have is after step 4 once I try to retrieve this grid. During step 4, I 'print' on the console the results to check them and everything is fine. I computed several grids with several variables and they all match the results I'm expecting. However, as soon as the grid is taken out of the recursive function, the last column is filled with 0 (all the values from before are replace in this column only).
I tried allocating one extra column for the grid in step 3 but this only made the problem worse (-3e303 etc. values). Also I have the error no matter what size I compute it with (very small to very large), so I assume it isn't a memory error (or at least a 'lack of memory' error). Finally the two functions used and their call have been listed below, this has been quickly programmed, so some variables might seem kind of useless - I know. However I'm always open to your comments (plus I'm no expert in C++ - hence this thread).
void size_Grid_Computation(int nVars, int endPoint, int consideredVariable, int * indexes, int &sum, int nChoices)
{
/** Remember to initialize r at 1 !! - we exclude var_0 and var_(m-1) (first and last variables) in this algorithm **/
int endPoint2 = 0;
if (consideredVariable < nVars - 2)
{
for (indexes[consideredVariable] = 0; indexes[consideredVariable] < endPoint; indexes[consideredVariable] ++)
{
endPoint2 = endPoint - indexes[consideredVariable];
size_Grid_Computation(nVars, endPoint2, consideredVariable + 1, indexes, sum, nChoices);
}
}
else
{
for (int i = 0; i < nVars - 2; i++)
{
sum -= indexes[i];
}
sum += nChoices;
return;
}
}
The above function is for the grid size. Below for the grid itself -
void grid_Creation(double* choicesVector, double** varVector, int consideredVariable, int * indexes, int endPoint, int nVars, int &r)
{
if (consideredVariable > nVars-1)
return;
for (indexes[consideredVariable] = 0; indexes[consideredVariable] < endPoint; indexes[consideredVariable]++)
{
if (consideredVariable == nVars - 1)
{
double sum = 0.0;
for (int j = 0; j <= consideredVariable; j++)
{
varVector[r][j] = choicesVector[indexes[j]];
sum += varVector[r][j];
printf("%lf\t", varVector[r][j]);
}
varVector[r][nVars - 1] = 1 - sum;
printf("%lf row %d\n", varVector[r][nVars - 1],r+1);
r += 1;
}
grid_Creation(choicesVector, varVector, consideredVariable + 1, indexes, endPoint - indexes[consideredVariable], nVars, r);
}
}
Finally the call
#include <stdio.h>
#include <stdlib.h>
int main()
{
int nVars = 5;
int gridPrecision = 3;
int sum1 = 0;
int r = 0;
int size = 0;
int * index, * indexes;
index = (int *) calloc(nVars - 1, sizeof(int));
indexes = (int *) calloc(nVars, sizeof(int));
for (index[0] = 0; index[0] < gridPrecision + 1; index[0] ++)
{
size_Grid_Computation(nVars, gridPrecision + 1 - index[0], 1, index, size, gridPrecision + 1);
}
double * Y;
Y = (double *) calloc(gridPrecision + 1, sizeof(double));
for (int i = 0; i <= gridPrecision; i++)
{
Y[i] = (double) i/ (double) gridPrecision;
}
double ** varVector;
varVector = (double **) calloc(size, sizeof(double *));
for (int i = 0; i < size; i++)
{
varVector[i] = (double *) calloc(nVars, sizeof(double *));
}
grid_Creation(Y, varVector, 0, indexes, gridPrecision + 1, nVars - 1, r);
for (int i = 0; i < size; i++)
{
printf("%lf\n", varVector[i][nVars - 1]);
}
}
I left my barbarian 'printf', they help narrow down the problem. Most likely, I have forgotten or butchered one memory allocation. But I can't see which one. Anyway, thanks for the help!
It seems to me that you have a principal mis-design, namely your 2D array. What you are programming here is not a 2D array but an emulation of it. It only makes sense if you want to have a sort of sparse data structure where you may leave out parts. In your case it looks as if it is just a plain old matrix that you need.
Nowadays it is neither appropriate in C nor in C++ to program like this.
In C, since that seems what you are after, inside functions you declare matrices even with dynamic bounds as
double A[n][m];
If you fear that this could smash your "stack", you may allocate it dynamically
double (*B)[m] = malloc(sizeof(double[n][m]));
You pass such beasts to functions by putting the bounds first in the parameter list
void toto(size_t n, size_t m, double X[n][m]) {
...
}
Once you have clean and readable code, you will find your bug much easier.
I have to get information about the scalar value of a lot of pixels on a gray-scale image using OpenCV. It will be traversing hundreds of thousands of pixels so I need the fastest possible method. Every other source I've found online has been very cryptic and hard to understand. Is there a simple line of code that should just hand a simple integer value representing the scalar value of the first channel (brightness) of the image?
for (int row=0;row<image.height;row++) {
unsigned char *data = image.ptr(row);
for (int col=0;col<image.width;col++) {
// then use *data for the pixel value, assuming you know the order, RGB etc
// Note 'rgb' is actually stored B,G,R
blue= *data++;
green = *data++;
red = *data++;
}
}
You need to get the data pointer on each new row because opencv will pad the data to 32bit boundary at the start of each row
With regards to Martin's post, you can actually check if the memory is allocated continuously using the isContinuous() method in OpenCV's Mat object. The following is a common idiom for ensuring the outer loop only loops once if possible:
#include <opencv2/core/core.hpp>
using namespace cv;
int main(void)
{
Mat img = imread("test.jpg");
int rows = img.rows;
int cols = img.cols;
if (img.isContinuous())
{
cols = rows * cols; // Loop over all pixels as 1D array.
rows = 1;
}
for (int i = 0; i < rows; i++)
{
Vec3b *ptr = img.ptr<Vec3b>(i);
for (int j = 0; j < cols; j++)
{
Vec3b pixel = ptr[j];
}
}
return 0;
}