I'm trying to convert an hls file to jpeg. firstly, I used openh264 to convert HLS file to YUV. I got a two dimensional array containing Y, U, V buffer (*pData[3]). After that, I try to combine the three arrays into one to pass it to CompressYUYV2JPEG.
here is how I convert:
for(i = 0; i < l; i++) {
inbuf.push_back(yuvData[0][i]);
}
l = bufferInfo.UsrData.sSystemBuffer.iWidth*bufferInfo.UsrData.sSystemBuffer.iHeight/4;
for(i = 0; i < l; i++) {
inbuf.push_back(yuvData[1][i]);
}
l = bufferInfo.UsrData.sSystemBuffer.iWidth*bufferInfo.UsrData.sSystemBuffer.iHeight/4;
for(i = 0; i < l; i++) {
inbuf.push_back(yuvData[2][i]);
}
but unfortunately, It doesn't produce the expected result. What is the proper way to convert 2-dimensional YUV array into a one-dimensional array?
You need YUV422. That means YUYV. inbuf must be dividable by 4 for every input data. You can use
for(i = 0; i < l/2; i++) {
inbuf.push_back(yuvData[0][2*i]);
inbuf.push_back((yuvData[1][2*i] + yuvData[1][2*i + 1])/2);
inbuf.push_back(yuvData[0][2*i + 1]);
inbuf.push_back((yuvData[2][2*i] + yuvData[2][2*i + 1])/2);
}
In this code snippet all Y values are used but only the average of two Cr resp. Cb values are used. Of course the number of elements in each yuvData channel must be even. Otherwise you have to find a solution for the last element.
I just now saw that you use YUV420. Then you can use this snippet
for(i = 0; i < l/2; i++) {
inbuf.push_back(yuvData[0][2*i]);
inbuf.push_back(yuvData[1][i/2]);
inbuf.push_back(yuvData[0][2*i + 1]);
inbuf.push_back(yuvData[2][i/2]);
}
In this code all Y values are used once and all Cr resp. Cb values are used twice.
Related
I want to convert the following code from objective C to C++.
In the class myClass, I have this attribute:
float tab[dim1][dim2][dim3];
In an objective-C file, the multidimensional array is filled from a binary file:
NSData *dataTab=[NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"pathOfMyTab" ofType:#""]];
[dataTab getBytes:myClass -> tab length:[dataTab length]];
How could I translate this part into C++ ?
I am assuming that your file contains the byte-representation of the array. If this is the case, then to mimic the behaviour of your Objective-C code using only C++ (the only thing that makes this C++ is the reinterpret_cast<>, otherwise it is just straight C), you could use the following code. I have not added any error checking, but left some comments where you might want to perform some.
float tab[dim1][dim2][dim3];
CFBundleRef mainBundle = CFBundleGetMainBundle();
CFURLRef dataTabURL = CFBundleCopyResourceURL(mainBundle, CFSTR("pathOfMyTab"), NULL, NULL);
CFReadStreamRef stream = CFReadStreamCreateWithFile(NULL, dataTabURL); // check for NULL return value
CFReadStreamOpen(stream); // check for errors here
CFReadStreamRead(stream, reinterpret_cast<UInt8 *>(tab), sizeof tab); // check that this function returns the number of bytes you were expecting (sizeof tab)
CFReadStreamClose(stream);
// we own "stream" and "dataTabURL" because we obtained these through functions
// with "create" in the name, therefore we must relinquish ownership with CFRelease
CFRelease(stream);
CFRelease(dataTabURL); // ditto
If you already have the path available in a std::string, then you can use the following C++ code to mimic the behaviour of your Objective-C code:
// make sure to include this header
#include <fstream>
// ... then elsewhere in your .cpp file ...
float tab[dim1][dim2][dim3];
std::string path = "path/to/mytab"; // obtain from somewhere
std::ifstream input(path, std::ios::binary); // check that the file was successfully opened
input.read(reinterpret_cast<char *>(tab), sizeof tab); // check that input.gcount() is the number of bytes you expected
I believe in this case we have to use reinterpret_cast<> because the file contains the actual representation of the array (assuming it was previously written to the file in a similar manner).
You can use a hybrid approach, once you have the CFURLRef containing the path to the resource, you can obtain a file system representation of the URL using this function (providing a suitably sized output buffer to store the result), and from there you should be able to pass that to one of std::ifstream's constructors (although, you may need to cast to the appropriate type).
C++ doesn't support variable-length arrays (the size of arrays must be known at compile time). There is also no matrix type provided by the standard library, so if the dimensions of your table vary at run time, then you will need a completely separate approach to the one in my answer. You could consider serialising the output from Objective-C (using e.g. JSON or another format) such that the dimensions of the matrix are also written to the output, making it easier to parse the file in C++.
Take a look at fstream, fread and read, all read binary files, pick the approach that suits.
On my mind the simplest and fastest way is to use memcpy() to copy NSData' bytes into target array with same structure (dimensions) as a source one. See, for example:
https://github.com/Voldemarus/MultiDimensionalArrayDemo/tree/master
#import "DemoClass.h"
#define DIM1 3
#define DIM2 4
#define DIM3 2
#interface DemoClass() {
int src[DIM1][DIM2][DIM3]; // source (initial) array
int dst[DIM1][DIM2][DIM3]; // destination array
}
#end
#implementation DemoClass
- (instancetype) init
{
if (self = [super init]) {
for (int i = 0; i < DIM1; i++) {
for (int j = 0; j < DIM2; j++) {
for (int k = 0; k < DIM3; k++) {
int value = i*100 + j*10 + k;
src[i][j][k] = value;
}
}
}
}
return self;
}
int getIntFromArray(int *array, int i, int j, int k) {
int offset = j*DIM3 + i*DIM2*DIM3;
return array[offset];
}
void putIntToArray(int *array, int i, int j, int k, int value) {
int offset = j*DIM3 + i*DIM2*DIM3;
array[offset] = value;
}
- (void) run
{
// Step 1. Save array into NSData
NSInteger s = sizeof(int)*DIM1*DIM2*DIM3;
NSData *data = [[NSData alloc] initWithBytes:src length:s];
NSAssert(data, #"NSData should be created");
//Step2 - Create new array
int *bytes = (int *)[data bytes];
memcpy(dst,bytes,s);
// Step 3. Compare src and dst
for (int i = 0; i < DIM1; i++) {
for (int j = 0; j < DIM2; j++) {
for (int k = 0; k < DIM3; k++) {
int template = i*100 + j*10 + k;
int s = src[i][j][k];
int d = dst[i][j][k];
// NSLog(#"i %d j %d k %d -->s = %d d = %d",i,j,k,s,d);
NSAssert(s == template, #"Source array should have value from template");
NSAssert(d == s, #"Destination array should be identical to the source");
}
}
}
}
#end
float tab[dim1][dim2][dim3] looks like a three-dimensional array. The standard implementation is with three nested FOR loops.
So your C++ implementation can look like this:
read dim1, dim2, dim3 from somewhere, usually the first values in the file (for example 12 bytes, 4 bytes for each number)
read the rest of the file in three nested FOR loops
Something like:
for (size_t i = 0; i < dim1; ++i)
for (size_t j = 0; j < dim2; ++j)
for (size_t k = 0; k < dim3; ++k)
tab[i][j][k] = read_float_value(inputFile);
In Objective-C you can write the file in a similar way.
Here are some examples to get you started:
Three dimensional arrays of integers in C++
3D array C++ using int [] operator
I want to apply a simple derive/gradient filter, [-1, 0, 1], to an image from a .ppm file.
The raw binary data from the .ppm file is read into a one-dimensional array:
uint8_t* raw_image_data;
size_t n_rows, n_cols, depth;
// Open the file as an input binary file
std::ifstream file;
file.open("test_image.ppm", std::ios::in | std::ios::binary);
if (!file.is_open()) { /* error */ }
std::string temp_line;
// Check that it's a valid P6 file
if (!(std::getline(file, temp_line) && temp_line == "P6")) {}
// Then skip all the comments (lines that begin with a #)
while (std::getline(file, temp_line) && temp_line.at(0) == '#');
// Try read in the info about the number of rows and columns
try {
n_rows = std::stoi(temp_line.substr(0, temp_line.find(' ')));
n_cols = std::stoi(temp_line.substr(temp_line.find(' ')+1,temp_line.size()));
std::getline(file, temp_line);
depth = std::stoi(temp_line);
} catch (const std::invalid_argument & e) { /* stoi has failed */}
// Allocate memory and read in all image data from ppm
raw_image_data = new uint8_t[n_rows*n_cols*3];
file.read((char*)raw_image_data, n_rows*n_cols*3);
file.close();
I then read a grayscale image from the data into a two-dimensional array, called image_grayscale:
uint8_t** image_grayscale;
image_grayscale = new uint8_t*[n_rows];
for (size_t i = 0; i < n_rows; ++i) {
image_grayscale[i] = new uint8_t[n_cols];
}
// Convert linear array of raw image data to 2d grayscale image
size_t counter = 0;
for (size_t r = 0; r < n_rows; ++r) {
for (size_t c = 0; c < n_cols; ++c) {
image_grayscale[r][c] = 0.21*raw_image_data[counter]
+ 0.72*raw_image_data[counter+1]
+ 0.07*raw_image_data[counter+2];
counter += 3;
}
}
I want to write the resulting filtered image to another two-dimensional array, gradient_magnitude:
uint32_t** gradient_magnitude;
// Allocate memory
gradient_magnitude = new uint32_t*[n_rows];
for (size_t i = 0; i < n_rows; ++i) {
gradient_magnitude[i] = new uint32_t[n_cols];
}
// Filtering operation
int32_t grad_h, grad_v;
for (int r = 1; r < n_rows-1; ++r) {
for (int c = 1; c < n_cols-1; ++c) {
grad_h = image_grayscale[r][c+1] - image_grayscale[r][c-1];
grad_v = image_grayscale[r+1][c] - image_grayscale[r-1][c];
gradient_magnitude[r][c] = std::sqrt(pow(grad_h, 2) + pow(grad_v, 2));
}
}
Finally, I write the filtered image to a .ppm output.
std::ofstream out;
out.open("output.ppm", std::ios::out | std::ios::binary);
// ppm header
out << "P6\n" << n_rows << " " << n_cols << "\n" << "255\n";
// Write data to file
for (int r = 0; r < n_rows; ++r) {
for (int c = 0; c < n_cols; ++c) {
for (int i = 0; i < 3; ++i) {
out.write((char*) &gradient_magnitude[r][c],1);
}
}
}
out.close();
The output image, however, is a mess.
When I simply set grad_v = 0; in the loop (i.e. solely calculate the horizontal gradient), the output is seemingly correct:
When I instead set grad_h = 0; (i.e. solely calculate the vertical gradient), the output is strange:
It seems like part of the image has been circularly shifted, but I cannot understand why. Moreover, I have tried with many images and the same issue occurs.
Can anyone see any issues? Thanks so much!
Ok, first clue is that the image looks circularly shifted. This hints that strides are wrong. The core of your problem is simple:
n_rows = std::stoi(temp_line.substr(0, temp_line.find(' ')));
n_cols = std::stoi(temp_line.substr(temp_line.find(' ')+1,temp_line.size()));
but in the documentation you can read:
Each PPM image consists of the following:
A "magic number" for identifying the file type. A ppm image's magic number is the two
characters "P6".
Whitespace (blanks, TABs, CRs, LFs).
A width, formatted as ASCII characters in decimal.
Whitespace.
A height, again in ASCII decimal.
[...]
Width is columns, height is rows. So that's the classical error that you get when implementing image processing stuff: swapping rows and columns.
From a didactic point of view, why are you doing this mistake? My guess: poor debugging tools. After making a working example from your question (effort that I would have saved if you had provided a MCVE), I run to the end of image loading and used Image Watch to see the content of your image with #mem(raw_image_data, UINT8, 3, n_cols, n_rows, n_cols*3). Result:
Ok, let's try to swap them: #mem(raw_image_data, UINT8, 3, n_rows, n_cols, n_rows*3). Result:
Much better. Unfortunately I don't know how to specify RGB instead of BGR in Image Watch #mem pseudo command, so the wrong colors.
Then we come back to your code: please compile with all warnings on. Then I'd use more of the std::stream features for parsing your input and less std::stoi() or find(). Avoid memory allocation by using std::vector and make a (possibly template) class for images. Even if you stick to your pointer to pointer, don't make multiple new for each row: make a single new for the pointer at row 0, and have the other pointers point to it:
uint8_t** image_grayscale = new uint8_t*[n_rows];
image_grayscale[0] = new uint8_t[n_rows*n_cols];
for (size_t i = 1; i < n_rows; ++i) {
image_grayscale[i] = image_grayscale[i - 1] + n_cols;
}
Same effect, but easier to deallocate and to manage as a single piece of memory. For example, saving as a PGM becomes:
{
std::ofstream out("output.pgm", std::ios::binary);
out << "P5\n" << n_rows << " " << n_cols << "\n" << "255\n";
out.write(reinterpret_cast<char*>(image_grayscale[0]), n_rows*n_cols);
}
Fill your borders! Using the single allocation style I showed you you can do it as:
uint32_t** gradient_magnitude = new uint32_t*[n_rows];
gradient_magnitude[0] = new uint32_t[n_rows*n_cols];
for (size_t i = 1; i < n_rows; ++i) {
gradient_magnitude[i] = gradient_magnitude[i - 1] + n_cols;
}
std::fill_n(gradient_magnitude[0], n_rows*n_cols, 0);
Finally the gradient magnitude is an integer value between 0 and 360 (you used a uint32_t). Then you save only the least significant byte of it! Of course it's wrong. You need to map from [0,360] to [0,255]. How? You can saturate (if greater than 255 set to 255) or apply a linear scaling (*255/360). Of course you can do also other things, but it's not important.
Here you can see the result on a zoomed version of the three cases: saturate, scale, only LSB (wrong):
With the wrong version you see dark pixels where the value should be higer than 255.
I have a 2048x2048 matrix of grayscale image,i want to find some points which value are > 0 ,and store its position into an array of 2 columns and n rows (n is also the number of founded points) Here is my algorithm :
int icount;
icount = 0;
for (int i = 0; i < 2048; i++)
{
for (int j = 0; j < 2048; j++)
{
if (iout.at<double>(i, j) > 0)
{
icount++;
temp[icount][1] = i;
temp[icount][2] = j;
}
}
}
I have 2 problems :
temp is an array which the number of rows is unknown 'cause after each loop the number of rows increases ,so how can i define the temp array ? I need the exact number of rows for another implementation later so i can't give some random number for it.
My algorithm above doesn't work,the results is
temp[1][1]=0 , temp[1][2]=0 , temp[2][1]=262 , temp[2][2]=655
which is completely wrong,the right one is :
temp[1][1]=1779 , temp[1][2]=149 , temp[2][1]=1780 , temp[2][2]=149
i got the right result because i implemented it in Matlab, it is
[a,b]=find(iout>0);
How about a std::vector of std::pair:
std::vector<std::pair<int, int>> temp;
Then add (i, j) pairs to it using push_back. No size needed to be known in advance:
temp.push_back(make_pair(i, j));
We'll need to know more about your problem and your code to be able to tell what's wrong with the algorithm.
When you define a variable of pointer type, you need to allocate memory and have the pointer point to that memory address. In your case, you have a multidimensional pointer so it requires multiple allocations. For example:
int **temp = new int *[100]; // This means you have room for 100 arrays (in the 2nd dimension)
int icount = 0;
for(int i = 0; i < 2048; i++) {
for(int j = 0; j < 2048; j++) {
if(iout.at<double>(i, j) > 0) {
temp[icount] = new int[2]; // only 2 variables needed at this dimension
temp[icount][1] = i;
temp[icount][2] = j;
icount++;
}
}
}
This will work for you, but it's only good if you know for sure you're not going to need any more than the pre-allocated array size (100 in this example). If you know exactly how much you need, this method is ok. If you know the maximum possible, it's also ok, but could be wasteful. If you have no idea what size you need in the first dimension, you have to use a dynamic collection, for example std::vector as suggested by IVlad. In case you do use the method I suggested, don't forget to free the allocated memory using delete []temp[i]; and delete []temp;
I am trying to make a fast image threshold function. Currently what I do is:
void threshold(const cv::Mat &input, cv::Mat &output, uchar threshold) {
int rows = input.rows;
int cols = input.cols;
// cv::Mat for result
output.create(rows, cols, CV_8U);
if(input.isContinuous()) { //we have to make sure that we are dealing with a continues memory chunk
const uchar* p;
for (int r = 0; r < rows; ++r) {
p = input.ptr<uchar>(r);
for (int c = 0; c < cols; ++c) {
if(p[c] >= threshold)
//how to access output faster??
output.at<uchar>(r,c) = 255;
else
output.at<uchar>(r,c) = 0;
}
}
}
}
I know that the at() function is quite slow. How can I set the output faster, or in other words how to relate the pointer which I get from the input to the output?
You are thinking of at as the C++ standard library documents it for a few containers, performing a range check and throwing if out of bounds, however this is not the standard library but OpenCV.
According to the cv::Mat::at documentation:
The template methods return a reference to the specified array element. For the sake of higher performance, the index range checks are only performed in the Debug configuration.
So there's no range check as you may be thinking.
Comparing both cv::Mat::at and cv::Mat::ptr in the source code we can see they are almost identical.
So cv::Mat::ptr<>(row) is as expensive as
return (_Tp*)(data + step.p[0] * y);
While cv::Mat::at<>(row, column) is as expensive as:
return ((_Tp*)(data + step.p[0] * i0))[i1];
You might want to take cv::Mat::ptr directly instead of calling cv::Mat::at every column to avoid further repetition of the data + step.p[0] * i0 operation, doing [i1] by yourself.
So you would do:
/* output.create and stuff */
const uchar* p, o;
for (int r = 0; r < rows; ++r) {
p = input.ptr<uchar>(r);
o = output.ptr<uchar>(r); // <-----
for (int c = 0; c < cols; ++c) {
if(p[c] >= threshold)
o[c] = 255;
else
o[c] = 0;
}
}
As a side note you don't and shouldn't check for cv::Mat::isContinuous here, the gaps are from one row to another, you are taking pointers to a single row, so you don't need to deal with the matrix gaps.
I have to translate from Matlab to C this code:
% take off the pads
x = (1 + padSize) : (rows - pad8Size);
y = (1 + padSize) : (cols - padSize);
rpad=rpad(x,y);
1st and 2nd create 2 array, but I don t know how I have to delete it from rpad Mat object It can be something like(subtract every element)
for(int i=1+pad;i<=rows-pad;i++){
for(int j=1+pad;i<=cols-pad;j++){
subtract(rpad,x,rpad);
subtract(rpad,y,rpad);}}
Or something like(delete the external element)
int a=(rows-pad)-(1+pad);
int b=(cols-pad)-(1+pad);
rpad.create(img.rows - a,img.cols - b,original.type());
img.copyTo(rpad);
Try
cv::Rect roi(padSize, padSize, rpad.cols-2*padSize, rpad.rows-2*padSize);
cv::Mat result = rpad(roi);
And depending on whether you want continuous memory, you can choose to directly use result (discontinuous, usually okay for most OpenCV functions) or copy it to back to rpad (continuous)
Is it possible to multiply a Mat object with a bidimensional array? Imfft is obviously the Mat object
for (int i = 0; i < rows; i++){
for (int j = 0; j < cols; j++){
imfft=imfft*filter[i][j]
}
}