How to convert image storage order from channel-height-width to height-width-channel? - c++

I would like to know how to convert an image stored as a 1D std::vector<float> from CHW format (Channel, Height, Width) to HWC format (Height, Width, Channel) in C++. The format change is needed due to requirements of a neural network.
I used OpenCV to read and show the image as below:
cv::namedWindow("Screenshot", cv::WINDOW_AUTOSIZE );
cv::imshow("Screenshot", rgbImage);
Then I converted the cv::Mat rgbImage to a 1D std::vector<float> in format CHW:
size_t channels = 3;
std::vector<float> data(channels*ROS_IMAGE_HEIGHT*ROS_IMAGE_WIDTH);
for(size_t j=0; j<ROS_IMAGE_HEIGHT; j++){
for(size_t k=0; k<ROS_IMAGE_WIDTH; k++){
cv::Vec3b intensity = rgbImage.at<cv::Vec3b>(j, k);
for(size_t i=0; i<channels; i++){
data[i*ROS_IMAGE_HEIGHT*ROS_IMAGE_WIDTH + j*ROS_IMAGE_HEIGHT + k] = (float) intensity[i];
}
}
}
Now I want to convert the format of std::vector<float> data to HWC. How can I do this?

I found some description of the "CHW" and "HWC" formats here.
If the storage order is HWC, it means that
Each sample is stored as a column-major matrix (height, width) of float[numChannels] (r00, g00, b00, r10, g10, b10, r01, g01, b01, r11, g11, b11).
Thus a pixel (x, y, c) is found using
xStride = channels;
yStride = channels * width;
cStride = 1;
data[x*xStride + y*yStride + c*cStride]
If the storage order is CHW, it means that each channel is a different plane. A pixel (x, y, c) is found using
xStride = 1;
yStride = width;
cStride = width * height;
data[x*xStride + y*yStride + c*cStride]
Note that in the code in the question, data[i*ROS_IMAGE_HEIGHT*ROS_IMAGE_WIDTH + j*ROS_IMAGE_HEIGHT + k] is incorrect, j is the y-coordinate and should be multiplied by ROS_IMAGE_WIDTH.
The code in the question can be modified to yield a std::vector in the HWC format by replacing the line in the innermost loop by:
data[i + j*ROS_IMAGE_WIDTH*channels + k*channels] = (float) intensity[i];

Related

Generating .pfm image in c++

I have a small program that outputs an rgb image. And I need it to be in .pfm format.
So, I have some data in the range [0, 255].
float * data;
data = new float[PixelWidth * PixelHeight * 3];
for (int i = 0; i < PixelWidth * PixelHeight * 3; i += 3) {
int idx = i / 3;
data[i] = img[idx].x;
data[i + 1] = img[idx].y;
data[i + 2] = img[idx].z;
}
(img[] here is Vec3[] of unsigned char)
Now I generate the image.
char sizes[256];
f = fopen("outputimage.pfm", "wb");
double scale = -1.0;
fprintf(f, "PF\n%d %d\n%lf\n", PixelWidth, PixelHeight, scale);
for (int i = 0; i < PixelWidth*PixelHeight*3; i++) {
float d = data[i];
fwrite((void *)&d, 1, 4, f);
}
fclose(f);
But somehow I get a grayscale image instead of RGB.
The data is fine. I tried to output it as .ppm and it works fine.
I guess the problem is with scaling, but I am not really sure how it should be done correctly.
To close the question.
I just had to convert all the values from [0-255] range to [0.0-1.0]. So, I divided each rgb value by 255.

How do I pass an OpenCV Mat into a C++ Tensorflow graph?

In Tensorflow C++ I can load an image file into the graph using
tensorflow::Node* file_reader = tensorflow::ops::ReadFile(tensorflow::ops::Const(IMAGE_FILE_NAME, b.opts()),b.opts().WithName(input_name));
tensorflow::Node* image_reader = tensorflow::ops::DecodePng(file_reader, b.opts().WithAttr("channels", 3).WithName("png_reader"));
tensorflow::Node* float_caster = tensorflow::ops::Cast(image_reader, tensorflow::DT_FLOAT, b.opts().WithName("float_caster"));
tensorflow::Node* dims_expander = tensorflow::ops::ExpandDims(float_caster, tensorflow::ops::Const(0, b.opts()), b.opts());
tensorflow::Node* resized = tensorflow::ops::ResizeBilinear(dims_expander, tensorflow::ops::Const({input_height, input_width},b.opts().WithName("size")),b.opts());
For an embedded application I would like to instead pass an OpenCV Mat into this graph.
How would I convert the Mat to a tensor that could be used as input to tensorflow::ops::Cast or tensorflow::ops::ExpandDims?
It's not directly from a CvMat, but you can see an example of how to initialize a Tensor from an in-memory array in the TensorFlow Android example:
https://github.com/tensorflow/tensorflow/blob/0.6.0/tensorflow/examples/android/jni/tensorflow_jni.cc#L173
You would start off by creating a new tensorflow::Tensor object, with something like this (all code untested):
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT,
tensorflow::TensorShape({1, height, width, depth}));
This creates a Tensor object with float values, with a batch size of 1, and a size of widthxheight, and with depth channels. For example a 128 wide by 64 high image with 3 channels would pass in a shape of {1, 64, 128, 3}. The batch size is just used when you need to pass in multiple images in a single call, and for simple uses you can leave it as 1.
Then you would get the underlying array behind the tensor using a line like this:
auto input_tensor_mapped = input_tensor.tensor<float, 4>();
The input_tensor_mapped object is an interface to the data in your newly-created tensor, and you can then copy your own data into it. Here I'm assuming you've set source_data as a pointer to your source data, for example:
const float* source_data = some_structure.imageData;
You can then loop through your data and copy it over:
for (int y = 0; y < height; ++y) {
const float* source_row = source_data + (y * width * depth);
for (int x = 0; x < width; ++x) {
const float* source_pixel = source_row + (x * depth);
for (int c = 0; c < depth; ++c) {
const float* source_value = source_pixel + c;
input_tensor_mapped(0, y, x, c) = *source_value;
}
}
}
There are obvious opportunities to optimize this naive approach, and I don't have sample code on hand to show how to deal with the OpenCV side of getting the source data, but hopefully this is helpful to get you started.
Here is complete example to read and feed:
Mat image;
image = imread("flowers.jpg", CV_LOAD_IMAGE_COLOR);
cv::resize(image, image, cv::Size(input_height, input_width), 0, 0, CV_INTER_CUBIC);
int depth = 3;
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT,
tensorflow::TensorShape({1, image.rows, image.cols, depth}));
for (int y = 0; y < image.rows; y++) {
for (int x = 0; x < image.cols; x++) {
Vec3b pixel = image.at<Vec3b>(y, x);
input_tensor_mapped(0, y, x, 0) = pixel.val[2]; //R
input_tensor_mapped(0, y, x, 1) = pixel.val[1]; //G
input_tensor_mapped(0, y, x, 2) = pixel.val[0]; //B
}
}
auto result = Sub(root.WithOpName("subtract_mean"), input_tensor, {input_mean});
ClientSession session(root);
TF_CHECK_OK(session.Run({result}, out_tensors));
I had tried to run inception model on the opencv Mat file and following code worked for me https://gist.github.com/kyrs/9adf86366e9e4f04addb. Although there are some issue with integration of opencv and tensorflow. Code worked without any issue for .png files but failed to load .jpg and .jpeg. You can follow this for more info https://github.com/tensorflow/tensorflow/issues/1924
Tensor convertMatToTensor(Mat &input)
{
int height = input.rows;
int width = input.cols;
int depth = input.channels();
Tensor imgTensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({height, width, depth}));
float* p = imgTensor.flat<float>().data();
Mat outputImg(height, width, CV_32FC3, p);
input.convertTo(outputImg, CV_32FC3);
return imgTensor;
}

Bilinear re-sizing with C++ and vector of RGBA pixels

I am trying to re-size an image by using the bilinear technique I found here but I don't see anything but a black image.
So, in first place I have my image decoded with LodePNG and the pixels go into a vector<unsigned char> variable. It says that they are stored as RGBARGBA but when I tried to apply the image to a X11 window I realized they were stored as BGRABGRA. I don't know if is the X11 API which changes the order or the LodePNG decoder. Anyway, before anything, I convert the BGR to RGB:
// Here is where I have the pixels stored
vector<unsigned char> Image;
// Converting BGRA to RGBA, or vice-versa, I don't know, but it's how it is shown
// correctly on the window
unsigned char red, blue;
unsigned int i;
for(i=0; i<Image.size(); i+=4)
{
red = Image[i + 2];
blue = Image[i];
Image[i] = red;
Image[i + 2] = blue;
}
So, now I am trying to change the size of the image, before applying it to the window. The size would be the size of the window (stretch it).
I firstly try to convert the RGBA to int values, like this:
vector<int> IntImage;
for(unsigned i=0; i<Image.size(); i+=4)
{
IData.push_back(256*256*this->Data[i+2] + 256*this->Data[i+1] + this->Data[i]);
}
Now I have this function from the link I specified above, which is supposed to do the interpolation:
vector<int> resizeBilinear(vector<int> pixels, int w, int h, int w2, int h2) {
vector<int> temp(w2 * h2);
int a, b, c, d, x, y, index ;
float x_ratio = ((float)(w-1))/w2 ;
float y_ratio = ((float)(h-1))/h2 ;
float x_diff, y_diff, blue, red, green ;
for (int i=0;i<h2;i++) {
for (int j=0;j<w2;j++) {
x = (int)(x_ratio * j) ;
y = (int)(y_ratio * i) ;
x_diff = (x_ratio * j) - x ;
y_diff = (y_ratio * i) - y ;
index = (y*w+x) ;
a = pixels[index] ;
b = pixels[index+1] ;
c = pixels[index+w] ;
d = pixels[index+w+1] ;
// blue element
// Yb = Ab(1-w)(1-h) + Bb(w)(1-h) + Cb(h)(1-w) + Db(wh)
blue = (a&0xff)*(1-x_diff)*(1-y_diff) + (b&0xff)*(x_diff)*(1-y_diff) +
(c&0xff)*(y_diff)*(1-x_diff) + (d&0xff)*(x_diff*y_diff);
// green element
// Yg = Ag(1-w)(1-h) + Bg(w)(1-h) + Cg(h)(1-w) + Dg(wh)
green = ((a>>8)&0xff)*(1-x_diff)*(1-y_diff) + ((b>>8)&0xff)*(x_diff)*(1-y_diff) +
((c>>8)&0xff)*(y_diff)*(1-x_diff) + ((d>>8)&0xff)*(x_diff*y_diff);
// red element
// Yr = Ar(1-w)(1-h) + Br(w)(1-h) + Cr(h)(1-w) + Dr(wh)
red = ((a>>16)&0xff)*(1-x_diff)*(1-y_diff) + ((b>>16)&0xff)*(x_diff)*(1-y_diff) +
((c>>16)&0xff)*(y_diff)*(1-x_diff) + ((d>>16)&0xff)*(x_diff*y_diff);
temp.push_back(
((((int)red)<<16)&0xff0000) |
((((int)green)<<8)&0xff00) |
((int)blue) |
0xff); // hardcode alpha ;
}
}
return temp;
}
and I use it like this:
vector<int> NewImage = resizeBilinear(IntData, image_width, image_height, window_width, window_height);
which is supposed to return me the RGBA vector of the re-sized image. Now I am changing back to RGBA (from int)
Image.clear();
for(unsigned i=0; i<NewImage.size(); i++)
{
Image.push_back(NewImage[i] & 255);
Image.push_back((NewImage[i] >> 8) & 255);
Image.push_back((NewImage[i] >> 16) & 255);
Image.push_back(0xff);
}
and what I get is a black window (the default background color), so I don't know what am I missing. If I comment out the line where I get the new image and just convert back to RGBA the IntImage I get the correct values so I don't know if it is the messed up RGBA/int <> int/RGBA. I'm just lost now. I know this can be optimized/simplified but for now I just want to make it work.
The array access in your code is incorrect:
vector<int> temp(w2 * h2); // initializes the array to contain zeros
...
temp.push_back(...); // appends to the array, leaving the zeros unchanged
You should overwrite instead of appending; for that, calculate the array position:
temp[i * w2 + j] = ...;
Alternatively, initialize the array to an empty state, and append your stuff:
vector<int> temp;
temp.reserve(w2 * h2); // reserves some memory; array is still empty
...
temp.push_back(...); // appends to the array

proper visualization of warped image

I am trying to implement image warping in C++ and OpenCV. My code is as follows:
Mat input = imread("Lena.jpg",CV_LOAD_IMAGE_GRAYSCALE);
Mat out;
double xo, yo;
input.convertTo(input, CV_32FC1);
copyMakeBorder(input, input, 3, 3, 3, 3, 0);
int height = input.rows;
int width = input.cols;
out = Mat(height, width, input.type());
for(int j = 0; j < height; j++){
for(int i =0; i < width; i++){
xo = (8.0 * sin(2.0 * PI * j / 128.0));
yo = (8.0 * sin(2.0 * PI * i / 128.0));
out.at<float>(j,i) = (float)input.at<float>(((int)(j+yo+height)%height),((int)(i+xo+width)%width));
}
}
normalize(out, out,0,255,NORM_MINMAX,CV_8UC1);
imshow("output", out);
This produces the following image:
As it is clearly visible, the values near the border are non-zero. Can anyone tell me how do I get black border as shown in the following image instead of artifacts that I get from my code?
Only the black border of this image should be considered, i.e the image should be wavy (sinusoidal) but without artifacts.
Thanks...
Here:
xo = (8.0 * sin(2.0 * PI * j / 128.0));
yo = (8.0 * sin(2.0 * PI * i / 128.0));
out.at<float>(j,i) = (float)input.at<float>(((int)(j+yo+height)%height),((int)(i+xo+width)%width));
You calculate the location of the source pixel, but you take the mod with width/height to ensure it's within the image. This results in pixels wrapping around at the edge. Instead you need to set any pixel outside of the image to black (or, if your source image has a black border, clamp to the edge).
As you have a border already, you could just clamp the coordinates, like this:
int ix = min(width-1, max(0, (int) (i + xo)));
int iy = min(height-1, max(0, (int) (j + yo)));
out.at<float>(j,i) = (float)input.at<float>(iy,ix);

Accessing certain pixel RGB value in openCV

I have searched internet and stackoverflow thoroughly, but I haven't found answer to my question:
How can I get/set (both) RGB value of certain (given by x,y coordinates) pixel in OpenCV? What's important-I'm writing in C++, the image is stored in cv::Mat variable. I know there is an IplImage() operator, but IplImage is not very comfortable in use-as far as I know it comes from C API.
Yes, I'm aware that there was already this Pixel access in OpenCV 2.2 thread, but it was only about black and white bitmaps.
EDIT:
Thank you very much for all your answers. I see there are many ways to get/set RGB value of pixel. I got one more idea from my close friend-thanks Benny! It's very simple and effective. I think it's a matter of taste which one you choose.
Mat image;
(...)
Point3_<uchar>* p = image.ptr<Point3_<uchar> >(y,x);
And then you can read/write RGB values with:
p->x //B
p->y //G
p->z //R
Try the following:
cv::Mat image = ...do some stuff...;
image.at<cv::Vec3b>(y,x); gives you the RGB (it might be ordered as BGR) vector of type cv::Vec3b
image.at<cv::Vec3b>(y,x)[0] = newval[0];
image.at<cv::Vec3b>(y,x)[1] = newval[1];
image.at<cv::Vec3b>(y,x)[2] = newval[2];
The low-level way would be to access the matrix data directly. In an RGB image (which I believe OpenCV typically stores as BGR), and assuming your cv::Mat variable is called frame, you could get the blue value at location (x, y) (from the top left) this way:
frame.data[frame.channels()*(frame.cols*y + x)];
Likewise, to get B, G, and R:
uchar b = frame.data[frame.channels()*(frame.cols*y + x) + 0];
uchar g = frame.data[frame.channels()*(frame.cols*y + x) + 1];
uchar r = frame.data[frame.channels()*(frame.cols*y + x) + 2];
Note that this code assumes the stride is equal to the width of the image.
A piece of code is easier for people who have such problem. I share my code and you can use it directly. Please note that OpenCV store pixels as BGR.
cv::Mat vImage_;
if(src_)
{
cv::Vec3f vec_;
for(int i = 0; i < vHeight_; i++)
for(int j = 0; j < vWidth_; j++)
{
vec_ = cv::Vec3f((*src_)[0]/255.0, (*src_)[1]/255.0, (*src_)[2]/255.0);//Please note that OpenCV store pixels as BGR.
vImage_.at<cv::Vec3f>(vHeight_-1-i, j) = vec_;
++src_;
}
}
if(! vImage_.data ) // Check for invalid input
printf("failed to read image by OpenCV.");
else
{
cv::namedWindow( windowName_, CV_WINDOW_AUTOSIZE);
cv::imshow( windowName_, vImage_); // Show the image.
}
The current version allows the cv::Mat::at function to handle 3 dimensions. So for a Mat object m, m.at<uchar>(0,0,0) should work.
uchar * value = img2.data; //Pointer to the first pixel data ,it's return array in all values
int r = 2;
for (size_t i = 0; i < img2.cols* (img2.rows * img2.channels()); i++)
{
if (r > 2) r = 0;
if (r == 0) value[i] = 0;
if (r == 1)value[i] = 0;
if (r == 2)value[i] = 255;
r++;
}
const double pi = boost::math::constants::pi<double>();
cv::Mat distance2ellipse(cv::Mat image, cv::RotatedRect ellipse){
float distance = 2.0f;
float angle = ellipse.angle;
cv::Point ellipse_center = ellipse.center;
float major_axis = ellipse.size.width/2;
float minor_axis = ellipse.size.height/2;
cv::Point pixel;
float a,b,c,d;
for(int x = 0; x < image.cols; x++)
{
for(int y = 0; y < image.rows; y++)
{
auto u = cos(angle*pi/180)*(x-ellipse_center.x) + sin(angle*pi/180)*(y-ellipse_center.y);
auto v = -sin(angle*pi/180)*(x-ellipse_center.x) + cos(angle*pi/180)*(y-ellipse_center.y);
distance = (u/major_axis)*(u/major_axis) + (v/minor_axis)*(v/minor_axis);
if(distance<=1)
{
image.at<cv::Vec3b>(y,x)[1] = 255;
}
}
}
return image;
}