I have this matlab code to display image object after do super spectrogram (stft, couple plca...)
t = z2 *stft_options.hop/stft_options.sr;
f = stft_options.sr*[0:size(spec_t,1)-1]/stft_options.N/1000;
max_val = max(max(db(abs(spec_t))));
imagesc(t, f, db(abs(spec_t)),[max_val-60 max_val]);
And get this result:
I was porting to C++ successfully by using Armadillo lib and get the mat results:
mat f,t,spec_t;
The problem is that I don't have any idea for converting bitmap like imagesc in matlab.
I searched and found this answer, but seems it doesn't work in my case because:
I use a double matrix instead of integer matrix, which can't be mark as bitmap color
The imagesc method take 4 parameters, which has the bounds with vectors x and y
The imagesc method also support scale ( I actually don't know how it work)
Does anyone have any suggestion?
Update: Here is the result of save method in Armadillo. It doesn't look like spectrogram image above. Do I miss something?
spec_t.save("spec_t.png", pgm_binary);
Update 2: save spectrogram with db and abs
mat spec_t_mag = db(abs(spec_t)); // where db method: m = 10 * log10(m);
mag_spec_t.save("mag_spec_t.png", pgm_binary);
And the result:
Armadillo is a linear algebra package, AFAIK it does not provide graphics routines. If you use something like opencv for those then it is really simple.
See this link about opencv's imshow(), and this link on how to use it in a program.
Note that opencv (like most other libraries) uses row-major indexing (x,y) and Armadillo uses column-major (row,column) indexing, as explained here.
For scaling, it's safest to convert to unsigned char yourself. In Armadillo that would be something like:
arma::Mat<unsigned char> mat2=255*(mat-mat.min())/(mat.max()-mat.min());
The t and f variables are for setting the axes, they are not part of the bitmap.
For just writing an image you can use Armadillo. Here is a description on how to write portable grey map (PGM) and portable pixel map (PPM) images. PGM export is only possible for 2D matrices, PPM export only for 3D matrices, where the 3rd dimension (size 3) are the channels for red, green and blue.
The reason your matlab figure looks prettier is because it has a colour map: a mapping of every value 0..255 to a vector [R, G, B] specifying the relative intensity of red, green and blue. A photo has an RGB value at every point:
colormap(gray);
x=imread('onion.png');
imagesc(x);
size(x)
That's the 3rd dimension of the image.
Your matrix is a 2d image, so the most natural way to show it is as grey levels (as happened for your spectrum).
x=mean(x,3);
imagesc(x);
This means that the R, G and B intensities jointly increase with the values in mat. You can put a colour map of different R,G,B combinations in a variable and use that instead, i.e. y=colormap('hot');colormap(y);. The variable y shows the R,G,B combinations for the (rescaled) image values.
It's also possible to make your own colour map (in matlab you can specify 64 R, G, and B combinations with values between 0 and 1):
z[63:-1:0; 1:2:63 63:-2:0; 0:63]'/63
colormap(z);
Now for increasing image values, red intensities decrease (starting from the maximum level), green intensities quickly increase then decrease, and blue values increase from minuimum to maximum.
Because PPM appears (I don't know the format) not to support colour maps, you need to specify the R,G,B values in a 3D array. For a colour order similar to z you would neet to make a Cube<unsigned char> c(ysize, xsize, 3) and then for every pixel y, x in mat2, do:
c(y,x,0) = 255-mat2(y,x);
c(y,x,1) = 255-abs(255-2*mat2(y,x));
x(y,x,2) = mat2(y,x)
or something very similar.
You may use SigPack, a signal processing library on top of Armadillo. It has spectrogram support and you may save the plot to a lot of different formats (png, ps, eps, tex, pdf, svg, emf, gif). SigPack uses Gnuplot for the plotting.
Related
I am looking for an idiomatic and efficient solution for this problem:
Let's say I have 3D Tensor where I want to represent an image with 100*100 pixels on 3 color channels,
Eigen::Tensor<int, 3> input(3,100,100);
The output I would like to get could be stored in
Eigen::Tensor<int, 4> output(3,3,100,100);
I would like to project the 3D input into the 4D output in a way that each color channel in the original tensor would have its own individual 3D tensor in the output, where each channel would contain the same values, that is
tensor(0,0,42,42) = tensor(0,1,42,42) = tensor(0,2,42,42)
tensor(0,0,12,12) = tensor(0,1,12,12) = tensor(0,2,12,12)
Illustrated on a picture:
Originally I wanted to solve this method:
Chip the individual color channels.
Broadcast the individual color channels into the size I need,
Reshape the broadcasted result into the desirable format(this is just a 3D Tensor at this point)
Concatenate the individual 3D Tensors into a big 4d one.
I have two problems with this approach.
Firstly, I just can not get the reshaping right, it always gives back a reshaped tensor with the dimensionality I want, but the coefficients get shuffled. I started to experiment with the layout of the Tensors, but it did not seem to help.
Secondly, this seems to be very tedious, I just feel like there should be a more convenient way to achieve this but I could not find any cue about that in the documentation.
I have some raw images to debayer then apply colour corrections/transforms to. I use OpenCV and C++, and for the image sensor used the linear matrix coefficients are:
1.32 -0.46 0.14
-0.36 1.25 0.11
0.08 -1.96 1.88
I am not sure how to apply these to the image. It's not clear to me what I am supposed to do with them and why.
Can anyone explain what these colour reproduction or colour matrix values are, and how to use them to process an image?
Thank you!
Your question is not clear because it seems you also don't know what to do.
"what I am supposed to do with them"
First thing coming to my mind, you can convolve image with that matrix by using filter2D. According to documentation filter2D:
Convolves an image with the kernel.
The function applies an arbitrary linear filter to an image. In-place
operation is supported. When the aperture is partially outside the
image, the function interpolates outlier pixel values according to the
specified border mode.
Here is the example code snippet hpw tp use it:
Mat output;
Mat kernelMatrix = (Mat_<double>(3, 3) << 1.32, -0.46, 0.14,
-0.36, 1.25, 0.11,
0.08, -1.96, 1.88);
filter2D(rawImage, output, -1, kernelMatrix);
Before debayering you have an array B (-ayer) of MxN filtered "graylevel" values. They are physically filtered in the sense that the the number of photons measured by each one of them is affected by the color filter on top of each sensor site.
After debayering you have an array C (-olor) of MxNx3 BGR values, obtained by (essentially) reindexing the B array. However, each of the 3 values at a (row, col) image location represents 3 physical measurements. This is not the final image because we still need to "convert" the physical measurements to numbers that are representative of color channels as perceived by a human (or, more generally, by the intended user, which could also be some kind of image processing software). That is, the physical values need to be mapped to a color space.
The 3x3 "color correction" matrix you have represents one possible mapping - a simple linear one. You need to apply it in turn to each BGR triple at all (row, col) pixel locations. For example (in python/numpy/cv2):
import numpy as np
def colorCorrect(img, M):
"""Applies a color correction M to a BGR image img"""
rows, cols, depth = img.shape
assert depth == 3
assert M.shape == (3, 3)
img_corr = np.zeros((rows, cols, 3), dtype=img.dtype)
for r in range(rows):
for c in range(cols):
img_corr[r, c, :] = M.dot(img[r, c, :])
return img_corr
I need to perform a threshold operation on an RGB image. The thresholding that I intend to do should behave as follows.
If greyscale equivalent of a pixel ( calculated as 0.299 * R' + 0.587 * G' + 0.114 * B' ) is Y, then the pixel value of the output image will be:
P = Threshold_color, if Y < threshold_value
= (R,G,B), Original value
,where Threshold_color is an RGB color value,
I wanted to perform this operation using Intel IPP library. There I found few API's related to thresholding of images. (ippiThreshold_LTVal_8u_C3R)
But these methods seems to work only on one data point at a time. But the thresholding that I want to do depends on the combination of 3 different values (R, G, B).
Is there a way to achieve this through IPP library?
Suggested approach:
Copy the image into a greyscale image
Create a binary mask 0/1 (same size as greyscale image) using the threshold
Multiply this mask with the replacement color you want to generate an overlay
Apply the overlay to the original image.
Note that you're generating images of different types here: first greyscale, then black&white, and finally color images again (although in step 3 it's a monochromatic image)
Yes you can implement this using IPP but I'm not aware of any standard function that does what you want.
All IPP threshold operations I can find in the reference use a global threshold.
I am looking for a general algorithm to smoothly transition between two colors.
For example, this image is taken from Wikipedia and shows a transition from orange to blue.
When I try to do the same using my code (C++), first idea that came to mind is using the HSV color space, but the annoying in-between colors show-up.
What is the good way to achieve this ? Seems to be related to diminution of contrast or maybe use a different color space ?
I have done tons of these in the past. The smoothing can be performed many different ways, but the way they are probably doing here is a simple linear approach. This is to say that for each R, G, and B component, they simply figure out the "y = m*x + b" equation that connects the two points, and use that to figure out the components in between.
m[RED] = (ColorRight[RED] - ColorLeft[RED]) / PixelsWidthAttemptingToFillIn
m[GREEN] = (ColorRight[GREEN] - ColorLeft[GREEN]) / PixelsWidthAttemptingToFillIn
m[BLUE] = (ColorRight[BLUE] - ColorLeft[BLUE]) / PixelsWidthAttemptingToFillIn
b[RED] = ColorLeft[RED]
b[GREEN] = ColorLeft[GREEN]
b[BLUE] = ColorLeft[BLUE]
Any new color in between is now:
NewCol[pixelXFromLeft][RED] = m[RED] * pixelXFromLeft + ColorLeft[RED]
NewCol[pixelXFromLeft][GREEN] = m[GREEN] * pixelXFromLeft + ColorLeft[GREEN]
NewCol[pixelXFromLeft][BLUE] = m[BLUE] * pixelXFromLeft + ColorLeft[BLUE]
There are many mathematical ways to create a transition, what we really want to do is understand what transition you really want to see. If you want to see the exact transition from the above image, it is worth looking at the color values of that image. I wrote a program way back in time to look at such images and output there values graphically. Here is the output of my program for the above pseudocolor scale.
Based upon looking at the graph, it IS more complex than a linear as I stated above. The blue component looks mostly linear, the red could be emulated to linear, the green however looks to have a more rounded shape. We could perform mathematical analysis of the green to better understand its mathematical function, and use that instead. You may find that a linear interpolation with an increasing slope between 0 and ~70 pixels with a linear decreasing slope after pixel 70 is good enough.
If you look at the bottom of the screen, this program gives some statistical measures of each color component, such as min, max, and average, as well as how many pixels wide the image read was.
A simple linear interpolation of the R,G,B values will do it.
trumpetlicks has shown that the image you used is not a pure linear interpolation. But I think an interpolation gives you the effect you're looking for. Below I show an image with a linear interpolation on top and your original image on the bottom.
And here's the (Python) code that produced it:
for y in range(height/2):
for x in range(width):
p = x / float(width - 1)
r = int((1.0-p) * r1 + p * r2 + 0.5)
g = int((1.0-p) * g1 + p * g2 + 0.5)
b = int((1.0-p) * b1 + p * b2 + 0.5)
pix[x,y] = (r,g,b)
The HSV color space is not a very good color space to use for smooth transitions. This is because the h value, hue, is just used to arbitrarily define different colors around the 'color wheel'. That means if you go between two colors far apart on the wheel, you'll have to dip through a bunch of other colors. Not smooth at all.
It would make a lot more sense to use RGB (or CMYK). These 'component' color spaces are better defined to make smooth transitions because they represent how much of each 'component' a color needs.
A linear transition (see #trumpetlicks answer) for each component value, R, G and B should look 'pretty good'. Anything more than 'pretty good' is going to require an actual human to tweak the values because there are differences and asymmetries to how our eyes perceive color values in different color groups that aren't represented in either RBG or CMYK (or any standard).
The wikipedia image is using the algorithm that Photoshop uses. Unfortunately, that algorithm is not publicly available.
I've been researching into this to build an algorithm that takes a grayscale image as input and colorises it artificially according to a color palette:
■■■■ Grayscale input ■■■■ Output ■■■■■■■■■■■■■■■
Just like many of the other solutions, the algorithm uses linear interpolation to make the transition between colours. With your example, smooth_color_transition() should be invoked with the following arguments:
QImage input("gradient.jpg");
QVector<QColor> colors;
colors.push_back(QColor(242, 177, 103)); // orange
colors.push_back(QColor(124, 162, 248)); // blue-ish
QImage output = smooth_color_transition(input, colors);
output.save("output.jpg");
A comparison of the original image VS output from the algorithm can be seen below:
(output)
(original)
The visual artefacts that can be observed in the output are already present in the input (grayscale). The input image got these artefacts when it was resized to 189x51.
Here's another example that was created with a more complex color palette:
■■■■ Grayscale input ■■■■ Output ■■■■■■■■■■■■■■■
Seems to me like it would be easier to create the gradient using RGB values. You should first calculate the change in color for each value based on the width of the gradient. The following pseudocode would need to be done for R, G, and B values.
redDifference = (redValue2 - redValue1) / widthOfGradient
You can then render each pixel with these values like so:
for (int i = 0; i < widthOfGradient; i++) {
int r = round(redValue1 + i * redDifference)
// ...repeat for green and blue
drawLine(i, r, g, b)
}
I know you specified that you're using C++, but I created a JSFiddle demonstrating this working with your first gradient as an example: http://jsfiddle.net/eumf7/
I have a matrix (Mat) constituted by double, in the range [0,1].
When I save it by means of command imwrite, the resulting image is totally black.
I suppose the problem is a casting problem, but I don't know how to solve it.
Thanks
The only way for OpenCV to store array of doubles without converting them to other formats (and losing information) is by using FileStorage. imwrite is restricted to arrays of 'char' or 'short'.
You get the totally black image is because all images pixels are within range [0,1] (actually either 0 or 1 when saving to image), which is approaching total black (either for gray-scale image or color image).
To save the matrix to the image with normal color, you need first to transform the double matrix to range [0, 255] by multiplying each value by 255. Remember to transform back if you later load the matrix from this image by dividing each value by 255.