If I have a straight line that mesures from 0 to 1, then I have colorA(255,0,0) at 0 on the line, then at 0.3 I have colorB(20,160,0) then at 1 on the line I have colorC(0,0,0). How could I find the color at 0.7?
Thanks
[Elaborating on my comments to Patrick -- it all got a bit out of hand!]
This is really an interesting question, and one of the reasons it's interesting is that there is no "right" answer. Colour representation and data interpolation can both be done in many different ways. You need to tailor your approach appropriately for your problem domain. Since we haven't been given much a priori information about that domain, we can only explore some of the possibilities.
The business of colours muddies the water somewhat, so let's put that aside temporarily and think first about interpolation of simple scalars.
Excursion into interpolation
Let's say we have some data points like this:
What we want to find is what the y value on this graph will be at points along the x axis other than the ones for which we have known values. That is, we're looking for a function
y = f(x)
that passes through these points.
Clearly, there are many different ways we could choose to join the dots. We might just link them up with straight line segments. Or we might want a smooth curve, and that curve could be simple or arbitrarily complicated:
Here, it's easy to understand where the red lines come from -- we're just drawing straight lines from one known point to the next. The green line also looks sort of reasonable, though we're adding some assumptions about how the curve should behave. The blue line, on the other hand, would be hard to justify on the basis of the data points alone, but there might be circumstances where we have reasons to model the system using such a shape -- as we'll see a bit later.
Note also the dotted lines heading off the sides of each curve. These are going beyond the known points, which is called extrapolation rather than interpolation. This is often a bit more questionable than interpolation. At least when you're going from one known point to another you have some evidence you're headed the right way. Whereas the further you get from the known points, the more likely it is you'll stray from the path. But still, it's a useful technique and it may well make sense.
OK, so the picture is pretty and all, but how do we generate these lines?
We need to find an f(x) that will give us the desired y. There are many different functions we could use depending on the circumstances, but by far the most common is to use a polynomial function, ie:
f(x) = a0 + a1 * x + a2 * x * x + a3 * x * x * x + ....
Now, given N distinct data points, it is always possible to find a perfectly-fitting curve using a polynomial of degree N-1 -- a straight line for two points, a parabola for three, a cubic for four, etc -- but unless the data are unusually well-behaved the curve tends to go haywire as the degree goes up. So unless there is a good reason to believe that the data's behaviour is well modelled by higher-degree polynomials, it's usual instead to interpolate the data piecewise, fitting a different curve segment between each successive pair of points.
Typically, each segment is modelled as a polynomial of degree 1 or 3. The former is just a straight line, and is used because it is really easy and requires no information than the two data points themselves. The latter is a cubic spline and is used because it is the simplest polynomial that will gives you a smooth transition across each point. However, calculating it is a bit more complex and requires two extra pieces of information for each segment (exactly what those pieces are depends on the particular spline form you use).
Linear interpolation is easy enough to go in one line of code. If our first data point is (x1, y1) and our second is (x2, y2), then the linearly-interpolated y at any intermediate x is:
y = y1 + (x - x1) * (y2 - y1) / (x2 - x1)
(Variations on this appear in some of the other answers.)
Cubic splines are a bit too involved to go into here, but Google should turn up some decent references. They are arguably overkill in this case anyway, since you only have three points.
Back to the question
With all that under our belts, let's take a look at the problem as described:
We have a line (shown here in grey) whose colour is known only at three points, and we're asked to calculate what the colour will be at some other point. Since we don't have any knowledge of how the colorus are distributed, we have to make some assumptions.
One key assumption is that colour changes along the line will be continuous, at least to some approximation. Clearly, if the line really looked like this:
then all that earlier stuff about interpolation would go out the window. We'd have no basis for deciding what colour any part should be and should just give up and go home. Let's assume that's not the case and we have something to interpolate.
The known colours are specified as RGB. In this representation, each channel is a separate scalar value and we can choose to treat it as completely independent of the others. So, one perfectly reasonable approach would be to do a piecewise linear interpolation of each channel and then recombine the results.
Doing so gives us something like this:
This is fine as far as it goes, but there are aspects of the result we might not like. One is that the transition from red to green passes through a pretty murky grey-brown. Another is that the green peak at 0.3 is a bit sharp.
It's important to note that, in the absence of a more comprehensive specification, these are really just aesthetic concerns. Our technique is perfectly sound, but it isn't giving quite the sort of results we might want. This sort of thing depends on our specific problem domain, and ultimately it's all a matter of choice.
Since we have just three data points -- and since Hans Passant suggested it -- perhaps we could instead try fitting a parabola to model the whole curve on each channel? It's true we don't have any reason to think that this is a good model, but it doesn't hurt to try:
The differences between this gradient and the last one are instructive. The quadratic has smoothed things out, but has also overshot dramatically. Remember, the green channel both starts and ends at 0. A parabola is symmetrical, so its maximum has to be in the middle. The only way it can fit a green rise towards 0.3 is by continuing to rise all the way to 0.5. (There's a similar effect in the red channel, but it's less obvious because in this case it's an undershoot and the value is clamped to 0.)
Do we have any evidence that this sort of shape is really there in our colour line? Nope: we've explicitly introduced it via our choice of model. That doesn't invalidate it -- we might have good reasons for wanting it to work that way -- once again it's a matter of choice.
But what about HSV?
So far we've stuck to the original RGB colour space, but as various people rushed to point out this can be less than ideal for interpolation: colour and intensity are bound together in RGB, so interpolating between two different full-intensity colours typically takes you through some drab low-intensity intermediate.
In HSV representation, colour and intensity are on separate dimensions, so we don't get that problem. Why not convert and do the interpolation in that space instead?
This immediately gives rise to a difficulty -- or at any rate, another decision. The mapping between HSV and RGB is not bijective; in particular, black, which happens to be one of our three data points, is a single point in RGB space but occupies a whole plane in HSV. We cannot interpolate between a point and a plane, so we need to pick one specific HSV black point to go to.
This is the basis of Patrick's ingenious solution, in which H and S are specifically chosen to make the whole colour gradient linear. The result looks something like this:
This looks a lot prettier and more colourful than the previous attempts, but there are a few things we might quibble with.
One important issue is that in the case of V, where we still have definite data at all three points, those data are not actually linear, and so the linear fit is only an approximation. What this means is that the value we see here at 0.3 is not quite what it should be.
Another, to my mind bigger, quibble: where is all that blue coming from? All three of our known RGB data points have B=0. It seems a bit strange to suddenly introduce a whole bunch of blue for which we seem to have no evidence at all. Check out this graph of the RGB components in Patrick's HSV interpolation:
The reason for the blue is, of course, that in switching colour spaces we have specifically selected a model where if you keep on going from green you inevitably get to blue. At the same time, we have had to discard one of our hue data points, and have chosen to fill it in by linear extrapolation from the other two, which means we do just keep on going, from green to blue to over the hills and far away.
Once again, this isn't invalid, and we might have a good reason for doing it this way. But to my mind it's a little but more out there than the previous piecewise-linear example, because of that extrapolation.
So is this the only way to do it in HSV? Of course not. There are always more choices.
For example, instead of choosing H and S values at 1.0 to maximise linearity, how about we instead choose them to minimise change in a piecewise linear interpolation? As it happens, for S the two strategies coincide: it is 100 at both points, so we make it 100 at the end too. The two segments are collinear. For H, though, we just keep it the same over the second segment. This gives a result like this:
This is not quite as cute as the previous one, but it seems slightly more plausible to me. Once again, though, that's largely an aesthetic judgement. That doesn't make this the "right" answer, anymore than it makes Patrick's or any of the others "wrong". As I said way back at the start, there is no "right" answer. It's all a matter of making choices according to your needs in a particular problem -- and being aware that you have made those choices and how they affect your outcome.
Try to convert this to another color-representation, e.g. HSV (see http://en.wikipedia.org/wiki/HSL_and_HSV).
Color A has a Hue of 0, a Saturation of 1 and a Value of 1.
Color C has a Hue of ?, a Saturation of ? and a Value of 0.
? means that it actually doesn't matter (since color C is simply black).
Now also convert color B to an HSV (can't do this out of my head, sorry), then choose nice values for the Hue and Saturation of color C so that the Hue, Value and Saturation are on one line in the HSV space. Then deduct the color at 0.7 from it.
EDIT: Using the RGB-HSV calculator at http://www.csgnetwork.com/csgcolorsel4.html I calculated the following:
Color A: H: 0, V: 100, S: 100
Color B: H: 113, V: 100, S: 63
Now we calculate the H and V of color C like this:
H: 113/0.3 = 376.6
V: 100 (since both A and B have a V or 100)
This gives us for the color at 0.7:
H = 376.66*0.7 = 263.66
V = 100
S = somewhere around 30
Unfortunately, this doesn't fit completely for the Saturation, but if you interpolate this way, you will get something that's very close to what you want.
In the abstract, you're asking for the value of f(0.7) given that f(0.0) = (255, 0, 0), f(0.3) = (20, 160, 0), and f(1.0) = (0, 0, 0). As stated, this isn't well-defined as f could be any of an infinite number of functions.
Some possible choices for f are:
A series of line segments in RGB space
A series of line segments after mapping to HSV or HSL space
A cubic spline in RGB space.
A cubic spline in HSV/L space.
But the particular choice of how f should be defined should depend on what you're doing, or how the color stops are defined for your application.
The correct way to interpolate RGB is to correct for gamma first, making the color components linear, perform an interpolation, and then convert back. Otherwise you will get errors due to the values being nonlinear. See http://www.4p8.com/eric.brasseur/gamma.html for more information (although the article is about scaling, scaling is usually done by interpolating pixel values).
If you have sRGB values, you can convert between linear and nonlinear using the formulas here: http://en.wikipedia.org/wiki/SRGB (use the Csrgb and Clinear formulas)
On the other hand, the errors are not huge, so it depends on your application if they matter.
For the interpolation itself, simple linear interpolation should suffice (as others here have noted). See http://en.wikipedia.org/wiki/Linear_interpolation for the specifics.
it looks like you are trying to interpolate in the wrong color space. First convert to HSV, then your colors become
RGB -> HSV
255,0,0 -> 0, 255,255
20,160,0 -> 80,255,160
0,0,0 -> 0,255,0
so its still confusing but at least we can see that the V value is interpolating to zero, the H value is confusing, but after realizing that HSV is modeled as a cylinder, we can see that it is really just interpolating from 0 to 256 mod 256, so it just goes back to zero.
so your equation would be (in HSV)
nH = frac*256 mod 256
nS = 255
nV = (1-frac)*255
So if you substitute 0.7 for frac you will get the right calculation in HSV coordinates, if you need to go back to RGB you should see if your libraries provide such a feature (i know java's Color class supports this), but if not you can just grab the code from here. it shows how to do color space conversions in C.
Good luck
Related
I hope you are doing well. I am stuck at one part of a visual effect program in C++, and wanted to ask for help.
I have an array of colors at random positions on an image. There can be any number of these "subpixels" that fall over top of any given pixel. The subpixels that overlap a pixel can be at any position within the pixel, since they're distributed randomly throughout the image. All I have access to is their position on the image and their color, which represents what the color should be at that precise subpixel point on the image.
I need to determine what color to make each pixel of the image. In other words, I need to interpolate what the color should be at the centre of each pixel.
Here is a diagram with an example of this on a 5x5 image:
I need to go from this:
To this:
If it aids your understanding, you can think of the first image as a series of random points whose color values were calculated using bilinear interpolation on the second image.
I am writing this in C++, and ideally it will be as fast as possible, but I welcome contributions in any language or just explained with symbols or words. It should be as accurate as possible, but I also welcome solutions that are slightly inaccurate in favour of performance or simplicity.
Please let me know if you need clarification on the problem.
Thank you.
I ended up finding quite a decent solution which, while it doesn't find the absolutely 100% technically correct color for each pixel, was more than good enough and acceptably fast, especially when I added multithreading.
I first create a vector for each pixel/cell that contains pointers to subpixels (points with known colors). When I create a subpixel, I add a pointer to it to the vector representing the pixel/cell that it overlaps and to each of the vectors representing pixels/cells directly adjacent to the pixel/cell that that it overlaps.
Then, I split each pixel/cell into n sub-cells (I found 8 works well). This is not as expensive as you might imagine, because I only have to calculate & compare the distance for those subpixels that are in that pixel/cell's subpixel pointer vector. For each sub-cell, I calculate which subpixel is the closest to its centre. That subpixel's color then contributes 1/nth of the color for that pixel/cell.
I found it was important to add the subpixel pointers to adjacent cell/pixel vectors, so that each sub-cell can take into account subpixels from adjacent pixels/cells. This even makes it produce a reasonable color when there are pixels/cells that have no subpixels overlapping them (as long as the neighboring pixels/cells do).
Thanks for all the comments so far; any ideas about how to speed this up would be appreciated as well.
I'm working on an application were I have a set of Contours(each one representing a Potential Line) and I wanna check "How straight" is that contour/shape.
The article I am using as a refrence uses the following technique:
It Matches a "segmented" line crossing the shape like so-
Then grading how "straight" is the line.
Heres an example of the Contours I am working on:
How would you go about implementing this technique?
Is there any other way of checking "How Straight" is a contour\shape?
Regards!
My first guess would be to use a coefficient of determination. That would be, fit a linear line to all your point assuming some reasonable origin where you won't receive rounding errors and calculate R^2.
A more advanced approach, if all contours are disconnected components, would be to calculate the structure model index (the link is for bone morphometry, but they explain the concept and cite the original paper.) This gives you a number that tells you how much your segment is "like a rod". This is just an idea, though. Anything that forms curves or has branches will be less and less like a rod.
I would say that it also depends on what you are using the metric for and if your contours are always generally carrying left to right.
An additional method would be to create the covariance matrix of your points, calculate the eigenvalues from that matrix, and take their ratio (where the ratio is greater than or equal to 1; otherwise, invert the ratio.) This is the basic principle behind a PCA besides the final ratio. If you have a rather linear data set (the data set varies in only one direction) then you will have a very large ratio. As the data set becomes less and less linear (or more uncorrelated) you would see the ratio approach one. A perfectly linear data set would be infinity and a perfect circle one (I believe, but I would appreciate if someone could verify this for me.) Also, working in two dimensions would mean the calculation would be computationally cheap and straight forward.
This would handle outliers very well and would be invariant to the rotation and shape of your contour. You also have a number which is always positive. The only issue would be preventing overflow when dividing the two eigenvalues. Then again you could always divide the smaller eigenvalue by the larger and your metric would be bound between zero and one, one being a circle and zero being a straight line.
Either way, you would need to test if this parameter is sensitive enough for your application.
One example for a simple algorithm is using the dot product between two segments to determine the angle between them. The formula for dot product is:
A * B = ||A|| ||B|| cos(theta)
Solving the equation for cos(theta) yields
cos(theta) = (A * B / (||A|| ||B||))
Since cos(0) = 1, cos(pi) = -1.0 and you're checking for the "straightness" of the lines, a line whose normalization of cos(theta) angles is closest to -1.0 is the straightest.
straightness = SUM(cos(theta))/(number of line segments)
where a straight line is close to -1.0, and a non-straight line approaches 1.0. Keep in mind this is a cursory evaluation of this algorithm and it obviously has edge cases and caveats that would need to be addressed in an implementation.
The trick is to use image moments. In short, you calculate the minimum inertia around an axis, the inertia around an axis perpendicular to this, and the ratio between them (which is always between 0 and 1; since inertia is non-negative)
For a straight line, the inertia along the line is zero, so the ratio is also zero. For a circle, the inertia is the same along all axis so the ratio is one. Your segmented line will be 0.01 or so as it's a fairly good match.
A simpler method is to compare the circumference of the the convex polygon containing the shape with the circumference of the shape itself. For a line, they're trivially equal, and for a not too crooked shape it's still comparable.
I have to do camera calibration. I understand the general concept and I have it working, but in many guides it says to use many images or at the very least two with different orientation. Why exactly is this necessary? I seem to be getting reasonably good results with a single image of 14x14 points:
I find the points with cv::findCirclesGrid and use cv::calibrateCamera to find the extrinsic and intrinsic parameters. Intrinsic guess is set to false. Principal point and aspect ratio are not fixed while tangential distortion is fixed to zero.
I then use cv::getOptimalNewCameraMatrix, cv::initUndistortRectifyMap and cv::remap to restore the image.
It seems to me the result is pretty good, but am I missing something? Is it actually wrong and just waiting to cause problems for me later?
Also before you ask why I don't just use multiple images to be sure; the software I am writing will be used with a semi-fixed camera stand to calibrate several cameras one at a time. So first off the stand would need to be modified in order to position the pattern at an angle or off centre, as currently it can only be moved closer or further away. Secondly the process should not be unnecessarily slowed down by having to capture more images.
Edit: To Micka asking "what happens if your viewing angle isnt 90° on the pattern? Can you try to rotate the pattern away from the camera?". I get a somewhat similar result, although it finds less distortion. From looking at the borders with a ruler it seems that the calibration from 90° is better, but it is really hard to tell.
Having more patterns in different orientations is necessary to avoid the situation where the instrinsic parameters are very inaccurate but the pixel reprojection error of the undistortion is still low because different errors compensate.
To illustrate this point: if you only have one image taken at 90 degree viewing angle, then a change in horizontal focal length can be poorly distinguished from viewing the pattern a little bit from the side. The only clue that sets the two parameters apart is the tapering of the lines, but that measurement is very noisy. Hence you need multiple views at significant angles to separate this aspect of the pose from the intrinsic parameters.
If you know your image is viewed at 90 degrees, you can use this to your advantage but it requires modification of the opencv algorithm. If you are certain that all images will be captured from the same pose as your calibration image, then it does not really matter as the undistortion will be good even if the individual calibration parameters are inaccurate but compensating (i.e. they compensate well for this specific pose, but poorly for other poses).
As stated here, the circle pattern (in theory) gets along quite well with only a single image. The reason that you would need multiple images is the noise present in the input data.
My suggestions would be to compare the results of different input images. If the error is low, you will probably be able to get away with one sample.
i check the paper , which opencv method using...that is zhangzhengyou's method.
that is n=1 , you can only get the focus .
in the paper , he says "
If n 3, we will have in general a unique solution b defined up to a
scale factor. If n à 2, we can impose the skewless constraint à 0,
i.e., â0; 1; 0; 0; 0; 0äb à 0, which is added as an additional
equation to (9). (If n à 1, we can only solve two camera intrinsic
parameters, e.g., and , assuming u0 and v0 are known (e.g., at the
image center) and à 0, and that is indeed what we did in [19] for
head pose determination based on the fact that eyes and mouth are
reasonably coplanar. In fact, Tsai [23] already mentions that focal
length from one plane is possible, but incorrectly says that aspect
ratio is not.) "
I was interested in how to draw a line with a specific width (or multiple lines) using a fragment shader. I stumbled on the this post which seems to explain it.
The challenge I have is understanding the logic behind it.
A couple of questions:
Our coordinate space in this example is (0.0-1.0,0.0-1.0), correct?
If so, what is the purpose of the "uv" variable. Since thickness is 500, the "uv" variable will be very small. Therefore the distances from it to pont 1 and 2 (stored in the a and b variables)?
Finally, what is the logic behind the h variable?
i will try to answer all of your questions one by one:
1) Yes, this is in fact correct.
2) It is common in 3d computer graphics to express coordinates(within certain boundaries) with floating-point values between 0 and 1(or between -1 and 1). First of all, this makes it quite easy to decide whether a given value crosses said boundary or not, and abstracts away from a concept of "pixel" being a discrete image unit; furthermore this common practise can be found pretty much everywhere else(think of device coordinates or texture coordinates)
Don't be afraid that values that you are working with are less than one; in fact, in computer graphics you usually deal with floating-point arithmetics, and FLOAT types are quite good at expressing Real values line around the "1" point.
3) The formula give for h consists of 2 parts: the square-root part, and the 2/c coefficient. The square root part should be well known from scholl math classes - this is Heron formula for the area of a triangle(between a,b,c). 2/c extracts the height of the said triangle, which is stored in h and is also the distance between point uv and the "ground line" of the triangle. This distance is then used to decide, where is uv in relation to the line p1-p2.
I'm a student, and I've been tasked to optimize bilinear interpolation of images by invoking parallelism from CUDA.
The image is given as a 24-bit .bmp format. I already have a reader for the .bmp and have stored the pixels in an array.
Now I need to perform bilinear interpolation on the array. I do not understand the math behind it (even after going through the wiki article and other Google results). Because of this I'm unable to come up with an algorithm.
Is there anyone who can help me with a link to an existing bilinear interpolation algorithm on a 1-D array? Or perhaps link to an open source image processing library that utilizes bilinear and bicubic interpolation for scaling images?
The easiest way to understand bilinear interpolation is to understand linear interpolation in 1D.
This first figure should give you flashbacks to middle school math. Given some location a at which we want to know f(a), we take the neighboring "known" values and fit a line between them.
So we just used the old middle-school equations y=mx+b and y-y1=m(x-x1). Nothing fancy.
We basically carry over this concept to 2-D in order to get bilinear interpolation. We can attack the problem of finding f(a,b) for any a,b by doing three interpolations. Study the next figure carefully. Don't get intimidated by all the labels. It is actually pretty simple.
For a bilinear interpolation, we again using the neighboring points. Now there are four of them, since we are in 2D. The trick is to attack the problem one dimension at a time.
We project our (a,b) to the sides and first compute two (one dimensional!) interpolating lines.
f(a,yj) where yj is held constant
f(a,yj+1) where yj+1 is held constant.
Now there is just one last step. You take the two points you calculated, f(a,yj) and f(a,yj+1), and fit a line between them. That's the blue one going left to right in the diagram, passing through f(a,b). Interpolating along this last line gives you the final answer.
I'll leave the math for the 2-D case for you. It's not hard if you work from the diagram. And going through it yourself will help you really learn what's going on.
One last little note, it doesn't matter which sides you pick for the first two interpolations. You could have picked the top and bottom, and then done the third interpolation line between those two instead. The answer would have been the same.
When you enlarge an image by scaling the sides by an integral factor, you may treat the result as the original image with extra pixels inserted between the original pixels.
See the pictures in IMAGE RESIZE EXAMPLE.
The f(x,y)=... formula in this article in Wikipedia gives you a method to compute the color f of an inserted pixel:
For every inserted pixel you combine the colors of the 4 original pixels (Q11, Q12, Q21, Q22) surrounding it. The combination depends on the distance between the inserted pixel and the surrounding original pixels, the closer it is to one of them, the closer their colors are:
The original pixels are shown as red. The inserted pixel is shown as green.
That's the idea.
If you scale the sides by a non-integral factor, the formulas still hold, but now you need to recalculate all pixel colors as you can't just take the original pixels and simply insert extra pixels between them.
Don't get hung up on the fact that 2D arrays in C are really 1D arrays. It's an implementation detail. Mathematically, you'll still need to think in terms of 2D arrays.
Think about linear interpolation on a 1D array. You know the value at 0, 1, 2, 3, ... Now suppose I ask you for the value at 1.4. You'd give me a weighted mix of the values at 1 and 2: (1 - 0.4)*A[1] + 0.4*A[2]. Simple, right?
Now you need to extend to 2D. No problem. 2D interpolation can be decomposed into two 1D interpolations, in the x-axis and then y-axis. Say you want (1.4, 2.8). Get the 1D interpolants between (1, 2)<->(2,2) and (1,3)<->(2,3). That's your x-axis step. Now 1D interpolate between them with the appropriate weights for y = 2.8.
This should be simple to make massively parallel. Just calculate each interpolated pixel separately. With shared memory access to the original image, you'll only be doing reads, so no synchronization issues.