(a) what I have, (b) what I get, (c) what I want
I have a simple vector graphic in Inkscape, which consists of a rectangle, filled points and stars. Since the axis ranges are not really nice (the height equals approximatly 3 times the width of the picture) for a publication, I want to rescale the picture. However, I do not have the raw data, such that I can plot it again. How can I rescale my graphic (see figure (a)), such that the x-range is more wide (see figure (c)) without getting distortions (see figure (b))? In the end I want to create a PDF file out of it.
Any ideas on that?
Thanks for your help.
You can try to do it in 2 steps, using the Object -> Transform tool (Shift-Ctrl-M).
First, select everything, and with the transform tool select the Scale tab, and scale horizontally by, say, 300%. All figures will be distorted.
Now, unselect the rectangle, and scale horizontally again by 33.3%, but first click on Apply to each object separately. This will undo the distortion (but not the translation) of each object.
Note that 300% followed by 33.3% should leave the individual objects with the same size.
Documentation here.
Related
I have a video of simple moving dots (that sometimes overlap) that is saved as a sequence of images. At each image I detect all the dots and save their coordinates:
(snapshot 1 -> snapshot 2)
I would like to infer the trajectory of each dot. The dots move smoothly and not too fast from one frame to the other, but if for each point of the first image I just find their closest point of the next image it often fails to reconstruct the trajectory.
I tried on opencv the multitrackers but the trackers very quickly lose their target by jumping on a different dot when the dots tend to overlap. The detection works very nicely though.
The video and the objects to track are simple. I do not want to believe that I need to implement something more technical to accurately track these dots. Which is why I decided to ask here, I am out of ideas. Any tip or advice is appreciated... Thanks.
I want to locate a service robot via infrared landmarks. The idea is to detect two landmarks, get the distance to the landmarks and calculate the robots position from these informations (the position of the landmarks are known).
For this I have built an artificial 2x3 matrix of IR LEDs, which are visible in the robots infrared camera image (shown in the image below).
As the first step, I want to detect a single landmark in a picture and get it's x-y coordinates. I can use these coordinates in the future to get the distance from the depth-image provided.
My first approach was to convert the image to a black and white image. Then I tried to filter out different cluster of points (which i dilated and contoured in the first place). I couldn't succeed with this method.
Now I wonder if there are any pattern recognition/computer vision methods which can help me to quite "easily" detect the pattern.
I've added a picture of the infrared image with the landmark in it and a converted black/white image.
a) Which method can help me to solve this problem?
b) Should I use a 3x3 Matrix or any other geometric form instead of the 2x3 Matrix ?
IR-Image
Black-White Image
A direct answer:
1) find all small circles in the image; 2) look among these small circles for ones that are the same size and close together, and, say, form parallel lines.
The reason for this approach is that you have coded the robot with a specific pattern of small objects. Therefore, look for the objects and then look for the pattern. (If the orientation and size wouldn't change, then you could just look for a sub-image within the larger image, but because it can, you need to look for elements of the pattern that remain consistent with motion in the 3D space, that is, the parallel lines.)
This will work in the example images, but to know whether this will work more generally, we need to know more than you told us: It depends on whether the variation in the images of the matrix and the variations in the background will let this be enough to distinguish between them. If not, maybe you need a more clever algorithm or maybe a different pattern of lights. In the extreme case, it's obvious that if you had another 2x3 matric around, it's not enough. It all depends on the variation of the object to be identified and the variations within the background scene, and because you don't tell us either of these things, it's hard to say the best way, what's good enough, what's a better way, etc.
If you have the choice, and here it sound like you do, good data is better than clever analysis. For this problem, I'd call good data to be anything that clearly distinguishes the object from the background. You need to think of it this way, and look at what the background is, and all the different perspectives on the lights that are possible, and make sure these can never be confused.
For example, if you have a lot of control over this, and enough time, temporal variations are often the easiest. Turning the lights (or a subset of the lights) on and off, etc, and then looking for the expected temporal variation is often the surest way to distinguish signal from noise — but really, this again is just making an assumption about the background and foreground (ie, that the background won't vary with some particular time pattern).
I would like to take a photo of some text and make the text easier to read. The tricky part is that the initial photo may have dark regions as well as light regions and I want the opengl function to enhance the text in all these regions.
Here is an example. On top is the original image. On bottom is the processed images.
[edited]
I have added in a better example picture of what is happening. I am able to enhance the text, but in areas where I have no text, this simple thresholding is creating speckled noise (image bottom left).
If I wind back the threshold, then I lose the text in the darker region (bottom right).
At the moment, the processed image only picks up some of the text, not all the text. The original algorithm I used was pretty simple:
- sample 8 pixels around the current pixel (pixels about 4-5 distant away seem to work best)
- figure out the lightest and darkest pixels from this sample
- if the current pixel is closer to the darkest threshold, then make black, and vice versa
This seemed to work very well for around text, but when it came to non-text, then it provided a very noisy image (even when I provided an initial rejection threshold)
I modified this algorithm to assume that text was always close to black. This provided the bottom image above, but once again I am not able to pull out all the text features I want.
Before implementing this as a program, you might want to take source photo and play with it in a GIMP or another editor to see what you can do.
One way to deal with shadows is to run high pass filter before thresolding.
This is how you do it in image editor (manually, without "highpass" filter plugin):
1. Convert image to grayscale and save it to "layer_A"
2. make a copy of "layer_A" into "Layer_B"
3. Invert colors in "Layer_B"
4. Gaussian blur "Layer_B" with radius that is larger than largest feature you want to preserve. (blur radius larger than letter)
5. Merge "Layer_A" with "Layer_B" where result = "Layer_A" * 0.5 + "Layer_B" * 0.5.
6. Increase contrast in resulting image.
7. Run thresold.
In opengl it'll be done in same fashion (and without multiple layers)
It won't work well with strong/crisp shadows (obviously), but it will exterminate huge smooth shadows that occurs due to page being bent, etc.
The technique (high pass filter) is frequently used for making seamless textures, and you should be able to find several such tutorials and additional info with google (GIMP seamless texture high pass or GIMP high pass).
By the way, if you want to improve "readability", then you might want to keep it grayscale (while improving contrast) instead of converting it to "black and white" (1 bit color). Sharp letter edges make text harder to read.
thanks for your help.
In the end I went for quite a basic approach.
Taking a sample of 8 nearby pixels, determining the max and min. Determined the local threshold (max - min). Then
smooth = dot(vec3(1.0/3.0), smoothstep(currentMin, currentMax, p11).rgb);
smooth = (localthreshold < threshold) ? 1.0 : smooth;
return vec4(smooth, smooth, smooth, 1);
This does not show me the text nicely in both the dark and light region, which is the ideal, but it nicely cleans up the text in the lighter region.
Mike
I'm relatively new to OpenCV, and I'm working on a project where I need to count the number of objects on a grid. the grid is the background of the image, and there's either an object in each space or there isn't; I need to count the number present, and I don't really know where to start. I've searched here and other places, but can't seem to find what I'm looking for. I will need to be tracking the space numbers of the grid in the future, so I will also eventually need to know whether each grid space is occupied or empty. I'm not going so far as to ask for a coded example, but does anybody know of any source or tutorials to accomplish this task or one similar to it? Thanks for your help!
Further Details: images will come from a stable-mounted camera, objects are of relatively uniform shape, but varying size and color.
I would first answer a few questions:
Will an object be completely enclosed in a grid cell? Or can it be placed on top of a grid line? (In other words, will the object hide a line from the camera?)
Will more than one object be in one cell?
Can an object occupy more than one cell? (closely related to question 1)
Given reasonable answers to those questions, I believe the problem can be broken into two parts: first, identify the centers of each grid space. To count objects, you can then sample that region to see if anything "not background" is there.
You can then assume that a grid space is defined by four strong, regularly-placed, corner features. (For the sake of discussion, I'll assume you've performed the initial image preparation as needed: histogram equalization, gaussian blur for noise reduction, etc.) From there, you might try some of OpenCV's methods for finding corners (Harris corner detector, cvGoodFeaturesToTrack, etc). It's likely that you can borrow some of the techniques found in OpenCV's square finding example (samples/c/square.c). For this task, it's probably sufficient to assume that the grid center is just the centroid of each set of "adjacent" (or sufficiently near) corners.
Alternatively, you might use the Hough transform to identify the principal horizontal and vertical lines in the image. You can then determine the intersection points to identify the extents of each grid cell. This implementation might be more challenging since inferring structure (or adjacency) from "nearby" vertices in order to find a grid center seems more difficult.
I'm trying to write an algorithm to generate the "ceiling panel" from a horiontally wrappable panoramic image like the one above. Images 1 to 4 are a straight cut out for the walls of the cube but the ceiling will be more complicated as I assume it needs to be composited from parts 5a to 5d. Does anyone know the solution in pseudocode?
my guess is that we need to iterate over the coordinates of the ceiling tile
i.e.
for y=0 to height
for x=0 to width
colorofsomecoordinateonoriginalimage = some function (poloar coords?)
set pixel(x,y) = colorofsomecoordinateonoriginalimage
next
next
Hum... I remember doing something like that for computer vision class one time back in grad school. It's not impossible but a LOT of work needs to be done. One way would be to degrade the entire product's quality. That's the easiest starting point. Once you degraded it enough (depending on how much you need to stretch the edges), you can start applying nonlinear transformations to the image. This is probably best done approximating by maybe cutting out sections of the cylinder by degrees and then applying one of the age old projections used in making flat maps (like Mercator or CADRG or something)... but you have to remember to interpolate the pixels, make sure you at least do an averaging of the pixels to approximate. That's the best I can think of.
You can't generate a panorama just by taking photos from a single location and stitch them. Well, you can for a single horizontal set, but it would look ugly (usually, you stitch many more than 4 photos to avoid distortions at the edges).
Here, you have even more data in the y-direction, which means even more pictures, and some sort fancy projection to generate the final image.
If you look at the panorama you have closely, you'll notice that the boundary of the region in sunlight is not straight. That is because your panorama was projected on a cylinder, not a cube. So I don't think 1/2/3/4 would look right directly mapped to a cube.
Bottom line, you really can't consider those 8 chunks as 8 pictures taken from a fixed point (If you need convincing, try yourself to take 8 pictures like that and try to stitch them together. You'll see how fun it is for the upper row, and even though it is easy for the bottom row, how ugly it looks on the stitched regions).
Now, why you need cube maps changes drastically what your options are. If you're only looking for a cube map to do cheap environment mapping effects, then the simplest is to find an arbitrary function that maps the edges where you want them to be, and simply linearly interpolate in between. It's completely the wrong projection, but ought to give a picture that looks good enough for the intended goal.
If you're looking for something more accurate, then you need to know how the projection was generated, so that you can unproject it before re-projecting it on the cube.
All that said, it's also a lot easier to just photograph cube maps rather than process a panorama to generate them, but that might not be possible for you.