My requirement is perspective rotating a user submitted image into a pre-rendered background image (which is actually an image per frame on a video).
The easiest was to use ImageMagick and I wrote a very crude simple bash script to achieve what I needed as follows:
#!/bin/bash
# #author: neurosys
# #Description: Perspective transforms and projects an alpha image
# onto a background image.
if [ $# -ne 11 ]
then
echo 'Usage: ./map_image.sh background.jpg image.png output.jpg x1 y1 x2 y2 x3 y3 x4 y4';
exit;
fi
BG=$1
IMAGE=$2
DEST=$3
TEMP='temp.png'
BG_SIZE_W=$(convert $BG -print "%w\n" /dev/null)
BG_SIZE_H=$(convert $BG -print "%h\n" /dev/null)
IMAGE_W=$(convert $IMAGE -print "%w\n" /dev/null)
IMAGE_H=$(convert $IMAGE -print "%h\n" /dev/null)
X1=$4
Y1=$5
X2=$6
Y2=$7
X3=$8
Y3=$9
X4=${10}
Y4=${11}
OFFSET=15
TRANSFORM="$OFFSET,$OFFSET, $X1,$Y1 $(($IMAGE_W+$OFFSET)),$OFFSET $X2,$Y2 $OFFSET, $(($IMAGE_H+$OFFSET)) $X3,$Y3 $(($IMAGE_W+$OFFSET)), $(($IMAGE_H+$OFFSET)) $X4,$Y4"
echo "Transform matrix: $TRANSFORM"
convert $IMAGE -background transparent -extent $BG_SIZE_W\x$BG_SIZE_H-$OFFSET-$OFFSET $TEMP
convert $TEMP -background transparent -distort Perspective "$TRANSFORM" $TEMP
convert $BG $TEMP -composite $DEST
rm -f $TEMP
However, it takes for about 4 seconds to produce the desired image on my computer as follows:
[neuro#neuro-linux ~]$ time ./map_image.sh bg.png Hp-lovecraft.jpg output.jpg 494 108 579 120 494 183 576 196 && nomacs output.jpg
Transform matrix: 15,15, 494,108 195,15 579,120 15, 267 494,183 195, 267 576,196
real 0m3.852s
user 0m3.437s
sys 0m0.037s
[neuro#neuro-linux ~]$
The order of operations as well as the parameters I use in the above ImageMagick script might not be optimal. So, any opinions or alternatives to achieve what I need are greatly welcome.
The images used for the above example are,
Background
User submitted image
Output
I am wondering if there is a way to speed up this so much so that I can generate frames for a one minute long video (25 fps * 60 sec) under a few seconds?
As a matter of fact, in case this approach fails, I may resort to writing an OpenGL program for this specifically, which I believe will be much faster given hardware leveraging.
Somewhat off-topic note: The background image is prerendered in animation software (3ds Max). In case I resort to writing an opengl renderer, I can import mesh and camera from 3ds Max and do so for better perspective and lighting.
Thanks.
Edit:
With the help of guys over at ImageMagick forum, the bottleneck turned out in the first conversion of -extent, which was unneeded.
I ended up combining all the commands into one:
convert image.png -background transparent +distort Perspective "1,1, 494,108 201,1 579,120 1, 201 494,183 201, 201 576,196" -compose DstOver -composite bg.png out.png
It runs in 0.6 seconds but transparency somehow does not work, so the output image ends up being only the distorted image with black background all around.
Edit:
Someone on ImageMagick forums wrote a very fast and clean script that reduced it to 0.13 seconds.
Here is the link, in case anyone needs it:
https://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=29495&p=132182#p132141
Try using MPC format as your $TEMP instead of PNG. Encoding of MPC is much much faster. It's designed for use as a temporary file, for use with ImageMagick.
MPC actually creates two files, *.mpc and *.cache, so you need to
remove both. In your script, set TEMP=temp.mpc and TEMPCACHE=temp.cache,
and then at the end of the script, rm $TEMP TEMPCACHE
See the MPC entry on the ImageMagick Formats page.
If I get the dimensions of an image using your technique, it takes around 0.4 seconds for width and another 0.4 seconds for height. I mean like this:
BG_SIZE_W=$(convert $BG -print "%w\n" /dev/null) # 0.48 sec
BG_SIZE_H=$(convert $BG -print "%h\n" /dev/null) # 0.48 sec
If you get both the width and the height in one go like this, it takes 0.006 seconds altogether on my machine:
read BG_SIZE_W BG_SIZE_H < <(identify -ping -format "%w %h" bg.png)
I am still looking at the rest of your code.
Related
I am trying to automate image conversion using ImageMagick CLI. One of the biggest problems with my image set is with tiny artifacts that should be cut out.
My images are generally consistent, with big objects (c.a. 50% of image space) on a white background. Unfortunately, sometimes tiny artifacts may just look bad and make trimming less efficient.
E.g. something like that:
In reality, the big object is not a solid color, it's just a simplified example. It is not necessarily a circle either, it can be a square, rectangle, or something irregular.
I cannot also use any morphology like opening, closing, or erosion. Filters like gaussian or median also are out of the question. I need to keep the big object untouched since the highest possible quality is required.
An ideal solution would be something similar to Contours known for example from OpenCV, where I could find all the uniform objects and if they don't meet certain rules (e.g. threshold of size greater than 5% of the whole image) - fill them with white color.
Is there any similar mechanism in ImageMagick CLI? I've gone through the docs and haven't found a suitable solution to my problem.
Thanks in advance!
EDIT (ImageMagick version):
Version: ImageMagick 7.1.0-47 Q16-HDRI x86_64 20393 https://imagemagick.org
Copyright: (C) 1999 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules OpenMP(5.0)
Delegates (built-in): bzlib fontconfig freetype gslib heic jng jp2 jpeg lcms lqr ltdl lzma openexr png ps raw tiff webp xml zlib
Compiler: gcc (4.2)
EDIT (Real-life example):
As requested, here is a real-life example. A picture of a coin on a white background, but with some artifacts:
noise under the coin (slightly on the left)
dot under the coin (slightly on the right)
gray irregular shape in the top right corner
The objects might not be necessarily circles like coins but we may assume that there always will be one object with a strong border (no white spaces on the border, like here) and the rest is noise.
Here is one way to do that im Imagemagick 7. First threshold the image so the background is white and the object(s) is black. That will likely be image dependent. NOTE: that JPG is a lousy format, since solid colors are not really truly solid due to the compression. If you can save your images in some non-lossy compressed or uncompress format, that would be better. Then decide on the largest area you need to remove. Use that with connected components processing so that you have only two regions, one white background and one black object. This will be a mask. If you have several objects that is fine also, but they need to be black. I show the textual output showing the two regions. The mask is just the object with the noise removed. So now use the original input, a white image and the mask to composite the first two images so that where the mask is black, the object is used and where the mask is white, the white image will be used. Note, I create the white image by making a copy (clone) of the input and colorizing it 100% with white. The following is in Unix syntax.
Input:
magick coin.jpg -negate -threshold 2% -negate -type bilevel \
-define connected-components:verbose=true \
-define connected-components:area-threshold=1000 \
-define connected-components:mean-color=true \
-connected-components 4 mask.png
Objects (id: bounding-box centroid area mean-color):
0: 1000x1000+0+0 525.8,555.7 594824 gray(255)
44: 722x720+101+58 460.9,417.0 405176 gray(0)
magick coin.jpg \
\( +clone -fill white -colorize 100 \) \
mask.png \
-compose over -composite \
coin_result.png
Mask
Result:
See https://imagemagick.org/script/connected-components.php
and https://imagemagick.org/Usage/compose/#compose and Composite Operator of Convert (-composite, -geometry) at https://imagemagick.org/Usage/layers/#convert
I am still a beginner in coding. I am currently working on a program in C/C++ that is determining pixel position of a defined mark (which is a black circle with white surroundings) in a photo.
I made a mask from the mark and a vector, which contains mask's every pixel value as it's elements (using Magick++ I summed values for Red, Green and Blue). Vector contains aprox. 10 000 values since the mask is 100x100px. I also used threshold functions for simplifying the image.
Than I made a grid, that is doing the same for the picture, where I want to find the coordinates of the mark. It is basically a loop, that is going throught the image and when the program knows pixel values in the grid it immediately compares them with the mask. Main idea is to find lowest difference between the mask and one of the grid positions.
The problem is however that this procedure of evaluating all grids position takes huge amount of time (e.g. the image has 1920x1080px so more than 2 million vectors containing 10 000 values). I decided to cycle the grid not every pixel but for example every 10th column and row, and than for the best corellation from this procedure I selected area where I used every pixel loop. But, this still takes lot of time.
I would like to ask you, if there is some way of improving this method for better (faster) results or this whole idea is not time efficient and I should use different approach.
Thanks for every advice!
Edit: The program will be used for processing multiple images and on all of them the size will be same. This is the picture after threshold, the mark is the big black dot.
Image
The idea that I find interesting is a pyramidal scheme - or progressive refinement: you find the spot at a lower size image then search only a small rectangle in the larger image.
If you reduce your image by 2 in each dimension then you would reduce the time by 4 plus some search effort in the larger image.
This has some problems: the reduction will affect accuracy I expect. You might miss the spot.
You have to cut the sample (template) by the same so you create a half-size template in this case. As you half half half... the template will get blurred into the surrounding objects so it will not be possible to have a valid template; for half size once I guess the dot has a couple of pixels around it.
As you haven't specified a tool or OS, I will choose ImageMagick which is installed on most Linux distros and is available for OSX and Windows. I am just using it at the command-line here but there are C, C++, Python, Perl, PHP, Ruby, Java and .Net bindings available.
I would use a "Connect Components Analysis" or "Blob Analysis" like this:
convert image.png -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 -auto-level result.png
I have inverted your image with -negate because in morphological operations, the foreground is usually white rather than black. I have excluded blobs smaller than 1200 pixels because your circles seem to have a radius of 22 pixels which makes for an area of 1520 pixels (Pi * 22^2).
That gives this output, which means 7 blobs - one per line - with the bounding box and area of each:
Objects (id: bounding-box centroid area mean-color):
0: 1358x1032+0+0 640.8,517.0 1296947 gray(0)
3: 341x350+1017+287 1206.5,468.9 90143 gray(255)
106: 64x424+848+608 892.2,829.3 6854 gray(255)
95: 38x101+44+565 61.5,619.1 2619 gray(255)
49: 17x145+1341+379 1350.3,446.7 2063 gray(0)
64: 43x43+843+443 864.2,464.1 1451 gray(255)
86: 225x11+358+546 484.7,551.9 1379 gray(255)
Note that, as your circle is 42x42 pixels you will be looking for a blob that is square-ish and close to that size - so I am looking at the second to last line. I can draw that in in red on your original image like this:
convert image.png -fill none -stroke red -draw "rectangle 843,443 886,486" result.png
Also, note that as you are looking for a circle, you would expect the area to be pi * r^2 or around 1500 pixels and you can check that in the penultimate column of the output.
That runs in 0.4 seconds on a reasonable spec iMac. Note that you could divide the image into 4 and run each quarter in parallel to speed things up. So, if you do something like this:
#!/bin/bash
# Split image into 4 (maybe should allow 23 pixels overlap)
convert image.png -crop 1x4# tile-%02d.mpc
# Do Blob Analysis on 4 strips in parallel
for f in tile-*mpc; do
convert $f -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 info: &
done
# Wait for all 4 to finish
wait
That runs in around 0.14 seconds.
As I explained in my previous post here, I am trying to generate some cascade.xml files to recognize euro coins to be used in my iOS app. Anyway, I am founding many difficulties in understanding how to generate a .vec file to give as input to opencv_traincascade. This because I heard many dissenting views: someone told me that vector file must include only positive images containing only the object to recognize; someone else instead (and also as read in my tutorials) said that vector file must include "samples" images, in other words random backgrounds to which the object to recognize has been added by opencv_createsamples. In oter words with:
opencv_createsamples -img positives/1.png -bg negatives.txt -info 1.txt -num 210 -maxxangle 0.0 -maxyangle 0.0 -maxzangle 0.9 -bgcolor 255 -bgthresh 8 -w 48 -h 48
which generated 12000 images.
Finally, I have created the .vec file with:
cat *.txt > positives.txt
opencv_createsamples -info positives.txt -bg negatives.txt -vec 2.vec -num 12600 -w 48 -h 48
So, I would like to ask you which the correct images to be contained in vector files from the following two:
Moreover, which is the final command to give to launch the training? This is the ones I have used up to now:
opencv_traincascade -data final -vec 2.vec -bg negatives.txt -numPos 12000 -numNeg 3000 -numStages 20 -featureType HAAR -precalcValBufSize 2048 -precalcIdxBufSize 2048 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -w 48 -h 48 -mode ALL
where the .vec files contains 12000 samples images (background + coin to recognize).
Should the .vec file contain only positive images (only coins), how shall I tell opencv_traincascade to train using the samples images?
I really need to know how to do things correctly because I have launched many training which then lead to no correct result and since they take many hours or days to execute, I can't waste time anymore.
Thank to all for your attention.
UPDATE
I managed to create a cascade.xml file with LBP. See what happens if I give one of the image used as training samples to a simple OpenCV program:
while with an image like the following:
it does not work at all. I really don't know where I am making the mistake.
UPDATE
Maybe firstly converting positive images to gray scale could help?
I've used the negative samples database of the INRIA training http://pascal.inrialpes.fr/data/human/
and this input (png with alpha transparency around the coin):
using this with this command:
opencv_createsamples -img pos_color.png -num 10 -bg neg.txt -info test.dat -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100
-bgcolor 0 -bgthresh 0
produces output like this:
so background color obviously didn't work. Converting to grayscale in the beginning however gives me this input:
and same command produces output like this:
I know this is no answer to all of your questions, but maybe it still helps.
OpenCV cascades (HAAR, LBP) can excellently detect objects which have permanent features. As example all faces have nose, eyes and mouth at the same places. OpenCV cascades are trained to search common features in required class of object and ignore features which changes from object to object. The problem is conclude that the cascade uses rectangular shape of search window, but a coin has round shape. Therefore an image of the coin always will be have part of background.
So training images of the coin must includes all possible backgrounds in order to classifiers can ignore them (otherwise it will be detect coin only on the specific background).
So all training samples must have the same image ratio, size and position of the coin (square images with coin in center, diameter of coin = 0.8-0.9 image width), and different background!
I am new to OpenCV and I am looking to fuse two images(Panchromatic and Multispectral) using OpenCV with C++. Note that I have already registered the reference image and now I just need to fuse the reference and the sensed image. I could not find any functions that could help me with this. Did I miss something or is there no direct way to fuse two images?
Please suggest any simple way to proceed with the fusion process.
Since you are trying to fuse together the panchromatic and multispectral images, you would need to :
Convert the input images into a suitable format (YUV works for me,
HSI might too).
Fuse the luminance or intensity values of the two images, leaving the color space untouched.
Combine the fused channel with the color information to produce the final image.
.
cvtColor(ref, tmp1, CV_BGR2GRAY, 0);
cvtColor(trans, tmp2, CV_BGR2GRAY, 0);
cv::Mat yuv;
cvtColor(ref, yuv, CV_BGR2YUV, 3);
vector <Mat> channels_ref;
split(yuv, channels_ref);
double alpha = 0.3;
double beta = 1 - alpha;
addWeighted(tmp1, alpha, tmp2, beta, 0.0, channels_ref[0]);
Mat merge[] = {channels_ref[0], channels_ref[1], channels_ref[2]};
cv::merge(merge, 3, output);
cvtColor(output, output, CV_YUV2BGR);
imshow("Linear Blend", output);
waitKey(0);
I revisited this question after a long time and decided to have a go at it as there was no sample imagery available before. In the meantime, I have generated some - see later.
So, let's say you have a hi-res, panchromatic image with 10m resolution something like this:
and a lo-res, multi-spectral image with 40m resolution of the same area, something like this:
Then, just using ImageMagick at the command-line for now (since it is installed on most Linux distros and is available for OSX and Windows), do what I was alluding to in the comments under your original question...
convert hi-res-panchromatic.tif \
\( lo-res-multispectral.tif -resize 400% -colorspace Lab -separate -delete 0 \) \
-set colorspace Lab -combine result.tif
So, that says... "Load up the hi-res image. Then, to one side, load the lo-res image and upsize it to 400% to account for the 40m resolution versus 10m resolution and convert it to Lab colorspace and separate the channels. Delete the Lightness (L) channel of the lo-res image. Now, returning to the main processing from the aside processing, we will have the hi-res image that we loaded first acting as the L channel along with the ab channels (i.e. colour information) of the lo-res image. Combine them from Lab back into RGB and save".
I see you haven't logged on in a year, so I will delay any OpenCV code-writing until anyone else expresses an interest in the question - but I hope the technique is understandable.
Note
As I don't happen to have any geo-registered panchromatic and multi-spectral imagery of the same place, I cheated somewhat... I took a single image and synthesised a panchromatic version using ImageMagick:
convert orig.tif -colorspace gray hi-res-panchromatic.tif
and I synthesised the lo-res multi-spectral image using:
convert orig.tif -resize 25% lo-res-multispectral.tif
Also, note that I just used Lab mode here to do the blending, because it is simpler, but in the comments I suggested using Principal Components Analysis. I may re-visit this again and implement that too...
I'm using gnuplot to profile my cuda program. I found especially the width plot feature helpful. It seems however that computeprof offers no way to export or customize the plots generated. Fortunately all the data is stored in csv format so I thought I could do it just myself using gnuplot or something similar. So now to my question: I couldn't find an example to pro create a plot of time blocks can you create such a plot using gnuplot and if so how?
Unfortunately horizontal histograms, as what this style of plot is called in gnuplot, are not easy to create. In gnuplot histograms are natively vertical. If you do however feel the need to have a horizontal histogram, check this blog entry.
For a vertical histogram you need to do the following:
With this data file Data.dat:
A B C D E F G H I J
0.41 0.03 0.74 0.97 0.15 0.05 0.11 0.60 0.25 0.76
and this little gnuplot script:
set style data histogram
set style histogram rowstacked
set style fill solid border -1
set key autotitle columnheader
plot for [i=1:10] "Data.dat" using i
you should be able to receive the result you are looking for (however vertically ;) ). If you still feel the need for a horizontal histogram you can follow the tutorial of the blog. It is not 100% what you are looking for, but it does the vertical - horizontal trick.