What is the best way to get the hash of a QPixmap? - c++

I am developing a graphics application using Qt 4.5 and am putting images in the QPixmapCache, I wanted to optimise this so that if a user inserts an image which is already in the cache it will use that.
Right now each image has a unique id which helps optimises itself on paint events. However I realise that if I could calculate a hash of the image I could lookup the cache to see if it already exists and use that (it would help more for duplicate objects of course).
My problem is that if its a large QPixmap will a hash calculation of it slow things down or is there a quicker way?

A couple of comments on this:
If you're going to be generating a hash/cache key of a pixmap, then you may want to skip the QPixmapCache and use QCache directly. This would eliminate some overhead of using QStrings as keys (unless you also want to use the file path to locate the items)
As of Qt4.4, QPixmap has a "hash" value associated with it (see QPixmap::cacheKey() ). The documentation claims "Distinct QPixmap objects can only have the same cache key if they refer to the same contents." However, since Qt uses shared-data copying, this may only apply to copied pixmaps and not to two distinct pixmaps loaded from the same image. A bit of testing would tell you if it works, and if it does, it would let you easily get a hash value.
If you really want to do a good, fairly quick cache with removing duplications, you might want to look at your own data structure that sorts according to sizes, color depths, image types, and things such as that. Then you would only need to hash the actual image data after you find the same type of image with the same dimensions, bit-depths, etc. Of course, if your users generally open a lot of images with those things the same, it wouldn't help at all.
Performance: Don't forget about the benchmarking stuff Qt added in 4.5, which would let you compare your various hashing ideas and see which one runs the fastest. I haven't checked it out yet, but it looks pretty neat.

Just in case anyone comes across this problem (and isn't too terribly experienced with hashing things, particularly something like an image), here's a VERY simple solution I used for hashing QPixmaps and entering them into a lookup table for later comparison:
qint32 HashClass::hashPixmap(QPixmap pix)
{
QImage image = pix.toImage();
qint32 hash = 0;
for(int y = 0; y < image.height(); y++)
{
for(int x = 0; x < image.width(); x++)
{
QRgb pixel = image.pixel(x,y);
hash += pixel;
hash += (hash << 10);
hash ^= (hash >> 6);
}
}
return hash;
}
Here is the hashing function itself (you can have it hash into a qint64 if you desire less collisions). As you can see I convert the pixmap into a QImage, and simply walk through its dimensions and perform a very simple one-at-a-time hash on each pixel and return the final result. There are many ways to improve this implementation (see the other answers to this question), but this is the basic gist of what needs to be done.
The OP mentioned how he would use this hashing function to then construct a lookup table for later comparing images. This would require a very simple lookup initialization function -- something like this:
void HashClass::initializeImageLookupTable()
{
imageTable.insert(hashPixmap(QPixmap(":/Image_Path1.png")), "ImageKey1");
imageTable.insert(hashPixmap(QPixmap(":/Image_Path2.png")), "ImageKey2");
imageTable.insert(hashPixmap(QPixmap(":/Image_Path3.png")), "ImageKey2");
// Etc...
}
I'm using a QMap here called imageTable which would need to be declared in the class as such:
QMap<qint32, QString> imageTable;
Then, finally, when you want to compare an image to the images in your lookup table (ie: "what image, out of the images I know it can be, is this particular image?"), you just call the hashing function on the image (which I'm assuming will also be a QPixmap) and the return QString value will allow you to figure that out. Something like this would work:
void HashClass::compareImage(const QPixmap& pixmap)
{
QString value = imageTable[hashPixmap(pixmap)];
// Do whatever needs to be done with the QString value and pixmap after this point.
}
That's it. I hope this helps someone -- it would have saved me some time, although I was happy to have the experience of figuring it out.

Hash calculations should be pretty quick (somewhere above 100 MB/s if no disk I/O involved) depending on which algorithm you use. Before hashing, you could also do some quick tests to sort out potential candidates - f.e. images must have same width and height, else it's useless to compare their hash values.
Of course, you should also keep the hash values for inserted images so you only have to calculate a hash for new images and won't have to calculate it again for the cached images.
If the images are different enough, it would perhaps be enough to not hash the whole image but a smaller thumbnail or a part of the image (f.e. first and last 10 lines), this will be faster, but will lead to more collisions.

I'm assuming you're talking about actually calculating a hash over the data of the image rather than getting the unique id generated by QT.
Depending on your images, you probably don't need to go over the whole image to calculate a hash. Maybe only read the first 10 pixels? first scan line?
Maybe a pseudo random selection of pixels from the entire image? (with a known seed so that you could repeat the sequence) Don't forget to add the size of the image to the hash as well.

Related

Where to alter reference code to extract motion vectors from HEVC encoded video

So this question has been asked a few times, but I think my C++ skills are too deficient to really appreciate the answers. What I need is a way to start with an HEVC encoded video and end with CSV that has all the motion vectors. So far, I've compiled and run the reference decoder, everything seems to be working fine. I'm not sure if this matters, but I'm interested in the motion vectors as a convenient way to analyze motion in a video. My plan at first is to average the MVs in each frame to just get a value expressing something about the average amount of movement in that frame.
The discussion here tells me about the TComDataCU class methods I need to interact with to get the MVs and talks about how to iterate over CTUs. But I still don't really understand the following:
1) what information is returned by these MV methods and in what format? With my limited knowledge, I assume that there are going to be something like 7 values associated with the MV: the frame number, an index identifying a macroblock in that frame, the size of the macroblock, the x coordinate of the macroblock (probably the top left corner?), the y coordinate of the macroblock, the x coordinate of the vector, and the y coordinate of the vector.
2) where in the code do I need to put new statements that save the data? I thought there must be some spot in TComDataCU.cpp where I can put lines in that print the data I want to a file, but I'm confused when the values are actually determined and what they are. The variable declarations look like this:
// create motion vector fields
m_pCtuAboveLeft = NULL;
m_pCtuAboveRight = NULL;
m_pCtuAbove = NULL;
m_pCtuLeft = NULL;
But I can't make much sense of those names. AboveLeft, AboveRight, Above, and Left seem like an asymmetric mix of directions?
Any help would be great! I think I would most benefit from seeing some example code. An explanation of the variables I need to pay attention to would also be very helpful.
At TEncSlice.cpp, you can access every CTU in loop
for( UInt ctuTsAddr = startCtuTsAddr; ctuTsAddr < boundingCtuTsAddr; ++ctuTsAddr )
then you can choose exact CTU by using address of CTU.
pCtu(TComDataCU class)->getCtuRsAddr().
After that,
pCtu->getCUMvField()
will return CTU's motion vector field. You can extract MV of CTU in that object.
For example,
TComMvField->getMv(g_auiRasterToZscan[y * 16 + x])->getHor()
returns specific 4x4 block MV's Horizontal element.
You can save these data after m_pcCuEncoder->compressCtu( pCtu ) because compressCtu determines all data of CTU such as CU partition and motion estimation, etc.
I hope this information helps you and other people!

Excluding fields with certain state from 2D array; Game of life

I have an array - 2D(100 x 100 in this case) with some states limited within borders as shown on picture:
http://tinypic.com/view.php?pic=mimiw5&s=5#.UkK8WIamiBI
Each cell has its own id(color, for example green is id=1) and flag isBorder(marked as white on pic if true). What I am trying to do is exclude set of cell with one state limited with borders(Grain) so i could work on each grain separately which means i would need to store all indexes for each grain.
Any one got an idea how to solve it?
Now that I've read your question again... The algorithm is essentially the same as filling the contiguous area with color. The most common way to do it is a BFS algorithm.
Simply start within some point you are sure lays inside the current area, then gradually move in every direction, selecting traversed fields and putting them into a vector.
// Edit: A bunch of other insights, made before I understood the question.
I can possibly imagine an algorithm working like this:
vector<2dCoord> result = data.filter(DataType::Green);
for (2dCoord in result) {
// do some operations on data[2dCoord]
}
The implementation of filter in a simple unoptimized way would be to scan the whole array and push_back matching fields to the vector.
If you shall need more complicated queries, lazily-evaluated proxy objects can work miracles:
data.filter(DataType::Green)
.filter_having_neighbours(DataType::Red)
.closest(/*first*/ 100, /*from*/ 2dCoord(x,y))
.apply([](DataField& field) {
// processing here
});

Finding the most efficient data structure to create an index file

I have a video file, which consists of many successive frames of binary data. Each frame has also a unique timestamp (which is NOT its sequential number in file, but rather a value, provided by the camera at the recording time). On the other hand, I've got an API function which retrieves that frame based on the sequential number of that frame. To make things a bit more complicated - I have a player, who is provided with the timestamp, and should get the binary data for that frame.
Another sad thing here: timestamps are NOT sequential. They can be sequential, but it is not guaranteed, as a wraparound may occur around max unsigned short size.
So a sequence of timestamps could be either
54567, 54568, ... , 65535, 65536 , ... or
54567, 54568, ..., 65535, 0, 1, ...
So it might look like the following:
Frame 0
timestamp 54567
binary data
........
Frame 1
timestamp 54569
binary data
........
Frame 2
timestamp 54579
binary data
.
.
.
Frame n
timestamp m
binary data
0 <= n <= 65536 (MAX_UNSIGNED_SHORT)
0 <= m <= MAX_UNSIGNED_INT
The clip player API should be able to get the binary frame by the timestamp. However, internally, I can get the frame only by its frame sequential number. So if I am asked for timestamp m, I need to iterate over n frames, to find the frame with timestamp m.
To optimize it, I chose to create an index file, which would give me a match between timestamp and the frame sequential number. And here is my question:
Currently my index file is composed of binary pairs of size 2*sizeof(unsigned int), which contains timestamp and frame sequential number. Player later on creates from that file stl map with key==timestamp, value==frame sequential number.
Is there any way to do it more efficiently? Should I create my index file as a dump of some data structure, so it could later be loaded into memory by the clip player while opening the clip, so I would have an O(1) access to frames? Do you have other suggestions?
UPD:
I have updated the names and requirements (timestamps are not necessarily sequential, and frames num bounded by MAX_UNSIGNED_SHORT value). Also wanted to thank everyone who already took the time and gave an answer. The interpolation search is an interesting idea, although I never tried it myself. I guess the question would be the delta between O(1) and O(log log N) in runtime.
It would seem that we should be able to make the following assumptions:
a) the video file itself will not be modified after it is created
b) the player may want to find successive frames i.e. when it is doing normal playback
c) the player may want to find random frames i.e. when it is doing FF, REW or skip by or to chapter
Given this, why not just do a HashMap associating the Frame Id and the Frame Index? You can create that once, the player can read it and then can do an easy and time bounded look up of the requested Frame.
There are a series of tradeoffs to make here.
Your index file is already a dump of a data structure: an array. If you don't plan on often inserting or deleting frames, and keep this array in a sorted order, it's easy to do a binary search (using std::binary_search) on the array. Insertion and deletion take O(N), but searching is still O(log N). The array will occupy less space in memory, and will be faster to read and write from your index file.
If you're doing a lot of inserting and removing frames, then coverting to a std::map structure will give you better performance. If the number of frames is large, or you want to store more metadata with them, you might want to look at a B-tree structure, or just use an embedded database like Sqlite or BerkeleyDB. Both of these implement B-tree indexing and are well-tested pieces of code.
Simply store the frame data in an array where indices represent frame numbers. Then create a hash map from camera indices to frame numbers. You can get the frame belonging to either a frame number or camera index in O(1) while barely using more memory than your current approach.
Alternatively, you can maintain an array, indexed by frame number, that stores a (camera index, data) pair and perform a O(log n) binary search on it when you need to access it by camera index. This takes advantage of the fact that the camera indices are sorted.
In C++'s standard library, hash maps are available as std::unordered_map (if you compiler/STL supports them, which might not be the case since they have only recently been added to the C++ standard), although the tree-based std::map (with O(log n) lookup) is probably good enough for this purpose.
A binary search implementation is available as std::binary_search.

Cropping an 8-bit bitmap by its palette information

I'm currently using C++ to read my 8-bit bitmap and save off its pixel data and colour table. I currently have my colour table stored in an array:
RGBQUAD* colours;
I was wondering how I would go about finding the nearest unique pixel colour in all directions and cropping the bitmap to that pixel. I'm using C++ without any external libraries.
I would recommend using readily available libraries, like ImageMagick, instead of trying to re-implement that particular wheel.
There's only two reasons why you would implement something already implemented that well elsewhere: 1) Homework, or 2) you think you can actually do significantly better than existing code.
It cannot be 1) because there is no "homework" tag, and it cannot be 2) because you wouldn't have to ask, then...
"nearest unique pixel colour" means nearest in color space? In absolute terms (R/G/B) or human sense? So, given #0002FE wou may find #0000FF in your color table?
The "standard" simple C++ method is std::min_element(), which takes a range and a predicate. In your case, that range is your color table and the predicate is the close-ness to the color you want. E.g. [targetColor](RGBQUAD tableEntry) { return abs(RGBdiff(tableEntry, targetColor)); }

Confusion regarding Image compression algorithms

I had been reading a webpage on Image Compression (Lossy and Non-lossy).
Now this is my problem, I was successful in making a project on Face detection using opencv - however - my Project Guide is not satisfied - my project simply captures the frames from a Capture device [ webcam ] and passes frames in a function to detect the Faces in those frames and outputs the detect frames in Windows.
My Project Guide wants me to implement some algorithm either of image compression or morphing , etc. but was not happy on seeing such heavy usage of the Library -
So what I would like to know - is it possible to code using C or C++ - image compression algorithms? If yes would not the code size be huge? (my project is supposed to be a minor one)
Please help me out, suppose I want to use the RLE compression using C++ how should I go about it?
You want to invent your own image compression or implement one of the standard ones?
( I assume this is for some sort of class/assignment, you wouldn't do this in the real world!)
You can compress simple images a little using something like Run-Length, especially if you can reduce the number of colours ie. a cartoon or graphic, but for a real photo style image it isn't going to work - that's why complex lossy techniques like jpeg or wavelets were invented.
It's very possible, and RLE compression is quite easy. If you want to look at a relatively straight-forward approach to RLE that won't use a lot of code, look at implementing a version of packbits.
Here's another link as well: http://michael.dipperstein.com/rle/index.html (includes an implementation with source-code for both traditional RLE and packbits)
BTW, keep in mind that you could, with noisy data, actually end up with more data than uncompressed using RLE schemes. For most "real-world" images though that have some form of low-pass filtering applied and a relatively good signal-to-noise ration (i.e,. above 40db), you should expect around 1.5:1 to 1.7:1 compression ratios.
Another option for lossless compression would be huffman-encoding ... that algorithm is more tolerant of noisy images, in that it generally prevents the data-expansion that could occur with those types of images when encoded with a RLE compression algorithm.
Finally, you didn't mention whether you were working with color or grayscale images ... if it's a color image, remember that you will find much greater redundancy if you compress each color-channel in a planar-color-channel image, rather than trying to compress contiguous RGB data.
RLE is the best way to go here. Even the "simplest" compression algorithms are non-trivial and require in-depth knowledge of color space transforms, discrete sin/cosine transforms, entropy, etc.
Back to RLE... to loop through pixesls use something like this:
cv::Mat img = cv::imread("lenna.png");
for(int i=0; i < img.rows; i++)
for(int j=0; i < img.cols; j++)
// You can now access the pixel value with cv::Vec3b
std::cout << img.at<cv::Vec3b>(i,j)[0] << " " << img.at<cv::Vec3b>(i,j)[1] << " " << img.at<cv::Vec3b>(i,j)[2] << std::endl;
Count the number of similar pixels in a row and store them in any data structure (maybe a < #Occurences, Vec3b > tuple in a vector?). Once you have your final vector, don't forget to store the size of your image somewhere with the aforementioned vector (maybe in a simple compressedImage struct) and voilĂ , you just compressed an image. To store it in a file, I suggest you use boost::serialize or something similar.
Your final struct may look something similar to:
struct compressedImage {
int height;
int width;
vector< pair<int, Vec3b> > data;
};
Happy coding!
You want to implement a compression based on colour reduction with a space-filling-curve or a spatial index. A si reduce the 2d complexity to a 1d complexity and it looks like a quadtree and a bit like a fractal. You want to look for Nick's hilbert curve quadtree spatial index blog!
Here is another interesting RLE encoding idea: Lossless hierarchical run length encoding. Maybe that's something for you?
if you need to abstract the raster type, you can use GDAL C++ library. Here is the list of supported by default or on request raster formats:
http://gdal.org/formats_list.html