Is there a way to have a matrix of user-defined type in OpenCV 2.x? Something like :
cv::Mat_<KalmanRGBPixel> backgroundModel;
I know cv::Mat<> is meant for image and mathematic, but I want to hold data in a matrix form. I don't plan to use inverse, transpose, multiplication, etc., it's only to store data. I want it to be in matrix form because the pixel_ij of each frame of a video will be linked to backgroundModel_ij.
I know there is a DataType<_Tp> class in core.hpp that needs to be defined for my type but I'm not sure how to do it.
EDIT : KalmanRGBPixel is only a wrapper for cv::KalmanFilter class. As for now, it's the only member.
... some functions ...
private:
cv::KalmanFilter kalman;
Thanks for your help.
I have a more long winded answer for anybody wanting to create a matrix of custom objects, of whatever size.
You will need to specialize the DataType template but instead of having 1 channel, you make the channels the same size of your custom object. You may also need to override a few functions to get expected functionality, but back to that later.
First, here is an example of my custom type template specialization:
typedef HOGFilter::Sample Sample;
namespace cv {
template<> class DataType<Sample>
{
public:
typedef HOGFilter::Sample value_type;
typedef HOGFilter::Sample channel_type;
typedef HOGFilter::Sample work_type;
typedef HOGFilter::Sample vec_type;
enum {
depth = CV_8U,
channels = sizeof(HOGFilter::Sample),
type = CV_MAKETYPE(depth, channels),
};
};
}
Second.. you may want to override some functions to get expected functionality:
// Special version of Mat, a matrix of Samples. Using the power of opencvs
// matrix manipulation and multi-threading capabilities
class SampleMat : public cv::Mat_<Sample>
{
typedef cv::Mat_<Sample> super;
public:
SampleMat(int width = 0, int height = 0);
SampleMat &operator=(const SampleMat &mat);
const Sample& at(int x, int y = 0);
};
The typedef of super isnt required but helps with readability in the cpp.
Notice I have overriden the constructor with width/hight parameters. This is because we have to instantiate the mat this way if we want a 2D matrix.
SampleMat::SampleMat(int width, int height)
{
int count = width * height;
for (int i = 0; i < count; ++i)
{
HOGFilter::Sample sample;
this->push_back(sample);
}
*dynamic_cast<Mat_*>(this) = super::reshape(channels(), height);
}
The at<_T>() override is just for cleaner code:
const Sample & SampleMat::at(int x, int y)
{
if (y == 0)
return super::at<Sample>(x);
return super::at<Sample>(cv::Point(x, y));
}
In the OpenCV documentation it is explained how to add custom types to OpenCV matrices. You need to define the corresponding cv::DataType.
https://docs.opencv.org/master/d0/d3a/classcv_1_1DataType.html
The DataType class is basically used to provide a description of such primitive data types without adding any fields or methods to the corresponding classes (and it is actually impossible to add anything to primitive C/C++ data types). This technique is known in C++ as class traits. It is not DataType itself that is used but its specialized versions […] The main purpose of this class is to convert compilation-time type information to an OpenCV-compatible data type identifier […]
(Yes, finally I answer the question itself in this thread!)
If you don't want to use the OpenCV functionality, then Mat is not the right type for you.
Use std::vector<std::vector<Type> > instead. You can give the size during initialization:
std::vector<std::vector<Type> > matrix(42, std::vector<Type>(23));
Then you can access with []-operator. No need to screw around with obscure cv::Mats here.
If you would really need to go for an OpenCV-Matrix, you are right in that you have to define the DataType. It is basically a bunch of traits. You can read about C++ Traits on the web.
You can create a CV mat that users your own allocated memory by specifying the address to the constructor. If you also want the width and height to be correct you will need to find an openCV pixel type that is the same number of bytes.
Related
I want to have a Collider interface class in which will have a overloaded -> operator to have access directy to the BoxCollider derived class. I want to have access to the members of box collider through the interface and chnage the type of collider at run-time.
So I thought of using templates:
template<typename T>
class ColliderV2 {
public:
virtual T* operator ->() = 0;
};
class BoxColliderV2 : public ColliderV2<BoxColliderV2> {
public:
float width;
float height;
BoxColliderV2* operator ->() {
return this;
}
};
int main()
{
ColliderV2<BoxColliderV2>* col = new BoxColliderV2;
(*col)->width = 1;
}
This works. But templates , as far as I know, will generate a brand new Collider class in compile-time filling T with Box Collider, correct? Thats why it worked. But later it prevents me from changing the collider type. I also thought of just making a virtual Collider class with Collider* operator->() ; overload in the derived class BoxCollider* operator->() ;
But if I tried :
Collider<BoxCollider>* col = new BoxCollider;
(*col)->width = 1; // won't work
doesn't work since Collider is not BoxCollider. And I don't want to dynamic_cast every possible collider type I could have. So, what can be done here?
As you've already found out, this doesn't work. Templates and runtime behavior are kind of contradicting mechanics. You can't create a common base class and let it act like a generic pointer to give you access to its derived types' members.
An interface specifies a contract against which you can code. You don't code against a specific implementation but the interface, so the interface has to provide all the members that you'd like to access. In your case this would result in width and height beeing part of ColliderV2 instead of BoxColliderV2. However this defeates the logic you are trying to mimic.
There are a few approaches that you can take:
Either make your collider type a variant, like
using ColliderType = std::variant<BoxColliderV2, MyOtherCollider, ...>;
and check for the actual type when you want to access the member
ColliderType myCollider = /* generate */;
if (auto boxCollider = std::get_if<BoxColliderV2>(&myCollider); boxCollider)
boxCollider->width = 0;
Or, keep the base class that you have, remove the operator-> and the template and do a dynamic cast on it:
ColliderV2* col = new BoxColliderV2;
if (auto boxCollider = dynamic_cast<BoxColliderV2*>(col); boxCollider)
boxCollider->width = 0;
You can also hide details like width or height behind more generic functions that are part of the interface. For example:
class ColliderV2 {
public:
virtual void setBounds(float width, float height) = 0;
};
class BoxColliderV2 : public ColliderV2 {
public:
void setBounds(float width, float height) override {
this->width = width;
this->height = height;
}
private:
float width;
float height;
};
int main()
{
ColliderV2* col = new BoxColliderV2;
col->setBounds(1, 1);
}
What you are trying to do is discouraged by C++. What you are trying to do is to change the type of something based on the return value of a function. The type system is designed to stop you from writing code like this.
One important restriction of a function is that can only return one type-of-thing. You can return one of a list of things if you wrap those possibilities in a class, and return that. In C++17, a ready-made class for this is std::variant. The restriction on this is that the list of things must be fixed (or a closed-set). If you want an arbitrary set of return values (open-set), you must use a different approach. You must restate your problem in terms a function that is done on the return value.
class BoxColliderV2 : public MyBaseCollider {
public:
void SetWidth(float new_width) override;
};
You may find this video useful. The bit of interest starts at around 40 minutes (but watch the whole video if you can). If you are interested in advice, I would suggest starting with std::variant, and if it works, move to virtual functions. Problems like collision detection get really complicated really quickly, and you will almost certainly require double dispatch at some stage. Start simple, because it's only going to get more complicated.
These excerpts from the ISO-Guidelines may help
1. When you change the semantic meaning of an operator, you make it
harder for other programmers to understand you code. guideline.
2. Dynamic casting is verbose and ugly, but deliberately so, because dynamic casting is dangerous, and should stand out. guideline
I think you are approaching the problem from the wrong direction. The purpose of an interface is that you don't have to know about the exact type or the implementation.
For example: You are using Axis-Aligned Bounding Boxes for collision detection. So, even if your CircleCollider uses a radius, you are still able to calculate its width and height from it. Now, you don't have to worry about if you are dealing with a BoxCollider or a CircleCollider, you have everything to make a Bounding Box.
class Collider
{
public:
virtual float x() const = 0;
virtual float y() const = 0;
virtual float width() const = 0;
virtual float height() const = 0;
};
class BoxCollider : public Collider
{
// Implementation...
};
class CircleCollider : public Collider
{
// Implementation...
};
Of course, you are maybe using something else, and not AABBs. I just wanted to demonstrate how you can use interfaces effectively.
I'm making a matrix calculator as a project for my C++ class in college and I'm not sure how to design the classes for it. My problem is that one of the traits of this program has to be that sparse and dense matrices should be stored in different ways for memory efficiency (dense as typical 2D array or vector, sparse in CSR format for example), but I need to handle both of the types in same way.
So far I was thinking of something like have abstract class 'MatrixWrapper', which should contain all the shared algorithms for adding, multiplying, GEM, and so on. And then have classes 'MatrixDense' and 'MatrixSparse', which would both inherit from 'MatrixWrapper' and therefor have same interface (shown in code below). But that's where I got stuck, because with this approach when I tried implementing the algorithms in 'MatrixWrapper' I didn't know with which of the two matrices I'd be working. I'm just not sure how to solve this or even may approach is correct.
class MatrixWrapper {
public:
// shared algorithms
/* for example
void addMatrix ( const ??? &x ) {
...
}
*/
}
class MatrixDense : public MatrixWrapper {
public:
//constructor, destructor, ...
private:
vector< vector<double> > matrix;
}
class MatrixSparse : public MatrixWrapper {
public:
//constructor, destructor, ...
private:
struct CSR {
...
};
CSR matrix;
}
I was maybe thinking about adding 2D array to the 'MatrixWrapper' along side with abstract method setValue() and then in 'MatrixSparse' and 'MatrixDense' every time just setting the values of this array using this method and then just working with that 2D array in 'MatrixWrapper', but I'm not sure how to implement that or even if that's the right approach.
Implement all binary operators using non-member functions. Either global functions, or functions inside an unrelated class:
// Option 1
void add(
MatrixWrapper& result,
const MatrixWrapper& operand1,
const MatrixWrapper& operand2);
// Option 2
struct WrapperForMatrixOperations // I don't know why you might want this class to exist
{
static // or maybe not static
void add(
MatrixWrapper& result,
const MatrixWrapper& operand1,
const MatrixWrapper& operand2);
};
The reason is, your algorithm will probably return a "dense" matrix when adding a dense and a sparse matrix:
dense + sparse = dense
sparse + sparse = sparse
sparse + dense = dense <- problem!
dense + dense = dense
This cannot work if it is implemented as a non-const member function.
You should also decide how you want to create your matrices - maybe each binary operation should allocate a new matrix and return it by shared_ptr?
I have the following use case.
I need to create an Image class. An image is defined by:
the number of pixels (width * height),
the pixel type (char, short, float, double)
the number of channels (single channel, 3 channels (RGB), 4 channels (RGBA)
All combinations of the above types shall be possible.
Furthermore,
I have some algorithms that operate over those images. These algorithms use templates for the pixel type.
I need to interface with totally generic file formats (e.g. TIFF). In these file formats, the pixel data is saved as a binary stream.
My question is the following: should I use a templated Image class, or a generic interface? Example:
// 'Generic' Image interface
class Image {
...
protected:
// Totally generic data container
uint8_t* data;
};
// Template Image interface
template <typename PixelType>
class Image {
...
protected:
// Template data container
PixelType* data;
};
Using Template Image Class
My problem now is that, if I use the templated Imageclass, my file Input/Output will be messy, as when I open an Image file, I don't know a-priori what the Image type will be, so I don't know what template type to return.
This would probably be the optimal solution, if I could figure out a way of creating a generic function that would read an Image from a file and return a generic object, something similar to
ImageType load(const char* filename);
but since ImageType would have to be a template, I don't know how and if I could do this.
Using Generic Image Class
However, if I use a generic Image class, all my algorithms will need a wrapper function with a if/switch statement like:
Image applyAlgorithmWrapper(const Image& source, Arguments args) {
if (source.channels() == 1) {
if (source.type() == IMAGE_TYPE_UCHAR) {
return FilterFunction<unsigned char>(source, args);
}
else if (source.type() == IMAGE_TYPE_FLOAT) {
return FilterFunction<float>(source, args);
} else if ...
} else if (source.channels() == 3) {
if (source.type() == IMAGE_TYPE_UCHAR) {
return FilterFunction<Vec3b>(source, args);
}
...
}
(NOTE: Vec3b is a generic 3 byte structure like
struct Vec3b {
char r, g, b;
};
In my opinion a templated class is the preferred solution.
It will offer you all the advantages of templates which basically mean that your codebase would be cleaner and simpler to understand and maintain.
What you say is a problem when using a templated class is not much of a problem. When a user would like to read an image, he/she should know the data type in which he/she would like to receive the output of the image file. Hence, a user should do it like this :
Image<float>* img;
LoadFromTIFF(*img, <filename>);
This is very similar to the way it is done in libraries such as ITK. In your module which you will perhaps write to read from TIFF module, you will perform this type-casting to ensure that you return the type that has been declared by the user.
When manually creating an image, the user should do something like :
Image<float>*img;
img->SetSize(<width>, <height>);
img->SetChannels(<enum_channel_type>);
It is all much simpler in the long run than having a non-templated class.
You could take a look at the source code of ITK to get an idea of how this can be implemented in the most generic sense, as ITK is a highly templated library.
EDIT (Addendum)
If you do not want the user to have apriori control over the image data type, you should consider using SMinSampleValue and SMaxSampleValue tags in the TIFF header. These headers are there in any modern TIFF file (Version 6.0). They are intended to have a TYPE that matches the sample datatype in the TIFF file. That I believe would solve your problem
To make the right decision (based on facts rather than opinion) about template versus non-template, my strategy is to measure and compare for both solutions (templates and non-templates). I like to measure the following indicators:
number of lines of code
performances
compilation time
as well as other more subjective measures such as:
ease of maintenance
how much time does it take to a freshman to understand the code
I developed a quite large software [1], and based on these measures, my image class is not a template. I know other imaging library that offers both options [2] (but I do not know what mechanisms they have for that / whether the code remains very legible). I also had some algorithms operating with points of various dimensions (2d, 3d, ... nd), and for these ones making the algorithm a template resulted in a performance gain that made it worth it.
In short, to make the right decision, have clear criteria, clear way of measuring them, and try both options on a toy example.
[1] http://alice.loria.fr/software/graphite/doc/html/
[2] http://opencv.org/
Templates. And a variant. And an 'interface helper', if you don't yet have C++14. Let me explain.
Whenever you have a limited set of specializations for a given operation, you can model them as classes satisfying an interface or concept. If these can be expressed as one template class, then do so. It helps your users when they only want a given specialization and all you need is a factory when you read from untyped source (e.g. file). Note that you need a factory anyway, it's just that the return type is well-defined normally. And this is where we come to...
Variants. Whenever you don't know your return type, but you know at compile time the set of possible return types, use a variant. Typedef your variant so it 'looks like' a base class (note that there no inheritance or virtual functions involved), then use a visitor. A particularly easy way to write a visitor in C++14 is a generic lambda that captures everything by reference. In essence, from that point in your code, you have the specific type. Therefore, take the specific/templated classes as function arguments.
Now, a boost::variant<> (or std::variant<> if you have it) cannot have member functions. Either you reside to 'C-API style' generic functions (that are possibly just delegating to the member functions) and symmetric operators; or you have a helper class that's created from your variant type. If your CR allows it, you might descend from variant - note, some consider this terrible style, others accept it as the library writer's intention (because, had the writers wanted to forbid inheritance, they had written final).
Code sketch, do not try to compile:
enum PixelFormatEnum { eUChar, eVec3d, eDouble };
template<PixelFormatEnum>
struct PixelFormat;
template<>
struct PixelFormat<eUChar>
{
typedef unsigned char type;
};
// ...
template<PixelFormatEnum pf>
using PixelFormat_t = typename PixelFormat<pf>::type;
template<PixelFormatEnum pf>
struct Image
{
std::vector<std::vector<PixelFormat_t<pf> > > pixels; // or anything like that
// ...
};
typedef boost::variant< Image<eUChar>, Image<eVec3d>, Image<eDouble> > ImageVariant;
template<typename F>
struct WithImageV : boost::static_visitor<void>
{
// you could do this better, e.g. with compose(f, bsv<void>), but...
F f_;
template<PixelFormatEnum e>
void operator()(const Image<e>& img)
{
f_(img);
}
}
template<typename F>
void WithImage(const ImageVariant& imgv, F&& f)
{
WithImageV v{f};
boost::apply_visitor(v, img);
}
std::experimental::optional<ImageVariant> ImageFactory(std::istream& is)
{
switch (read_pixel_format(is))
{
case eUChar: return Image<eUchar>(is);
// ...
default: return std::experimental::nullopt;
}
}
struct MyFavoritePixelOp : public boost::static_visitor<int>
{
template<PixelFormatEnum e>
int operator()(PixelFormat_t<e> pixel) { return pixel; }
template<>
int operator()(PixelFormat_t<eVec3d> pixel) { return pixel.r + pixel.g + pixel.b; }
};
int f_for_variant(const ImageVariant& imgv)
{
// this is slooooow. Use it only if you have to, e.g., for loading.
// Move the apply_visitor out of the loop whenever you can (here you could).
int sum = 0;
for (auto&& row : imgv.pixels)
for (auto&& pixel : row)
sum += boost::apply_visitor(MyFavoritePixelOp(), pixel);
return sum;
}
template<PixelTypeEnum e>
int f_for_type(const Image<e>& img)
{
// this is faster
int sum = 0;
for (auto&& row : img)
for (auto&& pixel : row)
sum += MyFavoritePixelOp()(pixel);
return sum;
}
int main() {
// ...
if (auto imgvOpt = ImageFactory(is))
{
// 1 - variant
int res = f_for_variant(*imgvOpt);
std::cout << res;
// 2 - template
WithImage(*imgvOpt, [&](auto&& img) {
int res2 = f_for_type(img);
std::cout << res2;
});
}
}
I have three classes, TImageProcessingEngine, TImage and TProcessing
TImageProcessingEngine is the one which i am using to expose all my methods to the world.
TImage is the one i plan to use generic image read and image write functions.
TProcessing contains methods that will perform imaging operations.
class TImageProcessingEngine
{
public:
TImage* mpImageProcessingEngine;
};
class TImage
{
public:
int ReadImage();
int WriteImage();
private:
//a two dimensional array holding the pixel values
tImageMatrix* mpImageMatrix;
};
class TProcessing
{
public:
int ConvertToBinary();
int ConvertToGrayScale();
};
My question is how do i access the object mpImageMatrix in class TProcessing? So that my calling application can use the following
TImageProcessingEngine* vEngine = new TImageProcessingEngine;
//Converts an input gray scsale image to binary image
vEngine->ReadImage().ConvertToBinary();
//Write the converted image to disk
vEngine->WriteImage();
delete vEngine;
vEngine = NULL;
//During this whole processing internally,
//the image is read in to `mpImageMatrix`
//and will also be holding the binarised image data,
//till writing the image to disk.
Or Do you recommend any other approach to my class design?
I would certainly recommend a different implementation, but let's check the design first.
I don't really understand the added value of TImageProcessingEngine, it doesn't bring any functionality.
My advice would be quite simple in fact:
Image class, to hold the values
Processing class (interface), to apply operations
Encoder and Decoder classes (interfaces), to read and write to different formats
It does make sense for the Processing class to have access to the images internal only if you can get efficiency from it (which is likely), in this case you can simply makes Processing friend and having it unpack the values for its derived
class Image
{
public:
Image();
void Accept(Processing& p);
void Encode(Encoder& e) const; // Image is not modified by encoding
void Decode(Decoder& d); // This actually resets the image content
private:
friend class Processing;
size_t mHeight;
size_t mWidth;
std::vector<Pixel> mPixels; // 2D array of Pixels
};
class Processing
{
public:
void apply(Image& image)
{
this->applyImpl(image.mHeight, image.mWidth, image.mPixels);
}
private:
virtual void applyImpl(size_t h, size_t w, std::vector<Pixel>& pixels) = 0;
};
Encoder and Decoder follow the same principle.
Note how I never needed an explicit pointer, and the guaranteed correctness that results from it.
First off, based on your provided code there are no ReadImage() & WriteImage() functions in the TImageProcessingEngine class, so the later code where you use such functionality is flawed.
As for the solution, you can make a getter function for the tImageMatrix pointer like this:
tImageMatrix* GetImageMatrix() { return mpImageMatrix; }
Then just pass that pointer (or a pointer to the whole TImage instance) to the TProcessing function you want to call.
Why you want to have a separate TProcessing process, when it specifically has functions just accessing mpImageMatrix;
In OOP, you have to bind the data members and it's operations..
So, IMO, remove your TProcessing class and have both the functions within TImage..
Your TImage will be like,
class TImage
{
public:
int ReadImage();
int WriteImage();
int ConvertToBinary();
int ConvertToGrayScale();
private:
//a two dimensional array holding the pixel values
tImageMatrix* mpImageMatrix;
};
You could create an accessor TImage class:
byte * pixelAt(unsigned x, unsigned y);
I have searched the web but could not find an answer.
how do I have set base index in the matrix, such that indexes start from values other than zero? for example:
A(-3:1) // Matlab/fortran equivalent
A.reindex(-3); // boost multi-array equivalent
thanks
Your search appears to be correct; it appears not to have such a function.
If you really need this functionality, perhaps you could consider subclassing the matrix and overriding the operator() to fiddle with the indices for you. For example:
using namespace boost::numeric::ublas;
template<typename T>
class Reindexable : public matrix<T>
{
public:
Reindexable() : m_offset(0) {}
void reindex(int offset) { m_offset = offset; }
T& operator()(int i) { return matrix<T>::operator()(i + m_offset); }
/* Probably more implementation needed here ... */
private:
int m_offset;
}
I've been programming in VB.NET (ughh!) and C# lately, so I'm a little rusty on my C++ syntax and have probably made a few mistakes in the above, but the general idea should work. You subclass the matrix so that you can provide a reindex operation and override the parenthesis operator so that it is aware of the new index offset. Of course, in the actual implementation, you will need offsets for each dimension of the matrix.
Also, if you ever have a reference or pointer to your Reindexable, and the type of the reference/pointer is matrix<T>, then you will be using the old index operator, so be careful!