Implementation help on database video matching in OpenCV - c++

I'm trying to develop a program to match query videos to videos in a database. So something like google image search but for videoclips. I'm using OpenCV for this implementation.
I can succesfully generate feature descriptors for selected frames in de video (which tbh isn't that hard).
I have no clue how I should implement the database part. I was wondering if somebody could give me pointers to what already exists in OpenCV to ease my implementation: What classes to extend, how to structure it, any existing implementations/examples,...?
Basically I would have a set of descriptors from a query clip which I want to compare to a (large) database of precomputed descriptors, and then have some voting algorithm to return a best match.
So i'm not asking for which algorithms to use, I just want to know about the best practices in OpenCV to implement such a thing i'm describing.
I am using OpenCV 2.4.8 C++ on a Mac in Xcode, if that somehow matters
EDIT
Let me make my question a little bit more specific.
If I use SIFT of SURF detector/descriptor I get a lot of features. Doing this for a bunch of frames of a bunch of videos will result a lot of data.
If I use FlannMatcher (or something in the likes) I have to have everything in memory... but this seems very unreasonable... So somehow there should be a way to implement this using a database. So I'm looking for any hints on how to do this, what kind of database to use...
thanks

You may cluster the features into fewer classes, and store/search class id before comparing actual features

Related

machine learning for any cancer diagnosis on image dataset with python

Blockquote
i am working on this project asssigned by university as final project. But the issue is i am not getting any help from the internet so i thought may be asking here can solve issue. i had read many articles but they had no code or guidance and i am confused what to do. Basically it is an image processing work with machine learning. Data set can be found easily but issue is python python learning algorithm and code
Blockquote
I presume if it's your final project you have to create the program yourself rather than ripping it straight from the internet. If you want a good starting point which you can customise Tensor Flow from Google is very good. You'll want to understand how it works (i.e. how machine learning works) but as a first step there's a good example of image processing on the website in the form of number recognition (which is also the "Hello World" of machine learning).
https://www.tensorflow.org/get_started/mnist/beginners
This also provides a good intro to machine learning with neural nets: https://www.youtube.com/watch?v=uXt8qF2Zzfo
One note on Tensor Flow, you'll probably have to use Python 3.5+ as in my experience it can be difficult getting it on 2.7.
First of all I need to know what type of data are you using because depending on your data, if it is a MRI or PET scan or CT, there could be different suggestion for using machine learning in python for detection.
However, I suppose your main dataset consist of MR images, I am attaching an article which I found it a great overview of different methods>
This project compares four different machine learning algorithms: Decision Tree, Majority, Nearest Neighbors, and Best Z-Score (an algorithm of my own design that is a slight variant of the Na¨ıve Bayes algorithm)
https://users.soe.ucsc.edu/~karplus/abe/Science_Fair_2012_report.pdf
Here, breast cancer and colorectal cancer have been considered and the algorithms that performed best (Best Z-Score and Nearest Neighbors) used all features in classifying a sample. Decision Tree used only 13 features for classifying a sample and gave mediocre results. Majority did not look at any features and did worst. All algorithms except Decision Tree were fast to train and test. Decision Tree was slow, because it had to look at each feature in turn, calculating the information gain of every possible choice of cutpoint.
My Solution:-
Lung Image Database Consortium provides open access dataset for Lung Cancer Images.
Download it then apply any machine learning algorithm to classify images having tumor cells or not.
I attached a link for reference paper. They applied neural network to classify the images.
For coding part, use python "OpenCV" for image pre-processing and segmentation.
When it comes for classification part, use any machine learning libraries (tensorflow, keras, torch, scikit-learn... much more) as you are compatible to work with and perform classification using any better outperforming algorithms as you wish.
That's it..
Link for Reference Journal

3D model files & creating triangle meshes from it

I'm new here. I'm learning C++ yet so I don't have a big knowledge about things ( well, I have a lot of experience with programming but with other languages. I also know how to work with a computer xD ). I decided to start working on a game engine. I know it ain't easy but that's one reason why I want to do it - trying/getting experience is the best way for me to learn.
So, I'm starting with the renderer but I have no idea how could I make a 3D model file which as it says in the name it would have a 3d model and it could be loaded into a triangle mesh in the C++ engine and I could use it with Direct3D. I don't want you to do anything except giving me an idea how to start.
Thank you very much if you decided to read this & help.
This is a great question and rings so true because you are in exactly the same place that I was 3 years ago. I looked around at various applications that I could use to generate 3D content. I didn't want to spend time learning various applications because I wanted to get coding my game engine. I couldn't find anything quick enough to learn to make it worth my while (that was free of charge) so in the end I put some scripts together to create the 3D models (vertex arrays). These developed into a GUI driven interface which developed into my own fully functioning 3D Modelling Tool. Now, after 3 years, my Game Engine is on the back burner and MeshCreator (check out http://www.MeshCreator.co.uk) is now my ongoing project.
You need a way of loading in Vertex arrays. OBJ files (wavefront) are Text Based and fairly easy to read. The output from MeshCreator is also ASCII based and there are details on the website on how to interpret the file format.
The File output from MeshCreator is designed for easy import into any 3D project. You could also look at .OBJ files as they are supported by most 3D apps (including mine). OBJ files can get a little complex though.
In short, you need an easy to use tool. It took me 3 years to write my own so I wouldn't recommend it. Try out MeshCreator, MilkShape 3D and maybe Google Shetchup. Also, there are many sites that have royalty free 3D content that you can use in your projects and many are available in OBJ format.
3D model file formats can be very complicated, mostly in the sheer amount of types of things they can contain (meshes, primitives, curved surfaces, materials, lighting, etc. etc.). Which also means that it can be very hard to come up with your own format too. So, there are really two options here: find off-the-shelf libraries to handle the loading of models; or, pick pre-existing formats and write your own code to load those files (and use a standard editor to create them).
The main problem with using an off-the-shelf library to load the models is often that these libraries are often intimately tied to a particular renderer and usually very complex in the (possible) data structures that are generated when loading the files. So, this is often a better solution only if you are prepared to adopt the renderer of which this model-loading-library is a part of or tied to. There are options like Irrlicht, Ogre3D, Coin3D, etc., which all have reasonable capabilities at loading fairly standard 3D model file formats.
If you are going to pick a pre-existing file format and create the loading code yourself (and thus, tied to your own renderer), then you should pick carefully. The 3ds file format is very widely used and its internal structure is nice and fairly simple, with the advantage that you can easily skip elements that you can't support yet (this is useful when you are incrementally writing your rendering code). Most other formats have similar features (forward and backward compatibility, and ability to skip "unsupported" elements). There are also important open-standard formats, vrml and x3d, that you might want to take a look at. Also, there are some simpler formats associated to some open-source 3D editors like Blender. It is important to have a good 3D editor that can output in a file format that you can load. That's why you shouldn't create your own file format, because you will have a lot of extra work to do, either in creating custom "export" scripts (or plugins) for some 3D editor, or in creating your own editor (a monumental task).
Search around. If you want to base your engine on DirectX, I believe the SDK comes with code samples explaining the basics, like loading meshes.
If you're just learning C++, I'd recommend you start with a much simpler project like making a simple game with an engine like Irrlicht or OGRE, but that's up to you.
I think you need something like obj files. You can find online free files to play with. If you want to generate your own, I think you need some 3D rendering software. You can also search how to parse them and load their content in your programs.

Audio Subtitle Transcription - C++

I'm on a project that among other video related tasks should eventually be capable of extracting the audio of a video and apply some kind of speech recognition to it and get a transcribed text of what's said on the video. Ideally it should output some kind of subtitle format so that the text is linked to a certain point on the video.
I was thinking of using the Microsoft Speech API (aka SAPI). But from what I could see it is rather difficult to use. The very few examples that I found for speech recognition (most are for Text-To-Speech which mush easier) didn't perform very well (they don't recognize a thing). For example this one: http://msdn.microsoft.com/en-us/library/ms717071%28v=vs.85%29.aspx
Some examples use something called grammar files that are supposed to define the words that the recognizer is waiting for but since I haven't trained the Windows Speech Recognition thoroughly I think that might be adulterating the results.
So my question is... what's the best tool for something like this? Could you provide both paid and free options? Well the best "free" (as it comes with Windows) option I believe it's SAPI, all the rest should be paid but if they are really good it might be worth it. Also if you have any good tutorials for using SAPI (or other API) on a context similar to this it would be great.
On the whole this is a big ask!
The issue with any speech recognition system is that it functions best after training. It needs context (what words to expect) and some kind of audio benchmark (what does each voice sound like). This might be possible in some cases, such as a TV series if you wanted to churn through hours of speech -separated for each character- to train it. There's a lot of work there though. For something like a film there's probably no hope of training a recogniser unless you can get hold of the actors.
Most film and TV production companies just hire media companies to transcribe the subtitles based on either direct transcription using a human operator, or converting the script. The fact that they still need humans in the loop for these huge operations suggests that automated systems just aren't up to it yet.
In video you have a plethora of things that make you life difficult, pretty much spanning huge swathes of current speech technology research:
-> Multiple speakers -> "Speaker Identification" (can you tell characters apart? Also, subtitles normally have different coloured text for different speakers)
-> Multiple simultaneous speakers -> The "cocktail party problem" - can you separate the two voice components and transcribe both?
-> Background noise -> Can you pick the speech out from any soundtrack/foley/exploding helicopters.
The speech algorithm will need to be extremely robust as different characters can have different gender/accents/emotion. From what I understand of the current state of recognition you might be able to get a single speaker after some training, but asking a single program to nail all of them might be tough!
--
There is no "subtitle" format that I'm aware of. I would suggest saving an image of the text using a font like Tiresias Screenfont that's specifically designed for legibility in these circumstances, and use a lookup table to cross-reference images against video timecode (remembering NTSC/PAL/Cinema use different timing formats).
--
There's a bunch of proprietary speech recognition systems out there. If you want the best you'll probably want to license a solution off one of the big boys like Nuance. If you want to keep things free the universities of RWTH and CMU have put some solutions together. I have no idea how good they are or how well they might be suited to the problem.
--
The only solution I can think of similar to what you're aiming at is the subtitling you can get on news channels here in the UK "Live Closed Captioning". Since it's live, I assume they use some kind of speech recognition system trained to the reader (although it might not be trained, I'm not sure). It's got better over the past few years, but on the whole it's still pretty poor. The biggest thing it seems to struggle with is speed. Dialogue is normally really fast, so live subtitling has the extra issue of getting everything done in time. Live closed captions quite frequently get left behind and have to miss a lot of content out to catch up.
Whether you have to deal with this depends on whether you'll be subtitling "live" video or if you can pre-process it. To deal with all the additional complications above I assume you'll need to pre-process it.
--
As much as I hate citing the big W there's a goldmine of useful links here!
Good luck :)
This falls into the category of dictation, which is a very large vocabulary task. Products like Dragon Naturally Speaking are amazingly good and that has a SAPI interface for developers. But it's not so simple of a problem.
Normally a dictation product is meant to be single speaker and the best products adapt automatically to that speaker, thereby improving the underlying acoustic model. They also have sophisticated language modeling which serves to constrain the problem at any given moment by limiting what is known as the perplexity of the vocabulary. That's a fancy way of saying the system is figuring out what you're talking about and therefore what types of words and phrases are likely or not likely to come next.
It would be interesting though to apply a really good dictation system to your recordings and see how well it does. My suggestion for a paid system would be to get Dragon Naturally Speaking from Nuance and get the developer API. I believe that provides a SAPI interface, which has the benefit of allowing you to swap in the Microsoft speech or any other ASR engine that supports SAPI. IBM would be another vendor to look at but I don't think you will do much better than Dragon.
But it won't work well! After all the work of integrating the ASR engine, what you will probably find is that you get a pretty high error rate (maybe half). That would be due to a few major challenges in this task:
1) multiple speakers, which will degrade the acoustic model and adaptation.
2) background music and sound effects.
3) mixed speech - people talking over each other.
4) lack of a good language model for the task.
For 1) if you had a way of separating each actor on a separate track that would be ideal. But there's no reliable way of separating speakers automatically in a way that would be good enough for a speech recognizer. If each speaker were at a distinctly different pitch, you could try pitch detection (some free software out there for that) and separate based on that, but this is a sophisticated and error prone task.) The best thing would be hand editing the speakers apart, but you might as well just manually transcribe the speech at that point! If you could get the actors on separate tracks, you would need to run the ASR using different user profiles.
For music (2) you'd either have to hope for the best or try to filter it out. Speech is more bandlimited than music so you could try a bandpass filter that attenuates everything except the voice band. You would want to experiment with the cutoffs but I would guess 100Hz to 2-3KHz would keep the speech intelligible.
For (3), there's no solution. The ASR engine should return confidence scores so at best I would say if you can tag low scores, you could then go back and manually transcribe those bits of speech.
(4) is a sophisticated task for a speech scientist. Your best bet would be to search for an existing language model made for the topic of the movie. Talk to Nuance or IBM, actually. Maybe they could point you in the right direction.
Hope this helps.

3d max integration with c++, Cal3D where to start?

okay i'm making a game using c++ (for the engine) and openGL, now i've had lots of trouble using cal3d library for importing my 3d max models into my c++ project,
as a matter of fact i dunno where to even start, i can't find any decent guide and their documentation is pure shit really. i've been searching and trying stuff in this for over a month, but i don't even understand the file structure it uses so far :S
i really need some help, r there any other libraries? any decent guide i can use? i'm stuck
thnx alot
Rather than write your own exporter, consider using one of the built-in exporters for FBX, COLLADA, Crosswalk (.XSI), the Quake/Doom3 .MD3/.MD4 format, or even OBJ. It'll be much easier to parse the resulting file format on your end than to write and maintain a brand-new exporter.
Max is a complete pain for any kind of scripting or plugin. I'd suggest using maya instead if at all possible. You'll get better results for animation and rigging, too. I know it's not a direct answer to your question but part of the problem is the info for stuff like this is not easy to come by.

Choosing a 3D game engine [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I want to start this as a hobby in developing a desktop game. I have found several engines, but I am not sure whether it does the initial job I am looking at.
Initially I want to do the following:
Create a figure (avatar), and let the user dress the avatar
Load the avatar in the game
In later stages, I want to develop this as a multi-player game.
What should I do?
I also recommend Ogre. Ogre can do this, it provides everything needed in regards of mesh and animation support, but not as a drop-in solution. You have to write lots of code for this to be done.
For our project we implemented something like you do. The main character and any other character can be dressed with different weapons and armor and the visuals of the character avatar change accordingly.
As a start hint for how to go about this: In your modeling tool (Blender, Maya, 3ds max, etc.) you model your avatar and all its clothes you need and rig them to the same skeleton. Then export everything individually to Ogre's mesh format.
At runtime you can then attach the clothing meshes the user chooses to the skeleton instance so that they together form the avatar. This is not hard to do via Ogre-API, but for even easier access to this you can use MeshMagick Ogre extension's meshmerge tool. It has been developed for exactly this purpose.
If you want to change other characteristics like facial features, this is possible too, as Ogre supports vertex pose animations out of the box, so you can prepare pathes for certain characteristics of the face and let the user change the face by sliders or somthing like this. (e.g like in Oblivion)
One thing to be aware regarding Ogre: It is a 3d graphics engine, not a game engine. So you can draw stuff to the screen with it and animate and light and in any way change the visuals, but it doesn't do input or physics or sound. For this you have to use other libs and integrate them. Several pre-bundled game engines based on Ogre are available though.
If you are good with C++ you should use Ogre, it's the best open-source engine, continuously been updated by it's creators, with a lot of tutorials and a very helpful community.
http://www.ogre3d.org/
It's more of a GFX engine, but it has all the prerequisites you desire.
Good luck!
No engine is likely to do this for you. What they do is generally allow you to load and render 3d models. But combining them, the way you'd need to do to "dress them" is up to you. And creating them, or letting the user do so, is ultimately up to you. The engine might offer a number of tools to make the task easier (for example, rendering the model while the user is designing it), but a game engine is not a magic "make a game" box where you just have to press a button, and your custom game comes out.
A couple of people have said Ogre3D, I'll offer up Irrlicht as an alternative.
You might want to take a look at http://www.crystalspace3d.org/ - I have to admit it was more of an exploratory matter for me, but it seemed like a pretty nice engine - with physics and scripting included. They have an project which shows the avatar walking in the spacestation-like building with very smooth camera effects.
OTOH: depending on how far you want to push this, you might find yourself recreating the SecondLife(tm)-like kind of environment. If that's a fair assumption, then you might take a look at OpenSimulator and the associated opensource viewer projects and see if this may be of interest to you - and work there with the existing team to develop the code further, rather than working on your own.
If you good with C++, I suggest the C4 Engine. From my experiences, existing game engines are either too rigid or just nothing more than a collection of libraries.
Ogre is a good way to go if you are just interested in getting something to show. As some have already stated here, Ogre is a rendering engine. There are lots of add-ons and functions to complete common tasks like Audio, Input and whatnot. This is perfectly fine if your just aiming at playing around or creating a prototype.
Should you want to start a long-term project that will be developed over a longer period of time (which would be pretty likely considering you probably being the only developer and games being complex applications), you should really start thinking about what it is that you want to do. Then, based on you're goals, look for several engines that can tackle your needs (there's always some API to accomplish XYZ). Then it's up to you how you manage your game and where you use existing libraries - you'd basically tie up your own engine according to your needs.
It get's a bit more difficult if you start looking for a real game engine in terms of "engine for all my game-dev needs". Check out the nice list of 3d game engines at devmasters (http://www.devmaster.net/engines/), you'll find lots of alpha status game engines trying to accomplish this, although you should keep in mind that support and documentation usually isn't first class in those cases.
I personally never used it, but I evaluated the open source engine Delta3D (delta3d.org) for my project and was impressed by it's cool architecture. It encapsulates a whole bunch of other quality open source frameworks for stuff like graphics (OpenSceneGraph: openscenegraph.org) or physics (ODE: ode.org). That's probably as close as you'll get to a free and flexible game engine as far as I know. It was developed at an air force university, and due to it's academic background comes with lots of detailed documentation.
If you're good with C++ you can write your own engine :P
Ogre is the best of Irrlicht and Crystalspace, and the reasoning behind that is simple - Ogre has actually been used in a production pipeline by the game industry. It actually has a lot of weight behind it, whereas Irrlicht and Crystalspace are more or less applications that don't do a lot out of the box. Crystalspace however has a branch project that implements a game engine right into Blender 3d allowing the artist to play the role of programmer without leaving the actual software.
I'm not very big on Irrlicht - there is a lot of sneakiness behind its motives. For an open source project it branches out into many different derivatives that are either complete game engines or WYSWYG editors and they find ways to lock you into paying somehow.
Ogre excels at being a graphics engine rather than a library, and it has to be compiled to individual needs. The tradeoff is you can implement Ogre into any design work flow or even create a new one. Where it takes the load off your shoulders is having to write graphics code of any kind thus making it a very slick rapid prototyping engine in its basic form.
One engine that you could try is the Torque 3D game engine www.garagegames.com which, although not being freeware, allows for out of the box usability. While the functionality that you seek in terms of being able to fully customise the character is not instantly available within the engine, if you are willing to create the models yourself it should not be too hard to add them into the game and the utilise the game engine to change the 'skins' of the avatar. One thing that I feel will set this apart from the other engines is the fact that it comes with networking functionality pre-installed (from what you have described in your question I am guessing that you are attempting to either make an RTS or MMO, and if so I wish you good luck).
While it may seem strange that the engine is based around a shooting game, there are guides witin the Torque forums that allow you to add the coding for sword based combat and other things associated with a fantasy based game (if, that is, what you are planing on making).
But anyway... good luck with your project. If you are attempting what I think you are attempting it is no easy feat. But I'm sure you know what you are doing =)
Hope this helped
If you are interested in using the Irrlicht 3D engine, you can find a number of tutorials that step you through the process of creating simple 3D applications here.
I also suggest Irrlicht. It's easier to begin with, but it does not have half of the Ogre3D support though.