Grabbing frames when VideoInfoHeader2 structure - c++

I'm working on an application that makes analysis on video files.
Being no expert on DirectShow I used simple code for the analysis
of all frames (SampleGrabber, Callback etc.).
This works fine for all media files, even when decoded with
VideoInfoHeader2 structure (although it shouldn't, as stated everywhere).
The problem is with grabbing a single frame.
For this I used IMediaDet. And this doesn't do if there's only VideoInfoHeader2, and no VideoInfoHeader.
I tried modifications of my analysis code (OneShot, Seek), but it doesn't do.
All sources in the internet concerning this are not very helpful, as they point to SDK/ DX examples that aren't accessible anymore, or they just say that modification would be "easy".
Well, maybe for an DX expert ...
(But I need to use the car, not first to build it ... ;-)
As the matter became more & more important to me my "workaround" is to recode all videos with VideoInfoHeader2, save them with VideoInfoHeader, and do the analysis/ grabbing on that.
Very resource consuming, and the opposite of smart ...
Any help appreciated.

You outlined the necessary steps which are still the easiest solution (provided that you don't give up and use Windows API; using a third party library might be easier in comparison but this is beyond the scope of this question).
Sample Grabber and IMediaDet are parts of deprecated DirectShow Editing Services, development of which was stopped long ago. If you are not satisfied with stock API, you have to use a more flexible replacement. For example, you can grab source of similar Sample Grabber sample from older DirectX or Platform SDK and extend it to support VIDEOINFOHEADER2.
IMediaDet is nothing but COM class building its own graph internally trying to decode video. It is inflexible and almost every time your building your own graph is a more reliable solution.
Microsoft's answer to this problem is - as they abandoned DirectShow development - newer API Media Foundation. However there are reasons why this "answer" is not so good: limited OS compatibility, limited support for codecs and formats, completely new API which has little in common with DirectShow and you need to redesign your application.
All together, you either have to find a Sample Grabber replacement using one of the popular and explained methods (no matter they look not so much helpful), or switch to another API or third party library. Or, another possible solution is to use a different filter/codec which is capable of decoding into VIDEOINFOHEADER formatted media type.

Related

Simple C++ Sound API

My commercial embedded C++ Linux project requires playing wav files and tones at individual volume levels concurrently. A few examples of the sounds:
• “Click” sounds each time user presses screen played at a user-specified volume
• Warning sounds played at max-volume
• Warning tones requested by other applications at app-specified volume level (0-100%)
• Future support for MP3 player and/or video playback (with sound) at user-specified volume. All other sounds should continue while song/video is playing.
We're using Qt as our UI framework which has QtMultimedia and Phonon support. However, I heard the former has spotty sound support on Linux and the latter is an older version and may be deprecated in an upcoming Qt release.
I've done some research and here are a few APIs I've come across:
KDE Phonon
SFML
PortAudio
SDL_Mixer
OpenAL Soft
FMOD (though I'd prefer to avoid license fees)
ALSA (perhaps a bit too low-level...)
Other considerations:
Cross-platform isn't required but preferred. We'd like to limit dependencies as much as possible. There is no need for advanced features like 3D audio or special effects in the foreseeable future. My team doesn't have much audio experience so ease-of-use is important.
Are any of these overkill for my application? Which seems like the best fit?
Update:
It turns out we were already dependent on SDL for other reasons so we decided on SDL_Mixer. For other Embedded applications, however, I'd take a long at the PortAudio/libsndfile combo as well due to their minimal dependencies.
libao is simple, cross-platform, Xiphy goodness.
There's documentation too!
Usage is outlined here - simple usage goes like this:
Initialize (ao_initialize())
Call ao_open_live() or ao_open_file()
Play sound using ao_play()
Close device/file using ao_close() and then ao_shutdown() to clean up.
Go for PortAudio. For just plain audio without unneeded overhead such as complex streaming pipelines, or 3D, it is the best lib out there. In addition you have really nice cross-platform support. It is used by several professional audio programs and has really high quality.
i have used SDL_Mixer time and time again, lovely library, it should serve well for your needs, the license is flexible and its heavily documented. i have also experimented with SFML, while more modern and fairly documented, i find it a bit bulky and cumbersome to work with even tho both libraries are very similar. imo SDL_Mixer is the best.
however you might also want to check out this one i found a few weeks ago http://www.mpg123.de/, i haven't delved too much into it, but it is very lightweight and again the license is flexible.
There is a sound library called STK that would meet most of your requirements:
https://ccrma.stanford.edu/software/stk/faq.html
Don't forget about:
FFmpeg: is a complete, cross-platform solution to record, convert and stream audio and video.
GStreamer: is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.

Sound output through M-Audio ProFire 610

I got an assignment at work to create a system which will be able to direct sound to different output channels of our sound card. We are using M-Audio ProFire 610, which has 8 channel output and connects through FireWire. We are also using a Mac Mini as our host server and I'm gonna be working in Xcode.
This is the diagram of what I am building:
diagram http://img121.imageshack.us/img121/7865/diagramy.png
At first I thought that Java will be enough for this project, however later on I discovered that Java is not able to push sound to other than default output channels of the sound card so I decided to switch to C++. The problem is that I am a web developer and I don't have any experience in this language whatsoever - that is why I am looking for help from more experienced developers.
I found a Core Audio Primer for ios4 but not sure how much of it I can use for my project. I find it a bit confusing, too.
What steps should I take to complete this assignment? What frameworks should I use? Any code examples? I am looking for any help, hints, tips - well anything that will help me complete this project.
If you're just looking for audio pass-through, you might want to look at something that's already been built, like Jack which creates a software audio device that looks and works just like a real one (you can set it as default output for your app) and then allows you to route each channel anywhere you want (including to other applications).
If you want/need to make your own, definitely go with C++, for which there are many many tutorials (I learned from cplusplus.com). CoreAudio is the low-level C/C++ interface as Justin mentioned, but it's really hard to learn and use. A much simpler API is provided by PortAudio, for which I've worked a bit on the Mac implementation. Look at the tutorials there, make something similar for default input and output, and then to do the channel mapping use PaMacCore_SetupChannelMap, which is described here. You'll need to call it twice, once for the input stream and once for the output stream. Join the mailing list for PortAudio if you need more advice! Good luck!
the primary APIs are at CoreAudio/AudioHardware.h
most of the samples/supporting code provided by apple is in C++. however, the APIs are totally C (don't know if that helps you or not).
you'll want to access the Hardware Abstraction Layer (aka HAL), more details in this doc:
http://developer.apple.com/documentation/MusicAudio/Conceptual/CoreAudioOverview/CoreAudioOverview.pdf
for (a rather significant amount of) additional samples/usage, see $DEVELOPER_DIR/Extras/CoreAudio/

Is COLLADA a dead format?

I've been reading lots of musings on the net that COLLADA is a dead file format? In that applications are not updating their support for it etc. Is this true? It was originally designed to be a format that could be almost application independant so my question is in 2 parts. Is it a dead format? And if so, what is the current accepted format to maximise inter-application development (and to use with OpenGL applications)?
Most applications are supporting COLLADA, new support is announced all the time.
Follow COLLADA on twitter to get daily updates... far to be dead.
Hard to keep track in fact.
BTW, Khronos just released the COLLADA reference card. free at http://www.khronos.org/files/collada_reference_card_1_4.pdf , making it easier to implement.
Still the missing piece was the conformance test, to make sure applications are correctly following the specification, and it has been released recently
In short, expect improved support, more applications, better interoperability. One thing for sure is COLLADA is a published standard, (as opposed to be a proprietary technology), so it is there to stay and safe to invest in as it is not impacted by mergers, bankruptcies, change in company policies...
...
also, we are in the process of rebuilding collada.org. There is an incomplete list of products , and a forum for your questions.
COLLADA was supposed to be an intermediate format while producing content. That is why there are more plugins and libraries for modeling packages than there are for 3d engines and libraries.
A custom OpenGL graphics engine will tend to use its own model format so it can implement new features that are not in COLLADA.
There's a lot of middle-ware support for COLLADA, so I don't think you can call it dead. However, it hasn't become The One Format To Rule Them All, as some was hoping for. Basically, it's the best common middle-ground for 3d-asset exchange between different software-packages, but it's not a very good fit for in-engine usage.
I am not really optimistic about the future of the COLLADA format. Nothing happened since the publication of specification 1.5. There is no tool which supports the complete COLLADA functionality.
OpenCOLLADA is a nice library which helps the format alive for now, but it is not enough. Format itself should improve through the time, too. I have tried to work with physics libraries of COLLADA, the only tool I can find to create a reasonable example was Maya. And not without extra effort to install plugins, etc. Most of the importers for COLLADA does not support 1.5, and when they support the version they do not support some of the elements. COLLADA model repository server is down for ages, it is difficult to find good examples with variety to work with.
AutomationML format uses COLLADA as graphic format, which brings the possibility to be spread for industrial usage. But there are already some strong competitors in this field like JT.
An extra problem, COLLADA supports MathML in element but during MathML improves as a format, COLLADA stays put and new versions of MathML format cannot be used.
I liked the idea of the COLLADA, but it is far away from its goals because of the lack of support and applications.

Video encoding services

I need to be able to allow users to upload a wide variety of video files in various formats then clean them up and make them kosher for delivery to a dedicated content handler.
I've tried ffmpeg onsite but it has some serious flaws in regards to h.264.
Then I tried flixcloud.com which has a very good interface, api, and was looking like the perfect solution except it doesn't provide the video frame rate correctly.
Moving on I tried Ankoder.com and it does work, but unfortunately it's API is somewhat of a mess and has some quirks that are proving to be difficult to code around.
What other services are out there, will only accept answer from someone who has used a video transcoding service.
Update:
Just started looking at http://www.encoding.com/ - seems interesting.
I have worked with a number of transcoding services in the past and have found flaws with just about all of them. For the last 3 projects I have been involved with that involved media encoding I have used Expression Encoder, with great results. The application its self is a pleasure to use and simple to achieve the results needed, and the SDK is one of the best out there. Microsoft have definatly done a great job.

Open Source sound engine

When I started using SoundEngine (from CrashLanding and TouchFighter), I had read about a few people recommending not to use it, for it was, according to them, not stable enough. Still it was the only solution I knew of to play sounds with pitch and position control without learning C++ and OpenAL, so I ignored the warnings and went on with it.
But now I'm starting to worry. The 2.2 SDK introduced AVFoundation. Using both SoundEngine from CrashLanding (for sounds) and AVAudioPlayer (for music), I found out SoundEngine behaves strangely when the only existing AVAudioPlayer is released (all sounds stop until a new AVAudioPlayer is initiated). Around the same time as the 2.2 SDK came out, the CrashLanding sample code was mysteriously removed from the ADC site. I'm worried there are more bad surprises to come.
My question is, is anyone aware of an Open Source alternative to SoundEngine? Maybe even a C++ library that uses OpenAL?
Look at this library, but i don't know is this what you need.
The Kowalski project provides a data driven and portable sound engine that currently runs on iOS, OS X and Windows. The engine is released under the zlib license and provides positional audio, pitch control etc.
ObjectAL for iPhone
Clone it. Use it. Love it. Enjoy the freedom.
Why not just use AVFoundation? It's pretty simple to handle and nicely flexible - apart from if you need exact timing (says the Apple documentation - but I've been testing it fairly extensively and yet to find any significant practical issues) I don't see any reason for not leveraging it.
AVFoundation lacks sound placement. This makes me sad.
I’ve written a simple sound engine around OpenAL. There are no position controls (I didn’t need them), but they would be trivial to add if you find the rest to your liking. And there is also some experimental sound code in the Cocos2D engine. It has both pitch and position controls and looks quite usable.