I am using Windows Media Foundation C++ for playing audio and video files.
My application is pretty much based on the Media Foundation guide - http://msdn.microsoft.com/en-us/library/ms697062%28v=VS.85%29.aspx.
My problem is that when I play a media file, the audio is rendered only from the left speaker.
Some more info:
The problem happens for both Audio and Video files.
My topology is a classic Input-Node -> Transfer-Node -> Output-Node.
The audio stream looks okay in the output of the Output-Node (It's a float32 stream and it has no interleaving zeros for the right speaker).
The Transfer-Node in the topology is for a future equalizer, but currently it does nothing. Even if I remove it from the topology, the problem still occurs.
I suppose the problem might happen because of some misconfiguration of the Media Foundation, but I haven't found anything out of the order with respect to the Media Foundation Guide.
Any idea what might be the problem?
I would be happy to share relevant code samples or give any other relevant info about my implementation.
Thanks.
It sounds like either your source node is providing a single channel data stream or the input media type for the output node is single channel. If it's the latter case then the media session is injecting a transform that downmixes the input stream to single channel to conform with the media type.
I would check the media types of both nodes and see if this is the issue.
I've found the problem.
It was a misuse of the waveOutSetVolume() function that muted my right speaker (I used it with value 0xFFFF instead of 0xFFFFFFFF).
Somehow I've missed in the multiple code reviews I was doing when debugging this issue :(
So not related to Media Foundation at all.
Related
So I'm trying to get a basic tool to output video/audio(s) to Twitch. I'm new to this side (AV) of programming so I'm not even sure what to look for. I'm trying to use mainly Windows infrastructure and third party where not available.
What are the steps of getting raw bitmap and wave data into a codec and then into a rtsp client and finally showing up on Twitch? I'm not looking for code. I'm looking for concepts so I can search for as I'm not absolutely sure what to search for. I'd rather not go through OBS source code to figure it out and use that as last resort.
So I capture the monitor via Output Duplication and also the Sound on the system as a wave and the microphone as another wave. I'm trying to push this to Twitch. I know that there's Media Foundation on Windows but I don't know how far to streaming it can get as I assume there no netcode integrated in it? And also the libav* collection in FFMPEG.
What are the basic steps of sending bitmap/wave to Twitch via any of thee above libraries or even others as long as they work on Windows. Please don't add code, I just need a not very long conceptual explanation and I'll take it from there. Try to cover also how bitrate and framerate gets regulated (do I have do it or the codec does it)?
Assume absolute noob level in this area (concept-wise not code-wise).
I'm sure this question would have been asked before but I've searched and can't find anything specific to help a solution.
I'll start out outlining the initial concerns and if more indepth technical information is needed then I can give it. Hopefully there is enough information for the initial question(s).
I'm writing an app using c++ and directshow in visual studio 2010. The main project specification is for a live preview and, at any time of choosing, record the video to mpeg2 to harddrive then to dvd to be played in a standard dvd player, all the time the live preview is not to be interrupted.
The capturing seems a pretty trivial standard straight forward thing to do with directshow.
There are a couple of custom filters that i wrote. Nothing amazing but we wanted our own custom screen overlay information - time and date etc - this must be in the preview and the recorded file. I use the avi decompressor connected to the capture card video out pin, and connect the avi decompressor to my filter to give me an rgb image that i can manipulate. The output from this filter is then split via an inftee filter, one goes to the screen, the other goes into the ms mpeg2 encoder. The audio goes from the capture card audio out into the same mpeg2 encoder. Output from the mpeg2 encoder then goes to a file. That file then gets authored for dvd and burnt to dvd.
So my questions are...
Where and how would be the best place to allow starting and stopping of only mpeg2 file output, to be done via user action?
I have tried using smart tee filters - 1 for video and 1 for audio as the last filter BEFORE the mpeg2 encoder, then using the iamstreamcontrol interface to turn off the pins at the appropriate time. Should this cause any timing issues with the final mpeg2? as the output file will play via mplayer and vlc etc but doesnt get converted to be mpeg2 dvd compliant ( for testing - via any dvd authoring software - complaints of a broken file and somteimes gives time references ) - is it possible that time stamps in the file are a problem and giving an error? If the file is captured from the first moment that capture commences ( as opposed to say after 5 mins of streaming ) then everything is ok.
I did think of using the streambuffer route - http://msdn.microsoft.com/en-gb/library/windows/desktop/dd693041(v=vs.85).aspx - but I'm not sure on the best direction to takes things. It seems that are possibly a few choices for the best direction.
Any help and tips would be greatly appreciated. Especially if theres websites/books/information of DirectShow filters,pins,graphs and how they all flow together.
EDIT: I was thinking of making my own copy of the 'Smart Tee' filter that in that I would have 2 pins coming in - audio and video - and 4 out pins - 2 video ( 1 preview and 1 capture ) and 2 of the same for audio, but would I end up with the same issue? And what is the correct way to handle 'switching off' the capture pins of that custom filter. Would I be wasting my time to work on something like this? Is it a simple case of overriding the Active/Inactive methods of the output pin(s) and either send or not send the sample? I feel its not that easy?
Many thanks!
Where and how would be the best place to allow starting and stopping of only mpeg2 file output, to be done via user action?
For this kind of action I would recommend GMFBridge. Creating your own filter is not easy. GMFBridge allows you to use two separate graphs with a dynamic connection. Use the first graph for the preview and the second graph for the file output. And only connect the graphs after a user action.
In my C++ application I have video image frames coming from a web camera.
I wish to send those image frames down to a HTML5 video tag element for live video playing from the camera. How can I do this?
For a starting point you are going to want to look into WebM and H.264/MPEG-4 AVC. Both of these technologies are used as HTML5 media streams. It use to be that FireFox only supported WebM while Safari and Chrome both supported H.264. I am not sure about their current states, but you will probably have to implement both.
Your C++ will then have to implement a web server that can stream these formats on the fly. Which may require significant work. If you choose this route this Microsoft document may be of some use. Also, the WebM page has developer documentation. It is possible that H.264 must be licensed for a cost. WebM allows royalty free usage.
If I am not mistaken neither of these formats has to be completely downloaded in order to work. So you would just have to encode and flush the current frame you have over and over again.
Then as far as the video tag in HTML5 you just have to provide it the URLS your C++ server will respond to. Here is some documentation on that. Though, you may want to see if there is some service to mirror these streams as not to overload your application.
An easier way to stream your webcam could be simply to use FFMPEG.
Another usefull document can be found at:
http://www.cecs.uci.edu/~papers/aspdac06/pdf/p736_7D-1.pdf
I am no expert, but I hope that at least helps you get your start.
I'm looking for a video library for Qt 4 (C++/Windows) that has:
1) Basic video playback functionality
It should play all the common video formats such as DVD VOB and MP4/MKV/AVI (h264, xvid, divx). It should also be able to deinterlace the video automatically and display it in Display Aspect Ratio.
2) Cropping
It should have some basic functionality to remove black bars (user supplied arguments).
3) Snapshots
It should have functionality to take snapshots in memory.
4) Frame-by-frame seeking
It should have some basic functionality to do frame-by-frame seeking, e.g. prevFrame(), nextFrame(), jumpTo(frame) and getNumFrames().
I have tried the following and from what I could find the functionality they support:
Qt Phonon:
Yes. Plays all the needed formats and displays them correctly.
No.
No. Not implemented (returns empty image).
No.
QtFFmpegWrapper:
Partial. Does not deinterlace DVD VOBs. Does not display DVD VOBs in DAR.
No.
Yes.
Partial. Broken for MKV (h264).
Qt VLC:
Yes. Plays all the needed formats and displays them correctly.
Yes. Have not tried if it works though.
Partial. Only to disk. edit: QPixmap::grabWindow(player->videoWidget()->winId()) works.
No. Only by milliseconds.
Now I'm looking at QVision, which seems to have all of those features except for cropping. Maybe implementing cropping isn't that difficult. But I'm wondering if there's any other libraries I should look into? Or perhaps I missed something and they're possible with one of these libraries. Thanks.
You can consider Movie Player Gold SDK ActiveX 3.6 from ViscomSoft. I don't see cropping mentioned on their site but memory snapshots and frame-by-frame steps are among supported features.
I used their VideoEdit and Screen2Video SDKs in Windows Qt software, worked quite well.
The Kinect OpenNI library uses a custom video file format to store videos that contain rgb+d information. These videos have the extension *.oni. I am unable to find any information or documentation whatsoever on the ONI video format.
I'm looking for a way to convert a conventional rgb video to a *.oni video. The depth channel can be left blank (ie zeroed out). For example purposes, I have a MPEG-4 encoded .mov file with audio and video channels.
There are no restrictions on how this conversion must be made, I just need to convert it somehow! Ie, imagemagick, ffmpeg, mencoder are all ok, as is custom conversion code in C/C++ etc.
So far, all I can find is one C++ conversion utility in the OpenNI sources. From the looks of it, I this converts from one *.oni file to another though. I've also managed to find a C++ script by a phd student that converts images from a academic database into a *.oni file. Unfortunately the code is in spanish, not one of my native languages.
Any help or pointers much appreciated!
EDIT: As my usecase is a little odd, some explanation may be in order. The OpenNI Drivers (in my case I'm using the excellent Kinect for Matlab library) allow you to specify a *.oni file when creating the Kinect context. This allows you to emulate having a real Kinect attached that is receiving video data - useful when you're testing / developing code (you don't need to have the Kinect attached to do this). In my particular case, we will be using a Kinect in the production environment (process control in a factory environment), but during development all I have is a video file :) Hence wanting to convert to a *.oni file. We aren't using the Depth channel at the moment, hence not caring about it.
I don't have a complete answer for you, but take a look at the NiRecordRaw and NiRecordSynthetic examples in OpenNI/Samples. They demonstrate how to create an ONI with arbitrary or modified data. See how MockDepthGenerator is used in NiRecordSynthetic -- in your case you will need MockImageGenerator.
For more details you may want to ask in the openni-dev google group.
Did you look into this command and its associated documentation
NiConvertXToONI --
NiConvertXToONI opens any recording, takes every node within it, and records it to a new ONI recording. It receives both the input file and the output file from the command line.