I would like to open a small video file and map every frames in memory (to apply some custom filter). I don't want to handle the video codec, I would rather let the library handle that for me.
I've tried to use Direct Show with the SampleGrabber filter (using this sample http://msdn.microsoft.com/en-us/library/ms787867(VS.85).aspx), but I only managed to grab some frames (not every frames!). I'm quite new in video software programming, maybe I'm not using the best library, or I'm doing it wrong.
I've pasted a part of my code (mainly a modified copy/paste from the msdn example), unfortunately it doesn't grabb the 25 first frames as expected...
[...]
hr = pGrabber->SetOneShot(TRUE);
hr = pGrabber->SetBufferSamples(TRUE);
pControl->Run(); // Run the graph.
pEvent->WaitForCompletion(INFINITE, &evCode); // Wait till it's done.
// Find the required buffer size.
long cbBuffer = 0;
hr = pGrabber->GetCurrentBuffer(&cbBuffer, NULL);
for( int i = 0 ; i < 25 ; ++i )
{
pControl->Run(); // Run the graph.
pEvent->WaitForCompletion(INFINITE, &evCode); // Wait till it's done.
char *pBuffer = new char[cbBuffer];
hr = pGrabber->GetCurrentBuffer(&cbBuffer, (long*)pBuffer);
AM_MEDIA_TYPE mt;
hr = pGrabber->GetConnectedMediaType(&mt);
VIDEOINFOHEADER *pVih;
pVih = (VIDEOINFOHEADER*)mt.pbFormat;
[...]
}
[...]
Is there somebody, with video software experience, who can advise me about code or other simpler library?
Thanks
Edit:
Msdn links seems not to work (see the bug)
Currently these are the most popular video frameworks available on Win32 platforms:
Video for Windows: old windows framework coming from the age of Win95 but still widely used because it is very simple to use. Unfortunately it supports only AVI files for which the proper VFW codec has been installed.
DirectShow: standard WinXP framework, it can basically load all formats you can play with Windows Media Player. Rather difficult to use.
Ffmpeg: more precisely libavcodec and libavformat that comes with Ffmpeg open- source multimedia utility. It is extremely powerful and can read a lot of formats (almost everything you can play with VLC) even if you don't have the codec installed on the system. It's quite complicated to use but you can always get inspired by the code of ffplay that comes shipped with it or by other implementations in open-source software. Anyway I think it's still much easier to use than DS (and much faster). It needs to be comipled by MinGW on Windows, but all the steps are explained very well here (in this moment the link is down, hope not dead).
QuickTime: the Apple framework is not the best solution for Windows platform, since it needs QuickTime app to be installed and also the proper QuickTime codec for every format; it does not support many formats, but its quite common in professional field (so some codec are actually only for QuickTime). Shouldn't be too difficult to implement.
Gstreamer: latest open source framework. I don't know much about it, I guess it wraps over some of the other systems (but I'm not sure).
All of this frameworks have been implemented as backend in OpenCv Highgui, except for DirectShow. The default framework for Win32 OpenCV is using VFW (and thus able only to open some AVI files), if you want to use the others you must download the CVS instead of the official release and still do some hacking on the code and it's anyway not too complete, for example FFMPEG backend doesn't allow to seek in the stream.
If you want to use QuickTime with OpenCV this can help you.
I have used OpenCV to load video files and process them. It's also handy for many types of video processing including those useful for computer vision.
Using the "Callback" model of SampleGrabber may give you better results. See the example in Samples\C++\DirectShow\Editing\GrabBitmaps.
There's also a lot of info in Samples\C++\DirectShow\Filters\Grabber2\grabber_text.txt and readme.txt.
I know it is very tempting in C++ to get a proper breakdown of the video files and just do it yourself. But although the information is out there, it is such a long winded process building classes to hand each file format, and make it easily alterable to take future structure changes into account, that frankly it just is not worth the effort.
Instead I recommend ffmpeg. It got a mention above, but says it is difficult, it isn't difficult. There are a lot more options than most people would need which makes it look more difficult than it is. For the majority of operations you can just let ffmpeg work it out for itself.
For example a file conversion
ffmpeg -i inputFile.mp4 outputFile.avi
Decide right from the start that you will have ffmpeg operations run in a thread, or more precisely a thread library. But have your own thread class wrap it so that you can have your own EventAgs and methods of checking the thread is finished. Something like :-
ThreadLibManager()
{
List<MyThreads> listOfActiveThreads;
public AddThread(MyThreads);
}
Your thread class is something like:-
class MyThread
{
public Thread threadForThisInstance { get; set; }
public MyFFMpegTools mpegTools { get; set; }
}
MyFFMpegTools performs many different video operations, so you want your own event
args to tell your parent code precisely what type of operation has just raised and
event.
enum MyFmpegArgs
{
public int thisThreadID { get; set; } //Set as a new MyThread is added to the List<>
public MyFfmpegType operationType {get; set;}
//output paths etc that the parent handler will need to find output files
}
enum MyFfmpegType
{
FF_CONVERTFILE = 0, FF_CREATETHUMBNAIL, FF_EXTRACTFRAMES ...
}
Here is a small snippet of my ffmpeg tool class, this part collecting information about a video.
I put FFmpeg in a particular location, and at the start of the software running it makes sure that it is there. For this version I have moved it to the Desktop, I am fairly sure I have written the path correctly for you (I really hate MS's special folders system, so I ignore it as much as I can).
Anyway, it is an example of using windowless ffmpeg.
public string GetVideoInfo(FileInfo fi)
{
outputBuilder.Clear();
string strCommand = string.Concat(" -i \"", fi.FullName, "\"");
string ffPath =
System.Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "\\ffmpeg.exe";
string oStr = "";
try
{
Process build = new Process();
//build.StartInfo.WorkingDirectory = #"dir";
build.StartInfo.Arguments = strCommand;
build.StartInfo.FileName = ffPath;
build.StartInfo.UseShellExecute = false;
build.StartInfo.RedirectStandardOutput = true;
build.StartInfo.RedirectStandardError = true;
build.StartInfo.CreateNoWindow = true;
build.ErrorDataReceived += build_ErrorDataReceived;
build.OutputDataReceived += build_ErrorDataReceived;
build.EnableRaisingEvents = true;
build.Start();
build.BeginOutputReadLine();
build.BeginErrorReadLine();
build.WaitForExit();
string findThis = "start";
int offset = 0;
foreach (string str in outputBuilder)
{
if (str.Contains("Duration"))
{
offset = str.IndexOf(findThis);
oStr = str.Substring(0, offset);
}
}
}
catch
{
oStr = "Error collecting file information";
}
return oStr;
}
private void build_ErrorDataReceived(object sender, DataReceivedEventArgs e)
{
string strMessage = e.Data;
if (outputBuilder != null && strMessage != null)
{
outputBuilder.Add(string.Concat(strMessage, "\n"));
}
}
Try using the OpenCV library. It definitely has the capabilities you require.
This guide has a section about accessing frames from a video file.
If it's for AVI files I'd read the data from the AVI file myself and extract the frames. Now use the video compression manager to decompress it.
The AVI file format is very simple, see: http://msdn.microsoft.com/en-us/library/dd318187(VS.85).aspx (and use google).
Once you have the file open you just extract each frame and pass it to ICDecompress() to decompress it.
It seems like a lot of work but it's the most reliable way.
If that's too much work, or if you want more than AVI files then use ffmpeg.
OpenCV is the best solution if video in your case only needs to lead to a sequence of pictures. If you're willing to do real video processing, so ViDeo equals "Visual Audio", you need to keep up track with the ones offered by "martjno". New windows solutions also for Win7 include 3 new possibilities additionally:
Windows Media Foundation: Successor of DirectShow; cleaned-up interface
Windows Media Encoder 9: It does not only include the programm, it also ships libraries for coding
Windows Expression 4: Successor of 2.
Last 2 are commercial-only solutions, but the first one is free. To code WMF, you need to install the Windows SDK.
I would recommend FFMPEG or GStreamer. Try and stay away from openCV unless you plan to utilize some other functionality than just streaming video. The library is a beefy build and a pain to install from source to configure FFMPEG/+GStreamer options.
Related
Is it possible to determine the duration of a media file?
When I say media (video) file I mean files of the following types: .wmv, .avi. .mp4, .flv, .mkv. And when I say duration I mean determine how long in minutes and seconds a video file is.
I understand each file is encoded/packed differently, but maybe each file stores their duration in the header? Are there native WinAPI functions that may allow me to read any of these files into memory or at least inspect the header? I know that native WinAPI doesn't provide any API functions for .png's so it's a long shot for movie files as well, but you never know.
If the native WinAPI doesn't have any functions able to do this, would you recommend a C++ video API/Library or would you just open the file and search the header for the duration manually (ie, using fopen())?
If you want to do it using pure windows API (like windows browser does) you should do it with help of the propsys.dll.
Also it can be done with DirectShow.
Like this:
REFERENCE_TIME GetMediaDuration(CString filePath)
{
CComPtr<IGraphBuilder> graphBuilder;
if (SUCCEEDED(CoCreateInstance(CLSID_FilterGraph, 0, CLSCTX_INPROC,
IID_IGraphBuilder, reinterpret_cast<void**>(&graphBuilder))))
{
CComPtr<IBaseFilter> pSource;
HRESULT hr = graphBuilder->AddSourceFilter(filePath, L"Source", &pSource);
CComPtr<IPin> pPin;
pSource->FindPin(L"Output", &pPin);
if (SUCCEEDED(graphBuilder->Render(pPin)))
{
CComPtr<IMediaSeeking> mediaSeeking;
hr = graphBuilder->QueryInterface( IID_IMediaSeeking, reinterpret_cast<void**>(&mediaSeeking));
REFERENCE_TIME rtDur = 100;
if (SUCCEEDED(mediaSeeking->GetDuration(&rtDur)))
return rtDur;
}
}
return 100;
}
There are many different APIs out there for video. It has been a while since I have looked into it, but I found this link from a google search of "open source C++ video libraries"
As far as a windows API they seem to come and go so I personally wouldn't rely on them. They also are very unlikely to be portable. If you must you can take a look at something like Direct 3D 11. I know a popular option for games is Bink.
Any of these libraries should provide the information you require as many of the formats do contain this information in a header of some kind.
Thanks for taking some time to read my question.
I'm developping a C++ application using Qt and windows API.
I'm recording the microphone output in small 10s audio files in raw format, and I want to convert them to aac format.
I have tried to read as many things as I could, and thought it would be a great idea to start from windows media foundation transcode API.
Problem is, I can't seem to use a .raw or .pcm file in the "CreateObjectFromUrl" function, and so I'm pretty much stuck here for the moment. It keeps on failing. The hr return code equals 3222091460. I have tried to pass an .mp3 file to the function and of course it works, so no url-human-failure involved.
MF_OBJECT_TYPE ObjectType = MF_OBJECT_INVALID;
IMFSourceResolver* pSourceResolver = NULL;
IUnknown* pUnkSource = NULL;
// Create the source resolver.
hr = MFCreateSourceResolver(&pSourceResolver);
if (FAILED(hr))
{
qDebug() << "Failed !";
}
// Use the source resolver to create the media source.
hr = pSourceResolver->CreateObjectFromURL(
sURL, // URL of the source.
MF_RESOLUTION_MEDIASOURCE, // Create a source object.
NULL, // Optional property store.
&ObjectType, // Receives the created object type.
&pUnkSource // Receives a pointer to the media source.
);
The MFCreateSourceResolver works fine, but CreateObjectFromURL does not succeed :(
So I have two questions for you folks :
Is it possible to encode raw audio files to aac files using windows media foundation ?
If yes, what should I read to accomplish what I want ?
I want to point out that I can't just use ffmpeg or libav because I can't afford any license for my software, and don't want it to be under the GPL license. But if there are alternatives to windows media foundations to encode raw audio files to aac, I would be glad to hear them.
And finally, sorry for my bad english, this is obviously not my native language and I'm sorry if I made your eyes bleed. (and happy if I made you laugh)
Have a nice day
The hr return code equals 3222091460
Those are HRESULT codes. Use this "ShowHresult" tool to have them conveniently decoded for you. The code means 0xC00D36C4 MF_E_UNSUPPORTED_BYTESTREAM_TYPE "The byte stream type of the given URL is unsupported."
The problem is basically that there is no support for these raw files, .WAV is a good source for raw audio - the file holds both format descriptor and the payload.
You can obviously read data from the raw audio file yourself and compress into AAC using Media Foundation's AAC Encoder via its IMFTransform interface. This is reasonably easy and you have AAC data on the output to e.g. write into raw .AAC.
Alternate options to Media Foundation is DirectShow (there are suitable codecs, though I thought it might be not so easy to start), libfaac, FFmpeg's libavcodec (available under LGPL, not GPL).
For the implementation of a Windows based page-flip application I need to be able to convert a large number of PDF pages into good quality JPG, not just thumbnails.
The aim is to achieve the best quality / file size for that, similar to Photoshops Save for Web does that.
Currently Im using Datalogics Adobe PDF Library SDK, which does not seem to be able to fullfil that task. I am thus looking for an alternative commcerical C++ or Delphi library which provides a good qualtiy / size / speed.
After doing some search here, I noticed that most posts are about GS & Imagekick, which I have also tested, but I am not satisfied with the output and the speed.
The target is to import the PDFs with 300dpi and convert them with JPG quality 50, 1500px height and an ouput size of 300-500kb.
If anyone could point out a good library for that task, I would be most greatful.
The Gnostice PDFtoolKit VCL may be a candidate. Convert to JPEG is one of the options.
I always recommend Graphics32 for all your image manipulation needs; you have several resamplers to choose. However, I don't think it can read PDF files as images. But if you can generate the big image yourself it may be a good choice.
Atalasoft DotImage (with the PDF rasterizer add-on) will do that (I work on PDF technologies there). You'd be working in C# (or another .NET) language:
ConvertToJpegs(string outfileStem, Stream pdf)
{
JpegEncoder encoder = new JpegEncoder();
encoder.Quality = 50;
int page = 1;
PdfImageSource source = new PdfImageSource(pdf);
source.Resolution = 300; // sets the rendering resolution to 200 dpi
// larger numbers means better resolution in the image, but will cost in
// terms of output file size - as resolution increases, memory used increases
// as a function of the square of the resolution, whereas compression only
// saves maybe a flat 30% of the total image size, depending on the Quality
// setting on the encoder.
while (source.HasMoreImages()) {
AtalaImage image = source.AcquireNext();
// this image will be in either 8 bit gray or 24 bit rgb depending
// on the page contents.
try {
string path = String.Format("{0}{1}.jpg", outFileStem, page++);
// if you need to resample the image, this is the place to do it
image.Save(path, encoder, null);
}
finally {
source.Release(image);
}
}
}
There is also Quick PDF Library
Have a look at DynaPDF. I know its pretty expensive but you can try the starter pack.
P.S.:before buying a product please make sure it meets your needs.
I'm looking for a general compression library that supports random access during decompression. I want to compress wikipedia into a single compressed format and at the same time I want to decompress/extract individual articles from it.
Of course, I can compress each articles individually, but this won't give much compression ratio. I've heard LZO compressed file consists of many chunks which can be decompressed separately, but I haven't found out API+documentation for that. I can also use the Z_FULL_FLUSH mode in zlib, but is there any other better alternative?
xz-format files support an index, though by default the index is not useful. My compressor, pixz, creates files that do contain a useful index. You can use the functions in the liblzma library to find which block of xz data corresponds to which location in the uncompressed data.
for seekable compression build on gzip, there is dictzip from the dict server and sgzip from sleuth kit
note that you can't write to either of these and as seekable is reading any way
DotNetZip is a zip archive library for .NET.
Using DotNetZip, you can reference particular entries in the zip randomly, and can decompress them out of order, and can return a stream that decompresses as it extracts an entry.
With the benefit of those features, DotNetZip has been used within the implementation of a Virtual Path Provider for ASP.NET, that does exactly what you describe - it serves all the content for a particular website from a compressed ZIP file. You can also do websites with dynamic pages (ASP.NET) pages.
ASP.NET ZIP Virtual Path Provider, based on DotNetZip
The important code looks like this:
namespace Ionic.Zip.Web.VirtualPathProvider
{
public class ZipFileVirtualPathProvider : System.Web.Hosting.VirtualPathProvider
{
ZipFile _zipFile;
public ZipFileVirtualPathProvider (string zipFilename) : base () {
_zipFile = ZipFile.Read(zipFilename);
}
~ZipFileVirtualPathProvider () { _zipFile.Dispose (); }
public override bool FileExists (string virtualPath)
{
string zipPath = Util.ConvertVirtualPathToZipPath (virtualPath, true);
ZipEntry zipEntry = _zipFile[zipPath];
if (zipEntry == null)
return false;
return !zipEntry.IsDirectory;
}
public override bool DirectoryExists (string virtualDir)
{
string zipPath = Util.ConvertVirtualPathToZipPath (virtualDir, false);
ZipEntry zipEntry = _zipFile[zipPath];
if (zipEntry != null)
return false;
return zipEntry.IsDirectory;
}
public override VirtualFile GetFile (string virtualPath)
{
return new ZipVirtualFile (virtualPath, _zipFile);
}
public override VirtualDirectory GetDirectory (string virtualDir)
{
return new ZipVirtualDirectory (virtualDir, _zipFile);
}
public override string GetFileHash(string virtualPath, System.Collections.IEnumerable virtualPathDependencies)
{
return null;
}
public override System.Web.Caching.CacheDependency GetCacheDependency(String virtualPath, System.Collections.IEnumerable virtualPathDependencies, DateTime utcStart)
{
return null;
}
}
}
And VirtualFile is defined like this:
namespace Ionic.Zip.Web.VirtualPathProvider
{
class ZipVirtualFile : VirtualFile
{
ZipFile _zipFile;
public ZipVirtualFile (String virtualPath, ZipFile zipFile) : base(virtualPath) {
_zipFile = zipFile;
}
public override System.IO.Stream Open ()
{
ZipEntry entry = _zipFile[Util.ConvertVirtualPathToZipPath(base.VirtualPath,true)];
return entry.OpenReader();
}
}
}
bgzf is the format used in genomics.
http://biopython.org/DIST/docs/api/Bio.bgzf-module.html
It is part of the samtools C library and really just a simple hack around gzip. You can probably re-write it yourself if you don't want to use the samtools C implementation or the picard java implementation. Biopython implements a python variant.
You haven't specified your OS. Would it be possible to store your file in a compressed directory managed by the OS? Then you would have the "seekable" portion as well as the compression. The CPU overhead will be handled for you with unpredictable access times.
I'm using MS Windows Vista, unfortunately, and I can send the file explorer into zip files as if they were normal files. Presumably it still works on 7 (which I'd like to be on). I think I've done that with the corresponding utility on Ubuntu, also, but I'm not sure. I could also test it on Mac OSX, I suppose.
If individual articles are too short to get a decent compression ratio, the next-simplest approach is to tar up a batch of Wikipedia articles -- say, 12 articles at a time, or however many articles it takes to fill up a megabyte.
Then compress each batch independently.
In principle, that gives better compression than than compressing each article individually, but worse compression than solid compression of all the articles together.
Extracting article #12 from a compressed batch requires decompressing the entire batch (and then throwing the first 11 articles away), but that's still much, much faster than decompressing half of Wikipedia.
Many compression programs break up the input stream into a sequence of "blocks", and compress each block from scratch, independently of the other blocks.
You might as well pick a batch size about the size of a block -- larger batches won't get any better compression ratio, and will take longer to decompress.
I have experimented with several ways to make it easier to start decoding a compressed database in the middle.
Alas, so far the "clever" techniques I've applied still have worse compression ratio and take more operations to produce a decoded section than the much simpler "batch" approach.
For more sophisticated techniques, you might look at
MG4J: Managing Gigabytes for
Java
"Managing Gigabytes: Compressing and Indexing Documents and
Images" by Ian H. Witten,
Alistair Moffat, and Timothy C. Bell
Can I convert a bitmap to PNG in memory (i.e. without writing to a file) using only the Platform SDK? (i.e. no libpng, etc.).
I also want to be able to define a transparent color (not alpha channel) for this image.
The GdiPlus solution seems to be limited to images of width divisible by 4. Anything else fails during the call to Save(). Does anyone know the reason for this limitation and how/whether I can work around it?
Update: Bounty
I'm starting a bounty (I really want this to work). I implemented the GDI+ solution, but as I said, it's limited to images with quad width. The bounty will go to anyone who can solve this width issue (without changing the image dimensions), or can offer an alternative non-GDI+ solution that works.
LodePNG (GitHub) is a lib-less PNG encoder/decoder.
I read and write PNGs using libpng and it seems to deal with everthing I throw at it (I've used it in unit-tests with things like 257x255 images and they cause no trouble). I believe the API is flexible enough to not be tied to file I/O (or at least you can override its default behaviour e.g see png_set_write_fn in section on customization)
In practice I always use it via the much cleaner boost::gil PNG IO extension, but unfortunately that takes char* filenames and if you dig into it the png_writer and file_mgr classes in its implementation it seem pretty tied to FILE* (although if you were on Linux a version using fmemopen and in-memory buffers could probably be cooked up quite easily).
On this site the code shows how convert a bitmap to PNG writing it to a file: http://dotnet-snippets.de/dns/gdi-speichern-eines-png-SID814.aspx. Instead of writing to a file, the Save method of Bitmap also supports writing to a IStream (http://msdn.microsoft.com/en-us/library/ms535406%28VS.85%29.aspx). You can create a Stream backed up by memory using the CreateStreamOnHGlobal API function. (http://msdn.microsoft.com/en-us/library/aa378980%28VS.85%29.aspx). The used library, GDI+, is included in Windows up from WindowsXP, and works in Windows up from Windows98. I've never done something with it, just googled around. Looks like you can use that, though.
The CImage class (ATL/MFC) supports saving into PNG format. Like the GDI+ solution, it also supports saving to a stream. Here's some code I use to save it to a CByteArray:
CByteArray baPicture;
IStream *pStream = NULL;
if (CreateStreamOnHGlobal(NULL, TRUE, &pStream) == S_OK)
{
if (image.Save(pStream, Gdiplus::ImageFormatPNG) == S_OK)
{
ULARGE_INTEGER ulnSize;
LARGE_INTEGER lnOffset;
lnOffset.QuadPart = 0;
if (pStream->Seek(lnOffset, STREAM_SEEK_END, &ulnSize) == S_OK)
{
if (pStream->Seek(lnOffset, STREAM_SEEK_SET, NULL) == S_OK)
{
baPicture.SetSize(ulnSize.QuadPart);
ULONG ulBytesRead;
pStream->Read(baPicture.GetData(), ulnSize.QuadPart, &ulBytesRead);
}
}
}
}
pStream->Release();
I don't know if you'd want to use ATL or MFC, though.
I've used GDI+ for saving a bitmap as a PNG to a file. You should probably check out the MSDN info about GDI+ here and in particular this function GdipSaveImageToStream.
This tutorial here will probably provide some help as well.
GDI's (old school, non-plus) has a GetDIBits method that can be asked to output bits using PNG compression (BITMAPINFOHEADER::biCompression == BI_PNG). I wonder if this could be used to create a PNG file? Using GetDIBits to write standard bitmap files is complicated enough - so i suspect this would be even more difficult.
If you want to only use Windows APIs, WIC is the way to accomplish this, and it supports both Bitmaps and PNGs.
It would probably be better to use a library instead of reinventing the wheel yourself.
Look into freeImage