Check for valid image - c++

I'm writing a program that downloads information from the web and part of that is images.
At the moment I'm having a problem as the code to download the images is a different part to the code that displays them (under mvc). If a 404 is issued or the image download fails some way the display code pops a message propmt up which i would like to avoid.
Is there an easy way to check to see if an image is valid? I'm only concerned about jpg, gif and png.
Note: I dont care about reading the image data, just to check to see if it is valid image format.

Do you want to check whether the download would be successful? Or do you want to check that what is downloaded is, in fact, an image?
In the former case, the only way to check is to try to access it and see what kind of HTTP response code you get. You can send an HTTP HEAD request to get the response code without actually downloading the image, but if you're just going to go ahead and download the image anyway (if it's successful) then sending a separate HEAD request seems like a waste of time (and bandwidth).
Alternatively, if you really want to check that what you're downloading is a valid image file, you have to read the whole file to check it for corruption. But if you just want to check that the file extension is accurate, it should be enough to check the first few bytes of the file. All GIF images start with the ASCII text GIF87 or GIF89 depending on which GIF specification is used. PNG images start with the ASCII text PNG, and JPEG images have some magic number, which appears to be 0xd8ffe0ff based on the JPEGs I looked at. (You should do some research and check that, try Wikipedia for links) Keep in mind, though, that to get even the first few bytes of the image, you will need to send an HTTP request which could return a 404 (and in that case you don't have any image to check).

Thanks for the answers guys. I have all ready downloaded the file so i went with just checking the magic number as the front end i use (wxWidgets) all ready has image library's and i wanted something very light.
uint8 UTIL_isValidImage(const unsigned char h[5])
{
//GIF8
if (h[0] == 71 && h[1] == 73 && h[2] == 70 && h[3] == 56)
return IMAGE_GIF;
//89 PNG
if (h[0] == 137 && h[1] == 80 && h[2] == 78 && h[3] == 71)
return IMAGE_PNG;
//FFD8
if (h[0] == 255 && h[1] == 216)
return IMAGE_JPG;
return IMAGE_VOID;
}

If you really want to know if an image file is valid, you actually have to decode it (although you don't need to store the bits). This is because the file might be the wrong size or might be corrupted.
If you're using an HTTP library to do the downloads, you should be able to examine the header and know that you're getting a 404 error and not a real payload. Look at the documentation for the library you're using.
If you're getting back a file and you want to see if it's probably an image without fully-decoding it, then you'll need to check at least the headers for validity. libpng and libjpeg offer pretty low-level access to png and jpeg files, respectively. You could also look at higher-level libraries like ImageMagick, Microsoft's MFC, or whatever library is most appropriate for your platform.

When you GET a resource through HTTP, you must use the Content-Type header to determine how to process the content. If you've already downloaded it to a local file, the information that a real web browser relies upon is already lost. In many cases, the URL will match the Content-Type (e.g. http://example.com/image.png is served up as Content-Type: image/png). However, you cannot rely on this.

Related

Get raw buffer for in-memory dataset in GDAL C++ API

I have generated a GeoTiff dataset in-memory using GDALTranslate() with a /vsimem/ filepath. I need access to the buffer for the actual GeoTiff file to put it in a stream for an external API. My understanding is that this should be possible with VSIGetMemFileBuffer(), however I can't seem to get this to return anything other than nullptr.
My code is essentially as follows:
//^^ GDALDataset* srcDataset created somewhere up here ^^
//psOptions struct has "-b 4" and "-of GTiff" settings.
const char* filep = "/vsimem/foo.tif";
GDALDataset* gtiffData = GDALTranslate(filep, srcDataset, psOptions, nullptr);
vsi_l_offset size = 0;
GByte* buf = VSIGetMemFileBuffer(filep, &size, true); //<-- returns nullptr
gtiffData seems to be a real dataset on inspection, it has all the appropriate properties (number of bands, raster size, etc). When I provide a real filesystem location to GDALTranslate() rather than the /vsimem/ path and load it up in QGIS it renders correctly too.
Looking a the source for VSIGetMemFileBuffer(), this should really only be returning nullptr if the file can't be found. This suggests i'm using it incorrectly. Does anyone know what the correct usage is?
Bonus points: Is there a better way to do this (stream the file out)?
Thanks!
I don't know anything about the C++ API. But in Python, the snippet below is what I sometimes use to get the contents of an in-mem file. In my case mainly VRT's but it shouldn't be any different for other formats.
But as said, I don't know if the VSI-api translate 1-on-1 to C++.
from osgeo import gdal
filep = "/vsimem/foo.tif"
# get the file size
stat = gdal.VSIStatL(filep, gdal.VSI_STAT_SIZE_FLAG)
# open file
vsifile = gdal.VSIFOpenL(filep, 'r')
# read entire contents
vsimem_content = gdal.VSIFReadL(1, stat.size, vsifile)
In the case of a VRT the content would be text, shown with something like print(vsimem_content.decode()). For a tiff it would of course be binary data.
I came back to this after putting in a workaround, and upon swapping things back over it seems to work fine. #mmomtchev suggested looking at the CPL_DEBUG output, which showed nothing unusual (and was silent during the actual VSIGetMemFileBuffer call).
In particular, for other reasons I had to put a GDALWarp call in between calling GDALTranslate and accessing the buffer, and it seems that this is what makes the difference. My guess is that GDALWarp is calling VSIFOpenL internally - although I can't find this in the source - and this does some kind of initialisation for VSIGetMemFileBuffer. Something to try for anyone else who encounters this.

C++ WinINet InternetReadFile function refresh

I am trying to get the content of a file using WinHTTP in C++. The file is a XML File and is generated by a executable on a server.
The code for init, connect and even read a file on the specified server address is working.
// Connect to internet.
m_hInternet = InternetOpen(L"HTTPRIP",INTERNET_OPEN_TYPE_PRECONFIG,NULL,NULL,0);
// Check if worked.
if( !m_hInternet )
return;
// Connect to selected URL.
m_hUrl = InternetOpenUrlA(m_hInternet, strUrl.c_str(), NULL, 0, INTERNET_FLAG_PRAGMA_NOCACHE | INTERNET_FLAG_RESYNCHRONIZE, 0);
// Check if worked.
if( !m_hUrl )
return;
if( InternetReadFile(m_hUrl, buf, BUFFER_SIZE, &bytesread) && bytesread != 0 )
{
// Put into std::string.
strData = std::string(buf,buf+bytesread);
}
Now I want to update the file (same address). The server update the file at 50Hz and I want my code to be able to ReadFile only if it has been updated by the server. Can InternetReadFile do that kind of thing? Maybe with a FLAG but I didn't find a thing on MSDN.
Thanks for your help.
There is no way in the HTTP protocol for you directly do that, hence there is no such function in WinHTTP. The easiest solution might be to download the file and see if it's changed, if the file is relatively small, or if the file is large, let the server which writes the file, also write a timestamp, checksum or counter increment file next to it.
Then your code would download the checksum file, see if it's changed, and in that case download the original file.
Or another solution would be to put a timestamp or similar data in the beginning of the XML file, and stop downloading the file if the timestamp (or checksum) is not updated. (This comes with its own drawbacks of course, you may have to write your own parser.)
If HTTP server has a page with info (e.g. timestamp) on this file (no matters that a file is generated; the page may be generated too), you may examine this page.
As you know that server updates the file with (nearly) constant speed, your app may just use the timer.
P.S. I doubt if there's really a sense in reading some file 50 times every second.

iOS file size during write using only C/C++ APIs

Purpose: I am monitoring file writes in a particular directory on iOS using BSD kernel queues, and poll for file sizes to determine write ends (when the size stops changing). The basic idea is to refresh a folder only after any number of file copies coming from iTunes sync. I have a completely working Objective-C implementation for this but I have my reasons for needing to implement the same thing in C++ only.
Problem: The one thing stopping me is that I can't find a C or C++ API that will get the correct file size during a write. Presumably, one must exist because Objective-C's [NSFileManager attributesOfItemAtPath:] seems to work and we all know it is just calling a C API underneath.
Failed Solutions:
I have tried using stat() and lstat() to get st_size and even st_blocks for allocated block count, and they return correct sizes for most files in a directory, but when there is a file write happening that file's size never changes between poll intervals, and every subsequent file iterated in that directory have a bad size.
I have tried using fseek and ftell but they are also resulting in a very similar issue.
I have also tried modified date instead of size using stat() and st_mtimespec, and the date doesn't appear to change during a write - not that I expected it to.
Going back to NSFileManager's ability to give me the right values, does anyone have an idea what C API call that [NSFileManager attributesOfItemAtPath:] is actually using underneath?
Thanks in advance.
Update:
It appears that this has less to do with in-progress write operations and more with specific files. After closer inspection there are some files which always return a size, and other files that never return a size when using the C API (but will work fine with the Objective-C API). Even creating a copy of the "good" files the C API does not want to give a size for the copy but works fine with the original "good" file. I have both failures and successes with text (xml) files and binary (zip) files. I am using iTunes to add these files to the iPad's app's Documents directory. It is an iPad Mini Retina.
Update 2 - Answer:
Probably any of the above file size methods will work, if your path isn't invisibly trashed, like mine was. See accepted answer on why the path was trashed.
Well this weird behavior turned out to be a problem with the paths, which result in strings that will print normally, but are likely trashed in memory enough that file descriptors sometimes didn't like it (thus only occurring in certain file paths). I was using the dirent API to iterate over the files in a directory and concatenating the dir path and file name erroneously.
Bad Path Concatenation: Obviously (or apparently not-so-obvious at runtime) str-copying over three times is not going to end well.
char* fullPath = (char*)malloc(strlen(dir) + strlen(file) + 2);
strcpy(fullPath, dir);
strcpy(fullPath, "/");
strcpy(fullPath, file);
long sizeBytes = getSize(fullPath);
free(fullPath);
Correct Path Concatenation: Use proper str-concatenation.
char* fullPath = (char*)malloc(strlen(dir) + strlen(file) + 2);
strcpy(fullPath, dir);
strcat(fullPath, "/");
strcat(fullPath, file);
long sizeBytes = getSize(fullPath);
free(fullPath);
Long story short, it was sloppy work on my part, via two typos.

Streaming MP3 from Internet with FMOD

I thought this would be a relatively simple task with something like FMOD, but I can't get it to work. Even the example code netstream doesn't seem to do the trick. No matter what mp3 I try to play with the netstream example program, I get this error:
FMOD error! (20) Couldn't perform seek operation. This is a limitation of the medium (ie netstreams) or the file format.
I don't really understand what this means. Isn't this exactly what the netstream example program was for? to stream some file from the internet?
I can't get passed the createSound method
result = system->createSound(argv[1], FMOD_HARDWARE | FMOD_2D | FMOD_CREATESTREAM | FMOD_NONBLOCKING, 0, &sound);
EDIT:
This is what I modified after reading Mathew's answer
FMOD_CREATESOUNDEXINFO soundExInfo;
memset(&soundExInfo, 0, sizeof(FMOD_CREATESOUNDEXINFO));
soundExInfo.cbsize = sizeof(FMOD_CREATESOUNDEXINFO);
soundExInfo.suggestedsoundtype = FMOD_SOUND_TYPE_MPEG;
result = system->createSound(argv[1], FMOD_HARDWARE | FMOD_2D | FMOD_CREATESTREAM | FMOD_NONBLOCKING | FMOD_IGNORETAGS, &soundExInfo, &sound);
I get two different errors depending on which files I use.
Test 1
URL: http://kylegobel.com/test.mp3
Test 1 Error: (25) Unsupported file or audio format.
Test 2 URL: http://kylegobel.com/bullet.mp3
Test 2 Error: (20) Couldn't perform seek operation. This is a limitation of the medium (ie netstreams) or the file format.
Before I made the change, I could use netstream to play "C:\test.mp3" which is the same file named test.mp3 on the web, but that no longer works with the above changes. Maybe these files are just in the wrong formats or something? Sorry for my lack of knowledge in this area, I really don't know much about sound, but trying to figure it out.
Thanks,
Kyle
It's possible the MP3 has a large amount of tags at the start, so FMOD reads them then tries to seek back to the start (which it can't do because it's a net stream). Can you try using FMOD_IGNORETAGS and perhaps FMOD_CREATESOUNDEXINFO with suggestedsoundtype set to FMOD_SOUND_TYPE_MPEG?
If that does't work could you post the url to a known not working MP3 stream?
EDIT:
The file in question has around 60KB of tag data, FMOD is happy to read over that stuff but for the MPEG codec to work it needs to do some small seeks. Since you cannot seek a netstream all the seeks must be contained inside the low level file buffer. If you tweak the file buffer size, make it a bit larger you can overcome this restriction. See System::setFileSystem "blockalign" parameter.

Edit the frame rate of an avi file

Is it possible to change the frame rate of an avi file using the Video for windows library? I tried the following steps but did not succeed.
AviFileInit
AviFileOpen(OF_READWRITE)
pavi1 = AviFileGetStream
avi_info = AviStreamInfo
avi_info.dwrate = 15
EditStreamSetInfo(dwrate) returns -2147467262.
I'm pretty sure the AVIFile* APIs don't support this. (Disclaimer: I was the one who defined those APIs, but it was over 15 years ago...)
You can't just call EditStreamSetInfo on an plain AVIStream, only one returned from CreateEditableStream.
You could use AVISave, then, but that would obviously re-copy the whole file.
So, yes, you would probably want to do this by parsing the AVI file header enough to find the one DWORD you want to change. There are lots of documents on the RIFF and AVI file formats out there, such as http://www.opennet.ru/docs/formats/avi.txt.
I don't know anything about VfW, but you could always try hex-editing the file. The framerate is probably a field somewhere in the header of the AVI file.
Otherwise, you can script some tool like mencoder[1] to copy the stream to a new file under a different framerate.
[1] http://www.mplayerhq.hu/
HRESULT: 0x80004002 (2147500034)
Name: E_NOINTERFACE
Description: The requested COM interface is not available
Severity code: Failed
Facility Code: FACILITY_NULL (0)
Error Code: 0x4002 (16386)
Does it work if you DON'T call EditStreamSetInfo?
Can you post up the code you use to set the stream info?