How do I read text chunks quicker with libpng? - c++

With libpng, I’m trying to extract text chunks in a 44-megabyte PNG image (and preferably validate that the PNG data is not malformed (e. g. lacking IEND, etc.)). I could do that with png_read_png and png_get_text, but it took way too long for me, 0.47 seconds, which I’m pretty sure is because of the massive amount of the IDAT chunks the image has. How do I do this in a quicker manner?
I didn’t need the pixels, so I tried to make libpng ignore the IDAT chunks.
To have libpng ignore IDAT chunks, I tried:
png_read_info(p_png, p_png_information); png_read_image(p_png, nullptr); png_read_end(p_png, p_png_information); to skip IDAT chunks; crashed and failed.
png_set_keep_unknown_chunks to make libpng unknow about IDAT, and png_set_read_user_chunk_fn(p_png, nullptr, discard_an_unknown_chunk) (discard_an_unknown_chunk is a function that does return 1;) to discard unknown chunks; a weird CRC error occurred on the first IDAT chunk and failed.
And failed to do that.
Edit
Running as a Node.js C++ addon, mostly written in C++, on Windows 10, with i9-9900K CPU # 3.6 GHz and gigabytes of memory.
Read the image file on an SSD with fs.readFileSync, a Node.js method returning a Buffer, and tossed it to the libpng to process.
Yes, at first, I blamed libpng for the prolonged computation. Now I see there might be other reasons causing the delay. (If that’s the case, this question would be a bad one with an XY problem.) Thank you for your comments. I’ll check my code out again more thoroughly.
Edit 2
With every step for feeding the PNG data input to the C++ addon kept the same, I ended up manually picking and decoding text chunks only, with my C pointer magic and some C++ magic. And, the performance was impressive (0.0020829 seconds on processing), being almost immediate. Don’t know why and how though.
B:\__A2MSUB\image-processing-utility>npm run test
> image-processing-utility#1.0.0 test B:\__A2MSUB\image-processing-utility
> node tests/test.js
----- “read_png_text_chunks (manual decoding, not using libpng.)” -----
[
{
type: 'tEXt',
keyword: 'date:create',
language_tag: null,
translated_keyword: null,
content: '2020-12-13T22:01:22+09:00',
the_content_is_compressed: false
},
{
type: 'tEXt',
keyword: 'date:modify',
language_tag: null,
translated_keyword: null,
content: '2020-12-13T21:53:58+09:00',
the_content_is_compressed: false
}
]
----- “read_png_text_chunks (manual decoding, not using libpng.)” took 0.013713 seconds.
B:\__A2MSUB\image-processing-utility>

I had to do something similar, but where I wanted libpng to do all of the metadata chunk parsing (e.g. eXIf, gAMA, pHYs, zEXt, cHRM, etc. chunks). Some of these chunks can appear after the IDAT, which means the metadata can't be read with just png_read_info. (The only way to get to them would be to do a full decode of the image, which is expensive, and then call png_read_end.)
My solution was to create a synthetic PNG byte stream that is fed to libpng via the read callback set using png_set_read_fn. In that callback, I skip all IDAT chunks in the source PNG file, and when I get to an IEND chunk, I instead emit a zero-length IDAT chunk.
Now I call png_read_info: it parses all of the metadata in all of the chunks it sees, stopping at the first IDAT, which in my synthetic PNG stream is really the end of the source PNG image. Now I have all of the metadata and can query libpng for it via the png_get_xxx functions.
The read callback that creates the synthetic PNG stream is a little complicated due to it being called by libpng multiple times, each for small sections of the stream. I solved that using a simple state machine that processes the source PNG progressively, producing the synthetic PNG stream on-the-fly. You could avoid those complexities if you produce the synthetic PNG stream up-front in memory before calling png_read_info: without any real IDATs, your full synthetic PNG stream is bound to be small...
While I don't have benchmarks to share here, the final solution is fast because IDATs are skipped entirely and not decoded. (I use a file seek to skip each IDAT in the source PNG after reading the 32-bit chunk length.)

You can check that all the correct PNG chunks are in a file, in the correct order, and not repeated and with correct checksums using pngcheck. It is open source so you could look at how it works.
If you add the parameter -7, you can not only check the structure but also extract the text:
pngcheck -7 a.png
Output
File: a.png (60041572 bytes)
date:create:
2020-12-24T13:22:41+00:00
date:modify:
2020-12-24T13:22:41+00:00
OK: a.png (10000x1000, 48-bit RGB, non-interlaced, -0.1%).
I generated a 60MB PNG and the above check takes 0.067s on my MacBook Pro.

Related

Why is a different zlib window bits value required for extraction, compared with compression?

I am trying to debug a problem with some code that uses zlib 1.2.8. The problem is that this larger project can make archives, but runs into Z_DATA_ERROR header problems when trying to extract that archive.
To do this, I wrote a small test program in C++ that compresses ("deflates") a specified regular file, writes the compressed data to a second regular file, and extracts ("inflates") to a third regular file, one line at a time. I then diff the first and third files to make sure I get the same bytes.
For reference, this test project is located at: https://github.com/alexpreynolds/zlib-test and compiles under Clang (and should also compile under GNU GCC).
My larger question is how to deal with header data correctly in my larger project.
In my first test scenario, I can set up compression machinery with the following code:
z_error = deflateInit(this->z_stream_ptr, ZLIB_TEST_COMPRESSION_LEVEL);
Here, ZLIB_TEST_COMPRESSION_LEVEL is 1, to provide best speed. I then run deflate() on the z_stream pointer until there is nothing left that comes out of compression.
To extract these bytes, I can use inflateInit():
int ret = inflateInit(this->z_stream_ptr);
So what is the header format, in this case?
In my second test scenario, I set up the deflate machinery like so:
z_error = deflateInit2(this->z_stream_ptr,
ZLIB_TEST_COMPRESSION_LEVEL,
ZLIB_TEST_COMPRESSION_METHOD,
ZLIB_TEST_COMPRESSION_WINDOW_BITS,
ZLIB_TEST_COMPRESSION_MEM_LEVEL,
ZLIB_TEST_COMPRESSION_STRATEGY);
These deflate constants are, respectively, 1 for level, Z_DEFLATED for method, 15+16 or 31 for window bits, 8 for memory level, and Z_DEFAULT_STRATEGY for strategy.
The former inflateInit() call does not work; instead, I must use inflateInit2() and specify a modified window bits value:
int ret = inflateInit2(this->z_stream_ptr, ZLIB_TEST_COMPRESSION_WINDOW_BITS + 16);
In this case, the window bits value is not 31 as in the deflateInit2() call, but 15+32 or 47.
If I use 31 (or any other value than 47), then I get a Z_DATA_ERROR on subsequent inflate() calls. That is, if I use the same window bits for the inflateInit2() call:
int ret = inflateInit2(this->z_stream_ptr, ZLIB_TEST_COMPRESSION_WINDOW_BITS);
Then I get the following error on attempting to inflate():
Error: inflate to stream failed [-3]
Here, -3 is the same as Z_DATA_ERROR.
According to the documentation, using 31 with deflateInit2() should write a gzip header and trailer. Thus, 31 on the following inflateInit2() call should be expected to be able to extract the header information.
Why is the modified value 47 working, but not 31?
My test project is mostly similar to the example code on the zlib site, with the exception of the extraction/inflation code, which inflates one z_stream chunk at a time and parses the output for newline characters.
Is there something special about running inflate() only when a new buffer of extracted data is asked for — like header information going missing between inflate() calls — as opposed to running the whole extraction in one pass, as in the zlib example code?
My larger debugging problem is looking for a robust way to extract a chunk of zlib-compressed data only on request, so that I can extract data one line at a time, as opposed to getting the whole extracted file. Something about the way I am handling the zlib format parameter seems to be messing me up, but I can't figure out why or how to fix this.
deflateInit() and inflateInit(), as well as deflateInit2() and inflateInit2() with windowBits in 0..15 all process zlib-wrapped deflate data. (See RFC 1950 and RFC 1951.)
deflateInit2() and inflateInit2() with negative windowBits in -1..-15 process raw deflate data with no header or trailer. deflateInit2() and inflateInit2() with windowBits in 16..31, i.e. 16 added to 0..15, process gzip-wrapped deflate data (RFC 1952). inflateInit2() with windowBits in 32..47 (32 added to 0..15) will automatically detect either a gzip or zlib header (but not raw deflate data), and decompress accordingly.
Why is the modified value 47 working, but not 31?
31 does work. I did not try to look at your code to debug it.
Is there something special about running inflate() only when a new
buffer of extracted data is asked for — like header information going
missing between inflate() calls — as opposed to running the whole
extraction in one pass, as in the zlib example code?
I can't figure out what you're asking here. Perhaps a more explicit example would help. The whole point of inflate() is to decompress a chunk at a time.

Can't get CreateDDSTextureFromFile to work

So, I've been trying to figure out my problem for a few hours now, but I have no idea what I'm doing wrong. I'm a noob when it comes to DirectX programming, so I've been following some tutorials, and right now, I'm trying to create a obj loader.
http://www.braynzarsoft.net/index.php?p=D3D11OBJMODEL
However, I can't get my texture to work.
This is how I try to load the DDS-texture:
ID3D11ShaderResourceView* tempMeshSRV = nullptr;
hr = CreateDDSTextureFromFile(gDevice, L"boxTexture.dds", NULL, &tempMeshSRV);
if (SUCCEEDED(hr))
{
textureNameArray.push_back(L"boxTexture.dds");
material[matCount - 1].texArrayIndex = meshSRV.size();
meshSRV.push_back(tempMeshSRV);
material[matCount - 1].hasTexture = true;
}
However, my HRESULT will never Succeed, but it doesn't crash either. If I hoover over the hr, it just says "HRESULT_FROM_WIN32(ERROR_NOT_SUPPORTED) I also tried to remove the if statement, but that will just turn my box black.
Any idea on what I'm doing wrong? =/
Thanks in advance!
The most likely problem is that your "boxTexture.dds" is a 24 bit-per-pixel format file. In Direct3D 9, this was D3DFMT_R8G8B8 and was reasonably common. However, there is no DXGI equivalent format for 24 bits-per-pixel and it therefore requires format conversion to work.
The DDSTextureLoader module in DirectX Tool Kit is designed to be a minimum-overhead function, and therefore does no runtime conversions at all. If the data directly maps to a DXGI format, it loads. If it doesn't, it fails with HRESULT_FROM_WIN32(ERROR_NOT_SUPPORTED).
There are two different solutions depending on your usage scenario.
The ideal solution is to convert 'boxTexture.dds' to a supported format. You can do this with the texconv command-line tool provided with DirectXTex. This is by far the best option so that the potentially expensive conversion operation is done once and not very single time your application runs and loads the data.
If you don't actually control the source of the dds files you are trying load (i.e. they are arbitrary files provided by a user or you are doing some kind of content tool that has to support legacy formats), then you should make use of the DirectXTex 'full-fat' LoadFromDDSFile function which has extensive conversion code for handling legacy DDS file formats.
Note this situation can happen for a number of legacy format DDS files as list in the CodePlex wiki documentation
D3DFMT_R8G8B8 (24bpp RGB) - Use a 32bpp format
D3DFMT_X8B8G8R8 (32bpp RGBX) - Use BGRX, BGRA, or RGBA
D3DFMT_A2R10G10B10 (BGRA 10:10:10:2) - Use RGBA 10:10:10:2
D3DFMT_X1R5G5B5 (BGR 5:5:5) - Use BGRA 5:5:5:1 or BGR 5:6:5
D3DFMT_A8R3G3B2, D3DFMT_R3G3B2 (BGR 3:3:2) - Expand to a supported format
D3DFMT_P8, D3DFMT_A8P8 (8-bit palette) - Expand to a supported format
D3DFMT_A4L4 (Luminance 4:4) - Expand to a supported format
D3DFMT_UYVY (YUV 4:2:2 16bpp) - Swizzle to YUY2
See also Direct3D 11 Textures and Block Compression
If you look at the source code for CreateTextureFromDDS (which is called by CreateDDSTextureFromFile to do the main data processing) - http://directxtk.codeplex.com/SourceControl/latest#Src/DDSTextureLoader.cpp - you will see that there are a lot of reasons you could be getting "HRESULT_FROM_WIN32(ERROR_NOT_SUPPORTED)".
It's not likely a problem with opening or reading the file since that would return a different error code. So most likely its an unsupported DXGI_FORMAT, a malformed cubemap, an invalid mipmap count, or invalid image dimensions (i.e. larger than the limits found here: http://msdn.microsoft.com/en-us/library/ff819065(v=vs.85).aspx ).

Saving output frame as an image file CUDA decoder

I am trying to save the decoded image file back as a BMP image using the code in CUDA Decoder project.
if (g_bReadback && g_ReadbackSID)
{
CUresult result = cuMemcpyDtoHAsync(g_bFrameData[active_field], pDecodedFrame[active_field], (nDecodedPitch * nHeight * 3 / 2), g_ReadbackSID);
long padded_size = (nWidth * nHeight * 3 );
CString output_file;
output_file.Format(_T("image/sample_45.BMP"));
SaveBMP(g_bFrameData[active_field],nWidth,nHeight,padded_size,output_file );
if (result != CUDA_SUCCESS)
{
printf("cuMemAllocHost returned %d\n", (int)result);
}
}
But the saved image looks like this
Can anybody help me out here what am i doing wrong .. Thank you.
After investigating further, there were several modifications I made to your approach.
pDecodedFrame is actually in some non-RGB format, I think it is NV12 format which I believe is a particular YUV variant.
pDecodedFrame gets converted to an RGB format on the GPU using a particular CUDA kernel
the target buffer for this conversion will either be a surface provided by OpenGL if g_bUseInterop is specified, or else an ordinary region allocated by the driver API version of cudaMalloc if interop is not specified.
The target buffer mentioned above is pInteropFrame (even in the non-interop case). So to make an example for you, for simplicity I chose to only use the non-interop case, because it's much easier to grab the RGB buffer (pInteropFrame) in that case.
The method here copies pInteropFrame back to the host, after it has been populated with the appropriate RGB image by cudaPostProcessFrame. There is also a routine to save the image as a bitmap file. All of my modifications are delineated with comments that include RMC so search for that if you want to find all the changes/additions I made.
To use, drop this file in the cudaDecodeGL project as a replacement for the videoDecodeGL.cpp source file. Then rebuild the project. Then run the executable normally to display the video. To capture a specific frame, run the executable with the nointerop command-line switch, eg. cudaDecodGL nointerop and the video will not display, but the decode operation and frame capture will take place, and the frame will be saved in a framecap.bmp file. If you want to change the specific frame number that is captured, modify the g_FrameCapSelect = 37; variable to some other number besides 37, and recompile.
Here is the replacement for videoDecodeGL.cpp I used pastebin because SO has a limit on the number of characters that can be entered in a question body.
Note that my approach is independent of whether readback is specified. I would recommend not using readback for this sequence.

Streaming MP3 from Internet with FMOD

I thought this would be a relatively simple task with something like FMOD, but I can't get it to work. Even the example code netstream doesn't seem to do the trick. No matter what mp3 I try to play with the netstream example program, I get this error:
FMOD error! (20) Couldn't perform seek operation. This is a limitation of the medium (ie netstreams) or the file format.
I don't really understand what this means. Isn't this exactly what the netstream example program was for? to stream some file from the internet?
I can't get passed the createSound method
result = system->createSound(argv[1], FMOD_HARDWARE | FMOD_2D | FMOD_CREATESTREAM | FMOD_NONBLOCKING, 0, &sound);
EDIT:
This is what I modified after reading Mathew's answer
FMOD_CREATESOUNDEXINFO soundExInfo;
memset(&soundExInfo, 0, sizeof(FMOD_CREATESOUNDEXINFO));
soundExInfo.cbsize = sizeof(FMOD_CREATESOUNDEXINFO);
soundExInfo.suggestedsoundtype = FMOD_SOUND_TYPE_MPEG;
result = system->createSound(argv[1], FMOD_HARDWARE | FMOD_2D | FMOD_CREATESTREAM | FMOD_NONBLOCKING | FMOD_IGNORETAGS, &soundExInfo, &sound);
I get two different errors depending on which files I use.
Test 1
URL: http://kylegobel.com/test.mp3
Test 1 Error: (25) Unsupported file or audio format.
Test 2 URL: http://kylegobel.com/bullet.mp3
Test 2 Error: (20) Couldn't perform seek operation. This is a limitation of the medium (ie netstreams) or the file format.
Before I made the change, I could use netstream to play "C:\test.mp3" which is the same file named test.mp3 on the web, but that no longer works with the above changes. Maybe these files are just in the wrong formats or something? Sorry for my lack of knowledge in this area, I really don't know much about sound, but trying to figure it out.
Thanks,
Kyle
It's possible the MP3 has a large amount of tags at the start, so FMOD reads them then tries to seek back to the start (which it can't do because it's a net stream). Can you try using FMOD_IGNORETAGS and perhaps FMOD_CREATESOUNDEXINFO with suggestedsoundtype set to FMOD_SOUND_TYPE_MPEG?
If that does't work could you post the url to a known not working MP3 stream?
EDIT:
The file in question has around 60KB of tag data, FMOD is happy to read over that stuff but for the MPEG codec to work it needs to do some small seeks. Since you cannot seek a netstream all the seeks must be contained inside the low level file buffer. If you tweak the file buffer size, make it a bit larger you can overcome this restriction. See System::setFileSystem "blockalign" parameter.

how do I read a huge .gz file (more than 5 gig uncompressed) in c

I have some .gz compressed files which is around 5-7gig uncompressed.
These are flatfiles.
I've written a program that takes a uncompressed file, and reads it line per line, which works perfectly.
Now I want to be able to open the compressed files inmemory and run my little program.
I've looked into zlib but I can't find a good solution.
Loading the entire file is impossible using gzread(gzFile,void *,unsigned), because of the 32bit unsigned int limitation.
I've tried gzgets, but this almost doubles the execution time, vs reading in using gzread.(I tested on a 2gig sample.)
I've also looked into "buffering", such as splitting the gzread process into multiple 2gig chunks, find the last newline using strcchr, and then setting the gzseek.
But gzseek will emulate a total file uncompression. which is very slow.
I fail to see any sane solution to this problem.
I could always do some checking, whether or not a current line actually has a newline (should only occure in the last partially read line), and then read more data from the point in the program where this occurs.
But this could get very ugly.
Does anyhow have any suggestions?
thanks
edit:
I dont need to have the entire file at once,just need one line a time, but I got a fairly huge machine, so if that was the easiest I would have no problems.
For all those that suggest piping the stdin, I've experienced extreme slowdowns compared to opening the file. Here is a small code snippet I made some months ago, that illustrates it.
time ./a.out 59846/59846.txt
# 59846/59846.txt
18255221
real 0m4.321s
user 0m2.884s
sys 0m1.424s
time ./a.out <59846/59846.txt
18255221
real 1m56.544s
user 1m55.043s
sys 0m1.512s
And the source code
#include <iostream>
#include <fstream>
#define LENS 10000
int main(int argc, char **argv){
std::istream *pFile;
if(argc==2)//ifargument supplied
pFile = new std::ifstream(argv[1],std::ios::in);
else //if we want to use stdin
pFile = &std::cin;
char line[LENS];
if(argc==2) //if we are using a filename, print it.
printf("#\t%s\n",argv[1]);
if(!pFile){
printf("Do you have permission to open file?\n");
return 0;
}
int numRow=0;
while(!pFile->eof()) {
numRow++;
pFile->getline(line,LENS);
}
if(argc==2)
delete pFile;
printf("%d\n",numRow);
return 0;
}
thanks for your replies, I'm still waiting the golden apple
edit2:
using the cstyle FILE pointers instead of c++ streams is much much faster. So I think this is the way to go.
Thank for all your input
gzip -cd compressed.gz | yourprogram
just go ahead and read it line by line from stdin as it is uncompressed.
EDIT: Response to your remarks about performance. You're saying reading STDIN line by line is slow compared to reading an uncompressed file directly. The difference lies within terms of buffering. Normally pipe will yield to STDIN as soon as the output becomes available (no, or very small buffering there). You can do "buffered block reads" from STDIN and parse the read blocks yourself to gain performance.
You can achieve the same result with possibly better performance by using gzread() as well. (Read a big chunk, parse the chunk, read the next chunk, repeat)
gzread only reads chunks of the file, you loop on it as you would using a normal read() call.
Do you need to read the entire file into memory ?
If what you need is to read lines, you'd gzread() a sizable chunk(say 8192 bytes) into a buffer, loop through that buffer and find all '\n' characters and process those as individual lines. You'd have to save the last piece incase there is just part of a line, and prepend that to the data you read next time.
You could also read from stdin and invoke your app like
zcat bigfile.gz | ./yourprogram
in which case you can use fgets and similar on stdin. This is also beneficial in that you'd run decompression on one processor and processing the data on another processor :-)
I don't know if this will be an answer to your question, but I believe it's more than a comment:
Some months ago I discovered that the contents of Wikipedia can be downloaded in much the same way as the StackOverflow data dump. Both decompress to XML.
I came across a description of how the multi-gigabyte compressed dump file could be parsed. It was done by Perl scripts, actually, but the relevant part for you was that Bzip2 compression was used.
Bzip2 is a block compression scheme, and the compressed file could be split into manageable pieces, and each part uncompressed individually.
Unfortunately, I don't have a link to share with you, and I can't suggest how you would search for it, except to say that it was described on a Wikipedia 'data dump' or 'blog' page.
EDIT: Actually, I do have a link