How to encapsulate the H.264 bitstream of video file in C++

How to encapsulate the H.264 bitstream of video file in C++ - c++

I'm trying to convert a video file (.mp4) to a Dicom file.
I have succeeded to do it by storing single images (one per frame of the video) in the Dicom, but the result is a too large file, it's not good for me.
Instead I want to encapsulate the H.264 bitstream as it is stored in the video file, into the Dicom file.
I've tried to get the bytes of the file as follows:
std::ifstream inFile(file_name, std::ifstream::binary);
inFile.seekg(0, inFile.end);
std::streampos length = inFile.tellg();
inFile.seekg(0, inFile.beg);
std::vector<unsigned char> bytes(length);
inFile.read((char*)&bytes[0], length);
but I think I have missed something like encapsulating for the read bytes because the result Dicom file was a black image.
In python I would use pydicom.encaps.encapsulate function for this purpose:
https://pydicom.github.io/pydicom/dev/reference/generated/pydicom.encaps.encapsulate.html
with open(videofile, 'rb') as f:
dataset.PixelData = encapsulate([f.read()])
Is there anything in C ++ that is equivalent to the encapsulate function?
or any different way to get the encapsulated pixel data of video at one stream and not frame by frame?
This is the code of initializing the Dcmdataset, using the bytes extracted:
VideoFileStream* vfs = new VideoFileStream();
vfs->setFilename(file_name);
if (!vfs->open())
return false;
DcmDataset* dataset = new DcmDataset();
dataset->putAndInsertOFStringArray(DCM_SeriesInstanceUID, dcmGenerateUniqueIdentifier(new char[100], SITE_SERIES_UID_ROOT));
dataset->putAndInsertOFStringArray(DCM_SOPInstanceUID, dcmGenerateUniqueIdentifier(new char[100], SITE_INSTANCE_UID_ROOT));
dataset->putAndInsertOFStringArray(DCM_StudyInstanceUID, dcmGenerateUniqueIdentifier(new char[100], SITE_STUDY_UID_ROOT));
dataset->putAndInsertOFStringArray(DCM_MediaStorageSOPInstanceUID, dcmGenerateUniqueIdentifier(new char[100], SITE_UID_ROOT));
dataset->putAndInsertString(DCM_MediaStorageSOPClassUID, UID_VideoPhotographicImageStorage);
dataset->putAndInsertString(DCM_SOPClassUID, UID_VideoPhotographicImageStorage);
dataset->putAndInsertOFStringArray(DCM_TransferSyntaxUID, UID_MPEG4HighProfileLevel4_1TransferSyntax);
dataset->putAndInsertOFStringArray(DCM_PatientID, "987655");
dataset->putAndInsertOFStringArray(DCM_StudyDate, "20050509");
dataset->putAndInsertOFStringArray(DCM_Modality, "ES");
dataset->putAndInsertOFStringArray(DCM_PhotometricInterpretation, "YBR_PARTIAL_420");
dataset->putAndInsertUint16(DCM_SamplesPerPixel, 3);
dataset->putAndInsertUint16(DCM_BitsAllocated, 8);
dataset->putAndInsertUint16(DCM_BitsStored, 8);
dataset->putAndInsertUint16(DCM_HighBit, 7);
dataset->putAndInsertUint16(DCM_Rows, vfs->height());
dataset->putAndInsertUint16(DCM_Columns, vfs->width());
dataset->putAndInsertUint16(DCM_CineRate, vfs->framerate());
dataset->putAndInsertUint16(DCM_FrameTime, 1000.0 * 1 / vfs->framerate());
const Uint16* arr = new Uint16[]{ 0x18,0x00, 0x63, 0x10 };
dataset->putAndInsertUint16Array(DCM_FrameIncrementPointer, arr, 4);
dataset->putAndInsertString(DCM_NumberOfFrames, std::to_string(vfs->numFrames()).c_str());
dataset->putAndInsertOFStringArray(DCM_FrameOfReferenceUID, dcmGenerateUniqueIdentifier(new char[100], SITE_UID_ROOT));
dataset->putAndInsertUint16(DCM_PixelRepresentation, 0);
dataset->putAndInsertUint16(DCM_PlanarConfiguration, 0);
dataset->putAndInsertOFStringArray(DCM_ImageType, "ORIGINAL");
dataset->putAndInsertOFStringArray(DCM_LossyImageCompression, "01");
dataset->putAndInsertOFStringArray(DCM_LossyImageCompressionMethod, "ISO_14496_10");
dataset->putAndInsertUint16(DCM_LossyImageCompressionRatio, 30);
dataset->putAndInsertUint8Array(DCM_PixelData, (const Uint8 *)bytes.data(), length);
DJ_RPLossy repParam;
dataset->chooseRepresentation(EXS_MPEG4HighProfileLevel4_1, &repParam);
dataset->updateOriginalXfer();
DcmFileFormat fileformat(dataset);
OFCondition status = fileformat.saveFile("C://temp//videoTest", EXS_LittleEndianExplicit);

The trick is to redirect the value of the attribute PixelData to a file stream. With this, the video is loaded in chunks and on demand (i.e. when the attribute is accessed).
But you have to create the whole structure explicitly, that is:
The Pixel Data element
The Pixel Sequence with...
...the offset table
...a single item containing the contents of the MPEG file
Code
// set length to the size of the video file
DcmInputFileStream dcmFileStream(videofile.c_str(), 0);
DcmPixelSequence* pixelSequence = new DcmPixelSequence(DCM_PixelSequenceTag));
DcmPixelItem* offsetTable = new DcmPixelItem(DCM_PixelItemTag);
pixelSequence->insert(offsetTable);
DcmPixelItem* frame = new DcmPixelItem(DCM_PixelItemTag);
frame->createValueFromTempFile(dcmFileStream.newFactory(), OFstatic_cast(Uint32, length), EBO_LittleEndian);
pixelSequence->insert(frame);
DcmPixelData* pixelData = new DcmPixeldata(DCM_PixelData);
pixelData->putOriginalRepresentation(EXS_MPEG4HighProfileLevel4_1, nullptr, pixelSequence);
dataset->insert(pixelData, true);
DcmFileFormat fileformat(dataset);
OFCondition status = fileformat.saveFile("C://temp//videoTest");
Note that you "destroy" the compression if you save the file in VR Implicit Little Endian.
As mentioned above and obvious in the code, the whole MPEG file is wrapped into a single item in the PixelData. This is DICOM conformant but you may want to encapsulate single frames each in one item.
Note : No error handling presented here

Related

Read CSV from std::vector<unsigned char> using Apache Arrow

I am trying to read a csv input format using Apache arrow. The example here mentions that the input should be an InputStream, however in my case I just have an std::vector of unsigned chars. Is it possible to parse this using apache arrow? I have checked the I/O interface to see if there is an "in-memory" data structure with no luck.
I copy-paste the example code for convenience here as well as my input data:
#include "arrow/csv/api.h"
{
// ...
std::vector<unsigned char> data;
arrow::io::IOContext io_context = arrow::io::default_io_context();
// how can I fit the std::vector to the input stream?
std::shared_ptr<arrow::io::InputStream> input = ...;
auto read_options = arrow::csv::ReadOptions::Defaults();
auto parse_options = arrow::csv::ParseOptions::Defaults();
auto convert_options = arrow::csv::ConvertOptions::Defaults();
// Instantiate TableReader from input stream and options
auto maybe_reader =
arrow::csv::TableReader::Make(io_context,
input,
read_options,
parse_options,
convert_options);
if (!maybe_reader.ok()) {
// Handle TableReader instantiation error...
}
std::shared_ptr<arrow::csv::TableReader> reader = *maybe_reader;
// Read table from CSV file
auto maybe_table = reader->Read();
if (!maybe_table.ok()) {
// Handle CSV read error
// (for example a CSV syntax error or failed type conversion)
}
std::shared_ptr<arrow::Table> table = *maybe_table;
}
Any help would be appreciated!

The I/O interface docs list BufferReader which works as an in-memory input stream. While not listed in the docs, it can be constructed from a pointer and a size which should let you use your vector<char>.

Core Audio specify which audio track to decode

I am able to successfully get the decoded PCM data of an audio file using Core Audio API. Below is the reduced code that shows how do I do that:
CFStringRef urlStr = CFStringCreateWithCString(kCFAllocatorDefault, "file.m4a", kCFStringEncodingUTF8);
CFURLRef urlRef = CFURLCreateWithFileSystemPath(NULL, urlStr, kCFURLPOSIXPathStyle, false);
ExtAudioFileOpenURL(urlRef, &m_audioFile);
bzero(&m_outputFormat, sizeof(AudioStreamBasicDescription));
m_outputFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked;
m_outputFormat.mSampleRate = m_inputFormat.mSampleRate;
m_outputFormat.mFormatID = kAudioFormatLinearPCM;
m_outputFormat.mChannelsPerFrame = m_inputFormat.mChannelsPerFrame;
m_outputFormat.mBytesPerFrame = sizeof(short) * m_outputFormat.mChannelsPerFrame;
m_outputFormat.mBitsPerChannel = sizeof(short) * 8;
m_outputFormat.mFramesPerPacket = 1;
m_outputFormat.mBytesPerPacket = m_outputFormat.mBytesPerFrame * m_outputFormat.mFramesPerPacket;
ExtAudioFileSetProperty(m_audioFile, kExtAudioFileProperty_ClientDataFormat, sizeof(m_outputFormat), &m_outputFormat)
short* transformData = new short[sampleCount];
AudioBufferList fillBufList;
fillBufList.mNumberBuffers = 1;
fillBufList.mBuffers[0].mNumberChannels = channels;
fillBufList.mBuffers[0].mDataByteSize = m_sampleCount * sizeof(short);
fillBufList.mBuffers[0].mData = (void*)(&transformData[0]);
ExtAudioFileRead(m_audioFile, &m_frameCount, &fillBufList);
I am interested in how can I specify the audio track I want to decode (suppose that media file contains more than one)?

One method is to decode all tracks and then extract (copy) the desired track data (every other sample for interleaved stereo, etc.) into another buffer, array, or file. Compared to the decode time, the extra copy time is insignificant.

what is difference in content of image object and image raw data

I want to send an image over UDP network in small packets of size 1024 bytes.
i have two options.
imgBinaryFormatter->Serialize(memStream, objNewImage); // Sending an image object
OR
imgBinaryFormatter->Serialize(memStream, objNewImage->RawData); // Sending a raw data of image
what is difference in their content and when to use ?
For reference full function is given below
Image^ objNewImage = Image::FromFile(fullPath); // fullpath is full path of an image
MemoryStream^ memStream = gcnew MemoryStream();
Formatters::Binary::BinaryFormatter^ imgBinaryFormatter = gcnew Formatters::Binary::BinaryFormatter(); // Binary formatter
imgBinaryFormatter->Serialize(memStream, objNewImage); // Or objNewImage->RawData ??
arrImgArray = memStream->ToArray(); // COnvert stream to byte array
int iNoOfPackets = arrImgArray->Length / 1024;
int i;
for (i = 1; i < iNoOfPackets; i++){
socket->SendTo(arrImgArray, 1024*(i-1), 1024, SocketFlags::None, receiversAdd);
}
int remainedBytes = arrImgArray->Length - 1024 * iNoOfPackets;
socket->SendTo(arrImgArray, 1024 * iNoOfPackets, remainedBytes, SocketFlags::None, receiversAdd);
If you find improvements in code, feel free to edit code with suitable solution for memory constraint application.

It's better to use
the Image.Save Method (Stream, ImageFormat) for serialization into a Stream
and
the Image.FromStream Method (Stream) for deserialization from a Stream
or one of their overloads

initialising NSInputStream from a particular portion of file (copied in Document directory)?

I am uploading a large file from my iOS app and file transfer is in chunk upload. i am using the below code to initialise NSInputStream for Chunks.
// for example
NSInteger chunkCount = 20;
for(int i=0; i<chunkCount; i++) {
NSFileHandle *handle = [NSFileHandle fileHandleForReadingAtPath:filePath];
[handle seekToFileOffset:(unsigned long long)i * (chunkCount == 1?fileSize:chunkSize)];
NSData *fileData = [handle readDataOfLength:chunkSize];
NSInputStream *iStream = [[NSInputStream alloc]initWithData:fileData];
}
But I'd like to know if I can have a method of NSInputStream by which i can initialise iStream from the range of file Stream rather than NSData.
Thanks

There is NSStreamFileCurrentOffsetKey property for file streams to specify read offset.
NSInputStream *s = [NSInputStream inputStreamWithFileAtPath:path];
[s setProperty:offset forKey:NSStreamFileCurrentOffsetKey];

Convert blob images to Mat images

I am using mysql database for my face recognition project.Here I store images into the table and the images are stored as blob images.when select these images back, i need to write these blob images to separate image files using a file pointer and it will be stored into a folder in which the program resides.And for using those images, i need to read it again from that folder.Then only i can use that images to another function. But i need to use those blob images directly from the database(when we select it from the DB, i need to pass it to another function). So that i can reduce operations like reading it again from the folder.How can i convert bl .So for passing DB images directly to another function, i think we need to convert it's type or something.
void SelectImage()
{
MYSQL *conn;
MYSQL_RES *result;
MYSQL_ROW row;
char temp[900];
char filename[50];
unsigned long *lengths;
FILE *fp;
my_ulonglong numRows;
unsigned int numFields;
int mysqlStatus = 0;
MYSQL_RES *mysqlResult = NULL;
conn = mysql_init(NULL);
mysql_real_connect(conn, "localhost", "root", "athira#iot", "Athira", 0, NULL, 0);
int state=mysql_query(conn, "SELECT * FROM ima1");
mysqlResult=mysql_store_result(conn);
if(mysqlResult)
{
numRows=mysql_num_rows(mysqlResult);
}
cout<<"rows:"<<numRows<<endl;
for(int i=1;i<=numRows;i++){
sprintf(temp,"SELECT data FROM ima1 WHERE id=%d",i);
sprintf(filename,"Gen_Image%d.jpeg",i);
fp = fopen(filename, "wb");// open a file for writing.
cout<<temp<<" to "<<filename<<endl;
mysql_query(conn, temp);//select an image with id
result = mysql_store_result(conn);
row = mysql_fetch_row(result);//row contains row data
lengths = mysql_fetch_lengths(result);//this is the length of th image
fwrite(row[0], lengths[0], 1, fp);//writing image to a file
cout<<"selected..."<<endl;
img.create(100,100,CV_16UC1);
memcpy(img.data,row.data,lengths);
mysql_free_result(result);
fclose(fp);
}
mysql_close(conn);
}
I tried to convert it to Mat type but it's showing error..
error is this,
error: request for member ‘data’ in ‘row’, which is of non-class type ‘MYSQL_ROW {aka char**}’

If your images are stored using some image format (JPEG, PNG, BMP, ...) and not as raw pixel data, then you should be looking at the imdecode function in OpenCV.
If your data is stored as raw pixels, then you'd have to make sure that the memory layout of your BLOB data can be expressed using the step/stride functionality OpenCV uses to layout its matrices.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to encapsulate the H.264 bitstream of video file in C++ - c++

Related

Read CSV from std::vector<unsigned char> using Apache Arrow

Core Audio specify which audio track to decode

what is difference in content of image object and image raw data

initialising NSInputStream from a particular portion of file (copied in Document directory)?

Convert blob images to Mat images

Categories

Resources