Is it possible that the PTS of a particular frame in a file is different with the PTS of the same frame in the same file while it is being streamed?
When I read a frame using av_read_frame I store the video stream in an AVStream. After I decode the frame with avcodec_decode_video2, I store the time stamp of that frame in an int64_t using av_frame_get_best_effort_timestamp. Now if the program is getting its input from a file I get a different timestamp from when I stream the input (from the same file) to the program.
To change the input type I simply change the argv argument from "/path/to/file.mp4" to something like "udp://localhost:1234", then I stream the file with ffmpeg in command line: "ffmpeg -re -i /path/to/file.mp4 -f mpegts udp://localhost:1234". Can it be because the "-f mpegts" arguments change some characteristics of the media?
Below is my code (simplified). By reading the ffmpeg mailing list archives I realized that the time_base that I'm looking for is in the AVStream and not the AVCodecContext. Instead of using av_frame_get_best_effort_timestamp I have also tried using the packet.pts but the results don't change.
I need the time stamps to have a notion of frame number in a streaming video that is being received.
I would really appreciate any sort of help.
//..
//argv[1]="/file.mp4";
argv[1]="udp://localhost:7777";
// define AVFormatContext, AVFrame, etc.
// register av, avcodec, avformat_network_init(), etc.
avformat_open_input(&pFormatCtx, argv, NULL, NULL);
avformat_find_stream_info(pFormatCtx, NULL);
// find the video stream...
// pointer to the codec context...
// open codec...
pFrame=av_frame_alloc();
while(av_read_frame(pFormatCtx, &packet)>=0) {
AVStream *strem = pFormatCtx->streams[videoStream];
if(packet.stream_index==videoStream) {
avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);
if(frameFinished) {
int64_t perts = av_frame_get_best_effort_timestamp(pFrame);
if (isMyFrame(pFrame)){
cout << perts*av_q2d(strem->time_base) << "\n";
}
}
}
//free allocated space
}
//..
Timestamps are stored at the container level, so changing the container can change the timestamps. In addition, TS stores a timestamp for every frame (based on a 90kHz clock). MP4 only stores the frame durations with an assumed start time of 0 (this gets more complicated with bframes since the first PTS is zero, and the first DTS is < 0). So to get the time stamp all the frame durations are added. Mp4 also allows the clock rate be set. It is often 1001/3000 ticks per second for 29.97FPS, but it can be set to anything. so av_frame_get_best_effort_timestamp returns you ticks in codec->stream_base units. For TS codec->stream_base is always 1/90000
Related
I searched a lot but I can't find a decent library to solve to my problem. I receive a h264 stream from the network, and I want to decode it from memory and display it in real time. Something like this pseudocode below:
cv::Mat frame;
while (true)
{
if (newDataArrived)
{
void* h264Buffer; // address with current h264 stream
size_t h264BufferSize; // size of the stream
LibraryINeed::UpdateFrame(frame, h264Buffer, h264BufferSize); // !?
}
...
imshow("display", frame); // show most recent frame
}
Is there any library that provides this functionality?
I have to use Opus Codec to encode & decode audio datas in C++ and I have to encapsulate the functions.
So I try to send a floats array to try to encode it and I decode the result of the Opus encoding function. Unfortunately, the result is not the same and I get a table that contains no value from the initial table.
Here is my code.
Encapsulation:
std::vector<float> codec::OpusPlugin::decode(packet_t &packet) {
std::vector<float> out(BUFFER_SIZE * NB_CHANNELS);
int ret = 0;
if (!this->decoder)
throw Exception("Can't decode since there is no decoder.");
ret = opus_decode_float(this->decoder, packet.data.data(), packet.size, reinterpret_cast<float*>(out.data()), FRAME_SIZE, 0);
if (ret < 0)
throw Exception("Error while decoding compressed data.");
return out;
}
// ENCODER
packet_t codec::OpusPlugin::encode(std::vector<float> to_encode) {
std::vector<unsigned char> data(BUFFER_SIZE * NB_CHANNELS * 2);
packet_t packet;
int ret = 0;
if (!this->encoder)
throw Exception("Can't encode since there is no decoder.");
ret = opus_encode_float(this->encoder, reinterpret_cast<float const*>(to_encode.data()), FRAME_SIZE, data.data(), data.size());
if (ret < 0)
throw Exception("Error while encoding data.");
packet.size = ret;
packet.data = data;
return packet;
}
And there is the call of the functions:
packet_t packet;
std::vector<float> floats = {0.23, 0, -0.312, 0.401230, 0.1234, -0.1543};
packet = CodecPlugin->encode(floats);
std::cout << "packet size: " << packet.size << std::endl;
std::vector<float> output = CodecPlugin->decode(packet);
for (int i = 0; i < 10; i++) {
std::cout << output.data()[i] << " ";
}
Here is the packet_t structure, where I stock the return value of encode and the unsigned char array (encoded value)
typedef struct packet_s {
int size;
std::vector<unsigned char> data;
} packet_t;
The output of the program is
*-1.44487e-15 9.3872e-16 -1.42993e-14 7.31834e-15 -5.09662e-14 1.53629e-14 -8.36825e-14 3.9531e-14 -8.72754e-14 1.0791e-13 which is not the array I initialize at the beginning.
I read a lot of times the documentation and code examples but I don't know where I did a mistake.
I hope you will be able to help me.
Thanks :)
We don't see how you initialize your encoder and decoder so we don't know what their sample rate, complexity or number of channels is. No matter how you have initialized them you are still going to have the following problems:
First Opus encoding doesn't support arbitrary frame sizes but instead 2.5ms, 5ms, 10ms, 20, 40ms or 60ms RFC 6716 - Definition of the Opus Audio Codec relevant section 2.1.4. Moreover opus supports only 8kHz, 12kHz, 16kHz, 24kHz or 48kHz sample rates. No matter which of those you have chosen your array of 10 elements doesn't correspond to any of the supported frame sizes.
Secondly Opus codec is a lossy audio codec. This means that after you encode any signal you will never (probably except some edge cases) be able to reconstruct the original signal after decoding the encoded opus frame. The best way to test if your encoder and decoder work is with a real audio sample. Opus encoding preserves the perceptual quality of the audio files. Therefore if you try to test it with arbitrary data you might not get the expected results back even if you implemented the encoding and decoding functions correctly.
What you can easily do is to make a sine function of 2000Hz(there are multiple examples on the internet) for 20ms. This means 160 array elements at a sample rate of 8000Hz if you wish to use 8kHz. A sine wave of 2kHz is within the human hearing range so the encoder is going to preserve it. Then decode it back and see whether the elements of the input and output array are similar as we've already established that it is unlikely that they are the same.
I am not good in C++ so I can't help you with code examples but the problems above hold true no matter what language is used.
I'm working on a project that will involve having to process PCM audio data through fft as its being played, preferably in sync. I'm using a linux g++ compiler and currently reading and playing audio data using OpenAL.
My question is this: is there a better way to process PCM audio data with an fft live as the audio is playing then using threads? If not, then what threading library would be best to use for these purposes.
this is my function that loads the wave data into an array of bytes, these can later be cast to ints for processing and all I use to play the data is OpenAL.
char* loadWAV(const char* fn, int& chan, int& samplerate, int& bps, int& size){
char buffer[4];
ifstream in(fn, ios::binary);
in.read(buffer, 4); //ChunkID "RIFF"
if(strncmp(buffer, "RIFF", 4) != 0){
cerr << "this is not a valid wave file";
return NULL;
}
in.read(buffer,4); //ChunkSize
in.read(buffer,4); //Format "WAVE"
in.read(buffer,4); // "fmt "
in.read(buffer,4); // 16
in.read(buffer,2); // 1
in.read(buffer,2); // NUMBER OF CHANNELS
chan = convertToInt(buffer,2);
in.read(buffer,4); // SAMPLE RATE
samplerate = convertToInt(buffer,4);
in.read(buffer,4); // ByteRate
in.read(buffer,2); // BlockAlign
in.read(buffer,2); // bits per sample
bps = convertToInt(buffer,2);
in.read(buffer,4); // "data"
in.read(buffer,4);
size = convertToInt(buffer,4);
char * data = new char[size];
in.read(data,size);
return data;
}
thank you for any and all help.
edit: to anyone who might be interested I wrote the function using this as a reference to know
how a WAV file is formated
Are you hoping to perform the FFT using OpenAL? I don't know if that's possible. Your code will likely be performing the FFT.
You don't need to explicitly set up any threads. However, your audio output library will probably do so on your behalf. I'm not familiar with OpenAL, but the way that a lot of audio libraries operate is by letting you specify a callback that will feed more audio into the output. Thus, your main program will load audio from the audio file, stuff it into a buffer (likely protected by a mutex) for the audio callback to read, compute an FFT over the audio window, and perhaps visualize the data for the user.
Again, the audio library will probably be managing the threading so you don't need to worry about the exact threading implementation or library. But be sure to manage shared data correctly with a mutex.
I have a roughly 11.1G binary file where stores a series of the depth frames from the Kinect. There are 19437 frames in this file. To read one frame per time, I use ifstream in fstream but it reaches eof before the real end of the file. (I only got the first 20 frames, and the function stops because of the eof flag)
However, all frames can be read by using fread in stdio instead.
Can anyone explain this situation? Thank you for precious time on my question.
Here are my two functions:
// ifstream.read() - Does Not Work: the loop will stop after 20th frame because of the eof flag
ifstream depthStream("fileName.dat");
if(depthStream.is_open())
{
while(!depthStream.eof())
{
char* buffer = new char[640*480*2];
depthStream.read(buffer, 640*480*2);
// Store the buffer data in OpenCV Mat
delete[] buffer;
}
}
// fread() - Work: Get 19437 frames successfully
FILE* depthStream
depthStream = fopen("fileName.dat", "rb");
if(depthStream != NULL)
{
while(!feof(depthStream))
{
char* buffer = new char[640*480*2];
fread(buffer, 1, 640*480*2, depthStream);
// Store the buffer data in OpenCV Mat
delete[] buffer;
}
Again, thank you for precious time on my question
You need to open the stream in binary mode, otherwise it will stop at the first byte it sees with a value of 26.
ifstream depthStream("fileName.dat", ios_base::in | ios_base::binary);
As for why 26 is special, it's the code for Ctrl-Z which was a convention used to mark the end of a text file. The history behind this was recorded in Raymond Chen's blog.
Hey all, I'm writing an application which records microphone input to a WAV file. Previously, I had written this to fill a buffer of a specified size and that worked fine. Now, I'd like to be able to record to an arbitrary length. Here's what I'm trying to do:
Set up 32 small audio buffers (circular buffering)
Start a WAV file with ofstream -- write the header with PCM length set to 0
Add a buffer to input
When a buffer completes, append its data to the WAV file and update the header; recycle the buffer
When the user hits "stop", write the remaining buffers to file and close
It kind of works in that the files are coming out to the correct length (header and file size and are correct). However, the data is wonky as hell. I can make out a semblance of what I said -- and the timing is correct -- but there's this repetitive block of distortion. It basically sounds like only half the data is getting into the file.
Here are some variables the code uses (in header)
// File writing
ofstream mFile;
WAVFILEHEADER mFileHeader;
int16_t * mPcmBuffer;
int32_t mPcmBufferPosition;
int32_t mPcmBufferSize;
uint32_t mPcmTotalSize;
bool mRecording;
Here is the code that prepares the file:
// Start recording audio
void CaptureApp::startRecording()
{
// Set flag
mRecording = true;
// Set size values
mPcmBufferPosition = 0;
mPcmTotalSize = 0;
// Open file for streaming
mFile.open("c:\my.wav", ios::binary|ios::trunc);
}
Here's the code that receives the buffer. This assumes the incoming data is correct -- it should be, but I haven't ruled out that it isn't.
// Append file buffer to output WAV
void CaptureApp::writeData()
{
// Update header with new PCM length
mPcmBufferPosition *= sizeof(int16_t);
mPcmTotalSize += mPcmBufferPosition;
mFileHeader.bytes = mPcmTotalSize + sizeof(WAVFILEHEADER);
mFileHeader.pcmbytes = mPcmTotalSize;
mFile.seekp(0);
mFile.write(reinterpret_cast<char *>(&mFileHeader), sizeof(mFileHeader));
// Append PCM data
if (mPcmBufferPosition > 0)
{
mFile.seekp(mPcmTotalSize - mPcmBufferPosition + sizeof(WAVFILEHEADER));
mFile.write(reinterpret_cast<char *>(&mPcmBuffer), mPcmBufferPosition);
}
// Reset file buffer position
mPcmBufferPosition = 0;
}
And this is the code that closes the file:
// Stop recording
void CaptureApp::stopRecording()
{
// Save remaining data
if (mPcmBufferSize > 0)
writeData();
// Close file
if (mFile.is_open())
{
mFile.flush();
mFile.close();
}
// Turn off recording flag
mRecording = false;
}
If there's anything here that looks like it would result in bad data getting appended to the file, please let me know. If not, I'll triple check the input data (in the callback). This data should be good, because it works if I copy it to a larger buffer (eg, two minutes) and then save that out.
I am just wondering, how
void CaptureApp::writeData()
{
mPcmBufferPosition *= sizeof(int16_t); // mPcmBufferPosition = 0, so 0*2 = 0;
// (...)
mPcmBufferPosition = 0;
}
works (btw. sizeof int16_t is always 2). Are you setting mPcmBufferPosition somewhere else?
void CaptureApp::writeData()
{
// Update header with new PCM length
long pos = mFile.tellp();
mPcmBufferBytesToWrite *= 2;
mPcmTotalSize += mPcmBufferBytesToWrite;
mFileHeader.bytes = mPcmTotalSize + sizeof(WAVFILEHEADER);
mFileHeader.pcmbytes = mPcmTotalSize;
mFile.seekp(0);
mFile.write(reinterpret_cast<char *>(&mFileHeader), sizeof(mFileHeader));
mFile.seekp(pos);
// Append PCM data
if (mPcmBufferBytesToWrite > 0)
mFile.write(reinterpret_cast<char *>(mPcmBuffer), mPcmBufferBytesToWrite);
}
Also mPcmBuffer is a pointer, so don't know why you use & in write.
The most likely reason is you're writing from the address of the pointer to your buffer, not from the buffer itself. Ditch the "&" in the final mFile.write. (It may have some good data in it if your buffer is allocated nearby and you happen to grab a chunk of it, but that's just luck that your write hapens to overlap your buffer)
In general, if you find yourself in this sort of situation, you could try to think how you can test this code in isolation from the recording code: Set up a buffer that has the values 0..255 in it, and then set your "chunk size" to 16 and see if it writes out a continuous sequence of 0..255 across 16 separate write operations. That will quickly verify if your buffering code is working or not.
I won't debug your code, but will try to give you checklist of the things you can try to check and determine where's the error:
always have referent recorder or player handy. It can be something as simple as Windows Sound Recorder, or Audacity, or Adobe Audition. Have a recorder/player that you are CERTAIN that will record and play files correctly.
record the file with your app and try to play it with reference player. Working?
try to record the file with reference recorder, and play it with your player. Working?
when you write SOUND data to the WAV file in your recorder, write it to one extra file. Open that file in RAW mode with the player (Windows Sound Recorder won't be enough here). Does it play correctly?
when playing the file in your player, and writing to the soundcard, write output to the RAW file, to see if you are playing the data correctly at all or you have soundcars issues. Does it play correctly?
Try all this, and you'll have much better idea of where something went wrong.
Shoot, sorry -- had a late night of work and am a bit off today. I forgot to show y'all the actual callback. This is it:
// Called when buffer is full
void CaptureApp::onData(float * data, int32_t & size)
{
// Check recording flag and buffer size
if (mRecording && size <= BUFFER_LENGTH)
{
// Save the PCM data to file and reset the array if we
// don't have room for this buffer
if (mPcmBufferPosition + size >= mPcmBufferSize)
writeData();
// Copy PCM data to file buffer
copy(mAudioInput.getData(), mAudioInput.getData() + size, mPcmBuffer + mPcmBufferPosition);
// Update PCM position
mPcmBufferPosition += size;
}
}
Will try y'alls advice and report.