Create a 44-byte header with ffmpeg - c++

I made a program using ffmpeg libraries that converts an audio file to a wav file. Except the only problem is that it doesn't create a 44-byte header. When input the file into Kaldi Speech Recognition, it produces the error:
ERROR (online2-wav-nnet2-latgen-faster:Read4ByteTag():wave-reader.cc:74) WaveData: expected 4-byte chunk-name, got read errror
I ran the file thru shntool and it reports a 78-byte header. Is there anyway I can get the standard 44-byte header using ffmpeg libraries?

FFmpeg inserts some metadata about the encoder into the header file. Here is the hexdump of the header before the fix:
00000000 52 49 46 46 06 90 00 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt |
00000010 10 00 00 00 01 00 01 00 40 1f 00 00 80 3e 00 00 |........#....>..|
00000020 02 00 10 00 4c 49 53 54 1a 00 00 00 49 4e 46 4f |....LIST....INFO|
00000030 49 53 46 54 0e 00 00 00 4c 61 76 66 35 36 2e 33 |ISFT....Lavf56.3|
00000040 36 2e 31 30 30 00 64 61 74 61 c0 8f 00 00 00 00 |6.100.data......|
as you can see Lavf56.36.100 is the encoder in the header. Here is the portion of code that I used to get rid of it.
std::cout<<"------------------BEFORE-----------------------"<<std::endl;
std::cout<< av_dict_count ( (*ofmt_ctx)->metadata) <<std::endl;
std::cout<<"-------------------------------------------"<<std::endl;
if(av_dict_set(&(*ofmt_ctx)->metadata,"ISFT",NULL, AV_DICT_IGNORE_SUFFIX)){
std::cerr<<"Nope it, didn't work :("<<std::endl;
}
ret = avformat_write_header(*ofmt_ctx,&(*ofmt_ctx)->metadata );
if (ret < 0) {
std::cout<<"-------------------------------------------"<<std::endl;
av_log(NULL, AV_LOG_ERROR, "Error occurred when writing header to file\n");
return ret;
}
std::cout<<"------------------AFTER-----------------------"<<std::endl;
std::cout<< av_dict_count ( (*ofmt_ctx)->metadata) <<std::endl;
std::cout<<"-------------------------------------------"<<std::endl;
Here is the hexdump afterwards:
00000000 52 49 46 46 e4 8f 00 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt |
00000010 10 00 00 00 01 00 01 00 40 1f 00 00 80 3e 00 00 |........#....>..|
00000020 02 00 10 00 64 61 74 61 c0 8f 00 00 00 00 00 00 |....data........|
00000030 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 |................|
shntool now report 44-bytes
(NOTE:ofmt_ctx was a ** in this function that I made, hence why referencing the metadata dictionary as &(*ofmt_ctx)->metadata)

Related

File readable but saved in binary mode

I wrote a very simple function that saves a file in binary mode with the support of qt. The file is saved correctly and the data inside is correct, however if I open the file with a text editor I can read strings that I shouldn't be reading.
void Game::saveRanking() const
{
QFile file(".ranking.dat");
file.open(QFile::WriteOnly);
QJsonObject recordObject;
QJsonArray rankingNameArray;
for (int i = 0; i < 5; i++)
rankingNameArray.push_back(QJsonValue::fromVariant(QVariant(highscoreName[i])));
recordObject.insert("Ranking Name", rankingNameArray);
QJsonArray rankingScoreArray;
for (int i = 0; i < 5; i++)
rankingScoreArray.push_back(QJsonValue::fromVariant(QVariant(highscoreValue[i])));
recordObject.insert("Ranking Value", rankingScoreArray);
QJsonDocument doc(recordObject);
file.write(doc.toBinaryData());
}
I've filled the arrays like this, for debugging purposes
highscoreName[0] = "Pippo"; highscoreValue[0] = 100;
highscoreName[1] = "Franco"; highscoreValue[1] = 300;
highscoreName[2] = "Giovanni"; highscoreValue[2] = 200;
highscoreName[3] = "Andrea"; highscoreValue[3] = 4000;
highscoreName[4] = "AI"; highscoreValue[4] = 132400;
I tried to do a hexdump and the result is the following
0000-0010: 71 62 6a 73-01 00 00 00-a4 00 00 00-05 00 00 00 qbjs.... ........
0000-0020: 9c 00 00 00-14 04 00 00-0c 00 52 61-6e 6b 69 6e ........ ..Rankin
0000-0030: 67 20 4e 61-6d 65 00 00-48 00 00 00-0a 00 00 00 g.Name.. H.......
0000-0040: 34 00 00 00-05 00 50 69-70 70 6f 00-06 00 46 72 4.....Pi ppo...Fr
0000-0050: 61 6e 63 6f-08 00 47 69-6f 76 61 6e-6e 69 00 00 anco..Gi ovanni..
0000-0060: 06 00 41 6e-64 72 65 61-02 00 41 49-8b 01 00 00 ..Andrea ..AI....
0000-0070: 8b 02 00 00-8b 03 00 00-0b 05 00 00-0b 06 00 00 ........ ........
0000-0080: 94 0f 00 00-0d 00 52 61-6e 6b 69 6e-67 20 56 61 ......Ra nking.Va
0000-0090: 6c 75 65 00-20 00 00 00-0a 00 00 00-0c 00 00 00 lue..... ........
0000-00a0: 8a 0c 00 00-8a 25 00 00-0a 19 00 00-0a f4 01 00 .....%.. ........
0000-00ac: 0a a6 40 00-0c 00 00 00-68 00 00 00 ..#..... h...
The toBinaryData() gives the internal representation data, not text, so it is normal you have binary data in the file.
QByteArray QJsonDocument::toBinaryData() const
From docs:
Returns a binary representation of the document.
Also it is deprecated:
This function is obsolete. It is provided to keep old source code working. We strongly advise against using it in new code.
You should use toJson().

Formatting a file using regex

Spent some time trying to format a file with roughly 5,000 hex values, however with no luck. For example
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00
01 02 00 01 00 85 03 0e 00 00 00 55 0e 04 66 03
2a 38 32 80 00 0e 00 2f c2
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00
01 02 00 01 00 85 03 2b 00 00 00 55 2b 04 58 28
2a 39 32 80 00 01 00 12 57 4d 32 34 30 20 41 43
20 56 65 72 2e 41 00 00 23 06 00 0a 23 06 00 0a
01 00 00 c0 14 56
1b 00 30 a6 59 b8 0e b7 ff ff 00 00 00 00 09 00
00 02 00 01 00 04 03 0d 00 00 00 55 0d 04 33 2a
03 3a 32 40 00 0e be 40
1b 00 f0 01 f1 b6 0e b7 ff ff 00 00 00 00 09 00
00 02 00 01 00 04 03 0e 00 00 00 55 0e 04 66 2a
00 3b 32 40 00 01 05 c9 b1
and so on..
I need to format in the following matter:
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00 01 02 00 01 00 85 03 0e 00 00 00 55 0e 04 66 03 2a 38 32 80 00 0e 00 2f c2
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00 01 02 00 01 00 85 03 2b 00 00 00 55 2b 04 58 28 2a 39 32 80 00 01 00 12 57 4d 32 34 30 20 41 43 20 56 65 72 2e 41 00 00 23 06 00 0a 23 06 00 0a 01 00 00 c0 14 56
1b 00 30 a6 59 b8 0e b7 ff ff 00 00 00 00 09 00 00 02 00 01 00 04 03 0d 00 00 00 55 0d 04 33 2a 03 3a 32 40 00 0e be 40
1b 00 f0 01 f1 b6 0e b7 ff ff 00 00 00 00 09 00 00 02 00 01 00 04 03 0e 00 00 00 55 0e 04 66 2a 00 3b 32 40 00 01 05 c9 b1
Basically put each hex block into one line still separated by space. For the life of me, i cannot figure out the regular expression to format this for me. I have tried different expressions but everything i tried either removes the line that separates hex blocks or grabs a last character of the line instead of the actual \n.
Maybe there is a better way of formatting files other than using regex
Please try regex: (?!\n\n)\n
Demo

(VS15 C++) Got a Visual Leak Detector report, but what now?

because of some (strange) problems in my C++-project I used Visual Leak Detector (for the first time), to check the project on memory leaks.
So I got i.a. the follwoing reports:
WARNING: Visual Leak Detector detected memory leaks!
---------- Block 4 at 0x004D07B0: 200 bytes ----------
Leak Hash: 0xD2D1B4A0, Count: 1, Total 200 bytes
Call Stack (TID 8796):
ucrtbase.dll!malloc()
f:\dd\vctools\crt\vcstartup\src\heap\new_scalar.cpp (19): LASS.exe!operator new() + 0x8 bytes
clr.dll!0x72D616E5()
Data:
28 75 14 03 00 00 00 00 01 00 00 00 00 00 00 00 (u...... ........
9A 99 99 99 99 99 B9 3F 50 00 00 00 0A 00 00 00 .......? P.......
00 00 00 00 F4 01 00 00 00 00 00 00 01 00 00 00 ........ ........
7B 14 AE 47 E1 7A 74 3F 14 00 00 00 BA FF FF FF {..G.zt? ........
00 00 00 00 F4 01 00 00 00 00 00 00 01 00 00 00 ........ ........
7B 14 AE 47 E1 7A 84 3F 00 00 00 00 64 00 00 00 {..G.z.? ....d...
00 00 00 00 01 00 00 00 14 00 00 00 46 00 00 00 ........ ....F...
00 00 00 00 64 00 00 00 00 00 00 00 F4 01 00 00 ....d... ........
01 00 00 00 B8 E2 13 03 F0 AD 18 03 00 00 00 00 ........ ........
C8 E2 13 03 C8 AB 18 03 00 00 00 00 78 E3 13 03 ........ ....x...
B8 AC 18 03 00 00 00 00 68 E2 13 03 E8 AC 18 03 ........ h.......
00 00 00 00 14 00 00 00 01 00 00 00 64 00 00 00 ........ ....d...
01 00 00 00 00 00 00 00 ........ ........
---------- Block 20 at 0x004D0880: 200 bytes ----------
Leak Hash: 0xD2D1B4A0, Count: 1, Total 200 bytes
Call Stack (TID 8796):
ucrtbase.dll!malloc()
f:\dd\vctools\crt\vcstartup\src\heap\new_scalar.cpp (19): LASS.exe!operator new() + 0x8 bytes
clr.dll!0x72D616E5()
Data:
78 74 14 03 00 00 00 00 01 00 00 00 00 00 00 00 xt...... ........
9A 99 99 99 99 99 B9 3F 50 00 00 00 0A 00 00 00 .......? P.......
00 00 00 00 F4 01 00 00 00 00 00 00 01 00 00 00 ........ ........
7B 14 AE 47 E1 7A 74 3F 14 00 00 00 BA FF FF FF {..G.zt? ........
00 00 00 00 F4 01 00 00 00 00 00 00 01 00 00 00 ........ ........
7B 14 AE 47 E1 7A 84 3F 00 00 00 00 64 00 00 00 {..G.z.? ....d...
00 00 00 00 01 00 00 00 14 00 00 00 46 00 00 00 ........ ....F...
00 00 00 00 64 00 00 00 00 00 00 00 F4 01 00 00 ....d... ........
01 00 00 00 38 E2 13 03 00 F0 15 03 00 00 00 00 ....8... ........
B8 E1 13 03 88 00 7F 05 00 00 00 00 08 E2 13 03 ........ ........
20 FF 7E 05 00 00 00 00 E8 E1 13 03 80 FF 7E 05 ..~..... ......~.
00 00 00 00 14 00 00 00 01 00 00 00 64 00 00 00 ........ ....d...
01 00 00 00 00 00 00 00 ........ ........
---------- Block 31 at 0x0053E1B8: 72 bytes ----------
Leak Hash: 0x3F88029B, Count: 1, Total 72 bytes
Call Stack (TID 8796):
ucrtbase.dll!malloc()
f:\dd\vctools\crt\vcstartup\src\heap\new_scalar.cpp (19): LASS.exe!operator new() + 0x8 bytes
clr.dll!0x72D616E5()
Data:
60 BC 55 00 40 3E 80 05 A0 3F 80 05 A0 3F 80 05 `.U.#>.. .?...?..
60 BB 55 00 20 34 18 03 00 00 00 00 00 00 00 00 `.U..4.. ........
00 00 00 00 20 00 00 00 2F 00 00 00 80 BC 55 00 ........ /.....U.
00 2E 18 03 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
20 00 00 00 2F 00 00 00 ..../... ........
---------- Block 33 at 0x0055BB60: 8 bytes ----------
Leak Hash: 0xA49C5AA6, Count: 1, Total 8 bytes
Call Stack (TID 8796):
ucrtbase.dll!malloc()
f:\dd\vctools\crt\vcstartup\src\heap\new_scalar.cpp (19): LASS.exe!operator new() + 0x8 bytes
clr.dll!0x72D616E5()
Data:
C8 E1 53 00 00 00 00 00
..S..... ........
//And many more...
Unfortunatly I do not understand, what VLD wants to say is the problem.
With a double-click on the "f:\dd..."-lines it should set my courser to the line with the problem, shouldn´t it? But it dosen´t.
My question is now: How do I get to the area of the problem or in other words "how do I read these reports"?
In addition:
I use Visual Studio 2015
The project is a C++ Windows Forms Project
I included the vld.h in the additional includes and the lib-directory to the additional libraries of the project
In the main() I use #include <vld.h> and _CrtDumpMemoryLeaks();
EDIT:
My Main (a reduced version, but gives similar reports):
//some class-includes
#include <vld.h>
using namespace System;
using namespace System::Windows::Forms;
using namespace std;
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
[STAThread]
void Main()
{
Application::EnableVisualStyles();
Application::SetCompatibleTextRenderingDefault(false);
Experiment* experiment = new Experiment();
Experiment_List* running_experiments = new Experiment_List();
while(!experiment->end) {
experiment= new Experiment();
LASS::MainWindow form(experiment, running_experiments);
form.ShowDialog();
if(!experiment->end){
running_experiments->register_experiment(experiment);
}
}
running_experimente->end_all();
_CrtDumpMemoryLeaks();
exit(0);
}
Unfortunatley there are about 40 classes, that I do not want to post...
I don't know where the problem exact is.
For me, it helps to run the program in RELEASE Mode, instead of DEBUG mode.
I suppose, my problem is the handling of managed and unmanaged code together.
I have unmanaged code inside managed code.
It seams as if CLR use a different new operator in Debug mode. Not as conform as the c++ standard.
According to: Using push_back() for STL List in C++ causes Access Violation, Crash
If you malloc() a C++ class, no constructors will be called for any of
that class's fields
And the VS will step into a constructor in class new_scalar.cpp.
Folks say that is depending of the Visual Leak Detector (VLD). You use them in your includes.
In the End, try to distinguish your code with
#pragma managed
and
#pragma unmanaged
And run in RELEASE mode.

Localizing function body chunk in .o file

i got some simple code file
mangen.c:
///////////// begin of the file
void mangen(int* data)
{
for(int j=0; j<100; j++)
for(int i=0; i<100; i++)
data[j*100+i] = 111;
}
//////// end of the file
I compile it with mingw (on win32)
c:\mingw\bin\gcc -std=c99 -c mangen.c -fno-exceptions -march=core2 -mtune=generic -mfpmath=both -msse2
it yeilds to mangen.o file which is 400 bytes
00000000 4C 01 03 00 00 00 00 00-D8 00 00 00 0A 00 00 00 L...............
00000010 00 00 05 01 2E 74 65 78-74 00 00 00 00 00 00 00 .....text.......
00000020 00 00 00 00 4C 00 00 00-8C 00 00 00 00 00 00 00 ....L...........
00000030 00 00 00 00 00 00 00 00-20 00 30 60 2E 64 61 74 ........ .0`.dat
00000040 61 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 a...............
00000050 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
00000060 40 00 30 C0 2E 62 73 73-00 00 00 00 00 00 00 00 #.0..bss........
00000070 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
00000080 00 00 00 00 00 00 00 00-80 00 30 C0 55 89 E5 83 ..........0.U...
00000090 EC 10 C7 45 FC 00 00 00-00 EB 34 C7 45 F8 00 00 ...E......4.E...
000000A0 00 00 EB 21 8B 45 FC 6B-D0 64 8B 45 F8 01 D0 8D ...!.E.k.d.E....
000000B0 14 85 00 00 00 00 8B 45-08 01 D0 C7 00 6F 00 00 .......E.....o..
000000C0 00 83 45 F8 01 83 7D F8-63 7E D9 83 45 FC 01 83 ..E...}.c~..E...
000000D0 7D FC 63 7E C6 C9 C3 90-2E 66 69 6C 65 00 00 00 }.c~.....file...
000000E0 00 00 00 00 FE FF 00 00-67 01 6D 61 6E 67 65 6E ........g.mangen
000000F0 2E 63 00 00 00 00 00 00-00 00 00 00 5F 6D 61 6E .c.........._man
00000100 67 65 6E 00 00 00 00 00-01 00 20 00 02 01 00 00 gen....... .....
00000110 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
00000120 2E 74 65 78 74 00 00 00-00 00 00 00 01 00 00 00 .text...........
00000130 03 01 4B 00 00 00 00 00-00 00 00 00 00 00 00 00 ..K.............
00000140 00 00 00 00 2E 64 61 74-61 00 00 00 00 00 00 00 .....data.......
00000150 02 00 00 00 03 01 00 00-00 00 00 00 00 00 00 00 ................
00000160 00 00 00 00 00 00 00 00-2E 62 73 73 00 00 00 00 .........bss....
00000170 00 00 00 00 03 00 00 00-03 01 00 00 00 00 00 00 ................
00000180 00 00 00 00 00 00 00 00-00 00 00 00 04 00 00 00 ................
Now I need to know where is the binary chunk containing
above function body in here
Could someone provide some simple code that will allow me to retrive
this boundaries ?
(assume that function body may be shorter or longer and also
there may be other functions or data in source fite added so
it will move in chunk but I suspect procedure to localise it
should be not very complex.
You can use objdump -Fd mangen.o to find out file offset and lenght of a function.
Alternatively, you can use readelf -s mangen.o to find out size of a function.
You may define something like int abc = 0x11223344; in the beginning and end of function and use the constants to locate the function body.
You can use objdump or nm.
For instance, try:
nm mangen.o
Or
objdump -t mangen.o
If you need to use your own code, have a look here:
http://www.rohitab.com/discuss/topic/38591-c-import-table-parser/
It will give you something to start with. You can find much more information about the format in MSDN.
If you are into Python, there is nice tool/library (including source code) that can be helpful:
https://code.google.com/p/pefile/

how to read a file from hex editor and find equivalent ASCII values in c++ [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
i have a file in hex looking like below
9C CB CB 8D 13 75 D2 11 91 58 00 C0 4F 79 56 A4
60 00 00 00 92 02 00 00 40 1F 00 00 80 14 00 00
A4 08 00 00 90 02 00 00 A0 09 01 00 40 AE 00 00
E4 27 00 00 90 02 00 00 A0 09 01 00 FC 7A 00 00
84 31 01 00 CF 01 00 00 A0 09 01 00 14 A7 00 00
24 3B 02 00 75 02 00 00 A0 09 01 00 50 8D 00 00
C4 44 03 00 14 02 00 00 A0 09 01 00 20 35 00 00
64 4E 04 00 C8 00 00 00 90 02 00 00 E8 03 00 00
00 00 00 00 00 08 00 00 CA 01 00 00 A4 00 00 00
first sixteen bytes (9C CB CB 8D 13 75 D2 11 91 58 00 C0 4F 79 56 A4) are showing header of my file.
This will convert the hexadecimal input to characters, and print them out (though the numbers 80 and greater won't actually be ASCII, since they're outside the range defined by ASCII):
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
int main() {
std::transform(std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::ostream_iterator<char>(std::cout),
[](std::string const &in) {
return (char) strtol(in.c_str(), NULL, 16);
});
}
I'm not sure how much good that'll do for you, but it seems to be what you're asking, and without more information on what you really want, it's about as good as I think anybody can probably do.