C-Style unsigned char parsing and manipulation in C/C++ - segmentation fault - c++

Note that I'm using a C++ compiler ( hence, the cast on the calloc function calls) to do this, but the code is essentially C.
Basically, I have a typedef to an unsigned char known as viByte, which I'm using to create a string buffer to parse a file from binary (a TGA file, to be exact - but, that's irrelevant).
I'm writing basic functions for it right now; append, prepend, new, etc.
The problem is that, on the first iteration of the first loop in viByteBuf_Prepend, I get a segmentation fault. I need to know why, exactly, as this is something which could keep me up all night without some pointers (pun intended).
I also would like to know if my algorithms are correct in terms of how the buffer is pre-pending the viByte string. For example, I have a feeling that using memset too much might be a bad idea, and whether or not my printf format for the unsigned char is correct (I have a feeling it isn't, as nothing is getting output to my console).
Compiling on GCC, Linux.
Ze Code
#ifdef VI_BYTEBUF_DEBUG
void viByteBuf_TestPrepend( void )
{
viByteBuf* buf = viByteBuf_New( 4 );
buf->str = ( viByte* ) 0x1;
printf(" Before viByteBuf_Prepend => %uc ", buf->str);
viByteBuf_Prepend( buf, 3, ( viByte* ) 0x2 );
printf(" After viByteBuf_Prepend => %uc ", buf->str);
}
#endif
viByteBuf* viByteBuf_New( unsigned int len )
{
viByteBuf* buf = ( viByteBuf* ) calloc( sizeof( viByteBuf ), 1 );
const int buflen = len + 1;
buf->str = ( viByte* ) calloc( sizeof( viByte ), buflen );
buf->len = buflen;
buf->str[ buflen ] = '\0';
return buf;
}
void viByteBuf_Prepend( viByteBuf* buf, unsigned int len, viByte* str )
{
unsigned int pos, i;
const unsigned int totallen = buf->len + len;
viByteBuf* tmp = viByteBuf_New( totallen );
viByte* strpos = buf->str;
memset( tmp->str, 0, tmp->len );
int index;
for( i = 0; i < buf->len; ++i )
{
index = ( buf->len - i ) - 1;
*strpos = buf->str[ 0 ];
++strpos;
}
memset( buf->str, 0, buf->len );
printf( "%uc\n", buf->str );
i = totallen;
for ( pos = 0; pos < len; ++pos )
{
tmp->str[ pos ] = str[ pos ];
tmp->str[ i ] = buf->str[ i ];
--i;
}
memset( buf->str, 0, buf->len );
buf->len = tmp->len;
memcpy( buf->str, tmp->str, tmp->len );
viByteBuf_Free( tmp );
//memset( )
//realloc( ( viByteBuf* ) buf, sizeof( viByteBuf ) * tmp->len );
}
Many thank yous.
Update
Sorry, I should have explicitly posted the code where the segmentation fault lies. It is right here:
for( i = 0; i < buf->len; ++i )
{
index = ( buf->len - i ) - 1;
*strpos = buf->str[ 0 ]; //<--segmentation fault.
++strpos;
}

On your code you have buf->str[ buflen ] = '\0';, but you only allocate space for buflen. I think you meant buf->str[ len ] = '\0';.

Related

Strange behavior with char pointer and char pointer returned by fonction in C/C++ with "cout"

I have a strange behavior with a char pointer initialized by the value of a return function and with the cout.
All my code is for an Arduino application, this is why I use char pointer, char array and string.h.
I created a class named FrameManager, with a function getDataFromFrame to extract data from a string (in fact a char array). See above:
`char * FrameManager::getDataFromFrame ( const char frame[], char key[] )
{
char *pValue = nullptr;
int frameLength = strlen ( frame );
int previousStartIndex = 0;
for ( int i=0; i<frameLength; i++ ) {
char c = frame[i];
if ( c == ',' ) {
int buffSize = i-previousStartIndex+1;
char subbuff[buffSize];
memset ( subbuff, 0, buffSize ); //clear buffer
memcpy ( subbuff, &frame[previousStartIndex], i-previousStartIndex );
subbuff[buffSize]='\0';
previousStartIndex = i+1;
int buffLength = strlen ( subbuff );
const char *ptr = strchr ( subbuff, ':' );
if ( ptr ) {
int index = ptr-subbuff;
char buffKey[index+1];
memset ( buffKey, 0, index+1 );
memcpy ( buffKey, &subbuff[0], index );
buffKey[index+1]='\0';
char buffValue[buffLength-index];
memset ( buffValue, 0, buffLength-index );
memcpy ( buffValue, &subbuff[index+1], buffLength-index );
buffValue[buffLength-index]='\0';
if ( strcmp ( key,buffKey ) == 0 ) {
pValue = &buffValue[0];
break;
}
}
} else if ( i+1 == frameLength ) {
int buffSize = i-previousStartIndex+1;
char subbuff[buffSize];
memcpy ( subbuff, &frame[previousStartIndex], frameLength-1 );
subbuff[buffSize]='\0';
int buffLength = strlen ( subbuff );
const char *ptr = strchr ( subbuff, ':' );
if ( ptr ) {
int index = ptr-subbuff;
char buffKey[index+1];
memset ( buffKey, 0, index+1 );
memcpy ( buffKey, &subbuff[0], index );
buffKey[index+1]='\0';
char buffValue[buffLength-index];
memset ( buffValue, 0, buffLength-index );
memcpy ( buffValue, &subbuff[index+1], buffLength-index );
buffValue[buffLength-index]='\0';
if ( strcmp ( key,buffKey ) == 0 ) {
pValue = &buffValue[0];
break;
}
}
}
}
return pValue;
}`
In the main(), I created juste a little code to test the returned value:
int main(int argc, char **argv) {
const char frame[] = "DEVICE:ARM,FUNC:MOVE_F,PARAM:12,SERVO_S:1";
FrameManager frameManager;
char key[] = "DEVICE";
char *value;
value = frameManager.getDataFromFrame(frame, &key[0]);
cout << "Retrieved value: " << value << endl;
cout << "Retrieved value: " << frameManager.getDataFromFrame(frame, &key[0]) << endl;
printf("%s",value);
return 0;
}
and here the result:
Retrieved value: y%R
Retrieved value: ARM
ARM
The first "cout" doesn't display the expected value.
The second "cout" display the expected value and the printf too.
I don't understand what is the problem with the first "cout".
Thanks
Jocelyn
pValue points into local arrays, which get out of scope. That's undefined behavior. It might work, but your program might also crash, return wrong values (that's what you experience), corrupt your data or do any other arbitrary action.
Given that you're already using C++, consider using std::string as a result instead or point into the original frame (if possible).

C++ program opens file corectly on Linux but not on Windows

I compiled a Linux program on Windows via Mingw but the output is wrong.
Error description:
The output of the program looks different on Windows than on Linux. This is how it looks on Windows:
>tig_2
CAATCTTCAGAGTCCAGAGTGGGAGGCACAGACTACAGAAAATGAGCAGCGGGGCTGGTA
>cluster_1001_conTTGGTGAAGAGAATTTGGACATGGATGAAGGCTTGGGCTTGACCATGCGAAGG
Expected output:
>cluster_1001_contig2
CAATCTTCAGAGTCCAGAGTGGGAGGCACAGACTACAGAAAATGAGCAGCGGGGCTGGTA
>cluster_1001_contig1
TTGGTGAAGAGAATTTGGACATGGATGAAGGCTTGGGCTTGACCATGCGAAGG
(Note: the output is very large to paste it here so the examples above are pseudo-real).
Possible cause:
I have observed that if I convert the enter characters the input file from Linux (LF) to Windows (CRLF) it almost works: the first character (>) in file is missing. The same code works perfectly on Linux without any enter conversion. So, the problem must be in the function that is parsing the input not in the one that writes the output:
seq_db.Read( db_in.c_str(), options );
Source code:
This is the piece that is parsing the input file. Anyway, I might me wrong. The fault might be in other place. In case it is needed, the FULL source code is here :)
void SequenceDB::Read( const char *file, const Options & options )
{
Sequence one;
Sequence dummy;
Sequence des;
Sequence *last = NULL;
FILE *swap = NULL;
FILE *fin = fopen( file, "r" );
char *buffer = NULL;
char *res = NULL;
size_t swap_size = 0;
int option_l = options.min_length;
if( fin == NULL ) bomb_error( "Failed to open the database file" );
if( options.store_disk ) swap = OpenTempFile( temp_dir );
Clear();
dummy.swap = swap;
buffer = new char[ MAX_LINE_SIZE+1 ];
while (not feof( fin ) || one.size) { /* do not break when the last sequence is not handled */
buffer[0] = '>';
if ( (res=fgets( buffer, MAX_LINE_SIZE, fin )) == NULL && one.size == 0) break;
if( buffer[0] == '+' ){
int len = strlen( buffer );
int len2 = len;
while( len2 && buffer[len2-1] != '\n' ){
if ( (res=fgets( buffer, MAX_LINE_SIZE, fin )) == NULL ) break;
len2 = strlen( buffer );
len += len2;
}
one.des_length2 = len;
dummy.des_length2 = len;
fseek( fin, one.size, SEEK_CUR );
}else if (buffer[0] == '>' || buffer[0] == '#' || (res==NULL && one.size)) {
if ( one.size ) { // write previous record
one.dat_length = dummy.dat_length = one.size;
if( one.identifier == NULL || one.Format() ){
printf( "Warning: from file \"%s\",\n", file );
printf( "Discarding invalid sequence or sequence without identifier and description!\n\n" );
if( one.identifier ) printf( "%s\n", one.identifier );
printf( "%s\n", one.data );
one.size = 0;
}
one.index = dummy.index = sequences.size();
if( one.size > option_l ) {
if ( swap ) {
swap_size += one.size;
// so that size of file < MAX_BIN_SWAP about 2GB
if ( swap_size >= MAX_BIN_SWAP) {
dummy.swap = swap = OpenTempFile( temp_dir );
swap_size = one.size;
}
dummy.size = one.size;
dummy.offset = ftell( swap );
dummy.des_length = one.des_length;
sequences.Append( new Sequence( dummy ) );
one.ConvertBases();
fwrite( one.data, 1, one.size, swap );
}else{
//printf( "==================\n" );
sequences.Append( new Sequence( one ) );
//printf( "------------------\n" );
//if( sequences.size() > 10 ) break;
}
//if( sequences.size() >= 10000 ) break;
}
}
one.size = 0;
one.des_length2 = 0;
int len = strlen( buffer );
int len2 = len;
des.size = 0;
des += buffer;
while( len2 && buffer[len2-1] != '\n' ){
if ( (res=fgets( buffer, MAX_LINE_SIZE, fin )) == NULL ) break;
des += buffer;
len2 = strlen( buffer );
len += len2;
}
size_t offset = ftell( fin );
one.des_begin = dummy.des_begin = offset - len;
one.des_length = dummy.des_length = len;
int i = 0;
if( des.data[i] == '>' || des.data[i] == '#' || des.data[i] == '+' ) i += 1;
if( des.data[i] == ' ' or des.data[i] == '\t' ) i += 1;
if( options.des_len and options.des_len < des.size ) des.size = options.des_len;
while( i < des.size and ( des.data[i] != '\n') ) i += 1;
des.data[i] = 0;
one.identifier = dummy.identifier = des.data;
} else {
one += buffer;
}
}
#if 0
int i, n = 0;
for(i=0; i<sequences.size(); i++) n += sequences[i].bufsize + 4;
cout<<n<<"\t"<<sequences.capacity() * sizeof(Sequence)<<endl;
int i;
scanf( "%i", & i );
#endif
one.identifier = dummy.identifier = NULL;
delete[] buffer;
fclose( fin );
}
The format of the input file is like this:
> comment
ACGTACGTACGTACGTACGTACGTACGTACGT
> comment
ACGTACGTACGTACGTACGTACGTACGTACGT
> comment
ACGTACGTACGTACGTACGTACGTACGTACGT
etc
The issue is more than likely you need to open the file using the "rb" switch in the call to fopen. The "rb" opens the file in binary mode, as opposed to "r", which opens a file in "text" mode.
Since you're going back and forth between Linux and Windows, the end-of-line characters will be different. If you open the file as "text" in Windows, but the file was formatted for Linux, you're lying to Windows that it is a text file. So the runtime will do CR/LF conversion all wrong.
Therefore you should open the file as binary, "rb" so that the CR/LF translation isn't done.

Why is ZLIB uncompress hanging?

I'm compiling ZLIB directly into the executable with is being compiled on VS2010 - 64-bit and this simple program always hangs, any ideas? More info below:
Call Stack:
zlibtest.exe!inflate(z_stream_s * strm, int flush) Line 607
zlibtest.exe!uncompress(unsigned char * dest, unsigned long * destLen, const unsigned char * source, unsigned long sourceLen) Line 44
zlibtest.exe!main(int argc, char * * argv) Line 137
It hangs here:
for (;;)
switch (state->mode) {
case HEAD:
if (state->wrap == 0) {
state->mode = TYPEDO;
break;
}
**NEEDBITS(16);**
Program:
#include "zlib.h"
int main( int argc, char * argv[] )
{
char * buffer = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
ULONG buffer_size = strlen( buffer ) + 1;
ULONG compressed_size = compressBound( buffer_size );
BYTE * compressed_buffer = new BYTE[ compressed_size ];
if ( Z_OK != compress2( compressed_buffer, &compressed_size, ( Bytef * ) buffer, buffer_size, Z_DEFAULT_COMPRESSION ) )
printf( "Failed to compress." );
ULONG uncompressed_size = buffer_size;
BYTE * uncompressed_buffer = new BYTE[ uncompressed_size ];
if ( Z_OK != uncompress( uncompressed_buffer, &uncompressed_size, compressed_buffer, compressed_size ) )
printf( "Failed to uncompress" );
printf( "Press <ENTER> to exit..." );
std::cin.ignore();
}

libvorbis audio decode from memory in C++

Given an encoded buffer in C++, what would be the steps using oggvorbis structs to decode the already in-memory data?
OggVorbis_File cannot be used, because assets are within compressed archives.
I'm trying to research the necessary structs and methods, but I'm fairly new to audio encoding and decoding.
Any resources that can help further my reading are appreciated as well!
I should clarify, I intend to use the decoded data to stream into OpenAL.
Thanks.
Answering my own question.
This can be done by providing custom callbacks to vorbis.
struct ogg_file
{
char* curPtr;
char* filePtr;
size_t fileSize;
};
size_t AR_readOgg(void* dst, size_t size1, size_t size2, void* fh)
{
ogg_file* of = reinterpret_cast<ogg_file*>(fh);
size_t len = size1 * size2;
if ( of->curPtr + len > of->filePtr + of->fileSize )
{
len = of->filePtr + of->fileSize - of->curPtr;
}
memcpy( dst, of->curPtr, len );
of->curPtr += len;
return len;
}
int AR_seekOgg( void *fh, ogg_int64_t to, int type ) {
ogg_file* of = reinterpret_cast<ogg_file*>(fh);
switch( type ) {
case SEEK_CUR:
of->curPtr += to;
break;
case SEEK_END:
of->curPtr = of->filePtr + of->fileSize - to;
break;
case SEEK_SET:
of->curPtr = of->filePtr + to;
break;
default:
return -1;
}
if ( of->curPtr < of->filePtr ) {
of->curPtr = of->filePtr;
return -1;
}
if ( of->curPtr > of->filePtr + of->fileSize ) {
of->curPtr = of->filePtr + of->fileSize;
return -1;
}
return 0;
}
int AR_closeOgg(void* fh)
{
return 0;
}
long AR_tellOgg( void *fh )
{
ogg_file* of = reinterpret_cast<ogg_file*>(fh);
return (of->curPtr - of->filePtr);
}
Usage
ov_callbacks callbacks;
ogg_file t;
t.curPtr = t.filePtr = compressedData;
t.fileSize = compressedDataSize;
OggVorbis_File* ov = new OggVorbis_File;
mOggFile = ov;
memset( ov, 0, sizeof( OggVorbis_File ) );
callbacks.read_func = AR_readOgg;
callbacks.seek_func = AR_seekOgg;
callbacks.close_func = AR_closeOgg;
callbacks.tell_func = AR_tellOgg;
int ret = ov_open_callbacks((void *)&t, ov, NULL, -1, callbacks);
vorbis_info* vi = ov_info(ov, -1);
assert(vi);
/* compressed data is available to use, to uncompress look into ov_read */
A Special thanks to the Doom3 GPL source for most of the help with this, it can
be viewed at : here
You also can don't reinvent the wheel and use fmemopen like this:
FILE* memfile = fmemopen(data, len, "r");
Where data is pointer to memory beginning and len is length of your data. Then pass memfile to ov_open like regular FILE object.
However, there is downside: this function seems linux-specific (but it can be found in arduino, so I'm a bit confused about its status), so you don't have it on other systems. But there is some implementations for them (check libconfuse for window or for apple OSes).

How to pass pointer and pointer to a function?

I implement a function that acts like getline( .. ). So my initial approach is:
#include <cstdio>
#include <cstdlib>
#include <cstring>
void getstr( char*& str, unsigned len ) {
char c;
size_t i = 0;
while( true ) {
c = getchar(); // get a character from keyboard
if( '\n' == c || EOF == c ) { // if encountering 'enter' or 'eof'
*( str + i ) = '\0'; // put the null terminate
break; // end while
}
*( str + i ) = c;
if( i == len - 1 ) { // buffer full
len = len + len; // double the len
str = ( char* )realloc( str, len ); // reallocate memory
}
++i;
}
}
int main() {
const unsigned DEFAULT_SIZE = 4;
char* str = ( char* )malloc( DEFAULT_SIZE * sizeof( char ) );
getstr( str, DEFAULT_SIZE );
printf( str );
free( str );
return 0;
}
Then, I think I should switch to pure C instead of using half C/C++. So I change char*& to char**:
Pointer to Pointer version ( crahsed )
#include <cstdio>
#include <cstdlib>
#include <cstring>
void getstr( char** str, unsigned len ) {
char c;
size_t i = 0;
while( true ) {
c = getchar(); // get a character from keyboard
if( '\n' == c || EOF == c ) { // if encountering 'enter' or 'eof'
*( *str + i ) = '\0'; // put the null terminate
break; // done input end while
}
*( *str + i ) = c;
if( i == len - 1 ) { // buffer full
len = len + len; // double the len
*str = ( char* )realloc( str, len ); // reallocate memory
}
++i;
}
}
int main() {
const unsigned DEFAULT_SIZE = 4;
char* str = ( char* )malloc( DEFAULT_SIZE * sizeof( char ) );
getstr( &str, DEFAULT_SIZE );
printf( str );
free( str );
return 0;
}
But this version crashed, ( access violation ). I tried run the debugger, but I could not find where it crashed. I'm running Visual Studio 2010 so could you guys show me how to fix it?
Another weird thing I've encountered is that, if I leave the "&" out, it only works with Visual Studio, but not g++. That is
void getstr( char* str, unsigned len )
From my understanding, whenever we use pointer to allocate or deallocate a block of memory, we actually modify where that pointer are pointing to. So I think we have to use either ** or *& to modify the pointer. However, because it run correctly in Visual Studio, is it just luck or it should be ok either way?
Then, I think I should switch to pure C instead of using half C/C++.
I suggest the other direction. Go full-blown C++.
Your pointer crash is probably in the realloc
*str = ( char* )realloc( str, len )
Should be
*str = ( char* )realloc( *str, len )
As Steve points out, your code leaks the original if realloc fails, so maybe change it to something like:
char* tmp = (char*) realloc(*str, len)
if (tmp) {
*str = tmp
} else {
// realloc failed.. sigh
}
Well, running it in a debugger highlights this line
*str = ( char* )realloc( str, len ); // reallocate memory
where there is a mismatch between str - the pointer to the variable - and *str - the pointer to the memory.
I'd be tempted to rewrite it so it returns the string, or zero on error, rather than having a void return and an in/out parameter ( like fgets does, which seems to be the function you're sort-of copying the behaviour of ). Or wrap such a function. That style doesn't let you get confused as you're only ever dealing with a pointer to char, rather than a pointer to pointer to char.
char* getstr_impl ( char* str, unsigned len ) {...}
void getstr( char** str, unsigned len ) {
*str = getstr_impl ( *str, len );
}