I'm having trouble reading in a 16bit .wav file. I have read in the header information, however, the conversion does not seem to work.
For example, in Matlab if I read in wave file I get the following type of data:
-0.0064, -0.0047, -0.0051, -0.0036, -0.0046, -0.0059, -0.0051
However, in my C++ program the following is returned:
0.960938, -0.00390625, -0.949219, -0.00390625, -0.996094, -0.00390625
I need the data to be represented the same way. Now, for 8 bit .wav files I did the following:
uint8_t c;
for(unsigned i=0; (i < size); i++)
{
c = (unsigned)(unsigned char)(data[i]);
double t = (c-128)/128.0;
rawSignal.push_back(t);
}
This worked, however, when I did this for 16bit:
uint16_t c;
for(unsigned i=0; (i < size); i++)
{
c = (signed)(signed char)(data[i]);
double t = (c-256)/256.0;
rawSignal.push_back(t);
}
Does not work and shows the output (above).
I'm following the standards found Here
Where data is a char array and rawSignal is a std::vector<double> I'm probably just handing the conversion wrong but cannot seem to find out where. Anyone have any suggestions?
Thanks
EDIT:
This is what is now displaying (In a graph):
This is what it should be displaying:
There are a few problems here:
8 bit wavs are unsigned, but 16 bit wavs are signed. Therefore, the subtraction step given in the answers by Carl and Jay are unnecessary. I presume they just copied from your code, but they are wrong.
16 bit waves have a range from -32,768 to 32,767, not from -256 to 255, making the multiplication you are using incorrect anyway.
16-bit wavs are 2 bytes, thus you must read two bytes to make one sample, not one. You appear to be reading one character at a time. When you read the bytes, you may have to swap them if your native endianness is not little-endian.
Assuming a little-endian architecture, your code would look more like this (very close to Carl's answer):
for (int i = 0; i < size; i += 2)
{
int c = (data[i + 1] << 8) | data[i];
double t = c/32768.0;
rawSignal.push_back(t);
}
for a big-endian architecture:
for (int i = 0; i < size; i += 2)
{
int c = (data[i] << 8) | data[i+1];
double t = c/32768.0;
rawSignal.push_back(t);
}
That code is untested, so please LMK if it doesn't work.
(First of all about little-endian/big-endian-ness. WAV is just a container format, the data encoded in it can be in countless format. Most of the codecs are lossless (MPEG Layer-3 aka MP3, yes, the stream can be "packaged" into a WAV, various CCITT and other codecs). You assume that you deal with some kind of PCM format, where you see the actual wave in RAW format, no lossless transformation was done on it. The endianness depends on the codec, which produced the stream.
Is the endianness of format params guaranteed in RIFF WAV files?)
It's also a question if the one PCM sample is in linear scale sampled integer or there some scaling, log scale or other transformation behind it. Regular PCM wav files I encountered were simple linear scale samples, but I'm not working in the audio recording or producing industry.
So a path to your solution:
Make sure that you are dealing with regular 16 bit PCM encoded RIFF WAV file.
While reading the stream, always read two bytes (char) at a time and convert the two chars into a 16 bit short. People showed this before me.
The wave form you show clearly suggest that you either not estimated the frequency well (or you just have one mono channel instead of a stereo). Because the sampling rate (44.1kHz, 22KHz, 11KHz, 8kHz, etc) is just as important as the resolution (8 bit, 16 bit, 24 bit, etc). Maybe in the first case you had a stereo data. You can read it in as mono, you may not notice it. In the second case if you have mono data, then you'll run out of samples half way into reading the data. That's what it seems to happen according to your graphs. Talking about the other cause: the lower sampling resolutions (and 16 bit is also lower) often paired with lower sampling rates. So if your input data is the recording time, and you think you have a 22kHz data, but it's actually just 11kHz, then again you'll run out half way through from the actual samples and read in memory garbage. So either one of these.
Make sure that you interpret and treat your loop iterator variable and the size well. It seems that size tells how many bytes you have. You'll have exactly half as much short integer samples. Notice, that Bjorn's solution correctly increments i by 2 because of that.
My working code is
int8_t* buffer = new int8_t[size];
/*
HERE buffer IS FILLED
*/
for (int i = 0; i < size; i += 2)
{
int16_t c = ((unsigned char)buffer[i + 1] << 8) | (unsigned char)buffer[i];
double t = c/32768.0;
rawSignal.push_back(t);
}
A 16-bit quantity gives you a range from -32,768 to 32,767, not from -256 to 255 (that's just 9 bits). Use:
for (int i = 0; i < size; i += 2)
{
c = (data[i + 1] << 8) + data[i]; // WAV files are little-endian
double t = (c - 32768)/32768.0;
rawSignal.push_back(t);
}
You might want something more like this:
uint16_t c;
for(unsigned i=0; (i < size); i++)
{
// get a 16 bit pointer to the array
uint16_t* p = (uint16_t*)data;
// get the i-th element
c = *( p + i );
// convert to signed? I'm guessing this is what you want
int16_t cs = (int16_t)c;
double t = (cs-256)/256.0;
rawSignal.push_back(t);
}
Your code converts the 8 bit value to a signed value then writes it into an unsigned variable. You should look at that and see if it's what you want.
Related
I am working on an undergrad project involving the Khepera IV mobile robot, and as I'm reading the files that came with it, I came across this line that confuses me:
for (i=0;i<5;i++) {
usvalues[i] = (short)(Buffer[i*2] | Buffer[i*2+1]<<8);
...
From the same file, usvalues[i] is initialized as usvalues[5] for each of the ultrasonic sensors on the robot, Buffer[] is initialized as Buffer[100] i assume for the sample rate of the ultrasonic sensors. But I've never seen a variable set like this. Can someone help me to understand this?
Code reads the Buffer[] array (certainly it has 8-bit elements) 2 successive bytes per iteration in little endian order (lower addressed byte is the least significant byte). It then forms a 16-bit value to save in usvalues[].
for (i=0;i<5;i++) {
usvalues[i] = (short)(Buffer[i*2] | Buffer[i*2+1]<<8);
Code should use uint8_t Buffer[100]; to prevent doing a signed left shift.
usvalues[] better as some unsigned type like uint16_t or unsigned and use unsigned operations.
uint8_t Buffer[100];
uint16_t /* or unsigned */ usvalues[5 /* or more */];
for (i = 0; i < 5; i++) {
usvalues[i] = Buffer[i*2] | (unsigned)Buffer[i*2+1] << 8;
}
This question already has answers here:
Detecting endianness programmatically in a C++ program
(29 answers)
Closed 2 years ago.
This question is about endian's.
Goal is to write 2 bytes in a file for a game on a computer. I want to make sure that people with different computers have the same result whether they use Little- or Big-Endian.
Which of these snippet do I use?
char a[2] = { 0x5c, 0x7B };
fout.write(a, 2);
or
int a = 0x7B5C;
fout.write((char*)&a, 2);
Thanks a bunch.
From wikipedia:
In its most common usage, endianness indicates the ordering of bytes within a multi-byte number.
So for char a[2] = { 0x5c, 0x7B };, a[1] will be always 0x7B
However, for int a = 0x7B5C;, char* oneByte = (char*)&a; (char *)oneByte[0]; may be 0x7B or 0x5C, but as you can see, you have to play with casts and byte pointers (bear in mind the undefined behaviour when char[1], it is only for explanation purposes).
One way that is used quite often is to write a 'signature' or 'magic' number as the first data in the file - typically a 16-bit integer whose value, when read back, will depend on whether or not the reading platform has the same endianness as the writing platform. If you then detect a mismatch, all data (of more than one byte) read from the file will need to be byte swapped.
Here's some outline code:
void ByteSwap(void *buffer, size_t length)
{
unsigned char *p = static_cast<unsigned char *>(buffer);
for (size_t i = 0; i < length / 2; ++i) {
unsigned char tmp = *(p + i);
*(p + i) = *(p + length - i - 1);
*(p + length - i - 1) = tmp;
}
return;
}
bool WriteData(void *data, size_t size, size_t num, FILE *file)
{
uint16_t magic = 0xAB12; // Something that can be tested for byte-reversal
if (fwrite(&magic, sizeof(uint16_t), 1, file) != 1) return false;
if (fwrite(data, size, num, file) != num) return false;
return true;
}
bool ReadData(void *data, size_t size, size_t num, FILE *file)
{
uint16_t test_magic;
bool is_reversed;
if (fread(&test_magic, sizeof(uint16_t), 1, file) != 1) return false;
if (test_magic == 0xAB12) is_reversed = false;
else if (test_magic == 0x12AB) is_reversed = true;
else return false; // Error - needs handling!
if (fread(data, size, num, file) != num) return false;
if (is_reversed && (size > 1)) {
for (size_t i = 0; i < num; ++i) ByteSwap(static_cast<char *>(data) + (i*size), size);
}
return true;
}
Of course, in the real world, you wouldn't need to write/read the 'magic' number for every input/output operation - just once per file, and store the is_reversed flag for future use when reading data back.
Also, with proper use of C++ code, you would probably be using std::stream arguments, rather than the FILE* I have shown - but the sample I have posted has been extracted (with only very little modification) from code that I actually use in my projects (to do just this test). But conversion to better use of modern C++ should be straightforward.
Feel free to ask for further clarification and/or explanation.
NOTE: The ByteSwap function I have provided is not ideal! It almost certainly breaks strict aliasing rules and may well cause undefined behaviour on some platforms, if used carelessly. Also, it is not the most efficient method for small data units (like int variables). One could (and should) provide one's own byte-reversal function(s) to handle specific types of variables - a good case for overloading the function with different argument types).
Which of these snippet do I use?
The first one. It has same output regardless of native endianness.
But you'll find that if you need to interpret those bytes as some integer value, that is not so straightforward. char a[2] = { 0x5c, 0x7B } can represent either 0x5c7B (big endian) or 0x7B5c (little endian). So, which one did you intend?
The solution for cross platform interpretation of integers is to decide on particular byte order for the reading and writing. De-facto "standard" for cross platform data is to use big endian.
To write a number in big endian, start by bit-shifting the input value right so that the most significant byte is in the place of the least significant byte. Mask all other bytes (technically redundant in first iteration, but we'll loop back soon). Write this byte to the output. Repeat this for all other bytes in order of significance.
This algorithm produces same output regardless of the native endianness - it will even work on exotic "middle" endian systems if you ever encounter one. Writing to little endian is similar, but in reverse order.
To read a big endian value, read the first byte of input, shift it left so that it goes to the place of most significant byte. Combine the shifted byte with the result (initially zero) using bitwise-or. Repeat with the next byte by shifting to the second most significant place and so on.
to know the Endianess of a computer?
To know endianness of a system, you can use std::endian in the upcoming C++20. Prior to that, you can use implementation specific macros from endian.h header. Or you can do a simple calculation like you suggest.
But you never really need to know the endianness of a system. You can simply use the algorithms that I described, which work on systems of all endianness without having to know what that endianness is.
I want to write to a file a series of binary strings whose length is expressed in bits rather than bytes. Take in consideration two strings s1 and s2 that in binary are respectively 011 and 01011. In this case the contents of the output file has to be: 01101011 (1 byte). I am trying to do this in the most efficient way possible since I have several million strings to concatenate for a total of several GB in output.
C++ has no way of working directly with bits because it aims at being a light layer
above the hardware and the hardware itself is not bit oriented. The very minimum
amount of bits you can read/write in one operation is a byte (normally 8 bits).
Also if you need to do disk i/o it's better to write your data in blocks instead of one byte at a time. The library has some buffering, but the earlier things are buffered the faster the code will be (less code is involved in passing data around).
A simple approach could be
unsigned char iobuffer[4096];
int bufsz; // how many bytes are present in the buffer
unsigned long long bit_accumulator;
int acc_bits; // how many bits are present in the accumulator
void writeCode(unsigned long long code, int bits) {
bit_accumulator |= code << acc_bits;
acc_bits += bits;
while (acc_bits >= 8) {
iobuffer[bufsz++] = bit_accumulator & 255;
bit_accumulator >>= 8;
acc_bits -= 8;
if (bufsz == sizeof(iobuffer)) {
// Write the buffer to disk
bufsz = 0;
}
}
}
There is no optimal way to solve your problem per se. But you can use a few pinches to speed things up:
Experiment with the file I/O sync flag. It might be that set/unset is significantly faster that the other, because of buffering and caching.
Try to use architecture sized variables so that they fit into the registers directly: uint32_t for 32 bit machines and uint64_t for 64 bit machines ...
"Volatile" might help to, keep things in registers
Use pointer and references for large data and copy small data blobs (to avoid unnecessary copy of large data and much lookups and page touching for small data)
Use mmap of the file for direct access and align your output to the page size of your architecture and hard disk (usually 4 KiB = 4096 Bytes)
Try to reduce branching (instructions like "if", "for", "while", "() ? :") and linearize your code.
And if that is not enough and when the going gets rough: Use assembler (but I would not recommend that for beginners)
I think multi threading would be contra productive in this case, because of the limited file writes that can be issued and the problem is not easy dividable into little tasks as each one needs to know how many bits after the other ones it has to start and then you would have to join all the results together in the end.
I've used the following in the past, it might help a bit...
FileWriter.h:
#ifndef FILE_WRITER_H
#define FILE_WRITER_H
#include <stdio.h>
class FileWriter
{
public:
FileWriter(const char* pFileName);
virtual ~FileWriter();
void AddBit(int iBit);
private:
FILE* m_pFile;
unsigned char m_iBitSeq;
unsigned char m_iBitSeqLen;
};
#endif
FileWriter.cpp:
#include "FileWriter.h"
#include <limits.h>
FileWriter::FileWriter(const char* pFileName)
{
m_pFile = fopen(pFileName,"wb");
m_iBitSeq = 0;
m_iBitSeqLen = 0;
}
FileWriter::~FileWriter()
{
while (m_iBitSeqLen > 0)
AddBit(0);
fclose(m_pFile);
}
void FileWriter::AddBit(int iBit)
{
m_iBitSeq |= iBit<<CHAR_BIT;
m_iBitSeq >>= 1;
m_iBitSeqLen++;
if (m_iBitSeqLen == CHAR_BIT)
{
fwrite(&m_iBitSeq,1,1,m_pFile);
m_iBitSeqLen = 0;
}
}
You can further improve it by accumulating the data up to a certain amount before writing it into the file.
I have a question regarding a sound synthesis app that I'm working on. I am trying to read in an audio file, create randomized 'grains' using granular synthesis techniques, place them into an output buffer and then be able to play that back to the user using OpenAL. For testing purposes, I am simply writing the output buffer to a file that I can then listen back to.
Judging by my results, I am on the right track but am getting some aliasing issues and playback sounds that just don't seem quite right. There is usually a rather loud pop in the middle of the output file and volume levels are VERY loud at times.
Here are the steps that I have taken to get the results I need, but I'm a little bit confused about a couple of things, namely formats that I am specifying for my AudioStreamBasicDescription.
Read in an audio file from my mainBundle, which is a mono file in .aiff format:
ExtAudioFileRef extAudioFile;
CheckError(ExtAudioFileOpenURL(loopFileURL,
&extAudioFile),
"couldn't open extaudiofile for reading");
memset(&player->dataFormat, 0, sizeof(player->dataFormat));
player->dataFormat.mFormatID = kAudioFormatLinearPCM;
player->dataFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
player->dataFormat.mSampleRate = S_RATE;
player->dataFormat.mChannelsPerFrame = 1;
player->dataFormat.mFramesPerPacket = 1;
player->dataFormat.mBitsPerChannel = 16;
player->dataFormat.mBytesPerFrame = 2;
player->dataFormat.mBytesPerPacket = 2;
// tell extaudiofile about our format
CheckError(ExtAudioFileSetProperty(extAudioFile,
kExtAudioFileProperty_ClientDataFormat,
sizeof(AudioStreamBasicDescription),
&player->dataFormat),
"couldnt set client format on extaudiofile");
SInt64 fileLengthFrames;
UInt32 propSize = sizeof(fileLengthFrames);
ExtAudioFileGetProperty(extAudioFile,
kExtAudioFileProperty_FileLengthFrames,
&propSize,
&fileLengthFrames);
player->bufferSizeBytes = fileLengthFrames * player->dataFormat.mBytesPerFrame;
Next I declare my AudioBufferList and set some more properties
AudioBufferList *buffers;
UInt32 ablSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1);
buffers = (AudioBufferList *)malloc(ablSize);
player->sampleBuffer = (SInt16 *)malloc(sizeof(SInt16) * player->bufferSizeBytes);
buffers->mNumberBuffers = 1;
buffers->mBuffers[0].mNumberChannels = 1;
buffers->mBuffers[0].mDataByteSize = player->bufferSizeBytes;
buffers->mBuffers[0].mData = player->sampleBuffer;
My understanding is that .mData will be whatever was specified in the formatFlags (in this case, type SInt16). Since it is of type (void *), I want to convert this to float data which is obvious for audio manipulation. Before I set up a for loop which just iterated through the buffer and cast each sample to a float*. This seemed unnecessary so now I pass in my .mData buffer to a function I created which then granularizes the audio:
float *theOutBuffer = [self granularizeWithData:(float *)buffers->mBuffers[0].mData with:framesRead];
In this function, I dynamically allocate some buffers, create random size grains, place them in my out buffer after windowing them using a hamming window and return that buffer (which is float data). Everything is cool up to this point.
Next I set up all my output file ASBD and such:
AudioStreamBasicDescription outputFileFormat;
bzero(audioFormatPtr, sizeof(AudioStreamBasicDescription));
outputFileFormat->mFormatID = kAudioFormatLinearPCM;
outputFileFormat->mSampleRate = 44100.0;
outputFileFormat->mChannelsPerFrame = numChannels;
outputFileFormat->mBytesPerPacket = 2 * numChannels;
outputFileFormat->mFramesPerPacket = 1;
outputFileFormat->mBytesPerFrame = 2 * numChannels;
outputFileFormat->mBitsPerChannel = 16;
outputFileFormat->mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked;
UInt32 flags = kAudioFileFlags_EraseFile;
ExtAudioFileRef outputAudioFileRef = NULL;
NSString *tmpDir = NSTemporaryDirectory();
NSString *outFilename = #"Decomp.caf";
NSString *outPath = [tmpDir stringByAppendingPathComponent:outFilename];
NSURL *outURL = [NSURL fileURLWithPath:outPath];
AudioBufferList *outBuff;
UInt32 abSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1);
outBuff = (AudioBufferList *)malloc(abSize);
outBuff->mNumberBuffers = 1;
outBuff->mBuffers[0].mNumberChannels = 1;
outBuff->mBuffers[0].mDataByteSize = abSize;
outBuff->mBuffers[0].mData = theOutBuffer;
CheckError(ExtAudioFileCreateWithURL((__bridge CFURLRef)outURL,
kAudioFileCAFType,
&outputFileFormat,
NULL,
flags,
&outputAudioFileRef),
"ErrorCreatingURL_For_EXTAUDIOFILE");
CheckError(ExtAudioFileSetProperty(outputAudioFileRef,
kExtAudioFileProperty_ClientDataFormat,
sizeof(outputFileFormat),
&outputFileFormat),
"ErrorSettingProperty_For_EXTAUDIOFILE");
CheckError(ExtAudioFileWrite(outputAudioFileRef,
framesRead,
outBuff),
"ErrorWritingFile");
The file is written correctly, in CAF format. My question is this: am I handling the .mData buffer correctly in that I am casting the samples to float data, manipulating (granulating) various window sizes and then writing it to a file using ExtAudioFileWrite (in CAF format)? Is there a more elegant way to do this such as declaring my ASBD formatFlag as kAudioFlagIsFloat? My output CAF file has some clicks in it and when I open it in Logic, it looks like there is a lot of aliasing. This makes sense if I am trying to send it float data but there is some kind of conversion happening which I am unaware of.
Thanks in advance for any advice on the matter! I have been an avid reader of pretty much all the source material online, including the Core Audio Book, various blogs, tutorials, etc. The ultimate goal of my app is to play the granularized audio in real time to a user with headphones so the writing to file thing is just being used for testing at the moment. Thanks!
What you say about step 3 suggests to me you are interpreting an array of shorts as an array of floats? If that is so, we found the reason for your trouble. Can you assign the short values one by one into an array of floats? That should fix it.
It looks like mData is a void * pointing to an array of shorts. Casting this pointer to a float * doesn't change the underlying data into float but your audio processing function will treat them as if they were. However, float and short values are stored in totally different ways, so the math you do in that function will operate on very different values that have nothing to do with your true input signal. To investigate this experimentally, try the following:
short data[4] = {-27158, 16825, 23024, 15};
void *pData = data;
The void pointer doesn't indicate what kind of data it points to, so erroneously, one can falsely assume it points to float values. Note that a short is 2 byte wide, but a float is 4 byte wide. It is a coincidence that your code did not crash with an access violation. Interpreted as float the array above is only long enough for two values. Let's just look at the first value:
float *pfData = (float *)pData;
printf("%d == %f\n", data[0], pfData[0]);
The output of this will be -27158 == 23.198200 illustrating how instead of the expected -27158.0f you obtain roughly 23.2f. Two problematic things happened. First, sizeof(float) is not sizeof(short). Second, the "ones and zeros" of a floating point number are stored very differently than an integer. See http://en.wikipedia.org/wiki/Single_precision_floating-point_format.
How to solve the problem? There are at least two simple solutions. First, you could convert each element of the array before you feed it into your audio processor:
int k;
float *pfBuf = (float *)malloc(n_data * sizeof(float));
short *psiBuf = (short *)buffers->mBuffers[0].mData[k];
for (k = 0; k < n_data; k ++)
{
pfBuf[k] = psiBuf[k];
}
[self granularizeWithData:pfBuf with:framesRead];
for (k = 0; k < n_data; k ++)
{
psiBuf[k] = pfBuf[k];
}
free(pfBuf);
You see that most likely you will have to convert everything back to short after your call to granularizeWithData: with:. So the second solution would be to do all processing in short although from what you write, I imagine you would not like that latter approach.
Despite the fact that big-endian computers are not very widely used, I want to store the double datatype in an independant format.
For int, this is really simple, since bit shifts make that very convenient.
int number;
int size=sizeof(number);
char bytes[size];
for (int i=0; i<size; ++i)
bytes[size-1-i] = (number >> 8*i) & 0xFF;
This code snipet stores the number in big endian format, despite the machine it is being run on. What is the most elegant way to do this for double?
The best way for portability and taking format into account, is serializing/deserializing the mantissa and the exponent separately. For that you can use the frexp()/ldexp() functions.
For example, to serialize:
int exp;
unsigned long long mant;
mant = (unsigned long long)(ULLONG_MAX * frexp(number, &exp));
// then serialize exp and mant.
And then to deserialize:
// deserialize to exp and mant.
double result = ldexp ((double)mant / ULLONG_MAX, exp);
The elegant thing to do is to limit the endianness problem to as small a scope as possible. That narrow scope is the I/O boundary between your program and the outside world. For example, the functions that send binary data to / receive binary data from some other application need to be aware of the endian problem, as do the functions that write binary data to / read binary data from some data file. Make those interfaces cognizant of the representation problem.
Make everything else blissfully ignorant of the problem. Use the local representation everywhere else. Represent a double precision floating point number as a double rather than an array of 8 bytes, represent a 32 bit integer as an int or int32_t rather than an array of 4 bytes, et cetera. Dealing with the endianness problem throughout your code is going to make your code bloated, error prone, and ugly.
The same. Any numeric object, including double, is eventually several bytes which are interpreted in a specific order according to endianness. So if you revert the order of the bytes you'll get exactly the same value in the reversed endianness.
char *src_data;
char *dst_data;
for (i=0;i<N*sizeof(double);i++) *dst_data++=src_data[i ^ mask];
// where mask = 7, if native == low endian
// mask = 0, if native = big_endian
The elegance lies in mask which handles also short and integer types: it's sizeof(elem)-1 if the target and source endianness differ.
Not very portable and standards violating, but something like this:
std::array<unsigned char, 8> serialize_double( double const* d )
{
std::array<unsigned char, 8> retval;
char const* begin = reinterpret_cast<char const*>(d);
char const* end = begin + sizeof(double);
union
{
uint8 i8s[8];
uint16 i16s[4];
uint32 i32s[2];
uint64 i64s;
} u;
u.i64s = 0x0001020304050607ull; // one byte order
// u.i64s = 0x0706050403020100ull; // the other byte order
for (size_t index = 0; index < 8; ++index)
{
retval[ u.i8s[index] ] = begin[index];
}
return retval;
}
might handle a platform with 8 bit chars, 8 byte doubles, and any crazy-ass byte ordering (ie, big endian in words but little endian between words for 64 bit values, for example).
Now, this doesn't cover the endianness of doubles being different than that of 64 bit ints.
An easier approach might be to cast your double into a 64 bit unsigned value, then output that as you would any other int.
void reverse_endian(double number, char (&bytes)[sizeof(double)])
{
const int size=sizeof(number);
memcpy(bytes, &number, size);
for (int i=0; i<size/2; ++i)
std::swap(bytes[i], bytes[size-i-1]);
}