Good evening, I am attempting to read some binary information from a .img file. I can retrieve 16-bit numbers (uint16_t) from ntohs(), but when I try to retrieve from the same position using ntohl(), it gives me 0 instead.
Here are the critical pieces of my program.
#include <iostream>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <arpa/inet.h>
#include <cmath>
int fd;
struct blockInfo {
long blockSize = 0;
long blockCount = 0;
long fatStart = 0;
long fatBlocks = 0;
long rootStart = 0;
long rootBlocks = 0;
long freeBlocks = 0;
long resBlocks = 0;
long alloBlocks = 0;
};
int main(int argc, char *argv[]) {
fd = open(argv[1], O_RDWR);
// Get file size
struct stat buf{};
stat(path, &buf);
size_t size = buf.st_size;
// A struct to hold data retrieved from a big endian image.
blockInfo info;
auto mapPointer = (char*) mmap(nullptr, size,
(PROT_READ | PROT_WRITE), MAP_PRIVATE, fd, 0);
info.blockSize = ntohs((uint16_t) mapPointer[12]);
long anotherBlockSize = ntohl((uint32_t) mapPointer[11]);
printf("%ld", info.blockSize); // == 512, correct
printf("%ld", anotherBlockSize); // == 0, what?
}
I understand that blockSize and anotherBlockSize are not supposed to be equal, but anotherBlockSize should be non-zero at the least, right?
Something else, I go to access data at ntohs(pointer[16]), which should return 2, but also returns 0. What is going on here? Any help would be appreciated.
No, anotherBlockSize will not necessarily be non-zero
info.blockSize = ntohs((uint16_t) mapPointer[12]);
This code reads a char from offset 12 relatively to mapPointer, casts it to uint16_t and applies ntohs() to it.
long anotherBlockSize = ntohl((uint32_t) mapPointer[11]);
This code reads a char from offset 11 relatively to mapPointer, casts it to uint32_t and applies ntohl() to it.
Obviously, you are reading non-overlapped data (different chars) from the mapped memory, so you should not expect blockSize and anotherBlockSize to be connected.
If you are trying to read the same memory in different ways (as uint32_t and uint16_t), you must do some pointer casting:
info.blockSize = ntohs( *((uint16_t*)&mapPointer[12]));
Note that such code will generally be platform dependent. Such cast working perfectly on x86 may fail on ARM.
auto mapPointer = (char*) ...
This declares mapPointer to be a char *.
... ntohl((uint32_t) mapPointer[11]);
Your obvious intent here is to use mapPointer to retrieve a 32 bit value, a four-byte value, from this location.
Unfortunately, because mapPointer is a plain, garden-variety char *, the expression mapPointer[11] evaluates to a single, lonely char value. One byte. That's what the code reads from the mmaped memory block, at the 11th offset from the start of the block. The (uint32_t) does not read an uint32_t from the address referenced mapPointer+11. mapPointer[11] reads a single char value from mapPointer+11, because mapPointer is a pointer to a char, converts it to a uint32_t, and feeds to to ntohl().
Related
I have a long array of char (coming from a raster file via GDAL), all composed of 0 and 1. To compact the data, I want to convert it to an array of bits (thus dividing the size by 8), 4 bytes at a time, writing the result to a different file. This is what I have come up with by now:
uint32_t bytes2bits(char b[33]) {
b[32] = 0;
return strtoul(b,0,2);
}
const char data[36] = "00000000000000000000000010000000101"; // 101 is to be ignored
char word[33];
strncpy(word,data,32);
uint32_t byte = bytes2bits(word);
printf("Data: %d\n",byte); // 128
The code is working, and the result is going to be written in a separate file. What I'd like to know is: can I do that without copying the characters to a new array?
EDIT: I'm using a const variable here just to make a minimal, reproducible example. In my program it's a char *, which is continually changing value inside a loop.
Yes, you can, as long as you can modify the source string (in your example code you can't because it is a constant, but I assume in reality you have the string in writable memory):
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
// You would need to make sure that the `data` argument always has
// at least 33 characters in length (the null terminator at the end
// of the original string counts)
char temp = data[32];
data[32] = 0;
uint32_t byte = bytes2bits(data);
data[32] = temp;
printf("Data: %d\n",byte); // 128
}
In this example by using char* as a buffer to store that long data there is not necessary to copy all parts into a temporary buffer to convert it to a long.
Just use a variable to step through the buffer by each 32 byte length period, but after the 32th byte there needs the 0 termination byte.
So your code would look like:
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
int dataLen = strlen(data);
int periodLen = 32;
char* periodStr;
char tmp;
int periodPos = periodLen+1;
uint32_t byte;
periodStr = data[0];
while(periodPos < dataLen)
{
tmp = data[periodPos];
data[periodPos] = 0;
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
data[periodPos] = tmp;
periodStr = data[periodPos];
periodPos += periodLen;
}
if(periodPos - periodLen <= dataLen)
{
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
}
}
Please than be careful to the last period, which could be smaller than 32 bytes.
const char data[36]
You are in violation of your contract with the compiler if you declare something as const and then modify it.
Generally speaking, the compiler won't let you modify it...so to even try to do so with a const declaration you'd have to cast it (but don't)
char *sneaky_ptr = (char*)data;
sneaky_ptr[0] = 'U'; /* the U is for "undefined behavior" */
See: Can we change the value of an object defined with const through pointers?
So if you wanted to do this, you'd have to be sure the data was legitimately non-const.
The right way to do this in modern C++ is by using std::string to hold your string and std::string_view to process parts of that string without copying it.
You can using string_view with that char array you have though. It's common to use it to modernize the classical null-terminated string const char*.
What is the most suitable type of vector to keep the bytes of a file?
I'm considering using the int type, because the bits "00000000" (1 byte) are interpreted to 0!
The goal is to save this data (bytes) to a file and retrieve from this file later.
NOTE: The files contain null bytes ("00000000" in bits)!
I'm a bit lost here. Help me! =D Thanks!
UPDATE I:
To read the file I'm using this function:
char* readFileBytes(const char *name){
std::ifstream fl(name);
fl.seekg( 0, std::ios::end );
size_t len = fl.tellg();
char *ret = new char[len];
fl.seekg(0, std::ios::beg);
fl.read(ret, len);
fl.close();
return ret;
}
NOTE I: I need to find a way to ensure that bits "00000000" can be recovered from the file!
NOTE II: Any suggestions for a safe way to save those bits "00000000" to a file?
NOTE III: When using char array I had problems converting bits "00000000" for that type.
Code Snippet:
int bit8Array[] = {0, 0, 0, 0, 0, 0, 0, 0};
char charByte = (bit8Array[7] ) |
(bit8Array[6] << 1) |
(bit8Array[5] << 2) |
(bit8Array[4] << 3) |
(bit8Array[3] << 4) |
(bit8Array[2] << 5) |
(bit8Array[1] << 6) |
(bit8Array[0] << 7);
UPDATE II:
Following the #chqrlie recommendations.
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <algorithm>
#include <random>
#include <cstring>
#include <iterator>
std::vector<unsigned char> readFileBytes(const char* filename)
{
// Open the file.
std::ifstream file(filename, std::ios::binary);
// Stop eating new lines in binary mode!
file.unsetf(std::ios::skipws);
// Get its size
std::streampos fileSize;
file.seekg(0, std::ios::end);
fileSize = file.tellg();
file.seekg(0, std::ios::beg);
// Reserve capacity.
std::vector<unsigned char> unsignedCharVec;
unsignedCharVec.reserve(fileSize);
// Read the data.
unsignedCharVec.insert(unsignedCharVec.begin(),
std::istream_iterator<unsigned char>(file),
std::istream_iterator<unsigned char>());
return unsignedCharVec;
}
int main(){
std::vector<unsigned char> unsignedCharVec;
// txt file contents "xz"
unsignedCharVec=readFileBytes("xz.txt");
// Letters -> UTF8/HEX -> bits!
// x -> 78 -> 0111 1000
// z -> 7a -> 0111 1010
for(unsigned char c : unsignedCharVec){
printf("%c\n", c);
for(int o=7; o >= 0; o--){
printf("%i", ((c >> o) & 1));
}
printf("%s", "\n");
}
// Prints...
// x
// 01111000
// z
// 01111010
return 0;
}
UPDATE III:
This is the code I am using using to write to a binary file:
void writeFileBytes(const char* filename, std::vector<unsigned char>& fileBytes){
std::ofstream file(filename, std::ios::out|std::ios::binary);
file.write(fileBytes.size() ? (char*)&fileBytes[0] : 0,
std::streamsize(fileBytes.size()));
}
writeFileBytes("xz.bin", fileBytesOutput);
UPDATE IV:
Futher read about UPDATE III:
c++ - Save the contents of a "std::vector<unsigned char>" to a file
CONCLUSION:
Definitely the solution to the problem of the "00000000" bits (1 byte) was change the type that stores the bytes of the file to std::vector<unsigned char> as the guidance of friends. std::vector<unsigned char> is a universal type (exists in all environments) and will accept any octal (unlike char* in "UPDATE I")!
In addition, changing from array (char) to vector (unsigned char) was crucial for success! With vector I manipulate my data more securely and completely independent of its content (in char array I have problems with this).
Thanks a lot!
Use std::vector<unsigned char>. Don't use std::uint8_t: it's won't exist on systems that don't have a native hardware type of exactly 8 bits. unsigned char will always exist; it will usually be the smallest addressable type that the hardware supports, and it's required to be at least 8 bits wide, so if you're trafficking in 8-bit bytes, it will handle the bits that you need.
If you really, really, really like the fixed-width types, you might consider std::uint_least8_t, which will always exist, and has at least eight bits, or std::uint_fast8_t, which also has at least eight bits. But file I/O traffics in char types, and mixing char and it's variants with vaguely specified "least" and "fast" types may well get confusing.
There are 3 problems in your code:
You use the char type and return a char *. Yet the return value is not a proper C string as you do not allocate an extra byte for the '\0' terminator nor null terminate it.
If the file may contain null bytes, you should probably use type unsigned char or uint8_t to make it explicit that the array does not contain text.
You do not return the array size to the caller. The caller has no way to tell how long the array is. You should probably use a std::vector<uint8_t> or std::vector<unsigned char> instead of an array allocated with new.
uint8_t is the winner in my eyes:
it's exactly 8 bits, or 1 byte, long;
it's unsigned without requiring you to type unsigned every time;
it's exactly the same on all platforms;
it's a generic type that does not imply any specific use, unlike char / unsigned char, which is associated with characters of text even if it can technically be used for any purpose just the same as uint8_t.
Bottom line: uint8_t is functionally equivalent to unsigned char, but does a better job of saying this is some data of unspecified nature in the source code.
So use std::vector<uint8_t>.
#include <stdint.h> to make the uint8_t definition available.
P. S. As pointed out in the comments, the C++ standard defines char as 1 byte, and byte is not, strictly speaking, required to be the same as octet (8 bits). On such a hypothetical system, char will still exist and will be 1 byte long, but uint8_t is defined as 8 bits (octet) and thus may not exist (due to implementation difficulties / overhead). So char is more portable, theoretically speaking, but uint8_t is more strict and has wider guarantees of expected behavior.
I'm trying to pullout values from a uint8_t array.
But I'm having troubles understanding how these are represented in the memory.
#include <cstdio>
#include <cstring>
#include <stdint.h>
int main(){
uint8_t tmp1[2];
uint16_t tmp2 = 511;//0x01 + 0xFF = 0x01FF
tmp1[0] = 255;//0xFF
tmp1[1] = 1;//0x01
fprintf(stderr,"memcmp = %d\n",memcmp(tmp1,&tmp2,2));
fprintf(stderr,"first elem in uint8 array = %u\n",(uint8_t) *(tmp1+0));
fprintf(stderr,"first elem in uint8 array = %u\n",(uint8_t) *(tmp1+1));
fprintf(stderr,"2xuint8_t as uint16_t = %u\n",(uint16_t) *tmp1);
return 0;
}
So i have an 2 element long array of datatype uint8_t. And I have a single variable uint16_t.
So when I take the value 511 on my little endian machine, I would assume this is layed out in memory as
0000 0001 1111 1111
But when I use memcompare it looks like it is actually being represented as
1111 1111 0000 0001
So little endianness is only used "within" each byte?
And since the single bit that is set in the tmp1[1] counts as 256, even though it is further "right" in my stream. The values for each byte (not bit), is therefore bigendian? I'm abit confused about this.
Also if I want to coerce an fprint, to printout, my 2xuint8_t as a single uint16_t, how do I do this. The code below doesn't work, it only printouts the first byte.
fprintf(stderr,"2x uint8_t as uint16_t = %u\n",(uint16_t) *tmp1);
Thanks in advance
Your assumption of what you expect is backwards. Your observation is consistent with little-endian representation. To answer your last question, it would look like this:
fprintf(stderr,"2x uint8_t as uint16_t = %u\n",*(uint16_t*)tmp1);
Don't think of endianness as "within bytes". Think of it as "byte ordering". (That is, the actual bit ordering never matters because humans typically read values in big-endian.) If it helps to imagine that the bits are reversed on a little-endian machine, you can imagine it that way. (in that case, your example would have looked like 1111 1111 1000 0000, but as I said, humans don't typically read numbers such that the most significant values are to the right...but you might want to imagine that's how the computer sees things, if it helps you understand little-endian.)
On a little endian machine, 0xAABBCCDD would be seen as 0xDD 0xCC 0xBB 0xAA in memory, just as you are seeing. On a big-endian machine (such as a PPC box) you'd see the same ordering in-memory as you see when you write out the 32-bit word.
First, if you want be 100% sure that your variables are stored in right order in memory, you should put them in a struct.
Then note that memcmp() treats input you give it as a sequence of bytes, since it has no assumptions regarding the nature of the data you give it. Think, for example, of the following code:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char** argv) {
int32_t a, b;
a = 1;
b = -1;
printf( "%i\n", memcmp( &a, &b, sizeof( int32_t ) ) );
}
It outputs -254 on my little-endian machine regardless of fact that a > b. This is because it has no ideas about what the memory actually is, so it compares them like an array of uint8_t.
If you actually want to visualize how the data is represented on your machine, you may first use fwrite to write a struct into the file and then open it with your favorite hex editor (in my experience, wxHexEditor is great in telling you how the data looks if it is X-bit Y-endian ingeter). Here's the source:
#include <stdio.h>
#include <stdint.h>
typedef struct {
uint8_t tmp1[2];
uint16_t tmp2;
} mytmp;
int main(int argc, char** argv) {
mytmp tmp;
tmp.tmp1[0] = 255;
tmp.tmp1[1] = 1;
tmp.tmp2 = 511;
FILE* file = fopen( "struct-dump", "w" );
fwrite( &tmp, sizeof( mytmp ), 1, file );
fclose( file );
}
As for treating an array of uint8_t as uint16_t, you would probably want to declare a union or use pointer coercion.
I'm following this tutorial for using OpenAL in C++: http://enigma-dev.org/forums/index.php?topic=730.0
As you can see in the tutorial, they leave a few methods unimplemented, and I am having trouble implementing file_read_int32_le(char*, FILE*) and file_read_int16_le(char*, FILE*). Apparently what it should do is load 4 bytes from the file (or 2 in the case of int16 I guess..), convert it from little-endian to big endian and then return it as an unsigned integer. Here's the code:
static unsigned int file_read_int32_le(char* buffer, FILE* file) {
size_t bytesRead = fread(buffer, 1, 4, file);
printf("%x\n",(unsigned int)*buffer);
unsigned int* newBuffer = (unsigned int*)malloc(4);
*newBuffer = ((*buffer << 24) & 0xFF000000U) | ((*buffer << 8) & 0x00FF0000U) | ((*buffer >> 8) & 0x0000FF00U) | ((*buffer >> 24) & 0x000000FFU);
printf("%x\n", *newBuffer);
return (unsigned int)*newBuffer;
}
When debugging (in XCode) it says that the hexadecimal value of *buffer is 0x72, which is only one byte. When I create newBuffer using malloc(4), I get a 4-byte buffer (*newBuffer is something like 0xC0000003) which then, after the operations, becomes 0x72000000. I assume the result I'm looking for is 0x00000027 (edit: actually 0x00000072), but how would I achieve this? Is it something to do with converting between the char* buffer and the unsigned int* newBuffer?
Yes, *buffer will read in Xcode's debugger as 0x72, because buffer is a pointer to a char.
If the first four bytes in the memory block pointed to by buffer are (hex) 72 00 00 00, then the return value should be 0x00000072, not 0x00000027. The bytes should get swapped, but not the two "nybbles" that make up each byte.
This code leaks the memory you malloc'd, and you don't need to malloc here anyway.
Your byte-swapping is correct on a PowerPC or 68K Mac, but not on an Intel Mac or ARM-based iOS. On those platforms, you don't have to do any byte-swapping because they're natively little-endian.
Core Foundation provides a way to do this all much more easily:
static uint32_t file_read_int32_le(char* buffer, FILE* file) {
fread(buffer, 1, 4, file); // Get four bytes from the file
uint32_t val = *(uint32_t*)buffer; // Turn them into a 32-bit integer
// Swap on a big-endian Mac, do nothing on a little-endian Mac or iOS
return CFSwapInt32LittleToHost(val);
}
there's a whole range of functions called "htons/htonl/hton" whose sole purpose in life is to convert from "host" to "network" byte order.
http://beej.us/guide/bgnet/output/html/multipage/htonsman.html
Each function has a reciprocal that does the opposite.
Now, these functions won't help you necessarily because they intrinsically convert from your hosts specific byte order, so please just use this answer as a starting point to find what you need. Generally code should never make assumptions about what architecture it's on.
Intel == "Little Endian".
Network == "Big Endian".
Hope this starts you out on the right track.
I've used the following for integral types. On some platforms, it's not safe for non-integral types.
template <typename T> T byte_reverse(T in) {
T out;
char* in_c = reinterpret_cast<char *>(&in);
char* out_c = reinterpret_cast<char *>(&out);
std::reverse_copy(in_c, in_c+sizeof(T), out_c);
return out;
};
So, to put that in your file reader (why are you passing the buffer in, since it appears that it could be a temporary)
static unsigned int file_read_int32_le(FILE* file) {
unsigned int int_buffer;
size_t bytesRead = fread(&int_buffer, 1, sizeof(int_buffer), file);
/* Error or less than 4 bytes should be checked */
return byte_reverse(int_buffer);
}
Hello I have a chunk of memory (allocated with malloc()) that contains bits (bit literal), I'd like to read it as an array of char, or, better, I'd like to printout the ASCII value of 8 consecutively bits of the memory.
I have allocated he memory as char *, but I've not been able to take characters out in a better way than evaluating each bit, adding the value to a char and shifting left the value of the char, in a loop, but I was looking for a faster solution.
Thank you
What I've wrote for now is this:
for allocation:
char * bits = (char*) malloc(1);
for writing to mem:
ifstream cleartext;
cleartext.open(sometext);
while(cleartext.good())
{
c = cleartext.get();
for(int j = 0; j < 8; j++)
{ //set(index) and reset(index) set or reset the bit at bits[i]
(c & 0x80) ? (set(index)):(reset(index));//(*ptr++ = '1'):(*ptr++='0');
c = c << 1;
}..
}..
and until now I've not been able to get character back, I only get the bits printed out using:
printf("%s\n" bits);
An example of what I'm trying to do is:
input.txt contains the string "AAAB"
My program would have to write "AAAB" as "01000001010000010100000101000010" to memory
(it's the ASCII values in bit of AAAB that are 65656566 in bits)
Then I would like that it have a function to rewrite the content of the memory to a file.
So if memory contains again "01000001010000010100000101000010" it would write to the output file "AAAB".
int numBytes = 512;
char *pChar = (char *)malloc(numBytes);
for( int i = 0; i < numBytes; i++ ){
pChar[i] = '8';
}
Since this is C++, you can also use "new":
int numBytes = 512;
char *pChar = new char[numBytes];
for( int i = 0; i < numBytes; i++ ){
pChar[i] = '8';
}
If you want to visit every bit in the memory chunk, it looks like you need std::bitset.
char* pChunk = malloc( n );
// read in pChunk data
// iterate over all the bits.
for( int i = 0; i != n; ++i ){
std::bitset<8>& bits = *reinterpret_cast< std::bitset<8>* >( pByte );
for( int iBit = 0; iBit != 8; ++iBit ) {
std::cout << bits[i];
}
}
I'd like to printout the ASCII value of 8 consecutively bits of the memory.
The possible value for any bit is either 0 or 1. You probably want at least a byte.
char * bits = (char*) malloc(1);
Allocates 1 byte on the heap. A much more efficient and hassle-free thing would have been to create an object on the stack i.e.:
char bits; // a single character, has CHAR_BIT bits
ifstream cleartext;
cleartext.open(sometext);
The above doesn't write anything to mem. It tries to open a file in input mode.
It has ascii characters and common eof or \n, or things like this, the input would only be a textfile, so I think it should only contain ASCII characters, correct me if I'm wrong.
If your file only has ASCII data you don't have to worry. All you need to do is read in the file contents and write it out. The compiler manages how the data will be stored (i.e. which encoding to use for your characters and how to represent them in binary, the endianness of the system etc). The easiest way to read/write files will be:
// include these on as-needed basis
#include <algorithm>
#include <iostream>
#include <iterator>
#include <fstream>
using namespace std;
// ...
/* read from standard input and write to standard output */
copy((istream_iterator<char>(cin)), (istream_iterator<char>()),
(ostream_iterator<char>(cout)));
/*-------------------------------------------------------------*/
/* read from standard input and write to text file */
copy(istream_iterator<char>(cin), istream_iterator<char>(),
ostream_iterator<char>(ofstream("output.txt"), "\n") );
/*-------------------------------------------------------------*/
/* read from text file and write to text file */
copy(istream_iterator<char>(ifstream("input.txt")), istream_iterator<char>(),
ostream_iterator<char>(ofstream("output.txt"), "\n") );
/*-------------------------------------------------------------*/
The last remaining question is: Do you want to do something with the binary representation? If not, forget about it. Else, update your question one more time.
E.g: Processing the character array to encrypt it using a block cipher
/* a hash calculator */
struct hash_sha1 {
unsigned char operator()(unsigned char x) {
// process
return rc;
}
};
/* store house of characters, could've been a vector as well */
basic_string<unsigned char> line;
/* read from text file and write to a string of unsigned chars */
copy(istream_iterator<unsigned char>(ifstream("input.txt")),
istream_iterator<char>(),
back_inserter(line) );
/* Calculate a SHA-1 hash of the input */
basic_string<unsigned char> hashmsg;
transform(line.begin(), line.end(), back_inserter(hashmsg), hash_sha1());
Something like this?
char *buffer = (char*)malloc(42);
// ... put something into the buffer ...
printf("%c\n", buffer[0]);
But, since you're using C++, I wonder why you bother with malloc and such...
char* ptr = pAddressOfMemoryToRead;
while(ptr < pAddressOfMemoryToRead + blockLength)
{
char tmp = *ptr;
// temp now has the char from this spot in memory
ptr++;
}
Is this what you are trying to achieve:
char* p = (char*)malloc(10 * sizeof(char));
char* p1 = p;
memcpy(p,"abcdefghij", 10);
for(int i = 0; i < 10; ++i)
{
char c = *p1;
cout<<c<<" ";
++p1;
}
cout<<"\n";
free(p);
Can you please explain in more detail, perhaps including code? What you're saying makes no sense unless I'm completely misreading your question. Are you doing something like this?
char * chunk = (char *)malloc(256);
If so, you can access any character's worth of data by treating chunk as an array: chunk[5] gives you the 5th element, etc. Of course, these will be characters, which may be what you want, but I can't quite tell from your question... for instance, if chunk[5] is 65, when you print it like cout << chunk[5];, you'll get a letter 'A'.
However, you may be asking how to print out the actual number 65, in which case you want to do cout << int(chunk[5]);. Casting to int will make it print as an integer value instead of as a character. If you clarify your question, either I or someone else can help you further.
Are you asking how to copy the memory bytes of an arbitrary struct into a char* array? If so this should do the trick
SomeType t = GetSomeType();
char* ptr = malloc(sizeof(SomeType));
if ( !ptr ) {
// Handle no memory. Probably should just crash
}
memcpy(ptr,&t,sizeof(SomeType));
I'm not sure I entirely grok what you're trying to do, but a couple of suggestions:
1) use std::vector instead of malloc/free and new/delete. It's safer and doesn't have much overhead.
2) when processing, try doing chunks rather than bytes. Even though streams are buffered, it's usually more efficient grabbing a chunk at a time.
3) there's a lot of different ways to output bits, but again you don't want a stream output for each character. You might want to try something like the following:
void outputbits(char *dest, char source)
{
dest[8] = 0;
for(int i=0; i<8; ++i)
dest[i] = source & (1<<(7-i)) ? '1':'0';
}
Pass it a char[9] output buffer and a char input, and you get a printable bitstring back. Decent compilers produce OK output code for this... how much speed do you need?