I'm a beginning user in C++ and I want to know how to do this:
How can I 'create' a byte from a string/int. So for example I've:
string some_byte = "202";
When I would save that byte to a file, I want that the file is 1 byte instead of 3 bytes.
How is that possible?
Thanks in advance,
Tim
I would use C++'s String Stream class <sstream> to convert the string to an unsigned char.
And write the unsigned char to a binary file.
so something like [not real code]
std::string some_byte = "202";
std::istringstream str(some_byte);
int val;
if( !(str >> val))
{
// bad conversion
}
if(val > 255)
{
// too big
}
unsigned char ch = static_cast<unsigned char>(val);
printByteToFile(ch); //print the byte to file.
The simple answer is...
int value = atoi( some_byte ) ;
There are a few other questions though.
1) What size is an int and is it important? (for almost all systems it's going to be more than a byte)
int size = sizeof(int) ;
2) Is the Endianness important? (if it is look in to the htons() / ntohs() functions)
In C++, casting to/from strings is best done using string streams:
#include <sstream>
// ...
std::istringstream iss(some_string);
unsigned int ui;
iss >> ui;
if(!iss) throw some_exception('"' + some_string + "\" isn't an integer!");
unsigned char byte = i;
To write to a file, you use file streams. However, streams usually write/read their data as strings. you will have to open the file in binary mode and write binary, too:
#include <fstream>
// ...
std::ofstream ofs("test.bin", std::ios::binary);
ofs.write( reinterpret_cast<const char*>(&byte), sizeof(byte)/sizeof(char) );
Use boost::lexical_cast
#include "boost/lexical_cast.hpp"
#include <iostream>
int main(int, char**)
{
int a = boost::lexical_cast<int>("42");
if(a < 256 && a > 0)
unsigned char c = static_cast<unsigned char>(a);
}
You'll find the documentation at http://www.boost.org/doc/libs/1_43_0/libs/conversion/lexical_cast.htm
However, if the goal is to save space in a file, I don't think it's the right way to go. How will your program behave if you want to convert "257" into a byte? Juste go for the simplest. You'll work out later any space use concern if it is relevant (thumb rule: always use "int" for integers and not other types unless there is a very specific reason other than early optimization)
EDIT
As the comments say it, this only works for integers, and switching to bytes won't (it will throw an exception).
So what will happen if you try to parse "267"?
IMHO, it should go through an int, and then do some bounds tests, and then only cast into a char. Going through atoi for example will result extreamly bugs prone.
Related
What is the most suitable type of vector to keep the bytes of a file?
I'm considering using the int type, because the bits "00000000" (1 byte) are interpreted to 0!
The goal is to save this data (bytes) to a file and retrieve from this file later.
NOTE: The files contain null bytes ("00000000" in bits)!
I'm a bit lost here. Help me! =D Thanks!
UPDATE I:
To read the file I'm using this function:
char* readFileBytes(const char *name){
std::ifstream fl(name);
fl.seekg( 0, std::ios::end );
size_t len = fl.tellg();
char *ret = new char[len];
fl.seekg(0, std::ios::beg);
fl.read(ret, len);
fl.close();
return ret;
}
NOTE I: I need to find a way to ensure that bits "00000000" can be recovered from the file!
NOTE II: Any suggestions for a safe way to save those bits "00000000" to a file?
NOTE III: When using char array I had problems converting bits "00000000" for that type.
Code Snippet:
int bit8Array[] = {0, 0, 0, 0, 0, 0, 0, 0};
char charByte = (bit8Array[7] ) |
(bit8Array[6] << 1) |
(bit8Array[5] << 2) |
(bit8Array[4] << 3) |
(bit8Array[3] << 4) |
(bit8Array[2] << 5) |
(bit8Array[1] << 6) |
(bit8Array[0] << 7);
UPDATE II:
Following the #chqrlie recommendations.
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <algorithm>
#include <random>
#include <cstring>
#include <iterator>
std::vector<unsigned char> readFileBytes(const char* filename)
{
// Open the file.
std::ifstream file(filename, std::ios::binary);
// Stop eating new lines in binary mode!
file.unsetf(std::ios::skipws);
// Get its size
std::streampos fileSize;
file.seekg(0, std::ios::end);
fileSize = file.tellg();
file.seekg(0, std::ios::beg);
// Reserve capacity.
std::vector<unsigned char> unsignedCharVec;
unsignedCharVec.reserve(fileSize);
// Read the data.
unsignedCharVec.insert(unsignedCharVec.begin(),
std::istream_iterator<unsigned char>(file),
std::istream_iterator<unsigned char>());
return unsignedCharVec;
}
int main(){
std::vector<unsigned char> unsignedCharVec;
// txt file contents "xz"
unsignedCharVec=readFileBytes("xz.txt");
// Letters -> UTF8/HEX -> bits!
// x -> 78 -> 0111 1000
// z -> 7a -> 0111 1010
for(unsigned char c : unsignedCharVec){
printf("%c\n", c);
for(int o=7; o >= 0; o--){
printf("%i", ((c >> o) & 1));
}
printf("%s", "\n");
}
// Prints...
// x
// 01111000
// z
// 01111010
return 0;
}
UPDATE III:
This is the code I am using using to write to a binary file:
void writeFileBytes(const char* filename, std::vector<unsigned char>& fileBytes){
std::ofstream file(filename, std::ios::out|std::ios::binary);
file.write(fileBytes.size() ? (char*)&fileBytes[0] : 0,
std::streamsize(fileBytes.size()));
}
writeFileBytes("xz.bin", fileBytesOutput);
UPDATE IV:
Futher read about UPDATE III:
c++ - Save the contents of a "std::vector<unsigned char>" to a file
CONCLUSION:
Definitely the solution to the problem of the "00000000" bits (1 byte) was change the type that stores the bytes of the file to std::vector<unsigned char> as the guidance of friends. std::vector<unsigned char> is a universal type (exists in all environments) and will accept any octal (unlike char* in "UPDATE I")!
In addition, changing from array (char) to vector (unsigned char) was crucial for success! With vector I manipulate my data more securely and completely independent of its content (in char array I have problems with this).
Thanks a lot!
Use std::vector<unsigned char>. Don't use std::uint8_t: it's won't exist on systems that don't have a native hardware type of exactly 8 bits. unsigned char will always exist; it will usually be the smallest addressable type that the hardware supports, and it's required to be at least 8 bits wide, so if you're trafficking in 8-bit bytes, it will handle the bits that you need.
If you really, really, really like the fixed-width types, you might consider std::uint_least8_t, which will always exist, and has at least eight bits, or std::uint_fast8_t, which also has at least eight bits. But file I/O traffics in char types, and mixing char and it's variants with vaguely specified "least" and "fast" types may well get confusing.
There are 3 problems in your code:
You use the char type and return a char *. Yet the return value is not a proper C string as you do not allocate an extra byte for the '\0' terminator nor null terminate it.
If the file may contain null bytes, you should probably use type unsigned char or uint8_t to make it explicit that the array does not contain text.
You do not return the array size to the caller. The caller has no way to tell how long the array is. You should probably use a std::vector<uint8_t> or std::vector<unsigned char> instead of an array allocated with new.
uint8_t is the winner in my eyes:
it's exactly 8 bits, or 1 byte, long;
it's unsigned without requiring you to type unsigned every time;
it's exactly the same on all platforms;
it's a generic type that does not imply any specific use, unlike char / unsigned char, which is associated with characters of text even if it can technically be used for any purpose just the same as uint8_t.
Bottom line: uint8_t is functionally equivalent to unsigned char, but does a better job of saying this is some data of unspecified nature in the source code.
So use std::vector<uint8_t>.
#include <stdint.h> to make the uint8_t definition available.
P. S. As pointed out in the comments, the C++ standard defines char as 1 byte, and byte is not, strictly speaking, required to be the same as octet (8 bits). On such a hypothetical system, char will still exist and will be 1 byte long, but uint8_t is defined as 8 bits (octet) and thus may not exist (due to implementation difficulties / overhead). So char is more portable, theoretically speaking, but uint8_t is more strict and has wider guarantees of expected behavior.
I need to write 16-bit integers to a file. fstream only writes characters. Thus I need to convert the integers to char - the actual integer, not the character representing the integer (i.e. 0 should be 0x00, not 0x30) I tried the following:
char * chararray = (char*)(&the_int);
However this creates a backwards array of two characters. The individual characters are not flipped, but the order of the characters is. Thus I created this function:
char * inttochar(uint16_t input)
{
int input_size = sizeof(input);
char * chararray = (char*)(&input);
char * output;
output[0]='\0';
for (int i=0; i<input_size; i++)
{
output[i]=chararray[input_size-(i+1)];
}
return output;
}
This seems slow. Surely there is a more efficient, less hacky way to convert it?
It's a bit hard to understand what you're asking here (perhaps it's just me, although I gather the commentators thought so too).
You write
fstream only writes characters
That's true, but doesn't necessarily mean you need to create a character array explicitly.
E.g., if you have an fstream object f (opened in binary mode), you can use the write method:
uint16_t s;
...
f.write(static_cast<const char *>(&s), sizeof(uint16_t));
As others have noted, when you serialize numbers, it often pays to use a commonly-accepted ordering. Hence, use htons (refer to the documentation for your OS's library):
uint16_t s;
...
const uint16_t ns = htons(s);
f.write(static_cast<const char *>(&ns), sizeof(uint16_t));
Again I've got a little problem with my DLL:
I try to convert a number (in this case "20") to a char which I can write to the file.
It doesn't really matter in which way this is done (whether following the ascii-table or not), but I need a way to convert back as well.
This was my attempt:
file.write((char*)20,3);
But it's throwing an access violence error..
Could someone tell me how this is done and also how I can reverse the process?
I could also use a method which works with numbers larger than 255 so the result are for example two or three chars (two chars = 16-bit-number.
Anyone have an idea?
If you just want to write an arbitrary byte, you can do this:
file.put(20);
or
char ch = 20;
file.write(&ch, 1); // Note a higher digit than 1 here will mean "undefined behaviour".
To reverse the process, you'd use file.get() or file.read(&ch, 1).
For larger units than a single byte, you'll have to use file.write(...), but it gets less portable, since it now relies on the size of the value being the same between different plaforms, AND that the internal representation is the same. This is not a problem if you are always running this on the same type of machine (Windows on an x86 processor, for example), but it will be a problem if you start using the code on different types of machines (x86, Sparc, ARM, IBM mainframe, Mobile phone DSP, etc) and possibly also between different OS's.
Something like this will work with the above restrictions:
int value = 4711;
file.write((char *)&value, sizeof(value));
It is much more portable to write this value to a file in text-form, which can be read by any other computer than recognises the same character encoding.
This will convert an unsigned long long into multiple characters depending on how big the number is, and output them to a file.
#include <fstream>
int main() {
unsigned long long number = 2098798987879879999;
std::ofstream out("out.txt");
while (number) { // While number != 0.
unsigned long long n = number & 255; // Copy the 8 rightmost bits.
number >>= 8; // Shift the original number 8 bits right.
out << static_cast<unsigned char>(n); // Cast to char and output.
}
out << std::endl; // Append line break for every number.
}
You can read it back from a file using something like this
#include <iostream>
#include <fstream>
#include <algorithm>
#include <string>
int main() {
std::ifstream in("out.txt");
unsigned long long number = 0;
std::string s;
std::getline(in, s); // Read line of characters.
std::reverse(begin(s), end(s)); // To account for little-endian order.
for (unsigned char c : s) {
number <<= 8;
number |= c;
}
std::cout << number << std::endl;
}
This outputs
2098798987879879999
So I need a little help, I've currently got a text file with following data in it:
myfile.txt
-----------
b801000000
What I want to do is read that b801 etc.. data as bits so I could get values for
0xb8 0x01 0x00 0x00 0x00.
Current I'm reading that line into a unsigned string using the following typedef.
typedef std::basic_string <unsigned char> ustring;
ustring blah = reinterpret_cast<const unsigned char*>(buffer[1].c_str());
Where I keep falling down is trying to now get each char {'b', '8' etc...} to really be { '0xb8', '0x01' etc...}
Any help is appreciated.
Thanks.
I see two ways:
Open the file as std::ios::binary and use std::ifstream::operator>> to extract hexadecimal double bytes after using the flag std::ios_base::hex and extracting to a type that is two bytes large (like stdint.h's (C++0x/C99) uint16_t or equivalent). See #neuro's comment to your question for an example using std::stringstreams. std::ifstream would work nearly identically.
Access the stream iterators directly and perform the conversion manually. Harder and more error-prone, not necessarily faster either, but still quite possible.
strtol does string (needs a nullterminated C string) to int with a specified base
Kind of a dirty way to do it:
#include <stdio.h>
int main ()
{
int num;
char* value = "b801000000";
while (*value) {
sscanf (value, "%2x", &num);
printf ("New number: %d\n", num);
value += 2;
}
return 0;
}
Running this, I get:
New number: 184
New number: 1
New number: 0
New number: 0
New number: 0
Hello I have a chunk of memory (allocated with malloc()) that contains bits (bit literal), I'd like to read it as an array of char, or, better, I'd like to printout the ASCII value of 8 consecutively bits of the memory.
I have allocated he memory as char *, but I've not been able to take characters out in a better way than evaluating each bit, adding the value to a char and shifting left the value of the char, in a loop, but I was looking for a faster solution.
Thank you
What I've wrote for now is this:
for allocation:
char * bits = (char*) malloc(1);
for writing to mem:
ifstream cleartext;
cleartext.open(sometext);
while(cleartext.good())
{
c = cleartext.get();
for(int j = 0; j < 8; j++)
{ //set(index) and reset(index) set or reset the bit at bits[i]
(c & 0x80) ? (set(index)):(reset(index));//(*ptr++ = '1'):(*ptr++='0');
c = c << 1;
}..
}..
and until now I've not been able to get character back, I only get the bits printed out using:
printf("%s\n" bits);
An example of what I'm trying to do is:
input.txt contains the string "AAAB"
My program would have to write "AAAB" as "01000001010000010100000101000010" to memory
(it's the ASCII values in bit of AAAB that are 65656566 in bits)
Then I would like that it have a function to rewrite the content of the memory to a file.
So if memory contains again "01000001010000010100000101000010" it would write to the output file "AAAB".
int numBytes = 512;
char *pChar = (char *)malloc(numBytes);
for( int i = 0; i < numBytes; i++ ){
pChar[i] = '8';
}
Since this is C++, you can also use "new":
int numBytes = 512;
char *pChar = new char[numBytes];
for( int i = 0; i < numBytes; i++ ){
pChar[i] = '8';
}
If you want to visit every bit in the memory chunk, it looks like you need std::bitset.
char* pChunk = malloc( n );
// read in pChunk data
// iterate over all the bits.
for( int i = 0; i != n; ++i ){
std::bitset<8>& bits = *reinterpret_cast< std::bitset<8>* >( pByte );
for( int iBit = 0; iBit != 8; ++iBit ) {
std::cout << bits[i];
}
}
I'd like to printout the ASCII value of 8 consecutively bits of the memory.
The possible value for any bit is either 0 or 1. You probably want at least a byte.
char * bits = (char*) malloc(1);
Allocates 1 byte on the heap. A much more efficient and hassle-free thing would have been to create an object on the stack i.e.:
char bits; // a single character, has CHAR_BIT bits
ifstream cleartext;
cleartext.open(sometext);
The above doesn't write anything to mem. It tries to open a file in input mode.
It has ascii characters and common eof or \n, or things like this, the input would only be a textfile, so I think it should only contain ASCII characters, correct me if I'm wrong.
If your file only has ASCII data you don't have to worry. All you need to do is read in the file contents and write it out. The compiler manages how the data will be stored (i.e. which encoding to use for your characters and how to represent them in binary, the endianness of the system etc). The easiest way to read/write files will be:
// include these on as-needed basis
#include <algorithm>
#include <iostream>
#include <iterator>
#include <fstream>
using namespace std;
// ...
/* read from standard input and write to standard output */
copy((istream_iterator<char>(cin)), (istream_iterator<char>()),
(ostream_iterator<char>(cout)));
/*-------------------------------------------------------------*/
/* read from standard input and write to text file */
copy(istream_iterator<char>(cin), istream_iterator<char>(),
ostream_iterator<char>(ofstream("output.txt"), "\n") );
/*-------------------------------------------------------------*/
/* read from text file and write to text file */
copy(istream_iterator<char>(ifstream("input.txt")), istream_iterator<char>(),
ostream_iterator<char>(ofstream("output.txt"), "\n") );
/*-------------------------------------------------------------*/
The last remaining question is: Do you want to do something with the binary representation? If not, forget about it. Else, update your question one more time.
E.g: Processing the character array to encrypt it using a block cipher
/* a hash calculator */
struct hash_sha1 {
unsigned char operator()(unsigned char x) {
// process
return rc;
}
};
/* store house of characters, could've been a vector as well */
basic_string<unsigned char> line;
/* read from text file and write to a string of unsigned chars */
copy(istream_iterator<unsigned char>(ifstream("input.txt")),
istream_iterator<char>(),
back_inserter(line) );
/* Calculate a SHA-1 hash of the input */
basic_string<unsigned char> hashmsg;
transform(line.begin(), line.end(), back_inserter(hashmsg), hash_sha1());
Something like this?
char *buffer = (char*)malloc(42);
// ... put something into the buffer ...
printf("%c\n", buffer[0]);
But, since you're using C++, I wonder why you bother with malloc and such...
char* ptr = pAddressOfMemoryToRead;
while(ptr < pAddressOfMemoryToRead + blockLength)
{
char tmp = *ptr;
// temp now has the char from this spot in memory
ptr++;
}
Is this what you are trying to achieve:
char* p = (char*)malloc(10 * sizeof(char));
char* p1 = p;
memcpy(p,"abcdefghij", 10);
for(int i = 0; i < 10; ++i)
{
char c = *p1;
cout<<c<<" ";
++p1;
}
cout<<"\n";
free(p);
Can you please explain in more detail, perhaps including code? What you're saying makes no sense unless I'm completely misreading your question. Are you doing something like this?
char * chunk = (char *)malloc(256);
If so, you can access any character's worth of data by treating chunk as an array: chunk[5] gives you the 5th element, etc. Of course, these will be characters, which may be what you want, but I can't quite tell from your question... for instance, if chunk[5] is 65, when you print it like cout << chunk[5];, you'll get a letter 'A'.
However, you may be asking how to print out the actual number 65, in which case you want to do cout << int(chunk[5]);. Casting to int will make it print as an integer value instead of as a character. If you clarify your question, either I or someone else can help you further.
Are you asking how to copy the memory bytes of an arbitrary struct into a char* array? If so this should do the trick
SomeType t = GetSomeType();
char* ptr = malloc(sizeof(SomeType));
if ( !ptr ) {
// Handle no memory. Probably should just crash
}
memcpy(ptr,&t,sizeof(SomeType));
I'm not sure I entirely grok what you're trying to do, but a couple of suggestions:
1) use std::vector instead of malloc/free and new/delete. It's safer and doesn't have much overhead.
2) when processing, try doing chunks rather than bytes. Even though streams are buffered, it's usually more efficient grabbing a chunk at a time.
3) there's a lot of different ways to output bits, but again you don't want a stream output for each character. You might want to try something like the following:
void outputbits(char *dest, char source)
{
dest[8] = 0;
for(int i=0; i<8; ++i)
dest[i] = source & (1<<(7-i)) ? '1':'0';
}
Pass it a char[9] output buffer and a char input, and you get a printable bitstring back. Decent compilers produce OK output code for this... how much speed do you need?