Code numbers to chars and write to a file - c++

Again I've got a little problem with my DLL:
I try to convert a number (in this case "20") to a char which I can write to the file.
It doesn't really matter in which way this is done (whether following the ascii-table or not), but I need a way to convert back as well.
This was my attempt:
file.write((char*)20,3);
But it's throwing an access violence error..
Could someone tell me how this is done and also how I can reverse the process?
I could also use a method which works with numbers larger than 255 so the result are for example two or three chars (two chars = 16-bit-number.
Anyone have an idea?

If you just want to write an arbitrary byte, you can do this:
file.put(20);
or
char ch = 20;
file.write(&ch, 1); // Note a higher digit than 1 here will mean "undefined behaviour".
To reverse the process, you'd use file.get() or file.read(&ch, 1).
For larger units than a single byte, you'll have to use file.write(...), but it gets less portable, since it now relies on the size of the value being the same between different plaforms, AND that the internal representation is the same. This is not a problem if you are always running this on the same type of machine (Windows on an x86 processor, for example), but it will be a problem if you start using the code on different types of machines (x86, Sparc, ARM, IBM mainframe, Mobile phone DSP, etc) and possibly also between different OS's.
Something like this will work with the above restrictions:
int value = 4711;
file.write((char *)&value, sizeof(value));
It is much more portable to write this value to a file in text-form, which can be read by any other computer than recognises the same character encoding.

This will convert an unsigned long long into multiple characters depending on how big the number is, and output them to a file.
#include <fstream>
int main() {
unsigned long long number = 2098798987879879999;
std::ofstream out("out.txt");
while (number) { // While number != 0.
unsigned long long n = number & 255; // Copy the 8 rightmost bits.
number >>= 8; // Shift the original number 8 bits right.
out << static_cast<unsigned char>(n); // Cast to char and output.
}
out << std::endl; // Append line break for every number.
}
You can read it back from a file using something like this
#include <iostream>
#include <fstream>
#include <algorithm>
#include <string>
int main() {
std::ifstream in("out.txt");
unsigned long long number = 0;
std::string s;
std::getline(in, s); // Read line of characters.
std::reverse(begin(s), end(s)); // To account for little-endian order.
for (unsigned char c : s) {
number <<= 8;
number |= c;
}
std::cout << number << std::endl;
}
This outputs
2098798987879879999

Related

how to read/write sequnce of bits bit by bit in c++

I have implemented the Huffman coding algorithm in C++, and it's working fine. I want to create a text compression algorithm.
behind every file or data in the digital world, there is 0/1.
I want to persist the sequence of bits(0/1) that are generated by the Huffman encoding algorithm in the file.
my goal is to save the number of bits used in the file to store. I'm storing metadata for decoding in a separate file. I want to write bit by bit data to file, and then read the same bit by bit in c++.
the problem I'm facing with the binary mode is that it not allowing me to put data bit by bit.
I want to put "10101" as bit by bit to file but it put asci values or 8-bits of each character at a time.
code
#include "iostream"
#include "fstream"
using namespace std;
int main(){
ofstream f;
f.open("./one.bin", ios::out | ios::binary);
f<<"10101";
f.close();
return 0;
}
output
any help or pointer to help is appreciated. thank you.
"Binary mode" means only that you have requested that the actual bytes you write are not corrupted by end-of-line conversions. (This is only a problem on Windows. No other system has the need to deliberately corrupt your data.)
You are still writing a byte at a time in binary mode.
To write bits, you accumulate them in an integer. For convenience, in an unsigned integer. This is your bit buffer. You need to decide whether to accumulate them from the least to most or from the most to least significant positions. Once you have eight or more bits accumulated, you write out one byte to your file, and remove those eight bits from the buffer.
When you're done, if there are bits left in your buffer, you write out those last one to seven bits to one byte. You need to carefully consider how exactly you do that, and how to know how many bits there were, so that you can properly decode the bits on the other end.
The accumulation and extraction are done using the bit operations in your language. In C++ (and many other languages), those are & (and), | (or), >> (right shift), and << (left shift).
For example, to insert one bit, x, into your buffer, and later three bits in y, ending up with the earliest bits in the most significant positions:
unsigned buf = 0, bits = 0;
...
// some loop
{
...
// write one bit (don't need the & if you know x is 0 or 1)
buf = (buf << 1) | (x & 1);
bits++;
...
// write three bits
buf = (buf << 3) | (y & 7);
bits += 3;
...
// write bytes from the buffer before it fills the integer length
if (bits >= 8) { // the if could be a while if expect 16 or more
// out is an ostream -- must be in binary mode if on Windows
bits -= 8;
out.put(buf >> bits);
}
...
}
...
// write any leftover bits (it is assumed here that bits is in 0..7 --
// if not, first repeat if or while from above to clear out bytes)
if (bits) {
out.put(buf << (8 - bits));
bits = 0;
}
...

Serialize/deserialize unsigned char

I'm working on an API for an embedded device, and need to display an image generated (by the API). The screen attached to the device allows me to render bitmaps, with data stored as unsigned char image[] = { 0B00000000, 0B00001111, 0B11111110... }.
What is the easiest way to deserialize a string in whatever format needed?
My approach was to create a stringstream, separate by comma and push to vector<char>. However, the function to render bitmaps will only accept char, and from what I can find online it seems to be quite difficult to convert it. Ideally, I'd rather not use a vector at all, as including it adds several kbs to the project, which is limited in size by both the download speed of the embedded device (firmware is transferred by EDGE) and the onboard storage.
From the comments, it sounds like you want to convert a string composed of a series of "0b00000000" style literals, comma separated, into an array of their actual values. The way I would do this is to:
Get the number of bytes in the image (I assume this is known from the string length?).
Create a std::vector of unsigned char to hold the results.
For each byte in the input, construct a std::bitset from the string value, and then get its actual value.
Here's a code example. Since you have said you'd rather not use vector I have used C-style arrays and strings:
#include <bitset>
#include <cstring>
#include <iostream>
#include <memory>
int main() {
auto input = "0b00000000,0b00001111,0b11111111";
auto length = strlen(input);
// Get the number of bytes from the string length. Each byte takes 10 chars
// plus a comma separator.
int size = (length + 1) / 11;
// Allocate memory to hold the result.
std::unique_ptr<unsigned char[]> bytes(new unsigned char[size]);
// Populate each byte individually.
for (int i = 0; i < size; ++i) {
// Create the bitset. The stride is 11, and skip the first 2 characters
// to skip the 0b prefix.
std::bitset<8> bitset(input + 2 + i * 11, 8);
// Store the resulting byte.
bytes[i] = bitset.to_ulong();
}
// Now loop back over each byte, and output it to confirm the result.
for (int i = 0; i < size; ++i) {
std::cout << "0b" << std::bitset<8>(bytes[i]) << std::endl;
}
}

C++ Convert char array to int representation

What is the best way to convert a char array (containing bytes from a file) into an decimal representation so that it can be converted back later?
E.g "test" -> 18951210 -> "test".
EDITED
It can't be done without a bignum class, since there's more letter combinations possible than integer combinations in an unsigned long long. (unsigned long long will hold about 7-8 characters)
If you have some sort of bignum class:
biguint string_to_biguint(const std::string& s) {
biguint result(0);
for(int i=0; i<s.length(); ++i) {
result *= UCHAR_MAX;
result += (unsigned char)s[i];
}
return result;
}
std::string biguint_to_string(const biguint u) {
std::string result;
do {
result.append(u % UCHAR_MAX)
u /= UCHAR_MAX;
} while (u>0);
return result;
}
Note: the string to uint conversion will lose leading NULLs, and the uint to string conversion will lose trailing NULLs.
I'm not sure what exactly you mean, but characters are stored in memory as their "representation", so you don't need to convert anything. If you still want to, you have to be more specific.
EDIT: You can
Try to read byte by byte shifting the result 8 bits left and oring it
with the next byte.
Try to use mpz_inp_raw
You can use a tree similar to Huffman compression algorithm, and then represent the path in the tree as numbers.
You'll have to keep the dictionary somewhere, but you can just create a constant dictionary that covers the whole ASCII table, since the compression is not the goal here.
There is no conversion needed. You can just use pointers.
Example:
char array[4 * NUMBER];
int *pointer;
Keep in mind that the "length" of pointer is NUMBER.
As mentioned, character strings are already ranges of bytes (and hence easily rendered as decimal numbers) to start with. Number your bytes from 000 to 255 and string them together and you've got a decimal number, for whatever that is worth. It would help if you explained exactly why you would want to be using decimal numbers, specifically, as hex would be easier.
If you care about compression of the underlying arrays forming these numbers for Unicode Strings, you might be interested in:
http://en.wikipedia.org/wiki/Standard_Compression_Scheme_for_Unicode
If you want some benefits of compression but still want fast random-access reads and writes within a "packed" number, you might find my "NSTATE" library to be interesting:
http://hostilefork.com/nstate/
For instance, if you just wanted a representation that only acommodated 26 english letters...you could store "test" in:
NstateArray<26> myString (4);
You could read and write the letters without going through a compression or decompression process, in a smaller range of numbers than a conventional string. Works with any radix.
Assuming you want to store the integers(I'm reading as ascii codes) in a string. This will add the leading zeros you will need to get it back into original string. character is a byte with a max value of 255 so it will need three digits in numeric form. It can be done without STL fairly easily too. But why not use tools you have?
#include <iostream>
#include <sstream>
using namespace std;
char array[] = "test";
int main()
{
stringstream out;
string s=array;
out.fill('0');
out.width(3);
for (int i = 0; i < s.size(); ++i)
{
out << (int)s[i];
}
cout << s << " -> " << out.str();
return 0;
}
output:
test -> 116101115116
Added:
change line to
out << (int)s[i] << ",";
output
test -> 116,101,115,116,

Incrementing Individual Characters in String

I dont know if I have the correct tiltle for this, so please correct me if I am wrong and I will change my title.
I have a string, for this example I will use:
"8ce4b16b"
I would like to shift the bits (I think) along 1 so the string would be:
"9df5c27c"
Any Ideas?
EDIT:
Just so you know, these strings are hex. So it will never reach z.
All I want to do is add a number to the numbers and progress one step through the alphabet so a->b, f->g ect ect
If the number is 9 there will be a condition to keep it as 9.
The output DOES NOT need to be a hex.
Also the string is only an example. It is part of an MD5 encryption.
Transform a string? This sounds like a job for std::transform():
#include <cassert>
#include <string>
char increment(char c)
{
if ('9' == c)
{
return '9';
}
return ++c;
}
std::string increment_string(const std::string& in)
{
std::string out;
std::transform(in.begin(), in.end(), std::back_inserter(out), increment);
return out;
}
int main()
{
assert(increment_string("8ce4b16b") == "9df5c27c");
assert(increment_string("ffffffff") == "gggggggg");
assert(increment_string("89898989") == "99999999"); // N.B: this is one of 2^8 strings that will return "99999999"
assert(increment_string("99999999") == "99999999"); // This is one more. Mapping backwards is going to be tricky!
return 1;
}
Any limits you wish to impose on the characters can be implemented in the increment() function, as demonstrated.
If, on the other hand, you wish to treat the string as a hexadecimal number and add 0x11111111 to it:
#include <sstream>
#include <cassert>
int main()
{
std::istringstream iss("8ce4b16b");
long int i;
iss >> std::hex >> i;
i += 0x11111111;
std::ostringstream oss;
oss << std::hex << i;
assert(oss.str() == "9df5c27c");
return 1;
}
No bits were shifted in the construction of this string.
It looks like you simply added 0x11111111 to the integer. But can you specify precisely what tpye your input has? And what the result should be when you add one to "f" or "9"?
That's not shifting the bits ... shifting a bit multiplies a word value by 2. You're simply incrementing each hex value by 1, and that can be done by adding 0x11111111 to your dword.
For instance, if you took your value 0x8ce4b16b (that would be treating the values you printed above as-if they were a 4-byte double-word in hexadecimal), shifting it by one bit, you would end up with 0x19C962D6.
But if you simply want to increment each nibble of your dword (each individual value in a hex-number represents 4-bits or a nibble), you're going to have to add an offset of 0x1 to each nibble. Also there is no value of G in a hex-word ... you have the values 0->9, and then A->F, where F represents the base-10 value 15. Finally, when you add 0x1 to 0xF, you're going to wrap around to 0x0.
Do you mean you want to increment each character in the string?
You can do that my iterating through the array and adding one to each character.

Byte from string/int in C++

I'm a beginning user in C++ and I want to know how to do this:
How can I 'create' a byte from a string/int. So for example I've:
string some_byte = "202";
When I would save that byte to a file, I want that the file is 1 byte instead of 3 bytes.
How is that possible?
Thanks in advance,
Tim
I would use C++'s String Stream class <sstream> to convert the string to an unsigned char.
And write the unsigned char to a binary file.
so something like [not real code]
std::string some_byte = "202";
std::istringstream str(some_byte);
int val;
if( !(str >> val))
{
// bad conversion
}
if(val > 255)
{
// too big
}
unsigned char ch = static_cast<unsigned char>(val);
printByteToFile(ch); //print the byte to file.
The simple answer is...
int value = atoi( some_byte ) ;
There are a few other questions though.
1) What size is an int and is it important? (for almost all systems it's going to be more than a byte)
int size = sizeof(int) ;
2) Is the Endianness important? (if it is look in to the htons() / ntohs() functions)
In C++, casting to/from strings is best done using string streams:
#include <sstream>
// ...
std::istringstream iss(some_string);
unsigned int ui;
iss >> ui;
if(!iss) throw some_exception('"' + some_string + "\" isn't an integer!");
unsigned char byte = i;
To write to a file, you use file streams. However, streams usually write/read their data as strings. you will have to open the file in binary mode and write binary, too:
#include <fstream>
// ...
std::ofstream ofs("test.bin", std::ios::binary);
ofs.write( reinterpret_cast<const char*>(&byte), sizeof(byte)/sizeof(char) );
Use boost::lexical_cast
#include "boost/lexical_cast.hpp"
#include <iostream>
int main(int, char**)
{
int a = boost::lexical_cast<int>("42");
if(a < 256 && a > 0)
unsigned char c = static_cast<unsigned char>(a);
}
You'll find the documentation at http://www.boost.org/doc/libs/1_43_0/libs/conversion/lexical_cast.htm
However, if the goal is to save space in a file, I don't think it's the right way to go. How will your program behave if you want to convert "257" into a byte? Juste go for the simplest. You'll work out later any space use concern if it is relevant (thumb rule: always use "int" for integers and not other types unless there is a very specific reason other than early optimization)
EDIT
As the comments say it, this only works for integers, and switching to bytes won't (it will throw an exception).
So what will happen if you try to parse "267"?
IMHO, it should go through an int, and then do some bounds tests, and then only cast into a char. Going through atoi for example will result extreamly bugs prone.