Why does ofstream insert a 0x0D byte before 0x0A?

Why does ofstream insert a 0x0D byte before 0x0A? - c++

I'm outputing an array of unsigned characters in C++ using ofstream fout("filename");
but it produces a spurious character in between. This is the part of the code that makes the problem:
for(int i = 0; i < 12; i++)
fout << DChuffTable[i];
and this is the definition of the array:
unsigned char DChuffTable[12] = {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B};
In the output file I get a spurious 0x0D between 0x09 and 0x0A. I checked the array in debugging mode right before it's going to get printed and it's not changed. Please tell me what you think of this problem.

Your stream is opening in text mode, and since 0x0A is the line feed (LF) character, that's being converted by your stream to 0x0D 0x0A, i.e. CR/LF.
Open your stream in binary mode:
std::ofstream fout("filename", std::ios_base::out | std::ios_base::binary);
Then line ending conversions should not be performed.
This is usually considered a good idea anyway, as streams can go bizarre w.r.t. flushing when in text mode.

Related

Encrypt in C++ / Decrypt in x86

I am having a problem with a school assignment. The assignment is to write a metamorphic Hello World program. This program will produce 10 .com files that print "Hello World!" when executed. Each of the 10 .com files must be different from the others. I understand the concept of metamorphic vs oligomorphic vs polymorphic. My program currently creates 10 .com files and then writes the machine code to the files. I began by simply writing only the machine code to print hello world and tested it. It worked just fine. I then tried to add a decryption routine to the beginning of the machine code. Here is my current byte array:
#define ARRAY_SIZE(array) (sizeof((array))/sizeof((array[0])))
BYTE pushCS = 0x0E;
BYTE popDS = 0x1F;
BYTE movDX = 0xBA;
BYTE helloAddr1 = 0x1A;
BYTE helloAddr2 = 0x01;
BYTE movAH = 0xB4;
BYTE nine = 0x09;
BYTE Int = 0xCD;
BYTE tOne = 0x21;
BYTE movAX = 0xB8;
BYTE ret1 = 0x01;
BYTE ret2 = 0x4C;
BYTE movBL = 0xB3;
BYTE keyVal = 0x03; // Encrypt/Decrypt key
typedef unsigned char BYTE;
BYTE data[] = { 0x8D, 0x0E, 0x01, 0xB7, 0x1D, 0xB3, keyVal, 0x30, 0x1C, 0x46, 0xFE, 0xCF, 0x75, 0xF9,
movDX, helloAddr1, helloAddr2, movAH, nine, Int, tOne, movAX, ret1, ret2, Int, tOne,
0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x57, 0x6F, 0x72, 0x6C, 0x64, 0x21, 0x0D, 0x0D, 0x0A, 0x24 };
The decryption portion of the machine code is the first 14 bytes of "data". This decryption routine would take the obfuscated machine code bytes and decrypt them by xor-ing the bytes with the same key that was used to encrypt them. I am encrypting the bytes in my C++ code with this:
for (int i = 15; i < ARRAY_SIZE(data); i++)
{
data[i] ^= keyVal;
}
I have verified over and over again that my addressing is correct considering that the code begins at offset 100. What I have noticed is that when keyVal is 0x00, my code runs fine and I get 10 .com files that print Hello World!. However, this does me no good as 0x00 leaves everything unchanged. When I provide an actual key like 0x02, my program no longer works. It simply hangs until I close out DosBox. Any hints as to the cause of this would be a great help. I have some interesting plans for junk insertion (The actual metamorphic part) but I don't want to move on to that until I figure out this encrypt/decrypt issue.

The decryption portion of the machine code is the first 14 bytes of "data".
and
for (int i = 15; i < ARRAY_SIZE(data); i++)
do not match since in C++ array indexes start at 0.
In your array data[15] == helloAddr1 which means you are not encrypting the data[14] == movDX element. Double-check which elements should be encrypted and start at i = 14 if required.

Output is not what it should be

So there is a program that I am working on, that requires me to access data from a char array containing hex values. I have to use a function called func(), in this example, in order to do access the data structure. Func() contains 3 pointer variables, each of different types, and I can use any of them to access the data in the array. Whichever datatype I choose will affect what values will be stored to the pointer. Soo heres the code:
unsigned char data[]
{
0xBA, 0xDA, 0x69, 0x50,
0x33, 0xFF, 0x33, 0x40,
0x20, 0x10, 0x03, 0x30,
0x66, 0x03, 0x33, 0x40,
}
func()
{
unsigned char *ch;
unsigned int i*;
unsigned short* s;
unsigned int v;
s = (unsigned short*)&data[0];
v = s[6];
printf("val:0x%x \n",v);
}
Output:
Val:0x366
The problem with this output is that it should be 0x0366 with the zero in front of the 3, but it gets cut off at the printf statement, and I'm not allowed to modify that. How else could I fix this?

Use a format that specifies leading zeros: %04x.
Without changing the format passed to printf or replacing it entirely I'm afraid there's no way to affect the output.

C++ Send bytes from a string?

I am writing a little program that talks to the serial port. I got the program working fine with one of these lines;
unsigned char send_bytes[] = { 0x0B, 0x11, 0x00, 0x02, 0x00, 0x69, 0x85, 0xA6, 0x0e, 0x01, 0x02, 0x3, 0xf };
However the string to send is variable and so I want make something like this;
char *blahstring;
blahstring = "0x0B, 0x11, 0x00, 0x02, 0x00, 0x69, 0x85, 0xA6, 0x0e, 0x01, 0x02, 0x3, 0xf"
unsigned char send_bytes[] = { blahstring };
It doesn't give me an error but it also doesnt work.. any ideas?

a byte-string is something like this:
char *blahString = "\x0B\x11\x00\x02\x00\x69\x85\xA6\x0E\x01\x02\x03\x0f"
Also, remember that this is not a regular string. It will be wise if you explicitly state it as an array of characters, with a specific size:
Like so:
unsigned char blahString[13] = {"\x0B\x11\x00\x02\x00\x69\x85\xA6\x0E\x01\x02\x03\x0f"};
unsigned char sendBytes[13];
memcpy(sendBytes, blahString, 13); // and you've successfully copied 13 bytes from blahString to sendBytes
not the way you've defined..
EDIT:
To answer why your first send_bytes works, and the second doesn't is this:
The first one, creates an array of individual bytes. Where as, the second one, creates a string of ascii characteres. So the length of first send_bytes is 13 bytes, where as the length of the second send_bytes is much higher, since the sequence of bytes is ascii equivalent of individual characters in the second blahstring.

blahstring is a string of characters.
1st character is 0, 2nd character is x, 3rd character is 0, 4th character is B etc. So the line
unsigned char send_bytes[] = { blahstring };
is an array (assuming that you preform a cast!) will have one item.
But the example that works is an array with the 1st character has a value 0x0B, 2nd character is of value 0x11.

how to check a bit is enabled or not in array of hexadecimal digit

#include<iostream>
#define check_bit(var,pos) {return (var & (1 << pos))!=0;}
using namespace std;
int main()
{
uint8_t temp[150]={0x00,0x02,0x17,0xe2,0x1c,0xa8,0x00,0x30,0x96,0xe1,0x8c, 0x38,
0x88, 0x47, 0x00 ,0x01 ,
0x30, 0xfe, 0x00, 0x01 ,0x31, 0xfe, 0x45, 0x00, 0x00 ,0x64, 0x3b, 0x89 ,0x00, 0
x00 ,0xfe, 0x01 ,
0x33, 0x5a, 0xc0 ,0xa8 ,0x79 ,0x02 ,0x0a, 0x0a, 0x0a, 0x01, 0x08, 0x00, 0xe3, 0
x86, 0x00, 0xea,
0x01, 0xd2, 0x00, 0x00, 0x00, 0x05, 0x02, 0x6a, 0x95 ,0x98, 0xab ,0xcd ,0xab, 0x
cd ,0xab, 0xcd,
0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd ,0xab, 0xcd ,0xab ,0
xcd ,0xab, 0xcd,
0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd ,0xab ,0xcd ,0xab ,0
xcd, 0xab ,0xcd,
0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd , 0xab, 0xcd ,0xab, 0xcd, 0xab, 0
xcd, 0xab ,0xcd,
0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd, 0xab, 0xcd , 0xab, 0xcd
};
uint16_t *ptr1=(uint16_t*)&temp[0];
while(!(*(ptr1+0)==0x88 && *(ptr1+1)==0x47))
{
ptr1++;
}
cout<<"MPLS packet";
uint32_t *ptr2=(uint32_t*)&temp[0];
cout<<"4 bytes accessed at a time";
ptr2++;
while(check_bit(*(ptr+3),7)!=1)
{
cout<<"bottom of the stack:label 0";
ptr2++;
}
cout<<"mpls label:1";
return 0;
}
The program is intended to identify packet is MPLS or not by accessing two bytes at a time and checking presence of 88 and 47 packets and if MPLS packet then it should access four packets at a time and check 3rd byte(30 in this case) is enabled or not.If not enabled then access next four bytes and check byte is enabled or not.I have written program but it is not working.Please someone help me.I am not able to access individual element of array.if i give cout<<temp[0] it gives garbage value
Please help

First thing I noticed is that your code looks for consecutive 16-bit values of 0x88 and 0x47, but in the packet itself these values seem to be 8-bit (1 byte each). If ptr1 is changed to be uint8_t*, it will be able to find the values. I don't know what the correct behavior for the rest of the code is so I can't check it.
In general, directly reading values that are bigger than 8 bits (e.g. uint16_t or uint32_t) from memory here may not be a good idea because your program will behave differently on little-endian and big-endian processors. And as ydroneaud mentions in a comment, some processors won't be able to read these values because you read them from unaligned addresses.

I think I can fix your program, but you better listen to the other folks who know networking stuff better than myself.
uint8_t *ptr=temp;
while(ptr[0]!=0x88 || ptr[1]!=0x47)
{
ptr++;
}
cout<<"MPLS packet";
ptr+=2;
cout<<"4 bytes accessed at a time";
while(!check_bit(ptr[2],7))
{
cout<<"bottom of the stack:label 0";
ptr+=4;
}
cout<<"mpls label:1";
return 0;
Edit: to print individual bytes from the array you need to cast them to some integer type first. This is because uint8_t is most likely typedeffed as unsigned char which is interpreted by cout as a character code. Then you need to set the cout to hexadecimal mode:
cout << hex << (int)ptr[2] << endl;
Edit 2: there is an error in your check_bit() macro. A macro is not a function, but a piece of text that is copied as is (replacing the arguments) in place where its name is mentioned. It must be
#define check_bit(var,pos) (((var)&(1<<(pos)))!=0)
or define a function instead:
bool check_bit(int var, int pos) {return (var & (1 << pos))!=0;}

A little bit more worked out version of my comment: You should actually decode the network stack to be sure if MPLS is present, the 0x8847 value is not extremely unlikely to occur somewhere in payload, addressing schemes, ... .
To actually get to this you should decode the network stack. Lets assume you begin with an ethernet frame. Note first that most applications will give you data from the destination mac address onwards, preambles and such are discarded. So the 13th and 14th byte are the type field. This tells you what is encapsulated in ethernet, this is usually 0x0800 meaning IP. 0x8847 means a unicast static MPLS label. Other options are possible, for example ipv6 or vlan tags (described below). But note you can determine with certainty which offsets you use. You know what is encapsulated in the mac frame and where this encapsulated data starts (15th octet). Of course you see there are optional q-tags there, I explain these below.
Now as You are looking for 0x8847 I guess you have direct MPLS over ethernet, in which case you shouldn't go any further, but if your stack is more complex you'll have to decode also the next encapsulated data (e.g. IP) and take into account these sizes up until the point where you can find your MPLS header.
For ethernet there are 2 somewhat common options and that are dot1q and qinq tagging, or vlan tags. dot1q adds 4 bytes to the ethernet header, you can recognise this because the type field will be 0x8100, in this case the real type field (of what is encapsulated) will be 4 bytes further one (so the 17th byte) and the encapsulated data will start on the 19th byte. With qinq the type will be 0x9100 and the real type will be 8 bytes further on, so the 21th byte, the encapsulated data can be found from the 23rd byte onwards.
Of course, decoding the whole network stack implementation would be crazy. To start with you can ignore addressing, QoS, ... . You mostly need to find what is the type to the next header and where this starts (this can be influenced by optional fields like dot1q). Usually you know beforehand which kind of stack you have on your system. So it involves studying these headers and finding the fixed offset where you can find your MPLS header, which makes the work quite simple.

Hex to String Conversion C++/C/Qt?

I am interfacing with an external device which is sending data in hex format. It is of form
> %abcdefg,+xxx.x,T,+yy.yy,T,+zz.zz,T,A*hhCRLF
CR LF is carriage return line feed
hh->checksum
%abcdefg -> header
Each character in above packet is sent as a hex representation (the xx,yy,abcd etc are replaced with actual numbers). The problem is at my end I store it in a const char* and during the implicit conversion the checksum say 0x05 is converted to \0x05. Here \0 being null character terminates my string. This is perceived as incorrect frames when it is not. Though I can change the implementation to processing raw bytes (in hex form) but I was just wondering whether there is another way out, because it greatly simplifies processing of bytes. And this is what programmers are meant to do.
Also in cutecom (on LINUX RHEL 4) I checked the data on serial port and there also we noticed \0x05 instead of 5 for checksum.
Note that for storing incoming data I am using
//store data from serial here
unsigned char Buffer[SIZE];
//convert to a QString, here is where problem arises
QString str((const char*)Buffer); of \0
QString is "string" clone of Qt. Library is not an issue here I could use STL also, but C++ string library is also doing the same thing. Has somebody tried this type of experiment before? Do share your views.
EDIT
This is the sample code you can check for yourself also:
#include <iostream>
#include <string>
#include <QString>
#include <QApplication>
#include <QByteArray>
using std::cout;
using std::string;
using std::endl;
int main(int argc,char* argv[])
{
QApplication app(argc,argv);
int x = 0x05;
const char mydata[] = {
0x00, 0x00, 0x03, 0x84, 0x78, 0x9c, 0x3b, 0x76,
0xec, 0x18, 0xc3, 0x31, 0x0a, 0xf1, 0xcc, 0x99};
QByteArray data = QByteArray::fromRawData(mydata, sizeof(mydata));
printf("Hello %s\n",data.data());
string str("Hello ");
unsigned char ch[]={22,5,6,7,4};
QString s((const char*)ch);
qDebug("Hello %s",qPrintable(s));
cout << str << x ;
cout << "\nHello I am \0x05";
cout << "\nHello I am " << "0x05";
return app.exec();
}

QByteArray text = QByteArray::fromHex("517420697320677265617421");
text.data(); // returns "Qt is great!"

If your 0x05 is converted to the char '\x05', then you're not having hexadecimal values (that only makes sense if you have numbers as strings anyway), but binary ones. In C and C++, a char is basically just another integer type with very little added magic. So if you have a 5 and assign this to a char, what you get is whatever character your system's encoding defines as the fifth character. (In ASCII, that would be the ENQ char, whatever that means nowadays.)
If what you want instead is the char '5', then you need to convert the binary value into its string representation. In C++, this is usually done using streams:
const char ch = 5; // '\0x5'
std::ostringstream oss;
oss << static_cast<int>(ch);
const std::string& str = oss.str(); // str now contains "5"
Of course, the C std library also provides functions for this conversion. If streaming is too slow for you, you might try those.

I think c++ string classes are usually designed to handle zero-terminated char sequences. If your data is of known length (as it appears to be) then you could use a std::vector. This will provide some of the functionality of a string class, whilst ignoring nulls within data.

As I see you want to eliminate control ASCII symbols. You could do it in the following way:
#include <iostream>
#include <string>
#include <QtCore/QString>
#include <QtCore/QByteArray>
using namespace std;
// test data from your comment
char data[] = { 0x49, 0x46, 0x50, 0x4a, 0x4b, 0x51, 0x52, 0x43, 0x2c, 0x31,
0x32, 0x33, 0x2e, 0x34, 0x2c, 0x54, 0x2c, 0x41, 0x2c, 0x2b,
0x33, 0x30, 0x2e, 0x30, 0x30, 0x2c, 0x41, 0x2c, 0x2d, 0x33,
0x30, 0x2e, 0x30, 0x30, 0x2c, 0x41, 0x2a, 0x05, 0x0d, 0x0a };
// functor to remove control characters
struct toAscii
{
// values >127 will be eliminated too
char operator ()( char value ) const { if ( value < 32 && value != 0x0d && value != 0x0a ) return '.'; else return value; }
};
int main(int argc,char* argv[])
{
string s;
transform( &data[0], &data[sizeof(data)], back_inserter(s), toAscii() );
cout << s; // STL string
// convert to QString ( if necessary )
QString str = QString::fromStdString( s );
QByteArray d = str.toAscii();
cout << d.data(); // QString
return 0;
}
The code above prints the following in console:
IFPJKQRC,123.4,T,A,+30.00,A,-30.00,A*.
If you have continuous stream of data you'll get something like:
IFPJKQRC,123.4,T,A,+30.00,A,-30.00,A*.
IFPJKQRC,123.4,T,A,+30.00,A,-30.00,A*.
IFPJKQRC,123.4,T,A,+30.00,A,-30.00,A*.
IFPJKQRC,123.4,T,A,+30.00,A,-30.00,A*.
IFPJKQRC,123.4,T,A,+30.00,A,-30.00,A*.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js