Newline character in Text Document? - c++

I wrote a pretty simple function that reads in possible player names and stores them in a map for later use. Basically in the file, each line is a new possible player name, but for some reason it seems like all but the last name has some invisible new line character after it. My print out is showing it like this...
nameLine = Georgio
Name: Georgio
0
nameLine = TestPlayer
Name: TestPlayer 0
Here is the actual code. I assume I need to be stripping something out but I am not sure what I need to be checking for.
bool PlayerManager::ParsePlayerNames()
{
FileHandle_t file;
file = filesystem->Open("names.txt", "r", "MOD");
if(file)
{
int size = filesystem->Size(file);
char *line = new char[size + 1];
while(!filesystem->EndOfFile(file))
{
char *nameLine = filesystem->ReadLine(line, size, file);
if(strcmp(nameLine, "") != 0)
{
Msg("nameLine = %s\n", nameLine);
g_PlayerNames.insert(std::pair<char*, int>(nameLine, 0));
}
for(std::map<char*,int>::iterator it = g_PlayerNames.begin(); it != g_PlayerNames.end(); ++it)
{
Msg("Name: %s %d\n", it->first, it->second);
}
}
return true;
}
Msg("[PlayerManager] Failed to find the Player Names File (names.txt)\n");
filesystem->Close(file);
return false;
}

You really need to consider using iostreams and std::string. The above code is SO much more simpler if you used the C++ constructs available to you.
Problems with your code:
why do you allocate a buffer for a single line which is the size of the file?
You don't clean up this buffer!
How does ReadLine fill the line buffer?
presumably nameLine points to the begining of the line buffer, if so, given in the std::map, the key is a pointer (char*) rather than a string as you were expecting, and the pointer is the same! If different (i.e. somehow you read a line and then move the pointer along for each name, then std::map will contain an entry per player, however you'll not be able to find an entry by player name as the comparison will be a pointer comparison rather than a string comparison as you are expecting!
I suggest that you look at implementing this using iostreams, here is some example code (without any testing)
ifstream fin("names.txt");
std::string line;
while (fin.good())
{
std::getline(fin, line); // automatically drops the new line character!
if (!line.empty())
{
g_PlayerNames.insert(std::pair<std::string, int>(line, 0));
}
}
// now do what you need to
}
No need to do any manual memory management, and std::map is typed with std::string!

ReadLine clearly includes the newline in the data it returns. Simply check for and remove it:
char *nameLine = filesystem->ReadLine(line, size, file);
// remove any newline...
if (const char* p_nl = strchr(nameLine, '\n'))
*p_nl = '\0';
(What this does is overwrite the newline character with a new NUL terminator, which effectively truncates the ASCIIZ string at that point.

Most likely the ReadLinefunction also reads the newline character. I suppose your file does not have a newline at the very last line, thus you do not get a newline for that name.
But until I know what filesystem, FileHandle_t, and Msg is, it is very hard to determine where the issue could be.

Related

Load shellcode from file to char* comes strange characters in end of text

I have a char array[] and is like following:
// MessageBox
char xcode[] = "\x31\xc9\x64\x8b\x41\x30\x8b\x40\xc\x8b\x70\x14\xad\x96\xad\x8b\x58\x10\x8b\x53\x3c\x1\xda\x8b\x52\x78\x1\xda\x8b\x72\x20\x1\xde\x31\xc9\x41\xad\x1\xd8\x81\x38\x47\x65\x74\x50\x75\xf4\x81\x78\x4\x72\x6f\x63\x41\x75\xeb\x81\x78\x8\x64\x64\x72\x65\x75\xe2\x8b\x72\x24\x1\xde\x66\x8b\xc\x4e\x49\x8b\x72\x1c\x1\xde\x8b\x14\x8e\x1\xda\x31\xc9\x53\x52\x51\x68\x61\x72\x79\x41\x68\x4c\x69\x62\x72\x68\x4c\x6f\x61\x64\x54\x53\xff\xd2\x83\xc4\xc\x59\x50\x51\x66\xb9\x6c\x6c\x51\x68\x33\x32\x2e\x64\x68\x75\x73\x65\x72\x54\xff\xd0\x83\xc4\x10\x8b\x54\x24\x4\xb9\x6f\x78\x41\x0\x51\x68\x61\x67\x65\x42\x68\x4d\x65\x73\x73\x54\x50\xff\xd2\x83\xc4\x10\x68\x61\x62\x63\x64\x83\x6c\x24\x3\x64\x89\xe6\x31\xc9\x51\x56\x56\x51\xff\xd0";
Then i had inserted all this content of variable above into a file (file with UTF-8 format and content without the "") and tried load this way:
ifstream infile;
infile.open("shellcode.bin", std::ios::in | std::ios::binary);
infile.seekg(0, std::ios::end);
size_t file_size_in_byte = infile.tellg();
char* xcode = (char*)malloc(sizeof(char) * file_size_in_byte);
infile.seekg(0, std::ios::beg);
infile.read(xcode, file_size_in_byte);
printf("%s\n", xcode); // << prints content of xcode after load from file
if (infile.eof()) {
size_t bytes_really_read = infile.gcount();
}
else if (infile.fail()) {
}
infile.close();
I'm seeing some strange characters in end of text see:
What is need to fix it?
The issue is that the printf format specifier "%s" requires that the string is null-terminated. In your case, the null-terminator just happens to be after those characters you're seeing, but nothing guarantees where the null is unless you put one there.
Since you're using C++, one way to print the characters is to use the write() function available for streams:
#include <iostream>
//...
std::cout.write(xcode, file_size_in_bytes);
The overall point is this -- if you have a character array that is not null-terminated and contains data, you must either:
Put the null in the right place before using the array in functions that look for the null-terminator or
Use functions that state how many characters to process from the character array.
The answer above uses item 2.

C++ Send a file containing \0 via sockets without accidental closure

I serialize the file via the code beneath, and send it over winsocks, this works fine with textfiles, but when I tried to send a jpg, the string contains \0 as some of the character elements, so the sockets only send part of the string, thinking \0 is the end, i was considering replacing \0 with something else, but say i replace it with 'xx', then replace it back on the other end, what if the file had natural occurrences of 'xx' that get lost? Sure I could make a large, unlikely sequence, but that bloats the file.
Any help appreciated.
char* read_file(string path, int& len)
{
std::ifstream infile(path);
infile.seekg(0, infile.end);
size_t length = infile.tellg();
infile.seekg(0, infile.beg);
len = length;
char* buffer = new char[len]();
infile.read(buffer, length);
return buffer;
}
string load_to_buffer(string file)
{
char* img;
int ln;
img = read_file(file, ln);
string s = "";
for (int i = 1; i <= ln; i++){
char c = *(img + i);
s += c;
}
return s;
}
Probably somewhere in your code (that isn't seen in the code you have posted) you use strlen() or std::string::length() to send the data, and/or you use std::string::c_str() to get the buffer. This results in truncated data because these functions stop at \0.
std::string is not good to handle binary data. Use std::vector<char> instead, and remove the new[] stuff.

C++ rapidxml access violation after a certain amount of time (visual studio 2013)

I have been using the excellent rapidxml library to read and use information from XML files to hold cutscene information for a game I am programming in C++. I have run into an odd problem,
I start by loading the XML file into a rapidxml::xmldocument<>* from std::ifstream* XMLFile
std::stringstream buffer; //Create a string buffer to hold the loaded information
buffer << XMLFile->rdbuf(); //Pass the ifstream buffer to the string buffer
XMLFile->close(); //close the ifstream
std::string content(buffer.str()); //get the buffer as a string
buffer.clear();
cutScene = new rapidxml::xml_document<>;
cutScene->parse<0>(&content[0]);
root = cutScene->first_node();
my cutscene xml file is made up of "parts" and at the beginning I want to load all of those parts (which are all xml_nodes) into a vector
//Load parts
if (parts->size() == 0) {
rapidxml::xml_node<>* partNode = root->first_node("part");
parts->push_back(partNode);
for (int i = 1; i < numParts; i++) {
parts->push_back(partNode->next_sibling());
printf("name of part added at %i: %s.\n", i, parts->at(i)->name());
}
}
That last line prints "name of part added at 1: part" to the console.
The problem is for some reason, whenever I try to access the vector and print the same name of that same specific part not as a part of this method, the name can be accessed but is just a random string of letters and numbers. It seems that for some reason rapidxml is deleting everything after my load method is complete. I am still new to posting on stackoverflow so if you need more information just ask, thanks!
Rapidxml is in-situ xml parser. It alters the original string buffer (contentin your case) to format null-terminated tokens such as element and attribute names. Secondly, the lifespan of tree nodes referenced byparts items is defined by xml_document (currscene) instance.
Keep currscene and content instances together with the vector, this will keep the vector items alive as well.
e.g:
struct SceneData
{
std::vector<char> content;
rapidxml::xml_document<> cutScene;
std::vector<rapidxml::xml_node<>*> parts;
bool Parse(const std::string& text);
};
bool SendData::Parse(const std::string& text)
{
content.reserve(text.length() + 1);
content.assign(text.begin(), text.end());
content.push_back('\0');
parts.clear();
try
{
cutScene.parse<0>(content.data());
}
catch(rapidxml::parse_error & err)
{
return false;
}
// Load parts.
rapidxml::xml_node<>* root = cutScene.first_node();
rapidxml::xml_node<>* partNode = root->first_node("part");
parts->push_back(partNode);
for (int i = 1; i < numParts; i++) {
parts->push_back(partNode->next_sibling());
//printf("name of part added at %i: %s.\n", i, parts->at(i)->name());
}
return true ;
}
EDITED
The parser expects a sequence of characters terminated by '\0 as input. Since a buffer referenced by &string[0] is not guaranteed to be null-terminated, it is recommended to copy the string content into std::vector<char>.

Parsing a character array with several null terminated characters into different strings - C++

I asked this question before but with less information than I have now.
What I essentially have is a data block of type char. That block contains filenames that I need to format and put into a vector. I initially thought the formation of this char block had three spaces between each filename. Now, I realize they are '/0' null terminated characters. So the solution that was provided was fantastic for the example I gave when I thought that there were spaces rather than null chars.
Here is what the structure looks like. Also, I should point out I DO have the size of the character data block.
filename1.bmp/0/0/0brick.bmp/0/0/0toothpaste.gif/0/0/0
The way the best solution did it was this:
// The stringstream will do the dirty work and deal with the spaces.
std::istringstream iss(s);
// Your filenames will be put into this vector.
std::vector<std::string> v;
// Copy every filename to a vector.
std::copy(std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>(),
std::back_inserter(v));
// They are now in the vector, print them or do whatever you want with them!
for(int i = 0; i < v.size(); ++i)
std::cout << v[i] << "\n";
This works fantastic for my original question but not with the fact they are null chars instead of spaces. Is there any way to make the above example work. I tried replacing null chars in the array with spaces but that didn't work.
Any ideas on the best way to format this char block into a vector of strings?
Thanks.
If you know your filenames don't have embedded "\0" characters in them, then this should work. (untested)
const char * buffer = "filename1.bmp/0/0/0brick.bmp/0/0/0toothpaste.gif/0/0/0";
int size_of_buffer = 1234; //Or whatever the real value is
const char * end_of_buffer = buffer + size_of_buffer;
std::vector<std::string> v;
while( buffer!=end_of_buffer)
{
v.push_back( std::string(buffer) );
buffer = buffer+filename1.size()+3;
}
If they do have embedded null characters in the filename you'll need to be a little cleverer.
Something like this should work. (untested)
char * start_of_filename = buffer;
while( start_of_filename != end_of_buffer )
{
//Create a cursor at the current spot and move cursor until we hit three nulls
char * scan_cursor = buffer;
while( scan_cursor[0]!='\0' && scan_cursor[1]!='\0' && scan_cursor[2]!='\0' )
{
++scan_cursor;
}
//From our start to the cursor is our word.
v.push_back( std::string(start_of_filename,scan_cursor) );
//Move on to the next word
start_of_filename = scan_cursor+3;
}
If spaces would be a suitable separator, you could just replace the null characters by spaces:
std::replace(std::begin(), std::end(), 0, ' ');
... and go from there. However, I'd suspect that you really need to use the null characters as separators as file names typically can include spaces. In this case, you could either use std::getline() with '\0' as the end of line or use the find() and substr() members of the string itself. The latter would look something like this:
std::vector<std::string> v;
std::string const null(1, '\0');
for (std::string::size_type pos(0); (pos = s.find_first_not_of(null, pos)) != s.npos; )
{
end = s.find(null, pos);
v.push_back(s.substr(0, end - pos));
pos = end;
}

sscanf() doesn't recognize format properly

When I use sscanf() in the following code, it is taking the whole line and placing it in the first string for some reason, and I do not see any problems with it. The output from Msg() is coming out like PatchVersion=1.1.1.5 = °?¦§-
The file looks like this (except each is a new line, not sure why it shows as one on StackOverflow)
PatchVersion=1.1.1.5
ProductName=tf
appID=440
Code:
bool ParseSteamFile()
{
FileHandle_t file;
file = filesystem->Open("steam.inf", "r", "MOD");
if(file)
{
int size = filesystem->Size(file);
char *line = new char[size + 1];
while(!filesystem->EndOfFile(file))
{
char *subLine = filesystem->ReadLine(line, size, file);
if(strstr(subLine, "PatchVersion"))
{
char *name = new char[32];
char *value = new char[32];
sscanf(subLine, "%s=%s", name, value);
Msg("%s = %s\n", name, value);
}
else if(strstr(subLine, "ProductName"))
{
char *name = new char[32];
char *value = new char[32];
sscanf(subLine, "%s=%s", name, value);
Msg("%s = %s\n", name, value);
}
}
return true;
}
else
{
Msg("Failed to find the Steam Information File (steam.inf)\n");
filesystem->Close(file);
return false;
}
filesystem->Close(file);
return false;
}
One solution would be to use the (rather underused, in my opinion) character group format specifier:
sscanf(subLine, "%[^=]=%s", name, value);
Also, you should use the return value of sscanf() to verify that you did indeed get both values, before relying on them.
%s is "greedy", i.e. it keeps reading until it hits whitspace (or newline, or EOF). The '=' character is none of these, so sscanf just carries on, matching the entire line for the first %s.
You're probably better off using (for example) strtok(), or a simple character-by-character parser.
From the manpage of scanf, regarding %s:
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null character ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
%s will read characters until a whitespace is encountered. Since there are no whitespaces before/after the '=' sign, the entire string is read.
Your use of arrays is very poor C++ technique. You might use streams but if you insist on using sscanf and arrays then at least use vector to manage your memory.
You might print out exactly what is in subLine and what Msg does. Is this your own code because I have never heard of FileHandle_t. I do know that it has a method that returns a char* that presumably you have to manage.
Regular expressions are part of the boost library and will soon be in the standard library. They are fairly "standard" and you might do well to use it to parse your line.
(boost::regex or tr1::regex if you have it, VS2008 has it)