sscanf() doesn't recognize format properly - c++

When I use sscanf() in the following code, it is taking the whole line and placing it in the first string for some reason, and I do not see any problems with it. The output from Msg() is coming out like PatchVersion=1.1.1.5 = °?¦§-
The file looks like this (except each is a new line, not sure why it shows as one on StackOverflow)
PatchVersion=1.1.1.5
ProductName=tf
appID=440
Code:
bool ParseSteamFile()
{
FileHandle_t file;
file = filesystem->Open("steam.inf", "r", "MOD");
if(file)
{
int size = filesystem->Size(file);
char *line = new char[size + 1];
while(!filesystem->EndOfFile(file))
{
char *subLine = filesystem->ReadLine(line, size, file);
if(strstr(subLine, "PatchVersion"))
{
char *name = new char[32];
char *value = new char[32];
sscanf(subLine, "%s=%s", name, value);
Msg("%s = %s\n", name, value);
}
else if(strstr(subLine, "ProductName"))
{
char *name = new char[32];
char *value = new char[32];
sscanf(subLine, "%s=%s", name, value);
Msg("%s = %s\n", name, value);
}
}
return true;
}
else
{
Msg("Failed to find the Steam Information File (steam.inf)\n");
filesystem->Close(file);
return false;
}
filesystem->Close(file);
return false;
}

One solution would be to use the (rather underused, in my opinion) character group format specifier:
sscanf(subLine, "%[^=]=%s", name, value);
Also, you should use the return value of sscanf() to verify that you did indeed get both values, before relying on them.

%s is "greedy", i.e. it keeps reading until it hits whitspace (or newline, or EOF). The '=' character is none of these, so sscanf just carries on, matching the entire line for the first %s.
You're probably better off using (for example) strtok(), or a simple character-by-character parser.

From the manpage of scanf, regarding %s:
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null character ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.

%s will read characters until a whitespace is encountered. Since there are no whitespaces before/after the '=' sign, the entire string is read.

Your use of arrays is very poor C++ technique. You might use streams but if you insist on using sscanf and arrays then at least use vector to manage your memory.
You might print out exactly what is in subLine and what Msg does. Is this your own code because I have never heard of FileHandle_t. I do know that it has a method that returns a char* that presumably you have to manage.
Regular expressions are part of the boost library and will soon be in the standard library. They are fairly "standard" and you might do well to use it to parse your line.
(boost::regex or tr1::regex if you have it, VS2008 has it)

Related

String Rev function, strange behavior for out of bounds exception (c++)

I played with the string function,i wrote the following one, obviously I set the first character in the ret string to be written in a place that is out of bounds, but instead of an exception, I get a string that has one extra place .
std::string StringManipulations::rev(std::string s)
{
std::string ret(s.size(), ' ');
for (int i = 0; i < s.size(); i++)
{
std::string ch;
ch.push_back(s[i]);
int place = s.size() -i;
ret.replace(place,1,ch);
}
return ret;
}
I write by mistake in a position that corresponds to a place that is one larger than the original string size that I assign at the beginning of the function.
Why don't we get an error ?
s = StringManipulations::rev("abcde");
std::cout << s.size();
std::cout << s;
output is : 6 _edcba
any help ?
solved: adding ch as a String adds a null terminator automatically, and by doing so we can get a new string with size+1.
C++ has a zero-overhead rule.
This means that no overhead, (like checking if an index is in-bounds) should be done unintentionally.
You don't get an exception because c++ simply doesn't verify if the index is valid.
For the extra character, this might have something to do with (regular) c strings.
In c, strings are arrays of type char (char*) without a defined size.
The end of a string is denoted with a null terminator.
C++ strings are backwards compatible, meaning that they have a null terminator too.
It's possible that you replaced the terminator with an other character but the next byte was also a zero meaning that you added one more char.
In addition to the information above about null terminators, another answer to your question is that the docs says it will only throw if the position is greater than the string size, rather than beyond the end of the string.
string replace api

Converting a Lua function chunk to a C string

I am working on a project that takes a Lua string and converts it into a C string – not at all difficult, of course. However, I run into trouble when attempting to convert a binary representation of a function, i.e. one produced by a call to string.dump, to a C string. I am having trouble reading the entire string.
While it is not the ultimate goal of the project, consider the following simple example where I print out the characters in a string one-by-one using a C function called chars that I have registered for use in Lua:
static void chars(char* cp) {
char* pointer = cp;
while (*pointer) {
printf("%c\n", *pointer);
++pointer;
}
return;
}
static int lua_chars(lua_State* L) {
lua_len(L, 1);
size_t len = static_cast<size_t>(lua_tonumber(L, -1)) + 1;
lua_pop(L, 1);
if (len > 0) {
char* cp = static_cast<char*>(malloc(len));
strcat(cp, lua_tostring(L, 1));
chars(cp);
free(cp);
}
return 0;
}
Calling chars from a Lua script would look like this:
chars("Hello World!")
and would print out the characters one by one with each followed by a newline.
Now to the actual issue. Consider this example where I declare a function in Lua, dump it with string.dump, and then pass that string to the function chars to print out its characters individually:
local function foo()
print("foo")
return
end
local s = assert(string.dump(foo))
chars(s)
The string s in its entirety, not printed with my function chars, looks something like this:
uaS?
xV(w#=stdin#A#$#&?&?printfoo_ENV
However, chars only prints the first five bytes:
u
a
S
(Note there are supposed to be two lines of whitespace before the 'u'.)
I am almost certain that this is due to null characters within the string, which I think interferes with lua_tostring's functionality. I have come across lua_Writer for reading chunks, but I have no idea how to use/code it. How can I successfully convert the entire string on the Lua stack to a C string?
I am almost certain that this is due to null characters within the
string
Yes, it's exactly because Lua strings can contain zeroes.
which I think interferes with lua_tostring's functionality.
And this is false. lua_tostring() works as intended. It's just strcat() you're using will only copy the data up to the nearest zero byte.
If you need to copy the string, use memcpy, passing it both the pointer to Lua string data and Lua string length (lua_len, lua_rawlen, etc).
But just for printing you don't even need to copy anything. Pass the len variable as an argument to chars(), and check that length instead of waiting for zero byte.
The Problem isn't lua_tostring but strcat which copies until it finds an null characters. Same Problem with your chars function.
That should work:
memcpy(cp, lua_tostring(L, 1), len);
chars(cp, len);
...
static void chars(char* cp, size_t len) {
for (size_t i = 0; i < len; ++i, ++cp) {
putchar(*cp);
}
}

How to read in only a particular number of characters

I have a small query regarding reading a set of characters from a structure. For example: A particular variable contains a value "3242C976*32" (char - type). How can I get only the first 8 bits of this variable. Kindly help.
Thanks.
Edit:
I'm trying to read in a signal:
For Ex: $ASWEER,2,X:3242C976*32
into this structure:
struct pg
{
char command[7]; // saves as $ASWEER,2,X:3242C976*32
char comma1[1]; // saves as ,2,X:3242C976*32
char groupID[1]; // saves as 2,X:3242C976*32
char comma2[1]; // etc
char handle[2]; // this is the problem, need it to save specifically each part, buts its not
char canID[8];
char checksum[3];
}m_pg;
...
When memcopying buffer into a structure, it works but because there is no carriage returns it saves the rest of the signal in each char variable. So, there is always garbage at the end.
you could..
convert your hex value in canID to float(depending on how you want to display it), e.g.
float value1 = HexToFloat(m_pg.canID); // find a conversion script for HexToFloat
CString val;
val.Format("0.3f",value1);
the garbage values aren't actually being stored in the structure, it only displays it as so, as there is no carriage return, so format the message however you want to and display it using the CString val;
If "3242C976*3F" is a c-string or std::string, you can just do:
char* str = "3242C976*3F";
char first_byte = str[0];
Or with an arbitrary memory block you can do:
SomeStruct memoryBlock;
char firstByte;
memcpy(&firstByte, &memoryBlock, 1);
Both copy the first 8bits or 1 byte from the string or arbitrary memory block just as well.
After the edit (original answer below)
Just copy by parts. In C, something like this should work (could also work in C++ but may not be idiomatic)
strncpy(m_pg.command, value, 7); // m.pg_command[7] = 0; // oops
strncpy(m_pg.comma, value+7, 1); // m.pg_comma[1] = 0; // oops
strncpy(m_pg.groupID, value+8, 1); // m.pg_groupID[1] = 0; // oops
strncpy(m_pg.comma2, value+9, 1); // m.pg_comma2[1] = 0; // oops
// etc
Also, you don't have space for the string terminator in the members of the structure (therefore the oopses above). They are NOT strings. Do not printf them!
Don't read more than 8 characters. In C, something like
char value[9]; /* 8 characters and a 0 terminator */
int ch;
scanf("%8s", value);
/* optionally ignore further input */
while (((ch = getchar()) != '\n') && (ch != EOF)) /* void */;
/* input terminated with ch (either '\n' or EOF) */
I believe the above code also "works" in C++, but it may not be idiomatic in that language
If you have a char pointer, you can just set str[8] = '\0'; Be careful though, because if the buffer is less than 8 (EDIT: 9) bytes, this could cause problems.
(I'm just assuming that the name of the variable that already is holding the string is called str. Substitute the name of your variable.)
It looks to me like you want to split at the comma, and save up to there. This can be done with strtok(), to split the string into tokens based on the comma, or strchr() to find the comma, and strcpy() to copy the string up to the comma.

Newline character in Text Document?

I wrote a pretty simple function that reads in possible player names and stores them in a map for later use. Basically in the file, each line is a new possible player name, but for some reason it seems like all but the last name has some invisible new line character after it. My print out is showing it like this...
nameLine = Georgio
Name: Georgio
0
nameLine = TestPlayer
Name: TestPlayer 0
Here is the actual code. I assume I need to be stripping something out but I am not sure what I need to be checking for.
bool PlayerManager::ParsePlayerNames()
{
FileHandle_t file;
file = filesystem->Open("names.txt", "r", "MOD");
if(file)
{
int size = filesystem->Size(file);
char *line = new char[size + 1];
while(!filesystem->EndOfFile(file))
{
char *nameLine = filesystem->ReadLine(line, size, file);
if(strcmp(nameLine, "") != 0)
{
Msg("nameLine = %s\n", nameLine);
g_PlayerNames.insert(std::pair<char*, int>(nameLine, 0));
}
for(std::map<char*,int>::iterator it = g_PlayerNames.begin(); it != g_PlayerNames.end(); ++it)
{
Msg("Name: %s %d\n", it->first, it->second);
}
}
return true;
}
Msg("[PlayerManager] Failed to find the Player Names File (names.txt)\n");
filesystem->Close(file);
return false;
}
You really need to consider using iostreams and std::string. The above code is SO much more simpler if you used the C++ constructs available to you.
Problems with your code:
why do you allocate a buffer for a single line which is the size of the file?
You don't clean up this buffer!
How does ReadLine fill the line buffer?
presumably nameLine points to the begining of the line buffer, if so, given in the std::map, the key is a pointer (char*) rather than a string as you were expecting, and the pointer is the same! If different (i.e. somehow you read a line and then move the pointer along for each name, then std::map will contain an entry per player, however you'll not be able to find an entry by player name as the comparison will be a pointer comparison rather than a string comparison as you are expecting!
I suggest that you look at implementing this using iostreams, here is some example code (without any testing)
ifstream fin("names.txt");
std::string line;
while (fin.good())
{
std::getline(fin, line); // automatically drops the new line character!
if (!line.empty())
{
g_PlayerNames.insert(std::pair<std::string, int>(line, 0));
}
}
// now do what you need to
}
No need to do any manual memory management, and std::map is typed with std::string!
ReadLine clearly includes the newline in the data it returns. Simply check for and remove it:
char *nameLine = filesystem->ReadLine(line, size, file);
// remove any newline...
if (const char* p_nl = strchr(nameLine, '\n'))
*p_nl = '\0';
(What this does is overwrite the newline character with a new NUL terminator, which effectively truncates the ASCIIZ string at that point.
Most likely the ReadLinefunction also reads the newline character. I suppose your file does not have a newline at the very last line, thus you do not get a newline for that name.
But until I know what filesystem, FileHandle_t, and Msg is, it is very hard to determine where the issue could be.

Issue with char[] in VS2008 - why does strcat append to the end of an empty array?

I am passing an empty char array that I need to recursively fill using strcat(). However, in the VS debugger, the array is not empty, it's full of some weird junk characters that I don't recognise. strcat() then appends to the end of these junk characters rather than at the front of the array.
I have also tried encoded[0] = '\0' to clear the junk before passing the array, but then strcat() doesn't append anything on the recursive call.
This is the code that supplies the array and calls the recursive function:
char encoded[512];
text_to_binary("Some text", encoded);
This is the recursive function:
void text_to_binary(const char* str, char* encoded)
{
char bintemp[9];
bintemp[0] = '\0';
while(*str != '\0')
{
ascii_to_binary(*str, bintemp);
strcat(encoded, bintemp);
str++;
text_to_binary(str, encoded);
}
}
What is going on?
ps. I can't use std::string - I am stuck with the char*.
Edit: This is the junk character in the array:
ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ...
You are not initialising the array. Change:
char encoded[512];
to
char encoded[512] = "";
strcat appends to the end of the string, the end is marked by a \0, it then appends a \0 to the new end position.
You should clear the destination encoded with either encoded[0]=0; or memset first.
char encoded[512];.. encoded is not initialized and will contain junk (or 0xCCCCCCCC in debug builds).
Your problem was due to encode initialization I think. A few comment on your program:
it's better to avoid recursive
function when you can do it with a
loop.
Second you should add the size of
encoded to avoid possible overflow
error (in the case the size of string
is bigger than encoded).
void text_to_binary(const char* str, char* encoded)
{
char bintemp[9];
bintemp[0] = '\0';
encode[0] = '\0';
for(const char *i = str; i!='\0'; i++)
{
ascii_to_binary(*i, bintemp);
strcat(encoded, bintemp);
}
}
PS: i didn't tried the source code, so if there is an error add a comment and I will correct it.
Good contination on your project.
The solution to your immediate problem has been posted already, but your text_to_binary is still inefficient. You are essentially calling strcat in a loop with always the same string to concatenate to, and strcat needs to iterate through the string to find its end. This makes your algorithm quadratic. What you should do is to keep track of the end of encoded on your own and put the content of bintemp directly there. A better way to write the loop would be
while(*str != '\0')
{
ascii_to_binary(*str, bintemp);
strcpy(encoded, bintemp);
encoded += strlen(bintemp);
str++;
}
You don't need the recursion because you are already looping over str (I believe this to be correct, as your original code will fill encoded pretty weirdly). Also, in the modified version, encoded is always pointing to the end of the original encoded string, so you can just use strcpy instead of strcat.
You didn't attached source of ascii_to_binary, let's assume that it will fill buffer with hex dump of the char (if this is the case it's easier to use sprintf(encoded+(i2),"%2x",*(str+i));
What's the point of recursively calling text_to_binary? I think this might be a problem.