`ncurses` function `wgetstr` is modifying my variables - c++

SOLUTION Apparently, the wgetstr function does not make a new buffer. If the second argument is called data and has size n and you give an input of more than n characters, it will access and overwrite parts in memory that do not belong to data, such as the place in memory where cursorY is stored. To make everything work, I declared data with char data[] = " "; (eight spaces) and wrote wgetnstr(inputWin, data, 8);.
--------------------------------------------------------------------------------------------------------------
It seems that the ncurses function wgetstr is literally changing the values of my variables. In a function called playGame, I have a variable called cursorY (of type int) which is adjusted whenever I press the up- or down-arrow on my keyboard (this works fine).
Please take a look at this code (inputWin is of type WINDOW*):
mvprintw(0, 0, (to_string(cursorY)).c_str());
refresh();
usleep(500000);
wgetstr(inputWin, data);
mvprintw(0, 0, (to_string(cursorY)).c_str());
refresh();
usleep(500000);
Suppose I move the cursor to the 6th row and then press Enter (which causes this piece of code to be executed). There are two things I can do:
Input just 1 character. After both refresh calls, the value 6 is shown on the screen (at position (0, 0)).
Input 2 or more characters. In this case, after the first refresh call I simply get 6, but after the second, I magically get 0.
The first two lines after the code above are
noecho();
_theView -> _theActualSheet -> putData(cursorY-1, cursorX/9 - 1, data);
(don't worry about the acutal parameters: the math regarding them checks out). While I'm in putData, I get a Segmentation fault, and gdb says that the first argument of putData was -1, so then cursorY had to be 0 (the first two arguments of putData are used to access a two-dimensional array using SheetCells[row][column], where row and column are, respectively, the first and second formal parameter of putData).
Clearly, wgetstr modifies the value of cursorY. The name of the latter variable doesn't matter: changing it to cursorrY or something weird like monkeyBusiness (yes I've tried that) doesn't work. What sort of works is replacing the piece of code above with
mvprintw(0, 0, (to_string(cursorY)).c_str());
refresh();
usleep(500000);
int a = cursorY;
wgetstr(inputWin, data);
cursorY = a;
mvprintw(0, 0, (to_string(cursorY)).c_str());
refresh();
usleep(500000);
In both cases I see 6 at the top-left corner of my screen. However, know the string is acting all weird: when I type in asdf as my string, then move to the right (i.e., I press the right key on my keyboard), then type in asdf again, I get as^a.
So basically, I would like to know two things:
Why the HELL is wgetstr changing my variables?
Why is it only happening when I input more than 1 character?
What seems to be wrong with wgetstr in general? It seems terrible at handling input.
I could try other things (like manually reading in characters and then concatenating data with them), but wgetstr seems perfect for what I want to do, and there is no reason I should switch here.
Any help is much appreciated. (Keep in mind: I specifically want to know why the value of cursorY is being changed. If you would recommend not using wgetstr and have a good alternative, please tell me, but I'm most interested in knowing why cursorY is being altered.)
EDIT The variable data is of type char[] and declared like so: char data[] = "". I don't "clear" this variable (i.e., remove all "letters"), but I don't think this makes any difference, as I think wgetstr just overrides the whole variable (or am I terribly wrong here?).

The buffer you provide for the data, data, is defined as being a single character long (only the null-terminator will be there). This means that if you enter any input of one or more characters, you will be writing outside the space provided by data, and thus overwrite something else. It looks like cursorY is the lucky variable that got hit.
You need to make sure that data is at least big enough to handle all inputs. And preferably, you should switch to some input function (like wgetnstr) that will let you pass the size of the buffer, otherwise it will always be possible to crash your application by typing enough characters.

wgetstr expects to write the received characters to a preallocated buffer, which should be at least as long as the expected input string. It does not allocate a new buffer for you!
What you've done is provide it with a single byte buffer, and are writing multiple bytes to it. This will stomp over the other variables you've defined in your function after data, such as cursorY, regardless of what it is called. Any changes to variables will in turn change the string that was read in:
int a = cursorY;
wgetstr(inputWin, data);
cursorY = a;
will write an int value into your string, which is why it is apparently getting corrupted.
What you should actually do is to make data actually long enough for the anticipated input, and ideally use something like wgetnstr to ensure you don't walk off the end of the buffer and cause damage.

Related

Why on earth is my file reading function placing null-terminators where excess CR LF carriages should be?

Today I tried to put together a simple OpenGL shader class, one that loads text from a file, does a little bit of parsing to build a pair of vertex and fragment shaders according to some (pretty sweet) custom syntax (for example, writing ".varying [type] [name];" would allow you to define a varying variable in both shaders while only writing it once, same with ".version",) then compiles an OpenGL shader program using the two, then marks the shader class as 'ready' if and only if the shader code compiled correctly.
Now, I did all this, but then encountered the most bizarre (and frankly kinda scary) problems. I set everything up, declared a new 'tt::Shader' with some file containing valid shader code, only to have it tell me that the shader was invalid but then give me an empty string when I asked what the error was (which means OpenGL gave me an empty string as that's where it gets it from.)
I tried again, this time with obviously invalid shader code, and while it identified that the shader was invalid, it still gave me nothing in terms of what the error was, just an empty string (from which I assumed that obviously the error identification portion of it was also just the same as before.)
Confused, I re-wrote both shaders, the valid and invalid one, by hand as a string, compiling the classes again with the string directly, with no file access. Doing this, the error vanished, the first one compiled correctly, and the second one failed but correctly identified what the error was.
Even more confused, I started comparing the strings from the files to those I wrote myself. Turns out the former were a tad longer than the ladder, despite printing the same. After doing a bit of counting, I realised that these characters must be Windows CR LF line ending carriage characters that got cut off in the importing process.
To test this, I took the hand-written strings, inserted carriages where they would be cut off, and ran my string comparison tests again. This time, it evaluated there lengths to be the same, but also told me that the two where still not equal, which was quite puzzling.
So, I wrote a simple for-loop to iterate through the characters of the two strings and print then each next to one another, and cast to integers so I could see their index values. I ran the program, looked through the (quite lengthy) list, and came to a vary insightful though even less clarifying answer: The hidden characters were in the right places, but they weren't carriages ... they were null-terminators!
Here's the code for the file reading function I'm using. It's nothing fancy, just standard library stuff.
// Attempts to read the file with the given path, returning a string of its contents.
// If the file could not be found and read, an empty string will be returned.
// File strings are build by reading the file line-by-line and assembling a single with new lines placed between them.
// Given this line-by-line method, take note that it will copy no more than 4096 bytes from a single line before moving on.
inline std::string fileRead(const std::string& path) {
if (!tt::fileExists(path))
return "";
std::ifstream a;
a.open(path);
std::string r;
const tt::uint32 _LIMIT = 4096;
char r0[_LIMIT];
tt::uint32 i = 0;
while (a.good()) {
a.getline(r0, _LIMIT);
if (i > 0)
r += "\n";
i++;
r += std::string(r0, static_cast<tt::uint32>(a.gcount()));
}
// TODO: Ask StackOverflow why on earth our file reading function is placing null characters where excess carriages should go.
for (tt::uint32 i = 0; i < r.length(); i++)
if (r[i] == '\0')
r[i] = '\r';
a.close();
tt::printL("Reading file '" + path + "' ...");
return r;
}
If y'all could take a read and tell me what the hell is going on with it, that'd be awesome, as I'm at a total loss for what its doing to my string to cause this.
Lastly, I do get why the null-terminators didn't show up to me but did for OpenGL, the ladder was using C-strings, while I was just doing everything with std::string objects, where store things based on length given that they're pretty much just fancy std::vector objects.
Read the documentation for std::string constructor. Constructor std::string(const char*, size_t n) creates string of size n regardless of input. It may contain null character inside or even more than 1. Note that size of std::string doesn't include the null character (so that str[str.size()] == '\0' always).
So clearly the code simply copies the null character from the output buffer of the getline function.
Why would it do that? Go to gcount() function documentation - it returns the number of extracted characters by the last operation. I.e., it includes the extracted character \n which is replaced in output by \0 voila. Exactly one number more that the constructor ask for.
So to fix it simply do replace:
r += std::string(r0, static_cast<tt::uint32>(a.gcount()-1));
Or you could've simply used getline() with std::string as input instead of a buffer - and none of this would've happened.

String array unusable after setting it to 0 using memset

I have a class property which is an array of strings (std::string command[10]). When I assign some string value to it, it stop the program execution. As you can see below I've a string variable tempCommandStr which I assign to my property. I don't know what the error could be, but I've the print statement after assignment which is never executed, while the one preceding it is.
//Declared in class header
std::string command[10];
// Part of function which is causing problem.
string tempCommandStr(commandCharArray);
printf("%s\n", tempCommandStr.c_str()); // Prints fine.
this->command[i] = tempCommandStr; // Something goes wrong here. i is set to some correct value, i.e. not out of range.
printf("%s\n", this->command[i].c_str()); // Never prints. Also program stops responding.
// I noticed that getting any value from the array also stops the execution.
// Just the following statement would stop the program too.
printf("%s\n", this->command[i].c_str());
It's not just this property, I also have another array which has the same problem. What could be causing this? What's actually going wrong (look at edit)? Is there another better way to do this?
I'm running the program on an MBED so I've limited debugging options.
EDIT:
I found the problem, I was cleaning the array before using to remove any previous values by memset(command, 0, sizeof(command));. This is was causing the problem. Now I use the clear function on each item in array as following. This fixed the execution problem.
for (int i = 0; i < sizeof(command)/sizeof(*command); i++){
command[i].clear();
}
Question: Why does setting the string array to 0 using memset makes it unusable?
Why does setting the string array to 0 using memset makes it unusable?
Because you're obliterating the values held in the string class, overwriting them all with 0s. A std::string has pointers to the memory where it's storing the string, character count information, etc. If you memset() all that to 0, it's not going to work.
You're coming from the wrong default position. Types where 'zeroing out' the memory is a meaningful (or even useful) operation are special; you should not expect any good to come from doing such a thing unless the type was specifically designed to work with that.

Using, StringCchCat

I'm trying to use the StringCchCat function:
HRESULT X;
LPWSTR _strOutput = new wchar_t[100];
LPCWSTR Y =L"Sample Text";
X = StringCchCat(_strOutput, 100, Y);
But for some reason I keep getting the "E_INVALIDARG One or more arguments are invalid." error from X. _strOutput Is also full of some random characters.
This is actually part of a bigger program. So what I'm trying to do is to concatenated the "sample text" to the empty _strOutput variable. This is inside a loop so it is going to happen multiple times. For this particular example it will be as if I'm assigning the Text "Sample Text" to _strrOutput.
Any Ideas?
If it's part of a loop, a simple *_strOutput = 0; will fix your issue.
If you're instead trying to copy a string, not concatenate it, there's a special function that does this for you: StringCchCopy.
Edit: As an aside, if you're using the TCHAR version of the API (and you are), you should declare your strings as TCHAR arrays (ie LPTSTR instead of LPWSTR, and _T("") instead of L""). This would keep your code at least mildly portable.
String copy/concat functions look for null terminators to know where to copy/concat to. You need to initialize the first element of _strOutput to zero so the buffer is null terminated, then you can copy/concat values to it as needed:
LPWSTR _strOutput = new wchar_t[100];
_strOutput[0] = L'\0`; // <-- add this
X = StringCchCat(_strOutput, 100, Y);
I'm writing this answer to notify you (so you see the red 1 at the top of any Stack Overflow page) because you had the same bug yesterday (in your message box) and I now realize I neglected to say this in my answer yesterday.
Keep in mind that the new[] operator on a built-in type like WCHAR or int does NOT initialize the data at all. The memory you get will have whatever garbage was there before the call to new[], whatever that is. The same happens if you say WCHAR x[100]; as a local variable. You must be careful to initialize data before using it. Compilers are usually good at warning you about this. (I believe C++ objects have their constructors called for each element, so that won't give you an error... unless you forget to initialize something in the class, of course. It's been a while.)
In many cases you'll want everything to be zeroes. The '\0'/L'\0' character is also a zero. The Windows API has a function ZeroMemory() that's a shortcut for filling memory with zeroes:
ZeroMemory(array, size of array in bytes)
So to initialize a WCHAR str[100] you can say
ZeoMemory(str, 100 * sizeof (WCHAR))
where the sizeof (WCHAR) turns 100 WCHARs into its equivalent byte count.
As the other answers say, simply setting the first character of a string to zero will be sufficient for a string. Your choice.
Also just to make sure: have you read the other answers to your other question? They are more geared toward the task you were trying to do (and I'm not at all knowledgeable on the process APIs; I just checked the docs for my answer).

Why Is This Pointer Being Copied Incorrectly? Why Does The Segmentation Fault Not Occur Earlier?

I am debugging a program that reads data from a binary file and puts it into the fields of a TaggerDataUnigram object, TaggerDataUnigram being a class derived from TaggerData. All the reading operations read a number of data objects specified in the file and put the objects into fields of TaggerData. Therefore, I defined a function ReadForNumberToRead that takes a file and a Reader* as arguments, Reader being a base class for functors that define how to read the data from the file. Each Reader derivative takes a TaggerData* as an argument and stores the value of the pointer as a member. Unfortunately, TaggerData uses getters and setters, but the getters return references to the fields. So, for example, OpenClassReader accesses TaggerData::open_class through tagger_data_pointer_->getOpenClass().
Example: ForbiddingRuleReader's Constructor:
ForbiddingRuleReader::ForbiddingRuleReader(
FILE*& tagger_data_input_file_reference,
TaggerData* tagger_data_pointer)
: Reader(tagger_data_input_file_reference, tagger_data_pointer) {}
tagger_data_pointer_ is a protected member of Reader.
Reader::Reader(FILE*& tagger_data_input_file_reference,
TaggerData* tagger_data_pointer)
: TaggerDataFileInputOutput(tagger_data_input_file_reference),
tagger_data_pointer_(tagger_data_pointer) {} // tagger_data_pointer_ is initialized.
. . . and the identical constructor of ArrayTagReader:
ArrayTagReader::ArrayTagReader(FILE*& tagger_data_input_file_reference,
TaggerData* tagger_data_pointer)
: Reader(tagger_data_input_file_reference, tagger_data_pointer) {}
Their usages are likewise the same:
void TaggerDataUnigram::ReadTheForbiddingRules(
FILE*& unigram_tagger_data_input_file_reference) {
ForbiddingRuleReader forbidding_rule_reader(
unigram_tagger_data_input_file_reference,
this);
ReadForNumberToRead(unigram_tagger_data_input_file_reference,
&forbidding_rule_reader);
}
[. . .]
void TaggerDataUnigram::ReadTheArrayTags(
FILE*& unigram_tagger_data_input_file_reference) {
ArrayTagReader array_tag_reader(unigram_tagger_data_input_file_reference,
this);
ReadForNumberToRead(unigram_tagger_data_input_file_reference,
&array_tag_reader);
}
Needless to say, the TaggerDataUnigram object is not going out of scope.
OpenClassReader and ForbiddingRuleReader both work perfectly; they store a copy of the file and TaggerData* as fields and successively read data from the file and put it into its respective field in TaggerData. The problem arises when the ArrayTagReader is constructed. Despite sharing an identical constructor and being used the same way as ForbiddingRuleReader, something goes terribly wrong--tagger_data_pointer_ does not point to the same location in memory as the TaggerData* tagger_data_pointer the object was constructed with!
Breakpoint 1, ArrayTagReader::ArrayTagReader (this=0x7fffffffd640, tagger_data_input_file_reference=#0x7fffffffd720: 0x62a730, tagger_data_pointer=0x7fffffffd8c0)
at array_tag_reader.cc:10
10 : Reader(tagger_data_input_file_reference, tagger_data_pointer) {}
(gdb) print tagger_data_pointer
$1 = (TaggerData *) 0x7fffffffd8c0 <----------
(gdb) continue
Continuing.
Breakpoint 2, ArrayTagReader::operator() (this=0x7fffffffd640) at array_tag_reader.cc:12
12 void ArrayTagReader::operator()() {
(gdb) print tagger_data_pointer_
$2 = (TaggerData *) 0x7fffffffd720 <----------
In both OpenClassReader and ForbiddingRuleReader, tagger_data_pointer_ is equal to tagger_data_pointer.
Strangely, errors do not result immediately, even though the pointer is clearly invalid.
Breakpoint 3, ArrayTagReader::operator() (this=0x7fffffffd640) at array_tag_reader.cc:12
12 void ArrayTagReader::operator()() {
(gdb) print *tagger_data_pointer_
$3 = {_vptr.TaggerData = 0x62a730, open_class = std::set with 0 elements, forbid_rules = std::vector of length 275736, capacity -17591907707330 = {{tagi = -1972060027,
[. . .]
However, upon the first call of TagIndexReader::operator(), the program encounters a segmentation fault, specifically SIGSEGV. It's no surprise; though TagIndexReader's tagger_data_pointer_ is valid, a great part of the TaggerDataUnigram object was compromised.
Breakpoint 4, TagIndexReader::operator() (this=0x7fffffffd650) at tag_index_reader.cc:7
7 void TagIndexReader::operator()() {
(gdb) print tagger_data_pointer_
$16 = (TaggerData *) 0x7fffffffd8c0 <---------- This is the correct value.
(gdb) print *tagger_data_pointer_
$17 = {_vptr.TaggerData = 0x41e5b0 <vtable for TaggerDataUnigram+16>,
open_class = std::set with 6467592 elements<error reading variable: Cannot access memory at address 0x5200000051>,
Why is tagger_data_pointer being copied incorrectly? Why does the program not encounter a segmentation fault immediately after trying to write to invalid memory? How can I resolve this issue?
Thank you for your time.
Update:
These might be useful:
void ArrayTagReader::operator()() {
std::wstring array_tag = Compression::wstring_read(
tagger_data_file_reference_);
tagger_data_pointer_->getArrayTags().push_back(array_tag);
}
void ReadForNumberToRead(
FILE* tagger_data_input_file_reference,
Reader* pointer_to_a_reader) {
for (int unsigned number_to_read =
Compression::multibyte_read(tagger_data_input_file_reference);
number_to_read != 0;
--number_to_read) {
pointer_to_a_reader->operator()();
}
}
Update:
Somehow, I missed the declaration of tagger_data_poiner_ in ArrayTagReader; making the pointers const generated the compiler error that brought this to my attention. What I still don't understand is why:
The compiler didn't complain about the use of an uninitialized pointer.
The program did not encounter a segmentation fault when trying to modify e.g. tagger_data_poiner_->getArrayTags().
"tagger_data_pointer_ does not point to the same location in memory as the TaggerData* tagger_data_pointer the object was constructed with"
That generally means the value has been overwritten. A very common cause is an buffer overflow in the preceding field, or less common an _under_flow in the following field. It also explains why this is a problem that occurs only n one of your two classes; the other class has other neighbors. Still, not all overwrites are buffer over/underflows. Invalid typecasting is another possible problem.
Since you don't mean to change the pointer, do make it const. A second debugging technique is to replace the field with an array of 3 identical copies. Create a function that checks if all three are the same, throws if not, and otherwise return the single value. In the places where you dereferenced the pointer, you now call this check function. This gives you a good chance to detect the exact nature of the change. Even more fancy algorithms add extra padding data with known values.
Despite sharing an identical constructor and being used the same way as ForbiddingRuleReader
I'm not sure why you think these are important, but I can tell you that according to the C++ standard, these have absolutely no relevance concerning whether two types have the same memory layout or whether one can reinterpret_cast (or the moral equivalent) between them.
I was not able to read your code properly as everything but the "UPDATE" is extremely messy and very hard to read for anyone but the author. The UPDATE part seems OK. So I'll just drop by a few tips on copying using pointers (as I recently saw that many people make these mistakes) and maybe it helps.
Make sure you're not just copying from or to a memory location that is not "marked" as allocated. In other words, if you have a pointer and you're just copying data in array to memory location it points to, nothing stops your program or other programs currently running on the computer to modify that area. You first allocate space (using new, malloc etc.) and then you can copy from/to it.
type *p = new type[size];
Even if you've satisfied point 1, make sure copied space doesn't exceed size.
Advice on the question, by comments on it (I can't comment ATM)...
You may be a really good programmer. But you'll make mistakes. Meaning you'll have to find them. Meaning you should keep your code tidy. But there's far more crucial reason for being tidy. People reading your code don't really know where everything "should" be. For them reading a messy code is mission impossible. That's important because someone might have to continue your work for the company on your code. Or if you're looking for help from other programmers, like you're doing right now, you need people to get you.
An indentation should be 1 tab or 4 spaces (you can't use tab on StackOverflow), for every sub-block in your code (covered with { }), unless if the block is empty.
If an instruction is continuing in the next row because of the length, it also deserves an indent. In this case you can also add additional tabs or spaces to make everything look nice. For example, if you have an equation long enough to be separated in 3 rows, you can make every row start from '=' of the first row.
"UPDATE" section looks much better than the rest, but you should still use 4-space indent instead of 2-space indent.

remove escape characters from a char

I've been working with this for about 2 days now. I'm stuck, with a rather simple annoyance, but I'm not capable of solving it.
My programs basicly recieves a TCP connection from a PHP script. And the message which is send is stored in char buffer[1024];.
Okay this buffer variable contains an unique key, which is being compared to a char key[1024] = "supersecretkey123";
The problem itself is that these two does not equal - no matter what I do.
I've been printing the buffer and key variable out just above eachother and by the look they are 100% identical. However my equalisation test still fails.
if(key == buffer) { // do some thing here etc }
So then I started searching the internet for some information on what could be wrong. I later realized that it might be some escape characters annoying me. But I'm not capable of printing them, removing them or even making sure they are there. So that's why I'm stuck - out of ideas on how to make these equal when the buffer variable matches the key variable.
Well the key does not chance, unless the declaration of the key is modified manually. The program itself is recieving the information and sending back information "correctly".
Thanks.
If you're using null terminated strings use proper api - strcmp and its variants.
Additionally size in declaration char key[1024] = "supersecretkey123"; is not needed - either compiler will reduced it or stack/heap memory will be wasted.
If you are using C++ use std::string instead of char []. You cannot compare two char [] in way you try to do this (they are pointers to memory), but it's possible with std::string.
If it's somehow mandatory to use char[] in your case, use strcmp.
Try with if(!strncmp(key,buffer,1024)). See this reference on strncmp.