Variable assignment overrides value of other variable - c++

I've been trying to learn a bit about socket programming in C++ and so have been developing a basic IRC bot. I've got it connecting and working, but I'm having an issue determining who has joined the channel (i.e. was the JOIN command for the bot itself, for a random user, or for me).
I'm using CLion on Windows 7.
I've got the following code:
//determine if message is JOIN command
if(regex_search(line, match, std::regex(R"(:([^!]*)!.*?JOIN)")))
{
//get the nickname of the bot (details.getNickName() shown in next code segment)
const char *trueNick = details.getNickName();
//reformat nickname because the following is returned from the method alone
//0x10042d0f9 <_ZStL6ignore+119> "CPlusPlusBotTest"
const char *nick = string(details.getNickName()).c_str();
//get the name of the user who joined
const char *name = match.str(1).c_str();
//debugging
std::cout << name << " - name\n";
std::cout << nick << " - nick\n";
//might not be the correct way to compare the two? but I'll sort that out later
if(name != nick)
{
//welcome the user
char param[1024];
sprintf(param, "%s :Hello, %s", details.getChannel(), name);
sendData("PRIVMSG", param);
}
}
I am unsure why I get the excess "stuff" (I have no idea what it is) from my getter, as it's simply a case of returning a private variable:
const char* BotDetails::getNickName() { return nickName; }
Regardless, that's not my issue given I can get rid of it (despite it possibly being rather hacky).
My issue is that when I connect to the channel for testing, and I set a breakpoint on the line assigning trueNick so I can see what happens as I step through the program, the following occurs:
1) trueNick is assigned the value: 0x10042d0f9 <_ZStL6ignore+119> "CPlusPlusBotTest"
2) nick is assigned the value: "CPlusPlusBotTest"
3) name is assigned the value: "Seanharrs" and nick is assigned the value: "Seanharrs"
This means when my debug statements run, nick and name are the same value. I am not sure why nick is being reassigned to be the same value as name, this should not occur. It happens every time as well, not just for my name. I tried using char arrays and strings instead but to no avail. I also find it strange that trueNick is never affected, it's only these two variables. Any help is appreciated (even if it's just an alt/better way to check this rather than a fix, because it may well be just an oddity on my end that nobody else experiences).

This line causes undefined behavior:
const char *nick = string(details.getNickName()).c_str();
It will construct a temporary string object and return a pointer to the data. However, being temporary means that it will be destructed immediately and the pointer will be invalid.
EDIT:
It turned out that OP misunderstood the additional information displayed by the debugger and interpreted it as the value of the variable. After clarifying that, there is no need for the "conversion" which causes undefined behavior and the code can just be:
const char *nick = details.getNickName();
const char *name = match.str(1).c_str();
if( strcmp(name, nick) == 0 )
{
//....
}
The example of undefined behavior is still shown below.
EXAMPLE CODE FOR THE UNDEFINED BEHAVIOR
Consider this code:
#include <iostream>
using namespace std;
int main() {
const char t[] = "0x10042d0f9 <_ZStL6ignore+119> \"CPlusPlusBotTest\"";
const char* pTrue = t;
const char* p = std::string(t).c_str();
std::cout << pTrue << std::endl;
std::cout << p << std::endl;
return 0;
}
it will output (rather: it may output):
0x10042d0f9 <_ZStL6ignore+119> "CPlusPlusBotTest"
0x10042d0f9 <_ZStL6ignore+119> "CPlusPlusBotTest"
(ideone.com used for this example and gave above output)
So from this you might think it was okay. (note however: I don't get the conversion mentioned by OP).
Now consider this code:
#include <iostream>
using namespace std;
int main() {
const char tdemo[] = "Some demo text"; // Added this line
const char t[] = "0x10042d0f9 <_ZStL6ignore+119> \"CPlusPlusBotTest\"";
const char* pTrue = t;
const char* p = std::string(t).c_str();
const char* pdemo = std::string(tdemo).c_str(); // Added this line
std::cout << pTrue << std::endl;
std::cout << p << std::endl;
return 0;
}
it will output (rather: it may output)
0x10042d0f9 <_ZStL6ignore+119> "CPlusPlusBotTest"
Some demo text
(ideone.com used for this example and gave above output)
As you can see the value of *p changed "unexpectedly". It changed because the pointer was invalid in the sense that it pointed to memory that was freed already. The extra line
const char* pdemo = std::string(tdemo).c_str();
caused the compiler to reuse that memory and consequently the value *p changed.
In other words - you have undefined behavior.
My guess is that your problem is inside details.getNickName();
It seems to me that the pointer returned is not pointing to the same test every time. Maybe it has some initialization problem so that it returns the wrong value the first time and then correct values afterwards.
The line causing undefined behavior can not do the conversion claimed by OP so it must be the return value from the function that changes.

Related

Regex stops working at 23 characters, but only if I pass in a string literal

I've been trying to narrow down this very strange behavior that I noticed. Here's the code.
#include <iostream>
#include <regex>
struct Scanner {
std::string::const_iterator read_head;
std::string::const_iterator eof;
Scanner(std::string const& program) {
read_head = program.cbegin();
eof = program.cend();
}
};
bool scan(Scanner const& scanner) {
using std::regex_constants::match_continuous;
static std::smatch match;
std::regex regex = std::regex("a+");
return std::regex_search(scanner.read_head, scanner.eof, match, regex, match_continuous);
}
int main() {
std::string str1 = "aaaaaaaaaaaaaaaaaaaaaa"; // 22 a's
std::string str2 = "aaaaaaaaaaaaaaaaaaaaaaa"; // 23 a's
Scanner s1(str1);
Scanner s2(str2);
Scanner s3("aaaaaaaaaaaaaaaaaaaaaa"); // 22 a's
Scanner s4("aaaaaaaaaaaaaaaaaaaaaaa"); // 23 a's
bool token1_found = scan(s1);
bool token2_found = scan(s2);
bool token3_found = scan(s3);
bool token4_found = scan(s4);
std::cout << std::boolalpha << token1_found << std::endl;
std::cout << std::boolalpha << token2_found << std::endl;
std::cout << std::boolalpha << token3_found << std::endl;
std::cout << std::boolalpha << token4_found << std::endl;
}
I would expect all four of these to show true, but bizarrely, I get:
true
true
true
false
It only seems to do this if I use the struct. I tried writing a function which did the same thing but by passing in either the string or the iterators directly, and everything behaved as expected in those cases (I didn't get this inexplicable false).
Does anybody know what's going on here?
EDIT:
I attempted to fix the problem, which according to #Geoffroy, is that Scanner is not taking ownership of the string. This is what I tried:
struct Scanner {
std::string program;
std::string::const_iterator read_head;
std::string::const_iterator eof;
Scanner(std::string program) : program(program) {
read_head = program.cbegin();
eof = program.cend();
}
};
but to no avail. Interestingly enough, when I do this, I get
true
false
true
false
EDIT 2:
Oh, but if I change
bool scan(Scanner const& scanner)
to
bool scan(Scanner scanner)
then I get
true
true
true
true
Does anyone know why that might be? I assumed that s1 - s4 would exist until the end of main.
Scanner should own the string it uses, as you're using iterator to a temporary object otherwise.
It works when passing str1 and str2 as the objects still exists, but in the case of the string literals it's an undefined behavior.
#Geoffroys answer is entirely correct, you are keeping addresses of temporary std::string variables and that is UB.
I was curious why 22 chars worked and 23 did not.
As it turns out, the libcxx std::string implementation decides between two representations based on whether strlen(input) < sizeof(size_t) + sizeof(size_t) + sizeof(char*) - 1. On a 64-bit system, this condition is different for 22 and 23 characters.
So the 23-character version accesses recently-freed heap memory, so my guess is that the string there is reallocated by one of the intermediate expressions.
EDIT: The decision on which storage strategy to make is here and it is called from several constructors, among which the const char * constructor.
EDIT 2: This talk on YouTube describes the short string optimization and how Facebook's fbstring works. It also shows that libstdc++ short strings (anno 2016) had a capacity of only 15 characters while being 32 bytes long.

C++ File Input/Output Outputting Numbers Instead of Chars

I have created a program that randomly assigns roles(jobs) to members of a certain house using file input / output.. It builds successfully, but when using cout and I actually see the results, I can see why the program is not working.
Here is the snippet of code I believe something is wrong with :
std::string foo = std::string("Preferences/") + std::to_string(members[random]) + "-Preferences";
cout << foo << endl;
And here is the members[random] array, it is randomly selecting members from this array and reviewing their available times and assigning them jobs based on their Preference input file.
unsigned const char members[22] =
{ 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v' };
I have created a random number picker that goes through 0-21 and assigns the value it creates to variable random. So, in essence, it is members[random] and completely random.
Here is the output I get in my terminal.
Preferences/116-Preferences
But I set the output to do Preferences/ member[random] -Preferences.
It is accessing a number and not my array chars.
I created a cout << members[random]; right below it, and every time I run the program, I get
Preferences/107-Preferences <---- A random number every time
k <---- random letter every time.
So I know it must be accessing my random functions, but assigned it to numbers! How do I fix this so my proper output can be :
Preferences/t-Preferences
Please help me, and thanks!
"The more you overthink the plumbing, the easier it is to stop up
the drain" - Scotty, Star Trek III
Declaring members to be unsigned chars does not accomplish anything useful. A simple char will suffice. std::string already implements an overloaded + operator that takes a char parameter, so it's much easier than you thought it would be:
const char members[22] = {
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v' };
// ...
std::string foo = std::string("Preferences/") + members[random]
+ "-Preferences";
There is no ::std::to_string(char), only (among less close) ::std::to_string(int). So your character is actually converted to its numerical representation and you get your unwanted result.
Try instead
std::string foo("Preferences/");
foo = foo.append(1, members[random]).append("-Preferences");
Variant using string streams:
ostringstream oss;
oss << "Preferences/" << members[random] << "-Preferences";
// get your string via:
oss.str();

ADO Recordset Field Value to C++ vector/array (capture pointer value)

I am trying to query an SQL Server (Express) through Visual C++ (express) and store the resulting dataset into a C++ vector (array would be great as well). To that end I researched the ADO library and found plenty of help on MSDN. In short, reference the msado15.dll library and use those features (especially the ADO Record Binding, which requires icrsint.h). In short, I have been able to query the database and display the field values with printf(); but am stumbling when I try to load the field values into a vector.
I originally tried loading values by casting everything as char* (due to despair after many type conversion errors) only to find the end result was a vector of pointers that all pointed to the same memory address. Next (and this is the code provided below) I attempted to assign the value of the memory location but am ending up with a vector of the first character of the memory location only. In short, I need help understanding how to pass the entire value stored by the Recordset Field Value (rs.symbol) pointer (at the time it is passed to the vector) instead of just the first character? In this circumstance the values returned from SQL are strings.
#include "stdafx.h"
#import "msado15.dll" no_namespace rename("EOF", "EndOfFile")
#include "iostream"
#include <icrsint.h>
#include <vector>
int j;
_COM_SMARTPTR_TYPEDEF(IADORecordBinding, __uuidof(IADORecordBinding));
inline void TESTHR(HRESULT _hr) { if FAILED(_hr) _com_issue_error(_hr); }
class CCustomRs : public CADORecordBinding {
BEGIN_ADO_BINDING(CCustomRs)
ADO_VARIABLE_LENGTH_ENTRY2(1, adVarChar, symbol, sizeof(symbol), symbolStatus, false)
END_ADO_BINDING()
public:
CHAR symbol[6];
ULONG symbolStatus;
};
int main() {
::CoInitialize(NULL);
std::vector<char> tickers;
try {
char sym;
_RecordsetPtr pRs("ADODB.Recordset");
CCustomRs rs;
IADORecordBindingPtr picRs(pRs);
pRs->Open(L"SELECT symbol From Test", L"driver={sql server};SERVER=(local);Database=Securities;Trusted_Connection=Yes;",
adOpenForwardOnly, adLockReadOnly, adCmdText);
TESTHR(picRs->BindToRecordset(&rs));
while (!pRs->EndOfFile) {
// Process data in the CCustomRs C++ instance variables.
//Try to load field value into a vector
printf("Name = %s\n",
(rs.symbolStatus == adFldOK ? rs.symbol: "<Error>"));
//This is likely where my mistake is
sym = *rs.symbol;//only seems to store the first character at the pointer's address
// Move to the next row of the Recordset. Fields in the new row will
// automatically be placed in the CCustomRs C++ instance variables.
//Try to load field value into a vector
tickers.push_back (sym); //I can redefine everything as char*, but I end up with an array of a single memory location...
pRs->MoveNext();
}
}
catch (_com_error &e) {
printf("Error:\n");
printf("Code = %08lx\n", e.Error());
printf("Meaning = %s\n", e.ErrorMessage());
printf("Source = %s\n", (LPCSTR)e.Source());
printf("Description = %s\n", (LPCSTR)e.Description());
}
::CoUninitialize();
//This is me running tests to ensure the data passes as expected, which it doesn't
std::cin.get();
std::cout << "the vector contains: " << tickers.size() << '\n';
std::cin.get();
j = 0;
while (j < tickers.size()) {
std::cout << j << ' ' << tickers.size() << ' ' << tickers[j] << '\n';
j++;
}
std::cin.get();
}
Thank you for any guidance you can provide.
A std::vector<char*> does not work because the same buffer is used for all the records. So when pRs->MoveNext() is called, the new content is loaded into the buffer, overwriting the previous content.
You need to make a copy of the content.
I would suggest using a std::vector<std::string>:
std::vector<std::string> tickers;
...
tickers.push_back(std::string(rs.symbol));
Why you did not use std::string instead of std::vector?
To add characters use one of these member functions:
basic_string& append( const CharT* s ); - for cstrings,
basic_string& append( const CharT* s,size_type count ); - otherwise.
Read more at: http://en.cppreference.com/w/cpp/string/basic_string/append.
If you want a line breaks, simply append '\n' where you want it.

print called function name using GCC plugin

I need to print the name of the called functions of a program using gcc plugins
for this I created a pass that will be called after ssa pass, I already initiated the plugin and I can loop on its statements, using a gimple_stmt_iterator :
int read_calls(){
unsigned i;
const_tree str, op;
basic_block bb;
gimple stmt;
tree fnt;
FOR_EACH_BB_FN(bb, cfun) {
gimple_stmt_iterator gsi;
for (gsi=gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi))
{
stmt = gsi_stmt(gsi);
if (is_gimple_call(stmt)){
const char* name = THE_FUNCTION_I_NEED(stmt);
cerr << " Function : " << name << " is called \n";
}
}
}
return 0;
}
How can I print the name of the called function using its gimple node ??
Can I also print other informations like the line number where it was called, the name of the function where it was called etc .. ?
I've been looking for the answer for hours, the answer is actually pretty easy :
get_name(tree node)... I've been trying many functions since the documentation is really poor... I found it here :
GCC Middle and Back End API Reference
As you can see, there is no comments about what the functions does, and it quit the best documentation I found about gcc, anyway get_name(..) is working fine, bit I haven't find how to print the source line yet
I know three ways:
1:
tree current_fn_decl = gimple_call_fndecl(stmt);
const char* name = function_name(DECL_STRUCT_FUNCTION(current_fn_decl);
2:
const char* name = IDENTIFIER_POINTER(DECL_NAME(current_fn_decl));
3:
tree current_fn_decl = gimple_call_fndecl(stmt);
const char* name = get_name(current_fn_decl);

VC++ function string::c_str(): the address of the first byte was set to 0 (compare to g++)

I met a strange problem when trying to get the result of a string’s function c_str() whose result is inconsistent with g++.
There is a function called Test to return a string instance. And I want to use a char* type to store the result (it’s needed). As you can see the function is simple return a string “resultstring”. But when I try to get the result something strange happened.
The result I got is “” in part two. The part one and part three both return the “resultstring”. While that’s in Visual Studio. The three part of the same code compiled with g++ both return the “result string. Let’s just as well see the result first:
result of vs:
address:16841988
resultstring
address:16842096
"here is a empty line"
address:16842060
address:16842144
address:16842396
address:16842396
resultstring
result of g++
address:5705156
resultstring
address:5705156
resultstring
address:5705156
address:5705196
address:5705156
address:5705156
resultstring
The code is very simple list below:
#include <iostream>
#include <string>
using namespace std;
string Test()
{
char a[64] = "resultstring";
return string(a);
}
int main(void)
{
//part one
cout << "address:"<< (unsigned)Test().c_str() << endl;
cout << Test().c_str() << endl;
//part two
char *j = const_cast<char*>(Test().c_str());
cout << "address:"<< (unsigned)Test().c_str() << endl;
cout << j << endl;
cout << "address:" << (unsigned)j <<endl;
//part three
string h3 = Test();
char* j2 = const_cast<char*>(h3.c_str());
cout << "address:"<< (unsigned)Test().c_str() << endl;
cout << "address:"<< (unsigned)h3.c_str() << endl;
cout << "address:" << (unsigned)j2 <<endl;
cout << j2 <<endl;
getchar();
return 0;
}
Now I have three questions.
1st, why the result complied by g++ returns all resultstring while the result of Visual Studio returns all resultstring except for variable j? If you debug into this you’ll find that VC++ only set the address of j2 like 00 65 73 75 … which is esultstring with a 00 begin address. And it is not strange that we’ll get “”. It’s just like char* str = "\0something else" you’ll always get "". But the question is why does this happen only with j?
2nd, why does one of the addresses of the (unsigned) Test ().c_str() is different with others? If we remove the line string h3 = Test () the address will be all the same.
3rd, Is it the “correct” behavior of Visual Studio returning “” value of variable j? why it is different with g++?
Looking forward to your replies.
Regards,
Kevin
This is totally flawed. You create and destroy a temporary string every time you call Test(). Any attempt to access memory using pointer returned by Test().c_str() after temporary was destroyed makes no sense - memory was freed already. It MIGHT have the old values (if nothing is written there before the access), but it might have anything as well (if it is reused before the access). It's Undefined Behavior.
In case of VC++ it is overwritten once and is not in other cases. With GCC - it's never overwritten. But this is pure chance. Once again - it's UB.
You have undefined behavior. The std::string returned by Test() is a temporary and the pointer returned by c_str() (stored in j) is no longer valid after the lifetime of the temporary ends. This means that anything can happen. The array the pointer points to may contain garbage, it may be the original string or the implementation may have null terminated the beginning of it. Accessing it may cause a segmentation fault or it may allow you to access the original string data. This can and usually does vary between different compilers and implementations of the standard library.
char *j = const_cast<char*>(Test().c_str());
// The contents pointed to by j are no longer valid and access that content
// is undefined behavior
cout << "address:"<< (unsigned)Test().c_str() << endl;
The address is different between calls to Test() because it returns a temporary each time you call it. Some compilers may optimize this and/or the allocation of data may get the same block of memory but it is not guaranteed to be the same.