yacc outputs unexpected results - c++

I have this rule in yacc file and a separate c++ file for the action of the rule.
But the output is not expected as shown with the print statement below .This is the rule in parser.y :
RecordItem : IdentifierList ':' TypeDenoter
{
char * result = declareRecordItem ($1 , $3);
$$ = result;
printf(" >>> inside RecordItem >> : %s\n",result);
}
;
and this is the function 'declareRecordItem' in main.cpp file :
char* declareRecordItem( std::vector<char* >* varList , char* type){
string stm = " ";
string identifier;
for(int i=0 ; i < varList-> size() ; i++)
{
identifier= string((*varList)[i]) ;
symtab[identifier] = string(type);
stm = stm + " " + string(type);
}
char * result = (char*)stm.c_str();
printf(">>> inside declareRecordItem >> : %s\n",result);
return result ;
}
The output in declareRecordItem function is correct but when it returns to the RecordItem rule, it does not produce any thing or sometimes strange symbols are printed as shown. Any idea !.
>>> inside declareRecordItem >> : i32 i32
>>> inside RecordItem >> :

Inside declareRecordItem, you create a local variable stm of type std::string, and then return the value: (char*)stm.c_str();
However, stm ceases to exist as soon as declareRecordItem returns, and the pointer returned by stm.c_str() is no longer valid. It is quite possible that the memory it now points to has been reused to hold some other object.
So if you want to retain the value of the string, you should either
return the std::string itself, relying on RVO for efficiency, or
create and return a dynamic object (i.e. new std::string()), or
make a C-style copy of the string buffer (i.e. strdup(stm.c_str()))
In both the latter two cases, you will need to manually free the copy once you are finished with it. In the first case, you will still need to make a copy of the underlying C string if you need to retain a C string.
Mixing C and C++ string handling is confusing and leads to cluttered code. If you are going to use C++, I recommend using std::string* throughout, although you end up with the same manual memory management regime as you would have with C style strings. For compilers and other applications with symbol tables, a good compromise is to put the string into the symbol table (or a map of immutable symbol names) as soon as possible, even in the lexical scanner, and then pass around references to the symbols (or interned names). Then you can free the entire repository of retained strings at the end of the compilation.

Related

Problem in comparing strings without assigning to variable

int i = ("aac" > "aab");
cout << i;
The above code does not give me the output as 1 (as it should be). But when I assign "aac" and "aab" to two separate string variables and use the variables instead of using strings directly (code attached below), I get the desired output.
Could anyone help me please?
string s1 = "aac";
string s2 = "aab";
int i = (s1 > s2);
cout << i;
Literal constants like "aac" aren't std::string objects; rather, they are just data in (read-only) memory that evaluate, in most 'access' cases, to the address of their first element (i.e. a char* pointer); so, a comparison between them will be a comparison between those addresses — something you are unlikely to be able to control or predict.
To get an inline comparison, in your case, you can use inline std::string constructors (sometimes knows as "wrappers"), like this:
int i=(string("aac")>string("aab"));
Or, using the more 'modern' "curly-brace" initializer syntax:
int i = (string{ "aac" } > string{ "aab" });
For more brevity, you can make use of the fact that std::string has versions of the > (and similar) operators that take a string literal as one of the arguments; thus, you need only 'wrap' one of the literals, and could reduce the above code to something like:
int i = (string{ "aac" } > "aab");
If you use C-style char * / char [] strings, you need to use strcmp like:
int i = strcmp("aac", "aab");
Otherwise, you are just comparing addresses of the first elements of both of strings.

Why is my string variable not printing a full string?

Program:
#include <iostream>
using namespace std;
int main() {
string str;
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
cout << str;
return 0;
}
Output:
No output.
If I replace cout << str; with cout << str[1], I get a proper output.
Output:
b
And if I change the data type of the variable to a character array, I get the desired output. (Replacing string str; with char str[5];
Output:
abc
Why is my program behaving like this? How do I alter my code to get the desired output without changing the data type?
Your program has undefined behavior.
string str;
creates an empty string. It has length 0.
You are trying to write to the first three elements of this string with
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
These do not exist. Indexing a std::string out-of-bounds causes undefined behavior.
You can add characters to a string with any of the following methods:
str += 'a';
str += "a";
str.push_back('a');
str.append("a");
or you can resize the string first to the intended length before you index into any of its elements:
str.resize(3);
As pointed out by #Ayxan in a comment under this answer, you are also missing #include<string>. Without it it is unspecified whether your program will compile since it uses std::string which is defined in <string>. It is unspecified whether including one standard library header will include another one if there isn't a specific exception. You should not rely on unspecified behavior that may break at any point.
In addition to walnut's answer, here's what's going on under the hood:
An std::string contains at least two members, a data pointer (type char*) and a size. The [] operator does not check the size, it only indexes into the memory behind the data pointer. Also, the [] operator does not modify the size member of the string.
The << operator that streams the string to cout, however, does interpret the size member. It does so to be able to output strings containing null characters. Since the size member is still zero, nothing is printed.
You may wonder why the memory access within str[0] even succeeds, after all, the string never had any reason to allocate any memory for its data yet. This is due to the fact that virtually all std::string implementations use the small-string-optimization: The std::string object itself is a bit larger than it needs to be, and the space at its end is used instead of an allocation on the heap unless the string becomes longer than that space. As such, the default constructor will just point the data pointer to that internal storage to have it initialized, and your memory accesses are directed to existing memory. This is why you don't get a SEGFAULT unless you access the string's data way out of bounds. Doesn't change the fact that already your expression str[0] is undefined behavior, though. The symptoms may appear benign, but the disease is fatal.

LLVM Pass - Issues replacing a GlobalVariable

I am trying to write an LLVM pass which manipulates strings.
After iterating all the GlobalVariable objects and picking out the strings, I get the string data, perform the manipulation, create a new GlobalVariable and then use replaceAllUsesWith() to replace the old with the new. Sounds simple enough...
However, I am getting an assert error, telling me that the replacement should be the same type. I have not changed the length of the string, so I don't know why the type would be different. A cut down version of the code is below.
for (Module::global_iterator gi = M.global_begin(), ge = M.global_end(); gi != ge; gi++) {
GlobalVariable *gv = *gi;
ConstantDataSequential *cdata = dyn_cast<ConstantDataSequential>(gv->getInitializer());
std::string orig = "";
if (cdata->isString() {
orig = cdata->getAsString();
} else if (cdata->isCString() {
orig = cdata->getAsCString();
} else {
continue;
}
// string returned has the same length, but different contents
std::string modified = manipulateString(orig);
std::ostringstream oss;
oss << gv->getName() << "Modified" ;
Constant *cMod = ConstantDataArray::getString(M.getContext(), modified, true);
GlobalVariable *newGv = new GlobalVariable(M,
cMod->getType(),
true,
GlobalValue::ExternalLinkage,
cMod,
oss.str());
gv->replaceAllUsesWith(newGv);
}
Note: I've hand typed this code, so it may not compile, but it should serve as an illustration of what I'm trying to achieve and how I'm trying to achieve it.
For some reason, the new GlobalVariable has a different type. Printing the types at runtime yields:
gv->getType() = [36 x i8]*
newGv->getType() = [37 * x i8]*
The size of both strings are 36 chars. Why is the type of the new GlobalVariable different, even though the string length has not changed? Why has an extra element been added?
Also, replaceAllUsesWith() requires that the replacement be same type. If I wanted the replacement to be string of a different length, how would I achieve that?
You cannot replace with an object of a different type. You can, however, cast the GlobalVariable to have the right type. What you want is...
ConstantExpr::getPointerCast(newGv, gv->getType());
...except that that won't compile, because the second argument has to be a PointerType. You can always add another level of casting, making the code less clear but the compiler more happy:
ConstantExpr::getPointerCast(newGv, cast<PointerType>(gv->getType()));
I have found it helpful to user 0-length arrays for all variable-length arrays, and always cast constants to that.

Uninitialized char

I'm reading over a C++ class for parsing CSV files in one of my programming books for class. I primarily write in C# for work and don't interact with C++ code very often. One of the functions, getline, uses an uninitialized char variable and I'm confused as to whether it's a typo or not.
// getline: get one line, grow as needed
int Csv::getline(string& str)
{
char c;
for (line = ""; fin.get(c) && !endofline(c); )
line += c;
split();
str = line;
return !fin.eof();
}
fin is an istream. The documentation I'm reading shows the get (char& c); function being passed a reference, but which char in the stream is returned? What's the initial value of c?
The initial value of c is undefined but it does not matter what the initial value of c is since the call to get will set the value. Since there is a sequence point after the left hand side of the || and && operators we know that all the side effects of get will have been effected and endofline will see the modified value of c.

Converting/parsing pointer strings and doubles

Assignment:
Read in info from text file (done)
Retrieve only parts of text file using substr method (done)
Store info into instance variables (need help)
Here is the code I am having trouble with:
string* lati;
lati = new string(data.substr(0, data.find_first_of(",")));
double* latDub;
latDub = new double(atof((char *)lati));
this->latitude = *latDub;
I need to store the latitude into the instance variable latitude.
The variable data is the read-in text file.
this->latitude is declared as a double.
I have tested and the variable lati is the correct value, but once I try to convert it into a double the value changes to 0 for some reason. I am specifically supposed to use the atof method when converting!
(char *)lati doesn't do what you think it does. What you're clearly trying to do there is get the char sequence associated with lati, but what you're actually doing is just squeezing a string* into a char* which is all kinds of bad.
There's a member function on std::string that will give you exactly what you want. You should review the documentation for string, and replace (char *)lati with a call to that function.
Why your code compiles, but gives meaningless results has already been explained by adpalumbo. There are two fundamental problems in your code leading to that error, on which I want to expand here.
One is that you use a C-style cast: (T)obj. Basically, that just tells the compiler to shut up, you know what you are doing. That is rarely ever a good idea, because when you do know what you are doing, you can usually do without such casts.
The other one is that you are using objects allocated dynamically on the heap. In C++, objects should be created on the stack, unless you have very good reasons for using dynamic objects. And dynamic objects are usually hidden inside objects on the stack. So your code should read like this:
string lati(data.substr(0, data.find_first_of(",")));
double latDub = /* somehow create double from lati */;
this->latitude = latDub;
Of course, latDub is completely unnecessary, you could just as well write to this->latitude directly.
Now, the common way to convert a string into some other type would be streaming it through a string stream. Removing the unnecessary variables you introduced, your code would then look like this:
std::istringstream iss(data.substr(0, data.find_first_of(",")));
if( !iss >> this->latitude ) throw "Dude, you need error handling here!";
Usually you want to pack that conversion from a string into a utility function which you could reuse throughout your code:
inline double convert3double(const std::string& str)
{
std::istringstream iss(str);
double result;
if( !iss >> result )
throw std::exception("Dang!");
return result;
}
However, since the very same algorithm can be used for all types (for which operator>> is overloaded meaningfully with an input stream as the left operand), just make this a template:
template< typename T >
inline T convert3double(const std::string& str)
{
std::istringstream iss(str);
T result; // presumes default constructor
if( !iss >> result ) // presumes operator>>
throw std::exception("Dang!");
return result;
}