This is Arduino code, but I have a feeling my error is with C++ in general and not specific to Arduino. I'm new to pointers and strings, so I probably am doing something wrong in that regard.
This is from a much larger program, but I've whittled it down to as little code as I can where I can still reproduce the bug.
It should just iterate through the letters of text[] and save each letter into newText[0][0], as well as print newText[0][0] and the counter variable i.
void setup() {
Serial.begin(9600);
}
void loop() {
const char text[] = "chunk";
static char newText[][10] = {};
static unsigned int i=0;
static int code = 0;
if(code == 0){
newText[0][0] = text[i]; //This
Serial.print(i);
Serial.println(newText[0][0]); //This
i++;
if(i>=strlen(text)){
code=1;
}
}
}
But as I have the code, i jumps to some number like 104 by the second iteration, when it should be equal to 1. (The exact value varies depending on how exactly the code looks.) If I comment out either of the lines that have the comment //This, then the counter works fine. Also, if I switch Serial.print(i); and the line before it, then it counts fine.
static char newText[][10] = {};
This creates a zero sized array (to be more precise a matrix of 0 rows and 10 columns). This would be illegal in C++ but gcc has an extension to allow it. Nevertheless even with gcc's extension, it's UB when you access newText[0][0] which is outside the size of the array. When you have UB anything can happen, the program can appear to work as you expect, it can print gibberish, it can crash, etc., literally anything.
So you need to declare a size that accommodates all your accesses, e.g.:
static char newText[1][10] = {};
if you only want 1 row (but in this case you don't need a matrix, a one dimension array would do fine).
Tip: use the String class for strings instead of raw arrays.
Related
I am trying to write a lexer, when I try to copy isdigit buffer value in an array of char, I get this core dumped error although I have done the same thing with identifier without getting error.
#include<fstream>
#include<iostream>
#include<cctype>
#include <cstring>
#include<typeinfo>
using namespace std;
int isKeyword(char buffer[]){
char keywords[22][10] = {"break","case","char","const","continue","default", "switch",
"do","double","else","float","for","if","int","long","return","short",
"sizeof","struct","void","while","main"};
int i, flag = 0;
for(i = 0; i < 22; ++i){
if(strcmp(keywords[i], buffer) == 0)
{
flag = 1;
break;
}
}
return flag;
}
int isSymbol_Punct(char word)
{
int flag = 0;
char symbols_punct[] = {'<','>','!','+','-','*','/','%','=',';','(',')','{', '}','.'};
for(int x= 0; x< 15; ++x)
{
if(word==symbols_punct[x])
{
flag = 1;
break;
}
}
return flag;
}
int main()
{
char buffer[15],buffer1[15];
char identifier[30][10];
char number[30][10];
memset(&identifier[0], '\0', sizeof(identifier));
memset(&number[0], '\0', sizeof(number));
char word;
ifstream fin("program.txt");
if(!fin.is_open())
{
cout<<"Error while opening the file"<<endl;
}
int i,k,j,l=0;
while (!fin.eof())
{
word = fin.get();
if(isSymbol_Punct(word)==1)
{
cout<<"<"<<word<<", Symbol/Punctuation>"<<endl;
}
if(isalpha(word))
{
buffer[j++] = word;
// cout<<"buffer: "<<buffer<<endl;
}
else if((word == ' ' || word == '\n' || isSymbol_Punct(word)==1) && (j != 0))
{
buffer[j] = '\0';
j = 0;
if(isKeyword(buffer) == 1)
cout<<"<"<<buffer<<", keyword>"<<endl;
else
{
cout<<"<"<<buffer<<", identifier>"<<endl;
strcpy(identifier[i],buffer);
i++;
}
}
else if(isdigit(word))
{
buffer1[l++] = word;
cout<<"buffer: "<<buffer1<<endl;
}
else if((word == ' ' || word == '\n' || isSymbol_Punct(word)==1) && (l != 0))
{
buffer1[l] = '\0';
l = 0;
cout<<"<"<<buffer1<<", number>"<<endl;
// cout << "Type is: "<<typeid(buffer1).name() << endl;
strcpy(number[k],buffer1);
k++;
}
}
cout<<"Identifier Table"<<endl;
int z=0;
while(strcmp(identifier[z],"\0")!=0)
{
cout <<z<<"\t\t"<< identifier[z]<<endl;
z++;
}
// cout<<"Number Table"<<endl;
// int y=0;
// while(strcmp(number[y],"\0")!=0)
// {
// cout <<y<<"\t\t"<< number[y]<<endl;
// y++;
// }
}
I am getting this error when I copy buffer1 in number[k] using strcpy. I do not understand why it is not being copied. When i printed the type of buffer1 to see if strcpy is not generating error, I got A_15, I searched for it, but did not find any relevant information.
The reason is here (line 56):
int i,k,j,l=0;
You might think that this initializes i, j, k, and l to 0, but in fact it only initializes l to 0. i, j, and k are declared here, but not initialized to anything. As a result, they contain random garbage, so if you use them as array indices you are likely to end up overshooting the bounds of the array in question.
At that point, anything could happen—in other words, this is undefined behavior. One likely outcome, which is probably happening to you, is that your program tries to access memory that hasn't been assigned to it by the operating system, at which point it crashes (a segmentation fault).
To give a concrete demonstration of what I mean, consider the following program:
#include <iostream>
void print_var(std::string name, int v)
{
std::cout << name << ": " << v << "\n";
}
int main(void)
{
int i, j, k, l = 0;
print_var("i", i);
print_var("j", j);
print_var("k", k);
print_var("l", l);
return 0;
}
When I ran this, I got the following:
i: 32765
j: -113535829
k: 21934
l: 0
As you can see, i, j, and k all came out such that using them as indices into any of the arrays you declared would exceed their bounds. Unless you are very lucky, this will happen to you, too.
You can fix this by initializing each variable separately:
int i = 0;
int j = 0;
int k = 0;
int l = 0;
Initializing each on its own line makes the initializations easier to see, helping to prevent mistakes.
A few side notes:
I was able to spot this issue immediately because I have my development environment configured to flag lines that provoke compiler warnings. Using a variable before it's being initialized should provoke such a warning if you're using a reasonable compiler, so you can fix problems like this as you run into them. Your development environment may support the same feature (and if it doesn't, you might consider switching to something that does). If nothing else, you can turn on warnings during compilation (by passing -Wall -Wextra to your compiler or the like—check its documentation for the specifics).
Since you declared your indices as int, they are signed integers, which means they can hold negative values (as j did in my demonstration). If you try to index into an array using a negative index, you will end up dereferencing a pointer to a location "behind" the start of the array in memory, so you will be in trouble even with an index of -1 (remember that a C-style array is basically just a pointer to the start of the array). Also, int probably has only 32 bits in your environment, so if you're writing 64-bit code then it's possible to define arrays too large for an int to fully cover, even if you were to index into the array from the middle. For these sorts of reasons, it's generally a good idea to type raw array indices as std::size_t, which is always capable of representing the size of the largest possible array in your target environment, and also is unsigned.
You describe this as C++ code, but I don't see much C++ here aside from the I/O streams. C++ has a lot of amenities that can help you guard against bugs compared to C-style code (which has to be written with great care). For example, you could replace your C-style arrays here with instances of std::array, which has a member function at() that does subscripting with bounds checking; that would have thrown a helpful exception in this case instead of having your program segfault. Also, it doesn't seem like you have a particular need for fixed-size arrays in this case, so you may better off using std::vector; this will automatically grow to accommodate new elements, helping you avoid writing outside the vector's bounds. Both support range-based for loops, which save you from needing to deal with indices by hand at all. You might enjoy Bjarne's A Tour of C++, which gives a nice overview of idiomatic C++ and will make all the wooly reference material easier to parse. (And if you want to pick up some nice C habits, both K&R and Kernighan and Pike's The Practice of Programming can save you much pain and tears).
Some general hints that might help you to avoid your cause of crash totally by design:
As this is C++, you should really refer to established C++ data types and schemes here as far as possible. I know, that distinct stuff in terms of parser/lexer writing can become quite low-level but at least for the things you want to achieve here, you should really appreciate that. Avoid plain arrays as far as possible. Use std::vector of uint8_t and/or std::string for instance.
Similar to point 1 and a consequence: Always use checked bounds iterations! You don't need to try to be better than the optimizer of your compiler, at least not here! In general, one should always avoid to duplicate container size information. With the stated C++ containers, this information is always provided on data source side already. If not possible for very rare cases (?), use constants for that, directly declared at/within data source definition/initialization.
Give your variables meaningful names, declare them as local to their used places as possible.
isXXX-methods - at least your ones, should return boolean values. You never return something else than 0 or 1.
A personal recommendation that is a bit controversional to be a general rule: Use early returns and abort criteria! Even after the check for file reading issues, you proceed further.
Try to keep your functions smart and non-boilerplate! Use sub-routines for distinct sub-tasks!
Try to avoid using namespace that globally! Even without exotic building schemes like UnityBuilds, this can become error-prone as hell for huger projects at latest.
the arrays keywords and symbols_punct should be at least static const ones. The optimizer will easily be able to recognize that but it's rather a help for you for fast code understanding at least. Try to use classes here to compound the things that belong together in a readable, adaptive, easy modifiable and reusable way. Always keep in mind, that you might want to understand your own code some months later still, maybe even other developers.
I'm new to C++, coming from mostly working with Java and I'm having a problem with a function I'm trying to write. I'm sure it's something simple, but nonetheless, it's giving me fits, so prepare for a painfully newbie question.
I'm trying to write a function as follows:
void foo(u_char *ct){
/* ct is a counter variable,
it must be written this way due to the library
I need to use keeping it as an arbitrary user argument*/
/*here I would just like to convert the u_char to an int,
print and increment it for each call to foo,
the code sample I'm working from attempts to do it more or less as follows:*/
int *counter = (int *) ct;
printf("Count: %d\n", *counter);
*counter++;
return;
}
When I try to run this in XCode (something I'm also new to using), I get a EXE_BAD_ACCESS exception on the printf() portion of foo. I'm really not sure what is going on here but I suspect that it has something to do with conflating values, pointers and references, something I don't yet have a strong gasp of how C++ understands them coming from Java. Does anyone see where I'm slipping up here?
Thanks.
An u_char would be 1 byte in memory (the name suggests it's just an unsigned char), an int is typically 4 bytes. In printf, you tell the runtime to read an int (4 bytes) from the address where counter resides. But you only own 1 byte there.
EDIT (based on comments down here where poster says it's called actually with the address of an int: foo((u_char*)&count) ):
void foo(u_char *ct)
{
int *pcounter = (int *)ct; // change pointer back to an int *
printf("Count: %d\n", *pcounter);
(*pcounter)++; // <<-- brackets here because of operator precedence.
}
Or even shorter (the wild c-style for which everbody but newbies loves this language):
void foo(u_char *ct)
{
printf("Count: %d\n", (*(int *)ct)++);
}
Note: i'm using the c++ compiler, hence why I can use pass by reference
i have a strange problem, and I don't really know what's going on.
Basically, I have a text file: http://pastebin.com/mCp6K3HB
and I'm reading the contents of the text file in to an array of atoms:
typedef struct{
char * name;
char * symbol;
int atomic_number;
double atomic_weight;
int electrons;
int neutrons;
int protons;
} atom;
this is my type definition of atom.
void set_up_temp(atom (&element_record)[DIM1])
{
char temp_array[826][20];
char temp2[128][20];
int i=0;
int j=0;
int ctr=0;
FILE *f=fopen("atoms.txt","r");
for (i = 0; f && !feof(f) && i < 827; i++ )
{
fgets(temp_array[i],sizeof(temp_array[0]),f);
}
for (j = 0; j < 128; j++)
{
element_record[j].name = temp_array[ctr];
element_record[j].symbol = temp_array[ctr+1];
element_record[j].atomic_number = atol(temp_array[ctr+2]);
element_record[j].atomic_weight = atol(temp_array[ctr+3]);
element_record[j].electrons = atol(temp_array[ctr+4]);
element_record[j].neutrons = atol(temp_array[ctr+5]);
element_record[j].protons = atol(temp_array[ctr+6]);
ctr = ctr + 7;
}
//Close the file to free up memory and prevent leaks
fclose(f);
} //AT THIS POINT THE DATA IS FINE
Here is the function I'm using to read the data. When i debug this function, and let it run right up to the end, I use the debugger to check it's contents, and the array has 100% correct data, that is, all elements are what they should be relative to the text file.
http://i.imgur.com/SEq9w7Q.png This image shows what I'm talking about. On the left, all the elements, 0, up to 127, are perfect.
Then, I go down to the function I'm calling it from.
atom myAtoms[118];
set_up_temp(myAtoms); //AT THIS POINT DATA IS FINE
region current_button_pressed; // NOW IT'S BROKEN
load_font_named("arial", "cour.ttf", 20);
panel p1 = load_panel("atomicpanel.txt");
panel p2 = load_panel("NumberPanel.txt");
As soon as ANYTHING is called, after i call set_up_temp, the elements 103 to 127 of my array turn in to jibberish. As more things get called, EVEN MORE of the array turns to jibberish. This is weird, I don't know what's happening... Does anyone have any idea? Thanks.
for (j = 0; j < 128; j++)
{
element_record[j].name = temp_array[ctr];
You are storing, and then returning, pointers into temp_array, which is on the stack. The moment you return from the function, all of temp_array becomes invalid -- it's undefined behavior to dereference any of those pointers after that point. "Undefined behavior" includes the possibility that you can still read elements 0 through 102 with no trouble, but 103 through 127 turn to gibberish, as you say. You need to allocate space for these strings that will live as long as the atom object. Since as you say you are using C++, the easiest fix is to change both char * members to std::string. (If you don't want to use std::string, the second easiest fix is to use strdup, but then you have to free that memory explicitly.)
This may not be the only bug in this code, but it's probably the one causing your immediate problem.
In case you're curious, the reason the high end of the data is getting corrupted is that on most (but not all) computers, including the one you're using, the stack grows downward, i.e. from high addresses to low. Arrays, however, always index from low addresses to high. So the high end of the memory area that used to be temp_array is the part that's closest to the stack pointer in the caller, and thus most likely to be overwritten by subsequent function calls.
Casual inspection yields this:
char temp_array[826][20];
...
for (i = 0; f && !feof(f) && i < 827; i++ )
Your code potentially allows i to become 826. Which means you're accessing the 827th element of temp_array. Which is one past the end. Oops.
Additionally, you are allocating an array of 118 atoms (atom myAtoms[118];) but you are setting 128 of them inside of set_up_temp in the for (j = 0; j < 128; j++) loop.
The moral of this story: Mind your indices and since you use C++ leverage things like std::vector and std::string and avoid playing with arrays directly.
Update
As Zack pointed out, you're returning pointers to stack-allocated variables which will go away when the set_up_temp function returns. Additionally, the fgets you use doesn't do what you think it does and it's HORRIBLE code to begin with. Please read the documentation for fgets and ask yourself what your code does.
You are allocating an array with space for 118 elements but the code sets 128 of them, thus overwriting whatever happens to live right after the array.
Also as other noted you're storing in the array pointers to data that is temporary to the function (a no-no).
My suggestion is to start by reading a good book about C++ before programming because otherwise you're making your life harder for no reason. C++ is not a language in which you can hope to make serious progress by experimentation.
Can somebody explain why next code output 26 timez 'Z' instead range from 'A' to 'Z', and how can I output this array correct. Look at code:
wchar_t *allDrvs[26];
int count = 0;
for (int n=0; n<26; n++)
{
wchar_t t[] = {L'A' + n, '\0'};
allDrvs[n] = t;
count++;
}
int j;
for(j = 0; j < count; j++)
{
std::wcout << allDrvs[j] << std::endl;
}
The problem (at least one) is:
{
wchar_t t[] = {L'A' + n, '\0'};
allDrvs[n] = t; //allDrvs points to t
count++;
} //t is deallocated here
//allDrvs[n] is a dangling pointer
So, short answer - undefined behavior on the line std::wcout << allDrvs[j].
To get a correct output - there's a crappy ugly version involving dynamic allocation and copying between arrays.
Then there's the correct version of using a std::vector<std::wstring> >.
Your t[] is on the stack; it only exists for one iteration of the loop at a time, and the next iteration appears to be reusing that space - not a behaviour that's required, but this seems to be what's happening based on your results. If you examine allDrvs[] with a debugger after the first loop completes, you'll probably see all the pointers point to the same memory location.
There's a variety of ways you could solve this. You can allocate a new t on the heap for each loop iteration (and delete them afterwards). You could do wchar_t allDrvs[26][2]; instead of wchar_t *allDrvs[26], and copy the contents of t over each iteration. You could display t right away in the first loop, instead of doing it later. You could use std::vector and std::wstring to manage things for you, instead of using arrays and pointers.
Your code has undefined behavior. Your t has automatic storage duration, so as soon as you exit the upper loop, it ceases to exist. Your allDrvs contains 26 pointers to objects that have been destroyed by the time you use them in the second loop.
As it happens, it looks like (under the circumstances you're running it, with the compiler you're using, etc.) what's happening is that it's re-using the same storage space for t at ever iteration of the loop, and when you use allDrvs in the second loop, that storage hasn't been overwritten, so you have 26 pointers to the same data.
Since you're using C++ anyway, I'd advise using std::wstring and probably std::vector instead -- for example, something on this general order:
std::vector<std::wstring> allDrvs;
for (char i=L'A'; i<L'Z'; i++)
allDrvs.push_back(std::wstring(i));
Technically, this isn't entirely portable -- it depends on 'A' .. 'Z' being contiguous, which isn't true with all character sets, IBM's EBCDIC being the obvious exception. Even in that case, it'll produce all the right outputs, but it'll also include a few additional items you didn't really want.
Nonetheless, the original depended on 'A'..'Z' being contiguous, and the code looks like it's probably intended for Windows anyway, so that's probably not really a big concern.
In conclusion: Thanks so much everyone! All the responses posted below were correct. The initial error was me forgetting to leave room for the NULL terminator. Strcpy() is a dangerous function because when I used it, it didn't know when the end of the 'string' was. Therefore, strcpy() grabbed to much data and overwrote the return address.
EDIT: added more code from the program
SOLVED: Honestly, my initial implementation was crap. I don't even know why I wrote swap that way if I wanted to swap out elements of the array. (At the time, each element only had the a char array in it. So I was able to get away with the old implementation). I have re-written it to:
void swap(ArrayElement list[], int index1, int index2) {
ArrayElement temp;
temp = list[index1];
list[index1] = list[index2];
list[index2] = temp;
}
I'm having problems with a segmentation fault at the end of the following function.
struct ArrayElement {
char data[SIZE_OF_ELEMENT];
// Implemented this way so that I can expand to multiple values later on
}
//In main:
ArrayElement* list = new ArrayElement[NUM_OF_ELEMENTS];
void swap(ArrayElement list[], int index1, int index2) {
char temp[SIZE_OF_ELEMENT];
strcpy(temp, list[index2].data);
strcpy(list[index2].data, list[index1].data);
strcpy(list[index1].data, temp);
}
The error is a segmentation fault at line 45, which is the ending curly brace of the function. This was compiled using g++. I used gbd to try and debug it and everything works correctly until it hits the curly brace.
I can give more code from the program if it is needed. I don't want to post the entire thing because this is for a class.
My best guess is, the string at list[index2].data is larger than temp[] and by copying, you overwrote the stack and the return address.
Try inserting a test for the length:
#include <iostream>
...
int n = strlen(list[index2].data);
std::cerr << "len=" << n << ", SIZE_OF_ELEMENT=" << SIZE_OF_ELEMENT << std::endl;
and see, if n (list[index2].data) is larger than SIZE_OF_ELEMENT
strcpy is a hazardous function. If the length of the input string is SIZE_OF_ELEMENT or more, you will write past the end of your temp array. If you must use a fixed size array as the output array in strcpy, you should test that the strcpy will work before using the function.
Even better is to switch from using char arrays to std::string.
Is data defined like this char data[SOME_CONSTANT]? If so then are you sure that SIZE_OF_ELEMENT is large enough? You are remembering the NULL terminator too, right?
If in ArrayElement data is defined like this char *data; and is allocated with malloc at a later time then are you sure that index1 has a buffer large enough for the data in index2 and vice versa? Again, you are remembering the NULL terminator too, right?