Segfaults on appending char* arrays - c++

I'm making a lexical analyzer and this is a function out of the whole thing. This function takes as argument a char, c, and appends this char to the end of an already defined char* array (yytext). It then increments the length of the text (yylen).
I keep getting segfaults on the shown line when it enters this function. What am I doing wrong here? Thanks.
BTW: can't use the strncpy/strcat, etc. (although if you want you can show me that implementation too)
This is my code:
extern char *yytext;
extern int *yylen;
void consume(char c){
int s = *yylen + 1; //gets yylen (length of yytext) and adds 1
//now seg faults here
char* newArray = new char[s];
for (int i = 0;i < s - 1;i++){
newArray[i] = yytext[i]; //copy all chars from existing yytext into newArray
}
newArray[s-1] = c; //append c to the end of newArray
for (int i = 0;i < s;i++){ //copy all chars + c back to yytext
yytext[i] = newArray[i];
}
yylen++;
}

You have
extern int *yylen;
but try to use it like so:
int s = (int)yylen + 1;
If the variable is an int *, use it like an int * and dereference to get the int. If it is supposed to be an int, then declare it as such.

That can t work:
int s = (int)yylen + 1; //gets yylen (length of yytext) and adds 1
char newArray[s];
use malloc or a big enought buffer
char * newarray=(char*)(malloc(s));

Every C-style string should be null-terminated. From your description it seems you need to append the character at c. So, you need 2 extra locations ( one is for appending the character and other for null-terminator ).
Next, yylen is of type int *. You need to dereference it to get the length (assuming it is pointing to valid memory location ). So, try -
int s = *yylen + 2;
I don't see the need of temporary array but there might be a reason why you are doing it. Now,
yytext[i] = newArray[i]; //seg faults here
you have to check if yytext is pointing to a valid write memory location. If yes, then is it long enough to fill the appending character plus null terminator.
But I would recommend using std::string than working with character arrays. Using it would be a one liner to solve the problem.

Related

C++ copying chars from static to dynamic array adds a bunch of random elements

I have an static array, but when copying values over to a dynamic array, I get a bunch of nonsense padded on. I need for the resulting dynamic array to be exactly 8 characters
unsigned char cipherText[9]; //Null terminated
cout<<cipherText<<endl; //outputs = F,ÿi~█ó¡
unsigned char* bytes = new unsigned char[8]; //new dynamic array
//Loop copys each element from static to dynamic array.
for(int x = 0; x < 8; x++)
{
bytes[x] = cipherText[x];
}
cout<<bytes; //output: F,ÿi~█ó¡²²²²½½½½½½½½ε■ε■
You need to update your code to copy the null terminator:
unsigned char* bytes = new unsigned char[9]; //new dynamic array
//Loop copys each element from static to dynamic array.
for(int x = 0; x < 9; x++)
{
bytes[x] = cipherText[x];
}
cout<<bytes;
This is assuming that cipherText does in fact contain a null terminated string.
C-strings need to be null-terminated if you want to properly print it. The machine has to know where to stop printing.
To solve this, make bytes one larger (9 instead of 8) to make space for the null character and then append it to the end:
bytes[8] = '\0';
Now cout will now to only read the first 8 character from the array, after that it encounters the '\0' and will stop.
How do you know cipherText is null terminated? Simply defining it won't make it null terminated.

why memcpy alters the string array?

I have code which goes like this. Could you tell me why it is not behaving as I would expect it to be?
/*
* test.cpp
*
* Created on: Dec 6, 2012
* Author: sandeep
*/
#include<iostream>
#include<string.h>
using namespace std;
int main()
{
int i=0;
string s="hello A B:bye A B";
char *input;
input=new char(s.size());
for(i=0;i<=s.size();i++)
input[i]=s[i];
char *tokenized1[2],*tokenized2[3];
tokenized1[0]=strtok(input,":");
tokenized1[1]=strtok(NULL,":");
i=0;
char *lstring;
while(i<2)
{
lstring=new char(strlen(tokenized1[i]));
memcpy(lstring,tokenized1[i],strlen(tokenized1[i])+1);
cout<<tokenized1[0]<<" "<<tokenized1[1]<<endl;
tokenized2[0]=strtok(lstring," ");
tokenized2[1]=strtok(NULL," ");
tokenized2[2]=strtok(NULL," ");
char c=tokenized2[0][0];
cout<<c<<endl;
cout<<tokenized2[0]<<" "<<tokenized2[1]<<" "<<tokenized2[2]<<endl;
i++;
}
}
and the output is this.
hello A B by
h
hello A B
hello A B by
b
by
There are some junk values at the end of 1st, 4th and 6th line of output.
Why tokenized1[1] got altered when I I did memcopy of tokenized1[0]? and how to solve this?
There's a couple of bugs in the following new call. You need to use square brackets; also, the argument is off by one.
lstring=new char[strlen(tokenized1[i]) + 1];
Without the square brackets, your are allocating space for one character. As a result, the memcpy() writes past the allocated memory.
edit: I just noticed the other new, which will also need to be fixed:
input=new char[s.size() + 1];
Finally, s[i] reads past the end of the string in:
for(i=0;i<=s.size();i++)
input[i]=s[i];
There could well be other bugs, not to mention memory leaks...
You don't appear to be zero terminating 'input'
In addition to what NPE said, there's a couple of other small things:
char *input;
input=new char(s.size());
This might have something to do with it - you are allocating a single character. You then write that one character, and overwrite other memory that is used for who-knows-what. Try this instead:
char *input = new char[s.size() + 1];
Another issue is your loop, immediately below that:
for(i=0;i<=s.size();i++)
input[i]=s[i];
At least on my system, using std::string::operator[] with an offset equal to s.size() fails; I don't know about your particular implementation, but I'm pretty it fails as well. Be safe, rather than sorry, and recode your loop thusly:
for(i = 0; i < s.size(); i++)
input[i] = s[i];
input[i] = 0;
I hope this helps.

Array initialization issue

I need an empty char array, but when i try do thing like this:
char *c;
c = new char [m];
int i;
for (i = 0; i < m; i++)
c[i] = 65 + i;
and then I print c. can see that c = 0x00384900 "НННННННээээ««««««««юоюою"
after cycle it becomes: 0x00384900 "ABCDEFGээээ««««««««юоюою"
How can I solve this problem? Or maybe there is way with string?
If you're trying to create a string, you need to make sure that the character sequence is terminated with the null character \0.
In other words:
char *c;
c = new char [m+1];
int i;
for (i = 0; i < m; i++)
c[i] = 65 + i;
c[m] = '\0';
Without it, functions on strings like printf won't know where the string ends.
printf("%s\n",c); // should work now
If you create a heap array, OS will not initialiase it.
To do so you hvae these options:
Allocate an array statically or globally. The array will be filled with zeroes automatically.
Use ::memset( c, 0, m ); on heap-initialised or stack array to fill it with zeroes.
Use high-level types like std::string.
I believe that's your debugger trying to interpret the string. When using a char array to represent a string in C or C++, you need to include a null byte at the end of the string. So, if you allocate m + 1 characters for c, and then set c[m] = '\0', your debugger should give you the value you are expecting.
If you want a dynamically-allocated string, then the best option is to use the string class from the standard library:
#include <string>
std::string s;
for (i = 0; i < m; i++)
s.push_back(65 + i);
C strings are null terminated. That means that the last character must be a null character ('\0' or just 0).
The functions that manipulate your string use the characters between the beginning of the array (that you passed as parameter, first position in the array) and a null value. If there is no null character in your array the function will iterate pass it's memory until it finds one (memory leak). That's why you got some garbage printed in your example.
When you see a literal constant in your code, like printf("Hello");, it is translate into an array of char of length 6 ('H', 'e', 'l', 'l', 'o' and '\0');
Of course, to avoid such complexity you can use std::string.

Why am i getting two different strings?

I wrote a very simple encryption program to practice c++ and i came across this weird behavior. When i convert my char* array to a string by setting the string equal to the array, then i get a wrong string, however when i create an empty string and add append the chars in the array individually, it creates the correct string. Could someone please explain why this is happening, i just started programming in c++ last week and i cannot figure out why this is not working.
Btw i checked online and these are apparently both valid ways of converting a char array to a string.
void expandPassword(string* pass)
{
int pHash = hashCode(pass);
int pLen = pass->size();
char* expPass = new char[264];
for (int i = 0; i < 264; i++)
{
expPass[i] = (*pass)[i % pLen] * (char) rand();
}
string str;
for (int i = 0; i < 264; i++)
{
str += expPass[i];// This creates the string version correctly
}
string str2 = expPass;// This creates much shorter string
cout <<str<<"\n--------------\n"<<str2<<"\n---------------\n";
delete[] expPass;
}
EDIT: I removed all of the zeros from the array and it did not change anything
When copying from char* to std::string, the assignment operator stops when it reaches the first NULL character. This points to a problem with your "encryption" which is causing embedded NULL characters.
This is one of the main reasons why encoding is used with encrypted data. After encryption, the resulting data should be encoded using Hex/base16 or base64 algorithms.
a c-string as what you are constructing is a series of characters ending with a \0 (zero) ascii value.
in the case of
expPass[i] = (*pass)[i % pLen] * (char) rand();
you may be inserting \0 into the array if the expression evaluates to 0, as well as you do not append a \0 at the end of the string either to assure it being a valid c-string.
when you do
string str2 = expPass;
it can very well be that the string gets shorter since it gets truncated when it finds a \0 somewhere in the string.
This is because str2 = expPass interprets expPass as a C-style string, meaning that a zero-valued ("null") byte '\0' indicates the end of the string. So, for example, this:
char p[2];
p[0] = 'a';
p[1] = '\0';
std::string s = p;
will cause s to have length 1, since p has only one nonzero byte before its terminating '\0'. But this:
char p[2];
p[0] = 'a';
p[1] = '\0';
std::string s;
s += p[0];
s += p[1];
will cause s to have length 2, because it explicitly adds both bytes to s. (A std::string, unlike a C-style string, can contain actual null bytes — though it's not always a good idea to take advantage of that.)
I guess the following line cuts your string:
expPass[i] = (*pass)[i % pLen] * (char) rand();
If rand() returns 0 you get a string terminator at position i.

Please suggest what is wrong with this string reversal function?

This code is compiling clean. But when I run this, it gives exception "Access violation writing location" at line 9.
void reverse(char *word)
{
int len = strlen(word);
len = len-1;
char * temp= word;
int i =0;
while (len >=0)
{
word[i] = temp[len]; //line9
++i;--len;
}
word[i] = '\0';
}
Have you stepped through this code in a debugger?
If not, what happens when i (increasing from 0) passes len (decreasing towards 0)?
Note that your two pointers word and temp have the same value - they are pointing to the same string.
Be careful: not all strings in a C++ program are writable. Even if your code is good it can still crash when someone calls it with a string literal.
When len gets to 0, you access the location before the start of the string (temp[0-1]).
Try this:
void reverse(char *word)
{
size_t len = strlen(word);
size_t i;
for (i = 0; i < len / 2; i++)
{
char temp = word[i];
word[i] = word[len - i - 1];
word[len - i - 1] = temp;
}
}
The function looks like it would not crash, but it won't work correctly and it will read from word[-1], which is not likely to cause a crash, but it is a problem. Your crashing problem is probably that you passed in a string literal that the compiler had put into a read-only data segment.
Something like this would crash on many operating systems.
char * word = "test";
reverse(word); // this will crash if "test" isn't in writable memory
There are also several problems with your algorithm. You have len = len-1 and later temp[len-1] which means that the last character will never be read, and when len==0, you will be reading from the first character before the word. Also, temp and word are both pointers, so they both point to the same memory, I think you meant to make a copy of word rather than just a copy of the pointer to word. You can make a copy of word with strdup. If you do that, and fix your off-by-one problem with len, then your function should work,
But that still won't fix the write crash, which is caused by code that you have not shown us.
Oh, and if you do use strdup be sure to call free to free temp before you leave the function.
Well, for one, when len == 0 len-1 will be a negative number. And that's pretty illegal. Second, it's quite possible that your pointer is pointing at an unreserved area of memory.
If you called that function as followed:
reverse("this is a test");
then with at least one compiler will pass in a read only string due to backwards compatibility with C where you can
pass string literals as non-const char*.