Using pointers to subsitute characters in C strings - c++

This is not homework, but study for a midterm.
I cannot use any type of array indexing such as str[i] or *(str+i)
I have to take the c-string "EECS280ISAWESOME" and substitute the 'E' with the c-string "XY". I also have to allow for multiple length of the "XY" variable.
The following main is given:
int main () {
const char* S = "EECS280ISAWESOME";
const char* P = "XY";
char result[256];
subsituteChar(S,P,'E', result);
cout << result << endl;
}
My solution seems complex/bad practice/and ugly. I could do it better with the use of deferencing and adding *(R+1) but I dont think it's allowed.
void subsituteChar(const char* S, const char* P, char c, char* R) {
while(*S != '\0') {
if(*S == c) {
const char* PP = P;
while (*P != '\0') {
*R = *P;
R++;
P++;
}
P = PP;
} else {
*R = *S;
R++;
}
S++;
}
}
This works but I am left with XYXYCS280ISAWXYSOMXY2. I have no idea where the weird 2 has came from.

Since this is a study problem, here's a hint to get you started. Try initializing your result array. I did this:
char result[256];
for (int i = 0; i < 250; ++i)
result[i] = 'a';
result[250] = 0;
Now run your code again, and you'll see that you've got lots of 'a' characters at the end of your output, up to the point where you get to character 250. That is, you'll see:
"XYXYCS280ISAWXYSOMXYaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
If you want to figure out the rest for yourself, STOP READING NOW.
The problem is that you're not explicitly null terminating your string.
Note that it's not a problem with your last character of S being replaced, but rather with your while loop's terminal condition. Since you aren't manually copying over the '\0' character from S, you're left hoping that the result array is full of '\0' characters, which C and C++ don't guarantee.
Simply adding the following line to the end of your substituteChar function will solve the problem:
*R = '\0';

Does this fit?
#include<stdio.h>
int main () {
const char* S = "EECS280ISAWESOME";
const char* P = "XY";
char result[256];
while(*S)
{
if(*S=='E')
printf("XY");
else
printf("%c", *S);
S++;
}
}

Your code is perfectly correct along with small error. There is a fact of string , everry
string should be terminate with null character. so just add the *R ='\0'; at the end of
while function and this algorithm works perfectely.

Related

How to return a pointer from a char* function

I am doing an exercise that is assigned by the Prof. The problem is:
Write the function myStrChr(). The function has two parameters: a const char * s pointing to the first character in a C-style string, and a char c. Return a pointer to the first appearance of c appearing inside s and nullptr (0) if c does not appear inside s.
I have been trying, but still cannot figure out how to fix the error: " invalid conversion from const char* to char*" or "invalid conversion from char* to const char*
Please show me to fix it.
char* mystrChr(const char *s, char c)
{
size_t len{};
for (size_t i = 0; s[i] != '\0'; i++)
len++;
char *p = s;
for (size_t i = 0; i < len; i++)
{
if (s[i] == c)
p = &s[i];
else
p = nullptr;
}
return p;
}
As mentioned in the comments, you need either to consistently use const char* throughout your function or, if you are required to return a (non-const) char* then you need to explicitly cast out the constness before returning.
But there are other problems in your code! The most serious is the fact that your second for loop doesn't stop when it finds a match: you should return the address of the character as soon as you find it.
Second (but less serious), you really don't need two loops - you can simply merge the two and either return a pointer to the found character or, if the loop ends without finding a match, return nullptr.
Here's a much-reduced version of the function that works:
const char* mystrChr(const char* s, char c)
{
while (*s) { // Shorthand way of writing "while (*s != '\0')"
if (*s == c) return s; // Found a match, so return current pointer!
++s;
}
return nullptr; // No match found - return signal null pointer
}
If you require your function to return a non-const char* pointer, then you can make the explicit cast, just before returning:
char* mystrChr(const char* s, char c)
{
while (*s) {
if (*s == c) return const_cast<char*>(s); // Cast away the constness
++s;
}
return nullptr; // The cast is not required when using nullptr
}
EDIT: If, as mentioned in the comments, you want to return the address of the string's "end-marker" if a nul character is passed as the c argument (as, in fact, the std::strchr function does), you can do this by modifying the return statement after the loop:
const char* mystrChr(const char* s, char c)
{
while (*s) {
if (*s == c) return s;
++s;
}
return (c == '\0') ? s : nullptr;
}

C++ Junior interview question: function to compress a character sequence with only char pointers

I was at a job interview the other day and I had the following function to implement:
char* Compress (char * text);
Rules were also that you are not allowed to use standard libary functions like strlen, strcpy, string etc... So the function has to compress a given character sequence.
For example if the input text is"11112222333344411" after passing it to the Compress function the returned value is: "12341", or if the text input is:"aaAbbBBcCCa" ---> return: aAbBcCa
I am not sure I did everything properly (with memory handling) here so any suggestions would be great. Am I doing it right that I delete the value of temp every time? Also if there is a simpler way to implement this function (without using the standard library functions of course) I would be really pleased to see it.
#include <iostream>
char* Compress(char* text) {
char* temp;
char* _compText;
int size = 1;
_compText = nullptr;
for (size_t i = 0; text[i] != '\0'; ++i)
{
if (text[i] != text[i + 1]) {
++size;
temp = _compText;
_compText = new char[size];
for (size_t j = 0; j < size-2; ++j)
{
_compText[j] = temp[j];
}
_compText[size-2] = text[i];
_compText[size-1] = '\0';
delete[] temp;
}
}
return _compText;
}
int main()
{
char t[] = "111122222233333444444555555111";
char* compedT;
std::cout << "Before:\n";
std::cout << t;
compedT = Compress(t);
std::cout << "\nAfter: \n";
std::cout << compedT;
delete[] compedT;
return 0;
}
The function initially is implemented incorrectly.
The type of the function is
char* Compress (char * text);
^^^^^^^
that is its parameter is not const char *, This means that the function should update the source string in place and return pointer to its first character. There is no need to allocate dynamically memory to perform the task.
The function can be defined as it is shown in the demonstrative program.
#include <iostream>
char * Compress( char *s )
{
for ( char *p = s, *q = s; *q; )
{
if ( *++q != *p ) *++p = *q;
}
return s;
}
int main()
{
char s[] = "11112222333344411";
std::cout << Compress( s ) << '\n';
}
Its output is
12341
Or the function can look also the following way
char * Compress( char *s )
{
for ( char *p = s, *q = s; *q; )
{
if ( ( *++q != *p ) and ( ++p != q ) ) *p = *q;
}
return s;
}
As for your function implementation then you should read warnings as for example
warning: comparison of integer expressions of different signedness: 'size_t' {aka 'long unsigned int'} and 'int' [-Wsign-compare]
34 | for (size_t j = 0; j < size-2; ++j)
| ~~^~~~~~~~
And your function returns nullptr for an empty string. This looks logically inconsistent. And the function is very inefficient.:)
And do not use names that start with underscore.
Does my code have memory leak?
As far as I can see, no; there is no memory leak.
That said, the use of bare owning pointers makes it difficult to spot memory leaks. They are a bad design choice, especially when transferring ownership to outside of the function. At the very least, there should be a comment near the function declaration that should document how the caller must clean up the allocation. If you need a dynamic array, a better solution is to use a container.
Am I doing it right that I delete the value of temp everytime?
As far memory leaks are concerned yes, you do need to delete every allocation. But reallocating memory on every iteration is unnecessary, and quite slow. In fact, there doesn't appear to be need for any dynamic allocation (See Vlad's answer).

How to declare an empty char* and increase the size dynamically?

Let's say I am trying to do the following (this is a sub problem of what I am trying to achieve):
int compareFirstWord(char* sentence, char* compareWord){
char* temp; int i=-1;
while(*(sentence+(++i))!=' ') { *(temp+i) = *(sentence+i); }
return strcmp(temp, compareWord); }
When I ran compareFirstWord("Hi There", "Hi");, I got error at the copy line. It said I was using temp uninitialized. Then I used char* temp = new char[]; In this case the function returned 1 and not 0. When I debugged, I saw temp starting with some random characters of length 16 and strcmp fails because of this.
Is there a way to declare an empty char* and increase the size dynamically only to length and contents of what I need ? Any way to make the function work ? I don't want to use std::string.
In C, you may do:
int compareFirstWord(const char* sentence, const char* compareWord)
{
while (*compareWord != '\0' && *sentence == *compareWord) {
++sentence;
++compareWord;
}
if (*compareWord == '\0' && (*sentence == '\0' || *sentence == ' ')) {
return 0;
}
return *sentence < *compareWord ? -1 : 1;
}
With std::string, you just have:
int compareFirstWord(const std::string& sentence, const std::string& compareWord)
{
return sentence.compare(0, sentence.find(" "), compareWord);
}
temp is an uninitialized variable.
It looks like you are attempting to extract the first word out of the sentence in your loop.
In order to do it this way, you would first have to initialize temp to be at least as long as your sentence.
Also, your sentence may not have a space in it. (What about period, \t, \r, \n? Do these matter?)
In addition, you must terminate temp with a null character.
You could try:
int len = strlen(sentence);
char* temp = new char[len + 1];
int i = 0;
while(i < len && *(sentence+(i))!=' ') {
*(temp+i) = *(sentence+i);
i++;
}
*(temp+i) = '\0';
int comparable = strcmp(temp, compareWord);
delete temp;
return comparable;
Also consider using isspace(*(sentence+(i))), which will at least catch all whitespace.
In general, however, I'd use a library, or STL... Why reinvent the wheel...

Add characters to a character array c++

Can someone tell me what's wrong with the following?
I'm trying to add characters to a character array. name is a pointer to a character array in the MyString class.
void MyString::add_chars(char* c)
{
if(l < strlen(c)+strlen(name))
name = resize(name, l, sizeof(c));
int i,j;
for(i=0; i<strlen(c); i++) {
name[i+l-1] = c[i];
l++;
}
}
char* MyString::resize(char* vptr, int currentsize, int extra) {
char* temp = new char[currentsize + extra];
int i;
for (i = 0; i < currentsize; i++) {
temp[i] = vptr[i];
}
vptr = temp;
return vptr;
}
And in main:
MyString g ("and");
g.add_chars("baasdf");
cout << g.get_name() << "\n";
But get_name returns "andb". How can I fix my code?
Edit:
Updated code, still same result..
void StringList::add_chars(char* c)
{
char* my_new_string = resize(name, l, sizeof(char));
if( my_new_string != NULL )
{
delete [] name;
name = my_new_string;
}
int i,j;
for(i=0; i<strlen(c); i++) {
name[i+l-1] = c[i];
l++;
}
name[l-1] = '\0';
}
char* StringList::resize(char* vptr, int currentsize, int extra) {
char* temp = new char[currentsize + extra + 1];
int i;
for (i = 0; i < currentsize; i++) {
temp[i] = vptr[i];
}
vptr = temp;
return vptr;
}
This line is wrong:
name = resize(name, l, sizeof(c));
You should not take the sizeof(char*), which your c variable is, but you should do sizeof(char) or just 1.
Also, make sure that you do +1 on the size to take care of the zero termination char at the end of your string.
How can I fix my code?
Don't fix it. Throw it away and use vector<char> or just string.
But I insist, how can I fix my code!?
OK, OK, here is how...
Get a nice debugger, for example this one.
Step carefully through the code, constantly inspecting the variables and comparing them with what you expect them to be.
When you reach the call to resize, take note of sizeof(c) (assigned to extra parameter of resize). When you realize it is not what you expected, ask yourself: what is the purpose of sizeof, and you'll understand why.
BTW, you also have a memory leak and a very poor performance due all these strlens.
Firstly, am I right in assuming that this is a learning exercise for you in learning "how to create your own string class"? C++ has already got a built-in string type which you should always prefer for the most part.
the sizeof operator yields the size (in bytes) of its operand, which in this case is c whose type is char* - it looks like what you're actually after is the length of a null-terminated character array (a "C" string") - you're already using strlen, so I'd suggest you simply want to use that again. (taking a null-terminator into account too)
name = resize(name, l, strlen(c) + 1);
Note, that your code looks as if it suffers from memory leaks. You're assigning a new value to your name variable without clearing up whatever existed there first.
if(l < strlen(c)+strlen(name))
{
char* my_new_string = resize(name, l, strlen(c));
if( my_new_string != NULL )
{
delete [] name;
name = my_new_string;
}
}
EDIT: As other replies have pointed out, there's still plenty wrong with the code which could be resolved using C++'s string and vector.
Here's one possible way you could implement add_chars
void MyString::add_chars(char* c)
{
if( c != NULL && name != NULL )
{
size_t newlength = strlen(c) + strlen(name) + 1;
char* newstring = new char[newlength];
if( newstring != NULL )
{
size_t namelength = strlen(name);
size_t remaining = newlength - namelength;
strncpy( newstring, name, newlength );
strncpy( &newstring[namelength] , c, remaining );
delete [] name;
name = newstring;
}
}
}

Find the first occurence of char c in char *s or return -1

For a homework assignment, I need to implement a function which takes a char *s and a char c and return the index of c if found, and -1 otherwise.
Here's my first try:
int IndexOf(const char *s, char c) {
for (int i = 0; *s != '\0'; ++i, ++s) {
if (*s == c) {
return i;
}
}
return -1;
}
Is that an okay implementation, or are there things to improve?
EDIT Sry, didn't mention that I only should use pointer-arithmetic/dereferencing, not something like s[i]. Besides, no use of the Standard Library is allowed.
Yes, it's fine, but you could increment only one variable:
int IndexOf(const char *s, char c) {
for (int i = 0; s[i] != '\0'; ++i) {
if (s[i] == c) {
return i;
}
}
return -1;
}
Won't make any serious difference though, mostly a matter of taste.
Looks fine to me, at least given the signature. Just to add to the "many slightly different ways to do it" roadshow:
int IndexOf(const char *s, const char c) {
for (const char *p = s; *p != 0; ++p) {
if (*p == c) return p - s;
}
return -1;
}
Slight issue - p-s isn't guaranteed to work if the result is sufficiently big, and certainly goes wrong here if the correct result is bigger than INT_MAX. To fix this:
size_t IndexOf(const char *s, const char c) {
for (size_t idx = 0; s[idx] != 0; ++idx) {
if (s[idx] == c) return idx;
}
return SIZE_MAX;
}
As sharptooth says, if for some didactic reason you're not supposed to use the s[i] syntax, then *(s+i) is the same.
Note the slightly subtle point that because the input is required to be nul-terminated, the first occurrence of c cannot be at index SIZE_MAX unless c is 0 (and even then we're talking about a rather unusual C implementation). So it's OK to use SIZE_MAX as a magic value.
All the size issues can be avoided by returning a pointer to the found character (or null) instead of an index (or -1):
char *findchr(const char *s, const char c) {
while (*s) {
if (*s == c) return (char *)s;
++s;
}
return 0;
}
Instead you get an issue with const-safety, the same as the issue that the standard function strchr has with const-safety, and that can be fixed by providing const and non-const overloads.
Here's a way to do it without keeping track of the index:
int IndexOf(const char *s, char c) {
const char *p = s;
while (*p != '\0') {
if (*p == c) {
return p - s;
}
++p;
}
return -1;
}
This is not necessarily better than your solution. Just demonstrating another way to use pointer arithmetic.
FWIW, I would define the function to return size_t rather than int. Also, for real-world usage (not homework), you would probably want to consider what the proper behavior should be if s is a NULL pointer.
Yours is perfectly fine, as far as it goes. You should also write a simple test program that tests for the first char, last char, and a missing char.
Piling on to the 'other ways to do it' group, here is one with no break, a single return, and showing off pointer arithmetic. But, beware: if I were grading your homework, I would grade yours higher than mine. Yours is clear and maintainable, mine needlessly uses ?: and pointer subtraction.
#include <stdio.h>
int IndexOf(const char *s, const char c)
{
const char * const p = s;
while(*s && *s != c) s++;
return (*s) ? s-p : -1;
}
#ifdef TEST
int main()
{
printf("hello, h: %d\n", IndexOf("hello", 'h'));
printf("hello, g: %d\n", IndexOf("hello", 'g'));
printf("hello, o: %d\n", IndexOf("hello", 'o'));
printf("hello, 0: %d\n", IndexOf("hello", 0));
}
#endif
The output of this program is:
hello, h: 0
hello, g: -1
hello, o: 4
hello, 0: -1
There's a typo (index instead of i), but otherwise it looks fine. I doubt you'd be able to do much better than this (both in terms of efficiency and code clarity.)
yes, you shoule return i;
not index.
I think it's just a typo.
Another variant, as an old school C programmer may write it:
int IndexOf(const char *s, char c) {
int i = 0;
while (s[i] && (s[i] != c)) ++i;
return (s[i] == c)?i:-1;
}
Benefices : short, only one variable, only one return point, not break (considered harmful by some people).
For clarity I would probably go for the one below:
int IndexOf(const char *s, char c) {
int result = -1;
for (int i = 0; s[i] != 0; ++i) {
if (s[i] == c) {
result = i;
break;
}
}
return result;
}
It uses a break, but has only one return point, and is still short.
You can also notice I used plain 0 instead of '\0', just to remind that char is a numeric type and that simple quotes are just a shorthand to convert letters to their values. Obviously comparing to 0 can also be replaced by ! in C.
EDIT:
If only pointer arithmetic is allowed, this does not change much... really s[i] is pointer arithmetic... but you can rewrite it *(s+i) if you prefer (or even i[s] if you like obfuscation)
int IndexOf(const char *s, char c) {
int result = -1;
for (int i = 0; *(s+i) != 0; ++i) {
if (*(s+i) == c) {
result = i;
break;
}
}
return result;
}
For a version that works for most cases on x86 systems, one can use:
int IndexOf(char *s, char sr)
{
uint_t *x = (uint_t*)s;
uint_t msk[] = { 0xff, 0xff00, 0xff0000, 0xff000000 };
uint_t f[4] = { (uint_t)sr, (uint_t)sr << 8, (uint_t)sr << 16, (uint_t)sr << 24 };
uint_t c[4], m;
for (;;) {
m = *x;
c[0] = m & msk[0]; if (!c[0]) break; if (c[0] == f[0]) return (char*)x - s;
c[1] = m & msk[1]; if (!c[1]) break; if (c[1] == f[1]) return (char*)x - s + 1;
c[2] = m & msk[2]; if (!c[2]) break; if (c[2] == f[2]) return (char*)x - s + 2;
c[3] = m & msk[3]; if (!c[3]) break; if (c[3] == f[3]) return (char*)x - s + 3;
x++;
}
return -1;
}
Limitations:
It breaks if the string is shorter than four bytes and its address is closer to the end of a MMU page than four bytes.
Also, the mask pattern is little endian, for big endian systems the order for the msk[] and f[] arrays has to be reversed.
In addition, if the hardware can't do misaligned multi-byte accesses (x86 can) then if the string doesn't start at an address that's a multiple of four it'll also fail.
All of these are solveable with more elaborate versions, if you wish...
Why would you ever want to do weird things like that - what's the purpose ?
One does so for optimization. A char-by-char check is simple to code and understand but optimal performance, at least for strings above a certain length, tends to require operations on larger blocks of data. Your standard library code will contain some such "funny" things for that reason. If you compare larger blocks in a single operation (and with e.g. SSE2 instructions, one can extend this to 16 bytes at a time) more work gets done in the same time.