C++ Check and modify strings / String subscript out of range

C++ Check and modify strings / String subscript out of range - c++

I'm trying to make a program which modifies words in a specific manner:
It should first check the ending of the words and then proceed to modify them. I won't explain it in detail, because it doesn't make much sense in English.
I've written the following:
#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;
int main()
{
cout << "Por favor, introduzca los gentilicios separados por la tecla enter, para finalizar, escriba OK" << '\n';
string name[10];
string place[10];
for (int i(0); (i < 10); i++)
{
getline(cin, name[i]);
if (name[i] == "OK") //Error here
break;
}
for (int i(0); (i < 10); i++)
{
place[i] = name[i];
if (name[i][name[i].length() - 1] == 'c')
{
if (name[i][name[i].length()] == 'a' || (name[i][name[i].length()] == 'o') || (name[i][name[i].length()] == 'u'))
place[i][place[i].length() - 1] = 'q';
place[i][place[i].length()] = 'u';
place[i] = place[i] + "istan";
}
else if (name[i][name[i].length()] == 'a' || name[i][name[i].length()] == 'e' || name[i][name[i].length()] == 'i' || name[i][name[i].length()] == 'o' || name[i][name[i].length()] == 'u')
{
place[i][place[i].length()] = 'i';
place[i] = place[i] + "stan";
}
if (name[i][name[i].length()] == 's')
place[i] = place[i] + "tan";
else {
place[i] = place[i] + "istan";
}
place[i][0] = toupper(place[i][0]);
}
for (int i(0); (i < 10); i++)
{
cout << place[i] << '\n';
}
return 0;
}
Now I'm getting the error "String subscript out of range" . I would like to know where is the error exactly. I know it prompts when I write "OK", at line "18".

The condition i <= sizeof(name). sizeof(name) returns the size of the array in bytes, not the number of elements in it. Even if it returned the number of elements, <= is wrong and would cause an out-of-bounds access (should be <).
To loop through all elements in an array, you can use the range-based for-loop:
for(auto& n : name)
{
getline(cin, n);
if (n == "OK")
break;
}
Or to do it the right way with the C-style for-loop:
for (int i(0); i < sizeof(name)/sizeof(name[0]; i++)
{
…
}

Here:
for (int i(0); (i <= sizeof(name)); i++)
sizeof(name) is the size in bytes of the array, which as it is an array of std::string is effectively meaningless. If you want to iterate over 10 items, simply say so (note also that less-than-or-equals is also wrong here):
for (int i = 0; i < 10; i++)
And here:
getline(cin, name[i]);
whenever you perform input you must check the return value of the input function and handle any errors:
if( ! getline(cin, name[i]) ) {
// handle error somehow
}
And here:
string * p;
you do not want to be dealing with pointers to strings. If you want to access the contents of a string, you use operator[] or other string member functions on the string.

std::strings are not like cstrings. You can just grab a part of them using a std::string*. When you do
*(p+ (name[i].length()-2))
You actually say advance the address stored in p by name[i].length()-2 amount and access that string. If you go past the end of the name array then that is undefined behavior. If not you still haver a std::string which cannot be compared with a char. If you want to check if the string ends with "ca" then you can just use
if (name[i].substr(name[i].size() - 2) == "ca")

You're last loop is doing something quite funky. There's no need to go that far. You can just do something like:
if (name[i][name[i].length - 2] == 'c')
To compare the next to last character with c. And a very similar test to compare the last one with a.
To clarify why what you're doing is not OK, you first get p as a pointer to a string to the current element. Then you do some pointer arithmetic p + (name[i].length - 2), which still results in a pointer to a string. Finally, you dereference this, resulting in a string. Which you can't compare to a char. Moreover, the pointer was to some arbitrary address in memory, so the dereference would produce a string with very bad data in it. Quite arbitrary, one might say. If you tried to work with it you'd break your program
You seem to be working with the string as one would with a C-like string, a char*. The two are not the same, even though they represent the same concepts. A C++ string, usually, has a size field, and a char* pointer inside it, as well as a bunch of other logic to make working with it a char-m.

Because you aren't comparing against a specific char in the string, you're comparing against a string.
Considering the following bit of code:
*(p + (name[i].length() - 2))
This evaluates to a string because you are taking p (a string*) and concatenating a char to it. This means it's still a string (even though it's a one-character string), thus the other side of the equation won't be comparable to it.
What you need here instead is this:
if (name[i][name[i].length() - 2] == 'c')
Since name[i] is already a string, we can just get the char from it using the code above. This does return char, so it's comparable. This also allows you to get rid of the whole string* bit as it is not needed.

First, (i <= sizeof(name)) is wrong, it should be i < sizeof(name) / sizeof(*name). sizeof(array) return the size of array in bytes, you need to divide the size of an array's element to actually get the maximum element count of an array. If you find that complicated then use std::vector:
vector<string> name(10); //a vector of size 10
for (size_t i = 0; i < name.size(); i++) //name.size(), simple
Secondly, you need to keep track of how many strings in your name array. Or you need to check if name[i] == "OK" then break the second loop (similar to the first loop). name[i] after "OK" are invalid.
Thirdly, don't use *(p+ (name[i].length()-2)). If you want the second last character of name[i], you can write it as name[i][name[i].size()-2] or name[i].end()[-2] or end(name[i])[-2]
If you want to check if the word ends in "ca", then you can use substr:
if (name[i].substr(name[i].size() - 2) == "ca")
{
//...
}

Related

Replacing a substring with a space character

I am given a string and I have to remove a substring from it. Namely WUB, and replace it with a space character.
There are 2 WUB's between ÁRE' and 'THE'. SO the first condition in if statement is for not printing two blank spaces but on executing the code two blank spaces are being printed.
Input: WUBWEWUBAREWUBWUBTHEWUBCHAMPIONSWUBMYWUBFRIENDWUB
Output: WE ARE THE CHAMPIONS MY FRIEND
Here is my code so far:
#include <iostream>
using namespace std;
int main()
{
const string check = "WUB";
string s, p;
int ct = 0;
cin >> s;
for (int i = 0; i < s.size(); i++)
{
if (s[i] == 'W' && s[i+1] == 'U' && s[i+2] == 'B')
{
i += 2;
if (p[ct] == '32' || p.empty())
{
continue;
}
else
{
p += ' ';
ct++;
}
}
else
{
p += s[i];
ct++;
}
}
cout << p;
return 0;
}
Why is the first if statement never executed?

2 things are going to break your code:
you are doing a for loop like this int i=0;i<s.size() but reading (s[i]=='W' && s[i+1]=='U' && s[i+2]=='B')
and here: if(p[ct]=='32') you mean for sure if(p[ct]==32) or if(p[ct]==' ')

This condition
if(p[ct]=='32')
should read either
if(p[ct]==32)
or
if(p[ct]==' ')
that is, compare to the numeric value of the space character or to the space character itself.
Additionally, when your i grows close to the string's length, the subexpressions s[i+1] and s[i+2] may reach non-exiting characters of the string. You should continue looping with a i<s.length()-2 condition.
EDIT
For a full solution you need to fully understand the problem you want to solve. The problem statement is a bit vague:
remove a substring ("WUB") from (a given string). And put a space inplace of it if required.
You considered the last condition, but not deeply enough. What does it mean, 'if required'? Replacement is not required if the resulting string is empty or you appended a space to it already (when you encounter a second of further consecutive WUB). It is also not necessary if you are at WUB, but there is nothing more following it - except possibly another WUBs...
So, when you find a "WUB" substring it is too early to decide if a space is needed. You know you need a space when you find a non-WUB text following some WUB (or WUBs) and there was some text before those WUB(s).

There are actually three bugs here, so it's probably worth to conclude them in one answer:
The first condition:
if (s[i] == 'W' && s[i+1] == 'U' && s[i+2] == 'B')
is out of bounds for the last two characters. One fix would be to check the length first:
if(i < s.length() - 2 && s[i] == 'W' && s[i+1] == 'U' && s[i+2] == 'B')
There's a multicharacter-literal in
if (p[ct] == '32' || p.empty())
Use ' ' or 32 or std::isspace instead. IMO the last one is the best.
In the same condition
p[ct] == '32'
is always out of bounds: ct is equal to p.length(). (Credits to Some programmer dude, who mentioned this in the comments!) The variable ct is also redundant, since std::string knows it's length. I suggest to use std::string::back() to access the last character and reorder the condition as so:
if (p.empty() || std::isspace(p.back()))

The algorithm to this program is on the right track.
However, there is a few issues..
The for loop goes out of index. A way to solve this issue is substracting the size -3. Something like this.
for (int i=0; i<s.size()-3; i++) {
}
I do not suggest using other variables as counters like ct. In this case ct can reach an index out of bound error by using p[ct] inside the for loop.
Creating a string and using append() function will be a better solution. In this case, we iterate through each character in the string and if we find "WUB" then we append a " ". Otherwise, we append the character.
I highly recommend to write the first if() statement using substring() from C++.
This makes the code easier to read.
Substring creates and returns a new string that starts from a specific position to an ending position. Here is the syntax
syntax: substr(startingIndex, endingIndex);
endingIndex is exclusive
#include <string>
#include <iostream>
int main() {
string s, p;
cin >> s;
for(int i=0;i<s.size()-3;i++) {
if (s.substr(i, i+3) == "WUB") {
p.append(" ");
} else {
p.append(s.substr(i,i+1));
i++;
continue;
}
i+=3;
}
}

Encrypting a string but receiving an infinite loop

Problem:
I was trying to encrypt a std::string password with a single rule:
Add "0" before and after a vowel
So that bAnanASplit becomes b0A0n0a0n0A0Spl0i0t.
However, I got stuck in an infinite loop.
Here is the code:
const std::string VOWELS = "AEIOUaeiou";
std::string pass = "bAnanASplit";
//Add zeroes before and after vowels
for (int i = 0; i < pass.length(); ++i)
{
i = pass.find_first_of(VOWELS, i);
std::cout << pass << "\n";
if(i != std::string::npos)
{
std::cout << pass[i] << ": " << i << "\n";
pass.insert(pass.begin() + i++, '0');
pass.insert(pass.begin() + ++i, '0');
}
}
...And the result:
bAnanASplit
A: 1
b0A0nanASplit
a: 5
b0A0n0a0nASplit
A: 9
b0A0n0a0n0A0Split
i: 15
b0A0n0a0n0A0Spl0i0t
b0A0n0a0n0A0Spl0i0t
A: 2
b00A00n0a0n0A0Spl0i0t
a: 8
b00A00n00a00n0A0Spl0i0t
A: 14
b00A00n00a00n00A00Spl0i0t
i: 22
b00A00n00a00n00A00Spl00i00t
b00A00n00a00n00A00Spl00i00t
...
Any help? This sure seems strange.
Edit: All the answers were useful, and therefore I have accepted the one which I think best answers the question. However, the best way to solve the problem is shown in this answer.

Never, ever, modify the collection/container you are iterating upon!
Saves you a lot of trouble that way.
Let's start with your code and generate a new string with vowels surrounded by 0.
const std::string VOWELS = "AEIOUaeiou";
std::string pass = "bAnanASplit", replacement;
//Add zeroes before and after vowels
for (auto ch : pass)
{
if(VOWELS.find(ch) != std::string::npos)
replacement += '0' + ch + '0';
else
replacement += ch;
}
And there you have it!

As the OP seems to look for the exact reason for the misbehavior, I thought to add another answer as the existing answers do not show the exact issue.
The reason for the unexpected behavior is visible in following lines.
for (int i = 0; i < pass.length(); ++i)
{
i = pass.find_first_of(VOWELS, i);
...
Problem 1:
The loop counter i is an int (i.e. a signed int). But std::string::find_first_of returns std::string::npos if there's no match. This is usually the maximum number representable by an unsigned long. Assigning a huge unsigned value to a shorter signed variable will store a totally unexpected value (assuming you are not aware of that). In this case, i will becomes -1 in most platforms (try int k = std::string::npos; and print k if you need to be sure). i = -1 is valid state for the loop condition i < pass.length(), so the next iteration will be allowed.
Problem 2:
Closely related to the above problem, same variable i is used to define the start position for the find operation. But, as explained, i will not represent the index of the character as you would expect.
Solution:
Storing a malformed value can be solved by using the proper data type. In the current scenario, best options would be using std::string::size_type as this is always guaranteed to work (most probably this will be equal to size_t everywhere). To make the program work with the given logic, you will also have to use a different variable to store the find result.
However, a better solution would be using a std::stringstream for building the string. This will perform better than modifying a string by inserting characters in the middle.
e.g.
#include <iostream>
#include <sstream>
int main() {
using namespace std;
const string VOWELS = "AEIOUaeiou";
const string pass = "bAnanASplit";
stringstream ss;
for (const char pas : pass) {
if (VOWELS.find(pas) == std::string::npos) {
ss << pas;
} else {
ss << '0' << pas << '0';
}
}
cout << pass << "\n";
cout << ss.str() << endl;
}

You are not exiting the loop in case i becomes std::string::npos. So, the i value is changed to some unexpected value (likely something like -1) when it gets to the position of last i or 0 after i(here I am referring to i of split). This is because i is an signed integer but in this case find_first_of() returns std::string::npos which is largest value that can be held by a size_t. In that case the terminating condition i < pass.length() may hold true and the loop continues. So, I am recommending following changes in your code -
for (size_t i = 0; i < pass.length(); ++i)
{
i = pass.find_first_of(VOWELS, i);
if(i == std::string::npos)
break;
pass.insert(pass.begin() + i++, '0');
pass.insert(pass.begin() + ++i, '0');
}
On the same note if (i != std::String::npos) does not do what you are expecting it to do.
But then again it better not to modify the container while you are iterating over it which #Tanveer mentioned in his answer

Input C-style string and get the length

The string input format is like this
str1 str2
I DONT know the no. of characters to be inputted beforehand so need to store 2 strings and get their length.
Using the C-style strings ,tried to made use of the scanf library function but was actually unsuccessful in getting the length.This is what I have:
// M W are arrays of char with size 25000
while (T--)
{
memset(M,'0',25000);memset(W,'0',25000);
scanf("%s",M);
scanf("%s",W);
i = 0;m = 0;w = 0;
while (M[i] != '0')
{
++m; ++i; // incrementing till array reaches '0'
}
i = 0;
while (W[i] != '0')
{
++w; ++i;
}
cout << m << w;
}
Not efficient mainly because of the memset calls.
Note:
I'd be better off using std::string but then because of 25000 length input and memory constraints of cin I switched to this.If there is an efficient way to get a string then it'd be good

Aside from the answers already given, I think your code is slightly wrong:
memset(M,'0',25000);memset(W,'0',25000);
Do you really mean to fill the string with the character zero (value 48 or 0x30 [assuming ASCII before some pedant downvotes my answer and points out that there are other encodings]), or with a NUL (character of the value zero). The latter is 0, not '0'
scanf("%s",M);
scanf("%s",W);
i = 0;m = 0;w = 0;
while (M[i] != '0')
{
++m; ++i; // incrementing till array reaches '0'
}
If you are looking for the end of the string, you should be using 0, not '0' (as per above).
Of course, scanf will put a 0 a the end of the string for you, so there's no need to fill the whole string with 0 [or '0'].
And strlen is an existing function that will give the length of a C style string, and will most likely have a more clever algorithm than just checking each character and increment two variables, making it faster [for long strings at least].

You do not need memset when using scanf, scanf adds the terminating '\0' to string.
Also, strlen is more simple way to determine string's length:
scanf("%s %s", M, W); // provided that M and W contain enough space to store the string
m = strlen(M); // don't forget #include <string.h>
w = strlen(W);

C-style strlen without memset may looks like this:
#include <iostream>
using namespace std;
unsigned strlen(const char *str) {
const char *p = str;
unsigned len = 0;
while (*p != '\0') {
len++;
*p++;
}
return len;
}
int main() {
cout << strlen("C-style string");
return 0;
}
It's return 14.

Appending character to a string in C with strcat

Hi guys I'm still really confused with pointers and I'm wondering if there's anyways to do the following without having to use sprintf:
char a[100], b[100], c[2];
//Some code that puts a string into a
for(i = 0; i<strlen(a); i++)
{
if(a[i] == 'C')
strcat(b, "b");
else if(a[i] == 'O')
strcat(b, "a");
else if(a[i] == 'D')
strcat(b, "1");
else
{
sprintf(c, "%s", a[i]);
strcat(b, c);
}
}
pretty much a for loop looping through a string(an array) and filling up another string with a character(or string) depending on what the character is, if the character ain'T C, O or D it just adds it to the other string.
I can't seem to just do strcat(b, a[i]); and I understand that it wouldn't work because it would try strcat(char *, char) instead of char*, const char*).
Is there anyway I can turn it into a pointer? they still confuse me so much..and I'm new to programming in general just to low level languages...
also what would be the best way to initialize char[]s? that are gonna be filled with a string, what I use right now is :
char ie[30] = ""
Also let me know if there's any easier way to do what
I want and sorry if it's unclear this is obviously a throwaway script but the same concept is used in my script.
Thank you in advance stackoverflow :X

(1) One bug may be in your code:
You are commenting that Some code that puts a string into a, and I think you don't assign any string to b. so by default char b[100]; have garbage value (may not present \0 in b). but string concatenation function expects that b must be a string. So
strcat(b, "b"); <--will Undefined Behavior
(2) A technique to initialize empty string:
Yes you should always initialize you variable (array) with default values like:
char a[100] = {0}, b[100] = {0}, c[2] = {0};
note: remaining elements of a half initialize array would be 0 (null), Initialize a variable assume to be good practice
(3) Yes strcat(b, a[i]); is wrong:
To concatenate string from a[i] on words into b you can do like:
strcat(b, a + i);
yes you are correct strcat(b, a[i]); is not valid indeed.
note: a[i] and (a + i) are not same, a[i] is char type, where as (a + i) is string that is type of a.
Suppose you have following string array a and value of i is 2 then:
+----+----+----+---+---+----+----+----+---+
| 'u'| 's' |'e'|'r'|'5'| '6'| '7'|'8' | 0 |
+----+----+----+---+---+----+----+----+---+
201 202 203 204 205 206 207 208 209 210 211
^ ^
| |
a (a+i)
So in above diagram a values is 201 and type is char [100] (assuming array is 100 in size) (a + i) also points a string from 'e' at address 203. where as a[i] = 'e'
So you can't do strcat(b, a[i]); but strcat(b, a + i); is valid syntax.
Additionally, From #BenVoigt to concat n chars from a from ith position you can do like:
strncat(b, a+i, n);
its will append n char from a+i to b.

Since you want to take a substring of a exactly one character long:
strncat(b, a+i, 1);

Initialize all char array to null so that no garbage values exists in code.
You are appending to garbaged char array.
char a[100]={0}, b[100]={0}, c[2]={0};
Now strcat() function behaves properly.

You seem to be confused in regards to strings. A string isn't just an array. Which book are you reading?
When you first call strcat to operate on b, b isn't guaranteed to be a string. The result is undefined behaviour. This code might seem to function correctly on your system, but if it does then that is by coincidence. I have seen code like this fail in strange ways on other systems. Fix it like this:
char a[100], b[100];
//Some code that puts a string into a
a[x] = '\0'; // <--- Null terminator is required for a to contain a "string".
// Otherwise, you can't pass a to strlen.
for(i = 0; i<strlen(a); i++)
{
if(a[i] == 'C')
b[i] = 'b';
else if(a[i] == 'O')
b[i] = 'a';
else if(a[i] == 'D')
b[i] = '1';
else
b[i] = a[i];
}
b[i] = '\0'; // If you don't put a null character at the end, it isn't a string.
Now, what is a string?

There are many possible ways to do as you wish. There are ways that avoid using strcat() and sprintf() altogether — see below; you can avoid sprintf() while continuing to use strcat().
The way I'd probably do it would keep a record of where the next character is to be added to the target string, b. This will be more efficient since repeatedly using strcat() involves quadratic behaviour as you build up a string one character at a time. Also, it is generally best to avoid using strlen() in the loop condition for the same reason; it is (probably) evaluated on each iteration, so that it too leads to quadratic behaviour.
char a[100], b[100];
char *bp = b;
//Some code that puts a string into a
size_t len = strlen(a);
for (int i = 0; i < len; i++, bp++)
{
if (a[i] == 'C')
*bp = 'b';
else if(a[i] == 'O')
*bp = 'a';
else if(a[i] == 'D')
*bp = '1';
else
*bp = a[i];
}
*bp = '\0'; // Null-terminate the string
You could also do without the pointer by using the index variable i to assign to b (as long as you only add one character to the output for each input character):
char a[100], b[100];
//Some code that puts a string into a
size_t len = strlen(a);
for (int i = 0; i < len; i++)
{
if (a[i] == 'C')
b[i] = 'b';
else if(a[i] == 'O')
b[i] = 'a';
else if(a[i] == 'D')
b[i] = '1';
else
b[i] = a[i];
}
b[i] = '\0'; // Null-terminate the string
As long as the string in a is short enough to fit, the code shown (either version) cannot overflow b. If you sometimes added several characters to b, you'd either need to indexes (i and j), or you could increment the pointer bp in the first version more than once per loop, and you'd need to ensure that you don't overflow the bounds of b.

Comparing chars in a character array with strcmp

I have read an xml file into a char [] and am trying to compare each element in that array with certain chars, such as "<" and ">". The char array "test" is just an array of one element and contains the character to be compared (i had to do it like this or the strcmp method would give me an error about converting char to cons char*). However, something is wrong and I can not figure it out. Here is what I am getting:
< is being compared to: < strcmp value: 44
Any idea what is happening?
char test[1];
for (int i=0; i<amountRead; ++i)
{
test[0] = str[i];
if( strcmp(test, "<") == 0)
cout<<"They are equal"<<endl;
else
{
cout<<test[0]<< " is being compare to: "<<str[i]<<" strcmp value= "<<strcmp(test, "<") <<endl;
}
}

strcmp() expects both of its parameters to be null terminated strings, not simple characters. If you want to compare characters for equality, you don't need to call a function, just compare the characters:
if (test[0] == '<') ...

you need to 0 terminate your test string.
char test[2];
for (int i=0; i<amountRead; ++i)
{
test[0] = str[i];
test[1] = '\0'; //you could do this before the loop instead.
...
But if you always intend to compare one character at a time, then the temp buffer isn't necessary at all. You could do this instead
for (int i=0; i<amountRead; ++i)
{
if (str[i] == "<")
cout<<"They are equal"<<endl;
else
{
cout << str[i] << " is being compare to: <" << endl;
}
}

strcmp wants both strings to be 0 terminated.
When you have non-0 terminated strings, use strncmp:
if( strncmp(test, "<", 1) == 0 )
It is up to you to make sure that both strings are at least N characters long (where N is the value of the 3rd parameter). strncmp is a good functions to have in your mental toolkit.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ Check and modify strings / String subscript out of range - c++

Related

Replacing a substring with a space character

Encrypting a string but receiving an infinite loop

Input C-style string and get the length

Appending character to a string in C with strcat

Comparing chars in a character array with strcmp

Categories

Resources