I was solving a question online on strings where we had to perform run-length encoding on a given string, I wrote this function to achieve the answer
using namespace std;
string runLengthEncoding(string str) {
vector <char> encString;
int runLength = 1;
for(int i = 1; i < str.length(); i++)
{
if(str[i - 1] != str[i] || runLength == 9)
{
encString.push_back(to_string(runLength)[0]);
encString.push_back(str[i - 1]);
runLength = 0;
}
runLength++;
}
encString.push_back(to_string(runLength)[0]);
encString.push_back(str[str.size() - 1]);
string encodedString(encString.begin(), encString.end());
return encodedString;
}
Here I was getting a very long error on this particular line in the for loop and outside it when I wrote:
encString.push_back(to_string(runLength));
which I later found out should be:
encString.push_back(to_string(runLength)[0]);
instead
I don't quite understand why I have to insert it as a 2D element(I don't know if that is the right way to say it, forgive me I am a beginner in this) when I am just trying to insert the integer...
In stupid terms - why do I gotta add [0] in this?
std::to_string() returns a std::string. That's what it does, if you check your C++ textbook for a description of this C++ library function that's what you will read there.
encString.push_back( /* something */ )
Because encString is a std::vector<char>, it logically follows that the only thing can be push_back() into it is a char. Just a single char. C++ does not allow you to pass an entire std::string to a function that takes a single char parameter. C++ does not work this way, C++ allows only certain, specific conversions betweens different types, and this isn't one of them.
And that's why encString.push_back(to_string(runLength)); does not work. The [0] operator returns the first char from the returned std::string. What a lucky coincidence! You get a char from that, the push_back() expects a single char value, and everyone lives happily ever after.
Also, it is important to note that you do not, do not "gotta add [0]". You could use [1], if you have to add the 2nd character from the string, or any other character from the string, in the same manner. This explains the compilation error. Whether [0] is the right solution, or not, is something that you'll need to figure out separately. You wanted to know why this does not compile without the [0], and that's the answer: to_string() returns a std::string put you must push_back() a single char value, and using [0] makes it happen. Whether it's the right char, or not, that's a completely different question.
Related
I'm trying to get the length of a character array in a second function. I've looked at a few questions on here (1 2) but they don't answer my particular question (although I'm sure something does, I just can't find it). My code is below, but I get the error "invalid conversion from 'char' to 'const char*'". I don't know how to convert my array to what is needed.
#include <cstring>
#include <iostream>
int ValidInput(char, char);
int main() {
char user_input; // user input character
char character_array[26];
int valid_guess;
valid_guess = ValidGuess(user_input, character_array);
// another function to do stuff with valid_guess output
return 0;
}
int ValidGuess (char user_guess, char previous_guesses) {
for (int index = 0; index < strlen(previous_guesses); index++) {
if (user_guess == previous_guesses[index]) {
return 0; // invalid guess
}
}
return 1; // valid guess, reaches this if for loop is complete
}
Based on what I've done so far, I feel like I'm going to have a problem with previous_guesses[index] as well.
char user_input;
defines a single character
char character_array[26];
defines an array of 26 characters.
valid_guess = ValidGuess(user_input, character_array);
calls the function
int ValidGuess (char user_guess, char previous_guesses)
where char user_guess accepts a single character, lining up correctly with the user_input argument, and char previous_guesses accepts a single character, not the 26 characters of character_array. previous_guesses needs a different type to accommodate character_array. This be the cause of the reported error.
Where this gets tricky is character_array will decay to a pointer, so
int ValidGuess (char user_guess, char previous_guesses)
could be changed to
int ValidGuess (char user_guess, char * previous_guesses)
or
int ValidGuess (char user_guess, char previous_guesses[])
both ultimately mean the same thing.
Now for where things get REALLY tricky. When an array decays to a pointer it loses how big it is. The asker has gotten around this problem, kudos, with strlen which computes the length, but this needs a bit of extra help. strlen zips through an array, counting until it finds a null terminator, and there are no signs of character_array being null terminated. This is bad. Without knowing where to stop strlen will probably keep going1. A quick solution to this is go back up to the definition of character_array and change it to
char character_array[26] = {};
to force all of the slots in the array to 0, which just happens to be the null character.
That gets the program back on its feet, but it could be better. Every call to strlen may recount (compilers are smart and could compute once per loop and store the value if it can prove the contents won't change) the characters in the string, but this is still at least one scan through every entry in character_array to see if it's null when what you really want to do is scan for user_input. Basically the program looks at every item in the array twice.
Instead, look for both the null terminator and user_input in the same loop.
int index = 0;
while (previous_guesses[index] != '\0' ) {
if (user_guess == previous_guesses[index]) {
return 0; // prefer returning false here. The intent is clearer
}
index++;
}
You can also wow your friends by using pointers and eliminating the need for the index variable.
while (*previous_guesses != '\0' ) {
if (user_guess == *previous_guesses) {
return false;
}
previous_guesses++;
}
The compiler knows and uses this trick too, so use the one that's easier for you to understand.
For 26 entries it probably doesn't matter, but if you really want to get fancy, or have a lot more than 26 possibilities, use a std::set or a std::unordered_set. They allow only one of an item and have much faster look-up than scanning a list one by one, so long as the list is large enough to get over the added complexity of a set and take advantage of its smarter logic. ValidGuess is replaced with something like
if (used.find(user_input) != used.end())
Side note: Don't forget to make the user read a value into user_input before the program uses it. I've also left out how to store the previous inputs because the question does as well.
1 I say probably because the Standard doesn't say what to do. This is called Undefined Behaviour. C++ is littered with the stuff. Undefined Behaviour can do anything -- work, not work, visibly not work, look like it works until it doesn't, melt your computer, anything -- but what it usually does is the easiest and fastest thing. In this case that's just keep going until the program crashes or finds a null.
I want to make a function that removes all the characters of ch in a c-string.
But I keep getting an access violation error.
Unhandled exception at 0x000f17ba in testassignments.exe: 0xC0000005: Access violation writing location 0x000f787e.
void removeAll(char* &s, const char ch)
{
int len=strlen(s);
int i,j;
for(i = 0; i < len; i++)
{
if(s[i] == ch)
{
for(j = i; j < len; j++)
{
s[j] = s[j + 1];
}
len--;
i--;
}
}
return;
}
I expected the c-string to not contain the character "ch", but instead, I get an access violation error.
In the debug I got the error on the line:
s[j] = s[j + 1];
I tried to modify the function but I keep getting this error.
Edit--
Sample inputs:
s="abmas$sachus#settes";
ch='e' Output->abmas$sachus#settes, becomes abmas$sachus#stts
ch='t' Output-> abmas$sachus#stts, becomes abmas$sachus#ss.
Instead of producing those outputs, I get the access violation error.
Edit 2:
If its any help, I am using Microsoft Visual C++ 2010 Express.
Apart from the inefficiency of your function shifting the entire remainder of the string whenever encountering a single character to remove, there's actually not much wrong with it.
In the comments, people have assumed that you are reading off the end of the string with s[j+1], but that is untrue. They are forgetting that s[len] is completely valid because that is the string's null-terminator character.
So I'm using my crystal ball now, and I believe that the error is because you're actually running this on a string literal.
// This is NOT okay!
char* str = "abmas$sachus#settes";
removeAll(str, 'e');
This code above is (sort of) not legal. The string literal "abmas$sachus#settes" should not be stored as a non-const char*. But for backward compatibility with C where this is allowed (provided you don't attempt to modify the string) this is generally issued as a compiler warning instead of an error.
However, you are really not allowed to modify the string. And your program is crashing the moment you try.
If you were to use the correct approach with a char array (which you can modify), then you have a different problem:
// This will result in a compiler error
char str[] = "abmas$sachus#settes";
removeAll(str, 'e');
Results in
error: invalid initialization of non-const reference of type ‘char*&’ from an rvalue of type ‘char*’
So why is that? Well, your function takes a char*& type that forces the caller to use pointers. It's making a contract that states "I can modify your pointer if I want to", even if it never does.
There are two ways you can fix that error:
The TERRIBLE PLEASE DON'T DO THIS way:
// This compiles and works but it's not cool!
char str[] = "abmas$sachus#settes";
char *pstr = str;
removeAll(pstr, 'e');
The reason I say this is bad is because it sets a dangerous precedent. If the function actually did modify the pointer in a future "optimization", then you might break some code without realizing it.
Imagine that you want to output the string with characters removed later, but the first character was removed and you function decided to modify the pointer to start at the second character instead. Now if you output str, you'll get a different result from using pstr.
And this example is only assuming that you're storing the string in an array. Imagine if you actually allocated a pointer like this:
char *str = new char[strlen("abmas$sachus#settes") + 1];
strcpy(str, "abmas$sachus#settes");
removeAll(str, 'e');
Then if removeAll changes the pointer, you're going to have a BAD time when you later clean up this memory with:
delete[] str; //<-- BOOM!!!
The I ACKNOWLEDGE MY FUNCTION DEFINITION IS BROKEN way:
Real simply, your function definition should take a pointer, not a pointer reference:
void removeAll(char* s, const char ch)
This means you can call it on any modifiable block of memory, including an array. And you can be comforted by the fact that the caller's pointer will never be modified.
Now, the following will work:
// This is now 100% legit!
char str[] = "abmas$sachus#settes";
removeAll(str, 'e');
Now that my free crystal-ball reading is complete, and your problem has gone away, let's address the elephant in the room:
Your code is needlessly inefficient!
You do not need to do the first pass over the string (with strlen) to calculate its length
The inner loop effectively gives your algorithm a worst-case time complexity of O(N^2).
The little tricks modifying len and, worse than that, the loop variable i make your code more complex to read.
What if you could avoid all of these undesirable things!? Well, you can!
Think about what you're doing when removing characters. Essentially, the moment you have removed one character, then you need to start shuffling future characters to the left. But you do not need to shuffle one at a time. If, after some more characters you encounter a second character to remove, then you simply shunt future characters further to the left.
What I'm trying to say is that each character only needs to move once at most.
There is already an answer demonstrating this using pointers, but it comes with no explanation and you are also a beginner, so let's use indices because you understand those.
The first thing to do is get rid of strlen. Remember, your string is null-terminated. All strlen does is search through characters until it finds the null byte (otherwise known as 0 or '\0')...
[Note that real implementations of strlen are super smart (i.e. much more efficient than searching single characters at a time)... but of course, no call to strlen is faster]
All you need is your loop to look for the NULL terminator, like this:
for(i = 0; s[i] != '\0'; i++)
Okay, and now to ditch the inner loop, you just need to know where to stick each new character. How about just keeping a variable new_size in which you are going to count up how long the final string is.
void removeAll(char* s, char ch)
{
int new_size = 0;
for(int i = 0; s[i] != '\0'; i++)
{
if(s[i] != ch)
{
s[new_size] = s[i];
new_size++;
}
}
// You must also null-terminate the string
s[new_size] = '\0';
}
If you look at this for a while, you may notice that it might do pointless "copies". That is, if i == new_size there is no point in copying characters. So, you can add that test if you want. I will say that it's likely to make little performance difference, and potentially reduce performance because of additional branching.
But I'll leave that as an exercise. And if you want to dream about really fast code and just how crazy it gets, then go and look at the source code for strlen in glibc. Prepare to have your mind blown.
You can make the logic simpler and more efficient by writing the function like this:
void removeAll(char * s, const char charToRemove)
{
const char * readPtr = s;
char * writePtr = s;
while (*readPtr) {
if (*readPtr != charToRemove) {
*writePtr++ = *readPtr;
}
readPtr++;
}
*writePtr = '\0';
}
I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.
for (int i = 0; i < string.length(); i++)
{
if (isalpha(string[i]))
{
if (islower(string[i]))
{
string[i] = toupper(string[i]);
}
}
}
Why do you have to list string[i] twice? Shouldn't this work?
toupper(string[i]); (I tried it, so I know it doesn't.)
toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:
// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
...
In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.
To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
char& c = string[i]; // reference to character inside string
c = std::toupper(c);
}
Using range-based for, the above can be written more briefly (and executed more efficiently) as:
for (auto& c: string)
c = std::toupper(c);
As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.
The prototype of toupper is:
int toupper( int ch );
As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.
As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.
What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.
#include <cctype>
#include <iostream>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
for (auto s = text; *s != '\0'; ++s)
*s = std::toupper(*s);
std::cout << text << '\n';
}
You could further eliminate the raw loop by using an algorithm, if you find this prettier.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
std::transform(std::cbegin(text), std::cend(text), std::begin(text),
[](auto c){ return std::toupper(c); });
std::cout << text << '\n';
}
toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.
Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.
So what, you think.
Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).
In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.
The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):
int x = toupper ('é');
So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.
See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."
I have a bit of code that that uses memcpy and I am trying to convert it over to using std::copy. I am getting stuck on the syntax but this is code currently works for memcpy
length=sizeof(int)+title.size()*80;
char *otitle=new char [length];
unsigned int *ntitle=reinterpret_cast<unsigned int *>(otitle);
*ntitle=title.size();
for (i=0; i< title.size(); i++){
memcpy(otitle+sizeof(int)+i*80, getTitle(i).c_str(), 80);
//std::copy(getTitle(i).begin(), getTitle(i).end() , otitle);
}
doSomething(otitle, length);
delete otitle;
The line that is commented out isn't correct and so I would appreciate any help. Essentially, "title" is an std::vector and the function "getTitle()" returns the string in the ith element of "title. otitle is a long character array and I want to be able to copy over each element of "title" into otitle (see memcpy).
Since getTitle returns a string by value, when you call it twice, you get two different strings. You need to make sure you are calling begin and end on the same string though. So you can store the result in another string variable.
std::string str = getTitle(i);
std::copy(str.begin(), str.end(), otitle + sizeof(int) + i*80);
If your original version with memcpy handles copying over a null-terminator (can't tell from your snippet), you'll need to handle that manually. There are other details that might be wrong here, because I don't know the exact details of what your code is supposed to be doing, but this at least gives a correct usage of std::copy.
I wrote the following code:
char *pch=new char[12];
char *f=new char[42];
char *lab=new char[20];
char *mne=new char[10];
char *add=new char[10];
If initially I want these arrays to be null, can't I do this:
*lab="\0";
*mne="\0";
and so on.....
And after that if I want to add some cstring to an empty array can't I check:
if(strcmp(lab,"\0")==0)
//then add cstring by *lab="cstring";
And if I can't do any of these things, please tell me the right way to do it...
In C++11, an easy way to initialize arrays is by using brace-initializers:
char * p = new char[100] { 0 };
The reasoning here is that all the missing array elements will be zero-initialized. You can also use explicit value-initialization (I think that's even allowed in C++98/03), which is zero-initalization for the primitive types:
char * q = new char[110]();
First of all, as DeadMG says, the correct way of doing this is using std:string:
std::string lab; // empty initially, no further initialization needed
if (lab.size() == 0) // string empty, note, very fast, no character comparison
lab += "cstring"; // or even lab = "cstring", as lab is empty
Also, in your code, if you insist in using C strings, after the initialization, the correct checking for the empty string would be
if (*lab == '\0')
First of all, I agree with everybody else to use a std::string instead of character arrays the vast majority of the time. Link for help is here: C++ Strings Library
Now to directly answer your question as well:
*lab="\0";
*mne="\0";
and so on.....
This is wrong. Assuming your compiler doesn't give you an error, you're not assigning the "null terminator" to those arrays, you're trying to assign the pointer value of where the "\0" string is to the first few memory locations where the char* is pointing to! Remember, your variables are pointers, not strings. If you're trying to just put a null-character at the beginning, so that strlen or other C-string functions see an "empty" string, do this: *lab='\0'; The difference is that with single-ticks, it denotes the character \0 whereas with double, it's a string literal, which returns a pointer to the first element. I hope that made sense.
Now for your second, again, you can't just "assign" like that to C-style strings. You need to put each character into the array and terminate it correctly. Usually the easiest way is with sprintf:
sprintf(lab, "%s", "mystring");
This may not make much sense, especially as I'm not dereferencing the pointer, but I'll walk you through it. The first argument says to sprintf "output your characters to where this pointer is pointing." So it needs the raw pointer. The second is a format string, like printf uses. So I'm telling it to use the first argument as a string. And the 3rd is what I want in there, a pointer to another string. This example would also work with sprintf(lab, "mystring") as well.
If you want to get into C-style string processing, you need to read some examples. I'm afraid I don't even know where to look on the 'net for good examples of that, but I wish you good luck. I'd highly recommend that you check out the C++ strings library though, and the basic_string<> type there. That's typedef'd to just std::string, which is what you should use.