fastest way to read the last line of a string? - c++

I'd like to know the fastest way for reading the last line in a std::string object.
Technically, the string after the last occurrence of \n in the fastest possible way?

This can be done using just string::find_last_of and string::substr like so
std::string get_last_line(const std::string &str)
{
auto position = str.find_last_of('\n');
if (position == std::string::npos)
return str;
else
return str.substr(position + 1);
}
see: example

I would probably use std::string::rfind and std::string::substr combined with guaranteed std::string::npos wrap around to be succinct:
inline std::string last_line_of(std::string const& s)
{
return s.substr(s.rfind('\n') + 1);
}
If s.rfind('\n') doesn't find anything it returns std::string::npos. The C++ standard says std::string::npos + 1 == 0. And returning s.substr(0) is always safe.
If s.rfind('\n') does find something then you want the substring starting from the next character. Again returning s.substr(s.size()) is safe according to the standard.
NOTE: In C++17 this method will benefit from guaranteed return value optimization so it should be super efficient.

I thought of a way that reads the string inversely (backwards) while storing what it reads
std::string get_last_line(const std::string &str)
{
size_t l = str.length();
std::string last_line_reversed, last_line;
for (--l; l > 0; --l)
{
char c = str.at(l);
if (c == '\n')
break;
last_line_reversed += c;
}
l = last_line_reversed.length();
size_t i = 0, y = l;
for (; i < l; ++i)
last_line += last_line_reversed[--y];
return last_line;
}
until it counters a '\n' character then reverse the stored string back and return it. If the target string is big and has a lot of new lines, this function would be very efficient.

Related

string::replace not working correctly 100% of the time?

I'm trying to replace every space character with '%20' in a string, and I'm thinking of using the built in replace function for the string class.
Currently, I have:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = 0; i < len; i++) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
When I pass in the string "_a_b_c_e_f_g__", where the underscores represent space, my output is "%20a%20b%20c%20e_f_g__". Again, underscores represent space.
Why is that the spaces near the beginning of the string are replaced, but the spaces towards the end aren't?
You are making s longer with each replacement, but you are not updating len which is used in the loop condition.
Modifying the string that you are just scanning is like cutting the branch under your feet. It may work if you are careful, but in this case you aren't.
Namely, you take the string len at the beginning but with each replacement your string gets longer and you are pushing the replacement places further away (so you never reach all of them).
The correct way to cut this branch is from its end (tip) towards the trunk - this way you always have a safe footing:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = len - 1; i >= 0; i--) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
You're growing the string but only looping to its initial size.
Looping over a collection while modifying it is very prone to error.
Here's a solution that doesn't:
void replace(string& s)
{
string s1;
std::for_each(s.begin(),
s.end(),
[&](char c) {
if (c == ' ') s1 += "%20";
else s1 += c;
});
s.swap(s1);
}
As others have already mentioned, the problem is you're using the initial string length in your loop, but the string gets bigger along the way. Your loop never reaches the end of the string.
You have a number of ways to fix this. You can correct your solution and make sure you go to the end of the string as it is now, not as it was before you started looping.
Or you can use #molbdnilo 's way, which creates a copy of the string along the way.
Or you can use something like this:
std::string input = " a b c e f g ";
std::string::size_type pos = 0;
while ((pos = input.find(' ', pos)) != std::string::npos)
{
input.replace(pos, 1, "%20");
}
Here's a function that can make it easier for you:
string replace_char_str(string str, string find_str, string replace_str)
{
size_t pos = 0;
for ( pos = str.find(find_str); pos != std::string::npos; pos = str.find(find_str,pos) )
{
str.replace(pos ,1, replace_str);
}
return str;
}
So if when you want to replace the spaces, try it like this:
string new_str = replace_char_str(yourstring, " ", "%20");
Hope this helps you ! :)

C++ - Attempting to use string functions to reverse an input string

As part of a homework assignment I need to be able to take an input string and manipulate it several ways using a list of string functions. The first function takes a string and reverses it using a for loop. This is what I have:
#include <iostream>
#include <string>
namespace hw06
{
typedef std::string::size_type size_type;
//reverse function
std::string reverse( const std::string str );
}
// Program execution begins here.
int main()
{
std::string inputStr;
std::cout << "Enter a string: ";
std::getline( std::cin, inputStr );
std::cout << "Reversed: " << hw06::reverse( inputStr )
<< std::endl;
return 0;
}
//reverse function definition
std::string hw06::reverse( const std::string str )
{
std::string reverseStr = "";
//i starts as the last digit in the input. It outputs its current
//character to the return value "tempStr", then goes down the line
//adding whatever character it finds until it reaches position 0
for( size_type i = (str.size() - 1); (i >= 0); --i ){
reverseStr += str.at( i );
}
return reverseStr;
}
The program asks for input, then returns this error:
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::tat
I'm really at a loss as to what I'm doing wrong here. The loop seems correct to me, so am I misunderstanding how to reference the function?
Unless you really want to write a loop, it's probably easier to just do something like:
std::string reverse(std::string const &input) {
return std::string(input.rbegin(), input.rend());
}
The problem is that your loop never terminates. You have as your condition i >= 0, but size_type is unsigned, so 0 - 1 == 2^(sizeof(size_t) * 8) - 1, which is certainly out of the range of your string. Therefore, you need to pick something else as your termination condition. One option is you can use i != std::string::npos, but that feels wrong. You're probably better off with something like:
for (size_type i = str.size(); i != 0; ) {
reverseStr += str.at(--i);
}
EDIT: I did some checking on i != std::string::npos. It should be well-defined and OK. However, it still seems like the Wrong Way To Do It.
As Andreas Grapentin said, the problem is that std::string::size() returns a size_t which is required by the standard to be an unsigned type. So it will always be >= 0 and when you hit 0 and decrement it, you will go to some really large, positive number.
Consider something like this:
std::string hw06::reverse(const std::string &str)
{
std::string reverseStr;
for(size_t i = str.size(); i != 0; i--)
reverseStr += str.at(i - 1);
return reverseStr;
}
I'm not keen on answering homework questions, but seeing some of the answers, I couldn't resist this:
std::string hw06::reverse(const std::string &str)
{ return std::string(str.rbegin(), str.rend()); }
Simple, clean and least wasteful if you can't do it in-place.
As other answers say, the problem is in the loop. I'll suggest using the following "goes to" operator :)
for(size_t i = str.size(); i --> 0;)
{
}
use i-- and not --i. Or u will decrease i value before getting the char and get loop problems.

creating a string split function in C++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
Im trying to create a function that mimics the behavior of the getline() function, with the option to use a delimiter to split the string into tokens.
The function accepts 2 strings (the second is being passed by reference) and a char type for the delimiter. It loops through each character of the first string, copying it to the second string and stops looping when it reaches the delimiter. It returns true if the first string have more characters after the delimiter and false otherwise. The position of the last character is being saved in a static variable.
for some reason the the program is going into an infinite loop and is not executing anything:
const int LINE_SIZE = 160;
bool strSplit(string sFirst, string & sLast, char cDelim) {
static int iCount = 0;
for(int i = iCount; i < LINE_SIZE; i++) {
if(sFirst[i] != cDelim)
sLast[i-iCount] = sFirst[i];
else {
iCount = i+1;
return true;
}
}
return false;
}
The function is used in the following way:
while(strSplit(sLine, sToken, '|')) {
cout << sToken << endl;
}
Why is it going into an infinite loop, and why is it not working?
I should add that i'm interested in a solution without using istringstream, if that's possible.
It is not exactly what you asked for, but have you considered std::istringstream and std::getline?
// UNTESTED
std::istringstream iss(sLine);
while(std::getline(iss, sToken, '|')) {
std::cout << sToken << "\n";
}
EDIT:
Why is it going into an infinite loop, and why is it not working?
We can't know, you didn't provide enough information. Try to create an SSCCE and post that.
I can tell you that the following line is very suspicious:
sLast[i-iCount] = sFirst[i];
This line will result in undefined behavior (including, perhaps, what you have seen) in any of the following conditions:
i >= sFirst.size()
i-iCount >= sLast.size()
i-iCount < 0
It appears to me likely that all of those conditions are true. If the passed-in string is, for example, shorter than 160 lines, or if iCount ever grows to be bigger than the offset of the first delimiter, then you'll get undefined behavior.
LINE_SIZE is probably larger than the number of characters in the string object, so the code runs off the end of the string's storage, and pretty much anything can happen.
Instead of rolling your own, string::find does what you need.
std::string::size_type pos = 0;
std::string::size_type new_pos = sFirst.find('|', pos);
The call to find finds the first occurrence of '|' that's at or after the position 'pos'. If it succeeds, it returns the index of the '|' that it found. If it fails, it returns std::string::npos. Use it in a loop, and after each match, copy the text from [pos, new_pos) into the target string, and update pos to new_pos + 1.
are you sure it's the strSplit() function that doesn't return or is it your caller while loop that's infinite?
Shouldn't your caller loop be something like:
while(strSplit(sLine, sToken, '|')) {
cout << sToken << endl;
cin >> sLine >> endl;
}
-- edit --
if value of sLine is such that it makes strSplit() to return true then the while loop becomes infinite.. so do something to change the value of sLine for each iteration of the loop.. e.g. put in a cin..
Check this out
std::vector<std::string> spliString(const std::string &str,
const std::string &separator)
{
vector<string> ret;
string::size_type strLen = str.length();
char *buff;
char *pch;
buff = new char[strLen + 1];
buff[strLen] = '\0';
std::copy(str.begin(), str.end(), buff);
pch = strtok(buff, separator.c_str());
while(pch != NULL)
{
ret.push_back(string(pch));
pch = strtok(NULL, separator.c_str());
}
delete[] buff;
return ret;
}

Loop quitting for no reason

I have a question regarding C++. This is my current function:
string clarifyWord(string str) {
//Remove all spaces before string
unsigned long i = 0;
int currentASCII = 0;
while (i < str.length()) {
currentASCII = int(str[i]);
if (currentASCII == 32) {
str.erase(i);
i++;
continue;
} else {
break;
}
}
//Remove all spaces after string
i = str.length();
while (i > -1) {
currentASCII = int(str[i]);
if (currentASCII == 32) {
str.erase(i);
i--;
continue;
} else {
break;
}
}
return str;
}
Just to get the basic and obvious things out of the way, I have #include <string> and using namespace std; so I do have access to the string functions.
The thing is though that the loop is quitting and sometimes skipping the second loop. I am passing in the str to be " Cheese " and it should remove all the spaces before the string and after the string.
In the main function, I am also assigning a variable to clarifyWord(str) where str is above. It doesn't seem to print that out either using cout << str;.
Is there something I am missing with printing out strings or looping with strings? Also ASCII code 32 is Space.
Okay so the erase function you are calling looks like this:
string& erase ( size_t pos = 0, size_t n = npos );
The n parameter is the number of items to delete. The npos means, delete everything up until the end of the string, so set the second parameter to 1.
str.erase(i,1)
[EDIT]
You could change the first loop to this:
while (str.length() > 0 && str[0] == ' ')
{
str.erase(0,1);
}
and the second loop to this:
while (str.length() > 0 && str[str.length() - 1] == ' ')
{
str.erase(str.length() - 1, 1);
}
In your second loop, you can't initialize i to str.length().
str[str.length()] is going to be after the end of your string, and so is unlikely to be a space (thus triggering the break out of the second loop).
You're using erase (modifying the string) while you're in a loop checking its size. This is a dangerous way of processing the string. As you return a new string, I would recommend you first to search for the first occurrence in the string of the non-space character, and then the last one, and then returning a substring. Something along the lines of (not tested):
size_t init = str.find_first_not_of(' ');
if (init == std::string::npos)
return "";
size_t fini = std.find_last_not_of(' ');
return str.substr(init, fini - init + 1);
You see, no loops, erases, etc.
unsigned long i ... while (i > -1) Well, that's not right, is it? How would you expect that to work? The compiler will in fact convert both operands to the same type: while (i > static_cast<unsigned long>(-1)). And that's just another way to write ULONG-MAX, i.e. while (i > ULONG_MAX). In other words, while(false).
You're using erase incorrectly. It'll erase from pos to npos.
i.e. string& erase ( size_t pos = 0, size_t n = npos );
See: http://www.cplusplus.com/reference/string/string/erase/
A better way to do this is to note the position of the first non space and where the spaces occur at the end of the string. Then use either substr or erase twice.
You also don't need to go to the trouble of doing this:
currentASCII = int(str[i]);
if (currentASCII == 32) {
Instead do this:
if (str[i] == ' ') {
Which I think you'll agree is a lot easier to read.
So, you can shorten it somewhat with something like: (not tested but it shouldn't be far
off)
string clarifyWord(string str) {
int start = 0, end = str.length();
while (str[start++] == ' ');
while (str[end--] == ' ');
return str.substr(start, end);
}

C++ Remove new line from multiline string

Whats the most efficient way of removing a 'newline' from a std::string?
#include <algorithm>
#include <string>
std::string str;
str.erase(std::remove(str.begin(), str.end(), '\n'), str.cend());
The behavior of std::remove may not quite be what you'd expect.
A call to remove is typically followed by a call to a container's erase method, which erases the unspecified values and reduces the physical size of the container to match its new logical size.
See an explanation of it here.
If the newline is expected to be at the end of the string, then:
if (!s.empty() && s[s.length()-1] == '\n') {
s.erase(s.length()-1);
}
If the string can contain many newlines anywhere in the string:
std::string::size_type i = 0;
while (i < s.length()) {
i = s.find('\n', i);
if (i == std::string:npos) {
break;
}
s.erase(i);
}
You should use the erase-remove idiom, looking for '\n'. This will work for any standard sequence container; not just string.
Here is one for DOS or Unix new line:
void chomp( string &s)
{
int pos;
if((pos=s.find('\n')) != string::npos)
s.erase(pos);
}
Slight modification on edW's solution to remove all exisiting endline chars
void chomp(string &s){
size_t pos;
while (((pos=s.find('\n')) != string::npos))
s.erase(pos,1);
}
Note that size_t is typed for pos, it is because npos is defined differently for different types, for example, -1 (unsigned int) and -1 (unsigned float) are not the same, due to the fact the max size of each type are different. Therefore, comparing int to size_t might return false even if their values are both -1.
s.erase(std::remove(s.begin(), s.end(), '\n'), s.end());
The code removes all newlines from the string str.
O(N) implementation best served without comments on SO and with comments in production.
unsigned shift=0;
for (unsigned i=0; i<length(str); ++i){
if (str[i] == '\n') {
++shift;
}else{
str[i-shift] = str[i];
}
}
str.resize(str.length() - shift);
std::string some_str = SOME_VAL;
if ( some_str.size() > 0 && some_str[some_str.length()-1] == '\n' )
some_str.resize( some_str.length()-1 );
or (removes several newlines at the end)
some_str.resize( some_str.find_last_not_of(L"\n")+1 );
Another way to do it in the for loop
void rm_nl(string &s) {
for (int p = s.find("\n"); p != (int) string::npos; p = s.find("\n"))
s.erase(p,1);
}
Usage:
string data = "\naaa\nbbb\nccc\nddd\n";
rm_nl(data);
cout << data; // data = aaabbbcccddd
All these answers seem a bit heavy to me.
If you just flat out remove the '\n' and move everything else back a spot, you are liable to have some characters slammed together in a weird-looking way. So why not just do the simple (and most efficient) thing: Replace all '\n's with spaces?
for (int i = 0; i < str.length();i++) {
if (str[i] == '\n') {
str[i] = ' ';
}
}
There may be ways to improve the speed of this at the edges, but it will be way quicker than moving whole chunks of the string around in memory.
If its anywhere in the string than you can't do better than O(n).
And the only way is to search for '\n' in the string and erase it.
for(int i=0;i<s.length();i++) if(s[i]=='\n') s.erase(s.begin()+i);
For more newlines than:
int n=0;
for(int i=0;i<s.length();i++){
if(s[i]=='\n'){
n++;//we increase the number of newlines we have found so far
}else{
s[i-n]=s[i];
}
}
s.resize(s.length()-n);//to delete only once the last n elements witch are now newlines
It erases all the newlines once.
About answer 3 removing only the last \n off string code :
if (!s.empty() && s[s.length()-1] == '\n') {
s.erase(s.length()-1);
}
Will the if condition not fail if the string is really empty ?
Is it not better to do :
if (!s.empty())
{
if (s[s.length()-1] == '\n')
s.erase(s.length()-1);
}
To extend #Greg Hewgill's answer for C++11:
If you just need to delete a newline at the very end of the string:
This in C++98:
if (!s.empty() && s[s.length()-1] == '\n') {
s.erase(s.length()-1);
}
...can now be done like this in C++11:
if (!s.empty() && s.back() == '\n') {
s.pop_back();
}
Optionally, wrap it up in a function. Note that I pass it by ptr here simply so that when you take its address as you pass it to the function, it reminds you that the string will be modified in place inside the function.
void remove_trailing_newline(std::string* str)
{
if (str->empty())
{
return;
}
if (str->back() == '\n')
{
str->pop_back();
}
}
// usage
std::string str = "some string\n";
remove_trailing_newline(&str);
Whats the most efficient way of removing a 'newline' from a std::string?
As far as the most efficient way goes--that I'd have to speed test/profile and see. I'll see if I can get back to you on that and run some speed tests between the top two answers here, and a C-style way like I did here: Removing elements from array in C. I'll use my nanos() timestamp function for speed testing.
Other References:
See these "new" C++11 functions in this reference wiki here: https://en.cppreference.com/w/cpp/string/basic_string
https://en.cppreference.com/w/cpp/string/basic_string/empty
https://en.cppreference.com/w/cpp/string/basic_string/back
https://en.cppreference.com/w/cpp/string/basic_string/pop_back