Is there a convenient way to parse an integer from a string::iterator in c++? For this specific question I only care about nonnegative base 10 integers, but all of these solutions can be pretty easily extended to arbitrary integers. Note, unlike similar questions I don't have a reference to the original string, only an iterator, e.g.
int parse_next_int(std::string::iterator begin, std::string::iterator end) {
// ...
}
I can think of a number of ways, but none are great. Another note, I'm not declaring stl headers, and I'm assuming everything is done in the std namespace. Hopefully this won't make the examples too difficult to parse.
Allocate a new string, and then call stoi:
int parse_next_int(string::iterator begin, string::iterator end) {
string::iterator num_end = find_if(
begin, end, [](char c)->bool{return !isdigit(c);});
string to_parse(begin, num_end);
return stoi(to_parse);
}
The downside of this is that I end up allocating a new buffer for something that could presumably be parsed on the fly.
Treat unsafely as a c string.
int parse_next_int(std::string::iterator begin, std::string::iterator end) {
return atoi(&(*begin));
}
This will somewhat work, but if it hits the end of the string and it's not not null terminated (which isn't guaranteed with c++ strings) it will segfault, so while nice and concise, this is probably the worst.
Write it myself:
int parse_next_int(std::string::iterator begin, std::string::iterator end) {
int result = 0;
while (begin != end && isdigit(*begin)) {
result = result * 10 + (*begin++ - '0');
}
return result;
}
This works and is simple, but it's also heavily problem dependent and not very error tolerant.
Is there some significantly different method that mostly relies on more tolerant stl calls, while still being simple and avoids copying unnecessary buffers?
If you have access to boost you could use:
int parse_next_int(std::string::iterator begin, std::string::iterator end) {
return boost::lexical_cast<int>(&(*begin), std::distance(begin, end));
}
Create a std::string from the iterators.
Create a std::istringstream from the string.
Extract the integer from the istringstream.
int parse_next_int(std::string::iterator begin, std::string::iterator end) {
std::string s(begin, end);
std::istringstream str(s);
int i;
str >> i;
return i;
}
PS Add error handling code to make it production worthy.
Don't use atoi, it causes undefined behaviour if the number would exceed INT_MAX. Your option 3 has the same problem.
My suggestion is:
Find the end of the number, using find_if or strchr or whatever other method; allow for leading - or + if you want.
Null-terminate the substring
Use strtol to convert, with code to handle all the overflow cases.
Regarding the null termination, you could choose one of the following:
Copy to an automatic array (easiest option).
If end is not actually the end of the string, the write a temporary null terminator there, and restore the old character afterwards.
Note that since C++11, std::strings are guaranteed to be null-terminated, so your dereference-and-treat-as-a-C-string solution is not unsafe at all; and, with a comment explaining what's going on, it would have my vote for the best solution to this problem.
Related
I'm attempting to insert an array of unsigned ints into a std::vector.
Here is my current code:
auto add_chars(std::vector<char> & vec, unsigned val[]){
std::string tmp;
tmp.resize(11) // Max chars a uint can be represented by, including the '\n' for sprintf
for (auto x = 0; x< 10; x++){
auto char_count = sprintf(tmp.data(),"%u", val[x]);
vec.insert(vec.begin()+vec.size(),tmp.data(), tmp.data()+char_count);
}
}
int main(){
std::vector<char> chars;
unsigned val[10] {1,200,3,4,5,6000,7,8,9000};
add_chars(chars,val);
for (auto & item : chars){
std::cout << item;
}
}
This solution works, however I question its efficiency (and elegance).
Two questions:
Is there a more idiomatic way of doing this?
Is there a more efficient way of doing this?
*edit Fixed a bug in the code made while transferring over to here.
Also, i'm aware that '9000' can't be represented as 1 char, whats why im using the buffer and sprintf to generate multiple chars for the one uint.
Is there a more idiomatic way of doing this?
A character stream is idiomatic for this. Unfortunately, the standard only has a stream for building a string; not a vector. You can copy the string into a vector though. This is not most efficient way:
std::ostringstream ss;
unsigned val[10] {1,200,3,4,5,6000,7,8,9000};
for (auto v : val)
ss << v;
std::string str = ss.str();
// if you need a vector for some reason
std::vector<char> chars(std::begin(str), std::end(str));
Or you could write your own custom vector stream, but that will be a lot of boilerplate.
I would make a couple of changes (and fix the bug).
Firstly the number of digits in an integer is limited so there's no need to use a dynamic object like std::string, a simple char array will do. Since you are using uint32_t and decimal digits 10 characters are sufficient, 11 if you include a nul terminator.
Secondly sprintf and similar are inefficient because they have to interpret the format string, "%u" in your case. A hand written function to perform the conversion from uint32_t to digits would be more efficient.
Let's say I am traversing a string of length n. I want it to end at a specific character that fulfils some conditions. I know that C style strings can be terminated at the i'th position by simply assigning the character '\0' at position i in the character array.
Is there any way to achieve the same result in an std::string (C++ style string)? I can think of substr, erase, etc. but all of them are linear in their complexity, which I cannot afford to use.
TL;DR, is there any "end" character for an std::string? Can I make the end iterator point to the current character somehow?
You can use resize:
std::string s = /* ... */;
if (auto n = s.find(c); n != s.npos) {
s.resize(n);
}
The logical answer here is basic_string::resize. What the standard says about this function is:
Effects: Alters the length of the string designated by *this as follows:
If n <= size(), the function replaces the string designated by *this with a string of length n whose elements are a copy of the initial elements of the original string designated by *this.
If n > size(), the function replaces the string designated by *this with a string of length n whose first size() elements are a copy of the original string designated by *this, and whose remaining elements are all initialized to c.
Now, that looks very much like linear time. However, the standard does not specifically state that things will happen this way. They only state that it will be "as if" things happen this way. Therefore, an implementation is completely free to implement the shrinking version of resize by shifting one pointer and writing a NUL character. Nothing in the standard would forbid such an implementation.
So the real question is... are standard library implementations written by complete morons? It's certainly possible that they are. But it's probably wise not to assume so.
Personally, I'd just use resize on the assumption that the library implementers know what they're doing. After all, if they can't write an optimization as simple as that, then who knows what other things they're doing wrong? If you can't trust your standard library implementation not to do stupid things, then you shouldn't be using it in performance-critical code.
is there any "end" character for an std::string?
No. It is possible to define a std::string that is not null terminated. You won't be able to do a few things for such strings, such as treat the return value of std::string:data() as a null terminated C string 1, but a std::string can be constructed that way.
Can I make the end iterator point to the current character somehow?
To get a std::string::iterator point to a certain character, you'll have to traverse the string.
E.g.
std::string str = "This is a string";
auto iter = str.begin();
auto end = iter;
while ( end != str.end() && *end != 'r' )
++end;
After that, the range defined by iter and end contains the string "This is a st".
If that is not acceptable, you'll have to adapt your code to check the value of the character for every step.
std::string str = "This is a string";
auto iter = str.begin();
// Break when 'r' is encountered or end of string is reached.
while ( iter != str.end() && *iter != 'r' )
{
// Use *iter
...
}
1 Thanks are due to #Cubbi for pointing out an error in what I stated. std::string::data() can return a char const* that is not null terminated if using a version of C++ earlier than C++11. If using C++11 or later, std::string::data() is required to return a null terminated char const*.
std::string does not have an "end character" like c style strings. You can have many null terminators inside a single std::string. If you want to the string to end after a certain character then you need to erase the rest of the characters in the string after that last character.
In your case that would give you something like
string_variable.erase(pos_of_last_character + 1)
TL;DR, is there any "end" character for an std::string? Can I make the end iterator point to the current character somehow?
Not really. std::string uses the std::string::size() function to keep track of the number of characters stored and maintained independently of any sentinel characters like '\0'.
Though these are considered when a std::string is initialized from a const char*.
I was trying to write a function that returns the first non-repeated character in a string. The algorithm I made was:
Assert that the string is non-empty
Iterate through the string and add all non-repeated characters to a set
Assert that the set be non-empty
Iterate through string again and return the first character that's in the set
Add a useless return statement to make the compiler happy. (Arbitrarily return 'F')
Obviously my algorithm is very "brute force" and could be improved on. It runs, anyhow. I was wondering if there's a better way to do this and was also wondering what the convention is for useless return statements. Don't be afraid to criticize me harshly. I'm trying to become a C++ stiffler. ;)
#include <iostream>
#include <string>
#include <set>
char first_nonrepeating_char(const std::string&);
int main() {
std::string S = "yodawgIheardyoulike";
std::cout << first_nonrepeating_char(S);
}
// Finds that first non-repeated character in the string
char first_nonrepeating_char(const std::string& str) {
assert (str.size() > 0);
std::set<char> nonRepChars;
std::string::const_iterator it = str.begin();
while (it != str.end()) {
if (nonRepChars.count(*it) == 0) {
nonRepChars.insert(*it);
} else {
nonRepChars.erase(*it);
}
++it;
}
assert (nonRepChars.size() != 0);
it = str.begin();
while (it != str.end()) {
if (nonRepChars.count(*it) == 1) return (*it);
++it;
}
return ('F'); // NEVER HAPPENS
}
The main problem is just getting rid of warnings.
Ideally you should be able to just say
assert( false ); // Should never get here
but unfortunately that does not get rid of all warnings with the compilers I use most, namely Visual C++ and g++.
Instead I do this:
xassert_should_never_get_here();
where xassert_should_never_get_here is a function that
is declared as "noreturn" by compiler-specific means, e.g. __declspec for Visual C++,
has an assert(false) to handle debug builds,
then throws a std::logic_error.
The last two points are accomplished by a macro XASSERT (its actual name in my code is CPPX_XASSERT, it's always a good idea to use prefixes for macro names so as to reduce name conflict probability).
Of course, the assertion that you should not get to the end, is equivalent to an assertion that the argument string does contain at least one non-repeated character, which therefore is a precondition of the function (part of its contract), which I think should be documented by a comment. :-)
There are three main "modern C++" ways of coding things up when you do not have that precondition, namely
choose one char value to signify "no such", e.g. '\0', or
throw an exception in the case of no such, or
return a boxed result which can be logically "empty", e.g. the Boost class corresponding to Barton and Nackmann's Fallible.
About the algorithm: when you're not intested in where the first non-repeating char is, you can avoid the rescan of the string by maintaining a count per character, e.g. by using a map<char, int> instead of a set<char>.
There is a simpler and "cleaner" way of doing it, but it is not computationally faster than "brute force".
Use a table that counts the number of occurrences of each character in the input string.
Then go over the input string one more time, and return the first character whose count is 1.
char GetFirstNonRepeatedChar(const char* s)
{
int table[256] = {0};
for (int i=0; s[i]!=0; i++)
table[s[i]]++;
for (int i=0; s[i]!=0; i++)
if (table[s[i]] == 1)
return s[i];
return 0;
}
Note: the above will work for ASCII strings.
If you're using a different format, then you'll need to change the 256 (and the char of course).
In my code, I have char array and here it is: char pIPAddress[20];
And I'm setting this array from a string with this code:strcpy(pIPAddress,pString.c_str());
After this loading; for example pIPAddress value is "192.168.1.123 ". But i don't want spaces. I need to delete spaces. For this i did this pIPAddress[13]=0;.
But If IP length chances,It won't work. How can i can calculate space efficient way? or other ways?
Thnx
The simplest approach that you can do is to use the std::remove_copy algorithm:
std::string ip = read_ip_address();
char ipchr[20];
*std::remove_copy( ip.begin(), ip.end(), ipchr, ' ' ) = 0; // [1]
The next question would be why would you want to do this, because it might be better not to copy it into an array but rather remove the spaces from the string and then use c_str() to retrieve a pointer...
EDIT As per James suggestion, if you want to remove all space and not just the ' ' character, you can use std::remove_copy_if with a functor. I have tested passing std::isspace from the <locale> header directly and it seems to work, but I am not sure that this will not be problematic with non-ascii characters (which might be negative):
#include <locale>
#include <algorithm>
int main() {
std::string s = get_ip_address();
char ip[20];
*std::remove_copy_if( s.begin(), s.end(), ip, (int (*)(int))std::isspace ) = 0; // [1]
}
The horrible cast in the last argument is required to select a particular overload of isspace.
[1] The *... = 0; needs to be added to ensure NUL termination of the string. The remove_copy and remove_copy_if algorithms return an end iterator in the output sequence (i.e. one beyond the last element edited), and the *...=0 dereferences that iterator to write the NUL. Alternatively the array can be initialized before calling the algorithm char ip[20] = {}; but that will write \0 to all 20 characters in the array, rather than only to the end of the string.
If spaces are only at the end (or beginning) of your string, you'd best use boost::trim
#include <boost/algorithm/string/trim.hpp>
std::string pString = ...
boost::trim(pString);
strcpy(pIPAddress,pString.c_str());
If you want to handcode, <cctype> has the function isspace, which also has a locale specific version.
I see you have a std::string. You can use the erase() method :
std::string tmp = pString;
for(std::string::iterator iter = tmp.begin(); iter != tmp.end(); ++iter)
while(iter != tmp.end() && *iter == ' ') iter = tmp.erase(iter);
Then you can copy the contents of tmp into your char array.
Note that char arrays are totally deprecated in C++ and you shouldn't use them unless you absolutely have to. In either way, you should do all your string manipulations using std::string.
To make the solution work at all cases, i suggest you iterate through your string, and when finding a space you deal with it.
A more high-level solution may be for you to use the string methods that allow you to do that automatically. (see: http://www.cplusplus.com/reference/string/string/)
I think if you are using
strcpy(pIPAddress,pString.c_str())
then nothing is required to be done, as c_str() returns the a char* to a null terminated string. So after doing the above operation your char array 'pIPAddress' is itself null terminated. So nothing needs to be done to adjust the length as you said.
It's been a while since I have worked with C++, I'm currently catching up for an upcoming programming test. I have the following function that has this signature:
void MyIntToChar(int *arrayOfInt,char* output)
Int is an array of integers and char* output is a buffer that should be long enough to hold the string representation of the integers that the function receives.
Here is an example of the usage of such function:
int numbers[3] = {11, 26, 81};
char* output = ""; // this I'm sure is not valid, any suggestions on how to
// to properly initialize this string?
MyIntToChar(numbers,output);
cout << output << endl; // this should print "11 26 81" or "11, 26, 81".
// i.e. formatting should not be a problem.
I have been reviewing my old c++ notes from college, but I keep having problems with these. I'm hating myself right now for going to the Java world and not working in this.
Thanks.
void MyIntToChar(int *arrayOfInt, char* output);
That's wrong in several ways. First, of all, it's a misnomer. You cannot, in general, convert an integer into one character, because only ten of all intergers (0...9) would fit into one. So I will assume you want to convert integers into _strings instead.
Then, if you pass arrays to functions, they decay to pointers to their first element, and all information about the array's size is lost. So when you pass arrays to function, you need to pass size information, too.
Either use the C way of doing this and pass in the number of elements as std::size_t (to be obtained as sizeof(myarray)/sizeof(myarray[0])):
void MyIntToStr(int *arrayOfInt, std::size_t arraySize, char* output);
Or do it the C++ way and pass in two iterators, one pointing at the first element (so-called begin iterator) and the other pointing to one behind the last (end iterator):
void MyIntToStr(int *begin, int *end, char* output);
You can improve on that by not insisting on the iterators being int*, but anything which, when dereferenced, yields an int:
template< typename FwdIt >
void MyIntToStr(FwdIt begin, FwdIt end, char* output);
(Templates would require you to implement the algorithm in an header.)
Then there's the problems with the output. First of all, do you really expect all the numbers to be written into one string? If so, how should they be separated? Nothing? Whitespace? Comma?
Or do you expect an array of strings to be returned?
Assuming you really want one string, if I pass the array {1, 2, 3, 4, 5} into your function, it needs space for five single-digit integers plus the space needed for four separators. Your function signature suggests you want me to allocate that upfront, but frankly, if I have to calculate this myself, I might just as well do the conversions myself. Further, I have no way of telling you how much memory that char* points to, so you can't check whether I was right. As generations of developers have found out, this is so hard to get right every time, that several computer languages have been invented to make things easier for programmers. One of those is C++, which nowadays comes with a dynamically resizing string class.
It would be much easier (for you and for me), if I could pass you a stirng and you write into that:
template< typename FwdIt >
void MyIntToChar(FwdIt begin, FwdIt end, std::string& output);
Note that I this passes the string per non-const reference. This allows you to modify my string and let's me see the changes you made.
However, once we're doing this, you might just as well return a new string instead of requireing me to pass one to you:
template< typename FwdIt >
std::string MyIntToChar(FwdIt begin, FwdIt end);
If, however, you actually wanted an array of strings returned, you shouldn't take one string to write to, but a means where to write them to. The naive way of doing this would be to pass a dynamically re-sizable array of dynamically re-sizable string. In C++, this is spelled std::vector<std::string>:
template< typename FwdIt >
void MyIntToStr(FwdIt begin, FwdIt end, std::vector<std::string>& output);
Again, it might be better you return such an array (although some would disagree since copying an array of string might be considered to expensive). However, the best way to do this would not require me to accept the result in form of a 'std::vector'. What if I needed the strings in a (linked) list instead? Or written to some stream?
The best way to do this would be for your function to accept an output iterator to which you write your result:
template< typename FwdIt, typename OutIt >
void MyIntToStr(FwdIt begin, FwdIt end, OutIt output);
Of course, now that's so general that it's hard to see what it does, so it's good we gave it a good name. However, looking at it I immediately think that this should build on another function which is needed probably even more than this one: A function that takes one integer and converts it to one string. Assuming that we have such a function:
std::string MyIntToStr(int i);
it's very easy to implement the array versions:
template< typename FwdIt, typename OutIt >
void MyIntToStr(FwdIt begin, FwdIt end, OutIt output)
{
while(begin != end)
*output++ = MyIntToStr(*begin++);
}
Now all that remains for you to be done is to implement that std::string MyIntToStr(int i); function. As someone else already wrote, that's easily done using string streams and you shouldn't have a problem to find some good examples for that. However, it's even easier to find bad examples, so I'd rather give you one here:
std::string MyIntToStr(int i);
{
std::ostringstream oss;
oss << i:
if(!oss) throw "bah!"; // put your error reporting mechanism here
return oss.str();
}
Of course, given templates, that easy to generalize to accepting anything that's streamable:
template< typename T >
std::string MyIntToStr(const T& obj);
{
std::ostringstream oss;
oss << obj:
if(!oss) throw "bah!"; // put your error reporting mechanism here
return oss.str();
}
Note that, given this general function template, the MyIntToStr() working on arrays now automatically works on arrays of any type the function template working on one object works on.
So, at the end of this (rather epic, I apologies) journey, this is what we arrived at: a generalized function template to convert anything (which can be written to a stream) into a string, and a generalized function template to convert the contents of any array of objects (which can be written to a stream) into a stream.
Note that, if you had at least a dozen 90mins lectures on C++ and your instructors failed to teach you enough to at least understand what I've written here, you have not been taught well according to modern C++ teaching standards.
Well an integer converted to string will require a max of 12 bytes including sign (assuming 32bit), so you can allocate something like this
char * output= new char[12*sizeof(numbers)/sizeof(int)];
First of all, it is impossible to use your method that way your example says:
char* output has only a size of 1 byte (don't forget the null-terminator '\0'). So you can't put a whole string in it. You will get segmentation faults. So, here you are going to make use of the heap. This is already implemented in std::string and std::stringstream. So use them for this problem.
Let's have a look:
#include <string>
#include <sstream>
#include <iostream>
std::string intArrayToString(int *integers, int numberOfInts)
{
std::stringstream ss;
for (int i = 0; i < numberOfInts; i++)
{
ss << integers[i] << ", ";
}
std::string temp = ss.str();
return temp.substr(0, temp.size() - 2); // Cut of the extra ", "
}
And if you want to convert it to char*, you can use yourString.c_str();
Here's a possibility if you are willing to reconsider a change in prototype of the function
template<int n>
void MyIntToChar(int (&iarr)[n], string &output){
stringstream ss;
for(size_t id = 0; id < n; ++id){
ss << iarr[id];
if(id != n - 1) ss << " ";
}
output = ss.str();
}
int main(){
int numbers[3] = {11, 26, 81};
string out = "";
MyIntToChar(numbers, out);
}
You should take a look at the std::stringstream, or, more C-ish (as char* type instead of strings might suggest) sprintf.
did you try sprintf(),it will do your work.For char * initialization,you have to either initialize it by calling malloc or you can take is as a char array and pass the address to the function rather then value.
Sounds like you would like to use C lang. Here's an example. There's an extra ", " at the end of the output but it should give you a feel for the concept. Also, I changed the return type so that I would know how many bytes of output were used. The alternative would be to initialize output would nulls.
int MyIntToChar(int *arrayOfInt, char* output) {
int bytes_used = 0; // use to bump the address past what has been used
for (int i = 0 ; i < sizeof(arrayOfInt); ++i)
bytes_used += sprintf(output + bytes_used, "%u, ", arrayOfInt[i]);
return bytes_used;
}
int main() {
int numbers[5] = {5, 2, 11, 26, 81}; // to properly initialize this string?
char output[sizeof(int)*sizeof(numbers)/sizeof(int) + sizeof(numbers)*2]; // int size plus ", " in string
int bytes_used = MyIntToChar(numbers, output);
printf("%*s", bytes_used, output);// this should print "11 26 81" or "11, 26, 81".
}