std::string characters somehow turned into Unicode/ASCII numbers

std::string characters somehow turned into Unicode/ASCII numbers - c++

I have a function ls() which parses a vector of string and puts it into a comma-separated list, wrapped within parentheses ():
std::string ls(std::vector<std::string> vec, std::string wrap="()", std::string sep=", ") {
std::string wrap_open, wrap_close;
wrap_open = std::to_string(wrap[0]);
wrap_close = std::to_string(wrap[1]);
std::string result = wrap_open;
size_t length = vec.size();
if (length > 0) {
if (length == 1) {
result += vec[0];
result += wrap_close;
}
else {
for (int i = 0; i < vec.size(); i++) {
if (i == vec.size() - 1) {
result += sep;
result += vec[i];
result += wrap_close;
}
else if (i == 0) {
result += vec[i];
}
else {
result += sep;
result += vec[i];
}
}
}
}
else {
result += wrap_close;
}
return result;
}
If I pass this vector
std::vector<std::string> vec = {"hello", "world", "three"};
to the ls() function, I should get this string:
std::string parsed_vector = ls(vec);
// AKA
std::string result = "(hello, world, three)"
The parsing works fine, however the characters in the wrap string turn into numbers when printed.
std::cout << result << std::endl;
Will result in the following:
40hello, world, three41
When it should instead result in this:
(hello, world, three)
The ( is turned into 40, and the ) is turned into 41.
My guess is that the characters are being turned into the Unicode/ASCII number values or something like that, I do not know how this happened or what to do.

The problem here is std::to_string converts a number to a string. There is no specialization for char values. So here, you're converting the ASCII value to a string:
wrap_open = std::to_string(wrap[0]);
wrap_close = std::to_string(wrap[1]);
Instead, you could simply do:
std::string wrap_open(1, wrap[0]);
std::string wrap_close(1, wrap[1]);
Note that you can greatly simplify your function by using std::ostringstream:
std::ostringstream oss;
oss << wrap[0];
for (size_t i = 0; i < vec.size(); i++)
{
if (i != 0) oss << sep;
oss << vec[i];
}
oss << wrap[1];
return oss.str();

I won't be commenting on how you could improve the function and that passing a vector by value (as an argument in the function) is never a good idea, however I will tell you how to fix your current issue:
std::string ls(std::vector<std::string> vec, std::string wrap = "()", std::string sep = ", ") {
std::string wrap_open, wrap_close;
wrap_open = wrap.at(0); //<----
wrap_close = wrap.at(1); //<----
std::string result = wrap_open;
size_t length = vec.size();
if (length > 0) {
... // Rest of the code
You don't need to use std::to_string, just use one of std::string's constructors to create a string with one character from the wrap string. This constructor is invoked via the = operator.
I recommend reading about std::string, it is apparent that you aren't using the full potential of the STL library : std::string
EDIT: After discussing the usage of .at() vs [] operator in the comments. I've decided to add the bit into this answer:
The main difference between .at() and [] is the bounds checking feature. .at will throw an std::out_of_range exception because it is performing a bounds check. The [] operator (IMHO) is present in STL containers due to backwards compatibility (imagine refactoring old C code into a C++ project). Point being it behaves like you would expect [] to behave and doesn't do any bounds checking.
In general I recommend the usage of .at() especially to beginners and especially if you are relying on human input. The uncaught exception will produce an easy to understand error, while untested [] will either produce weird values or RAV (read access violation) depending on the type stored in the container and from experience beginners usually have a harder time debugging this.
Bare in mind that this is just an opinion of one programmer and opinions may vary (as is visible in the discussion).
Hope it helps!

Related

Recognize string formatting Debug Assertion

I have a runtime problem with code below.
The purpose is to "recognize" the formats (%s %d etc) within the input string.
To do this, it returns an integer that matches the data type.
Then the extracted types are manipulated/handled in other functions.
I want to clarify that my purpose isn't to write formatted types in a string (snprintf etc.) but only to recognize/extract them.
The problem is the crash of my application with error:
Debug Assertion Failed!
Program:
...ers\Alex\source\repos\TestProgram\Debug\test.exe
File: minkernel\crts\ucrt\appcrt\convert\isctype.cpp
Line: 36
Expression: c >= -1 && c <= 255
My code:
#include <iostream>
#include <cstring>
enum Formats
{
TYPE_INT,
TYPE_FLOAT,
TYPE_STRING,
TYPE_NUM
};
typedef struct Format
{
Formats Type;
char Name[5 + 1];
} SFormat;
SFormat FormatsInfo[TYPE_NUM] =
{
{TYPE_INT, "d"},
{TYPE_FLOAT, "f"},
{TYPE_STRING, "s"},
};
int GetFormatType(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return format.Type;
}
return -1;
}
bool isValidFormat(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return true;
}
return false;
}
bool isFindFormat(const char* strBufFormat, size_t stringSize, int& typeFormat)
{
bool foundFormat = false;
std::string stringFormat = "";
for (size_t pos = 0; pos < stringSize; pos++)
{
if (!isalpha(strBufFormat[pos]))
continue;
if (!isdigit(strBufFormat[pos]))
{
stringFormat += strBufFormat[pos];
if (isValidFormat(stringFormat.c_str()))
{
typeFormat = GetFormatType(stringFormat.c_str());
foundFormat = true;
}
}
}
return foundFormat;
}
int main()
{
std::string testString = "some test string with %d arguments"; // crash application
// std::string testString = "%d some test string with arguments"; // not crash application
size_t stringSize = testString.size();
char buf[1024 + 1];
memcpy(buf, testString.c_str(), stringSize);
buf[stringSize] = '\0';
for (size_t pos = 0; pos < stringSize; pos++)
{
if (buf[pos] == '%')
{
if (buf[pos + 1] == '%')
{
pos++;
continue;
}
else
{
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
int typeFormat;
if (isFindFormat(bufFormat, stringSize, typeFormat))
{
std::cout << "type = " << typeFormat << "\n";
// ...
}
}
}
}
}
As I commented in the code, with the first string everything works. While with the second, the application crashes.
I also wanted to ask you is there a better/more performing way to recognize types "%d %s etc" within a string? (even not necessarily returning an int to recognize it).
Thanks.

Let's take a look at this else clause:
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
The variable stringSize was initialized with the size of the original format string. Let's say it's 30 in this case.
Let's say you found the %d code at offset 20. You're going to copy 30 characters, starting at offset 20, into bufFormat. That means you're copying 20 characters past the end of the original string. You could possibly read off the end of the original buf, but that doesn't happen here because buf is large. The third line sets a NUL into the buffer at position 30, again past the end of the data, but your memcpy copied the NUL from buf into bufFormat, so that's where the string in bufFormat will end.
Now bufFormat contains the string "%d arguments." Inside isFindFormat you search for the first isalpha character. Possibly you meant isalnum here? Because we can only get to the isdigit line if the isalpha check passes, and if it's isalpha, it's not isdigit.
In any case, after isalpha passes, isdigit will definitely return false so we enter that if block. Your code will find the right type here. But, the loop doesn't terminate. Instead, it continues scanning up to stringSize characters, which is the stringSize from main, that is, the size of the original format string. But the string you're passing to isFindFormat only contains the part starting at '%'. So you're going to scan past the end of the string and read whatever's in the buffer, which will probably trigger the assertion error you're seeing.
Theres a lot more going on here. You're mixing and matching std::string and C strings; see if you can use std::string::substr instead of copying. You can use std::string::find to find characters in a string. If you have to use C strings, use strcpy instead of memcpy followed by the addition of a NUL.

You could just demand it to a regexp engine which bourned to search through strings
Since C++11 there's direct support, what you have to do is
#include <regex>
then you can match against strings using various methods, for instance regex_match which gives you the possibility, together with an smatch to find out your target with just few lines of codes using standard library
std::smatch sm;
std::regex_match ( testString.cbegin(), testString.cend(), sm, str_expr);
where str_exp is your regex to find what you want specifically
in the sm you have now every matched string against your regexp, which you can print in this way
for (int i = 0; i < sm.size(); ++i)
{
std::cout << "Match:" << sm[i] << std::endl;
}
EDIT:
to better express the result you would achieve i'll include a simple sample below
// target string to be searched against
string target_string = "simple example no.%d is: %s";
// pattern to look for
regex str_exp("(%[sd])");
// match object
smatch sm;
// iteratively search your pattern on the string, excluding parts of the string already matched
cout << "My format strings extracted:" << endl;
while (regex_search(target_string, sm, str_exp))
{
std::cout << sm[0] << std::endl;
target_string = sm.suffix();
}
you can easily add any format string you want modifying the str_exp regex expression.

fastest way to read the last line of a string?

I'd like to know the fastest way for reading the last line in a std::string object.
Technically, the string after the last occurrence of \n in the fastest possible way?

This can be done using just string::find_last_of and string::substr like so
std::string get_last_line(const std::string &str)
{
auto position = str.find_last_of('\n');
if (position == std::string::npos)
return str;
else
return str.substr(position + 1);
}
see: example

I would probably use std::string::rfind and std::string::substr combined with guaranteed std::string::npos wrap around to be succinct:
inline std::string last_line_of(std::string const& s)
{
return s.substr(s.rfind('\n') + 1);
}
If s.rfind('\n') doesn't find anything it returns std::string::npos. The C++ standard says std::string::npos + 1 == 0. And returning s.substr(0) is always safe.
If s.rfind('\n') does find something then you want the substring starting from the next character. Again returning s.substr(s.size()) is safe according to the standard.
NOTE: In C++17 this method will benefit from guaranteed return value optimization so it should be super efficient.

I thought of a way that reads the string inversely (backwards) while storing what it reads
std::string get_last_line(const std::string &str)
{
size_t l = str.length();
std::string last_line_reversed, last_line;
for (--l; l > 0; --l)
{
char c = str.at(l);
if (c == '\n')
break;
last_line_reversed += c;
}
l = last_line_reversed.length();
size_t i = 0, y = l;
for (; i < l; ++i)
last_line += last_line_reversed[--y];
return last_line;
}
until it counters a '\n' character then reverse the stored string back and return it. If the target string is big and has a lot of new lines, this function would be very efficient.

Performance optimization for std::string

When I did some performance test in my app I noticed a difference in the following code (Visual Studio 2010).
Slower version
while(heavyloop)
{
if(path+node+"/" == curNode)
{
do something
}
}
This will cause some extra mallocs for the resulting string to be generated.
In order to avoid these mallocs, I changed it in the following way:
std::string buffer;
buffer.reserve(500); // Big enough to hold all combinations without the need of malloc
while(heavyloop)
{
buffer = path;
buffer += node;
buffer += "/";
if(buffer == curNode)
{
do something
}
}
While the second version looks a bit more awkward compared to the first version it's still readable enough. What I was wondering though is, wether this kind of optimization is an oversight on part of the compiler, or if this always has to be done manually. Since it only changes the order of allocations I would expect that the compiler could also figure it out on it's own. On the other hand, certain conditions have to be met, to really make it an optimization, which may not neccessarily be fullfilled, but if the conditions are not, the code would at least perform as good as the first version. Are newer versions of Visual Studio better in this regard?
A more complete version which shows the difference (SSCE):
std::string gen_random(std::string &oString, const int len)
{
static const char alphanum[] =
"0123456789"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz";
oString = "";
for (int i = 0; i < len; ++i)
{
oString += alphanum[rand() % (sizeof(alphanum) - 1)];
}
return oString;
}
int main(int argc, char *argv[])
{
clock_t start = clock();
std::string s = "/";
size_t adds = 0;
size_t subs = 0;
size_t max_len = 0;
s.reserve(100000);
for(size_t i = 0; i < 1000000; i++)
{
std::string t1;
std::string t2;
if(rand() % 2)
{
// Slow version
//s += gen_random(t1, (rand() % 15)+3) + "/" + gen_random(t2, (rand() % 15)+3);
// Fast version
s += gen_random(t1, (rand() % 15)+3);
s += "/";
s += gen_random(t2, (rand() % 15)+3);
adds++;
}
else
{
subs++;
size_t pos = s.find_last_of("/", s.length()-1);
if(pos != std::string::npos)
s.resize(pos);
if(s.length() == 0)
s = "/";
}
if(max_len < s.length())
max_len = s.length();
}
std::cout << "Elapsed: " << clock() - start << std::endl;
std::cout << "Added: " << adds << std::endl;
std::cout << "Subtracted: " << subs << std::endl;
std::cout << "Max: " << max_len << std::endl;
return 0;
}
On my system I get about 1 second difference between the two (tested with gcc this time but there doesn't seem to be any notable difference to Visual Studio there):
Elapsed: 2669
Added: 500339
Subtracted: 499661
Max: 47197
Elapsed: 3417
Added: 500339
Subtracted: 499661
Max: 47367

Your slow version may be rewritten as
while(heavyloop)
{
std::string tempA = path + node;
std::string tempB = tempA + "/";
if(tempB == curNode)
{
do something
}
}
Yes, it is not a full analog, but makes temporary objects more visible.
See two temporary objects: tempA and tempB. They are created because std::string::operator+ always generates new std::string object. This is how std::string is designed. A compiler won't be able to optimize this code.
There is a technique in C++ called expression templates to address this issue, but again, it it done on library level.

For class types (like std::string) there is no requirement that the conventional relationship between operator + and operator += be honoured like you expect. There is certainly no requirement that a = a + b and a += b have the same net effect, since operator=(), operator+() and operator+=() can all potentially be implemented individually, and not work together in tandem.
As such, a compiler would be semantically incorrect if it replaced
if(path+node+"/" == curNode)
with
std::string buffer = path;
buffer += node;
buffer += "/";
if (buffer == curNode)
If there was some constraint in the standard, for example a fixed relationship between overloaded operator+() and overloaded operator+=() then the two fragments of code would have the same net effect. However, there is no such constraint, so the compiler is not permitted to do such substitutions. The result would be changing meaning of the code.

path+node+"/" will allocate a temp variable string to compare with curNode,it's the c++ implement.

C++ - Attempting to use string functions to reverse an input string

As part of a homework assignment I need to be able to take an input string and manipulate it several ways using a list of string functions. The first function takes a string and reverses it using a for loop. This is what I have:
#include <iostream>
#include <string>
namespace hw06
{
typedef std::string::size_type size_type;
//reverse function
std::string reverse( const std::string str );
}
// Program execution begins here.
int main()
{
std::string inputStr;
std::cout << "Enter a string: ";
std::getline( std::cin, inputStr );
std::cout << "Reversed: " << hw06::reverse( inputStr )
<< std::endl;
return 0;
}
//reverse function definition
std::string hw06::reverse( const std::string str )
{
std::string reverseStr = "";
//i starts as the last digit in the input. It outputs its current
//character to the return value "tempStr", then goes down the line
//adding whatever character it finds until it reaches position 0
for( size_type i = (str.size() - 1); (i >= 0); --i ){
reverseStr += str.at( i );
}
return reverseStr;
}
The program asks for input, then returns this error:
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::tat
I'm really at a loss as to what I'm doing wrong here. The loop seems correct to me, so am I misunderstanding how to reference the function?

Unless you really want to write a loop, it's probably easier to just do something like:
std::string reverse(std::string const &input) {
return std::string(input.rbegin(), input.rend());
}

The problem is that your loop never terminates. You have as your condition i >= 0, but size_type is unsigned, so 0 - 1 == 2^(sizeof(size_t) * 8) - 1, which is certainly out of the range of your string. Therefore, you need to pick something else as your termination condition. One option is you can use i != std::string::npos, but that feels wrong. You're probably better off with something like:
for (size_type i = str.size(); i != 0; ) {
reverseStr += str.at(--i);
}
EDIT: I did some checking on i != std::string::npos. It should be well-defined and OK. However, it still seems like the Wrong Way To Do It.

As Andreas Grapentin said, the problem is that std::string::size() returns a size_t which is required by the standard to be an unsigned type. So it will always be >= 0 and when you hit 0 and decrement it, you will go to some really large, positive number.
Consider something like this:
std::string hw06::reverse(const std::string &str)
{
std::string reverseStr;
for(size_t i = str.size(); i != 0; i--)
reverseStr += str.at(i - 1);
return reverseStr;
}

I'm not keen on answering homework questions, but seeing some of the answers, I couldn't resist this:
std::string hw06::reverse(const std::string &str)
{ return std::string(str.rbegin(), str.rend()); }
Simple, clean and least wasteful if you can't do it in-place.

As other answers say, the problem is in the loop. I'll suggest using the following "goes to" operator :)
for(size_t i = str.size(); i --> 0;)
{
}

use i-- and not --i. Or u will decrease i value before getting the char and get loop problems.

istringstream invalid error beginner

I have this piece of code :
if(flag == 0)
{
// converting string value to integer
istringstream(temp) >> value ;
value = (int) value ; // value is a
}
I am not sure if I am using the istringstream operator right . I want to convert the variable "value" to integer.
Compiler error : Invalid use of istringstream.
How should I fix it ?
After trying to fix with the first given answer . it's showing me the following error :
stoi was not declared in this scope
Is there a way we can work past it . The code i am using right now is :
int i = 0 ;
while(temp[i] != '\0')
{
if(temp[i] == '.')
{
flag = 1;
double value = stod(temp);
}
i++ ;
}
if(flag == 0)
{
// converting string value to integer
int value = stoi(temp) ;
}

Unless you really need to do otherwise, consider just using something like:
int value = std::stoi(temp);
If you must use a stringstream, you typically want to use it wrapped in a lexical_cast function:
int value = lexical_cast<int>(temp);
The code for that looks something like:
template <class T, class U>
T lexical_cast(U const &input) {
std::istringstream buffer(input);
T result;
buffer >> result;
return result;
}
As to how to imitation stoi if your don't have one, I'd use strtol as the starting point:
int stoi(const string &s, size_t *end = NULL, int base = 10) {
return static_cast<int>(strtol(s.c_str(), end, base);
}
Note that this is pretty much a quick and dirty imitation that doesn't really fulfill the requirements of stoi correctly at all. For example, it should really throw an exception if the input couldn't be converted at all (e.g., passing letters in base 10).
For double you can implement stod about the same way, but using strtod instead.

First of all, istringstream is not an operator. It is an input stream class to operate on strings.
You may do something like the following:
istringstream temp(value);
temp>> value;
cout << "value = " << value;
You can find a simple example of istringstream usage here: http://www.cplusplus.com/reference/sstream/istringstream/istringstream/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::string characters somehow turned into Unicode/ASCII numbers - c++

Related

Recognize string formatting Debug Assertion

fastest way to read the last line of a string?

Performance optimization for std::string

C++ - Attempting to use string functions to reverse an input string

istringstream invalid error beginner

Categories

Resources