Unexplained out_of_range in string::substr - c++

I have been getting a really annoying error about an std::out_of_range when calling substr. The exact error is
terminate called after throwing an
instance of 'std::out_of_range'
what(): basic_string::substr
I'm absolutely sure that tmp_request has a length greater then 1. No matter what I pass to substr—1, 2, or bodypos—it always throws that error. I'm using g++ on Unix.
Only interesting thing I can include is the string has multiple "\r\n", including one "\r\n\r\n".
In one cpp file:
std::string tmp_request, outRequest;
tmp_request = SS_Twitter->readData();
outRequest = SS_Twitter->parse(tmp_request);
In another:
std::string parse(const std::string &request)
{
std::map<std::string,std::string> keyval;
std::string outRequest;
if(request[0]=='P')
{
if(request.find("register")!=std::string::npos)
{ //we have a register request
size_t bodypos = request.find("username");
if(bodypos==std::string::npos)
{
HttpError(400,"Malformed HTTP POST request. Could not find key username.",request);
}
else
{
std::string body = request.substr(bodypos);
StringExplode(body,"&", "=",keyval);
outRequest = "doing stuff";
}
}
Update:
std::string request2("P\r\nregister\r\nusername=hello\r\n\r\n");
std::string body = request2.substr(4);
That throws the same error. Now I know this is perfectly valid and correct code, but it's still throwing the error.
//removed source link

I modified your sample slightly to decrease amount of indentation used.
There are 5 "test cases" and none causes any problem. Could you please provide a sample request to reproduce the problem you're having.
EDIT: Forgot to mention: if this sample as it is (with commented-out bits) doesn't produce that error, your best bet is that you have a bug in your StringExplode function. You could post its source, to get a more helpful advice.
EDIT2:
In your StringExplode, change results[tmpKey] = tmpKey.substr(found+1); to results[tmpKey] = tmpResult[i].substr(found+1);. Change int found to size_t found, and remove all of if (found > 0), that will fix your mysterious out_of_range. You were substr-ing a wrong string. Just in case, here's the code with a fix:
void StringExplode(std::string str, std::string objseparator, std::string keyseperator,
std::map <std::string, std::string> &results)
{
size_t found;
std::vector<std::string> tmpResult;
found = str.find_first_of(objseparator);
while(found != std::string::npos)
{
tmpResult.push_back(str.substr(0,found));
str = str.substr(found+1);
found = str.find_first_of(objseparator);
}
if(str.length() > 0)
{
tmpResult.push_back(str);
}
for(size_t i = 0; i < tmpResult.size(); i++)
{
found = tmpResult[i].find_first_of(keyseperator);
while(found != std::string::npos)
{
std::string tmpKey = tmpResult[i].substr(0, found);
results[tmpKey] = tmpResult[i].substr(found+1);
found = tmpResult[i].find_first_of(keyseperator, found + results[tmpKey].size());
}
}
}
Initial test code:
#include <iostream>
#include <map>
#include <string>
std::string parse(const std::string &request)
{
std::map<std::string,std::string> keyval;
std::string outRequest;
if(request[0] != 'P')
return outRequest;
if(request.find("register") == std::string::npos)
return outRequest;
//we have a register request
size_t bodypos = request.find("username");
if(bodypos==std::string::npos)
{
// HttpError(400,"Malformed HTTP POST request. Could not find key username.",request);
// you said HttpError returns, so here's a return
return outRequest;
}
std::string body = request.substr(bodypos);
// StringExplode(body,"&", "=",keyval);
outRequest = "doing stuff";
return outRequest;
}
int main()
{
std::string request("P\r\nregister\r\nusername=hello\r\n\r\n");
std::cout << "[" << parse(request) << "]\n";
request = "Pregisternusername=hello\r\n\r\n";
std::cout << "[" << parse(request) << "]\n";
request = "Pregisternusername=hello";
std::cout << "[" << parse(request) << "]\n";
request = "registernusername=hello";
std::cout << "[" << parse(request) << "]\n";
request = "";
std::cout << "[" << parse(request) << "]\n";
return 0;
}
This outputs, predictably:
[doing stuff]
[doing stuff]
[doing stuff]
[]
[]

Are you sure that it's failing on that substr and not on a substr call within the HttpError or StringExplode functions? If you haven't already, you should run this through a debugger so that you can see exactly where it's throwing the exception. Alternatively, you could add a:
std::cout << "calling substr" << std::endl;
line immediately before you call substr, and a similar line immediately afterwards, so that it would look like:
std::cout << "calling substr" << std::endl;
std::string body = request.substr(bodypos);
std::cout << "finished calling substr" << std::endl;
StringExplode(body,"&", "=",keyval);
outRequest = "doing stuff";
If that substr really is throwing the exception, then you'll know because the program will print "calling substr" without a matching "finished calling substr". If it prints the pair of debug messages, though, or none at all, then something else is throwing the exception.

One fairly obvious thing wrong with your code:
int k = read(ns, buf, sizeof(buf)-1);
buf[k] = '\0';
You are not checking that read() succeeded - it returns -1 on failure which will cause all sorts of memory corruption problems if it occurs.
Also:
char * buf2 = const_cast<char *>(reply.c_str());
write(ns,buf2,sizeof(buf2));
You are taking the size of the pointer - you want the length of the output string:
write(ns, buf2, reply.size() );
And you should once again test that write succeeded and that it wrote as many bytes as you requested, though this shouldn't directly cause the substr() error.

Looks like you need an else after
if(bodypos==std::string::npos)
{
HttpError(...);
}
otherwise you are calling substr with bodypos = npos

You might consider using the (unsigned) type std::string::size_type instead of int.
Why are you casting the result of find to an int here:
int(request.find("register"))!=std::string::npos

Related

property_tree get array value as String

I want to parse a json file/string with boosts property_tree, but instead of having sub-trees parsed into an array I would like it to stay as a string for use in another existing function which only deals with json-Strings.
I hope the following example is sufficient:
example.json
{
"type": "myType",
"colors": {
"color0":"red",
"color1":"green",
"color2":"blue"
}
}
main.cpp
std::stringstream ss("example.json");
ptree pt;
read_json(ss, pt);
std::string sType = pt.get("type", "");
std::string sColors = pt.get<std::string>("colors");
std::cout << "sType: " << sType << std::endl; // sType: myType
std::cout << "sColors: " << sColors << std::endl; // sColors: {"color0":"red", "color1":"green", "color2":"blue"}
I've tried several functions, for example pt.get_child("colors") would just return another ptree and pt.get_value<std::string>("colors") does return an empty string ("").
The desired output would look like this:
sColors: {"color0":"red", "color1":"green", "color2":"blue"}
or
sColors: {\"color0\":\"red\", \"color1\":\"green\", \"color2\":\"blue\"}
Is there a way to recive the desired output for sColors?
I found a possible solution faster than anticipated, the following code will provide a satisfactory answer:
std::stringstream os;
write_json(os, pt.get_child("colors"), false);
std::string sColors = os.str();
std::cout << "sColors: " << sColors << std::endl;
If there is a more elegant solution feel free to post it as well!

C++: Separating a char* with '\t' delimiter

I've been fighting this problem for a while now, and can't seem to find a simple solution that doesn't involve parsing a char * by hand. I need to split my char* variable by '\t', and I've tried the following ways:
Method 1:
char *splitentry;
std::string ss;
splitentry = strtok(read_msg_.data(), "\\t");
while(splitentry != NULL)
{
std::cout << splitentry << std::endl;
splitentry = strtok(NULL, "\\t");
}
Using the input '\tthis\tis\ta\ttest'
results in this output:
his
is
a
es
Method 2:
std::string s(read_msg_.data());
boost::algorithm::split(strs, s, boost::is_any_of("\\t");
for (int i = 0; i < strs.size(); i++)
std::cout << strs.at(i) << std::endl;
Which creates an identical output.
I've tried using boost::split_regex and used "\\t" as my regex value, but nothing gets split. Will I have to split it on my own, or am I going about this incorrectly?
I would try to make things a little simpler by sticking to std:: functions. (p.s. you never use this: std::string ss;)
Why not do something like this?
Method 1: std::istringstream
std::istringstream ss(read_msg_.data());
std::string line;
while( std::getline(ss,line,ss.widen('\t')) )
std::cout << line << std::endl;
Method 2: std::string::substr (my preferred method as it is lighter)
std::string data(read_msg_.data());
std::size_t SPLITSTART(0); // signifies the start of the cell
std::size_t SPLITEND(0); // signifies the end of the cell
while( SPLITEND != std::string::npos ) {
SPLITEND = data.find('\t',SPLITSTART);
// SPLITEND-SPLITSTART signifies the size of the string
std::cout << data.substr(SPLITSTART,SPLITEND-SPLITSTART) << std::endl;
SPLITSTART = SPLITEND+1;
}

Parse JSON array using casablanca

I am trying to read from a JSON response in Casablanca. The sent data looks like this:
{
"devices":[
{"id":"id1",
"type":"type1"},
{"id":"id2",
"type":"type2"}
]
}
Does anyone know how to do this? Casablanca tutorials only seem to care about creating such arrays and not about reading from them.
Let's assume you got your json as an http response:
web::json::value json;
web::http::http_request request;
//fill the request properly, then send it:
client
.request(request)
.then([&json](web::http::http_response response)
{
json = response.extract_json().get();
})
.wait();
Note that no error checking is done here, so let's assume everything works fine (--if not,see the Casablanca documentation and examples).
The returned json can then be read via the at(utility::string_t) function. In your case it is an array (you either know that or check it via is_array()):
auto array = json.at(U("devices")).as_array();
for(int i=0; i<array.size(); ++i)
{
auto id = array[i].at(U("id")).as_string();
auto type = array[i].at(U("type")).as_string();
}
With this you get the entries of the json response stored in string variables.
In reality, you further might want to check whether the response has the coresponding fields, e.g. via has_field(U("id")), and if so, check whether the entries are not null via is_null() -- otherwise, the as_string() function throws an exception.
The following is a recursive function I made for parsing JSON values in cpprestsdk, if you would like additional info or elaboration feel free to ask.
std::string DisplayJSONValue(web::json::value v)
{
std::stringstream ss;
try
{
if (!v.is_null())
{
if(v.is_object())
{
// Loop over each element in the object
for (auto iter = v.as_object().cbegin(); iter != v.as_object().cend(); ++iter)
{
// It is necessary to make sure that you get the value as const reference
// in order to avoid copying the whole JSON value recursively (too expensive for nested objects)
const utility::string_t &str = iter->first;
const web::json::value &value = iter->second;
if (value.is_object() || value.is_array())
{
ss << "Parent: " << str << std::endl;
ss << DisplayJSONValue(value);
ss << "End of Parent: " << str << std::endl;
}
else
{
ss << "str: " << str << ", Value: " << value.serialize() << std::endl;
}
}
}
else if(v.is_array())
{
// Loop over each element in the array
for (size_t index = 0; index < v.as_array().size(); ++index)
{
const web::json::value &value = v.as_array().at(index);
ss << "Array: " << index << std::endl;
ss << DisplayJSONValue(value);
}
}
else
{
ss << "Value: " << v.serialize() << std::endl;
}
}
}
catch (const std::exception& e)
{
std::cout << e.what() << std::endl;
ss << "Value: " << v.serialize() << std::endl;
}
return ss.str();
}

Getting C-string from local copy of returned std::string

I am trying to debug a problem related to the scope of the character array contained within a std::string. I have posted the relevant code sample below,
#include <iostream>
#include <string>
const char* objtype;
namespace A
{
std::string get_objtype()
{
std::string result;
std::string envstr( ::getenv("CONFIG_STR") );
std::size_t pos1 = 0, pos2 = 0, pos3 = 0;
pos1 = envstr.find_first_of("objtype");
if (pos1 != std::string::npos)
pos2 = envstr.find_first_of("=", pos1+7);
if (pos2 != std::string::npos)
{
pos3 = envstr.find_first_of(";", pos2+1);
if (pos3 != std::string::npos)
result = envstr.substr(pos2+1, pos3 - pos2 - 1);
}
const char* result_cstr = result.c_str();
std::cerr << "get_objtype()" << reinterpret_cast<long>((void*)result_cstr) << std::endl;
return result;
}
void set_objtype()
{
objtype = get_objtype().c_str();
std::cerr << "Objtype " << objtype << std::endl;
std::cerr << "main()" << reinterpret_cast<long>((void*)objtype) << std::endl;
}
}
int main()
{
using namespace A;
std::cerr << "main()" << reinterpret_cast<long>((void*)objtype) << std::endl;
set_objtype();
if (::strcmp(objtype, "AAAA") == 0)
std::cerr << "Do work for objtype == AAAA " << std::endl;
else
std::cerr << "Do work for objtype != AAAA" << std::endl;
}
This was compiled and executed on MacOS 12.3 with g++ 4.2.1. The output from running this is as follows,
$ g++ -g -DNDEBUG -o A.exe A.cpp
$ CONFIG_STR="objtype=AAAA;objid=21" ./A.exe
main()0
get_objtype()140210713147944
Objtype AAAA
main()140210713147944
Do work for objtype == AAAA
$
My questions are these:
The pointer value printed from main() and get_objtype() are the same. Is this due to RVO?
The last line of output shows that the global pointer to C-string is ok even when the enclosing std::string is out of scope. So, when does the returned value go out of scope and the string array deleted? Any help from the community is appreciated. Thanks.
The pointer value won't change, but the memory it points to may no longer be part of a string.
objtype is invalid on the line right after you set it in set_objtype() because the result of get_objtype() isn't saved anywhere, so the compiler is free to kill it there and then.
It may work, but it's accessing invalid memory, so it is invalid code and if you rely on things like this, you will eventually run into big problems.
You should look at the disassembly using objdump to check if its RVO.
But, from experiments I did (making result global and making copies of it), it looks like c_str is reference counted.

c++ map<string,string>::find seems to return garbage iterator

map<string,string>::find seems to be returning garbage iterator, since i can access neither my_it->first nor second (NB: my_it != my_map.end() is verified). VC2010 reports a debug error, and looking deeper reveals
my_it is (Bad Ptr, Bad Ptr).
The 'offending' map is a class attribute, _match, shown below in context:
class NicePCREMatch
{
private:
map<string, string, less<string> > _match;
public:
void addGroup(const string& group_name, const string& value);
string group(const string& group_name);
};
Here is the code that returns elements by key (the commented-out code works fine):
string NicePCREMatch::group(const string& group_name)
{
/*for (map<string, string, less<string> >::iterator j = _match.begin(); j != _match.end(); j++)
{
if(!strcmp(j->first.c_str(), group_name.c_str()))
{
return j->second;
}
}
throw runtime_error("runtime_error: no such group");*/
map<string, string, less<string> >::iterator i = _match.find(group_name);
if (i == _match.end())
{
throw runtime_error("runtime_error: no such group");
}
return i->second;
}
And Here is the code that inserts new elements in the map:
void NicePCREMatch::addGroup(const string& group_name, const string& value)
{
_match.insert(pair<string, string>(group_name, value));
}
Another class uses NicePCREMatch as follows:
template<class Match_t>
vector<Match_t> NicePCRE<Match_t>::match(const string& buf)
{
[snip]
Match_t m;
[snip]
m.addGroup(std::string((const char *)tabptr + 2, name_entry_size - 3), \
buf.substr(ovector[2*n], ovector[2*n+1] - ovector[2*n]));
[snip]
addMatch(m);
[snip]
return _matches;
}
Where,
template<class Match_t>
void NicePCRE<Match_t>::addMatch(const Match_t& m)
{
_matches.push_back(m);
}
Finally, client code uses NicePCRE class as follows:
void test_NicePCRE_email_match(void)
{
NicePCRE<> npcre;
npcre.compile("(?P<username>[a-zA-Z]+?)(?:%40|#)(?P<domain>[a-zA-Z]+\.[a-zA-Z]{2,6})");
vector<NicePCREMatch> matches = npcre.match("toto#yahoo.com");
assert(!matches.empty());
assert(!strcmp(matches.begin()->group("username").c_str(), "toto"));
cout << matches.begin()->group("domain").c_str() << endl;
assert(!strcmp(matches.begin()->group("domain").c_str(), "yahoo.com"));
}
BTW, this --is pretty much-- my main (the oddest TDD ever :) ):
int main()
{
int test_cnt = 0;
cout << "Running test #" << test_cnt << " .." << endl;
test_NicePCRE_email_match();
cout << "OK." << endl << endl;
test_cnt++;
SleepEx(5000, 1);
return 0;
}
What am I doing wrong here?
EDIT:
The following modification (compare with the version above) solved my problem. Viz,
void NicePCREMatch::addGroup(const string& group_name, const string& value)
{
_match.insert(pair<string, string>(group_name.c_str(), value.c_str()));
}
Client code (slightly modified) now looks like this:
void test_NicePCRE_email_match(void)
{
NicePCRE<> npcre;
npcre.compile("(?P<username>[a-zA-Z]+?)(?:%40|#)(?P<domain>[a-zA-Z]+\.[a-zA-Z]{2,6})");
vector<NicePCREMatch> matches = npcre.match("toto#yahoo.com");
assert(!matches.empty());
try
{
assert(!strcmp(matches.begin()->group("username").c_str(), "toto"));
assert(!strcmp(matches.begin()->group("domain").c_str(), "yahoo.com"));
cout << "username = " << matches.begin()->group("username") << endl;
cout << "domain = " << matches.begin()->group("domain") << endl;
}
catch (const runtime_error& e)
{
cout << "Caught: " << e.what() << endl;
assert(0x0);
}
}
This is quite bizarre. Can someone please explain. However, I consider my problem solved already.
Thanks every one.
Your issue is here
if (i == _match.end())
{
throw runtime_error("runtime_error: no such group");
}
return i->second;
Your find failed for some reason. I can't say why because I don't have the full code. But, after the failure, you are throwing an error, but there is nobody to catch outside. Please add a try catch at the point where you call the method group() and implement the logic if the match is not found.
I tried with your sample snippets (+ some changes to get the stuff compiled) and it looks like visual studio continues with the next line in the function even after a throw statement. I don't know the theory behind it. I was bit surprised at seeing such a behavior.
[To make sure that your class structure is not causing the problem, I tried with a simple global method and even the method also gave me the same behavior. If there are somebody who can explain this please feel free.]
This might be caused by three things - either you modify the map in some way after the execution of find or you have a memory coruption somewhere in your program or the debugger is simply not showing the correct values for the iterator.
Try using debug output - if the code crashes when you try to output the values, then probably the iterator is really broken.
Also make sure you do not modify the map after the execution of find. If you do, this may make the iterator invalid and so you need to move the find call immedietly before using the iterator.
If both of the above options don't help you probably have memory corruption somewhere and you need to find it. Maybe use valgrind for that. Please note this should be your last resort only when the two other options are proved impossible.