Related
This question is for tests purposes, nothing more.
I'm currently trying to store function pointers with a different number of parameters (and these parameters can have different types).
Basically, I've coded the following code snippet in C++11:
#include <functional>
#include <iostream>
void fct(int nb, char c, int nb2, int nb3) {
std::cout << nb << c << nb2 << nb3 << std::endl;
}
template <typename... Args>
void call(void (*f)(), Args... args) {
(reinterpret_cast<void(*)(Args...)>(f))(args...);
}
int main(void) {
call(reinterpret_cast<void(*)()>(&fct), 42, 'c', 19, 94);
}
I convert a void(*)(int, char, int, int) function pointer into a generic void(*)() function pointer. Then, by using variadic template parameters, I simply recast the function pointer to its original type and call the function with some parameters.
This code compiles and runs. Most of the times, it displays the good values. However, this code gives me some Valgrind errors under Mac OS (concerning uninitialized values) and it sometimes displays some unexpected garbage.
==52187== Conditional jump or move depends on uninitialised value(s)
==52187== at 0x1004E4C3F: _platform_memchr$VARIANT$Haswell (in /usr/lib/system/libsystem_platform.dylib)
==52187== by 0x1002D8B96: __sfvwrite (in /usr/lib/system/libsystem_c.dylib)
==52187== by 0x1002D90AA: fwrite (in /usr/lib/system/libsystem_c.dylib)
==52187== by 0x100025D29: std::__1::__stdoutbuf<char>::overflow(int) (in /usr/lib/libc++.1.dylib)
==52187== by 0x10001B91C: std::__1::basic_streambuf<char, std::__1::char_traits<char> >::xsputn(char const*, long) (in /usr/lib/libc++.1.dylib)
==52187== by 0x10003BDB0: std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> > std::__1::__pad_and_output<char, std::__1::char_traits<char> >(std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> >, char const*, char const*, char const*, std::__1::ios_base&, char) (in /usr/lib/libc++.1.dylib)
==52187== by 0x10003B9A7: std::__1::num_put<char, std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> > >::do_put(std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::ios_base&, char, long) const (in /usr/lib/libc++.1.dylib)
==52187== by 0x1000217A4: std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(int) (in /usr/lib/libc++.1.dylib)
==52187== by 0x1000011E8: fct(int, char, int, int) (in ./a.out)
==52187== by 0x1000013C2: void call<int, char, int, int>(void (*)(), int, char, int, int) (in ./a.out)
==52187== by 0x100001257: main (in ./a.out)
I find this quite curious because when I call the function, I have recasted the function pointer to its original type. I thought it was similar to casting a datatype to void* and then recasting it into the original datatype.
What is wrong with my code? Can't we cast function pointers to void(*)() pointer and then recast this pointer to the original function pointer signature?
If not, is there some other ways to achieve this? I'm not interested in std::bind which does not what I want.
Going out on a limb and guessing what you did to get it to fail...
#include <functional>
#include <iostream>
void fct(int nb, char c, int nb2, std::string nb3) {
std::cout << nb << c << nb2 << nb3 << std::endl;
}
template <typename... Args>
void call(void (*f)(), Args... args) {
(reinterpret_cast<void(*)(Args...)>(f))(args...);
}
int main(void) {
call(reinterpret_cast<void(*)()>(&fct), 42, 'c', 19, "foobar");
}
This will fail because "foobar" never gets converted to std::string ... how can the compiler know if it goes through Args... ?
I'm not sure exactly how std::string gets pushed on the call stack by a caller ( a string reference would be pushed on as a pointer), but I suspect it is more than just a single pointer to char*. When the callee pops off that pointer to char* expecting the entire string member, it freaks out.
I think if you change to
void fct(int nb, char c, int nb2, char* nb3)
or
call(reinterpret_cast<void(*)()>(&fct), 42, 'c', 19, std::string("foobar"));
then it might work.
You said you're also interested in alternative implementations. Personally, I wouldn't implement things this way even if it worked perfectly, both function pointers and reinterpret_casts are things I try to avoid. I haven't tested this code, but my thought would be:
#include <functional>
#include <iostream>
#include <boost/any.hpp>
template <typename... Args>
void call(boost::any clbl, Args... args) {
auto f = boost::any_cast<std::function<void(Args...)>>(clbl);
f(args...);
}
int main(void) {
std::function<void(int, char, int, int)> func = fct;
call(boost::any(func), 42, 'c', 19, 94);
}
Edit: this code, combined with your definition of fct, works correctly, and runs clean under valgrind on Fedora, compiled with clang35.
I started playing with the C++11 standard and the in-built threading. From what I gather when the value on a future is gotten, it is done using the move operator giving ownership away from the original object (like the old auto_ptr used to do on assignment). I tested this out by printing out the pointer of the char array inside an std::string object during the thread and printing the pointer after receiving it back in the main. However, the pointers are different. I would appreciate it if someone could tell me why they are different in this simple code and what the code would have to look like for them to be equal:
#include <cstdlib>
#include <iostream>
#include <vector>
#include <algorithm>
#include <chrono>
#include <string>
#include <thread>
#include <future>
using namespace std;
void thrfut(promise<string>&& promReceived)
{
string strObj("Hello from future");
cout << "Address of char array inside string inside of thread " << (void*)strObj.data() << endl;
promReceived.set_value(strObj);
}
int main(int argc, char** argv)
{
promise<string> promiseOfText;
future<string> futureText = promiseOfText.get_future(); // has to be before creating thread if the promise is passed rvalue reference, should be moved on calling get or not ?
thread threadHandlingPromise(&thrfut, std::move(promiseOfText));
string stringReceived = futureText.get();
cout << "Received from promise through thread: " << stringReceived << endl;
cout << "Address of of char array inside string received from promise in main " << (void*)stringReceived.data() << endl;
threadHandlingPromise.join();
return 0;
}
Here is a sample output
Address of char array inside string inside of thread 0x10ebc9be1
Received from promise through thread: Hello from future
Address of of char array inside string received from promise in main 0x7fff510f68c9
fyi: OS X 10.9.1 w/ Xcode 5 clang++ in Netbeans 8.0 other people have run the code on Ubuntu and Windows and returned the same address.
EDIT (cfr. comments in answers) * * * * *
I tried this:
struct MYC
{ MYC() = default;
~MYC() { delete _pInt; };
MYC(const MYC & myc) { puts("MYC copy"); _pInt = nullptr; if(myc._pInt != nullptr) { _pInt = new int{*myc._pInt}; } }
MYC(MYC && myc) { puts("MYC move"); delete _pInt; _pInt = myc._pInt; myc._pInt = nullptr; }
void setMe(int value) { delete _pInt; _pInt = new int{value} ; }
int * _pInt = nullptr;
};
void thrfut(promise<MYC>&& promReceived)
{
MYC obj;
obj.setMe(5);
cout << "Address of int inside MYC inside thread " << (void*)obj._pInt << endl;
promReceived.set_value(std::move(obj));
}
int main(int argc, char** argv)
{
promise<MYC> promiseOfMYC;
future<MYC> futureMYC = promiseOfMYC.get_future();
thread threadHandlingPromise(&thrfut, std::move(promiseOfMYC));
auto mycReceived = futureMYC.get();
cout << "Address of int inside MYC received from promise in main " << (void*)mycReceived._pInt << endl;
cout << "Value of int inside MYC received from promise in main " << *(mycReceived._pInt) << endl;
threadHandlingPromise.join();
return 0;
}
and got:
Address of int inside MYC inside thread 0x7fd1b9c00110
MYC move
MYC move
Address of int inside MYC received from promise in main 0x7fd1b9c00110
Value of int inside MYC received from promise in main 5
Which confirms the move dynamics for the non-string classes.
Edit
I finally remembered GNU libstdc++ std::string employs reference counting. Perhaps this explains something?? (not everything, though)
Is std::string refcounted in GCC 4.x / C++11?
Original answer
(1) It is indeed system dependent
$ g++48 promiseStr.cpp -o promiseStr -Wall -Wextra -std=c++0x -O0 -g3 -pthread && echo OK
OK
$ ./promiseStr
Address of char array inside string inside of thread 0x7f4b400008d8
Received from promise through thread: Hello from future
Address of of char array inside string received from promise in main 0x7f4b400008d8
$ lsb_release -a
LSB Version: (snip)
Distributor ID: Ubuntu
Description: Ubuntu 12.04.2 LTS
Release: 12.04
Codename: precise
$ g++48 --version
g++48 (GCC) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
(2) future<T>::get() returns T not &&T, so stringReceived might be copy-constructed.
http://www.cplusplus.com/reference/future/future/get/
(3) You might as well try ltrace on Linux to see what happens under the hood (Sorry, I have no OS X machines.)
$ ltrace -n2 -f -C ./promiseStr 2>&1 | grep basic_string
[pid 6899] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)(0x7f486c17fde0, 0x406ea0, 0x7f486c17fdef, 0x7f486c180700, 0x7f486c180700) = 0x7f48640008d8
[pid 6899] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)(0x244e0b0, 0x7f486c17fde0, 0x7f4864000900, 0x7f486c17fd90, 0x7f486c17fd20) = 0x7f48640008d8
[pid 6899] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x7f486c17fde0, 0xffffffff, 0, 0x7f4864000028, 0x244e038 <unfinished ...>
[pid 6899] <... std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() resumed> ) = 1
[pid 6898] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string&&)(0x7fff5c17ece0, 0x244e0b0, 0x244e0b0, -1, 0x244e038) = 0x7f486cf723d8
[pid 6898] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x244e0b0, 1, 0x244e0a0, -1, 0x244e060) = 0x244e0b0
[pid 6898] std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)(0x60bd40, 0x7fff5c17ece0, 0x7fff5c17ece0, 0x203a6461, 0x7f486c53cab0) = 0x60bd40
[pid 6898] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x7fff5c17ece0, 0x7f486c759250, 0x7f486c180700, 0x7f486c759250, 0) = 0
On promReceived.set_value(strObj);, string's copy constructor is called; not move constructor.
And although you use promReceived.set_value(std::move(strObj));, maybe there's no guarantee that data()'s return values are same.. As Mehrdad says, it can be different for each implementation, such as small string optimization.
(edit: And your implementation can be use copy constructor inside library..)
I tried to overload QDebug::operator<< for std::string. I know that we can debug (using qDebug()) std::string objects using its std::string::c_str() function but I want to avoid typing .c_str each time.
Here is my attempt
#include <QDebug>
#include <string>
inline const QDebug& operator<< (const QDebug& qDebugObj, const std::string& str) {
return qDebugObj << str.c_str();
}
int main()
{
std::string s = "4444555";
qDebug() << s;
}
This program produces segmentation fault. What is incorrect with this code?
Here is the stack:
#1 0x00000037c407a911 in malloc () from /lib64/libc.so.6
#2 0x00000037ca8bd09d in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6
#3 0x00000037ca89c3c9 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) ()
from /usr/lib64/libstdc++.so.6
#4 0x00000037ca89cde5 in ?? () from /usr/lib64/libstdc++.so.6
#5 0x00000037ca89cf33 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) () from /usr/lib64/libstdc++.so.6
#6 0x00000000004012ca in operator<< (qDebugObj=..., str="4444555") at main.cpp:5
If you look at every overloaded output operator, you will see that none have a const qualifier. Which is your problem, you try to modify a constant object. Remove the const qualification of the qDebugObject and the return value.
You should have compiler warnings screaming about it, and if not then you need to enable more warnings (at least use -Wall when compiling with GCC/clang).
The actual problem, as answered by Mike Seymour in a comment, is that your overload will be called recursively until you get a stack overflow.
A way of bypassing that might be to convert the string to something else, like for example a QString:
return qDebugObj << QString::fromStdString(str);
In addition to your attempt to make an output stream const, you also failed to follow the instructions in the QT documentation
// with the fixed output operator
inline QDebug operator<<(QDebug dbg, const std::string& str)
{
dbg.nospace() << QString::fromStdString(str);
return dbg.space();
}
QT wants the output operator passed by copy (not by reference). There used to be a reason for that, but I cannot remember what it was.
I recently read about using GCC's code generation features (specifically, the -finstrument-functions compiler flag) to easily add instrumentation to my programs. I thought it sounded really cool and went to try it out on a previous C++ project. After several revisions of my patch, I found that any time I tried to use an STL container or print to stdout using C++ stream I/O, my program would immediately crash with a segfault. My first idea was to maintain a std::list of Event structs
typedef struct
{
unsigned char event_code;
intptr_t func_addr;
intptr_t caller_addr;
pthread_t thread_id;
timespec ts;
}Event;
list<Event> events;
which would be written to a file when the program terminated. GDB told me that when I tried to add an Event to the list, calling events.push_back(ev) itself initiated an instrumentation call. This wasn't terrible surprising and made sense after I thought about it for a bit, so on to plan 2.
The example in the blog which got me involved in all this mess didn't do anything crazy, it simply wrote a string to a file using fprintf(). I didn't think there would be any harm in using C++'s stream-based I/O instead of the older (f)printf(), but that assumption proved to be wrong. This time, instead of a nearly-infinite death spiral, GDB reported a fairly normal-looking descent into the standard library... followed by a segfault.
A Short Example
#include <list>
#include <iostream>
#include <stdio.h>
using namespace std;
extern "C" __attribute__ ((no_instrument_function)) void __cyg_profile_func_enter(void*, void*);
list<string> text;
extern "C" void __cyg_profile_func_enter(void* /* unused */, void* /* unused */)
{
// Method 1
text.push_back("NOPE");
// Method 2
cout << "This explodes" << endl;
// Method 3
printf("This works!");
}
Sample GDB Backtrace
Method 1
#0 _int_malloc (av=0x7ffff7380720, bytes=29) at malloc.c:3570
#1 0x00007ffff704ca45 in __GI___libc_malloc (bytes=29) at malloc.c:2924
#2 0x00007ffff7652ded in operator new(unsigned long) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff763ba89 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff763d495 in char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff763d5e3 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00000000004028c1 in __cyg_profile_func_enter () at src/instrumentation.cpp:82
#7 0x0000000000402c6f in std::move<std::string&> (__t=...) at /usr/include/c++/4.6/bits/move.h:82
#8 0x0000000000402af5 in std::list<std::string, std::allocator<std::string> >::push_back(std::string&&) (this=0x6055c0, __x=...) at /usr/include/c++/4.6/bits/stl_list.h:993
#9 0x00000000004028d2 in __cyg_profile_func_enter () at src/instrumentation.cpp:82
#10 0x0000000000402c6f in std::move<std::string&> (__t=...) at /usr/include/c++/4.6/bits/move.h:82
#11 0x0000000000402af5 in std::list<std::string, std::allocator<std::string> >::push_back(std::string&&) (this=0x6055c0, __x=...) at /usr/include/c++/4.6/bits/stl_list.h:993
#12 0x00000000004028d2 in __cyg_profile_func_enter () at src/instrumentation.cpp:82
#13 0x0000000000402c6f in std::move<std::string&> (__t=...) at /usr/include/c++/4.6/bits/move.h:82
#14 0x0000000000402af5 in std::list<std::string, std::allocator<std::string> >::push_back(std::string&
...
Method 2
#0 0x00007ffff76307d1 in std::ostream::sentry::sentry(std::ostream&) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00007ffff7630ee9 in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007ffff76312ef in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x000000000040251e in __cyg_profile_func_enter () at src/instrumentation.cpp:81
#4 0x000000000040216d in _GLOBAL__sub_I__ZN8GLWindow7attribsE () at src/glwindow.cpp:164
#5 0x0000000000402f2d in __libc_csu_init ()
#6 0x00007ffff6feb700 in __libc_start_main (main=0x402cac <main()>, argc=1, ubp_av=0x7fffffffe268,
init=0x402ed0 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7fffffffe258) at libc-start.c:185
#7 0x0000000000401589 in _start ()
Environment:
Ubuntu Linux 12.04 (x64)
GCC 4.6.3
Intel 3750K CPU
8GB RAM
The problem with using cout in the instrumentation function is that the instrumentation function is being called by __libc_csu_init() which is a very early part of the runtime's initialization - before global C++ objects get a chance to be constructed (in fact, I think __libc_csu_init() is responsible for kicking off those constructors - at least indirectly).
So cout hasn't had a chance to be constructed yet and trying to use it doesn't work very well...
And that may well be the problem you run into with trying to use std::List after fixing the infinite recursion (mentioned in Dave S' answer).
If you're willing to lose some instrumentation during initialization, you can do something like:
#include <iostream>
#include <stdio.h>
int initialization_complete = 0;
using namespace std;
extern "C" __attribute__ ((no_instrument_function)) void __cyg_profile_func_enter(void*, void*);
extern "C" void __cyg_profile_func_enter(void* /* unused */, void* /* unused */)
{
if (!initialization_complete) return;
// Method 2
cout << "This explodes" << endl;
// Method 3
printf("This works! ");
}
void foo()
{
cout << "foo()" << endl;
}
int main()
{
initialization_complete = 1;
foo();
}
The first case seems to be an infinite loop, resulting in stack overflow. This is probably because std::list is a template, and it's code is generated as part of the translation unit where you're using it. This causes it to get instrumented as well. So you call push_back, which calls the handler, which calls push_back, ...
The second, if I had to guess, might be similar, though it's harder to tell.
The solution is to compile the instrumentation functions separately, without the -finstrument-functions. Note, the example blog compiled the trace.c separately, without the option.
Does anyone know if it's kosher to pass a boost::unordered_set as the first parameter to boost::split? Under libboost1.42-dev, this seems to cause problems. Here's a small example program that causes the problem, call it test-split.cc:
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
#include <boost/unordered_set.hpp>
#include <string>
int main(int argc, char **argv) {
boost::unordered_set<std::string> tags_set;
boost::split(tags_set, "a^b^c^",
boost::is_any_of(std::string(1, '^')));
return 0;
}
Then, if I run the following commands:
g++ -o test-split test-split.cc; valgrind ./test-split
I get a bunch of complaints in valgrind like the one that follows (I also sometimes see coredumps without valgrind, though it seems to vary based on timing):
==16843== Invalid read of size 8
==16843== at 0x4ED07D3: std::string::end() const (in /usr/lib/libstdc++.so.6.0.13)
==16843== by 0x401EE2: unsigned long boost::hash_value<char, std::allocator<char> >(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /tmp/test-split)
...
==16843== by 0x402248: boost::unordered_set<std::string, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >& boost::algorithm::split<boost::unordered_set<std::string, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >, char const [26], boost::algorithm::detail::is_any_ofF<char> >(boost::unordered_set<std::string, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >&, char const (&) [26], boost::algorithm::detail::is_any_ofF<char>, boost::algorithm::token_compress_mode_type) (in /tmp/test-split)
==16843== by 0x40192A: main (in /tmp/test-split)
==16843== Address 0x5936610 is 0 bytes inside a block of size 32 free'd
==16843== at 0x4C23E0F: operator delete(void*) (vg_replace_malloc.c:387)
==16843== by 0x4ED1EE8: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (in /usr/lib/libstdc++.so.6.0.13)
==16843== by 0x404A8B: void boost::unordered_detail::hash_unique_table<boost::unordered_detail::set<boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> > >::insert_range_impl<boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, char const*>, boost::algorithm::split_iterator<char const*>, boost::use_default, boost::use_default> >(std::string const&, boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, char const*>, boost::algorithm::split_iterator<char const*>, boost::use_default, boost::use_default>, boost::transform_iterator<boost::algorithm::detail::copy_iterator_rangeF<std::string, char const*>, boost::algorithm::split_iterator<char const*>, boost::use_default, boost::use_default>) (in /tmp/test-split)
...
==16843== by 0x402248: boost::unordered_set<std::string, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >& boost::algorithm::split<boost::unordered_set<std::string, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >, char const [26], boost::algorithm::detail::is_any_ofF<char> >(boost::unordered_set<std::string, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >&, char const (&) [26], boost::algorithm::detail::is_any_ofF<char>, boost::algorithm::token_compress_mode_type) (in /tmp/test-split)
==16843== by 0x40192A: main (in /tmp/test-split)
This is a Debian Squeeze box; here's my relevant system info:
$ g++ --version
g++ (Debian 4.4.5-2) 4.4.5
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ dpkg -l | grep boost
ii libboost-iostreams1.42.0 1.42.0-4 Boost.Iostreams Library
ii libboost1.42-dev 1.42.0-4 Boost C++ Libraries development files
$ uname -a
Linux gcc44-buildvm 2.6.32-5-amd64 #1 SMP Fri Sep 17 21:50:19 UTC 2010 x86_64 GNU/Linux
However, the code seems to work fine if I downgrade libboost1.42-dev to libboost1.40-dev. So is this a bug in boost 1.42, or am I misusing boost::split by passing in a container that can't handle sequences? Thanks!
This was confirmed on the boost-users mailing list to be a bug in the boost::unordered_set implementation. There is a patch available on the mailing list, and a fix will be checked in soon, hopefully in time for boost 1.45.
Boost-users: patch
Boost-users: confirmation
Thanks everyone for looking into this!
I think the answer should be yes.
Reading the headers (split.hpp and iter_find.hpp) split takes a SequenceSequenceT& Result as its first argument, which it passes to iter_split which range-constructs it from two boost::transform_iterators:
SequenceSequenceT Tmp(itBegin, itEnd);
Result.swap(Tmp);
return Result;
So all it needs of this type is that it has a constructor that takes a pair of iterators which dereference to std::string (or, technically, to BOOST_STRING_TYPENAME). And has a .swap() member.. and has a SequenceSequenceT::iterator type whose type is std::string.
proof:
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
#include <string>
#include <iterator>
#include <algorithm>
#include <iostream>
struct X
{
typedef std::iterator<std::forward_iterator_tag,
std::string, ptrdiff_t, std::string*, std::string&>
iterator;
X() {}
template<typename Iter> X(Iter i1, Iter i2)
{
std::cout << "Constructed X: ";
copy(i1, i2, std::ostream_iterator<std::string>(std::cout, " " ));
std::cout << "\n";
}
void swap(X&) {}
};
int main()
{
X x;
boost::split(x, "a^b^c^", boost::is_any_of(std::string(1, '^')));
}
I think that unordered_set<std::string> should satisfy these requirements as well.
Apparently, the answer is no yes.
Using the following code, I get compile-time warnings and a runtime assert (Visual C++ v10) on the unordered_set while the vector works fine (apart from an empty string in the last element, due to the trailing '^').
boost::unordered_set<std::string> tags_set;
vector<string> SplitVec; // #2: Search for tokens
boost::split( SplitVec, "a^b^c^", boost::is_any_of("^") );
boost::split( tags_set, "a^b^c^", boost::is_any_of("^") );
Iterator compatibility between source (string) and the target container is the issue. I would post the warning error, but it's one of those "War and Peace" template warnings.
EDIT:
This looks like a bug in Boost unordered_set? When I use the following, it works as you would expect:
std::unordered_set<std::string> tags_set_std;
boost::split( tags_set_std, string("a^b^c^"), boost::is_any_of(string("^")) );