The Eigen documentation mentions that structures having fixed-size vectorizable Eigen types can cause problems with allocation. They mention:
If you're compiling in [c++17] mode only with a sufficiently recent compiler (e.g., GCC>=7, clang>=5, MSVC>=19.12), then everything is taken care by the compiler and you can stop reading.
Does it mean that the problem is present when compiling with [c++20], since they stress [c++17] mode only? Or is the problem solved for any compiler since [c++17]?
This statement is based on P0035R4 (Dynamic memory allocation for over-aligned data) which was introduced in C++17, which means the statement arguably holds also for C++20. However, there seem to be some issues and pitfalls with this statement, even if following the advice of the Eigen guidelines, e.g. as highlighted in PointCloudLibrary
/
pcl / issue #4258.
The documentation is pretty clear: only compiling in c++17 mode with a recent compiler (it's an AND) protects you from the alignment issues. So, using a more recent compiler but failing to set -std=c++17 exposes you to the issue. The solution, as suggested in the documentation, is to modify the code and take advantage of the EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro (which is empty for c++17)
Alternatively, the documentation proposes other, less intrusive solutions: here.
Related
P0040R3 (adopted 2016-06, see also N4603) introduced some extended memory management algorithms like std::uninitialized_move_n into the draft, and finally it became parts of ISO C++17. Some of them had an extra overload with a ExecutionPolicy parameter for potential support of parallelism.
However, as of now (Aug 2018), I don't find any standard library implementation shipped with the implementations of these overloads. And the documentation of implementations I've checked does not clarify it well. Specifically, (currently) they are:
libstdc++ shows it does not support P0040R3 in trunk, but actually at least std::destroy_at and std::uninitialized_move_n without ExecutionPolicy are in GCC 8.2.
libc++ has "Complete" support of P0040R3 since 4.0, but the overloads with ExecutionPolicy are actually missing.
Microsoft VC++ has support of P0040R3 since VS 2017 15.3 with /std:c++17 or /std:c++latest, but the overloads with ExecutionPolicy are actually missing.
The only implementation with ExecutionPolicy overloads I know is in HPX, but this is not a full implementation of the standard library. If I want to use the features portably, I have to adapt to custom implementation likewise, rather than direct use of std names. But I still want to use std implementation in future as preference (unless they have known bugs). (The reason is that implementation-defined execution policies are tightly coupled with concrete implementations, so external implementations as well as their client code would likely have less opportunity to utilize various execution policies in general; although this is not necessarily true for client code which is not guaranteed portable in the sense of conforming to standard.) Thus, I want something available for conditional inclusion in my potable adaptive layer for implementations - to get the specified features with using std::... when they are provided by the standard library, and complement it with my implementations as the fallback of missing parts from the standard library implementation only when necessary.
As I have known, the SD-6 feature testing macros as well as P0941R2 shows __cpp_lib_raw_memory_algorithms is sufficient for the features in P0040R3. On the other hand, __cpp_lib_parallel_algorithm seems not related to <memory> at all. So there is no way to express the state like current libc++ and MSVC implementations - with std names from P0040R3 but lack of ExecutionPolicy overloads. And I'm not sure __has_include<execution> would ever work. The reality may be quirkier, e.g. P0336R1 is even not supported by libc++.
So, how to get it perfectly portable in my code when the features become (hopefully) available in some newer version of the standard library implementations, except inspecting the source of each version of them, or totally reinventing my wheels of the whole P0040R3?
Edited:
I know the intended use of feature testing macros and I think libstdc++ has done the right thing. However, there is room to improve. More specifically, my code of the portable layer would play the role of the implementation (like HPX), but more "lightweight" in the sense of not reinventing wheels when they are already provided by the standard library implementation:
namespace my
{
#if ???
//#if __cpp_lib_raw_memory_algorithms
using std::uninitialized_move_n;
// XXX: Is this sufficient???
#else
// ... wheels here ... not expected to be more efficient to std counterparts in general
#endif
}
so my client code can be:
my::uninitialized_move_n(???::par, iter, size, d_iter);
rather than (copied from Barry's answer):
#if __cpp_lib_raw_memory_algorithms
std::uninitialized_move_n(std::execution::par, iter, size, d_iter);
#else
// ???
#endif
Both pieces of the code can work, but obviously checking __cpp_lib_raw_memory_algorithms directly everywhere in client code is more costly.
Ideally I should have some complete up-to-date standard library implementation, but that is not always the case I can guarantee (particularly working with environments where the standard library is installed as parts of system libraries). I need the adaption to ease the clients' work anyway.
The fallback is obvious: avoiding the using std::uninitialized_move_n; path totally. I'm afraid this would be a pessimistic implementation so I want to avoid this approach when possible.
Second update:
Because "perfectly portable" sounds unclear, I have illustrated some code in the edit above. Although the question is not changed and still covered by the title, I will make it more concrete here.
The "perfectly portable" way I want in the question is restricted as, given the code like the edit above, filling up any parts marked in ???, without relying on any particular versions of language implementations (e.g., nothing like macro names depended on implementations should be used for the purpose).
See here and here for the code examples fail to meet the criteria. (Well, these versions are figured out via inspection of commit logs... certainly imperfect, and, still buggy in some cases.) Note this is not related to the overloads with ExecutionPolicy yet, because they are missing in the mentioned standard library implementations, and my next action is depending on the solution of this question. (But the future of the names in std should be clear.)
A perfect (enough) solution can be, for example, adding a new feature testing macro to make the overloads independent from __cpp_lib_raw_memory_algorithms so in future I can just add my implementation of the overloads with ExecutionPolicy when they are not detected by the stand-alone new feature testing macro, without messing up the condition of #if again. But certainly I can't guarantee this way would be feasible; it ultimately depends on the decision of the committee and vendors.
I'm not sure whether there can be other directions.
The initial version of P0941 contained a table which made it clear that P0040R3 has the corresponding feature-test macro __cpp_lib_raw_memory_algorithms. This implies that the correct, portable way to write code to conditionally use this feature is:
#if __cpp_lib_raw_memory_algorithms
std::uninitialized_move_n(std::execution::par, iter, size, d_iter);
#else
// ???
#endif
The imposed requirement is that if that macro is defined, then that function exists and does what the standard prescribes. But that macro not being defined does not really say anything. As you point out, there are parts of P0040R3 that are implemented in libstdc++ - parts, but not all, which is why the feature-test macro is not defined.
There is currently a concerted effort to implement the parallel algorithms in libstdc++.
As to what to do in the #else branch there, well... you're kind of on your own.
As used in this answer, I'm looking for a C++11 compatible code for the same but the usage of std::quoted prevents me from achieving that. Can anyone suggest an alternative solution?
I give my answer assuming that you expect to find a generic approach to handle such situations. The main question that defines the guideline for me is:
"How long am I supposed to maintain this code for an older compiler version?"
If I'm certain that it will be migrated to the newer toolset along with the rest of the code base (even though in a few years time, but it will inevitably happen), then I just copy-paste implementation from the standard headers of the next target version of my compiler and put it into namespace std in a separate header within my code base. Even though it's a very rude hack, it ensures that I have exactly the same code version as the one I'll get after migration. As I start using newer (in this case C++14-compatible) compiler, I will just remove my own "quoted.h", and that's it.
Important Caveat: Barry suggested to copy-paste gcc's implementation, and I agree as long as the gcc is your main target compiler. If that's not the case, then I'd take the one from your compiler. I'm making this statement explicitly because I had troubles when I tried to copy gcc's std::nested_exception into my code base and, having switched from Visual Studio 2013 to 2017, noticed several differences. Also, in the case of gcc, pay attention to its license.
If I'm in a situation where I'll have to maintain compatibility with this older compiler for quite a while (for instance, if my product targets multiple compiler version), then it's more preferable first of all to look if there's a similar functionality available in Boost. And there is, in most cases. So check out at Boost website. Even though it states
"Quoted" I/O Manipulators for Strings are not yet accepted into Boost
as public components. Thus the header file is currently located in
you are able to use it from "boost/detail". And, I strongly believe that it's still better than writing your own version (despite the advice from Synxis), even though the latter can be quite simple.
If you're obliged to maintain the old toolset and you cannot use Boost, well...then it's maybe indeed worth thinking of putting your own implementation in.
It looks like using Eigen types with STL containers is very messy and requires special attention to alignment issues. My problem is that I'm planning to create complex class hierarchy with dozens of classes that might include one or more Eigen types as member variables. From the documentation it appears that as soon as you include Eigen type in member variables, your class gets "infected" with same issues as Eigen types. This means I've to take extra care for using STL containers not only for Eigen types but also for all of my dozens of classes.
Even more worse part that worries me is that anyone who uses instances of my classes in their code will have same issues and would be needed to be expert on this subject - even if my classes didn't expose any Eigen types in their public interface!
This is quite frustrating. Questions I have,
Is my understanding above correct (I only need to support C++11 and modern compilers)?
Is there any pattern people use so they don't have to pollute their code with special Eigen handling all over places?
I'm thinking about disabling whole vectorization globally. Would that resolve above issues at the cost of performance? Can it be enabled selectively for only specific code?
If I forget to take care of alignment issue somewhere in code, do I always get compile time error OR issue might remain hidden and there can be crash at runtime?
Yes your understanding is mostly correct, but I should add that this only concerns Eigen's fixed size types that require alignment such as Vector4f, Matrix2d, etc. but not Vector3f or MatrixXd. Moreover, the core of the problem is that STL containers do not honor alignas requirements yet, though this should come in some future C++ version.
I think that the easiest way to avoid such difficulties is to use non-aligned Eigen's types for class members and container value-types such as:
typedef Eigen::Matrix<float,4,1,Eigen::DontAlign> UVector4f;
typedef Eigen::Matrix<double,2,2,Eigen::DontAlign> UMatrix2d;
This way you do not have to bother about alignment issues, and you won't loose explicit vectorization. In Eigen 3.3, unaligned objects are vectorized too.
EDIT:
Regarding your last question, unfortunately, there is no possibility in C++ to detect such a shortcoming at compile time. If assertions are not disabled and that an invalid unaligned allocation occurs at runtime, then you will get an explicit assertions message, but that's all we can do. Therefore, if your program runs fine on a given system with some given compilation flags, then this does not mean that your code is safe. For instance, on most 64 bits systems buffer are aligned on 16 bytes boundary and so, if you do not enable AVX instruction set, then your code will run fine. On the other hand, the same code may assert if moving to a more exotic platform, or enabling AVX instructions that required 32 bytes alignment by default. Nonetheless, static analysers are becoming more and more powerful, and I think that some of such issue could be detected by them.
Another strategy consists in checking your program with a custom malloc returning 8 bytes aligned buffers only. This way you should be able to catch all shortcomings, assuming your program is well covered by unit tests. To do so, you must compile with -DEIGEN_MALLOC_ALREADY_ALIGNED=0, for obvious reasons.
My friend introduced me to hash tables as a way to easily associate strings with integers, and so I tried to implement them in my current project using std::unordered_map. A sample of the code is shown below (the full function is not shown for brevity, nor is the header; I am confident the problem does not reside there):
unordered_map <string,int> world::gentypetable(){
unordered_map <string,int> hashtable;
hashtable.emplace("Normal",0);
hashtable.emplace("Fire",1);
hashtable.emplace("Water",2);
hashtable.emplace("Electric",3);
hashtable.emplace("Grass",4);
However, when I try to compile this code using g++ 3.4.4-999, I receive the following error:
error: 'class std::unordered_map<std::basic_string,<char>, int>' has no member named 'emplace'
I suspect that this is because the compiler is outdated. Is that why, or is there another reason? And if it is due to the compiler, is there an alternative syntax that could be used to avoid the lengthy process of updating it to the current version?
Yes it is the problem with outdated compiler. However the problem does not lie in unordered_map it was implemented as custom extension back then as ext/hash_map. Some codes used appropriate defines to make the names consistent. It just officially got into the standard for the c++11, but it was implemented in many compilers as extensions for quite some time before.
The problem is the emplace and move semantics are relatively new concept. It also got standarized for c++11 but only most modern compilers support it.
That is why the compiler might recognize and be able to use hash maps, but does not know the method itself.
You probably could try swapping emplace for plain ol' insert. It may involve creating unnecessary temporaries, however it is pretty legacy method of adding elements to STL containers.
Side note: gcc 3.4.4 is really old now, and lacks some crucial stuff. It is also quite inferior in terms of optimization algorithms compared to its newer releases. It would be good to update.
unordered_map was introduced in c++11, your compiler seems to be from 2009.
An alternative could be to use the boost variant of it, or to update the compiler.
I suspect that this is because the compiler is outdated.
Yes. According to this bug report that function wasn't implemented until version 4.8.0. I'm somewhat surprised that your compiler supports C++11 at all, since it's several years older than that.
Is there an alternative syntax that could be used to avoid the lengthy process of updating it to the current version?
You might be able to do something like
hashtable.insert({"Normal",0});
if your compiler supports brace-initialisation; or there's the array-like syntax
hashtable["Normal"] = 0;
Note that these have different behaviour if the key already exists; the first does nothing, while the second replaces the existing value.
You're using an old compiler. Please update for the sake of everyone and especially the kittens who will inevitably be killed when you use the old compiler.
See this question on how to install GCC on Windows. In short, install via the installer.
I've found an odd inconsistency between Rcpp which compiled with and without -std=c++0x.
Consider the expression
Function data_frame("data.frame");
GenericVector a;
a.push_back("17");
return data_frame(a, _["stringsAsFactors"]=0);
(ed. note: coercion to DataFrame in Rcpp actually thunks down to the R function, but doesn't allow the user to set that flag.)
In "old" C++ (w/o -std=c++0x set) this code works. In modern C++ (w/ -std=c++0x set), this fails, saying "cannot coerce class "pairlist" into a data.frame".
Obviously, this isn't the end of the world: I just don't use any newer features. However, I confess to being totally at a loss as to what causes this difference, and how to work around it without throwing C++11 away. Any ideas, anyone?
Code targetting features of the new standard was written in Rcpp about 2 years ago.
But then, later we realized that CRAN did not accept the -std=c++0x flag for gcc (or equivalent flags for other compilers), and forcing the C++99 standard and therefore we cannot realistically use it.
Consequently we pretty much don't maintain the C++11 aware code. That's a shame because we'd really like to, but we prefer the exposure of being accepted in CRAN. Since we don't maintain, there are probably many things that don't work as they should.
This particular issue is probably easy to fix. And this will happen as soon as we get the green light on using C++11.
We love C++11 and cannot wait to use it. But we cannot use it in uploads to CRAN (as per a decree of the CRAN maintainers who consider C++11 "non portable" at this point -- please complain to them, not us, of that irks you).
Consequently it is currently "barred". There is a bit of detection in RcppCommon.h and we define HAS_CXX0X. But we haven't really written code for this, as we can't (yet) per the previous paragraph.
So if you found a bug, please do us the favor and report it where request follow-ups to be sent: the rcpp-devel list. Reproducible is good, patches even better :)