In C++17, a lot of special functions were added to the standard library. One function is the associated Laguerre polynomials. The second argument requires an unsigned int, but the mathematical definition is valid for real numbers as well. Is there any reason that it's only limited to nonnegative integers? Is it simply due to the fact that binomial(n,k) is is easier/faster/simpler to calculate when n and k are both positive integers?
Walter E. Brown, father of special functions in C++, never explicitly answered your Laguerre question as far as I know. Nevertheless, when one reads what Brown did write, a likely motive grows clear:
Many of the proposed Special Functions have definitions over some or all of the complex plane as well as over some or all of the real numbers. Further, some of these functions can produce complex results, even over real-valued arguments. The present proposal restricts itself by considering only real-valued arguments and (correspondingly) real-valued results.
Our investigation of the alternative led us to realize that the complex landscape for the Special Functions is figuratively dotted with land mines. In coming to our recommendation, we gave weight to the statement from a respected colleague that “Several Ph.D. dissertations would [or could] result from efforts to implement this set of functions over the complex domain.” This led us to take the position that there is insufficient prior art in this area to serve as a basis for standardization, and that such standardization would be therefore premature....
Of course, you are asking about real numbers rather than complex, so I cannot prove that the reason is the same, but Abramowitz and Stegun (upon whose manual Brown's proposal was chiefly based) provide extra support to some special functions of integer order. Curiously, in chapter 13 of my copy of Abramowitz and Stegun, I see no extra support, so you have a point, don't you? Maybe Brown was merely being cautious, not wishing to push too much into the C++ standard library all at once, but there does not immediately seem to me to be any obvious reason why floating-point arguments should not have been supported in this case.
Not that I would presume to second-guess Brown.
As you likely know, you can probably use chapter 13 of Abramowitz and Stegun to get the effect you want without too much trouble. Maybe Brown himself would have said just this: you can get the effect you want without too much trouble. To be sure, one would need to ask Brown himself.
For information, Brown's proposal, earlier linked, explicitly refers the assoc_laguerre of C++ to Abramowitz and Stegun, sect. 13.6.9.
In summary, at the time C++ was first getting special-function support, caution was heeded. The library did not want to advance too far, too fast. It seems fairly probable that this factor chiefly accounts for the lack you note.
I know for a fact that there was a great concern over real and perceived implementability. Also, new library features have to justified. Features have to be useful to more than one community. Integer order assoc_laguerre is the type physicists most often use. A signed integer or real order was probably thought too abstract. As it was, the special math functions barely made it into C++17. I actually think part of the reason they got in was the perception that 17 was a light release and folks wanted to bolster it.
As we all know, there is no reason for the order parameter to be unsigned, or even integral for that matter. The underlying implementation that I put in libstdc++ ( gcc) has a real order alpha. Real order assoc_laguerre is useful in quadrature rules for example.
I might recommend making overloads for real order to the committee in addition to relaxing the unsigned integer to be signed.
Related
Question
This is something that's bugging me for some time now, but I couldn't find a definitive answer for it:
Is anyone aware of a proposal to introduce a standard 2D and/or 3D Vector (a struct with x,y and z members) to the STL?
If not, is there a realistic way to get such a class into the next version of the standard - short of writing a complete and perfectly written proposal myself?
And, are there any good reasons (aside from no one having the time) why this hasn't already been done?
I'm definitely willing to contribute, but I believe I lack the experience to produce something of high enough quality to get accepted (I'm not a professional programmer).
Reasoning / Background
By now I've seen dozens of libraries and frameworks (be it for graphics, physics, math, navigation, sensor fusion ...) which all basically implement their own version of
struct Vector2d {
double x,y;
//...
};
/* ...
* operator overloads
*/
and/or its 3D equivalent - not to mention all the occasions, where I implemented one myself before I took the time to do a proper, reusable version.
Obviously, this is not something difficult and I'm not worrying about suboptimal implementations, but every time, I want to combine two libraries or reuse code from a different project, I have to take care of converting one version into the other (either by casting or - if possible - text replacement).
Now that the committee strives to significantly extend the standard library for c++17 (especially with a 2D graphics framework), I really would like to have a common 2D vector baked into all interfaces from the start, so I can just write e.g.:
drawLine(transformCoordinates(trackedObject1.estimatePos(),params),
transformCoordinates(trackedObject2.estimatePos(),params));
rather than
MyOwnVec2D t1{trackedObject1.estimatePosX(), trackedObject1.estPosY()};
MyOwnVec2D t2{trackedObject2.estimatePosX(), trackedObject2.estPosY()};
t1 = transformCoordinates(t1,params);
t2 = transformCoordinates(t2,params);
drawLine(t1.x,t1.y,t2.x,t2.y);
The example might be a little exaggerated, but I think it shows my point.
I'm aware of std::valarray, which already goes in the right direction, as it allows standard operations like addition and multiplication, but it carries far too much weight if all you need are two or three coordinates. I think a valarray, with fixed size and no dynamic memory allocation (e.g. based on std::array) would be an acceptable solution, especially as it would come with a trivial iterator implementation, but I personally would prefer a class with x, y (and z) members.
Remark: I'm sorry if this topic has already been discussed (and I would be surprised if it hasn't), but every time I'm searching for 2d vectors I get results talking about something like std::vector<std::vector<T>> or how to implement a certain transformation, but nothing on the topic of standardization.
are there any good reasons (aside from no one having the time) why this hasn't already been done?
There's essentially no reason to.
Forming a type that contains two or three elements is utterly trivial, and all the operations can be trivially defined too. Furthermore, the C++ standard library is not intended to be a general-purpose mathematical toolsuite: it makes sense to use specialised third-party libraries for that if you are serious about mathematical types and constructs beyond the functions and operators that you can throw together in half an hour.
And we do not standardise things that do not need to be standardised.
If C++ were to gain some kind of standardised 3D graphics API then I can see this changing but not until then. And hopefully C++ will never gain any kind of standardised 3D graphics API, because that is not what it is for.
However, if you feel strongly about it, you can start a conversation on std-discussion where all the experts (and some assuredly non-experts) live; sometimes such conversations lead to the formation of proposals, and it needn't necessarily be you who ends up writing it.
In case someone else has an interest in it, I wanted to point out that the July 2014 version of "A Proposal to Add 2D Graphics Rendering and Display to C++" includes a 2D point class / struct (my question was based on the initial draft from January 2014). So maybe there will be at least a simple standard 2D-Vector in c++1z.
This question talks of an optimization of the sort function that cannot be readily achieved in C:
Performance of qsort vs std::sort?
Are there more examples of compiler optimizations which would be impossible or at least difficult to achieve in C when compared to C++?
As #sehe mentioned in a comment. It's about the abstractions more than anything else. In other words, if the language allows the coder to express intent better, then it can emit code which implements that intent in a more optimal fashion.
A simple example is std::fill. Sure for basic types, you could use memset, but, let's say it's an array of 32-bit unsigned longs. std::fill knows that the array size is a multiple of 32-bits. And depending on the compiler, it might even be able to make the assumption that the array is properly aligned on a 32-bit boundary as well.
All of this combined may allow the compiler to emit code which sets the value 32-bit at a time, with no run-time checks to make sure that it is valid to do so. If we are lucky, the compiler will recognize this and replace it with a particularly efficient architecture specific version of the code.
(in reality gcc and probably the other mainstream compilers do in fact do this for just about anything that could be considered equivalent to a memset already, including std::fill).
often, memset is implemented in a way that has run-time checks for these types of things in order to choose the optimal code path. While this difference is probably negligible, the idea is that we have better expressed the intent of "filling" an array with a specific value, so the compiler is able to make slightly better choices.
Other more complicated language features do a good job of using the expression of intent to get larger gains, but this is the simplest example.
To be clear, my point is not that std::fill is "better" than memset, instead this is an example of how c++ allows better expression of intent to the compiler, allowing it to have more information during compile time, resulting in some optimizations being easier to implement.
It depends a bit on what you think of as the optimization here. If you're thinking of it purely as "std::sort vs. qsort", then there are thousands of other similar optimizations. Using a C++ template can supports inlining in situations where essentially the only reasonable alternative in C is to use a pointer to a function and nearly no known compiler will inline the code being called. Depending on your viewpoint, this is either a single optimization, or an entire (open-ended) family of them.
Another possibility is using template meta-programming to turn something into a compile-time constant that would normally have to be computed at run-time with C. In theory, you could usually do this by embedding a magic number. This is possible via a #define into C, but can lose context, flexibility or both (e.g., in C++ you can define a constant at compile time, carry out an arbitrary calculation from that input, and produce a compile-time constant used by the rest of the code. Given the much more limited calculations you can carry out in a #define, that's not possible nearly as often.
Yet another possibility is function overloading and template specialization. These are separate, but give the same basic result: using code that's specialized to a particular type. In C, to keep the number of functions you deal with halfway reasonable, you frequently end up writing code that (for example) converts all integers to a long, then does math on that. Templates, template specialization, and overloading make it relatively easy to use code that keeps the smaller types their native sizes, which can give a substantial speed increase (especially when it can enable vectorizing the math).
One last obvious possibility stems from simply providing quite a few pre-built data structures and algorithms, and allowing such things to be packaged for relatively easy, efficient re-use. I doubt I could even count the number of times I wrote code in C using what I knew were relatively inefficient data structures and/or algorithms, simply because it wasn't worth the time to find (or adapt) a more efficient one to the task at hand. Yes, if it really became a major bottleneck, I'd go to the trouble of finding or writing something better -- but doing a bit of comparing, it's still fairly common to see speed double when written in C++.
I should add, however, that all of these are undoubtedly possible with C, at least in theory. If you approach this from a viewpoint of something like language complexity theory and theoretical models of computation (e.g., Turing machines) there's no question that C and C++ are equivalent. With enough work writing specialized versions of each function, you can/could theoretically do all of those same things with C as you can with C++.
From a viewpoint of what code you can plan on really writing in a practical project, the story changes very quickly -- the limit on what you can do mostly comes down to what you can reasonably manage, not anything like the theoretical model of computation represented by the language. Levels of optimization that are almost entirely theoretical in C are not only practical, but quite routine in C++.
Even the qsort vs std::sort example is invalid. If a C implementation wanted, it could put an inline version of qsort in stdlib.h, and any decent C compiler could handle inlining the comparison function. The reason this usually isn't done is that it's massively bloated and of dubious performance benefit -- issues C++ folks tend not to care about...
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have recently read an article about fast sqrt calculation. Therefore, I have decided to ask SO community and its experts to help me find out, which STL algorithms or mathematical calculations can be implemented faster with programming hacks?
It would be great if you can give examples or links.
Thanks in advance.
System library developers have more concerns than just performance in mind:
Correctness and standards compliance: Critical!
General use: No optimisations are introduced, unless they benefit the majority of users.
Maintainability: Good hand-written assembly code can be faster, but you don't see much of it. Why?
Portability: Decent libraries should be portable to more than just Windows/x86/32bit.
Many optimisation hacks that you see around violate one or more of the requirements above.
In addition, optimisations that will be useless or even break when the next generation CPU comes around the corner are not a welcome thing.
If you don't have profiler evidence on it being really useful, don't bother optimising the system libraries. If you do, work on your own algorithms and code first, anyway...
EDIT:
I should also mention a couple of other all-encompassing concerns:
The cost/effort to profit/result ratio: Optimisations are an investment. Some of them are seemingly-impressive bubbles. Others are deeper and more effective in the long run. Their benefits must always be considered in relation to the cost of developing and maintaining them.
The marketing people: No matter what you think, you'll end up doing whatever they want - or think they want.
Probably all of them can be made faster for a specific problem domain.
Now the real question is, which ones should you hack to make faster? None, until the profiler tells you to.
Several of the algorithms in <algorithm> can be optimized for vector<bool>::[const_]iterator. These include:
find
count
fill
fill_n
copy
copy_backward
move // C++0x
move_backward // C++0x
swap_ranges
rotate
equal
I've probably missed some. But all of the above algorithms can be optimized to work on many bits at a time instead of just one bit at a time (as would a naive implementation).
This is an optimization that I suspect is sorely missing from most STL implementations. It is not missing from this one:
http://libcxx.llvm.org/
This is where you really need to listen to project managers and MBAs. What you're suggesting is re-implementing parts of the STL and or standard C library. There is an associated cost in terms of time to implement and maintenance burden of doing so, so you shouldn't do it unless you really, genuinely need to, as John points out. The rule is simple: is this calculation you're doing slowing you down (a.k.a. you are bound by the CPU)? If not, don't create your own implementation just for the sake of it.
Now, if you're really interested in fast maths, there are a few places you can start. The gnu multi-precision library implements many algorithms from modern computer arithmetic and semi numerical algorithms that are all about doing maths on arbitrary precision integers and floats insanely fast. The guys who write it optimise in assembly per build platform - it is about as fast as you can get in single core mode. This is the most general case I can think of for optimised maths i.e. that isn't specific to a certain domain.
Bringing my first paragraph and second in with what thkala has said, consider that GMP/MPIR have optimised assembly versions per cpu architecture and OS they support. Really. It's a big job, but it is what makes those libraries so fast on a specific small subset of problems that are programming.
Sometimes domain specific enhancements can be made. This is about understanding the problem in question. For example, when doing finite field arithmetic under rijndael's finite field you can, based on the knowledge that the characteristic polynomial is 2 with 8 terms, assume that your integers are of size uint8_t and that addition/subtraction are equivalent to xor operations. How does this work? Well basically if you add or subtract two elements of the polynomial, they contain either zero or one. If they're both zero or both one, the result is always zero. If they are different, the result is one. Term by term, that is equivalent to xor across a 8-bit binary string, where each bit represents a term in the polynomial. Multiplication is also relatively efficient. You can bet that rijndael was designed to take advantage of this kind of result.
That's a very specific result. It depends entirely on what you're doing to make things efficient. I can't imagine many STL functions are purely optimised for cpu speed, because amongst other things STL provides: collections via templates, which are about memory, file access which is about storage, exception handling etc. In short, being really fast is a narrow subset of what STL does and what it aims to achieve. Also, you should note that optimisation has different views. For example, if your app is heavy on IO, you are IO bound. Having a massively efficient square root calculation isn't really helpful since "slowness" really means waiting on the disk/OS/your file parsing routine.
In short, you as a developer of an STL library are trying to build an "all round" library for many different use cases.
But, since these things are always interesting, you might well be interested in bit twiddling hacks. I can't remember where I saw that, but I've definitely stolen that link from somebody else on here.
Almost none. The standard library is designed the way it is for a reason.
Taking sqrt, which you mention as an example, the standard library version is written to be as fast as possible, without sacrificing numerical accuracy or portability.
The article you mention is really beyond useless. There are some good articles floating around the 'net, describing more efficient ways to implement square roots. But this article isn't among them (it doesn't even measure whether the described algorithms are faster!) Carmack's trick is slower than std::sqrt on a modern CPU, as well as being less accurate.
It was used in a game something like 12 years ago, when CPUs had very different performance characteristics. It was faster then, but CPU's have changed, and today, it's both slower and less accurate than the CPU's built-in sqrt instruction.
You can implement a square root function which is faster than std::sqrt without losing accuracy, but then you lose portability, as it'll rely on CPU features not present on older CPU's.
Speed, accuracy, portability: choose any two. The standard library tries to balance all three, which means that the speed isn't as good as it could be if you were willing to sacrifice accuracy or portability, and accuracy is good, but not as good as it could be if you were willing to sacrifice speed, and so on.
In general, forget any notion of optimizing the standard library. The question you should be asking is whether you can write more specialized code.
The standard library has to cover every case. If you don't need that, you might be able to speed up the cases that you do need. But then it is no longer a suitable replacement for the standard library.
Now, there are no doubt parts of the standard library that could be optimized. the C++ IOStreams library in particular comes to mind. It is often naively, and very inefficiently, implemented. The C++ committee's technical report on C++ performance has an entire chapter dedicated to exploring how IOStreams could be implemented to be faster.
But that's I/O, where performance is often considered to be "unimportant".
For the rest of the standard library, you're unlikely to find much room for optimization.
I'm convinced that software testing indeed is very important, especially in science. However, over the last 6 years, I never have come across any scientific software project which was under regular tests (and most of them were not even version controlled).
Now I'm wondering how you deal with software tests for scientific codes (numerical computations).
From my point of view, standard unit tests often miss the point, since there is no exact result, so using assert(a == b) might prove a bit difficult due to "normal" numerical errors.
So I'm looking forward to reading your thoughts about this.
I am also in academia and I have written quantum mechanical simulation programs to be executed on our cluster. I made the same observation regarding testing or even version control. I was even worse: in my case I am using a C++ library for my simulations and the code I got from others was pure spaghetti code, no inheritance, not even functions.
I rewrote it and I also implemented some unit testing. You are correct that you have to deal with the numerical precision, which can be different depending on the architecture you are running on. Nevertheless, unit testing is possible, as long as you are taking these numerical rounding errors into account. Your result should not depend on the rounding of the numerical values, otherwise you would have a different problem with the robustness of your algorithm.
So, to conclude, I use unit testing for my scientific programs, and it really makes one more confident about the results, especially with regards to publishing the data in the end.
Just been looking at a similar issue (google: "testing scientific software") and came up with a few papers that may be of interest. These cover both the mundane coding errors and the bigger issues of knowing if the result is even right (depth of the Earth's mantle?)
http://http.icsi.berkeley.edu/ftp/pub/speech/papers/wikipapers/cox_harris_testing_numerical_software.pdf
http://www.cs.ua.edu/~SECSE09/Presentations/09_Hook.pdf (broken link; new link is http://www.se4science.org/workshops/secse09/Presentations/09_Hook.pdf)
http://www.associationforsoftwaretesting.org/?dl_name=DianeKellyRebeccaSanders_TheChallengeOfTestingScientificSoftware_paper.pdf
I thought the idea of mutation testing described in 09_Hook.pdf (see also matmute.sourceforge.net) is particularly interesting as it mimics the simple mistakes we all make. The hardest part is to learn to use statistical analysis for confidence levels, rather than single pass code reviews (man or machine).
The problem is not new. I'm sure I have an original copy of "How accurate is scientific software?" by Hatton et al Oct 1994, that even then showed how different implementations of the same theories (as algorithms) diverged rather rapidly (It's also ref 8 in Kelly & Sanders paper)
--- (Oct 2019)
More recently Testing Scientific Software: A Systematic Literature Review
I'm also using cpptest for its TEST_ASSERT_DELTA. I'm writing high-performance numerical programs in computational electromagnetics and I've been happily using it in my C++ programs.
I typically go about testing scientific code the same way as I do with any other kind of code, with only a few retouches, namely:
I always test my numerical codes for cases that make no physical sense and make sure the computation actually stops before producing a result. I learned this the hard way: I had a function that was computing some frequency responses, then supplied a matrix built with them to another function as arguments which eventually gave its answer a single vector. The matrix could have been any size depending on how many terminals the signal was applied to, but my function was not checking if the matrix size was consistent with the number of terminals (2 terminals should have meant a 2 x 2 x n matrix); however, the code itself was wrapped so as not to depend on that, it didn't care what size the matrices were since it just had to do some basic matrix operations on them. Eventually, the results were perfectly plausible, well within the expected range and, in fact, partially correct -- only half of the solution vector was garbled. It took me a while to figure. If your data looks correct, it's assembled in a valid data structure and the numerical values are good (e.g. no NaNs or negative number of particles) but it doesn't make physical sense, the function has to fail gracefully.
I always test the I/O routines even if they are just reading a bunch of comma-separated numbers from a test file. When you're writing code that does twisted math, it's always tempting to jump into debugging the part of the code that is so math-heavy that you need a caffeine jolt just to understand the symbols. Days later, you realize you are also adding the ASCII value of \n to your list of points.
When testing for a mathematical relation, I always test it "by the book", and I also learned this by example. I've seen code that was supposed to compare two vectors but only checked for equality of elements and did not check for equality of length.
Please take a look at the answers to the SO question How to use TDD correctly to implement a numerical method?
I just found out about the Boost Phoenix library (hidden in the Spirit project) and as a fan of the functional-programming style (but still an amateur; some small experience with haskell and scheme) i wanted to play around with this library to learn about reasonable applications of this library.
Besides the increasement of expressiveness and clarity of the code using fp-style, i'm especially interested in lazy-evaluation for speeding up computations at low costs.
A small and simple example would be the following:
there is some kind of routing problem (like the tsp), which is using a eucliedean distance matrix. We assume, that some of the values of the distance matrix are never used, and some are used very often (so it isn't a good idea to compute them on the fly for every call). Now it seems to be reasonable to have a lazy data-structure holding the distance values. How would that be possible with phoenix? (ignoring the fact that i would be easily done without fp-style-programming at all) Reading the official documentation of phoenix didn't let me understand enough to answer that.
Is it possible at all? (in Haskell for example the ability to create thunks which are guaranteeing that the value can be computed later are in the core of the language).
What does using a vector with all the lazy functions defined in phoenix mean? As naive as i am, i tried to fill two matrices (vector >) with random values, one with the normal push_back, the other with boost::phoenix::push_back and tried to read out only a small amount of values from these matrices and store them in a container for printing out. The lazy one was alway empty. Am i using phoenix in a wrong way / it should be possible? Or did i misunderstand the function of the containers/algorithms in phoenix. A small clue for the latter one is the existence of a special list-data-structure in the FP++ library, which influenced phoenix.
Additionally:
What are you using phoenix for?
Do you know some good ressources regarding phoenix? (tutorials, blog entries...)
Thanks for all your input!
As requested, my comment (with additions and small modifications) as an answer...
I know your position exactly, I too played around with Phoenix (although I didn't dig in very deeply, mostly a byproduct of reading the Boost::Spirit tutorial) a while ago, relatively soon after catching the functional bug and learning basic Haskell - and I didn't get anything working :( This is btw in synch with my general experience with dark template magic: Very easy to misunderstand, screw up and get punched in the face with totally unexpected behaviour or incomprehensible error messages.
I'd advice you to stay away from Phoenix for a long time. I like FP too, but FP in C++ is even uglier than mutability in Haskell (they'd be head to head but C++ is already ugly and Haskell is, at least according to Larry Wall, the most beautiful language ever ;) ). Learn and use FP, and when you're good at it and forced to use C++, use Phoenix. But for learning, a library that bolts a wholly different paradigm on an already complex language (i.e. FP in C++) is not advisable.