Array operation; Fortran equivalent of (:) - c++

I am converting from Fortran to C++ . In Fortran on a 2D array ie A(1000,3) i can do the following:
A(1,:)=(x0,y0,z0)
I have searched online and haven't found anything close to that for C++ so I can vectorize some operations. Any help would be appreciated.

The C++ language does not include a built-in matrix type, but includes all the tools you need to build one. Fortunately, that does not mean you need to go and build one all by yourself. Indeed, that would be a bit like reinventing the wheel.
I suggest you rely on one of the existing C++ open source matrix/vector packages, rather than defining ad hoc types manually. One possibility consists of using the Armadillo library.
Armadillo Wikipedia article
Armadillo online documentation
What you want to do can be done like this:
A.row(0) = rowvec{x0, y0, z0};
Example of self-contained source code below:
#include <armadillo>
#include <iostream>
using arma::Mat;
using arma::vec;
using arma::rowvec;
using arma::colvec;
int main(int argc, const char* argv[])
{
Mat<double> A = Mat<double>(1000, 3, arma::fill::zeros);
std::cout << "Shape of matrix A: " <<
A.n_rows << "x" << A.n_cols << std::endl;
double x0 = 1.5;
double y0 = 2.5;
double z0 = 3.5;
A.row(0) = rowvec{x0, y0, z0}; // HERE
std::cout << " A(0,0)=" << A(0,0) << " A(0,1)=" << A(0,1)
<< " A(0,2)=" << A(0,2) << std::endl;
std::cout << " A(1,0)=" << A(1,0) << " A(1,1)=" << A(1,1)
<< " A(1,2)=" << A(1,2) << std::endl;
return 0;
}
The specialized libraries go much further than this. For example, you can write a matrix product just like P=A*B, and the library will gracefully call the appropriate highly optimized BLAS/LAPACK backend routines for you. You can remain blissfully ignorant of the sad facts that the backend is Fortran-based and thus insists on numbering array elements from 1 not 0, and also insists on storing 2-dimensional arrays in contiguous columns, unlike C/C++ which prefers contiguous rows.
Of course it will take some time to learn about one the existing libraries, but it is unfortunately possible that writing/debugging your own type system will take even more time.
Alternatively, if you insist on designing/using your own C++ linear algebra package, you could do worse than getting a copy of this book:
"The C++ Programming Language, 4th edition"
by Bjarne Stroustrup, ISBN 978-0321-563842.
As you probably know, the C++ initial design effort, and a good chunk of the subsequent language evolution was led by this author. The 4th edition of his book includes a chapter 29 (taking only 30 pages) whose title is: "A matrix design". Chapter 29 makes extensive use of "advanced" features of C++11 such as template types and operator overloading.
Regarding elementary C++ types:
Using an array of std::arrays assume that you know the size of your matrix at compile time. This is a restriction that was fortunately removed from modern versions of Fortran.
Using a vector of vector would allocate the memory space of the matrix as N separate areas, and that would be both less time-efficient and incompatible with BLAS/LAPACK backend libraries.

If using std::array containers and using std::array::operator= you could do this:
std::array<std::array<int, 3>, 1000> A;
int x0, y0, z0;
A[0] = std::array<int, 3>{x0, y0, z0};

Related

boost:multiprecision

I have just started using boost::multiprecision trying to speed up some calculations previously done in Matlab. I found quite an unexpected problem, though. My calculations involve complex numbers, so I am using cpp_complex_50 type (e.g. cpp_complex_50 A, B;)
At some point I need to use boost::math::tools::bracket_and_solve_root() function, which requires that the function it works on returns real values. Here comes my problem... I cannot convert my complex multiprecision variable A.real() to any type that is real, eg. to cpp_dec_float_50 type or even double. The task should be streightforward, but I am virtually drowned in error complaints from my compiler (MSVC2015), and cannot solve it. Any hints at how to convert the data are more than welcome.
A somewhat connected question is the problem of initialization of cpp_complex_50 type variables with real values. At the moment I can only use data of type double at initialization, which means I am loosing some accuracy at the initialization stage already, e.g.:
cpp_complex_50 A = 4.0 * boost::math::constants::pi<double>(); // it works
but
cpp_complex_50 A = 4.0 * boost::math::constants::pi<cpp_dec_float_50>(); // It does NOT work
Any hints are needed. I am stuck at this, despite nice initial results.
Regards
Pawel
cpp_complex uses cpp_bin_float.
Live On Compiler Explorer
#include <boost/multiprecision/cpp_complex.hpp>
#include <iostream>
namespace bmp = boost::multiprecision;
int main() {
using Complex = bmp::cpp_complex_100;
using Real = Complex::value_type;
Real r = 4.0 * boost::math::constants::pi<Real>();
Complex b(r, {});
// or
b = r.convert_to<Complex>();
std::cout << b.str(100) << std::endl;
}
Prints
12.56637061435917295385057353311801153678867759750042328389977836923126562514483599451213930136846827
Following valuable comment from sehe... the code
cpp_complex_50 A = 4.0 * boost::math::constants::pi<cpp_bin_float_50>();
cout << A << endl;
works, producing:
12.5663706143591729538505735331180115367886775975
Similarely,
cpp_bin_float_50 B = A.real();
cout << B << endl;
works as well, printing the same.

R check doesn't like std:cout (C++)

I'm trying to submit a package to CRAN which contains C++ code (I have no clue about C++, the cpp files were written by somebody else).
The R check complains about ‘std::cout’ (C++)
Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor the C RNG
I found in the code the following command:
integrate_const(stepper_type( default_error_checker< double >( abs_error , rel_error ) ),
mDifEqn,
x,
0.0,
(precipitationLength * timeStep),
timeStep,
streaming_observer(std::cout) );
I guess R (CRAN) expects something else rather than std::cout... but what?
Your C++ project may well be using standard input and output.
The issue, as discussed in the Writing R Extensions manual, is that you then end up mixing two output systems: R's, and the C++ one.
So you are "encouraged" to replace all uses of, say,
std::cout << "The value of foo is " << foo << std::endl;
with something like
Rprintf("The value of foo is %f\n", foo);
so that your output gets blended properly with R's. In one of my (non-Rcpp) packages I had to do a lot of tedious patching for that...
Now, as mentioned in a comment by #vasicbre and an answer by #Dason, if you use Rcpp you can simply do
Rcpp::Rcout << "The value of foo is " << foo << std::endl;
If you already use Rcpp this is pretty easy, otherwise you need to decide if that makes it worth adding Rcpp...
edit: fixed typo in Rcpp::Rcout.
If you want to stream to R's buffered output you'll want to use Rcpp::Rcout instead of std::cout.
For more details you can read this article by one of Rcpp's authors: http://dirk.eddelbuettel.com/blog/2012/02/18/

Can std::vector<std::complex<boost:multiprecision::float128>>(N).data() safely be reinterpret_casted to fftwq_complex*?

I did not really expect the following example to work, but indeed it does (g++ 4.6.4, with --std=c++0x):
#include <boost/multiprecision/float128.hpp>
#include <blitz/array.h>
#include <fftw3.h>
int main(int /*argc*/, char** /*argv*/)
{
//these are the same
std::cout << sizeof(std::complex<boost::multiprecision::float128>) << " " << sizeof(fftwq_complex) << std::endl;
typedef std::vector< std::complex<boost::multiprecision::float128> > boost128cvec;
//typedef std::vector<std::complex<boost::multiprecision::float128> , fftw::allocator< std::complex<boost::multiprecision::float128> > > boost128cvec;
//declare a std::vector consisting of std::complex<boost::multiprecision::float128>
boost128cvec test_vector3(12);
//casting its data storatge to fftwq_complex*
fftwq_complex* test_ptr3 = reinterpret_cast<fftwq_complex*>(test_vector3.data());
//also create a view to the same data as a blitz::Array
blitz::Array<std::complex<boost::multiprecision::float128>, 1> test_array3(test_vector3.data(), blitz::TinyVector<int, 1>(12), blitz::neverDeleteData);
test_vector3[3] = std::complex<boost::multiprecision::float128>(1.23,4.56);
//this line would not work with std::vector
test_array3 = sin(test_array3);
//this line would not work with the built-in type __float128
test_vector3[4] = sin(test_vector3[3]);
//all of those print the same numbers
std::cout << "fftw::vector: " << test_vector3[3].real() << " + i " << test_vector3[3].imag() << std::endl;
std::cout << "fftw_complex: " << (long double)test_ptr3[3][0] << " + i " << (long double)test_ptr3[3][1] << std::endl;
std::cout << "blitz: " << test_array3(3).real() << " + i " << test_array3(3).imag() << std::endl << std::endl;
}
Two remarks:
The goal is to be able to use both fftw and blitz::Array operations on the same data without the need to copy them around while at the same time being able to use generic funcionst like sin() also for complex variables with quad precision
The blitz-part works fine, which is expected. But the surprise (to me) was, that the fftwq_complex* part also works fine.
The fftw::allocator is a simple replacement to std::allocator which will use fftwq_malloc to assure correct simd alignment, but that is not important for this question, so I left it out (at least I think that this is not important for this question)
My Question is: How thin is the ice I'm stepping on?
You're pretty much save:
std::vector is compatible with a C array (you can access a pointer to the first element via vector.data(), as answered in this question
std::complex<T> is designed to be compatible with a Array of form T[2], which is compatible with FFTW. This is described in the FFTW documentation
C++ has its own complex template class, defined in the standard header file. Reportedly, the C++ standards committee has recently agreed to mandate that the storage format used for this type be binary-compatible with the C99 type, i.e. an array T[2] with consecutive real [0] and imaginary [1] parts. (See report http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf WG21/N1388.) Although not part of the official standard as of this writing, the proposal stated that: “This solution has been tested with all current major implementations of the standard library and shown to be working.” To the extent that this is true, if you have a variable complex *x, you can pass it directly to FFTW via reinterpret_cast(x).
The only thing to keep in mind is that the data() gets invalidated if you add values to your vector.
For the last part there is the compatiblity between boost::multiprecision::float128 and __float128. The boost documentation gives no guarantee about this.
What can be done however, is to add some static asserts in your code, which fails if the conversion is not possible. This could look like this:
static_assert(std::is_standard_layout<float128>::value,"no standard type");
static_assert(sizeof(float128) == sizeof(__float128),"size mismatch");
Where sizeof guarantees the same size of the boost type and __float128, and is_standard_layout checks that:
A pointer to a standard-layout class may be converted (with reinterpret_cast) to a pointer to its first non-static data member and vice versa.
Of course, this only gives hints if it works in the end, as you cannot say if the type is really a __float128, but ab boost states their type is a thin wrapper around it, it should be fine. If their are changes in design or structure of float128, the static assertions should fail.

std::chrono & Boost.Units

I'm working on a software design in which I'd like to leverage Boost.Units. Some of the units I'd like to use represent time, however, I'm inclined to use the C++11 std::chrono units for those since they're standard.
I'm wondering if there's any clean integration between Boost.Units and chrono or whether I have to resort to writing my own converters and lose type safety by just copying scalar values between the types.
Are there any best practices for this issue?
If you just want to convert a std::chrono duration to a boost time quantity you can use the following template function:
using time_quantity = boost::units::quantity<si::time, double>;
template<class _Period1, class _Type>
time_quantity toBoostTime( chrono::duration<_Type, _Period1> in)
{
return time_quantity::from_value(double(in.count()) * double(_Period1::num) / double(_Period1::den) );
}
One thing to note is that the returned time_quantity will always be in seconds and the storage type will be of type double. If any of those two are a problem, the template can be adapted.
Example:
namespace bu = boost::units;
namespace sc = std::chrono;
using time_quantity_ms = bu::quantity<decltype(bu::si::milli * bu::si::second), int32_t>;
std::cout << "Test 1: " << toBoostTime(sc::seconds(10)) << std::endl;
std::cout << "Test 2: " << toBoostTime(sc::milliseconds(10)) << std::endl;
std::cout << "Test 3: " << static_cast<time_quantity_ms>(toBoostTime(sc::milliseconds(10))) << std::endl;
/* OUTPUT */
Test 1: 10 s
Test 2: 0.01 s
Test 3: 10 ms
This may not be a perfect answer, but boost::chrono provides an example of how to integrate it with a units system they define in the example itself (devel) (version at time of writing).
Essentially, based on the boost.units examples for quaternion and complex numbers it should be possible to define the same functions for the std::chrono units, though it may require additional code for new user-defined units.
There is also a similar, though slightly different question regarding boost::date_time which may also have useful information.
Sorry this isn't a full answer, but perhaps it will be a start someone else can complete!

cout or printf which of the two has a faster execution speed C++?

I have been coding in C++ for a long time. I always wondered which has a faster execution speed printf or cout?
Situation: I am designing an application in C++ and I have certain constraints such as time limit for execution. My application has loads printing commands on the console. So which one would be preferable printf or cout?
Each has its own overheads. Depending on what you print, either may be faster.
Here are two points that come to mind -
printf() has to parse the "format" string and act upon it, which adds a cost.
cout has a more complex inheritance hierarchy and passes around objects.
In practice, the difference shouldn't matter for all but the weirdest cases. If you think it really matters - measure!
EDIT -
Oh, heck, I don't believe I'm doing this, but for the record, on my very specific test case, with my very specific machine and its very specific load, compiling in Release using MSVC -
Printing 150,000 "Hello, World!"s (without using endl) takes about -
90ms for printf(), 79ms for cout.
Printing 150,000 random doubles takes about -
3450ms for printf(), 3420ms for cout.
(averaged over 10 runs).
The differences are so slim this probably means nothing...
Do you really need to care which has a faster execution speed? They are both used simply for printing text to the console/stdout, which typically isn't a task that demands ultra-high effiency. For that matter, I wouldn't imagine there to be a large difference in speed anyway (though one might expect printf to be marginally quicker because it lacks the minor complications of object-orientedness). Yet given that we're dealing with I/O operations here, even a minor difference would probably be swamped by the I/O overhead. Certainly, if you compared the equivalent methods for writing to files, that would be the case.
printf is simply the standard way to output text to stdout in C.
'cout' piping is simply the standard way to output text to stdout in C++.
Saying all this, there is a thread on the comp.lang.cc group discussing the same issue. Consensus does however seem to be that you should choose one over the other for reasons other than performance.
The reason C++ cout is slow is the default sync with stdio.
Try executing the following to deactivate this issue.
ios_base::sync_with_stdio(false)
http://www.cplusplus.com/reference/iostream/ios_base/sync_with_stdio/
http://msdn.microsoft.com/es-es/library/7yxhba01.aspx
On Windows at least, writing to the console is a huge bottleneck, so a "noisy" console mode program will be far slower than a silent one. So on that platform, slight differences in the library functions used to address the console will probably make no significant difference in practice.
On other platforms it may be different. Also it depends just how much console output you are doing, relative to other useful work.
Finally, it depends on your platform's implementation of the C and C++ I/O libraries.
So there is no general answer to this question.
Performance is a non-issue for comparison; can't think of anything where it actually counts (developing a console-program). However, there's a few points you should take into account:
Iostreams use operator chaining instead of va_args. This means that your program can't crash because you passed the wrong number of arguments. This can happen with printf.
Iostreams use operator overloading instead of va_args -- this means your program can't crash because you passed an int and it was expecting a string. This can happen with printf.
Iostreams don't have native support for format strings (which is the major root cause of #1 and #2). This is generally a good thing, but sometimes they're useful. The Boost format library brings this functionality to Iostreams for those who need it with defined behavior (throws an exception) rather than undefined behavior (as is the case with printf). This currently falls outside the standard.
Iostreams, unlike their printf equivilants, can handle variable length buffers directly themselves instead of you being forced to deal with hardcoded cruft.
Go for cout.
I recently was working on a C++ console application on windows that copied files using CopyFileEx and was echoing the 'to' and 'from' paths to the console for each copy and then displaying the average throughput at the end of the operation.
When I ran the console application using printf to echo out the strings I was getting 4mb/sec, when replacing the printf with std::cout the throughput dropped to 800kb/sec.
I was wondering why the std::cout call was so much more expensive and even went so far as to echo out the same string on each copy to get a better comparison on the calls. I did multiple runs to even out the comparison, but the 4x difference persisted.
Then I found this answer on stackoverflow..
Switching on buffering for stdout did the trick, now my throughput numbers for printf and std::cout are pretty much the same.
I have not dug any deeper into how printf and cout differ in console output buffering, but setting the output buffer before I begin writing to the console solved my problem.
Another Stack Overflow question addressed the relative speed of C-style formatted I/O vs. C++ iostreams:
Why is snprintf faster than ostringstream or is it?
http://www.fastformat.org/performance.html
Note, however, that the benchmarks discussed were for formatting to memory buffers. I'd guess that if you're actually performing the I/O to a console or file that the relative speed differences would be much smaller due to the I/O taking more of the overall time.
If you're using C++, you should use cout instead as printf belongs to the C family of functions. There are many improvements made for cout that you may benefit from. As for speed, it isn't an issue as console I/O is going to be slow anyway.
In practical terms I have always found printf to be faster than cout. But then again, cout does a lot more for you in terms of type safety. Also remember printf is a simple function whereas cout is an object based on a complex streams hierarchy, so it's not really fair to compare execution times.
To settle this:
#include <iostream>
#include <cstdio>
#include <ctime>
using namespace std;
int main( int argc, char * argcv[] ) {
const char * const s1 = "some text";
const char * const s2 = "some more text";
int x = 1, y = 2, z = 3;
const int BIG = 2000;
time_t now = time(0);
for ( int i = 0; i < BIG; i++ ) {
if ( argc == 1 ) {
cout << i << s1 << s2 << x << y << z << "\n";
}
else {
printf( "%d%s%s%d%d%d\n", i, s1, s2, x, y, z );
}
}
cout << (argc == 1 ? "cout " : "printf " ) << time(0) - now << endl;
}
produces identical timings for cout and printf.
Why don't you do an experiment? On average for me, printing the string helloperson;\n using printf takes, on average, 2 clock ticks, while cout using endl takes a huge amount of time - 1248996720685 clock ticks. Using cout with "\n" as the newline takes only 41981 clock ticks. The short URL for my code is below:
cpp.sh/94qoj
link may have expired.
To answer your question, printf is faster.
#include <iostream>
#include <string>
#include <ctime>
#include <stdio.h>
using namespace std;
int main()
{
clock_t one;
clock_t two;
clock_t averagePrintf;
clock_t averageCout;
clock_t averagedumbHybrid;
for (int j = 0; j < 100; j++) {
one = clock();
for (int d = 0; d < 20; d++) {
printf("helloperson;");
printf("\n");
}
two = clock();
averagePrintf += two-one;
one = clock();
for (int d = 0; d < 20; d++) {
cout << "helloperson;";
cout << endl;
}
two = clock();
averageCout += two-one;
one = clock();
for (int d = 0; d < 20; d++) {
cout << "helloperson;";
cout << "\n";
}
two = clock();
averagedumbHybrid += two-one;
}
averagePrintf /= 100;
averageCout /= 100;
averagedumbHybrid /= 100;
cout << "printf took " << averagePrintf << endl;
cout << "cout took " << averageCout << endl;
cout << "hybrid took " << averagedumbHybrid << endl;
}
Yes, I did use the word dumb. I first made it for myself, thinking that the results were crazy, so I searched it up, which ended up with me posting my code.
Hope it helps,
Ndrewffght
If you ever need to find out for performance reasons, something else is fundamentally wrong with your application - consider using some other logging facility or UI ;)
Under the hood, they will both use the same code, so speed differences will not matter.
If you are running on Windows only, the non-standard cprintf() might be faster as it bypasses a lot of the streams stuff.
However it is an odd requirement. Nobody can read that fast. Why not write output to a file, then the user can browse the file at their leisure?
Anecdotical evidence:
I've once designed a logging class to use ostream operators - the implementation was insanely slow (for huge amounts of data).
I didn't analyze that to much, so it might as well have been caused by not using ostreams correctly, or simply due to the amount of data logged to disk. (The class has been scrapped because of the performance problems and in practice printf / fmtmsg style was preferred.)
I agree with the other replies that in most cases, it doesn't matter. If output really is a problem, you should consider ways to avoid / delay it, as the actual display updates typically cost more than a correctly implemented string build. Thousands of lines scrolling by within milliseconds isn't very informative anyway.
You should never need to ask this question, as the user will only be able to read slower than both of them.
If you need fast execution, don't use either.
As others have mentioned, use some kind of logging if you need a record of the operations.