Armadillo random number generator only generating zeros (Windows, MSYS2) - c++

The following test program is supposed to generate a vector with 5 random elements, but only contains zeroes when I compile and run it on my machine.
#include <iostream>
#include <armadillo>
using std::cout;
using std::endl;
using arma::vec;
int main()
{
arma::arma_rng::set_seed(1);
vec v = arma::randu<vec>(5);
cout << v << endl;
cout << v(0) << endl;
return 0;
}
Compilation/output
$ g++ main.cpp -o example -std=c++11 -O2 -larmadillo
$ ./example.exe
0
0
0
0
0
0
I'm on Windows 10, using gcc ((Rev1, Built by MSYS2 project) 8.2.1 20181214) and Armadillo (9.200.6) from MSYS2.
Packages (pacman in mingw64 subsystem):
mingw64/mingw-w64-x86_64-armadillo 9.200.6-1
mingw64/mingw-w64-x86_64-gcc-8.3.0-1
Any idea what could cause this?
I have an inkling that this might be because I'm using the MSYS2 version of Armadillo, but I'm not sure, and haven't tested compiling the library myself (yet).
EDIT: I have an inkling that this is related to MSYS somehow, so I opened an issue over here: https://github.com/msys2/MINGW-packages/issues/5019

Related

Problem with running mlpack sample program

I have installed mlpack via msys2.
Also I have installed gcc via msys2.
Made a simple program in c++ from the code on mlpack website
// This is an interactive demo, so feel free to change the code and click the 'Run' button.
// This simple program uses the mlpack::neighbor::NeighborSearch object
// to find the nearest neighbor of each point in a dataset using the L1 metric,
// and then print the index of the neighbor and the distance of it to stdout.
#include <C:\msys64\mingw64\include\mlpack\core.hpp>
#include <C:\msys64\mingw64\include\mlpack\methods\neighbor_search\neighbor_search.hpp>
using namespace mlpack;
using namespace mlpack::neighbor; // NeighborSearch and NearestNeighborSort
using namespace mlpack::metric; // ManhattanDistance
int main()
{
// Load the data from data.csv (hard-coded). Use CLI for simple command-line
// parameter handling.
arma::mat data("0.339406815,0.843176636,0.472701471; \
0.212587646,0.351174901,0.81056695; \
0.160147626,0.255047893,0.04072469; \
0.564535197,0.943435462,0.597070812");
data = data.t();
// Use templates to specify that we want a NeighborSearch object which uses
// the Manhattan distance.
NeighborSearch<NearestNeighborSort, ManhattanDistance> nn(data);
// Create the object we will store the nearest neighbors in.
arma::Mat<size_t> neighbors;
arma::mat distances; // We need to store the distance too.
// Compute the neighbors.
nn.Search(1, neighbors, distances);
// Write each neighbor and distance using Log.
for (size_t i = 0; i < neighbors.n_elem; ++i)
{
std::cout << "Nearest neighbor of point " << i << " is point "
<< neighbors[i] << " and the distance is " << distances[i] << "." << std::endl;
}
return 0;
}
Trying to run this program as follows,
g++ nearest-neighbour.cpp -o nearest-neighbour -std=c++11 -larmadillo -l mlpack -lomp
I get the following error while executing the executable.
After installing dependency walker I see the above procedure as flagged in red colour, I dont know what it means.
This time I have used below command to compile,
g++ -std=c++11 nearest_neighbour.cpp -o nearest_neighbour.exe -larmadillo -llapack -fopenmp -lmlpack -lboost_serialization-mt -lopenblas

sin(<minus zero>) does not return the expected result on Visual Studio 2013 64bit

In the engineering application that I develop for, I stumbled over a difference in the result of sin(-0) between 32bit and 64bit. Due to the nature of the computations, this propagates into some phase differences.
We are developing on Windows with MSVC 2013.
Apparently the floating point standard specifies that sin(-0) returns the argument unchanged - according to cppreference/sin at least.
I've done some investigation and these are some other results I got:
// Visual Studio 2013 32 bit on Win7 - default arguments
std::sin( -0 ) = -0
std::sin( 0 ) = 0
// Visual Studio 2013 64 bit on Win7 - default arguments
std::sin( -0 ) = 0 // the faulty one
std::sin( 0 ) = 0
// g++ (GCC) 5.1.0 : g++ -std=c++11 -O2 -Wall -pedantic -mfpmath=387 -m64 main.cpp && ./a.out
std::sin( -0 ) = -0
std::sin( 0 ) = 0
// g++ (GCC) 5.1.0 : g++ -std=c++11 -O2 -Wall -pedantic -mfpmath=sse -m64 main.cpp && ./a.out
std::sin( -0 ) = -0
std::sin( 0 ) = 0
I also know that the Intel math libraries (libm*.dll) also return sin(-0)=-0.
Looking into the disassembly, the implementation of std::sin directs into msvcr120d.dll.
The questions:
is this an error in Microsoft's sin routine implementation on 64bit?
should I have used some specific compiler argument that I do not know about?
The code to use for the above output:
#include <cmath>
#include <iostream>
void printSin( const double dfPh )
{
const auto dfSinPh = std::sin( dfPh );
std::cout.precision( 16 );
std::cout << "std::sin( " << dfPh << " ) = " << dfSinPh << std::endl;
}
int main()
{
printSin( -0.00000000000000000000 );
printSin( +0.00000000000000000000 );
return 0;
}
In the end I resorted to a poor man's solution. I specifically check for -0 using std::signbit and closeness to zero, and consider sin(-0) = -0.
Slow? Perhaps, but correct for our needs.

g++ 4.8.1 on Ubuntu can't compile large bitsets

My source:
#include <iostream>
#include <bitset>
using std::cout;
using std::endl;
typedef unsigned long long U64;
const U64 MAX = 8000000000L;
struct Bitmap
{
void insert(U64 N) {this->s.set(N % MAX);}
bool find(U64 N) const {return this->s.test(N % MAX);}
private:
std::bitset<MAX> s;
};
int main()
{
cout << "Bitmap size: " << sizeof(Bitmap) << endl;
Bitmap* s = new Bitmap();
// ...
}
Compilation command and its output:
g++ -g -std=c++11 -O4 tc002.cpp -o latest
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.8/README.Bugs> for instructions.
Bug report and its fix will take long time... Has anybody had this problem already? Can I manipulate some compiler flags or something else (in source probably) to bypass this problem?
I'm compiling on Ubuntu, which is actually VMware virtual machine with 12GB memory and 80GB disk space, and host machine is MacBook Pro:
uname -a
Linux ubuntu 3.11.0-15-generic #23-Ubuntu SMP Mon Dec 9 18:17:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
On my machine, g++ 4.8.1 needs a maximum of about 17 gigabytes of RAM to compile this file, as observed with top.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18287 nm 20 0 17.880g 0.014t 808 D 16.6 95.7 0:17.72 cc1plus
't' in the RES column stands for terabytes ;)
The time taken is
real 1m25.283s
user 0m31.279s
sys 0m5.819s
In the C++03 mode, g++ compiles the same file using just a few megabytes. The time taken is
real 0m0.107s
user 0m0.074s
sys 0m0.011s
I would say this is definitely a bug. A workaround is to give the machine more RAM, or enable swap. Or use clang++.
[Comment]
This little thing:
#include <bitset>
int main() {
std::bitset<8000000000UL> b;
}
results in 'virtual memory exhausted: Cannot allocate memory' when compiled with
g++ (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2

Cygwin g++ x86_64 segmentation fault (core dumped) when using > 2GB memory

I've written a prime sieve program in c++, which uses ~12GB ram to calculate all primes below 100,000,000,000 (100 Billion).
The program works fine when compiled with Visual Studio 2012 (in a project set up for x64) as well as g++ on 64 bit linux. However, when compiled with g++ in cygwin64 on Windows 7 Home Premium 64 bit, a segmentation fault occurs when attempting to use more than ~2GB ram (running the sieve for > ~17,000,000,000)
I'm fairly sure it's running as a 64 bit process as there's no *32 next to the process name in task manager.
The code:
#include <iostream>
#include <vector>
#include <cmath>
#include <cstdlib>
using namespace std;
long long sieve(long long n);
int main(int argc, char** argv) {
const long long ONE_BILLION = 1000*1000*1000;
if(argc == 2)
cout << sieve(atol(argv[1])) << endl;
else
cout << sieve(ONE_BILLION * 100) << endl;
}
long long sieve(long long n) {
vector<bool> bools(n+1);
for(long long i = 0; i <=n; i++)
bools[i] = true;
double csqrtn = sqrt(n);
for (long long i = 2; i < csqrtn; ++i)
if (bools[i])
for (long long j = i * i; j < n; j += i)
bools[j] = false;
long long primes2 = 0;
for (long long i = 2; i < n; i++)
if (bools[i])
primes2++;
return primes2;
}
Working fine in Visual studio:
Working fine on x64 linux:
Compiled with the command:
$ g++ -O3 sieve.cpp -o sieve.exe
Running for 18 billion fails:
$ ./sieve.exe 18000000000
Segmentation fault (core dumped)
Works fine (using 2,079,968 K memory according to task manager, though my reputation doesn't allow me to post a third link.)
$ ./sieve.exe 17000000000
755305935
g++ version:
$ g++ --version
g++ (GCC) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Note: if you are going to try and run this yourself, it can take quite a long time. On a 3570k # 4.2GHz running 100 billion in visual studio takes around 30 mins, 1 billion around 10 seconds. However you might be able to duplicate the error with just the vector allocation.
Edit: since I didn't explicitly put a question: Why does this happen? Is it a limitation of the cygwin64 dll (cygwin64 was only released fully about a month ago)?
Try increasing the cygwin memory limit. This cygwin documentation suggests that the default maximum application heap size on 64-bit platforms is 4GB... although, this may be referring to 32-bit executables on 64-bit platforms... not sure what restrictions cygwin64 64-bit applications would have regarding their maximum heap size.

Nested openmp causes segmentation fault (MacOS X only)

I am building an c++ application that uses nested omp. It however causes crash. The problem is solved when either one of the two omp is removed, or the wait function is inside the main file itself. OS is MacOS X Lion, compiler should be either llvm-gcc or gcc-4.2 (I am not sure, simply used cmake...) I then built the following app to demonstrate:
EDIT: I now tried the same on a linux machine, it works fine. So it's a pure MACOS X (lion) issue.
OMP_NESTED is set to true.
The main:
#include "waiter.h"
#include "iostream"
#include "time.h"
#include <omp.h>
void wait(){
int seconds = 1;
#pragma omp parallel for
for (int i=0;i<2;i++){
clock_t endwait;
endwait = clock () + seconds * CLOCKS_PER_SEC ;
while (clock() < endwait) {}
std::cout << i << "\n";
}
}
int main(){
std::cout << "blub\n";
#pragma omp parallel for
for(int i=0;i<5;i++){
Waiter w; // causes crash
// wait(); // works
}
std::cout << "blub\n";
return 0;
}
header:
#ifndef WAITER_H_
#define WAITER_H_
class Waiter {
public:
Waiter ();
};
#endif // WAITER_H_
implementation:
#include "waiter.h"
#include "omp.h"
#include "time.h"
#include <iostream>
Waiter::Waiter(){
int seconds = 1;
#pragma omp parallel for
for (int i=0;i<5;i++){
clock_t endwait;
endwait = clock () + seconds * CLOCKS_PER_SEC ;
while (clock() < endwait) {}
std::cout << i << "\n";
}
}
CMakeLists.txt:
cmake_minimum_required (VERSION 2.6)
project (waiter)
set(CMAKE_CXX_FLAGS "-fPIC -fopenmp")
set(CMAKE_C_FLAGS "-fPIC -fopenmp")
set(CMAKE_SHARED_LINKDER_FLAGS "-fPIC -fopenmp")
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${PROJECT_BINARY_DIR}/lib)
set(EXECUTABLE_OUTPUT_PATH ${PROJECT_BINARY_DIR}/bin)
add_library(waiter SHARED waiter.cpp waiter.h)
add_executable(use_waiter use_waiter.cpp)
target_link_libraries(use_waiter waiter)
thanks for help!
EDIT: rewritten with more details.
openmp causes intermittent failure on gcc 4.2, but it is fixed by gcc 4.6.1 (or perhaps 4.6). You can get the 4.6.1 binary from http://hpc.sourceforge.net/ (look for gcc-lion.tar.gz).
The failure of openmp in lion with less than gcc 4.6.1 is intermittent. It seems to happen after many openmp calls, so is likely made more likely by nesting but nesting is not required. This link doesn't have nested openmp (there is a parallel for within a standard single threaded for) but fails. My own code had intermittent hanging or crashing due to openmp after many minutes of working fine with gcc 4.2 (with no nested pragmas) in lion and was completely fixed by gcc 4.6.1.
I downloaded your code and compiled it with gcc 4.2 and it ran fine on my machine (with both the Waiter w; and wait(); options :-). I just used:
g++ -v -fPIC -fopenmp use_waiter.cpp waiter.cpp -o waiter
I tried increasing the loop maxes but still couldn't get it to fail. I see both the starting and ending blub.
What error message do you see?
Are you sure that the gcc 4.6 you downloaded is being used (use -v to make sure)?
See also here.