C++ atomic variable compare and increment - c++

I would like to know how to make the following function in a whole atomic.
With my code, I believe there can be a situation that two threads both pass the condition, and return 0,1 respectively right?
static std::atomic<uintV> shared_v (0);
int compare_increment() {
if (shared_v >= 10) {
return -1;
}
return shared_v++;
Any help would be appreciated.

You can use compare_exchange_weak in a loop to achieve this sort of read-modify-write effect without a mutex.
Example (untested):
int compare_increment() {
uintV old = shared_v.load();
do {
if (old >= 10)
return -1;
} while (!shared_v.compare_exchange_weak(old, old+1));
return old;
}

I wrote one for a bit of fun.
$ cat compare-exchange-atomic-test.cpp
#include <array>
#include <atomic>
#include <iostream>
#include <thread>
std::atomic<int> value{0};
std::atomic<int> loops{0};
int comparison_target = 10;
void f() {
int v;
do {
v = value;
if (v < comparison_target) {
return;
}
++loops;
} while (!value.compare_exchange_weak(v, v + 1));
}
void g() {
int i;
for (i = 0; i < 1000; ++i) {
f();
}
}
int main(int argc, char *argv[]) {
if (argc > 1) {
comparison_target = std::stoi(argv[1]);
}
std::array<std::thread, 64> threads;
for (auto &x : threads) {
x = std::thread(g);
}
for (auto &x : threads) {
x.join();
}
std::cout << value << ' ' << comparison_target << ' ' << loops << std::endl;
return 0;
}
$ g++ -Wall -W -pedantic -g -O3 -flto -fno-fat-lto-objects -mcpu=native -DNDEBUG -pthread -MMD -fPIC -fPIE -std=c++17 compare-exchange-atomic-test.cpp -pthread -flto -fno-fat-lto-objects -o compare-exchange-atomic-test
$ ./compare-exchange-atomic-test 1
0 1 0
$ ./compare-exchange-atomic-test 0
64000 0 455100
$ ./compare-exchange-atomic-test 0
64000 0 550596

Related

Clang++ compiled successfully but executable file didn't run

I ran into this problem when solving a math problem. Here's the code:
// program to check if parentheses expression is correct
// {()}[{()}] = true, ()(} = false
#include <string.h>
#include <iostream>
#include <stack>
using namespace std;
int par(string str) {
int a = str.length();
stack<char> S;
char x;
for (int i = 0; i < a; i++) {
x = str[i];
if (x == '(' || x == '[' || x == '{') {
S.push(x);
} else {
if (x == ')') {
if (S.top() == '(') {
S.pop();
} else
return 0;
} else if (x == ']') {
if (S.top() == '[') {
S.pop();
} else
return 0;
} else if (x == '}') {
if (S.top() == '{') {
S.pop();
} else
return 0;
}
}
}
if (!S.empty()) {
return 0;
} else
return 1;
}
int main() {
int n;
string str;
cin >> n;
for (int i = 0; i < n; i++) {
cin >> str;
cout << par(str) << endl;
}
return 0;
}
Ignore the bad coding, when compile this code using this command:
clang++ -std=c++20 -Wall -Wextra -pedantic -g par.cpp -o par.exe
The code compiled, executable generated, but when I open par.exe it terminates immediately.
But when using g++, it worked well:
g++ -std=c++20 -Wall -Wextra -pedantic -g par.cpp -o par.exe
Here's the clang++ version:
>clang++ --version
clang version 15.0.5
Target: x86_64-w64-windows-gnu
Thread model: posix
InstalledDir: C:/msys64/ucrt64/bin
and g++:
>g++ --version
g++.exe (tdm64-1) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
I'm assumimg this was because of the <stack> and <queue> library, as I used clang without these libraries and it worked well.
Any ideas ?

Undefined reference for TicTacToe [duplicate]

This question already has answers here:
What is an undefined reference/unresolved external symbol error and how do I fix it?
(39 answers)
Using G++ to compile multiple .cpp and .h files
(13 answers)
Closed last year.
I've got this file where it seems to be an undefined reference but I don't know why, everything it seems correct to me. Someone can help me?
This is the error:
g++ -c driver.cpp -std=c++11 -pedantic -Wall
g++ -o driver driver.o
/usr/bin/ld: driver.o: in function `main':
driver.cpp:(.text+0x23): undefined reference to `TicTacToe::TicTacToe()'
/usr/bin/ld: driver.cpp:(.text+0x2f): undefined reference to `TicTacToe::makeMove()'
collect2: error: ld returned 1 exit status
make: *** [makefile:2: driver] Error 1
And these are the files I'm using:
The driver program:
// driver.cpp: use of TicTacToe class
#include "TicTacToe.h" // include definition of class TicTacToe
int main() {
TicTacToe g; // creates object g of class TicTacToe
g.makeMove(); // invokes function makeMove
}
This is the class:
// TicTacToe.h
#ifndef TICTACTOE_H
#define TICTACTOE_H
#include <array>
class TicTacToe {
private:
enum Status {WIN, DRAW, CONTINUE}; // enumeration constants
std::array<std::array<int, 3>, 3> board;
public:
TicTacToe(); // default constructor
void makeMove(); // make move
void printBoard() const; // print board
bool validMove(int, int) const; // validate move
bool xoMove(int); // x o move
Status gameStatus() const; // game status
};
#endif
These are the class function members:
// TicTacToe.cpp
// Member-function defnitions for class TicTacToe.
#include <iostream>
#include <iomanip>
#include "TicTacToe.h" // include definition of class TicTacToe
using std::cout;
using std::cin;
using std::setw;
TicTacToe::TicTacToe() {
for (int j{0}; j < 3; ++j) { // initialize board
for (int k{0}; k < 3; ++k) {
board[j][k] = ' ';
}
}
}
void TicTacToe::makeMove() {
printBoard();
while (true) {
if (xoMove('X')) {
break;
}
else if (xoMove('O')) {
break;
}
}
}
void TicTacToe::printBoard() const {
cout << " 0 1 2\n\n";
for (int r{0}; r < 3; ++r) {
cout << r;
for (int c = 0; c < 3; ++r) {
cout << setw(3) << static_cast< char > (board[r][c]);
if (c != 2) {
cout << " |";
}
}
if (r != 2) {
cout << "\n ____|____|____\n | | \n";
}
}
cout << "\n\n";
}
bool TicTacToe::xoMove(int symbol) {
int x;
int y;
do {
cout << "Player " << static_cast<char>(symbol) << " enter move: ";
cin >> x >> y;
cout << '\n';
} while (!validMove(x, y));
board[x][y] = symbol;
printBoard();
Status xoStatus = gameStatus();
if (xoStatus == WIN) {
cout << "Player " << static_cast<char>(symbol) << " wins!\n";
return true;
}
else if (xoStatus == DRAW) {
cout << "Game is draw.\n";
return true;
}
else { // CONTINUE
return false;
}
}
bool TicTacToe::validMove(int r, int c) const {
return r >= 0 && r < 3 && c >= 0 && c < 3 && board[r][c] == ' ';
}
// must specify that type Status is part of the TicTacToe class.
TicTacToe::Status TicTacToe::gameStatus() const {
// check for a win on diagonals
if (board[0][0] != ' ' && board[0][0] == board[1][1] && board[0][0] == board[2][2]) {
return WIN;
}
else if (board[2][0] != ' ' && board[2][0] == board[1][1] && board[2][0] == board[0][2]) {
return WIN;
}
// check for win in rows
for (int a{0}; a < 3; ++a) {
if (board[a][0] != ' ' && board[a][0] == board[a][1] && board[a][0] == board[a][2]) {
return WIN;
}
}
// check for win in columns
for (int a{0}; a < 3; ++a) {
if (board[0][a] != ' ' && board[0][a] == board[1][a] && board[0][a] == board[2][a]) {
return WIN;
}
}
// check for a completed game
for (int r{0}; r < 3; ++r) {
for (int c{0}; c < 3; ++c) {
if (board[r][c] == ' ') {
return CONTINUE; // game is not finished
}
}
}
return DRAW; // game is a draw
}
It's probably something stupid but I don't know what I have to look for.
step by step:
g++ -c driver.cpp TicTacToe.cpp -std=c++11 -pedantic -Wall
g++ -o driver driver.o TicTacToe.o
./driver

Compiler optimizes out the coroutine value

I have implemented two coroutines that uses the Resumable class:
#include <coroutine>
#include <future>
#include <iostream>
class Resumable
{
public:
class promise_type
{
public:
auto initial_suspend()
{
return std::suspend_always();
}
auto final_suspend() noexcept
{
return std::suspend_always();
}
auto yield_value(const int& value)
{
value_ = value;
return std::suspend_always();
}
auto return_value(const int& value)
{
value_ = value;
return std::suspend_always();
}
auto get_return_object()
{
return std::coroutine_handle<promise_type>::from_promise(*this);
}
void unhandled_exception()
{
}
public:
int value_;
};
public:
Resumable(std::coroutine_handle<promise_type> handle)
: handle_{ handle }
{
}
~Resumable()
{
handle_.destroy();
}
int get()
{
if (!handle_.done())
{
handle_.resume();
}
return handle_.promise().value_;
}
private:
std::coroutine_handle<promise_type> handle_;
};
Resumable generate(int a)
{
for (int i = 1; i < a; ++i)
{
co_yield i;
}
co_return 0;
}
Resumable n_generate(int a)
{
auto gen = generate(a);
for (auto value = gen.get(); value != 0; value = gen.get())
{
//std::cout << -value << std::endl; <-- uncommenting this fixes the problem
co_yield -value;
}
co_return 0;
}
int main(int argc, const char* argv[])
{
auto gen_10 = n_generate(10);
for (auto value = gen_10.get(); value != 0; value = gen_10.get())
{
std::cout << value << std::endl;
}
return 0;
}
The output for this code is empty.
If I uncomment the line marked line in n_generate, the output will be:
-1
-1
-2
-2
-3
-3
-4
-4
-5
-5
-6
-6
-7
-7
-8
-8
-9
-9
How can I pass the negative values to main without printing them in n_generate function?
I am using gcc-10 on Ububtu 20.04.
Additional flags in my makefile:
LDLIBS := -lstdc++
CXXFLAGS := -g -std=c++2a -fcoroutines

Why calling via weak_ptr is so slow?

I have read the question What's the performance penalty of weak_ptr? but my own tests show different results.
I'm making delegates with smart pointers. The simple code below shows reproduces the performance issues with weak_ptr. Can anybody tell me why?
#include <chrono>
#include <functional>
#include <iostream>
#include <memory>
#include <stdint.h>
#include <string>
#include <utility>
struct Foo
{
Foo() : counter(0) { incrStep = 1;}
void bar()
{
counter += incrStep;
}
virtual ~Foo()
{
std::cout << "End " << counter << std::endl;
}
private:
uint64_t counter;
uint64_t incrStep;
};
void pf(const std::string &md, const std::function<void()> &g)
{
const auto st = std::chrono::high_resolution_clock::now();
g();
const auto ft = std::chrono::high_resolution_clock::now();
const auto del = std::chrono::duration_cast<std::chrono::milliseconds>(ft - st);
std::cout << md << " \t: \t" << del.count() << std::endl;
}
And the test:
int main(int , char** )
{
volatile size_t l = 1000000000ULL;
size_t maxCounter = l;
auto a = std::make_shared<Foo>();
std::weak_ptr<Foo> wp = a;
pf("call via raw ptr ", [=](){
for (size_t i = 0; i < maxCounter; ++i)
{
auto p = a.get();
if (p)
{
p->bar();
}
}
});
pf("call via shared_ptr ", [=](){
for (size_t i = 0; i < maxCounter; ++i)
{
if (a)
{
a->bar();
}
}
});
pf("call via weak_ptr ", [=](){
std::shared_ptr<Foo> p;
for (size_t i = 0; i < maxCounter; ++i)
{
p = wp.lock();
if (p)
{
p->bar();
}
}
});
pf("call via shared_ptr copy", [=](){
volatile std::shared_ptr<Foo> p1 = a;
std::shared_ptr<Foo> p;
for (size_t i = 0; i < maxCounter; ++i)
{
p = const_cast<std::shared_ptr<Foo>& >(p1);
if (p)
{
p->bar();
}
}
});
pf("call via mem_fn ", [=](){
auto fff = std::mem_fn(&Foo::bar);
for (size_t i = 0; i < maxCounter; ++i)
{
fff(a.get());
}
});
return 0;
}
Results:
$ ./test
call via raw ptr : 369
call via shared_ptr : 302
call via weak_ptr : 22663
call via shared_ptr copy : 2171
call via mem_fn : 2124
End 5000000000
As you can see, weak_ptr is 10 times slower than shared_ptr with copying and std::mem_fn and 60 times slower than using raw ptr or shared_ptr.get()
In trying to reproduce your test I realised that the optimizer might be eliminating more than it should. What I did was to utilize random numbers to defeat over-optimization and these results seem realistic with std::weak_ptr being nearly three times slower than the std::shared_ptr or its raw pointer.
I calculate a checksum in each test to ensure they are all doing the same work:
#include <chrono>
#include <memory>
#include <random>
#include <vector>
#include <iomanip>
#include <iostream>
#define OUT(m) do{std::cout << m << '\n';}while(0)
class Timer
{
using clock = std::chrono::steady_clock;
using microseconds = std::chrono::microseconds;
clock::time_point tsb;
clock::time_point tse;
public:
void start() { tsb = clock::now(); }
void stop() { tse = clock::now(); }
void clear() { tsb = tse; }
friend std::ostream& operator<<(std::ostream& o, const Timer& timer)
{
return o << timer.secs();
}
// return time difference in seconds
double secs() const
{
if(tse <= tsb)
return 0.0;
auto d = std::chrono::duration_cast<microseconds>(tse - tsb);
return double(d.count()) / 1000000.0;
}
};
constexpr auto N = 100000000U;
int main()
{
std::mt19937 rnd{std::random_device{}()};
std::uniform_int_distribution<int> pick{0, 100};
std::vector<int> random_ints;
for(auto i = 0U; i < 1024; ++i)
random_ints.push_back(pick(rnd));
std::shared_ptr<int> sptr = std::make_shared<int>(std::rand() % 100);
int* rptr = sptr.get();
std::weak_ptr<int> wptr = sptr;
Timer timer;
unsigned sum = 0;
sum = 0;
timer.start();
for(auto i = 0U; i < N; ++i)
{
sum += random_ints[i % random_ints.size()] * *sptr;
}
timer.stop();
OUT("sptr: " << sum << " " << timer);
sum = 0;
timer.start();
for(auto i = 0U; i < N; ++i)
{
sum += random_ints[i % random_ints.size()] * *rptr;
}
timer.stop();
OUT("rptr: " << sum << " " << timer);
sum = 0;
timer.start();
for(auto i = 0U; i < N; ++i)
{
sum += random_ints[i % random_ints.size()] * *wptr.lock();
}
timer.stop();
OUT("wptr: " << sum << " " << timer);
}
Compiler flags:
g++ -std=c++14 -O3 -g0 -D NDEBUG -o bin/timecpp src/timecpp.cpp
Example Output:
sptr: 1367265700 1.26869 // shared pointer
rptr: 1367265700 1.26435 // raw pointer
wptr: 1367265700 2.99008 // weak pointer

Error: expected an identifier while compiling a program of cusp which is opensource library of CUDA

The basic idea is to do sparse matrix-vector multiplication(SpMV) with CUSP under Compressed Sparse Row(CSR) Compressed format ,the code is below:
the code of csr.cu
#include <string.h>
#include "fileproc.h"
#include <cusp/csr_matrix.h>
#include <cusp/detail/device/spmv/csr_vector.h>
#define VALUETYPE float
#define INDEXTYPE int
using namespace std;
int main( int argc, char *argv[] )
{
string file=argv[1];
INDEXTYPE m;
INDEXTYPE n;
INDEXTYPE nnz;
INDEXTYPE * rowA,*colA;
VALUETYPE * valA;
readFile(file,colA,rowA,valA,m,n,nnz);
//the code will be omitted because of too much of them
return 0;
}
the code of fileproc.h
#ifndef _FILEPROC_H_
#define _FILEPROC_H_
#include <string>
#include <iostream>
#include <fstream>
/*triples for storing original data from files */
template <class T, class ITYPE>
struct Triple
{
ITYPE row; // row index
ITYPE col; // col index
T val; // value
};
/*counting the nonzeros along the rows*/
template <typename ITYPE>
ITYPE CumulativeSum (ITYPE * arr, ITYPE size)
{
ITYPE prev;
ITYPE tempnz = 0 ;
for (ITYPE i = 0 ; i < size ; ++i)
{
prev = arr[i];
arr[i] = tempnz;
tempnz += prev ;
}
return (tempnz) ; // return sum
}
/*convert the matrix to triples*/
template <typename T,typename ITYPE>
void Triple_init(std::string fileName,Triple<T,ITYPE> *A)
{
ITYPE flops;
//cout <<"reading input matrix in text(ascii).."<<endl;
int m,n,nnz;
std::ifstream infile(fileName.c_str());
char line[256];
char c=infile.get();
while(c=='%')
{
infile.getline(line,256);
c=infile.get();
}
infile.unget();
infile>>m>>n>>nnz;
flops=nnz*2;
if(infile.is_open())
{
ITYPE cnz=0;
while (!infile.eof() && cnz<nnz)
{
infile>>A[cnz].row>>A[cnz].col>>A[cnz].val;
A[cnz].row--;
A[cnz].col--;
//cout<<A[cnz].row<<" "<<A[cnz].col<<" "<<A[cnz].val<<endl;
++cnz;
}
std::assert(cnz==nnz);
}
infile.close();
}
/*assigned the value to m(row),n(column) and nnz(the number of non-zeros)*/
template <typename ITYPE>
void init_nnz(std::string fileName,ITYPE& m,ITYPE& n,ITYPE& nnz)
{
ITYPE flops;
std::cout <<"reading input matrix in text(ascii).."<<std::endl;
std::ifstream infile(fileName.c_str());
char line[256];
char c=infile.get();
while(c=='%')
{
infile.getline(line,256);
c=infile.get();
}
infile.unget();
infile>>m>>n>>nnz;
infile.close();
}
/*convert to CSR format*/
template <class T,class ITYPE>
void init_csr(Triple<T,ITYPE> * triples,ITYPE *& jc,ITYPE *& ir,T *& num,ITYPE m,ITYPE n,ITYPE size)
{
// Constructing empty Csr objects (size = 0) are not allowed.
assert(size != 0 && n != 0);
ITYPE * w = new ITYPE[m]; // workspace of size n (# of columns)
for(ITYPE k = 0; k < m; ++k)
w[k] = 0;
for (ITYPE k = 0 ; k < size ; ++k)
{
int tmp = triples[k].row;
w [ tmp ]++ ; // column counts (i.e, w holds the "col difference array")
}
if(size > 0)
{
ir[m] = CumulativeSum (w, m) ; // cumulative sum of w
for(ITYPE k = 0; k < m; ++k)
ir[k] = w[k];
ITYPE last;
for (ITYPE k = 0 ; k < size ; ++k)
{
jc[ last = w[ triples[k].row ]++ ] = triples[k].col ;
num[last] = triples[k].val ;
}
}
delete [] w;
}
/*
** convert the file into CSR format
*/
template <class T,class ITYPE>
void readFile(std::string name,ITYPE *& jc,ITYPE *& ir,T *& num,ITYPE & m,ITYPE & n,ITYPE &nnz)
{
init_nnz<ITYPE>(name,m,n,nnz);
Triple<T,ITYPE> * A=new Triple<T,ITYPE>[nnz];
Triple_init<T,ITYPE>(name,A);
init_csr(A,jc,ir,num,m,n,nnz);
}
#endif
the makefile is below:
NVIDIA = $(HOME)/NVIDIA_CUDA-5.0_Samples
CUDA = /usr/local/cuda-5.0
CUSP = /home/taoyuan/setup_package
NVIDINCADD = -I$(NVIDIA)/common/inc
CUDAINCADD = -I$(CUDA)/include
CC = -L/usr/lib64/ -lstdc++
GCCOPT = -O2 -fno-rtti -fno-exceptions
INTELOPT = -O3 -fno-rtti -xW -restrict -fno-alias
SPMV = $(CUSP)/cusp/detail/device/spmv
#DEB = -g
#NVCC = -G
#ARCH = -arch=sm_20
ARCH = -arch=sm_35
csr:csr.cu
nvcc $(DEB) $(NVCC) $(ARCH) $(CC) -lm $(NVIDINCADD) -I$(CUSP) -I$(SPMV) -I$(THRUST) -lcusparse -I./ -o $(#) $(<)
the OS is redhat Enterprise Linux Server release 6.2, the software is cuda-5.0 and thrust 1.5.1 which is setup at directory named "/home/taoyuan/setup_package", GPU is k20M.
while compiling the program, I got the error below:
the error is "fileproc.h(60): error: expected an identifier"
Use assert instead of std::assert - assert is a macro, so it is not in the std namespace.