When trying to retrieve data from an af::array (arrayfire) from the device via host(), my output data on the host is wrong (i.e. wrong values). For testing that, I wrote a small code sample (based on https://stackoverflow.com/a/29212923/2546099):
int main(void) {
size_t vector_size = 16;
af::array in_test_array = af::constant(1., vector_size), out_test_array = af::constant(0., vector_size);
af_print(in_test_array);
double *local_data_ptr = new double[vector_size]();
for(int i = 0; i < vector_size; ++i)
std::cout << local_data_ptr[i] << '\t';
std::cout << '\n';
in_test_array.host(local_data_ptr);
for(int i = 0; i < vector_size; ++i)
std::cout << local_data_ptr[i] << '\t';
std::cout << '\n';
delete[] local_data_ptr;
out_test_array = in_test_array;
af_print(out_test_array);
return 0;
}
My output is
in_test_array
[16 1 1 1]
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.007813 0.007813 0.007813 0.007813 0.007813 0.007813 0.007813 0.007813 0 0 0 0 0 0 0 0
out_test_array
[16 1 1 1]
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
Why are half the values in the pointer set to 0.007813, and not all values to 1? When changing the default value for in_test_array to 2, half the values are set to 2, and for 3 those values are set to 32. Why does that happen?
The datatypes between arrayfire and C are in conflict.
For float use:
af::array in_test_array = af::constant(1., vector_size),
out_test_array = af::constant(0., vector_size);
float *local_data_ptr = new float[vector_size]();
For double use:
af::array in_test_array = af::constant(1., vector_size, f64),
out_test_array = af::constant(0., vector_size, f64)
double *local_data_ptr = new double[vector_size]();
IN both cases above, you will see that arrayfire will return you 1.0 in the local_data_ptr buffer, although with different data types.
I am learning python.I want to calculate correlation between values.Below is my data which is a dictionary.
My_data = {1: [1450.0, -80.0, 840.0, -220.0, 630.0, 780.0, -1140.0], 2: [1450.0, -80.0, 840.0, -220.0, 630.0, 780.0, -1140.0],3:[ 720.0, -230.0, 460.0, 220.0, 710.0, -460.0, 90.0] }
This is what I expect to have in return.
1 2 3
1 1 0.69 0.77
2 1 0.54
3 1
This is the code I tried. I get TypeError:unsupported operand type(s) for /: 'list' and 'long'
I am not sure what went wrong. I would appreciate if somebody explains me and help me get the desired solution.
my_array=np.array(My_data .values())
Correlation = np.corrcoef(my_array,my_array)
Case 1: if you are open to use pandas
Using pandas (which is a wrapper of numpy), you can porceed as follows:
In [55]: import pandas as pd
In [56]: df = pd.DataFrame.from_dict(My_data, orient='index').T
In [57]: df.corr(method='pearson')
Out[57]:
1 2 3
1 1.000000 1.000000 0.384781
2 1.000000 1.000000 0.121978
3 0.384781 0.121978 1.000000
In [58]: df.corr(method='kendall')
Out[58]:
1 2 3
1 1.000000 1.000000 0.333333
2 1.000000 1.000000 0.240385
3 0.333333 0.240385 1.000000
In [59]: df.corr(method='spearman')
Out[59]:
1 2 3
1 1.000000 1.00000 0.464286
2 1.000000 1.00000 0.327370
3 0.464286 0.32737 1.000000
In [60]:
Explanation:
The following line creates a pandas.DataFrame from the dictionary My_data
df = pd.DataFrame.from_dict(My_data, orient='index').T
Which looks like this:
In [60]: df
Out[60]:
1 2 3
0 1450.0 1450.0 720.0
1 -80.0 -80.0 -230.0
2 840.0 840.0 460.0
3 -220.0 -220.0 220.0
4 630.0 630.0 710.0
5 780.0 780.0 -460.0
6 -1140.0 -1140.0 90.0
7 NaN 450.0 -640.0
8 NaN 730.0 870.0
9 NaN -810.0 -290.0
10 NaN 390.0 -2180.0
11 NaN -220.0 -790.0
12 NaN -1640.0 65.0
13 NaN -590.0 70.0
14 NaN -145.0 460.0
15 NaN -420.0 NaN
16 NaN 620.0 NaN
17 NaN 450.0 NaN
18 NaN -90.0 NaN
19 NaN 990.0 NaN
20 NaN -705.0 NaN
then df.corr() will compute the pairwise correlation between columns.
Case 2: if you want a pure numpy solution
You need to convert your data into numpy.ndarray first, then you can compute the correlation like this,
In [91]: np.corrcoef(np.asarray(new_data.values()))
Out[91]:
array([[ 1. , 1. , 0.38478131],
[ 1. , 1. , 0.38478131],
[ 0.38478131, 0.38478131, 1. ]])
In [92]:
I try and diagonalize the matrix:
In my analysis, I set $\hbar = 1$. The code is:
MODULE FUNCTION_CONTAINER
IMPLICIT NONE
SAVE
INTEGER, PARAMETER :: DBL = SELECTED_REAL_KIND(P = 15,R = 200)
COMPLEX(KIND = DBL), PARAMETER :: IMU = (0.0D0, 1.0D0)
REAL(KIND = DBL), PARAMETER :: S = 1.0D0
INTEGER, PARAMETER :: TEMP1 = NINT((2.0D0 * S) + 1.0D0)
INTEGER, PARAMETER :: DIMJ = TEMP1
INTEGER, PARAMETER :: TEMP2 = TEMP1*TEMP1
INTEGER, PARAMETER :: DIMMAT = TEMP2
CONTAINS
INTEGER FUNCTION KRONDELTAR(K,L)
IMPLICIT NONE
REAL(KIND = DBL), INTENT(IN)::K,L
REAL(KIND = DBL) :: TEMP
TEMP = DABS(K - L)
IF (TEMP < 0.000001D0) THEN
KRONDELTAR = 1
ELSE
KRONDELTAR = 0
END IF
END FUNCTION KRONDELTAR
SUBROUTINE MATJplus(MATOUT)
IMPLICIT NONE
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ),INTENT(OUT)::MATOUT
INTEGER::K,L
REAL(KIND = DBL)::M,MP
DO K = 1,DIMJ
DO L = 1,DIMJ
MP = (S + 1.0D0) - L
M = (S + 1.0D0) - K
MATOUT(K,L) = DSQRT(S * (S + 1.0D0) - M * (M + 1.0D0)) * KRONDELTAR(MP,M + 1)
END DO
END DO
END SUBROUTINE MATJplus
SUBROUTINE MATJminus(MATOUT)
IMPLICIT NONE
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ),INTENT(OUT)::MATOUT
INTEGER::K,L
REAL(KIND = DBL)::MP,M
DO K = 1,DIMJ
DO L = 1,DIMJ
MP = (S + 1) - L
M = (S + 1) - K
MATOUT(K,L) = DSQRT(S* (S + 1.0D0) - M * (M - 1.0D0)) * KRONDELTAR(MP,M - 1)
END DO
END DO
END SUBROUTINE MATJminus
SUBROUTINE MATJy(MATOUT)
IMPLICIT NONE
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ),INTENT(OUT)::MATOUT
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ)::Jp,Jm
CALL MATJplus(Jp)
CALL MATJminus(Jm)
MATOUT = (Jp - Jm)/(2.0D0 * IMU)
END SUBROUTINE MATJy
SUBROUTINE DIAGONALIZEJy(EIGENSTATESJy,EIGENVALUESJY)
IMPLICIT NONE
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ),INTENT(OUT)::EIGENSTATESJy
REAL(KIND = DBL), DIMENSION(DIMJ),INTENT(OUT)::EIGENVALUESJY
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ)::JyTEMP,Jy
COMPLEX(KIND = DBL),DIMENSION(2*DIMJ)::D1
REAL(KIND = DBL),DIMENSION(3*DIMJ - 2)::D2
INTEGER::D3
CALL MATJy(Jy)
JyTEMP = Jy
CALL ZHEEV('V','U',DIMJ,JyTEMP,DIMJ,EIGENVALUESJy,D1,2*DIMJ,D2,D3)
EIGENSTATESJy = JyTEMP
END SUBROUTINE DIAGONALIZEJy
END MODULE FUNCTION_CONTAINER
PROGRAM TEST
USE FUNCTION_CONTAINER
IMPLICIT NONE
COMPLEX(KIND = DBL), DIMENSION(DIMJ,DIMJ) :: EIGENSTATESJy, MatrixJy
REAL(KIND = DBL), DIMENSION(DIMJ) :: EIGENVALUESJy
CALL DIAGONALIZEJy(EIGENSTATESJy,EIGENVALUESJY)
CALL MATJy(MatrixJy)
OPEN(1, FILE = 'EIGENVALUESJy.DAT')
OPEN(2, FILE = 'EIGENSTATESJyREAL.DAT')
OPEN(3,FILE = 'EIGENSTATESJyCOMPLEX.DAT')
WRITE (1,*) EIGENVALUESJy
WRITE (2,*) REAL(EIGENSTATESJy)
WRITE (3,*) AIMAG(EIGENSTATESJy)
CLOSE(1)
CLOSE(2)
CLOSE(3)
END PROGRAM TEST
Up till the subroutine DIAGONALIZEJy, I am simply constructing the matrix stated above. One can easily check Fortran constructs is neatly by simply writing the result from the subroutine MatJy. I transfer the data to Mathematica. The results are:
{{-1., -9.19403*10^-17, 1.}}
This is the list of eigenvalues. The list of eigenvectors is:
{{-0.5 + 0. I, 0. - 0.707107 I, 0.5 + 0. I}, {0.707107 + 0. I,
0. + 1.04083*10^-16 I, 0.707107 + 0. I}, {-0.5 + 0. I,
0. + 0.707107 I, 0.5 + 0. I}}
The first eigenvector corresponds to the first eigenvalue (at least that's what I get by printing the column vectors from EigenvectorsJy one by one).
Clearly, the result is wrong. See:
http://www.wolframalpha.com/widgets/view.jsp?id=9aa01caf50c9307e9dabe159c9068c41
I hope the link shows the results for the eigenvalues problem done using a widget. The eigenvalues are correct but all the eigenvectors are way off.
Also, when I run only the subroutine that diagonlizes the matrix in my main program which contains a whole host of other stuff, the results are:
{{0.885212, 0., -0.920222}}
and
{{0.0439691 + 0. I, -0.388918 + 0. I, 0.5 + 0. I}, {0.707107 + 0. I,
0. + 1.04083*10^-16 I, 0.707107 + 0. I}, {-0.5 + 0. I,
0. + 0.707107 I, 0.5 + 0. I}}
As you can see, the non zero eigenvalues are a bit off and the eigenvectors are too (and still incorrect). Why is the main program giving a different result, perhaphs exacerbating the error? Also, in the first place (minimal example, see above), why am I getting wrong answers?
Edit: Apparently, the link doesn't show the results so here's a snippet:
In short, your Jy matrix in the code seems to be the complex conjugate of what is desired (i.e., the image posted in the Question), which results in the eigenvectors that are complex conjugate of the correct ones.
The above error seems to originate from the OP's assumption that list-directed output (as write(*,*) A) prints the matrix elements in the "row-major" order, while in fact they are printed in the "column-major" order (see the comments below). By noting this and correcting the program accordingly, I think the program will work as expected.
More specifically, adding the following utility routine to print a matrix
subroutine printmat( msg, mat )
implicit none
character(*), intent(in) :: msg
complex(DBL), intent(in) :: mat( dimJ, dimJ )
integer i1, i2
print *
print *, msg
do i1 = 1, dimJ
print "(3('(',f10.6,',',f10.6,' ) '))", ( mat( i1, i2 ), i2 = 1,dimJ )
enddo
end subroutine
and checking the value of Jp, Jm, Jy in the subroutine MATJy()
Jp:
( 0.000000, 0.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 0.000000 )
( 1.414214, 0.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 0.000000 )
( 0.000000, 0.000000 ) ( 1.414214, 0.000000 ) ( 0.000000, 0.000000 )
Jm:
( 0.000000, 0.000000 ) ( 1.414214, 0.000000 ) ( 0.000000, 0.000000 )
( 0.000000, 0.000000 ) ( 0.000000, 0.000000 ) ( 1.414214, 0.000000 )
( 0.000000, 0.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 0.000000 )
Jy * sqrt(2):
( 0.000000, 0.000000 ) ( 0.000000, 1.000000 ) ( 0.000000, 0.000000 )
( 0.000000, -1.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 1.000000 )
( 0.000000, 0.000000 ) ( 0.000000, -1.000000 ) ( 0.000000, 0.000000 )
eigenvaluesJy(1) = -1.000000
eigvec:
( -0.500000, 0.000000 )
( 0.000000, -0.707107 )
( 0.500000, 0.000000 )
eigenvaluesJy(2) = -0.000000
eigvec:
( 0.707107, 0.000000 )
( 0.000000, 0.000000 )
( 0.707107, 0.000000 )
eigenvaluesJy(3) = 1.000000
eigvec:
( -0.500000, 0.000000 )
( 0.000000, 0.707107 )
( 0.500000, 0.000000 )
we see that the above Jy matrix is the complex conjugate of the desired matrix (given as an image in the Question). The reason seems to be that the Jp and Jm matrices are given as the transpose of the correct ones (according to some pages like this and this). For example, if we change their index as
SUBROUTINE MATJplus(MATOUT)
IMPLICIT NONE
COMPLEX(KIND = DBL),DIMENSION(DIMJ,DIMJ),INTENT(OUT)::MATOUT
INTEGER::K,L
REAL(KIND = DBL)::M,MP
DO K = 1,DIMJ
DO L = 1,DIMJ
MP = (S + 1.0D0) - L !! 1, 0, -1 ("m_prime")
M = (S + 1.0D0) - K !! 1, 0, -1 ("m")
!>>> Here, we swap the indices K and L in the LHS
!! MATOUT(K,L) = DSQRT(S * (S + 1.0D0) - M * (M + 1.0D0)) * KRONDELTAR(MP, M + 1)
MATOUT(L,K) = DSQRT(S * (S + 1.0D0) - M * (M + 1.0D0)) * KRONDELTAR(MP, M + 1)
END DO
END DO
call printmat( "Jplus:", matout )
END SUBROUTINE
(and modifying MATJminus() similarly), we obtain the expected result:
Jp:
( 0.000000, 0.000000 ) ( 1.414214, 0.000000 ) ( 0.000000, 0.000000 )
( 0.000000, 0.000000 ) ( 0.000000, 0.000000 ) ( 1.414214, 0.000000 )
( 0.000000, 0.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 0.000000 )
Jm:
( 0.000000, 0.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 0.000000 )
( 1.414214, 0.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, 0.000000 )
( 0.000000, 0.000000 ) ( 1.414214, 0.000000 ) ( 0.000000, 0.000000 )
Jy * sqrt(2):
( 0.000000, 0.000000 ) ( 0.000000, -1.000000 ) ( 0.000000, 0.000000 )
( 0.000000, 1.000000 ) ( 0.000000, 0.000000 ) ( 0.000000, -1.000000 )
( 0.000000, 0.000000 ) ( 0.000000, 1.000000 ) ( 0.000000, 0.000000 )
eigenvaluesJy(1) = -1.000000
eigvec:
( -0.500000, 0.000000 )
( 0.000000, 0.707107 )
( 0.500000, 0.000000 )
eigenvaluesJy(2) = -0.000000
eigvec:
( 0.707107, 0.000000 )
( 0.000000, -0.000000 )
( 0.707107, 0.000000 )
eigenvaluesJy(3) = 1.000000
eigvec:
( -0.500000, 0.000000 )
( 0.000000, -0.707107 )
( 0.500000, 0.000000 )
For convenience, here are some matrices taken from the above pages (which can be compared directly with the above Jp, Jm, Jy):
I compile the code below on cent os 5.3 and cent os 6.3:
#include <pthread.h>
#include <list>
#include <unistd.h>
#include <iostream>
using namespace std;
pthread_mutex_t _mutex;
pthread_spinlock_t spinlock;
list<int *> _task_list;
void * run(void*);
int main()
{
int worker_num = 3;
pthread_t pids[worker_num];
pthread_mutex_init(&_mutex, NULL);
for (int worker_i = 0; worker_i < worker_num; ++worker_i)
{
pthread_create(&(pids[worker_i]), NULL, run, NULL);
}
sleep(14);
}
void *run(void * args)
{
int *recved_info;
long long start;
while (true)
{
pthread_mutex_lock(&_mutex);
if (_task_list.empty())
{
recved_info = 0;
}
else
{
recved_info = _task_list.front();
_task_list.pop_front();
}
pthread_mutex_unlock(&_mutex);
if (recved_info == 0)
{
int f = usleep(1);
continue;
}
}
}
While running on the 5.3, you can't even find the process on top, cpu usage is around 0%. But on cent os 6.3, it's about 20% with 6 threads on a 4 cores cpu.
So I check the a.out with time and stace , the results are about that:
On 5.3:
real 0m14.003s
user 0m0.001s
sys 0m0.001s
On 6.3:
real 0m14.002s
user 0m1.484s
sys 0m1.160s
the strace:
on 5.3:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
91.71 0.002997 0 14965 nanosleep
8.29 0.000271 271 1 execve
0.00 0.000000 0 5 read
0.00 0.000000 0 10 4 open
0.00 0.000000 0 6 close
0.00 0.000000 0 4 4 stat
0.00 0.000000 0 6 fstat
0.00 0.000000 0 22 mmap
0.00 0.000000 0 13 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 3 rt_sigaction
0.00 0.000000 0 3 rt_sigprocmask
0.00 0.000000 0 1 1 access
0.00 0.000000 0 3 clone
0.00 0.000000 0 1 uname
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 38 4 futex
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 4 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 0.003268 15092 13 total
on 6.3:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.99 1.372813 36 38219 nanosleep
0.01 0.000104 0 409 43 futex
0.00 0.000000 0 5 read
0.00 0.000000 0 6 open
0.00 0.000000 0 6 close
0.00 0.000000 0 6 fstat
0.00 0.000000 0 22 mmap
0.00 0.000000 0 15 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 3 rt_sigaction
0.00 0.000000 0 3 rt_sigprocmask
0.00 0.000000 0 7 7 access
0.00 0.000000 0 3 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 4 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 1.372917 38716 50 total
The time and the strace results are not the same test, so data is a little different. But I think it can show something.
I check the kernel config CONFIG_HIGH_RES_TIMERS, CONFIG_HPET and CONFIG_HZ:
On 5.3:
$ cat /boot/config-`uname -r` |grep CONFIG_HIGH_RES_TIMERS
$ cat /boot/config-`uname -r` |grep CONFIG_HPET
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
# CONFIG_HPET_MMAP is not set
$ cat /boot/config-`uname -r` |grep CONFIG_HZ
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
On 6.3:
$ cat /boot/config-`uname -r` |grep CONFIG_HIGH_RES_TIMERS
CONFIG_HIGH_RES_TIMERS=y
$ cat /boot/config-`uname -r` |grep CONFIG_HPET
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
$ cat /boot/config-`uname -r` |grep CONFIG_HZ
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
In fact, I also try the code on arch on ARM and xubuntu13.04-amd64-desktop, the same as the cent os 6.3.
So what can I do to figure out the reason of the different CPU usages?
Does it have anything with the kernel config?
You're correct, it has to do with the kernel config. usleep(1) will try to sleep for one microsecond. Before high resolution timers, it was not possible to sleep for less than a jiffy (in your case HZ=1000 so 1 jiffy == 1 millisecond).
On CentOS 5.3 which does not have these high resolution timers, you would sleep between 1ms and 2ms[1]. On CentOS 6.3 which has these timers, you're sleeping for close to one microsecond. That's why you're using more cpu on this platform: you're simply polling your task list 500-1000 times more.
If you change the code to usleep(1000), CentOS 5.3 will behave the same. CentOS 6.3 cpu time will decrease and be in the same ballpark as the program running on CentOS 5.3
There is a full discussion of this in the Linux manual: run man 7 time.
Note that your code should use condition variables instead of polling your task list at a certain time interval. That's a more efficient and clean way to do what you're doing.
Also, your main should really join the threads instead of just sleeping for 14 seconds.
[1] There is one exception. If your application was running under a realtime scheduling policy (SCHED_FIFO or SCHED_RR), it would busy-wait instead of sleeping to sleep close to the right amount. But by default you need root privileges