My code goes like this:
if (x[b] != x[c] && x[a] != x[b])
{
if(x[a] != x[b] != x[c] != x[d])
{
if(y[a] != y[b] != y[c] != y[d])
{
int m = (y[a] - y[b])/(x[a] - x[b]);
int b = y[a] - x[a] * m;
if((y[c] - x[c]*m > b && y[d] - x[d]*m > b)|| (y[c] - x[c]*m < b && y[d] - x[d]*m < b)){
m = (y[b] - y[c])/(x[b] - x[c]);
b = y[b] - x[b] * m;
if((y[a] - x[a]*m > b && y[d] - x[d]*m > b)|| (y[a] - x[a]*m < b && y[d] - x[d]*m < b))
{
int xx[4];
int yy[4];
xx[0] = x[a];
xx[1] = x[b];
xx[2] = x[c];
xx[3] = x[d];
yy[0] = y[a];
yy[1] = y[b];
yy[2] = y[c];
yy[3] = y[d];
if(c == 1)
{
c = 0;
pt = povrsina(xx, yy);
pc = p - pt;
}
pv = povrsina(xx, yy);
pr = p - pv;
if(pr < pc)
{
pc = pr;
pt = pv;
}
else if(pr == pc)
{
pt = max(pt, pv);
}
}
}
}
}
}
The input I have tried is this:
5
30
0 0
10 0
0 10
10 10
7 3
My program crashes with an error code for Division by zero in line 174:
m = (y[b] - y[c])/(x[b] - x[c]);
So I added the first if statement checking
if x[b] != x[c]
but it still crashes for some reason with the same error code and in the same line.
The following works correctly.
int n = 0;
double p = 0;
scanf("%d\n%lf", &n, &p);
int *x = (int *) malloc(sizeof(int) * n);
int *y = (int *) malloc(sizeof(int) * n);
for(int i = 0; i < n; i++){
scanf("%d %d", &x[i], &y[i]);
}
for(int i = 0; i < n; i++){
printf("%d %d\n", x[i], y[i]);
}
See the demo.
Update
Your code is crashing because of the following line.
int m = (y[a] - y[b])/(x[a] - x[b]);
Think if the value of a, b, c, d is 0, 2, 3, 4, then the denominator term (x[a] - x[b]) represents (x[0] - x[2]) which is equal to zero, right? Because the 1st and 3rd input is 0 0 and 0 10.
If you divide a value by zero, then it becomes infinity and your program will crash at that time. While writing statement with division operation, make sure the divisor is not zero.
Related
I was learning MO's Algorithm. In that I found a question. In which we have to make a program to take input n for n nodes of a tree then n-1 pairs of u and v denoting the connection between node u and node v. After that giving the n node values.
Then we will ask q queries. For each query we take input of k and l which denote the two nodes of that tree. Now we have to find the product of all the nodes in the path of k and l (including k and l).
I want to use MO's algorithm. https://codeforces.com/blog/entry/43230
But I am unable to make the code. Can anybody help me out in this.
The basic code for that would be:
int n, q;
int nxt[ N ], to[ N ], hd[ N ];
struct Que{
int u, v, id;
} que[ N ];
void init() {
// read how many nodes and how many queries
cin >> n >> q;
// read the edge of tree
for ( int i = 1 ; i < n ; ++ i ) {
int u, v; cin >> u >> v;
// save the tree using adjacency list
nxt[ i << 1 | 0 ] = hd[ u ];
to[ i << 1 | 0 ] = v;
hd[ u ] = i << 1 | 0;
nxt[ i << 1 | 1 ] = hd[ v ];
to[ i << 1 | 1 ] = u;
hd[ v ] = i << 1 | 1;
}
for ( int i = 0 ; i < q ; ++ i ) {
// read queries
cin >> que[ i ].u >> que[ i ].v;
que[ i ].id = i;
}
}
int dfn[ N ], dfn_, block_id[ N ], block_;
int stk[ N ], stk_;
void dfs( int u, int f ) {
dfn[ u ] = dfn_++;
int saved_rbp = stk_;
for ( int v_ = hd[ u ] ; v_ ; v_ = nxt[ v_ ] ) {
if ( to[ v_ ] == f ) continue;
dfs( to[ v_ ], u );
if ( stk_ - saved_rbp < SQRT_N ) continue;
for ( ++ block_ ; stk_ != saved_rbp ; )
block_id[ stk[ -- stk_ ] ] = block_;
}
stk[ stk_ ++ ] = u;
}
bool inPath[ N ];
void SymmetricDifference( int u ) {
if ( inPath[ u ] ) {
// remove this edge
} else {
// add this edge
}
inPath[ u ] ^= 1;
}
void traverse( int& origin_u, int u ) {
for ( int g = lca( origin_u, u ) ; origin_u != g ; origin_u = parent_of[ origin_u ] )
SymmetricDifference( origin_u );
for ( int v = u ; v != origin_u ; v = parent_of[ v ] )
SymmetricDifference( v );
origin_u = u;
}
void solve() {
// construct blocks using dfs
dfs( 1, 1 );
while ( stk_ ) block_id[ stk[ -- stk_ ] ] = block_;
// re-order our queries
sort( que, que + q, [] ( const Que& x, const Que& y ) {
return tie( block_id[ x.u ], dfn[ x.v ] ) < tie( block_id[ y.u ], dfn[ y.v ] );
} );
// apply mo's algorithm on tree
int U = 1, V = 1;
for ( int i = 0 ; i < q ; ++ i ) {
pass( U, que[ i ].u );
pass( V, que[ i ].v );
// we could our answer of que[ i ].id
}
}
This problem is a slight modification of the blog that you have shared.
Problem Tags:- MO's Algorithm, Trees, LCA, Binary Lifting, Sieve, Precomputation, Prime Factors
Precomputations:- Just we need to do some precomputations with seiveOfErothenesis to store the highest prime factor of each element possible in input constraints. Then using this we will store all the prime factors and their powers for each element in input array in another matrix.
Observation:- with the constraints you can see the there can be very few such primes possible for each element. For an element (10^6) there can be a maximum of 7 prime factors possible.
Modify MO Algo Given in blog:- Now in our compute method we just need to maintain a map that will store the current count of the prime factor. While adding or subtracting each element in solving the queries we will iterate on the prime factors of that element and divide our result(storing total no. of factors) with the old count of that prime and then update the count of that prime and the multiple our result with the new count.(This will be O(7) max for each addition/subtraction).
Complexity:- O(T * ((N + Q) * sqrt(N) * F)) where F is 7 in our case. F is the complexity of your check method().
T - no of test cases in input file.
N - the size of your input array.
Q - No. of queries.
Below is an implementation of the above approach in JAVA. computePrimePowers() and check() are the methods you would be interested in.
import java.util.*;
import java.io.*;
public class Main {
static int BLOCK_SIZE;
static int ar[];
static ArrayList<Integer> graph[];
static StringBuffer sb = new StringBuffer();
static boolean notPrime[] = new boolean[1000001];
static int hpf[] = new int[1000001];
static void seive(){
notPrime[0] = true;
notPrime[1] = true;
for(int i = 2; i < 1000001; i++){
if(!notPrime[i]){
hpf[i] = i;
for(int j = 2 * i; j < 1000001; j += i){
notPrime[j] = true;
hpf[j] = i;
}
}
}
}
static long modI[] = new long[1000001];
static void computeModI() {
for(int i = 1; i < 1000001; i++) {
modI[i] = pow(i, 1000000005);
}
}
static long pow(long x, long y) {
if (y == 0)
return 1;
long p = pow(x, y / 2);
p = (p >= 1000000007) ? p % 1000000007 : p;
p = p * p;
p = (p >= 1000000007) ? p % 1000000007 : p;
if ((y & 1) == 0)
return p;
else {
long tt = x * p;
return (tt >= 1000000007) ? tt % 1000000007 : tt;
}
}
public static void main(String[] args) throws Exception {
Reader s = new Reader();
int test = s.nextInt();
seive();
computeModI();
for(int ii = 0; ii < test; ii++){
int n = s.nextInt();
lcaTable = new int[19][n + 1];
graph = new ArrayList[n + 1];
arrPrimes = new int[n + 1][7][2];
primeCnt = new int[1000001];
visited = new int[n + 1];
ar = new int[n + 1];
for(int i = 0; i < graph.length; i++) graph[i] = new ArrayList<>();
for(int i = 1; i < n; i++){
int u = s.nextInt(), v = s.nextInt();
graph[u].add(v);
graph[v].add(u);
}
int ip = 1; while(ip <= n) ar[ip++] = s.nextInt();
computePrimePowers();
int q = s.nextInt();
LVL = new int[n + 1];
dfsTime = 0;
dfs(1, -1);
BLOCK_SIZE = (int) Math.sqrt(dfsTime);
int Q[][] = new int[q][4];
int i = 0;
while(q-- > 0) {
int u = s.nextInt(), v = s.nextInt();
Q[i][0] = lca(u, v);
if (l[u] > l[v]) {
int temp = u; u = v; v = temp;
}
if (Q[i][0] == u) {
Q[i][1] = l[u];
Q[i][2] = l[v];
}
else {
Q[i][1] = r[u]; // left at col1 in query
Q[i][2] = l[v]; // right at col2
}
Q[i][3] = i;
i++;
}
Arrays.sort(Q, new Comparator<int[]>() {
#Override
public int compare(int[] x, int[] y) {
int block_x = (x[1] - 1) / (BLOCK_SIZE + 1);
int block_y = (y[1] - 1) / (BLOCK_SIZE + 1);
if(block_x != block_y)
return block_x - block_y;
return x[2] - y[2];
}
});
solveQueries(Q);
}
System.out.println(sb);
}
static long res;
private static void solveQueries(int [][] Q) {
int M = Q.length;
long results[] = new long[M];
res = 1;
int curL = Q[0][1], curR = Q[0][1] - 1;
int i = 0;
while(i < M){
while (curL < Q[i][1]) check(ID[curL++]);
while (curL > Q[i][1]) check(ID[--curL]);
while (curR < Q[i][2]) check(ID[++curR]);
while (curR > Q[i][2]) check(ID[curR--]);
int u = ID[curL], v = ID[curR];
if (Q[i][0] != u && Q[i][0] != v) check(Q[i][0]);
results[Q[i][3]] = res;
if (Q[i][0] != u && Q[i][0] != v) check(Q[i][0]);
i++;
}
i = 0;
while(i < M) sb.append(results[i++] + "\n");
}
static int visited[];
static int primeCnt[];
private static void check(int x) {
if(visited[x] == 1){
for(int i = 0; i < 7; i++) {
int c = arrPrimes[x][i][1];
int pp = arrPrimes[x][i][0];
if(pp == 0) break;
long tem = res * modI[primeCnt[pp] + 1];
res = (tem >= 1000000007) ? tem % 1000000007 : tem;
primeCnt[pp] -= c;
tem = res * (primeCnt[pp] + 1);
res = (tem >= 1000000007) ? tem % 1000000007 : tem;
}
}
else if(visited[x] == 0){
for(int i = 0; i < 7; i++) {
int c = arrPrimes[x][i][1];
int pp = arrPrimes[x][i][0];
if(pp == 0) break;
long tem = res * modI[primeCnt[pp] + 1];
res = (tem >= 1000000007) ? tem % 1000000007 : tem;
primeCnt[pp] += c;
tem = res * (primeCnt[pp] + 1);
res = (tem >= 1000000007) ? tem % 1000000007 : tem;
}
}
visited[x] ^= 1;
}
static int arrPrimes[][][];
static void computePrimePowers() {
int n = arrPrimes.length;
int i = 0;
while(i < n) {
int ele = ar[i];
int k = 0;
while(ele > 1) {
int c = 0;
int pp = hpf[ele];
while(hpf[ele] == pp) {
c++; ele /= pp;
}
arrPrimes[i][k][0] = pp;
arrPrimes[i][k][1] = c;
k++;
}
i++;
}
}
static int dfsTime;
static int l[] = new int[1000001], r[] = new int[1000001], ID[] = new int[1000001], LVL[], lcaTable[][];
static void dfs(int u, int p){
l[u] = ++dfsTime;
ID[dfsTime] = u;
int i = 1;
while(i < 19) {
lcaTable[i][u] = lcaTable[i - 1][lcaTable[i - 1][u]];
i++;
}
i = 0;
while(i < graph[u].size()){
int v = graph[u].get(i);
i++;
if (v == p) continue;
LVL[v] = LVL[u] + 1;
lcaTable[0][v] = u;
dfs(v, u);
}
r[u] = ++dfsTime;
ID[dfsTime] = u;
}
static int lca(int u, int v){
if (LVL[u] > LVL[v]) {
int temp = u;
u = v; v = temp;
}
int i = 18;
while(i >= 0) {
if (LVL[v] - (1 << i) >= LVL[u]) v = lcaTable[i][v];
i--;
}
if (u == v) return u;
i = 18;
while(i >= 0){
if (lcaTable[i][u] != lcaTable[i][v]){
u = lcaTable[i][u];
v = lcaTable[i][v];
}
i--;
}
return lcaTable[0][u];
}
}
// SIMILAR SOLUTION FOR FINDING NUMBER OF DISTINCT ELEMENTS FROM U TO V
// USING MO's ALGORITHM
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 40005;
const int MAXM = 100005;
const int LN = 19;
int N, M, K, cur, A[MAXN], LVL[MAXN], DP[LN][MAXN];
int BL[MAXN << 1], ID[MAXN << 1], VAL[MAXN], ANS[MAXM];
int d[MAXN], l[MAXN], r[MAXN];
bool VIS[MAXN];
vector < int > adjList[MAXN];
struct query{
int id, l, r, lc;
bool operator < (const query& rhs){
return (BL[l] == BL[rhs.l]) ? (r < rhs.r) : (BL[l] < BL[rhs.l]);
}
}Q[MAXM];
// Set up Stuff
void dfs(int u, int par){
l[u] = ++cur;
ID[cur] = u;
for (int i = 1; i < LN; i++) DP[i][u] = DP[i - 1][DP[i - 1][u]];
for (int i = 0; i < adjList[u].size(); i++){
int v = adjList[u][i];
if (v == par) continue;
LVL[v] = LVL[u] + 1;
DP[0][v] = u;
dfs(v, u);
}
r[u] = ++cur; ID[cur] = u;
}
// Function returns lca of (u) and (v)
inline int lca(int u, int v){
if (LVL[u] > LVL[v]) swap(u, v);
for (int i = LN - 1; i >= 0; i--)
if (LVL[v] - (1 << i) >= LVL[u]) v = DP[i][v];
if (u == v) return u;
for (int i = LN - 1; i >= 0; i--){
if (DP[i][u] != DP[i][v]){
u = DP[i][u];
v = DP[i][v];
}
}
return DP[0][u];
}
inline void check(int x, int& res){
// If (x) occurs twice, then don't consider it's value
if ( (VIS[x]) and (--VAL[A[x]] == 0) ) res--;
else if ( (!VIS[x]) and (VAL[A[x]]++ == 0) ) res++;
VIS[x] ^= 1;
}
void compute(){
// Perform standard Mo's Algorithm
int curL = Q[0].l, curR = Q[0].l - 1, res = 0;
for (int i = 0; i < M; i++){
while (curL < Q[i].l) check(ID[curL++], res);
while (curL > Q[i].l) check(ID[--curL], res);
while (curR < Q[i].r) check(ID[++curR], res);
while (curR > Q[i].r) check(ID[curR--], res);
int u = ID[curL], v = ID[curR];
// Case 2
if (Q[i].lc != u and Q[i].lc != v) check(Q[i].lc, res);
ANS[Q[i].id] = res;
if (Q[i].lc != u and Q[i].lc != v) check(Q[i].lc, res);
}
for (int i = 0; i < M; i++) printf("%d\n", ANS[i]);
}
int main(){
int u, v, x;
while (scanf("%d %d", &N, &M) != EOF){
// Cleanup
cur = 0;
memset(VIS, 0, sizeof(VIS));
memset(VAL, 0, sizeof(VAL));
for (int i = 1; i <= N; i++) adjList[i].clear();
// Inputting Values
for (int i = 1; i <= N; i++) scanf("%d", &A[i]);
memcpy(d + 1, A + 1, sizeof(int) * N);
// Compressing Coordinates
sort(d + 1, d + N + 1);
K = unique(d + 1, d + N + 1) - d - 1;
for (int i = 1; i <= N; i++) A[i] = lower_bound(d + 1, d + K + 1, A[i]) - d;
// Inputting Tree
for (int i = 1; i < N; i++){
scanf("%d %d", &u, &v);
adjList[u].push_back(v);
adjList[v].push_back(u);
}
// Preprocess
DP[0][1] = 1;
dfs(1, -1);
int size = sqrt(cur);
for (int i = 1; i <= cur; i++) BL[i] = (i - 1) / size + 1;
for (int i = 0; i < M; i++){
scanf("%d %d", &u, &v);
Q[i].lc = lca(u, v);
if (l[u] > l[v]) swap(u, v);
if (Q[i].lc == u) Q[i].l = l[u], Q[i].r = l[v];
else Q[i].l = r[u], Q[i].r = l[v];
Q[i].id = i;
}
sort(Q, Q + M);
compute();
}
}
Demo
This script produces a fibonacci spiral which starts off in the centre and branches outward. Although the output is correct, i get an error message saying:
"Run-Time Check Failure #2 - Stack around the variable 'fibArr' was corrupted."
I've tried reducing fibCount by 1, but this caused the central number to be 0 which I dont want.
int fibFunc(int);
int main() {
//--setting variables
const int rowCount = 5, colCount = 5;
const int fibCount = rowCount * colCount - 1;
static int f;
int i, j;
//--creating grid
int grid[rowCount][colCount] = { 0 };
//Create fibArr
f = fibCount;
int fibArr[fibCount];
while (f >= 0) {
fibArr[f] = fibFunc(f);
f--;
}
//directions settings
enum direction {
DOWN, RIGHT, UP, LEFT
} d = DOWN;
//spiral loop
int R = 0, C = 0; //R for row, C for column
f = fibCount;
while (f >= 0) {
grid[R][C] = fibArr[f];
f--;
if (d == DOWN) {
if (grid[R+1][C] == 0)
R++;
else d = RIGHT;
}
if (d == RIGHT) {
if (grid[R][C+1] == 0)
C++;
else d = UP;
}
if (d == UP) {
if (grid[R-1][C] == 0)
R--;
else d = LEFT;
}
if (d == LEFT) {
if (grid[R][C-1] == 0)
C--;
else {
d = DOWN;
R++;
}
}
}
printFib(fibArr, fibCount);
for (i = 0; i < rowCount; i++) {
for (j = 0; j < colCount; j++) {
cout << setw(7) << grid[i][j];
}
cout << endl;
cout << endl;
}
system("pause>nul");
}
int fibFunc(int n) {
if (n == 0 || n == 1)
return 1;
else
return fibFunc(n - 1) + fibFunc(n - 2);
}
Output
75025 55 89 144 233
46368 34 1 2 377
28657 21 1 3 610
17711 13 8 5 987
10946 6765 4181 2584 1597
As juanchopanza pointed out, array indexing is from 0 to size - 1.
const int rowCount = 5, colCount = 5;
const int fibCount = rowCount * colCount - 1;
This would result in:
fibCount = 24 = ( 5 * 5 ) - 1;
Further down, at your while loop:
int R = 0, C = 0; //R for row, C for column
f = fibCount;
while (f >= 0) {
grid[R][C] = fibArr[f]; // f = 24; size = 24;
...
}
Here you take a value from your array fibArray with the index of 24. Last index would be 23.
Change this to:
const int fibCount = rowCount * colCount;
Then before your while loop:
f = fibCount - 1;
Besides that I would use a single dimensional array like this:
int rows = 5, cols = 5;
char* tst = new char[ rows * cols ];
int select_row = 2;
int select_col = 3;
tst[ select_row * cols + select_col ] = 65;
char c = tst[ select_row * cols + select_col ];
delete[] tst;
So I've traced the segfault to the line, but I don't understand why this is a segfault. Can someone elaborate on the error of my ways?
Here are the variable declarations.
size_t i, j, n, m, chunk_size, pixel_size;
i = j = n = m = 0;
chunk_size = 256;
pixel_size = 4;
Here are the array declarations.
uint8_t** values = new uint8_t*[chunk_size];
for (i = 0; i < chunk_size; ++ i)
values[i] = new uint8_t[chunk_size];
float** a1 = new float*[chunk_size];
for (i = 0; i < chunk_size; ++i)
a1[i] = new float[chunk_size];
And here is where the segfault occurs.
float delta, d;
for (i = 0; i < 256; ++i) {
for (j = n = m = d = 0; j < 256; j = m) {
while (i == 0 || d != 0) {
d = a1[i][m]; <------SEGFAULT per GDB
++m;
}
delta = (d - a1[i][j]) / m;
n = j + 1;
while (n < j + m) {
a1[i][n] = a1[i][n - 1] + delta;
++n;
}
}
}
I'm fairly new to C++ and can't figure out why this would be a segfault. Is this not the proper way to set a variables value to a variable in an array? Is that the source of my segfault?
Note: The point of this whole thing is too expand a 4x4 array to a 256x256 array with my simpleton interpolation formula.
while (i == 0 || d != 0) {
d = a1[i][m]; <------SEGFAULT per GDB
++m;
}
This is an endless while loop, in some cases (e.g. in the first iteration of the outer loop).
Your outer loop starts out with i = 0 and the inner loops starts with d = 0 and the logic controlling the while loop is not sufficient (see code comment).
for (i = 0; i < 256; ++i) {
for (j = n = m = d = 0; j < 256; j = m) {
// Here i == 0 is ALWAYS true (so d != 0 is ignored due to
// short-circuit evaluation) and then 'm' is continuously incremented
// until it goes out of bounds
while (i == 0 || d != 0) {
d = a1[i][m]; <------SEGFAULT per GDB
++m;
}
delta = (d - a1[i][j]) / m;
n = j + 1;
while (n < j + m) {
a1[i][n] = a1[i][n - 1] + delta;
++n;
}
}
The problem is in the following lines :
while (i == 0 || d != 0) {
d = a1[i][m]; <------SEGFAULT per GDB
++m;
}
Your while loop will keep on going while i equals 0. Since you never increment i in your while loop, m keeps on incrementing forever until arriving out of bounds, causing the segfault issue that you are having.
Make sure you check the values of i and m, so that they are in the allocated memory range and your code will work.
This is a problem I have been struggling for a week, coming back just to give up after wasted hours...
I am supposed to find coefficents for the following Laguerre polynomial:
P0(x) = 1
P1(x) = 1 - x
Pn(x) = ((2n - 1 - x) / n) * P(n-1) - ((n - 1) / n) * P(n-2)
I believe there is an error in my implementation, because for some reason the coefficents I get seem way too big. This is the output this program generates:
a1 = -190.234
a2 = -295.833
a3 = 378.283
a4 = -939.537
a5 = 774.861
a6 = -400.612
Description of code (given below):
If you scroll the code down a little to the part where I declare array, you'll find given x's and y's.
The function polynomial just fills an array with values of said polynomial for certain x. It's a recursive function. I believe it works well, because I have checked the output values.
The gauss function finds coefficents by performing Gaussian elimination on output array. I think this is where the problems begin. I am wondering, if there's a mistake in this code or perhaps my method of veryfying results is bad? I am trying to verify them like that:
-190.234 * 1.5 ^ 5 - 295.833 * 1.5 ^ 4 ... - 400.612 = -3017,817625 =/= 2
Code:
#include "stdafx.h"
#include <conio.h>
#include <iostream>
#include <iomanip>
#include <math.h>
using namespace std;
double polynomial(int i, int j, double **tab)
{
double n = i;
double **array = tab;
double x = array[j][0];
if (i == 0) {
return 1;
} else if (i == 1) {
return 1 - x;
} else {
double minusone = polynomial(i - 1, j, array);
double minustwo = polynomial(i - 2, j, array);
double result = (((2.0 * n) - 1 - x) / n) * minusone - ((n - 1.0) / n) * minustwo;
return result;
}
}
int gauss(int n, double tab[6][7], double results[7])
{
double multiplier, divider;
for (int m = 0; m <= n; m++)
{
for (int i = m + 1; i <= n; i++)
{
multiplier = tab[i][m];
divider = tab[m][m];
if (divider == 0) {
return 1;
}
for (int j = m; j <= n; j++)
{
if (i == n) {
break;
}
tab[i][j] = (tab[m][j] * multiplier / divider) - tab[i][j];
}
for (int j = m; j <= n; j++) {
tab[i - 1][j] = tab[i - 1][j] / divider;
}
}
}
double s = 0;
results[n - 1] = tab[n - 1][n];
int y = 0;
for (int i = n-2; i >= 0; i--)
{
s = 0;
y++;
for (int x = 0; x < n; x++)
{
s = s + (tab[i][n - 1 - x] * results[n-(x + 1)]);
if (y == x + 1) {
break;
}
}
results[i] = tab[i][n] - s;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
int num;
double **array;
array = new double*[5];
for (int i = 0; i <= 5; i++)
{
array[i] = new double[2];
}
//i 0 1 2 3 4 5
array[0][0] = 1.5; //xi 1.5 2 2.5 3.5 3.8 4.1
array[0][1] = 2; //yi 2 5 -1 0.5 3 7
array[1][0] = 2;
array[1][1] = 5;
array[2][0] = 2.5;
array[2][1] = -1;
array[3][0] = 3.5;
array[3][1] = 0.5;
array[4][0] = 3.8;
array[4][1] = 3;
array[5][0] = 4.1;
array[5][1] = 7;
double W[6][7]; //n + 1
for (int i = 0; i <= 5; i++)
{
for (int j = 0; j <= 5; j++)
{
W[i][j] = polynomial(j, i, array);
}
W[i][6] = array[i][1];
}
for (int i = 0; i <= 5; i++)
{
for (int j = 0; j <= 6; j++)
{
cout << W[i][j] << "\t";
}
cout << endl;
}
double results[6];
gauss(6, W, results);
for (int i = 0; i < 6; i++) {
cout << "a" << i + 1 << " = " << results[i] << endl;
}
_getch();
return 0;
}
I believe your interpretation of the recursive polynomial generation either needs revising or is a bit too clever for me.
given P[0][5] = {1,0,0,0,0,...}; P[1][5]={1,-1,0,0,0,...};
then P[2] is a*P[0] + convolution(P[1], { c, d });
where a = -((n - 1) / n)
c = (2n - 1)/n and d= - 1/n
This can be generalized: P[n] == a*P[n-2] + conv(P[n-1], { c,d });
In every step there is involved a polynomial multiplication with (c + d*x), which increases the degree by one (just by one...) and adding to P[n-1] multiplied with a scalar a.
Then most likely the interpolation factor x is in range [0..1].
(convolution means, that you should implement polynomial multiplication, which luckily is easy...)
[a,b,c,d]
* [e,f]
------------------
af,bf,cf,df +
ae,be,ce,de, 0 +
--------------------------
(= coefficients of the final polynomial)
The definition of P1(x) = x - 1 is not implemented as stated. You have 1 - x in the computation.
I did not look any further.
I am trying to make a fraction calculator that calculates on a cuda devise, below is first the sequential version and then my try for a parallel version.
It runs without error, but for some reason do it not give the result back, I have been trying to get this to work for 2 weeks now, but can’t find the error!
Serilized version
int f(int x, int c, int n);
int gcd(unsigned int u, unsigned int v);
int main ()
{
clock_t start = clock();
srand ( time(NULL) );
int x = 1;
int y = 2;
int d = 1;
int c = rand() % 100;
int n = 323;
if(n % y == 0)
d = y;
while(d == 1)
{
x = f(x, c, n);
y = f(f(y, c, n), c, n);
int abs = x - y;
if(abs < 0)
abs = abs * -1;
d = gcd(abs, n);
if(d == n)
{
printf("\nd == n");
c = 0;
while(c == 0 || c == -2)
c = rand() % 100;
x = 2;
y = 2;
}
}
int d2 = n/d;
printf("\nTime elapsed: %f", ((double)clock() - start) / CLOCKS_PER_SEC);
printf("\nResult: %d", d);
printf("\nResult2: %d", d2);
int dummyReadForPause;
scanf_s("%d",&dummyReadForPause);
}
int f(int x, int c, int n)
{
return (int)(pow((float)x, 2) + c) % n;
}
int gcd(unsigned int u, unsigned int v){
int shift;
/ * GCD(0,x) := x * /
if (u == 0 || v == 0)
return u | v;
/ * Let shift := lg K, where K is the greatest power of 2
dividing both u and v. * /
for (shift = 0; ((u | v) & 1) == 0; ++shift) {
u >>= 1;
v >>= 1;
}
while ((u & 1) == 0)
u >>= 1;
/ * From here on, u is always odd. * /
do {
while ((v & 1) == 0) / * Loop X * /
v >>= 1;
/ * Now u and v are both odd, so diff(u, v) is even.
Let u = min(u, v), v = diff(u, v)/2. * /
if (u < v) {
v -= u;
} else {
int diff = u - v;
u = v;
v = diff;
}
v >>= 1;
} while (v != 0);
return u << shift;
}
parallel version
#define threads 512
#define MaxBlocks 65535
#define RunningTheads (512*100)
__device__ int gcd(unsigned int u, unsigned int v)
{
int shift;
if (u == 0 || v == 0)
return u | v;
for (shift = 0; ((u | v) & 1) == 0; ++shift) {
u >>= 1;
v >>= 1;
}
while ((u & 1) == 0)
u >>= 1;
do {
while ((v & 1) == 0)
v >>= 1;
if (u < v) {
v -= u;
} else {
int diff = u - v;
u = v;
v = diff;
}
v >>= 1;
} while (v != 0);
return u << shift;
}
__device__ bool cuda_found;
__global__ void cudaKernal(int *cArray, int n, int *outr)
{
int index = blockIdx.x * threads + threadIdx.x;
int x = 1;
int y = 2;
int d = 4;
int c = cArray[index];
while(d == 1 && !cuda_found)
{
x = (int)(pow((float)x, 2) + c) % n;
y = (int)(pow((float)y, 2) + c) % n;
y = (int)(pow((float)y, 2) + c) % n;
int abs = x - y;
if(abs < 0)
abs = abs * -1;
d = gcd(abs, n);
}
if(d != 1 && !cuda_found)
{
cuda_found = true;
outr = &d;
}
}
int main ()
{
int n = 323;
int cArray[RunningTheads];
cArray[0] = 1;
for(int i = 1; i < RunningTheads-1; i++)
{
cArray[i] = i+2;
}
int dresult = 0;
int *dev_cArray;
int *dev_result;
HANDLE_ERROR(cudaMalloc((void**)&dev_cArray, RunningTheads*sizeof(int)));
HANDLE_ERROR(cudaMalloc((void**)&dev_result, sizeof(int)));
HANDLE_ERROR(cudaMemcpy(dev_cArray, cArray, RunningTheads*sizeof(int), cudaMemcpyHostToDevice));
int TotalBlocks = ceil((float)RunningTheads/(float)threads);
if(TotalBlocks > MaxBlocks)
TotalBlocks = MaxBlocks;
printf("Blocks: %d\n", TotalBlocks);
printf("Threads: %d\n\n", threads);
cudaKernal<<<TotalBlocks,threads>>>(dev_cArray, n, dev_result);
HANDLE_ERROR(cudaMemcpy(&dresult, dev_result, sizeof(int), cudaMemcpyDeviceToHost));
HANDLE_ERROR(cudaFree(dev_cArray));
HANDLE_ERROR(cudaFree(dev_result));
if(dresult == 0)
dresult = 1;
int d2 = n/dresult;
printf("\nResult: %d", dresult);
printf("\nResult2: %d", d2);
int dummyReadForPause;
scanf_s("%d",&dummyReadForPause);
}
Lets have a look at your kernel code:
__global__ void cudaKernal(int *cArray, int n, int *outr)
{
int index = blockIdx.x * threads + threadIdx.x;
int x = 1;
int y = 2;
int d = 4;
int c = cArray[index];
while(d == 1 && !cuda_found) // always false because d is always 4
{
x = (int)(pow((float)x, 2) + c) % n;
y = (int)(pow((float)y, 2) + c) % n;
y = (int)(pow((float)y, 2) + c) % n;
int abs = x - y;
if(abs < 0)
abs = abs * -1;
d = gcd(abs, n); // never writes to d because the loop won't
// be executed
}
if(d != 1 && !cuda_found) // maybe true if cuda_found was initalized
// with false
{
cuda_found = true; // Memory race here.
outr = &d; // you are changing the adresse where outr
// points to; the host code does not see this
// change. your cudaMemcpy dev -> host will copy
// the exact values back from device that have
// been uploaded by cudaMemcpy host -> dev
// if you want to set outr to 4 than write:
// *outr = d;
}
}
One of the problems is you don't return the result. In your code you just change outr which has local scope in your kernel function (i.e. changes are not seen outside this function). You should write *outr = d; to change the value of memory you're pointing with outr.
and I'm not sure if CUDA initializes global variables with zero. I mean are you sure cuda_found is always initialized with false?