Second iteration crash - order irrelevant - c++

To save on global memory transfers, and because all of the steps of the code work individually, I have tried to combine all of the kernals into a single kernal, with the first 2 (of 3) steps being done as device calls rather than global calls.
This is failing in the second half of the first step.
There is a function that I need to call twice, to calculate the 2 halves of an image. Regardless of the order the image is calculated in, it crashes on the second iteration.
After examining the code as well as I could, and running it multiple times with different return points, I have found what makes it crash.
__device__
void IntersectCone( float* ModDistance,
float* ModIntensity,
float3 ray,
int threadID,
modParam param )
{
bool ignore = false;
float3 normal = make_float3(0.0f,0.0f,0.0f);
float3 result = make_float3(0.0f,0.0f,0.0f);
float normDist = 0.0f;
float intensity = 0.0f;
float check = abs( Dot(param.position, Cross(param.direction,ray) ) );
if(check > param.r1 && check > param.r2)
ignore = true;
float tran = param.length / (param.r2/param.r1 - 1);
float length = tran + param.length;
float Lsq = length * length;
float cosSqr = Lsq / (Lsq + param.r2 * param.r2);
//Changes the centre position?
float3 position = param.position - tran * param.direction;
float aDd = Dot(param.direction, ray);
float3 e = position * -1.0f;
float aDe = Dot(param.direction, e);
float dDe = Dot(ray, e);
float eDe = Dot(e, e);
float c2 = aDd * aDd - cosSqr;
float c1 = aDd * aDe - cosSqr * dDe;
float c0 = aDe * aDe - cosSqr * eDe;
float discr = c1 * c1 - c0 * c2;
if(discr <= 0.0f)
ignore = true;
if(!ignore)
{
float root = sqrt(discr);
float sign;
if(c1 > 0.0f)
sign = 1.0f;
else
sign = -1.0f;
//Try opposite sign....?
float3 result = (-c1 + sign * root) * ray / c2;
e = result - position;
float dot = Dot(e, param.direction);
float3 s1 = Cross(e, param.direction);
float3 normal = Cross(e, s1);
if( (dot > tran) || (dot < length) )
{
if(Dot(normal,ray) <= 0)
{
normal = Norm(normal); //This stuff (1)
normDist = Magnitude(result);
intensity = -IntensAt1m * Dot(ray, normal) / (normDist * normDist);
}
}
}
ModDistance[threadID] = normDist; and this stuff (2)
ModIntensity[threadID] = intensity;
}
There are two things I can do to to make this not crash, both off which negate the point of the function: If I do not try to write to ModDistance[] and ModIntensity[], or if I do not write to normDist and intensity.
First chance exceptions are thrown by the code above, but not if either of the blocks commented out.
Also, The program only crashes the second time this routine is called.
Have been trying to figure this out all day, any help would be fantastic.
The code that calls it is:
int subrow = threadIdx.y + Mod_Height/2;
int threadID = subrow * (Mod_Width+1) + threadIdx.x;
int obsY = windowY + subrow;
float3 ray = CalculateRay(obsX,obsY);
if( !IntersectSphere(ModDistance, ModIntensity, ray, threadID, param) )
{
IntersectCone(ModDistance, ModIntensity, ray, threadID, param);
}
subrow = threadIdx.y;
threadID = subrow * (Mod_Width+1) + threadIdx.x;
obsY = windowY + subrow;
ray = CalculateRay(obsX,obsY);
if( !IntersectSphere(ModDistance, ModIntensity, ray, threadID, param) )
{
IntersectCone(ModDistance, ModIntensity, ray, threadID, param);
}

The kernel is running out of resources. As posted in the comments, it was giving the error CudaErrorLaunchOutOfResources.
To avoid this, you should use a __launch_bounds__ specifier to specify the block dimensions you want for your kernel. This will force the compiler to ensure there are enough resources. See the CUDA programming guide for details on __launch_bounds__.

Related

Interpreting visual studio profiler, is this subtraction slow? Can I make all this any faster?

I'm using the Visual Studio profiler for the first time and I'm trying to interpret the results. Looking at the percentages on the left, I found this subtraction's time cost a bit strange:
Other parts of the code contain more complex expressions, like:
Even a simple multiplication seems way faster than the subtraction :
Other multiplications take way longer and I really don't get why, like this :
So, I guess my question is if there is anything weird going on here.
Complex expressions take longer than that subtraction and some expressions take way longer than similar other ones. I run the profiler several times and the distribution of the percentages is always like this. Am I just interpreting this wrong?
Update:
I was asked to give the profile for the whole function so here it is, even though it's a bit big. I ran the function inside a for loop for 1 minute and got 50k samples. The function contains a double loop. I include the text first for ease, followed by the pictures of profiling. Note that the code in text is a bit updated.
for (int i = 0; i < NUMBER_OF_CONTOUR_POINTS; i++) {
vec4 contourPointV(contour3DPoints[i], 1);
float phi = angles[i];
float xW = pose[0][0] * contourPointV.x + pose[1][0] * contourPointV.y + contourPointV.z * pose[2][0] + pose[3][0];
float yW = pose[0][1] * contourPointV.x + pose[1][1] * contourPointV.y + contourPointV.z * pose[2][1] + pose[3][1];
float zW = pose[0][2] * contourPointV.x + pose[1][2] * contourPointV.y + contourPointV.z * pose[2][2] + pose[3][2];
float x = -G_FU_STRICT * xW / zW;
float y = -G_FV_STRICT * yW / zW;
x = (x + 1) * G_WIDTHo2;
y = (y + 1) * G_HEIGHTo2;
y = G_HEIGHT - y;
phi -= extraTheta;
if (phi < 0)phi += CV_PI2;
int indexForTable = phi * oneKoverPI;
//vec2 ray(cos(phi), sin(phi));
vec2 ray(cos_pre[indexForTable], sin_pre[indexForTable]);
vec2 ray2(-ray.x, -ray.y);
float outerStepX = ray.x * step;
float outerStepY = ray.y * step;
cv::Point2f outerPoint(x + outerStepX, y + outerStepY);
cv::Point2f innerPoint(x - outerStepX, y - outerStepY);
cv::Point2f contourPointCV(x, y);
cv::Point2f contourPointCVcopy(x, y);
bool cut = false;
if (!isInView(outerPoint.x, outerPoint.y) || !isInView(innerPoint.x, innerPoint.y)) {
cut = true;
}
bool outside2 = true; bool outside1 = true;
if (cut) {
outside2 = myClipLine(contourPointCV.x, contourPointCV.y, outerPoint.x, outerPoint.y, G_WIDTH - 1, G_HEIGHT - 1);
outside1 = myClipLine(contourPointCVcopy.x, contourPointCVcopy.y, innerPoint.x, innerPoint.y, G_WIDTH - 1, G_HEIGHT - 1);
}
myIterator innerRayMine(contourPointCVcopy, innerPoint);
myIterator outerRayMine(contourPointCV, outerPoint);
if (!outside1) {
innerRayMine.end = true;
innerRayMine.prob = true;
}
if (!outside2) {
outerRayMine.end = true;
innerRayMine.prob = true;
}
vec2 normal = -ray;
float dfdxTerm = -normal.x;
float dfdyTerm = normal.y;
vec3 point3D = vec3(xW, yW, zW);
cv::Point contourPoint((int)x, (int)y);
float Xc = point3D.x; float Xc2 = Xc * Xc; float Yc = point3D.y; float Yc2 = Yc * Yc; float Zc = point3D.z; float Zc2 = Zc * Zc;
float XcYc = Xc * Yc; float dfdxFu = dfdxTerm * G_FU; float dfdyFv = dfdyTerm * G_FU; float overZc2 = 1 / Zc2; float overZc = 1 / Zc;
pixelJacobi[0] = (dfdyFv * (Yc2 + Zc2) + dfdxFu * XcYc) * overZc2;
pixelJacobi[1] = (-dfdxFu * (Xc2 + Zc2) - dfdyFv * XcYc) * overZc2;
pixelJacobi[2] = (-dfdyFv * Xc + dfdxFu * Yc) * overZc;
pixelJacobi[3] = -dfdxFu * overZc;
pixelJacobi[4] = -dfdyFv * overZc;
pixelJacobi[5] = (dfdyFv * Yc + dfdxFu * Xc) * overZc2;
float commonFirstTermsSum = 0;
float commonFirstTermsSquaredSum = 0;
int test = 0;
while (!innerRayMine.end) {
test++;
cv::Point xy = innerRayMine.pos(); innerRayMine++;
int x = xy.x;
int y = xy.y;
float dx = x - contourPoint.x;
float dy = y - contourPoint.y;
vec2 dxdy(dx, dy);
float raw = -glm::dot(dxdy, normal);
float heavisideTerm = heaviside_pre[(int)raw * 100 + 1000];
float deltaTerm = delta_pre[(int)raw * 100 + 1000];
const Vec3b rgb = ante[y * 640 + x];
int red = rgb[0]; int green = rgb[1]; int blue = rgb[2];
red = red >> 3; red = red << 10; green = green >> 3; green = green << 5; blue = blue >> 3;
int colorIndex = red + green + blue;
pF = pFPointer[colorIndex];
pB = pBPointer[colorIndex];
float denAsMul = 1 / (pF + pB + 0.000001);
pF = pF * denAsMul;
float pfMinusPb = 2 * pF - 1;
float denominator = heavisideTerm * (pfMinusPb)+pB + 0.000001;
float commonFirstTerm = -pfMinusPb / denominator * deltaTerm;
commonFirstTermsSum += commonFirstTerm;
commonFirstTermsSquaredSum += commonFirstTerm * commonFirstTerm;
}
}
Visual Studio profiles by sampling: it interrupts execution often and records the value of the instruction pointer; it then maps it to the source and calculates the frequency of hitting that line.
There are few issues with that: it's not always possible to figure out which line produced a specific assembly instruction in the optimized code.
One trick I use is to move the code of interest into a separate function and declare it with __declspec(noinline) .
In your example, are you sure the subtraction was performed as many times as multiplication? I would be more puzzled by the difference in subsequent multiplication (0.39% and 0.53%)
Update:
I believe that the following lines:
float phi = angles[i];
and
phi -= extraTheta;
got moved together in assembly and the time spent getting angles[i] was added to that subtraction line.

openCl path tracer creates strange noise patterns

I've made a path tracer using openCl and c++, following the basic structure in this tutorial: http://raytracey.blogspot.com/2016/11/opencl-path-tracing-tutorial-2-path.html. As far as I can tell, nothing is wrong with the path tracing algorithm itself, but I get strange stripe patterns in the image that don't match the regular noise of path tracing. striped image
There are distinct vertical stripes and more narrow horizontal ones that make the image look granular regardless of how many samples I take per pixel. Again, pixel by pixel, the path tracer seems to be working (the outlines of objects are correct even where they appear mid-stripe) as seen here: close-up.
The only difference between my code and the one in the tutorial I link is that Sam Lapere appears to be using the c++ wrapper for openCl, and I've added a couple of features like movement. There also are a few differences in how I'm handling light bounces.
I'm new to openCl. What could be causing this? It seems like it doesn't have to do with my ray tracer itself, but somehow in the way I'm implementing openCl. I'm also using an SDL texture and renderer to show the image to the screen
here is the tracer code if it helps:
kernel:
__kernel void render_kernel
(__constant struct Sphere* spheres, const int width, const int height,
const int sphere_count, __global int * output, __global float3*
pixel_buckets, __global int* counter, __constant struct Ray* camera,
__global bool* reset){
int gid = get_global_id(0);
//for movement
if (*reset){
pixel_buckets[gid] = (float3)(0,0,0);
counter[gid] = 0;
}
int xcoord = gid % width;
int ycoord = gid / width;
struct Ray camray = createCamRay(xcoord, ycoord, width, height, counter[gid], camera);
float3 final_color = trace(spheres, &camray, sphere_count, xcoord, ycoord);
counter[gid] ++;
//average colors
pixel_buckets[gid] += final_color;
output[gid] = colorInt(clampColor(pixel_buckets[gid] / counter[gid]));
}
trace:
float3 trace(__constant struct Sphere* spheres, struct Ray* camray, const int sphere_count,
unsigned int seed0, unsigned int seed1){
struct Ray ray = *camray;
struct Sphere sphere1;
sphere1.center = (float3)(0, 0, 3);
sphere1.radius = 0.7;
sphere1.color = (float3)(1,1,0);
const int bounce_count = 8;
float3 colors[20];
float3 emiss[20];
for (int bounce = 0; bounce < bounce_count; bounce ++){
int sphere_id = 0;
float hit_distance = intersectScene(spheres, &ray, &sphere_id, sphere_count);
struct Sphere hit_sphere = spheres[sphere_id];
float3 hit_point = ray.origin + (ray.direction * hit_distance);
float3 normal = normalize(hit_point - hit_sphere.center);
if (dot(normal, -ray.direction) < 0){
normal = -normal;
}
//random bounce angles
float rand_theta = get_random(seed0, seed1);
float theta = acos(sqrt(rand_theta));
float rand_phi = get_random(seed0, seed1);
float phi = 2 * PI * rand_phi;
//scales the tnb vectors
float x = sin(theta) * sin(phi);
float y = sin(theta) * cos(phi);
float n = cos(theta);
float3 hemx = normalize(cross(ray.direction, normal)) * x;
float3 hemy = normalize(cross(hemx, normal)) * y;
normal = normal * n;
float3 new_ray = normalize(hemx + hemy + normal);
ray.origin = hit_point + (normal * EPSILON);
ray.direction = new_ray;
colors[bounce] = hit_sphere.color;
emiss[bounce] = hit_sphere.emmissive;
}
colors[bounce_count] = (float3)(0,0,0);
emiss[bounce_count] = (float3)(0,0,0);
for (int i = bounce_count - 1; i >= 0; i--){
colors[i] = (colors[i] * emiss[i]) + (colors[i] * colors[i + 1]);
}
return colors[0];
}
random number generator:
float get_random(unsigned int *seed0, unsigned int *seed1) {
/* hash the seeds using bitwise AND operations and bitshifts */
*seed0 = 36969 * ((*seed0) & 65535) + ((*seed0) >> 16);
*seed1 = 18000 * ((*seed1) & 65535) + ((*seed1) >> 16);
unsigned int ires = ((*seed0) << 16) + (*seed1);
/* use union struct to convert int to float */
union {
float f;
unsigned int ui;
} res;
res.ui = (ires & 0x007fffff) | 0x40000000; /* bitwise AND, bitwise OR */
return (res.f - 2.0f) / 2.0f;
}
thanks

random velocities when using simulation shader

I'm trying to implement a Fruchterman Reingold simulation using shaders. Before implementing the compute portion in a shader I wrote it in javascript. It works exactly as I expect, as seen here:
http://jaredmcqueen.github.io/gpgpu-force-direction/canvas_app.html
When implementing the compute portion in a shader, I get a stable structure that randomly drifts around the screen. I cannot figure out what repulsion / attraction forces are causing my graphs to float around so unpredictably:
http://jaredmcqueen.github.io/gpgpu-force-direction/gpgpu_app.html
the core of the physics are from the repulsion / attraction functions:
//fr(x) = (k*k)/x;
vec3 addRepulsion(vec3 self, vec3 neighbor){
vec3 diff = self - neighbor;
float x = length( diff );
float f = ( k * k ) / x;
return normalize(diff) * f;
}
//fa(x) = (x*x)/k;
vec3 addAttraction(vec3 self, vec3 neighbor){
vec3 diff = self - neighbor;
float x = length( diff );
float f = ( x * x ) / k;
return normalize(diff) * f;
}
Any insight as to why gpgpu, simulation based shaders would behave seemingly random would be greatly appreciated.
It doesn't seem random, construction stabilizes in seemingly right state and moves out in constant direction.
It looks like you apply force in shader and then update your model's position on CPU side and this global model position should stay constant either should have been updated by another value.
From what i've seen in code i recommend to eliminate floating point comprasions (compareNodePosition.w == -1.0 || 0.0) and continue operator. Please say if it helped. I haven't looked into algorithms logic yet.
It turns out I was iterating through the edges incorrectly. Here's my new edge iteration:
float idx = selfEdgeIndices.x;
float idy = selfEdgeIndices.y;
float idz = selfEdgeIndices.z;
float idw = selfEdgeIndices.w;
float start = idx * 4.0 + idy;
float end = idz * 4.0 + idw;
if(! ( idx == idz && idy == idw ) ){
float edgeIndex = 0.0;
for(float y = 0.0; y < edgesTexWidth; y++){
for(float x = 0.0; x < edgesTexWidth; x++){
vec2 ref = vec2( x + 0.5 , y + 0.5 ) / vec2(edgesTexWidth,edgesTexWidth);
vec4 pixel = texture2D(edgeData,ref);
if (edgeIndex >= start && edgeIndex < end){
nodePosition = getNeighbor(pixel.x);
nodeDiff.xyz -= addAttraction(currentNodePosition.xyz, nodePosition);
}
edgeIndex++;
if (edgeIndex >= start && edgeIndex < end){
nodePosition = getNeighbor(pixel.y);
nodeDiff.xyz -= addAttraction(currentNodePosition.xyz, nodePosition);
}
edgeIndex++;
if (edgeIndex >= start && edgeIndex < end){
nodePosition = getNeighbor(pixel.z);
nodeDiff.xyz -= addAttraction(currentNodePosition.xyz, nodePosition);
}
edgeIndex++;
if (edgeIndex >= start && edgeIndex < end){
nodePosition = getNeighbor(pixel.w);
nodeDiff.xyz -= addAttraction(currentNodePosition.xyz, nodePosition);
}
edgeIndex++;
}
}
}

Converting 2D Noise to 3D

I've recently started experimenting with noise (simple perlin noise), and have run into a slight problem with animating it. So far come I've across an awesome looking 3d noise (https://github.com/ashima/webgl-noise) that I could use in my project but that I understood nothing of, and a bunch of tutorials that explain how to create simple 2d noise.
For the 2d noise, I originally used the following fragment shader:
uniform sampler2D al_tex;
varying vec4 varying_pos; //Actual coords
varying vec2 varying_texcoord; //Normalized coords
uniform float time;
float rand(vec2 co) { return fract(sin(dot(co, vec2(12.9898, 78.233))) * 43758.5453); }
float ease(float p) { return 3*p*p - 2*p*p*p; }
float cnoise(vec2 p, int wavelength)
{
int ix1 = (int(varying_pos.x) / wavelength) * wavelength;
int iy1 = (int(varying_pos.y) / wavelength) * wavelength;
int ix2 = (int(varying_pos.x) / wavelength) * wavelength + wavelength;
int iy2 = (int(varying_pos.y) / wavelength) * wavelength + wavelength;
float x1 = ix1 / 1280.0f;
float y1 = iy1 / 720.0f;
float x2 = ix2 / 1280.0f;
float y2 = iy2 / 720.0f;
float xOffset = (varying_pos.x - ix1) / wavelength;
float yOffset = (varying_pos.y - iy1) / wavelength;
xOffset = ease(xOffset);
yOffset = ease(yOffset);
float t1 = rand(vec2(x1, y1));
float t2 = rand(vec2(x2, y1));
float t3 = rand(vec2(x2, y2));
float t4 = rand(vec2(x1, y2));
float tt1 = mix(t1, t2, xOffset);
float tt2 = mix(t4, t3, xOffset);
return mix(tt1, tt2, yOffset);
}
void main()
{
float t = 0;
int minFreq = 0;
int noIterations = 8;
for (int i = 0; i < noIterations; i++)
t += cnoise(varying_texcoord, int(pow(2, i + minFreq))) / pow(2, noIterations - i);
gl_FragColor = vec4(vec3(t), 1);
}
The result that I got was this:
Now, I want to animate it with time. My first thought was to change the rand function to take a vec3 instead of vec2, and then change my cnoise function accordingly, to interpolate values in the z direction too. With that goal in mind, I made this:
sampler2D al_tex;
varying vec4 varying_pos;
varying vec2 varying_texcoord;
uniform float time;
float rand(vec3 co) { return fract(sin(dot(co, vec3(12.9898, 78.2332, 58.5065))) * 43758.5453); }
float ease(float p) { return 3*p*p - 2*p*p*p; }
float cnoise(vec3 pos, int wavelength)
{
ivec3 iPos1 = (ivec3(pos) / wavelength) * wavelength; //The first value that I'll sample to interpolate
ivec3 iPos2 = iPos1 + wavelength; //The second value
vec3 transPercent = (pos - iPos1) / wavelength; //Transition percent - A float in [0-1) indicating how much of each of the above values will contribute to final result
transPercent.x = ease(transPercent.x);
transPercent.y = ease(transPercent.y);
transPercent.z = ease(transPercent.z);
float t1 = rand(vec3(iPos1.x, iPos1.y, iPos1.z));
float t2 = rand(vec3(iPos2.x, iPos1.y, iPos1.z));
float t3 = rand(vec3(iPos2.x, iPos2.y, iPos1.z));
float t4 = rand(vec3(iPos1.x, iPos2.y, iPos1.z));
float t5 = rand(vec3(iPos1.x, iPos1.y, iPos2.z));
float t6 = rand(vec3(iPos2.x, iPos1.y, iPos2.z));
float t7 = rand(vec3(iPos2.x, iPos2.y, iPos2.z));
float t8 = rand(vec3(iPos1.x, iPos2.y, iPos2.z));
float tt1 = mix(t1, t2, transPercent.x);
float tt2 = mix(t4, t3, transPercent.x);
float tt3 = mix(t5, t6, transPercent.x);
float tt4 = mix(t8, t7, transPercent.x);
float tt5 = mix(tt1, tt2, transPercent.y);
float tt6 = mix(tt3, tt4, transPercent.y);
return mix(tt5, tt6, transPercent.z);
}
float fbm(vec3 p)
{
float t = 0;
int noIterations = 8;
for (int i = 0; i < noIterations; i++)
t += cnoise(p, int(pow(2, i))) / pow(2, noIterations - i);
return t;
}
void main()
{
vec3 p = vec3(varying_pos.xy, time);
float t = fbm(p);
gl_FragColor = vec4(vec3(t), 1);
}
However, on doing this, the animation feels... strange. It's as though I'm watching a slideshow of perlin noise slides, with the individual slides fading in. All other perlin noise examples that I have tried (like https://github.com/ashima/webgl-noise) are actually animated with time - you can actually see it being animated, and don't just feel like the images are fading in, and not being actually animated. I know that I could just use the webgl-noise shader, but I want to make one for myself, and for some reason, I'm failing miserably. Could anyone tell me where I am going wrong, or suggest me on how I can actually animate it properly with time?
You should proably include z in the sin function:
float rand(vec3 co) { return fract(sin(dot(co.xy ,vec2(12.9898,78.233)) + co.z) * 43758.5453); }
Apparently the somewhat random numbers are prime numbers. This is to avoid patterns in the noise. I found another prime number, 94418953, and included that in the sin/dot function. Try this:
float rand(vec3 co) { return fract(sin(dot(co.xyz ,vec3(12.9898,78.233, 9441.8953))) * 43758.5453); }
EDIT: You don't take into account wavelength on the z axis. This means that all your iterations will have the same interpolation distance. In other words, you will get the fade effect you're describing. Try calculating z the same way you calculate x and y:
int iz1 = (int(p.z) / wavelength) * wavelength;
int iz2 = (int(p.z) / wavelength) * wavelength + wavelength;
float z1 = iz1 / 720.0f;
float z2 = iz2 / 720.0f;
float zOffset = (varying_pos.z - iz1) / wavelength;
This means however that the z value will variate the same rate that y will. So if you want it to scale from 0 to 1 then you should proably multiply z with 720 before passing it into the noise function.
check this code. it's a simple version of 3d noise:
// Here are some easy to understand noise gens... the D line in cubic interpolation (rounding)
function rndng ( n: float ): float
{//random proportion -1, 1 ... many people use Sin to take
//linearity out of a pseudo random, exp n*n is faster on central processor.
var e = ( n *321.9234)%1;
return (e*e*111.07546)%2-1;
}
function lerps(o:float, v:float, alpha:float):float
{
o += ( v - o ) * alpha;
return o;
}
//3d ----------------
function lnz ( vtx: Vector3 ): float //3d perlin noise code fast
{
vtx= Vector3 ( Mathf.Abs(vtx.x) , Mathf.Abs(vtx.y) , Mathf.Abs(vtx.z) ) ;
var I = Vector3 (Mathf.Floor(vtx.x),Mathf.Floor(vtx.y),Mathf.Floor(vtx.z));
var D = Vector3(vtx.x%1,vtx.y%1,vtx.z%1);
D = Vector3(D.x*D.x*(3.0-2.0*D.x),D.y*D.y*(3.0-2.0*D.y),D.z*D.z*(3.0-2.0*D.z));
var W = I.x + I.y*71.0 + 125.0*I.z;
return lerps(
lerps( lerps(rndng(W+0.0),rndng(W+1.0),D.x) , lerps(rndng(W+71.0),rndng(W+72.0),D.x) , D.y)
,
lerps( lerps(rndng(W+125.0),rndng(W+126.0),D.x) , lerps(rndng(W+153.0),rndng(W+154.0),D.x) , D.y)
,
D.z
);
}
//1d ----------------
function lnzo ( vtx: Vector3 ): float //perlin noise, same as unityfunction version
{
var total = 0.0;
for (var i:int = 1; i < 5; i ++)
{
total+= lnz2(Vector3 (vtx.x*(i*i),0.0,vtx.z*(i*i)))/(i*i);
}
return total*5;
}
//2d 3 axis honeycombe noise ----------------
function lnzh ( vtx: Vector3 ): float // perlin noise, 2d, with 3 axes at 60'instead of 2 x y axes
{
vtx= Vector3 ( Mathf.Abs(vtx.z) , Mathf.Abs(vtx.z*.5-vtx.x*.866) , Mathf.Abs(vtx.z*.5+vtx.x*.866) ) ;
var I = Vector3 (Mathf.Floor(vtx.x),Mathf.Floor(vtx.y),Mathf.Floor(vtx.z));
var D = Vector3(vtx.x%1,vtx.y%1,vtx.z%1);
//D = Vector3(D.x*D.x*(3.0-2.0*D.x),D.y*D.y*(3.0-2.0*D.y),D.z*D.z*(3.0-2.0*D.z));
var W = I.x + I.y*71.0 + 125.0*I.z;
return lerps(
lerps( lerps(rndng(W+0.0),rndng(W+1.0),D.x) , lerps(rndng(W+71.0),rndng(W+72.0),D.x) , D.y)
,
lerps( lerps(rndng(W+125.0),rndng(W+126.0),D.x) , lerps(rndng(W+153.0),rndng(W+154.0),D.x) , D.y)
,
D.z
);
}
//2d ----------------
function lnz2 ( vtx: Vector3 ): float // i think this is 2d perlin noise
{
vtx= Vector3 ( Mathf.Abs(vtx.x) , Mathf.Abs(vtx.y) , Mathf.Abs(vtx.z) ) ;
var I = Vector3 (Mathf.Floor(vtx.x),Mathf.Floor(vtx.y),Mathf.Floor(vtx.z));
var D = Vector3(vtx.x%1,vtx.y%1,vtx.z%1);
D = Vector3(D.x*D.x*(3.0-2.0*D.x),D.y*D.y*(3.0-2.0*D.y),D.z*D.z*(3.0-2.0*D.z));
var W = I.x + I.y*71.0 + 125.0*I.z;
return lerps(
lerps( lerps(rndng(W+0.0),rndng(W+1.0),D.x) , lerps(rndng(W+71.0),rndng(W+72.0),D.x) , D.z)
,
lerps( rndng(W+125.0), rndng(W+126.0),D.x)
,
D.z
);
}

Am I converting local space to world space coordinates properly?

I'm trying to create a bone and IK system. Below is the method that is recursive and that calculates the absolute positions and absolute angles of each bone. I call it with the root bone and zero'd parameters. It works fine, but when I try to use CCD IK I get discrepancies between the resulting end point and the calculated one. Therefore maybe I'm doing this wrong even though it works.
Thanks
void Skeleton::_updateBones( Bone* root,float realStartX, float realStartY, float realStartAngle )
{
if(!root->isRelative())
{
realStartX = 0.0f;
realStartY = 0.0f;
realStartAngle = 0.0f;
}
realStartX += root->getX();
realStartY += root->getY();
realStartAngle += root->getAngle();
float vecX = sin(realStartAngle);
float vecY = cos(realStartAngle);
realStartX += (vecX * root->getLength());
realStartY += (vecY * root->getLength());
root->setFrame(realStartX,realStartY,realStartAngle);
float angle = fmod(realStartAngle,2.0f * 3.141592f);
if( angle < -3.141592f )
angle += (2.0f * 3.141592);
else if( angle > 3.141592f )
angle -= (2.0f * 3.141592f);
for(std::list<Bone>::iterator it = root->begin(); it != root->end(); ++it)
{
_updateBones(&(*it),realStartX,realStartY,angle);
}
}
This looks wrong.
float vecX = sin(realStartAngle);
float vecY = cos(realStartAngle);
Swap sin() and cos().
float vecX = cos(realStartAngle);
float vecY = sin(realStartAngle);