I currently have the following GLSL functions defined for raising a complex number to a power.
dvec2 p2 (dvec2 t) {return (dvec2 (cmul(t,t) ));}
dvec2 p3 (dvec2 t) {return (dvec2 (cmul(cmul(t,t),t) ));}
dvec2 p4 (dvec2 t) {return (dvec2 (cmul(cmul(cmul(t,t),t),t) ));}
dvec2 p5 (dvec2 t) {return (dvec2 (cmul(cmul(cmul(cmul(t,t),t),t),t) ));}
dvec2 p6 (dvec2 t) {return (dvec2 (cmul(cmul(cmul(cmul(cmul(t,t),t),t),t),t) ));}
dvec2 p7 (dvec2 t) {return (dvec2 (cmul(cmul(cmul(cmul(cmul(cmul(t,t),t),t),t),t),t) ));}
dvec2 p8 (dvec2 t) {return (dvec2 (cmul(cmul(cmul(cmul(cmul(cmul(cmul(t,t),t),t),t),t),t),t) ));}
I can use these in complex number formula like
dvec2 func (dvec2 z) { return (dvec2( cadd(csub(p4(z),cmul(c5,p2(z))),c4) ));
and it works fine.
Now I want to get rid of those p2,p3,p4,etc functions and write a more generalized power function. So I tried the following
dvec2 cpow (dvec2 c, int p) {
for (int i = 0; i < p; i++) {
c=cmul(c,c);
}
return c;
}
which I then call like
dvec2 func (dvec2 z) { return (dvec2( cadd(csub(cpow(z,4),cmul(c5,cpow(z,2))),c4) )); }
But it gives different results. I can find a lot of complex power routines online but they all use log and trig calls which are not double precision which I need for this GLSL code.
Can any GLSL gurus spot why that simple cpow loop would not work?
your conversion formula is wrong ... because you multiply by the subresult meaning you got squaring instead of multiplying ... You have to change your function to:
dvec2 cpow (dvec2 c, int p)
{
dvec2 a=(1.0,0.0); // assuming a.x is real part
for (int i = 1; i <= p; i++) a=cmul(a,c);
return a;
}
if you want something faster you can use power by squaring or even port to polar representation do power there and convert back to cartesian form of complex number. For more info see:
How to express tetration function, for complex numbers
Look for the vec2 cpow(vec2 a,vec2 b) // a^b in the GLSL code for tetration fractal.
Related
I'm developing my own graphics engine to render all sorts of fractals (like my video here for example), and I'm currently working on optimizing my code for Julia Set matings (see this question and my project for more details). In the fragment shader, I use this function:
vec3 JuliaMatingLoop(dvec2 z)
{
...
for (int k = some_n; k >= 0; --k)
{
// z = z^2
z = dcproj(c_2(z));
// Mobius Transformation: (ma[k] * z + mb[k]) / (mc[k] * z + md[k])
z = dcproj(dc_div(cd_mult(ma[k], z) + mb[k], dc_mult(mc[k], z) + md[k]));
}
...
}
And after reading this, I realized that I'm doing Mobius transformations in this code, and (mathematically speaking) I can use matrices to accomplish the same operation. However, the a, b, c, and d constants are all complex numbers (represented as ma[k], mb[k], mc[k], and md[k] in my code), whereas the elements in GLSL matrices contain only real numbers (rather than vec2). And so to my question: is there a way to optimize these Mobius transformations using matrices in GLSL? Or any other way of optimizing this part of my code?
Helper functions (I need to use doubles for this part, so I can't optimize by switching to using floats):
// Squaring
dvec2 c_2(dvec2 c)
{
return dvec2(c.x*c.x - c.y*c.y, 2*c.x*c.y);
}
// Multiplying
dvec2 dc_mult(dvec2 a, dvec2 b)
{
return dvec2(a.x*b.x - a.y*b.y, a.x*b.y + a.y*b.x);
}
// Dividing
dvec2 dc_div(dvec2 a, dvec2 b)
{
double x = b.x * b.x + b.y * b.y;
return vec2((a.x * b.x + a.y * b.y) / x, (b.x * a.y - a.x * b.y) / x);
}
// Riemann Projecting
dvec2 dcproj(dvec2 c)
{
if (!isinf(c.x) && !isinf(c.y) && !isnan(c.x) && !isnan(c.y))
return c;
return dvec2(infinity, 0);
}
I'm not sure if this will help, but yes you can do complex arithmetic by matrices.
If you regard a complex number z as a real two-vector with components Re(z), Im(z)
Then
A*z + B ~ (Re(A) -Im(A) ) * (Re(z)) + (Re(B))
(Im(A) Re(A) ) (Im(z)) (Im(B))
Of course you actually want
(A*z + B) / (C*z + D)
If you compute
A*z+b as (x)
(y)
C*z+d as (x')
(y')
Then the answer you seek is
inv( x' -y') * ( x)
( y' x' ) ( y)
i.e
(1/(x'*x'+y'*y')) * (x' y') * (x)
(-y' x') (y)
One thing to note, though, is that in these formulae, as in your code, division is not implemented as robustly as it could be. The trouble lies in evaluating b.x * b.x + b.y * b.y. This could overflow to infinity, or underflow to 0, even though the result of division could be quite reasonable. A commonly used way round this is Smith's method eg here and if you search for 'robust complex division' you'll find more recent work. Often this sort of thing matters little, but if you are iterating off to infinity it could make a difference.
I'm referencing a mathematics paper, but the terminology is strange, and I'm unsure of how to code the following:
Return if the orthogonal projection of Point P exists on S(P2, P3).
I found std::inner_product but not sure if thats the correct method to use.
The concept is that you project P onto S and then check whether that projection P' is between P2 and P3.
To make it a little easier you say that P2 is the support-vector of S and P3-P2 is the direction-vector. You then project P-P2 onto the normalized P3-P2 (you compute the scalar-product between them) which gives you the distance D of P' to P2.
Now in your case you only want to know if P' is between P2 and P3. That is true if D is between 0 and 1.
You want the orthogonal projection of P (on the line given by P2 and P3) to be inside the segment [P2,P3]. Mathematically, it writes simply (I'm noting vect(A, B) the vector AB, because I do not know how to use the arrow notation):
0 <= vect(P2, P) . vect (P2, P3) <= vect(P2, P3) . vect(P2, P3)
You can indeed use std::inner_product but if your points are something as simple as:
struct Point {
double x;
double y;
};
You could just use
double operator - (const Point& a, const Point& b) {
return a.x - b.x + a.y - b.y;
}
double operator * (const Point& a, const Point& b) {
return a.x * b.x + a.y * b.y;
}
And the mathematical formula just gives:
bool is_proj_inside(const Point& P, const Point& P2, const Point& P3) {
double p_proj = (P - P2) * (P3 - P2);
double p3_proj = (P3 - P2) * (P3 - P2);
return (p_proj >= 0) && (p_proj <= p3_proj);
}
Yes, you may use inner_product (dot product) to take the result in very simple way.
Make vectors
V2 = P - P2
V3 = P - P3
V = P3 - P2
Find signs of dot products
D2 = Dot(V2,V) and D3 = Dot(V3,V)
Projection of Point P lies at S(P2, P3), if
D2 >=0 and
D3 <=0
Note - there is no need in normalizations, square roots etc. Just some subtractions, multiplications and additions.
(Explanation - angles P-P2-P3 and P-P3-P2 should be acute or right)
Consider the following problem:
My question is the following: how to optimize the following independent functions:
// Computation of the coordinates of P
inline std::array<double, 3> P(const std::array<double, 3>& A,
const std::array<double, 3>& B,
const std::array<double, 3>& M)
{
// The most inefficient version in the world (to be verified)
std::array<double, 3> AB = {B[0]-A[0], B[1]-A[1], B[2]-A[2]};
std::array<double, 3> AM = {M[0]-A[0], M[1]-A[1], M[2]-A[2]};
double norm = std::sqrt(AB[0]*AB[0]+AB[1]*AB[1]+AB[2]*AB[2]);
double dot = AB[0]*AM[0]+AB[1]*AM[1]+AB[2]*AM[2];
double d1 = dot/norm;
std::array<double, 3> AP = {AB[0]/d1, AB[1]/d1, AB[2]/d1};
std::array<double, 3> P = {AP[0]-A[0], AP[1]-A[1], AP[2]-A[2]};
return P;
}
// Computation of the distance d0
inline double d0(const std::array<double, 3>& A,
const std::array<double, 3>& B,
const std::array<double, 3>& M)
{
// The most inefficient version in the world (to be verified)
std::array<double, 3> AB = {B[0]-A[0], B[1]-A[1], B[2]-A[2]};
std::array<double, 3> AM = {M[0]-A[0], M[1]-A[1], M[2]-A[2]};
double norm = std::sqrt(AB[0]*AB[0]+AB[1]*AB[1]+AB[2]*AB[2]);
double dot = AB[0]*AM[0]+AB[1]*AM[1]+AB[2]*AM[2];
double d1 = dot/norm;
std::array<double, 3> AP = {AB[0]/d1, AB[1]/d1, AB[2]/d1};
std::array<double, 3> P = {AP[0]-A[0], AP[1]-A[1], AP[2]-A[2]};
std::array<double, 3> MP = {P[0]-M[0], P[1]-M[1], P[2]-M[2]};
double d0 = std::sqrt(MP[0]*MP[0]+MP[1]*MP[1]+MP[2]*MP[2]);
return d0;
}
// Computation of the distance d1
inline double d1(const std::array<double, 3>& A,
const std::array<double, 3>& B,
const std::array<double, 3>& M)
{
// The most inefficient version in the world (to be verified)
std::array<double, 3> AB = {B[0]-A[0], B[1]-A[1], B[2]-A[2]};
std::array<double, 3> AM = {M[0]-A[0], M[1]-A[1], M[2]-A[2]};
double norm = std::sqrt(AB[0]*AB[0]+AB[1]*AB[1]+AB[2]*AB[2]);
double dot = AB[0]*AM[0]+AB[1]*AM[1]+AB[2]*AM[2];
double d1 = dot/norm;
}
// Computation of the distance d2
inline double d2(const std::array<double, 3>& A,
const std::array<double, 3>& B,
const std::array<double, 3>& M)
{
// The most inefficient version in the world (to be verified)
std::array<double, 3> AB = {B[0]-A[0], B[1]-A[1], B[2]-A[2]};
std::array<double, 3> AM = {M[0]-A[0], M[1]-A[1], M[2]-A[2]};
double norm = std::sqrt(AB[0]*AB[0]+AB[1]*AB[1]+AB[2]*AB[2]);
double dot = AB[0]*AM[0]+AB[1]*AM[1]+AB[2]*AM[2];
double d1 = dot/norm;
double d2 = norm-d1;
return d2;
}
So that each function will be as much optimized as possible ? (I will execute these functions billion times).
From algorithm point of view, you can calculate projection of vector to another vector not using SQRT call
the pseudocode from here
http://www.euclideanspace.com/maths/geometry/elements/line/projections/
// projection of vector v1 onto v2
inline vector3 projection( const vector3& v1, const vector3& v2 ) {
float v2_ls = v2.len_squared();
return v2 * ( dot( v2, v1 )/v2_ls );
}
where dot() is a dot product of two vectors and len_squared is the dot product of vector with self.
NOTE: Try to pre calculate inverse of v2_ls before main loop, if possible.
It is probably better to compute all requested quantities in a single go.
Let P = A + t . AB the vector equation giving the position of P. Express that MP and AB are orthogonal: MP . AB = 0 = (MA + t . AB) . AB, which yields t= - (MA . AB) / AB^2, and P.
t is the ratio AP / AB, hence d1 = t . |AB|. Similarly, d2 = (1 - t) . |AB|. d0 is obtained from Pythagoras, d0^2 = MA^2 - d1^2, or by direct computation of |MP|.
Accounting: compute MA (3 add), AB (3 add), AB^2 (2 add, 3 mul), MA.AB (2 add, 3 mul), t (1 div), P (3 add, 3 mul), |AB| (1 sqrt), d1 (1 mul), d2 (1 add, 1 mul), MA^2 (2 add, 3 mul), d0 (1 add, 1 mul, 1 sqrt).
Total 17 add, 15 mul, 1 div, 2 sqrt.
If you want portable code, so no processor specific features are used, I'd suggest the following:-
1) As I mentioned in my comment above, create a 3D vector class, it will just make it a lot easier to write the code (optimise development time)
2) Create an intersection class that uses lazy evaluation to get P, d0 and d1, like this:-
class Intersection
{
public:
Intersection (A, B, M) { store A, B, M; constants_calculated = false }
Point GetP () { CalculateConstants; Return P; }
double GetD0 () { CalculateConstants; Return D0; }
double GetD1 () { CalculateConstants; Return D1; }
private:
CalculateConstants ()
{
if (!constants_calculate)
{
calculate and store common expressions required for P, d0 and d1
constants_calculate = true
}
}
3) Don't call it a billion times. Not doing something is infinitely quicker. Why does it need to be called so often? Is there a way to do the same thing with fewer calls to find P, d0 and d1?
If you can use processor specific features, then you could look into doing things like using SIMD, but that may require dropping the precision from double to float.
The following is a C++implementation of point-to-line projection calculation
#include <iostream>
#include <cmath>
using namespace std;
int main() {
// the point
double x0 = 1.0, y0 = 1.0;
// the line equation
double A = 1.0, B = 1.0, C = 0.0;
// calc point to line distance
double dist = fabs(A * x0 + B * y0 + C) / sqrt(A * A + B * B);
// calc project point coord
double x1 = x0 - dist * A / sqrt(A * A + B * B);
double y1 = y0 - dist * B / sqrt(A * A + B * B);
// result
cout << "project point:(" << x1 << ", " << y1 << ")" << endl;
return 0;
}
I am attempting to approximate integrals using an adaptive Trapezoidal Rule.
I have a coarse integral approximation:
//Approximates the integral of f across the interval [a,b]
double coarse_app(double(*f)(double x), double a, double b) {
return (b - a) * (f(a) + f(b)) / 2.0;
}
I have a fine integral approximation:
//Approximates the integral of f across the interval [a,b]
double fine_app(double(*f)(double x), double a, double b) {
double m = (a + b) / 2.0;
return (b - a) / 4.0 * (f(a) + 2.0 * f(m) + f(b));
}
This is made adaptive by summing the approximation across decreasing portions of the given interval until either the recursion level is too high or the coarse and fine approximation are very close to one another:
//Adaptively approximates the integral of f across the interval [a,b] with
// tolerance tol.
double trap(double(*f)(double x), double a, double b, double tol) {
double q = fine_app(f, a, b);
double r = coarse_app(f, a, b);
if ((currentLevel >= minLevel) && (abs(q - r) <= 3.0 * tol)) {
return q;
} else if (currentLevel >= maxLevel) {
return q;
} else {
++currentLevel;
return (trap(f, a, b / 2.0, tol / 2.0) + trap(f, a + (b / 2.0), b, tol / 2.0));
}
}
If I manually calculate an integral by breaking it up into sections and using fine_app on it, I get a very good approximation. However, when I use the trap function, which should do this for me, all of my results are far too small.
For example, trap(square, 0, 2.0, 1.0e-2) gives the output 0.0424107, where the square function is defined as x^2. However, the output should be around 2.667. This is far worse than doing a single run of fine_app on the entire interval, which gives a value of 3.
Conceptually, I believe I have it implemented correctly, but there is something about C++ recursion which is not doing what I expect it to.
First time programming in C++, so all improvements are welcome.
I'm assuming you have currentLevel defined somewhere else. You don't want to do that. You also calculate your midpoints incorrectly.
Take a = 3, b = 5:
[a, b / 2.0] = [3, 2.5]
[a + b / 2.0, b] = 2.5, 3]
The correct points should be [3, 4] and [4, 5]
The code should look like this:
double trap(double(*f)(double x), double a, double b, double tol, int currentLevel) {
double q = fine_app(f, a, b);
double r = coarse_app(f, a, b);
if ((currentLevel >= minLevel) && (abs(q - r) <= 3.0 * tol)) {
return q;
} else if (currentLevel >= maxLevel) {
return q;
} else {
++currentLevel;
return (trap(f, a, (a + b) / 2.0, tol / 2, currentLevel) + trap(f, (a + b) / 2.0, b, tol / 2, currentLevel));
}
}
You can add a helper function so you don't have to specify currentLevel:
double integrate(double (*f)(double x), double a, double b, double tol)
{
return trap(f, a, b, tol, 1);
}
If I call this as integrate(square, 0, 2, 0.01) I get the answer of 2.6875, which means you need an even lower tolerance to converge to the correct result of 8/3 = 2.6666...7. You can check the exact error bound on this by using the error terms for Simpson's method.
Actually I am having several questions related to the subject given in the topic title.
I am already using Perlin functions to create lightning in my application, but I am not totally happy about my implementation.
The following questions are based on the initial and the improved Perlin noise implementations.
To simplify the issue, let's assume I am creating a simple 2D lightning by modulating the height of a horizontal line consisting of N nodes at these nodes using a 1D Perlin function.
As far as I have understood, two subsequent values passed to the Perlin function must differ by at least one, or the resulting two values will be identical. That is because with the simple Perlin implementation, the Random function works with an int argument, and in the improved implementation values are mapped to [0..255] and are then used as index into an array containing the values [0..255] in a random distribution. Is that right?
How do I achieve that the first and the last offset value (i.e. for nodes 0 and N-1) returned by the Perlin function is always 0 (zero)? Right now I am modulation a sine function (0 .. Pi) with my Perlin function to achieve that, but that's not really what I want. Just setting them to zero is not what I want, since I want a nice lightning path w/o jaggies at its ends.
How do I vary the Perlin function (so that I would get two different paths I could use as animation start and end frames for the lightning)? I could of course add a fixed random offset per path calculation to each node value, or use a differently setup permutation table for improved Perlin noise, but are there better options?
That depends on how you implement it and sample from it. Using multiple octaves helps counter integers quite a bit.
The octaves and additional interpolation/sampling done for each provides much of the noise in perlin noise. In theory, you should not need to use different integer positions; you should be able to sample at any point and it will be similar (but not always identical) to nearby values.
I would suggest using the perlin as a multiplier instead of simply additive, and use a curve over the course of the lightning. For example, having perlin in the range [-1.5, 1.5] and a normal curve over the lightning (0 at both ends, 1 in the center), lightning + (perlin * curve) will keep your ends points still. Depending on how you've implemented your perlin noise generator, you may need something like:
lightning.x += ((perlin(lightning.y, octaves) * 2.0) - 0.5) * curve(lightning.y);
if perlin returns [0,1] or
lightning.x += (perlin(lightning.y, octaves) / 128.0) * curve(lightning.y);
if it returns [0, 255]. Assuming lightning.x started with a given value, perhaps 0, that would give a somewhat jagged line that still met the original start and end points.
Add a dimension to the noise for every dimension you add to the lightning. If you're modifying the lightning in one dimension (horizontal jagged), you need 1D perlin noise. If you want to animate it, you need 2D. If you wanted lightning that was jagged on two axis and animated, you'd need 3D noise, and so on.
After reading peachykeen's answer and doing some (more) own research in the internet, I have found the following solution to work for me.
With my implementation of Perlin noise, using a value range of [0.0 .. 1.0] for the lightning path nodes work best, passing the value (double) M / (double) N for node M to the Perlin noise function.
To have a noise function F' return the same value for node 0 and node N-1, the following formula can be applied: F'(M) = ((M - N) * F(N) + N * F (N - M)) / M. In order to have the lightning path offsets begin and end with 0, you simply need to subtract F'(0) from all lightning path offsets after having computed the path.
To randomize the lightning path, before computing the offsets for each path node, a random offset R can be computed and added to the values passed to the noise function, so that a node's offset O = F'(N+R). To animate a lightning, two lightning paths need to be computed (start and end frame), and then each path vertex has to be lerped between its start and end position. Once the end frame has been reached, the end frame becomes the start frame and a new end frame is computed. For a 3D path, for each path node N two offset vectors can be computed that are perpendicular to the path at node N and each other, and can be scaled with two 1D Perlin noise values to lerp the node position from start to end frame position. That may be cheaper than doing 3D Perlin noise and works quite well in my application.
Here is my implementation of standard 1D Perlin noise as a reference (some stuff is virtual because I am using this as base for improved Perlin noise, allowing to use standard or improved Perlin noise in a strategy pattern application. The code has been simplified somewhat as well to make it more concise for publishing it here):
Header file:
#ifndef __PERLIN_H
#define __PERLIN_H
class CPerlin {
private:
int m_randomize;
protected:
double m_amplitude;
double m_persistence;
int m_octaves;
public:
virtual void Setup (double amplitude, double persistence, int octaves, int randomize = -1);
double ComputeNoise (double x);
protected:
double LinearInterpolate (double a, double b, double x);
double CosineInterpolate (double a, double b, double x);
double CubicInterpolate (double v0, double v1, double v2, double v3, double x);
double Noise (int v);
double SmoothedNoise (int x);
virtual double InterpolatedNoise (double x);
};
#endif //__PERLIN_H
Implementation:
#include <math.h>
#include <stdlib.h>
#include "perlin.h"
#define INTERPOLATION_METHOD 1
#ifndef Pi
# define Pi 3.141592653589793240
#endif
inline double CPerlin::Noise (int n) {
n = (n << 13) ^ n;
return 1.0 - ((n * (n * n * 15731 + 789221) + 1376312589) & 0x7fffffff) / 1073741824.0;
}
double CPerlin::LinearInterpolate (double a, double b, double x) {
return a * (1.0 - x) + b * x;
}
double CPerlin::CosineInterpolate (double a, double b, double x) {
double f = (1.0 - cos (x * Pi)) * 0.5;
return a * (1.0 - f) + b * f;
}
double CPerlin::CubicInterpolate (double v0, double v1, double v2, double v3, double x) {
double p = (v3 - v2) - (v0 - v1);
double x2 = x * x;
return v1 + (v2 - v0) * x + (v0 - v1 - p) * x2 + p * x2 * x;
}
double CPerlin::SmoothedNoise (int v) {
return Noise (v) / 2 + Noise (v-1) / 4 + Noise (v+1) / 4;
}
int FastFloor (double v) { return (int) ((v < 0) ? v - 1 : v; }
double CPerlin::InterpolatedNoise (double v) {
int i = FastFloor (v);
double v1 = SmoothedNoise (i);
double v2 = SmoothedNoise (i + 1);
#if INTERPOLATION_METHOD == 2
double v0 = SmoothedNoise (i - 1);
double v3 = SmoothedNoise (i + 2);
return CubicInterpolate (v0, v1, v2, v3, v - i);
#elif INTERPOLATION_METHOD == 1
return CosineInterpolate (v1, v2, v - i);
#else
return LinearInterpolate (v1, v2, v - i);
#endif
}
double CPerlin::ComputeNoise (double v) {
double total = 0, amplitude = m_amplitude, frequency = 1.0;
v += m_randomize;
for (int i = 0; i < m_octaves; i++) {
total += InterpolatedNoise (v * frequency) * amplitude;
frequency *= 2.0;
amplitude *= m_persistence;
}
return total;
}
void CPerlin::Setup (double amplitude, double persistence, int octaves, int randomize) {
m_amplitude = (amplitude > 0.0) ? amplitude : 1.0;
m_persistence = (persistence > 0.0) ? persistence : 2.0 / 3.0;
m_octaves = (octaves > 0) ? octaves : 6;
m_randomize = (randomize < 0) ? (rand () * rand ()) & 0xFFFF : randomize;
}