I'm trying to find the point where 2 lines intersect.
I am using this example on geeks for geeks.
This works well for small numbers. I am using "accurate" latitude and longitude points which are scaled by 10000000. So a latitude of 40.4381311 would be 404381311 in my program. I can't use double as I will lose accuracy and the whole point of getting the points scaled like this is for accuracy.
So in the geeks for geeks example I changed everywhere they use double to use int64_t. But unfortunately this is still not enough.
By the time you get to:
double determinant = a1*b2 - a2*b1;
or:
double x = (b2*c1 - b1*c2)/determinant;
double y = (a1*c2 - a2*c1)/determinant;
The number will be too big for int64_t. What is my best way around this issue while still keeping accuracy.
I am writing this code for an ESP32 and it doesn't seem to have __int128/__int128_t.
Thanks!
Related
Suppose that I have the following chart -- https://www.desmos.com/calculator/eabskyo0hk
I need to do the following:
Determine whether apex (endpoint1), dest (newvertex) and endpoint1
are collinear.
If they are not collinear, find their intersection point.
I wrote the following code to do that -- http://coliru.stacked-crooked.com/a/c08cbe113e706a9c
Unfortunately, it doesn't work properly. It says that those points are not collinear but then it says that their intersection point is dest (newvertex) so that invalidates the first statement.
I suppose that this must be due to the floating point-related issues but I wonder how can I fix it w/o changing the coordinates?
Thanks in advance.
Your coordinate values have very large magnitude, so comparison of cross-product magnitude with std::numeric_limits<T>::epsilon() gives false result.
Edit: std::numeric_limits<T>::epsilon() is not suitable as "small value here".
You can try to scale "small value" accordingly to coordinates. Example:
my_eps = 1.0E-9;
eps = my_eps * Max(abs(x1), abs(y1)....
Note that small difference of two big values already contains large relative error, so next approach is wrong:
eps = my_eps * Max(abs(y1 - y2), abs(x1 - x3), abs(y1 - y3), abs(x1 - x2));
Described method should improve result, but won't provide "silver bullet" - there are special computational methods for robust, error-prone intersection calculations (problem is global). Example - orientation test
I am looking some advice for a Qt program I am working on, that uses Qwt to draw a line graph.
Basically my problem arises from the graph's x axis, which is in 24:00 time. I have a QPolygonF that stores a series of QPointFs that hold the values for my plot curve, where every 1.0 in the x axis equates to 1 second. I then use unix timestamps to set each value for the x axis, so basically I have double xAxis initialised to 0.0 which is added to the QPolygonF like points.append(xAxis, yAxis) for the start of curve and for each point thereafter I use currentTime - prevTime to find the difference between both timestamps and then increase xAxis by said difference using +=. If that makes sense.
Anyway, currently everything is displayed in whole seconds and it works perfectly fine. However, I need it to be precise to the millisecond. What I need some guidance on is working with large high precision doubles.
Working with unix timestamps in seconds is easy as that can be done with a simple int, but when you increase the number of digits to include milliseconds doubles are switched to scientific notation.
My question is: how do I store potentially large numbers, like 22429.388 or larger, if they revert to scientific notation?
Thanks and sorry if this is a very basic question.
You say you graph axis is 24:00 long. This will be 24*3600 seconds, so 24*3600*1000 milliseconds: 86,400,000 which is way smaller than INT_MAX (=2,147,483,647).
So there should be no problem storing your x values as an int. You just need to make first axis value be 0 then last axis value will be 86,400,000.
If your times do not start at 0, you just need to define the smallest time displayed as a "reference date" and store values based on this "reference date" (to guarantee they will all be between 00:00:00.0000 (i.e: 0 as an int) and 24:00:00.0000 (i.e: 86,400,000 as an int)).
I am learning about Two Dimensional Neuron Network so I am facing many obstacles but I believe it is worth it and I am really enjoying this learning process.
Here's my plan: To make a 2-D NN work on recognizing images of digits. Images are 5 by 3 grids and I prepared 10 images from zero to nine. For Example this would be number 7:
Number 7 has indexes 0,1,2,5,8,11,14 as 1s (or 3,4,6,7,9,10,12,13 as 0s doesn't matter) and so on. Therefore, my input layer will be a 5 by 3 neuron layer and I will be feeding it zeros OR ones only (not in between and the indexes depends on which image I am feeding the layer).
My output layer however will be one dimensional layer of 10 neurons. Depends on which digit was recognized, a certain neuron will fire a value of one and the rest should be zeros (shouldn't fire).
I am done with implementing everything, I have a problem in computing though and I would really appreciate any help. I am getting an extremely high error rate and an extremely low (negative) output values on all output neurons and values (error and output) do not change even on the 10,000th pass.
I would love to go further and post my Backpropagation methods since I believe the problem is in it. However to break down my work I would love to hear some comments first, I want to know if my design is approachable.
Does my plan make sense?
All the posts are speaking about ranges ( 0->1, -1 ->+1, 0.01 -> 0.5 etc ), will it work for either { 0 | .OR. | 1 } on the output layer and not a range? if yes, how can I control that?
I am using TanHyperbolic as my transfer function. Does it make a difference between this and sigmoid, other functions.. etc?
Any ideas/comments/guidance are appreciated and thanks in advance
Well, by the description given above, I think that the design and approach taken it's correct! With respect to the choice of the activation function, remember that those functions help to get the neurons which have the largest activation number, also, their algebraic properties, such as an easy derivative, help with the definition of Backpropagation. Taking this into account, you should not worry about your choice of activation function.
The ranges that you mention above, correspond to a process of scaling of the input, it is better to have your input images in range 0 to 1. This helps to scale the error surface and help with the speed and convergence of the optimization process. Because your input set is composed of images, and each image is composed of pixels, the minimum value and and the maximum value that a pixel can attain is 0 and 255, respectively. To scale your input in this example, it is essential to divide each value by 255.
Now, with respect to the training problems, Have you tried checking if your gradient calculation routine is correct? i.e., by using the cost function, and evaluating the cost function, J? If not, try generating a toy vector theta that contains all the weight matrices involved in your neural network, and evaluate the gradient at each point, by using the definition of gradient, sorry for the Matlab example, but it should be easy to port to C++:
perturb = zeros(size(theta));
e = 1e-4;
for p = 1:numel(theta)
% Set perturbation vector
perturb(p) = e;
loss1 = J(theta - perturb);
loss2 = J(theta + perturb);
% Compute Numerical Gradient
numgrad(p) = (loss2 - loss1) / (2*e);
perturb(p) = 0;
end
After evaluating the function, compare the numerical gradient, with the gradient calculated by using backpropagation. If the difference between each calculation is less than 3e-9, then your implementation shall be correct.
I recommend to checkout the UFLDL tutorials offered by the Stanford Artificial Intelligence Laboratory, there you can find a lot of information related to neural networks and its paradigms, it's worth to take look at it!
http://ufldl.stanford.edu/wiki/index.php/Main_Page
http://ufldl.stanford.edu/tutorial/
For a group project, we are attempting to make a game, where functions are executed whenever a player forms a set of specific hand gestures in front of a camera. To process the images, we are using Open-CV 2.3.
During the image-processing we are trying to find the length between two points.
We already know this can be done very easily with Pythagoras law, though it is known that Pythagoras law requires much computer power, and we wish to do this as low-resource as possible.
We wish to know if there exist any build-in function within Open-CV or standard library for C++, which can handle low-resource calculations of the distance between two points.
We have the coordinates for the points, which are in pixel values (Of course).
Extra info:
Previous experience have taught us, that OpenCV and other libraries are heavily optimized. As an example, we attempted to change the RGB values of the live image feed from the camera with a for loop, going through each pixel. This provided with a low frame-rate output. Instead we decided to use an Open-CV build-in function instead, which instead gave us a high frame-rate output.
You should try this
cv::Point a(1, 3);
cv::Point b(5, 6);
double res = cv::norm(a-b);//Euclidian distance
As you correctly pointed out, there's an OpenCV function that does some of your work :)
(Also check the other way)
It is called magnitude() and it calculates the distance for you. And if you have a vector of more than 4 vectors to calculate distances, it will use SSE (i think) to make it faster.
Now, the problem is that it only calculate the square of the powers, and you have to do by hand differences. (check the documentation). But if you do them also using OpenCV functions it should be fast.
Mat pts1(nPts, 1, CV_8UC2), pts2(nPts, 1, CV_8UC2);
// populate them
Mat diffPts = pts1-pts2;
Mat ptsx, ptsy;
// split your points in x and y vectors. maybe separate them from start
Mat dist;
magnitude(ptsx, ptsy, dist); // voila!
The other way is to use a very fast sqrt:
// 15 times faster than the classical float sqrt.
// Reasonably accurate up to root(32500)
// Source: http://supp.iar.com/FilesPublic/SUPPORT/000419/AN-G-002.pdf
unsigned int root(unsigned int x){
unsigned int a,b;
b = x;
a = x = 0x3f;
x = b/x;
a = x = (x+a)>>1;
x = b/x;
a = x = (x+a)>>1;
x = b/x;
x = (x+a)>>1;
return(x);
}
This ought to a comment, but I haven't enough rep (50?) |-( so I post it as an answer.
What the guys are trying to tell you in the comments of your questions is that if it's only about comparing distances, then you can simply use
d=(dx*dx+dy*dy) = (x1-x2)(x1-x2) + (y1-y2)(y1-y2)
thus avoiding the square root. But you can't of course skip the square elevation.
Pythagoras is the fastest way, and it really isn't as expensive as you think. It used to be, because of the square-root. But modern processors can usually do this within a few cycles.
If you really need speed, use OpenCL on the graphics card for image processing.
I have an audio file and I am iterating through the file and taking 512 samples at each step and then passing them through an FFT.
I have the data out as a block 514 floats long (Using IPP's ippsFFTFwd_RToCCS_32f_I) with real and imaginary components interleaved.
My problem is what do I do with these complex numbers once i have them? At the moment I'm doing for each value
const float realValue = buffer[(y * 2) + 0];
const float imagValue = buffer[(y * 2) + 1];
const float value = sqrt( (realValue * realValue) + (imagValue * imagValue) );
This gives something slightly usable but I'd rather some way of getting the values out in the range 0 to 1. The problem with he above is that the peaks end up coming back as around 9 or more. This means things get viciously saturated and then there are other parts of the spectrogram that barely shows up despite the fact that they appear to be quite strong when I run the audio through audition's spectrogram. I fully admit I'm not 100% sure what the data returned by the FFT is (Other than that it represents the frequency values of the 512 sample long block I'm passing in). Especially my understanding is lacking on what exactly the compex number represents.
Any advice and help would be much appreciated!
Edit: Just to clarify. My big problem is that the FFT values returned are meaningless without some idea of what the scale is. Can someone point me towards working out that scale?
Edit2: I get really nice looking results by doing the following:
size_t count2 = 0;
size_t max2 = kFFTSize + 2;
while( count2 < max2 )
{
const float realValue = buffer[(count2) + 0];
const float imagValue = buffer[(count2) + 1];
const float value = (log10f( sqrtf( (realValue * realValue) + (imagValue * imagValue) ) * rcpVerticalZoom ) + 1.0f) * 0.5f;
buffer[count2 >> 1] = value;
count2 += 2;
}
To my eye this even looks better than most other spectrogram implementations I have looked at.
Is there anything MAJORLY wrong with what I'm doing?
The usual thing to do to get all of an FFT visible is to take the logarithm of the magnitude.
So, the position of the output buffer tells you what frequency was detected. The magnitude (L2 norm) of the complex number tells you how strong the detected frequency was, and the phase (arctangent) gives you information that is a lot more important in image space than audio space. Because the FFT is discrete, the frequencies run from 0 to the nyquist frequency. In images, the first term (DC) is usually the largest, and so a good candidate for use in normalization if that is your aim. I don't know if that is also true for audio (I doubt it)
For each window of 512 sample, you compute the magnitude of the FFT as you did. Each value represents the magnitude of the corresponding frequency present in the signal.
mag
/\
|
| ! !
| ! ! !
+--!---!----!----!---!--> freq
0 Fs/2 Fs
Now we need to figure out the frequencies.
Since the input signal is of real values, the FFT is symmetric around the middle (Nyquist component) with the first term being the DC component. Knowing the signal sampling frequency Fs, the Nyquist frequency is Fs/2. And therefore for the index k, the corresponding frequency is k*Fs/512
So for each window of length 512, we get the magnitudes at specified frequency. The group of those over consecutive windows form the spectrogram.
Just so people know I've done a LOT of work on this whole problem. The main thing I've discovered is that the FFT requires normalisation after doing it.
To do this you average all the values of your window vector together to get a value somewhat less than 1 (or 1 if you are using a rectangular window). You then divide that number by the number of frequency bins you have post the FFT transform.
Finally you divide the actual number returned by the FFT by the normalisation number. Your amplitude values should now be in the -Inf to 1 range. Log, etc, as you please. You will still be working with a known range.
There are a few things that I think you will find helpful.
The forward FT will tend to give larger numbers in the output than in the input. You can think of it as all of the intensity at a certain frequency being displayed at one place rather than being distributed through the dataset. Does this matter? Probably not because you can always scale the data to fit your needs. I once wrote an integer based FFT/IFFT pair and each pass required rescaling to prevent integer overflow.
The real data that are your input are converted into something that is almost complex. As it turns out buffer[0] and buffer[n/2] are real and independent. There is a good discussion of it here.
The input data are sound intensity values taken over time, equally spaced. They are said to be, appropriately enough, in the time domain. The output of the FT is said to be in the frequency domain because the horizontal axis is frequency. The vertical scale remains intensity. Although it isn't obvious from the input data, there is phase information in the input as well. Although all of the sound is sinusoidal, there is nothing that fixes the phases of the sine waves. This phase information appears in the frequency domain as the phases of the individual complex numbers, but often we don't care about it (and often we do too!). It just depends upon what you are doing. The calculation
const float value = sqrt((realValue * realValue) + (imagValue * imagValue));
retrieves the intensity information but discards the phase information. Taking the logarithm essentially just dampens the big peaks.
Hope this is helpful.
If you are getting strange results then one thing to check is the documentation for the FFT library to see how the output is packed. Some routines use a packed format where real/imaginary values are interleaved, or they may begin at the N/2 element and wrap around.
For a sanity check I would suggest creating sample data with known characteristics, eg Fs/2, Fs/4 (Fs = sample frequency) and compare the output of the FFT routine with what you'd expect. Try creating both a sine and cosine at the same frequency, as these should have the same magnitude in the spectrum, but have different phases (ie the realValue/imagValue will differ, but the sum of squares should be the same.
If you're intending on using the FFT though then you really need to know how it works mathematically, otherwise you're likely to encounter other strange problems such as aliasing.