I wrote a program that simulates perfectly elastic collisions in 1D using an event-driven algorithm. The problem is something akin to Newton's cradle.
I was trying to fix something I perceive to be an issue, which is that when I give two spheres an equal initial velocity, the position and velocity arrays are updated twice due to simultaneous collisions (and thus I get a double output for the same instant).
To this effect, I created a variable that would check whether the next collision would happen within 0 seconds (i.e.: "now"). However, when I do that, the time until the collision changes completely for those simultaneous collisions.
There are 5 particles in total, and each one has a radius of 1, with a separation of 0.5 between them. Collisions are perfectly elastic, gravity is disregarded, and the algorithm stops when the final particle hits a wall placed arbitrarily in front of it.
Initial velocity for the first two particles is 1, so the first collision should occur after 0.5, and the second and third collisions should occur simultaneously after another 0.5.
Before adding the variable to check whether or not the time until collision is 0, the time until collision was outputted as 6.94906e-310 (this was verified by outputting the return value of the function that calculates the time until collision).
With the new variable that was going to be used to check if the previous value was zero, the time until collision during a simultaneous collision is now outputted as -1 in both the new variable and the return value of the aforementioned function.
I'm guessing this has something to do with the fact that it's an extremely small double value, but I don't quite understand the problem. How could creating one variable affect the value of another (albeit somewhat related) variable to this extent?
The code below is just to help visualize the problem. It is not a MWE. I'd have to post almost my entire code here to produce a MWE. Because this is for a class, I could get plagiarized or be accused of plagiarism myself, so I don't feel comfortable posting it.
Again, not a MWE, just to better explain what I'm doing.
//x and v are arrays with the positions and
//velocities, respectively, of each particle
//hasReachedWall(x) checks if the last particle has hit the wall
while (!hasReachedWall(x)) {
//The return value of updatePos(x, v) is a double with the time until the next collision
//aux, is the cause of the problem. Its existence somehow changes
//the return value into -1 when it should be near-zero. Without this variable,
//the return value is as expected (near-zero)
double aux = updatePos(x, v);
cout << aux << endl;
//t is also a double
t += aux;
}
EDIT: I am aware of how doubles are stored internally and that operations performed on them have errors. My problem is that the mere creation of an intermediary variable completely changes the result of an operation.
I'm creating a double to store the return value of a function (another double - not long double, just double), and the return value of the function changes radically. I don't understand why.
Related
Given a real number X within [0,1], after a specific binning I have to identify in what bin X falls. Given the bin size dx, I am using i = std::size_t(X/dx) , which works very well. I then look for the respective value of a given array v and set a second variable Y using double Y=v[i]. The whole code looks as follows:
double X = func();
dx=0.01;
int i = std::size_t(X/dx);
double Y = v[i];
print(Y)
This method correctly gives the expected value for the index i within the range [0, length(v)].
My main issue is not with finding the index, but using it: X is determined from an auxiliary function, and whenever I need to set Y=v[i] using the index determined above the code becomes extremely slow.
Without commenting or removing any of the lines, the code becomes much faster when setting X to some random value between 0 and 1 right after its definition or by setting i to some random value between 0 and length of v after the third line.
Could anyone be able to tell why this occurs? The speed changes of a factor 1000 if not more, and since there are only additional steps in the faster method and func() is called anyway I can't understand why it should become faster.
Since you have put no code in the question, there has to be a wild-guess like this:
You didn't sort all the X results before accessing lookup table. Processing a sorted array is faster.
Some of X had denormalized values which took a toll on computation time for certain CPU types including yours.
The dataset is too big for the L3 cache and it accessed RAM always, instead of quick cache hits that were seen in the other test.
Compiler was optimizing all of the expensive function calls out, but in real-world test scenario, it is not.
Time measurement has bugs
Computer is not stable in performance (like being a shared server or an antivirus intervention feeding on RAM bandwidth)
I'm curious if there's a way to make this function:
float linearInterpolation(float startPoint, float endPoint, float time)
{
return startPoint + ((endPoint - startPoint) * time);
}
More linear, as right now when the start point nears its endpoint it goes slower, i just want it to go the same speed all the way through no slowing down/speeding up. If i need to implement another variable or something that can be done. Another function that would take the same variables and output the next value would also be acceptable.
That looks like a correct implementation of lerp to me. Without seeing the code that you are calling it with, I'm going to guess that the reason it is slowing down is that you are calling it with a different startPoint (calling with the previous result of the call to linearInterpolation would cause this) each time (frame?) which would cause it to slow down as the distance to interpolate is reduced each time it is called.
Make sure the startPoint and endPoint variables are the same for each call over the life of the interpolation and only the time variable is increasing.
UPDATE: MY BAD. this was not the cause of the double slowdown. I had other bugs.
C++ MFC. Visual Studio 12.
I'm trying to optimize performance within a draw loop. I have a list of all my objects(ListAll), lets say it has 300 objects, all with unique ID's. I have a second list(ListNow) of the ID's which need to be rendered, size of 100. all the values in ListNow have associated objects stored in ListAll.
currently, ListAll is a CMap < UINT, UINT, Object*, Object*>, and ListNow is a CArray< UINT,UINT>.
// this is the slower, current method
for (int i = 0; i < ListNow.GetSize(); i++)
{
UINT id = ListNow.GetAt(i);
if (ListAll->Lookup(id, object))
{
object->draw();
}
}
in the past I only had ListAll(CMap), and I called draw() on every object in it. It only had the 100 I wanted to draw, and I 'rebuilt' it every time i switched what was being drawn.
// this is the faster, old method
POSITION pos = ListAll->GetStartPosition();
while (pos)
{
ListAll->GetNextAssoc(pos, id, object);
object->Draw();
}
Technically both algorithms perform at O(n) speed...but simply adding the CMap::Lookup function to the loop has doubled the time it takes. I have properly set my CMap size to a prime number larger than the number of objects in the CMap. This slowdown is blatant with lists of size 300,000 and above.
I switched to this system so that I could store all the objects in the draw lists, and could quickly swap between what is being drawn between different windows using the same object lists. This speeds up time when switching drastically but has slowed down each individual draw call. Switching back now is not an option, we knew it would slow down each draw call a bit, but not this much. The slowdown is definitely in the code I show you, because when i switch back to drawing everything(remove the lookup), it cuts time in half.
My only idea to increase performance is to record the LastDrawn object pointers in a list, and inform the function if it needs to change(call lookup()) or if it can simply re-use the last drawn(GetNext()). since 90% of the time, nothing has changed between calls.
Does anyone have a faster solution than this? I'm dreaming of a tricky bit masking solution that somehow produces the object pointers i want, I don't know. Anything would help at this point.
It appears that you problem will be solved if you store your Object's pointers instead of their IDs into your ListNow.
I am working on building a Tic Tac Toe game with varying AI implementations for a computer opponent for the sake of learning different algorithms and how to implement them. The first I am trying which should be the easiest is just having the computer choose a random space each time.
This is working for me to a certain extent, the issue becomes run time. Every time the aiRandMove() method is called, it takes longer and longer to pick a move to the point where after 5 moves have been made on board (cpu + user combined) the program appears to hang (although this isn't technically the case).
Upon further debugging on my part I realize that this should be expected as the aiRandMove() method is randomly choosing an X and Y coordinate and then the move is tested to see if it is legal. As less and less spaces are open, there are fewer and fewer legal moves, thus many more failed attempts by the randomizer to generate a legal move.
My questions is, Is there any way I can modify this that would at least reduce the time taken by the function? As far as I can tell from googling and just running through the problem myself, I cannot think of a way to optimize this without compromising the "randomness" of the function. I thought about keeping an array of moves the computer attempted but that would not resolve the problem because that would not affect the amount of times rand() generated duplicate numbers. Here is the code for this function which is all that is really relevant to this issue:
//Function which handles the AI making a random move requires a board
//object to test moves legality and player object to make a move with
//both are passed by reference because changes to board state and the player's
//evaluation arrays must be saved
char aiRandMove(Player &ai, Board &game){
int tryX;
int tryY; //Variables to store computer's attempted moves
bool moveMade = false;
char winner;
while(!moveMade){
srand(time(NULL));//Randomizes the seed for rand()
tryX = rand() % 3;
tryY = rand() % 3; //coordinates are random numbers between X and Y
cout << "Trying move " << tryX << ", " << tryY << endl;
if(game.isLegalMove(tryX, tryY)){
winner = game.makeMove(tryX, tryY, ai);
moveMade = true;
}
}
return winner;
}
I have also tried moving the seed function out of the while loop (this was put inside the while to "increase randomness" even though that is something of a logical folly and this has also not improved results.
If all else fails I may just label this method "Easy" and only have random moves until I can tell if I need to block or make the winning move. But perhaps there are other random functions which may assist in this endeavor. Any and all thoughts and comments are more than appreciated!
You need to remove the invalid moves from the equation, such as with the following pseudo-code, using an array to collect valid moves:
possibleMoves = []
for each move in allMoves:
if move is valid:
add move to possibleMoves
move = possibleMoves[random (possibleMoves.length)]
That removes the possibility that you will call random more than once per attempted move since all possibilities in the array are valid.
Alternatively, you can start the game with all moves in the possibleMoves array and remove each possibility as it's used.
You also need to learn that it's better to seed a random number generator once and then just use the numbers it generates. Seeding it with time(0) every time you try to get a random number will ensure that you get the same number for an entire second.
Given that there is only at most 9 choices, even using your random picking, this would not cause a long delay. What is causing the long delay is calling srand inside the loop. This is causing your program to get the same random numbers for the duration of a second. The loop is probably being executed millions of times in that second (or would be without the cout call)
Move the srand call outside of the loop (or better yet, just call it once at the start of your program).
That is not to say you shouldn't look at ways of removing the unavailable moves from the random selection, as it may make a difference for other types of games.
You could reduce that to very acceptable levels by creating a list of free coordinates and getting a random index in that collection. Conceptually:
#include <vector>
struct tictactoe_point
{
int x, y;
};
vector<tictactoe_point> legal_points;
tictactoe_point point;
for (point.x = 0; point.x < 3; point.x++)
{
for (point.y = 0; point.y < 3; point.y++)
{
if (game.isLegalMove(point.x, point.y))
{
legal_points.push_back(point);
}
}
}
point = legal_points[rand() % legal_points.size()];
game.makeMove(point.x, point.y, ai);
moveMade = true;
This solution is not optimal, but it's a significant improvement: now, the time it takes to make a move is fully predictable. This algorithm will complete with one single call to rand.
The fact that you call srand each time you pick a number makes the process even slower, but then again, the major problem is that your current solution has to try over and over again. It's not bounded: it may even never complete. Even if srand is considerably slow, if you know that it'll run just one time, and not an indefinite number of times, it should be viable (though not optimal either).
There are many ways to improve on this:
Keep a list of valid coordinates to play, and remove the coordinates when either the player or the AI plays it. This way you don't have to rebuild the list at every turn. It won't make a big difference for a tic-tac-toe game, but it would make a big difference if you had a larger board.
Use the standard C++ random function. This isn't really an algorithm improvement, but rand() in C is pretty crappy (I know, I know, it's a long video, but this guy really really knows his stuff).
The reason why it seems slower every move is because the AI is picking moves that have already been made so it randomly re-picks either another illegal move(Could be recurring) or it picks the correct square.
To speed this part of your program up you could have a collection(eg linkedlist) that contains the positions, use your random function over this list. When a move is picked by you or the AI remove the element from the list.
This will remove the recurring process of the AI picking the same squares.
Here is what I'm doing. My application takes points from the user while dragging and in real time displays a filled polygon.
It basically adds the mouse position on MouseMove. This point is a USERPOINT and has bezier handles because eventually I will do bezier and this is why I must transfer them into a vector.
So basically MousePos -> USERPOINT. USERPOINT gets added to a std::vector<USERPOINT> . Then in my UpdateShape() function, I do this:
DrawingPoints is defined like this:
std::vector<std::vector<GLdouble>> DrawingPoints;
Contour[i].DrawingPoints.clear();
for(unsigned int x = 0; x < Contour[i].UserPoints.size() - 1; ++x)
SetCubicBezier(
Contour[i].UserPoints[x],
Contour[i].UserPoints[x + 1],
i);
SetCubicBezier() currently looks like this:
void OGLSHAPE::SetCubicBezier(USERFPOINT &a,USERFPOINT &b, int ¤tcontour )
{
std::vector<GLdouble> temp(2);
if(a.RightHandle.x == a.UserPoint.x && a.RightHandle.y == a.UserPoint.y
&& b.LeftHandle.x == b.UserPoint.x && b.LeftHandle.y == b.UserPoint.y )
{
temp[0] = (GLdouble)a.UserPoint.x;
temp[1] = (GLdouble)a.UserPoint.y;
Contour[currentcontour].DrawingPoints.push_back(temp);
temp[0] = (GLdouble)b.UserPoint.x;
temp[1] = (GLdouble)b.UserPoint.y;
Contour[currentcontour].DrawingPoints.push_back(temp);
}
else
{
//do cubic bezier calculation
}
So for the reason of cubic bezier, I need to make USERPOINTS into GlDouble[2] (since GLUTesselator takes in a static array of double.
So I did some profiling. At ~ 100 points, the code:
for(unsigned int x = 0; x < Contour[i].UserPoints.size() - 1; ++x)
SetCubicBezier(
Contour[i].UserPoints[x],
Contour[i].UserPoints[x + 1],
i);
Took 0 ms to execute. then around 120, it jumps to 16ms and never looks back. I'm positive this is due to std::vector. What can I do to make it stay at 0ms. I don't mind using lots of memory while generating the shape then removing the excess when the shape is finalized, or something like this.
0ms is no time...nothing executes in no time. This should be your first indicator that you might want to check your timing methods over timing results.
Namely, timers typically don't have good resolution. Your pre-16ms results are probably just actually 1ms - 15ms being incorrectly reported at 0ms. In any case, if we could tell you how to keep it at 0ms, we'd be rich and famous.
Instead, find out which parts of the loop take the longest, and optimize those. Don't work towards an arbitrary time measure. I'd recommend getting a good profiler to get accurate results. Then you don't need to guess what's slow (something in the loop), but can actually see what part is slow.
You could use vector::reserve() to avoid unnecessary reallocations in DrawingPoints:
Contour[i].DrawingPoints.reserve(Contour[i].size());
for(unsigned int x = 0; x < Contour[i].UserPoints.size() - 1; ++x) {
...
}
If you actually timed the second code snippet only (as you stated in your post), then you're probably just reading from the vector. This means, the cause can not be the re-allocation cost of the vector. In that case, it may due to cache issues of the CPU (i.e. the small datasets can be read in lightning speed from cpu cache, but whenever the dataset is larger than the cache [or when alternately reading from different memory locations], the cpu has to access ram, which is distinctly slower than cache access).
If the part of the code, which you profiled, appends data to the vector, then use std::vector::reserve() with an appropriate capacity (number of expected entries in vector) before filling it.
However, regard two general rules for profiling/benchmarking:
1) Use time measurement methods with high resolution precision (as others stated, the resolution of your timer IS too low)
2) In any case, run the code snippet more than once (e.g. 100 times), get the total time of all runs and divide it by number of runs. This will give you some REAL numbers.
There's a lot of guessing going on here. Good guesses, I imagine, but guesses nevertheless. And when you try to measure the time functions take, that doesn't tell you how they take it. You can see if you try different things that the time will change, and from that you can have some suggestion of what was taking the time, but you can't really be certain.
If you really want to know what's taking the time, you need to catch it when it's taking that time, and find out for certain what it's doing. One way is to single-step it at the instruction level through that code, but I suspect that's out of the question. The next best way is to get stack samples. You can find profilers that are based on stack samples. Personally, I rely on the manual technique, for the reasons given here.
Notice that it's not really about measuring time. It's about finding out why that extra time is being spent, which is a very different question.