I am using Box2D for a game that communicates to a server and I need complete determinism. I would simply like to use integer math/fixed-point math to achieve this and I was wondering if there was a way to enable that in Box2D.
Yes. Albeit with a fixed-point implementation and with modifications to the Box2D library code.
The C++ library code for Box2D 2.3.2 uses the float32-type for its implementation of real-number-like values. As float32 is defined in b2Settings.h via a typedef (to the C++ float type), it can be changed on the one line to use a different underlying implementation of real-number-like values.
Unfortunately some of the code (like b2Max) is used or written in ways that break if float32 is not defined to be a float. So then those errors have to be chased down and the errant code rewritten such that the new type can be used.
I have done this sort of work myself including writing my own fixed-point implementation. The short of this is that I'd recommend using a 64-bit implementation with between 14 to 24 bits for the fractional portion of values (at least to make it through most of the Testbed tests without unusable amounts of underflow/overflow issues). You can take a look at my fork to see how I've done this but it's not presently code that's ready for release (not as of 2/11/2017).
The only way you can achieve determinism in Physics Engine, is to use a Fixed Time step update on the physics engine. You can read more information in this link. http://saltares.com/blog/games/fixing-your-timestep-in-libgdx-and-box2d/
Related
The DirectX Math API for matrix calculation contains separate functions for generating left handed vs. right handed matrices (e.g. XMMatrixLookAtLH vs. XMMatrixLootAtRH alongside XMMatrixPerspectiveLH vs. XMMatrixPerspectiveRH).
I don't exactly understand the full difference between the two coordinate systems (especially as they apply to traditional DirectX and OpenGL) but why is the API structured like this, as opposed to combining the entry points and providing, say, an enum indicating handedness or an extra generic function that converts a matrix intended for a right handed system to a left handed one (or vice versa)? Is it simply that both operations need to be fast (i.e. you could provide those options, but they would be too slow for any practical purposes and thus not worth supporting) or is there something fundamental about these matrix functions that requires the LH and RH variations to be entirely separate endpoints?
EDIT: To clarify: While I do appreciate answers expounding on why the API design decisions were made, my primary curiosity is whether a parameterization or after-the-fact conversion function can be implemented correctly or efficiently when you consider the math and the implementation (i.e. if the two halves can't really share code, that would be inefficient).
The DirectXMath project has a long history to it. I started working on it back in 2008 when it was "xboxmath" for the Xbox 360 focused on VMX128 with no SSE/SSE2 optimizations. Much of the initial API surface area has been preserved since then as I've tried to maintain support for existing clients as I moved from xboxmath to xnamath then xnamath to DirectXMath, including this "LH" vs. "RH" as two distinct functions.
There is a practical reason for this design: a single application is only going to use one or the other, not both. Having a potential run-time check of a parameter to pick something that is fixed and known is not that useful.
Another practical reason is to minimize branching in the code. Most of the DirectXMath functions are straight-line code that avoids all branching, using element selects instead. This was originally motivated by the fact that the Xbox 360 was an extremely fast in-order processor, but didn't have a particularly advanced branch predictor.
Generally the choice of viewing coordinate system is a matter of historical comfort: OpenGL has long preferred column-major, right-handed viewing systems. DirectX has historically used row-major, left-handed viewing coordinates. XNA Game Studio choose to go with row-major, right-handed viewing coordinates.
With the modern programmable GPU pipeline, there is actually no requirement to use one or the other as long as you are consistent: the DirectX API can support either LH or RH. Most DirectX samples including anything written using DXUT uses left-handed viewing coordinates. Most Windows Store samples and .NET-based systems generally stick with the XNA Game Studio convention.
Because of all this, DirectXMath supports row-major matrices and leave it up to the developer to use Right-Handed vs. Left-Handed. The SimpleMath wrapper for DirectXMath is intended to be natural to those coming from C#'s XNA Game Studio math library, so it assumes right-handed.
In response to #galpo1n: "DirectXMath is quite ancient now too, not a good example of C++, it is good enough for what it does, but most project would rewrite their math library."
xboxmath's tradition is to use C callable functions because in the early days of Xbox 360 there were still plenty of developers who preferred C over C++. That has become far less important over time as C++ compilers have matured and developer tastes have changed. In the transition from XNAMath to DirectXMath, I made the library C++ only (i.e. no C) and took advantage of things like stdint.h, C++ namespaces, and I made use of templates and specializations to improve the implementation of permutes and shuffle operations for the SSE/SSE2 instruction set.
The C++ language use of DirectXMath has also tracked the Visual C++ compiler support. DirectXMath 3.08 uses =default, and the upcoming 3.09 release uses constexpr. At it's core, it remains a basically C interface by design. Really the best way to think of it is that each DirectXMath function is a 'meta-intrinsic'. They are all inline, and really you want the compiler to stitch a bunch of them together into one codepath for maximum efficiency. While more recent compilers have gotten better at optimizing C++ code patterns, even the old ones (think Visual C++ .NET 2002 era) did pretty well with C code.
The original API implemented VXM128 and "no-intrinsics". XNAMath implemented VMX128, no-intrinsics, and SSE/SSE2 for x86 & x64. DirectXMath no longer supports VXM128, but added ARM-NEON and I've since added optional codepaths for SSE3, SSE4.1, AVX, and AVX2. As such, this C-style API has proven to map well to the SIMD intrinsics operations across a variety of processor families.
SimpleMath is where I decided to put C++ type conversion behaviors to hide the verbosity around loading & storing data. This is less efficient as the programmer may not realize that they are actually doing something expensive, which is why I kept it out of the base library. If you avoid SimpleMath and stick with DirectXMath, then you will be writing more verbose code but in return you know when you are doing something that's potentially performance impacting where with C++ implicit conversions and constructors you can end up spilling to memory through temporaries when you didn't expect to. If I had put this in the base library, performance-sensitive programmers couldn't easily opt-out. It's all a matter of trade-offs.
UPDATE: If you really need a parameterized version, you could do something simple and let the compiler take care of optimizing it:
inline XMMATRIX XM_CALLCONV XMMatrixPerspectiveFov(float FovAngleY, float AspectRatio, float NearZ, float FarZ, bool rhcoords )
{
if (rhcoords)
{
return XMMatrixPerspectiveFovRH(FovAngleY, AspectRatio, NearZ, FarZ);
}
else
{
return XMMatrixPerspectiveFovLH(FovAngleY, AspectRatio, NearZ, FarZ);
}
}
The library is all inline, so you basically have the source. Each case is a little different, so you can optimize each parameterized version individually. Some of the functions have full expansions of codepaths, some just have one. In most cases it just comes down to a negative on the Z.
This is quite opinion based to design an API. In an engine, the time you spend computing a matrix with an effective difference based on the handiness is negligible. They could have use a set of flags to drive instead of name and code duplication without any real major issue.
DirectXMath is quite ancient now too, not a good example of C++, it is good enough for what it does, but most project would rewrite their math library.
With modern GPU and shaders, the handiness is a pure fashion choice, as long as your pipeline is consistant, or perform conversion when required, from the modeling tool to the render engine. It is worth to note that in addition to handiness, you often have to deal with a Y or Z up convention.
The easy way to understand handiness is to form a frame with your finger ( thumb is X, index is Y, middle is Z ). If you do that with both your hands and try to align two fingers, the difference is obvious, the third axis is inverted. That's all :)
I have series of c++ signal processing classes which use 32 bit floats as their primary sample datatype. For example all the oscillator classes return floats for every sample thats requested. This is the same for all the classes, all calculations of samples are in floating point.
I am porting these classes to iOS.. and for performance issues I want to operate in 8.24 fixed point to get the most out of the processor, word has it there are major performance advantages on iOS to crunching integers instead of floats.. I'm currently doing all the calculations in floats, then converting to SInt32 at the final stage before output which means every sample at the final stage needs to be converted.
Do I simply change the datatype used inside my classes from Float to SInt32. So my oscillators and filters etc calculate in fixed point by passing SInt32's around internally instead of floats ??
is it really this simple ? or do I have to completely rewrite all the different algorithms ?
is there any other voodoo I need to understand before taking on this mission ?
Many Thanks for anyone who finds the time to comment on this.. Its much appreciated..
It's mostly a myth. Floating point performance used to be slow if you compiled for armv6 in Thumb mode; this not an issue in armv7 which supports Thumb 2 (I'll avoid further discussion of armv6 which is no longer supported in Xcode). You also want to avoid using doubles, since floats can use the faster NEON (a.k.a. Advanced SIMD Instructions) unit — this is easy to do accidentally; try enabling -Wshorten.
I also doubt you'll get significantly better performance doing an 8.24 multiply, especially over making use of the NEON unit. Changing float int/int32_t/SInt32 will also not automatically do the necessary shifts for an 8.24 multiply.
If you know that converting floats to ints is the slow bit, consider using some of the functions in Accelerate.framework, namely vDSP_vfix16() or vDSP_vfixr16().
I'm writing an application that uses an SVM to do classification on some images (specifically these). My Matlab implementation works really well. Using a SIFT bag-of-words approach, I'm able to get near 100% accuracy with a linear kernel.
I need to implement this in C++ for speed/portability reasons, and so I've tried using both libsvm and dlib. I've tried multiple SVM types (c_svm, nu_svm, one_class) and multiple kernels (linear, polynomial, rbf). The best I've been able to achieve is around 50% accuracy - even on the same samples that I've trained on. I've confirmed that my feature generators are working, because when I export my c++-generated features to Matlab and train on those, I'm able to get near-perfect results again.
Is there something magical about Matlab's SVM implementation? Are there any common pitfalls or areas that I might look into that would explain the behavior I'm seeing? I know this is a little vague, but part of the problem is that I don't know where to go. Please let me know in the comments if there is other info I can provide that would be helpful.
There is nothing magical about the Matlab version of the libraries, other that it runs in Matlab which makes it harder to shoot yourself on the foot.
A check list:
Are you normalizing your data, making all values lie between 0 and 1
(or between -1 and 1), either linearly or using the mean and the
standard deviation?
Are you parameter searching for a good value of C (or C and gamma in
the case of an RBF kernel)? Doing cross validation or on a hold out set?
Are you sure that your're handling NaN, and all other floating point
nastiness? Matlab is very good at hiding this from you, C++ not so
much.
Could it be that you're loading your data incorrectly, reading a
"%s" into a double or something that is adding noise to your input
data?
Could it be that libsvm/dlib expects the data in row major order and
your're sending it in in column major (or the other way around)? Again Matlab makes this almost impossible, C++ not so much.
32-64 bit nastiness one version of the library, executable compiled
with the other?
Some other things:
Could it be that in Matlab you're somehow leaking the class (y) into
the preprocessing? no one does this on purpose, but I've seen it happen.
If you make almost any f(y) a feature, you'll get almost 100%
everytime.
Sometimes it helps to verify that everything is numerically
identical by printing to file before training both in C++ and
Matlab.
i'm very happy with libsvm using the rbf kernel. carlosdc pointed out the most common errors in the correct order :-). for libsvm - did you use the python tools shipped with libsvm? if not i recommend to do so. write your feature vectors to a file (from matlab and/or c++) and do a metatraining for the rbf kernel with easy.py. you get the parameters and a prediction for the generated model. if this prediction is ok continue with c++. from training you also get a scaled feature file (min/max transformed to -1.0/1.0 for every feature). compare these to your c++ implementation as well.
some libsvm issues: a nasty habit is (if i remember correctly) that values scaling to 0 (zero) are omitted in the scaled file. in grid.py is a parameter "nr_local_worker" which is defining the mumber of threads. you might wish to increase it.
In my program I am dealing with tracking device movement with CMMotionManager and use quaternion representation of device attitude. For positioning the device in global coordinate system I need to do some basic calculations involving quaternion. For that I want to write some Quaternion class with all needed functions implemented. Does somebody know what is the proper way or maybe some general guidelines how it should be done.
I've found a sample project from Apple which has Quaternion, Vector2 and Vector3 classes written in cpp, but I think that it's not very easy to use them in Cocoa, since I can't define properties of the object in Obj-C header file using these c++ classes. Or am I wrong?
Thank you.
You're quite possibly not interested in OpenGL at all but as part of GLKit, Apple supplies GLKQuaternion. It has a C interface but should be easy to extrapolate into a class. Using it is recommended over expressing the mathematics directly since it likely uses the maths vector unit as fully as possible.
Alternatively you can declare C++ and Objective-C uses of GLKQuaternion directly since both superset C.
You're asking multiple questions here. How to implement a Quaternion class, how to use them to represent orientation (attitude), how to integrate a C++ implementation with Objective C, how to integrate all that into Cocoa. I'll answer the last with a question: Does Cocoa really need to know what you are using under the hood to represent attitude?
There are lots of existing packages out there that use quaternions to represent orientation in three dimensional space. Eigen is one. There are lots of others. Don't reinvent the wheel. There are some gotchas that you do need to beware of, particularly when using a third party package. See View Matrix from Quaternion . If you scroll down and look at some of the other answers, you'll see that two of the packages mentioned by others are subject to the very issues I talked about in my answer.
Note well: I am not saying don't use quaternions. That would be rather hypocritical; I use them all the time. They work very nicely as a compact means of representing rotation. You just need to beware of issues, mainly because too many people who use them / implement software for them are clueless regarding those issues.
I'm doing some linear algebra math, and was looking for some really lightweight and simple to use matrix class that could handle different dimensions: 2x2, 2x1, 3x1 and 1x2 basically.
I presume such class could be implemented with templates and using some specialization in some cases, for performance.
Anybody know of any simple implementation available for use? I don't want "bloated" implementations, as I'll running this in an embedded environment where memory is constrained.
Thanks
You could try Blitz++ -- or Boost's uBLAS
I've recently looked at a variety of C++ matrix libraries, and my vote goes to Armadillo.
The library is heavily templated and header-only.
Armadillo also leverages templates to implement a delayed evaluation framework (resolved at compile time) to minimize temporaries in the generated code (resulting in reduced memory usage and increased performance).
However, these advanced features are only a burden to the compiler and not your implementation running in the embedded environment, because most Armadillo code 'evaporates' during compilation due to its design approach based on templates.
And despite all that, one of its main design goals has been ease of use - the API is deliberately similar in style to Matlab syntax (see the comparison table on the site).
Additionally, although Armadillo can work standalone, you might want to consider using it with LAPACK (and BLAS) implementations available to improve performance. A good option would be for instance OpenBLAS (or ATLAS). Check Armadillo's FAQ, it covers some important topics.
A quick search on Google dug up this presentation showing that Armadillo has already been used in embedded systems.
std::valarray is pretty lightweight.
I use Newmat libraries for matrix computations. It's open source and easy to use, although I'm not sure it fits your definition of lightweight (it includes over 50 source files which Visual Studio compiles it into a 1.8MB static library).
CML matrix is pretty good, but may not be lightweight enough for an embedded environment. Check it out anyway: http://cmldev.net/?p=418
Another option, altough may be too late is:
https://launchpad.net/lwmatrix
I for one wasn't able to find simple enough library so I wrote it myself: http://koti.welho.com/aarpikar/lib/
I think it should be able to handle different matrix dimensions (2x2, 3x3, 3x1, etc) by simply setting some rows or columns to zero. It won't be the most fastest approach since internally all operations will be done with 4x4 matrices. Although in theory there might exist that kind of processors that can handle 4x4-operations in one tick. At least I would much rather believe in existence of such processors that than go optimizing those low level matrix calculations. :)
How about just store the matrix in an array, like
2x3 matrix = {2,3,val1,val2,...,val6}
This is really simple, and addition operations are trivial. However, you need to write your own multiplication function.