the representation of the Vertex index of SMPL - computer-vision

Please tell me about the 6890 points output by SMPL, whether they represent different meanings by different serial numbers, such as 0~1000 for the arm and 1001~2000 for the leg.
I would like to try to explore the relationship between their points in scientific research.

Related

How can I get a random uint or the last digit of float in HLSL/GLSL?

I just need a random uint, better ranging from 0-6, but there is no enumeration type in openGL. I learned that I can get a random float ranging 0-1 from the code below:
frac(sin(dot(uv, float2(12.9898, 78.233))) * 43758.5453123)
I tried to do 1/above and get floor(), but it doesn't work. Then how can I get a random int? or is there a way to get the last digit of the float(so presumably still random)?
First, let's define what we mean by "random". In the context of this answer, a "random" variable is a variable whose values are unpredictable. That is, there is no function that determines/computes an outcome for the random variable when being evaluated (with any possible inputs). Or at least, no such function has been found (yet).
Obviously, when we are talking about computing here, there is no such thing as a true random variable as described above, because anything we do in computing (and by extension in a shader) is necessarily bound to the set of functions that are computable.
Your proposed function in the question:
f(uv) = frac(sin(dot(uv, float2(12.9898, 78.233))) * 43758.5453123)
is just a computable function. It takes as input a vector uv, which itself is a deterministic/computable value - such as derived from a built-in or custom varying variable giving you the "coordinates" of the current fragment.
After evaluation, the function's result itself was computable/deterministic and happens to be a value (which the input vector uv maps to). Taking different IEEE 754 rules and precisions aside (which may vary between different GPUs such as desktop ones and mobile ones), the function itself is purely deterministic/computable and therefore does not give you a random value.
We humans may think that the output is random, because we lack the intuition for the functions used to compute the result, such that when we "see" a number 0.623513632 followed by another number 0.9734126 for only slight variations in the input vector, we could draw the conclusion that "yeah, that looks pretty random", when it fact it obviously isn't. It is just what that function computed, given two input values.
So, when you already have a deterministic function like the above and wanted to obtain values in the closed range [0, 6] from it as a GLSL uint, you can simply scale the output of said function by multiplying the function's result with 7.0 and truncating the result:
g(uv) = uint(f(uv) * 7.0)
If you wanted to obtain true random numbers drawn from a random variable (whose deterministic function simply hasn't been found yet), you can obtain such values from universe background radiation (such as from random.org) and use that as an input to your shader (such as via textures or buffer objects).
But, from a computational perspective, a shader is just a function taking in values (ints, floats, ...) and computing (by means of computable functions) a deterministic result.
All we can do is to shuffle/scramble/diffuse the input bits in such a way, that the result "looks" like random to us. We then call these "pseudo-random" values.
Taking this a step further, we could now ask the question of the distribution quality of the obtained pseudo-random values. This has two qualities:
how evenly distributed are the pseudo-random values in their domain/interval? I.e. do all possible values have the same probability of occurring? Or: Do you even want to have uniformly-distributed values or should the values follow another distribution (like Guassian?)
how well are two values drawn from two sequential input values spaced apart? I.e. what is the frequency of the pseudo-random values?
There are different (deterministic) algorithms/functions depending on which distribution and which frequency spectrum your values should have. But first, you should define an answer to the two questions for your use-case.
And by the way, the commonly used function in your question to obtain pseudo-random numbers in a shader has a terrible distribution quality.
Last but not least, it should also be mentioned that true randomness (i.e. non-determinism), like when you do use an entropy source as input values, is oftentimes an undesirable property in computation, because it:
makes it difficult to repeat the same computation / output when needed, which is useful in various algorithms in the context of path tracing
makes it difficult to reproduce/debug/inspect your function for a particular run when every following execution/run will yield a different output

If two languages follow IEEE 754, will calculations in both languages result in the same answers?

I'm in the process of converting a program from Scilab code to C++. One loop in particular is producing a slightly different result than the original Scilab code (it's a long piece of code so I'm not going to include it in the question but I'll try my best to summarise the issue below).
The problem is, each step of the loop uses calculations from the previous step. Additionally, the difference between calculations only becomes apparent around the 100,000th iteration (out of approximately 300,000).
Note: I'm comparing the output of my C++ program with the outputs of Scilab 5.5.2 using the "format(25);" command. Meaning I'm comparing 25 significant digits. I'd also like to point out I understand how precision cannot be guaranteed after a certain number of bits but read the sections below before commenting. So far, all calculations have been identical up to 25 digits between the two languages.
In attempts to get to the bottom of this issue, so far I've tried:
Examining the data type being used:
I've managed to confirm that Scilab is using IEEE 754 doubles (according to the language documentation). Also, according to Wikipedia, C++ isn't required to use IEEE 754 for doubles, but from what I can tell, everywhere I use a double in C++ it has perfectly match Scilab's results.
Examining the use of transcendental functions:
I've also read from What Every Computer Scientist Should Know About Floating-Point Arithmetic that IEEE does not require transcendental functions to be exactly rounded. With that in mind, I've compared the results of these functions (sin(), cos(), exp()) in both languages and again, the results appear to be the same (up to 25 digits).
The use of other functions and predefined values:
I repeated the above steps for the use of sqrt() and pow(). As well as the value of Pi (I'm using M_PI in C++ and %pi in Scilab). Again, the results were the same.
Lastly, I've rewritten the loop (very carefully) in order to ensure that the code is identical between the two languages.
Note: Interestingly, I noticed that for all the above calculations the results between the two languages match farther than the actual result of the calculations (outside of floating point arithmetic). For example:
Value of sin(x) using Wolfram Alpha = 0.123456789.....
Value of sin(x) using Scilab & C++ = 0.12345yyyyy.....
Where even once the value computed using Scilab or C++ started to differ from the actual result (from Wolfram). Each language's result still matched each other. This leads me to believe that most of the values are being calculated (between the two languages) in the same way. Even though they're not required to by IEEE 754.
My original thinking was one of the first three points above are implemented differently between the two languages. But from what I can tell everything seems to produce identical results.
Is it possible that even though all the inputs to these loops are identical, the results can be different? Possibly because a very small error (past what I can see with 25 digits) is occurring that accumulates over time? If so, how can I go about fixing this issue?
No, the format of the numbering system does not guarantee equivalent answers from functions in different languages.
Functions, such as sin(x), can be implemented in different ways, using the same language (as well as different languages). The sin(x) function is an excellent example. Many implementations will use a look-up table or look-up table with interpolation. This has speed advantages. However, some implementations may use a Taylor Series to evaluate the function. Some implementations may use polynomials to come up with a close approximation.
Having the same numeric format is one hurdle to solve between languages. Function implementation is another.
Remember, you need to consider the platform as well. A program that uses an 80-bit floating point processor will have different results than a program that uses a 64-bit floating point software implementation.
Some architectures provide the capability of using extended precision floating point registers (e.g. 80 bits internally, versus 64-bit values in RAM). So, it's possible to get slightly different results for the same calculation, depending on how the computations are structured, and the optimization level used to compile the code.
Yes, it's possible to have a different results. It's possible even if you are using exactly the same source code in the same programming language for the same platform. Sometimes it's enough to have a different compiler switch; for example -ffastmath would lead the compiler to optimize your code for speed rather than accuracy, and, if your computational problem is not well-conditioned to begin with, the result may be significantly different.
For example, suppose you have this code:
x_8th = x*x*x*x*x*x*x*x;
One way to compute this is to perform 7 multiplications. This would be the default behavior for most compilers. However, you may want to speed this up by specifying compiler option -ffastmath and the resulting code would have only 3 multiplications:
temp1 = x*x; temp2 = temp1*temp1; x_8th = temp2*temp2;
The result would be slightly different because finite precision arithmetic is not associative, but sufficiently close for most applications and much faster. However, if your computation is not well-conditioned that small error can quickly get amplified into a large one.
Note that it is possible that the Scilab and C++ are not using the exact same instruction sequence, or that one uses FPU and the other uses SSE, so there may not be a way to get them to be exactly the same.
As commented by IInspectable, if your compiler has _control87() or something similar, you can use it to change the precision and/or rounding settings. You could try combinations of this to see if it has any effect, but again, even you manage to get the settings identical for Scilab and C++, differences in the actual instruction sequences may be the issue.
http://msdn.microsoft.com/en-us/library/e9b52ceh.aspx
If SSE is used, I"m not sure what can be adjusted as I don't think SSE has an 80 bit precision mode.
In the case of using FPU in 32 bit mode, and if your compiler doesn't have something like _control87, you could use assembly code. If inline assembly is not allowed, you would need to call an assembly function. This example is from an old test program:
static short fcw; /* 16 bit floating point control word */
/* ... */
/* set precision control to extended precision */
__asm{
fnstcw fcw
or fcw,0300h
fldcw fcw
}

Is there any way to make sure the output of the float-point the same in different OS?

Here is my code:
int a = 0x451998a0;
float b = *((float *)&a);
printf("coverto float: %f, %.10lf\n", b, b);
In windows the output is:
coverto float: 2457.539063, 2457.5390625000
In linux the output is:
coverto float: 2457.539062, 2457.5390625000
Is there any way to make sure the output is the same?
The behavior you're seeing is just a consequence of the fact that Windows' printf() function is implemented differently from Linux's printf() function. Most likely the difference is in how printf() implements number rounding.
How printf() works under the hood in either system is an implementation detail; thus the system is not likely to provide such fine-grained control on how printf() displays the floating point values.
There are two ways that may work to keep them the same:
Use more precision during calculation than while displaying it. For example, some scientific and graphing calculators use double precision for all internal calculations, but display the results with only float precision.
Use a cross-platform printf() library. Such libraries would most likely have the same behavior on all platforms, as the calculations required to determine what digits to display are usually platform-agnostic.
However, this really isn't as big of a problem as you think it is. The difference between the outputs is 0.000001. That is a ~0.0000000004% difference from either the two values. The display error is really quite negligible.
Consider this: the distance between Los Angeles and New York is 2464 miles, which is of the same order of magnitude as the numbers in your display outputs. A difference of 0.000001 miles is 1.61 millimeters. We of course don't measure distances between cities with anywhere near that kind of precision. :-)
If you use the same printf() implementation, there's a good chance they'll show the same output. Depending on what you're up to, it may be easier to use GNU GCC on both OSes, or to get printf() source code and add it to your project (you should have no trouble googling one).
BTW - have you actually checked what that hex number encodes? Should it round up or down? The 625 thing is likely itself rounded, so you shouldn't assume it should round to 63....
The obvious answer is to use less precision in your output. In general,
if there's any calculation involved, you can't even be sure that the
actual floating point values are identical. And how printf and
ostream round is implementation defined, even if the floating point
values are equal.
In general, C++ doesn't guarantee that two implementations produce the
same results. In this particular case, if it's important, you can do
the rounding by hand, before doing the conversion, but you'll still have
occasional problems because the actual floating point values will be
different. This may, in fact, occur even with different levels of
optimization with the same compiler. So anything you try (other than
writing the entire program in assembler) is bound to be a loosing battle
in the end.

Better compression algorithm for vector data?

I need to compress some spatially correlated data records. Currently I am getting 1.2x-1.5x compression with zlib, but I figure it should be possible to get more like 2x. The data records have various fields, but for example, zlib seems to have trouble compressing lists of points.
The points represent a road network. They are pairs of fixed-point 4-byte integers of the form XXXXYYYY. Typically, if a single data block has 100 points, there will be only be a few combinations of the top two bytes of X and Y (spatial correlation). But the bottom bytes are always changing and must look like random data to zlib.
Similarly, the records have 4-byte IDs which tend to have constant high bytes and variable low bytes.
Is there another algorithm that would be able to compress this kind of data better? I'm using C++.
Edit: Please no more suggestions to change the data itself. My question is about automatic compression algorithms. If somebody has a link to an overview of all popular compression algorithms I'll just accept that as answer.
You'll likely get much better results if you try to compress the data yourself based on your knowledge of its structure.
General-purpose compression algorithms just treat your data as a bitstream. They look for commonly-used sequences of bits, and replace them with a shorter dictionary indices.
But the duplicate data doesn't go away. The duplicated sequence gets shorter, but it's still duplicated just as often as it was before.
As I understand it, you have a large number of data points of the form
XXxxYYyy, where the upper-case letters are very uniform. So factor them out.
Rewrite the list as something similar to this:
XXYY // a header describing the common first and third byte for all the subsequent entries
xxyy // the remaining bytes, which vary
xxyy
xxyy
xxyy
...
XXYY // next unique combination of 1st and 3rd byte)
xxyy
xxyy
...
Now, each combination of the rarely varying bytes is listed only once, rather than duplicated for every entry they occur in. That adds up to a significant space saving.
Basically, try to remove duplicate data yourself, before running it through zlib. You can do a better job of it because you have additional knowledge about the data.
Another approach might be, instead of storing these coordinates as absolute numbers, write them as deltas, relative deviations from some location chosen to be as close as possible to all the entries. Your deltas will be smaller numbers, which can be stored using fewer bits.
Not specific to your data, but I would recommend checking out 7zip instead of zlib if you can. I've seen ridiculously good compression ratios using this.
http://www.7-zip.org/
Without seeing the data and its exact distribution, I can't say for certain what the best method is, but I would suggest that you start each group of 1-4 records with a byte whose 8 bits indicate the following:
0-1 Number of bytes of ID that should be borrowed from previous record
2-4 Format of position record
6-7 Number of succeeding records that use the same 'mode' byte
Each position record may be stored one of eight ways; all types other than 000 use signed displacements. The number after the bit code is the size of the position record.
000 - 8 - Two full four-byte positions
001 - 3 - Twelve bits for X and Y
010 - 2 - Ten-bit X and six-bit Y
011 - 2 - Six-bit X and ten-bit Y
100 - 4 - Two sixteen-bit signed displacements
101 - 3 - Sixteen-bit X and 8-bit Y signed displacement
110 - 3 - Eight-bit signed displacement for X; 16-bit for Y
111 - 2 - Two eight-bit signed displacements
A mode byte of zero will store all the information applicable to a point without reference to any previous point, using a total of 13 bytes to store 12 bytes of useful information. Other mode bytes will allow records to be compacted based upon similarity to previous records. If four consecutive records differ only in the last bit of the ID, and either have both X and Y within +/- 127 of the previous record, or have X within +/- 31 and Y within +/- 511, or X within +/- 511 and Y within +/- 31, then all four records may be stored in 13 bytes (an average of 3.25 bytes each (a 73% reduction in space).
A "greedy" algorithm may be used for compression: examine a record to see what size ID and XY it will have to use in the output, and then grab up to three more records until one is found that either can't "fit" with the previous records using the chosen sizes, or could be written smaller (note that if e.g. the first record has X and Y displacements both equal to 12, the XY would be written with two bytes, but until one reads following records one wouldn't know which of the three two-byte formats to use).
Before setting your format in stone, I'd suggest running your data through it. It may be that a small adjustment (e.g. using 7+9 or 5+11 bit formats instead of 6+10) would allow many data to pack better. The only real way to know, though, is to see what happens with your real data.
It looks like the Burrows–Wheeler transform might be useful for this problem. It has a peculiar tendency to put runs of repeating bytes together, which might make zlib compress better. This article suggests I should combine other algorithms than zlib with BWT, though.
Intuitively it sounds expensive, but a look at some source code shows that reverse BWT is O(N) with 3 passes over the data and a moderate space overhead, likely making it fast enough on my target platform (WinCE). The forward transform is roughly O(N log N) or slightly over, assuming an ordinary sort algorithm.
Sort the points by some kind of proximity measure such that the average distance between adjacent points is small. Then store the difference between adjacent points.
You might do even better if you manage to sort the points so that most differences are positive in both the x and y axes, but I can't say for sure.
As an alternative to zlib, a family of compression techniques that works well when the probability distribution is skewed towards small numbers is universal codes. They would have to be tweaked for signed numbers (encode abs(x)<<1 + (x < 0 ? 1 : 0)).
You might want to write two lists to the compressed file: a NodeList and a LinkList. Each node would have an ID, x, y. Each link would have a FromNode and a ToNode, along with a list of intermediate xy values. You might be able to have a header record with a false origin and have node xy values relative to that.
This would provide the most benefit if your streets follow an urban grid network, by eliminating duplicate coordinates at intersections.
If the compression is not required to be lossless, you could use truncated deltas for intermediate coordinates. While someone above mentioned deltas, keep in mind that a loss in connectivity would likely cause more problems than a loss in shape, which is what would happen if you use truncated deltas to represent the last coordinate of a road (which is often an intersection).
Again, if your roads aren't on an urban grid, this probably wouldn't buy you much.

Why would I use 2's complement to compare two doubles instead of comparing their differences against an epsilon value?

Referenced here and here...Why would I use two's complement over an epsilon method? It seems like the epsilon method would be good enough for most cases.
Update: I'm purely looking for a theoretical reason why you'd use one over the other. I've always used the epsilon method.
Has anyone used the 2's complement comparison successfully? Why? Why Not?
the second link you reference mentions an article that has quite a long description of the issue:
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
but unless you are tweaking performance I would stick with epsilon so people can debug your code
The bits method might be faster. I say might because on modern (multicore, highly pipelined) processors it is often impossible to guess what is really faster.
Code the simplest most obviously correct implementation, then measure, then optomise.
In short, when comparing two floats with unknown origins, picking an epsilon that is valid is almost impossible.
For example:
What is a good epsilon when comparing distance in miles between Atlanta GA, Dallas TX and some place in Ohio?
What is a good epsilon when comparing distance in miles between my left foot, my right foot and the computer under my desk?
EDIT:
Ok, I'm getting a fair number of people not understanding why you wouldn't know what your epsilon is.
Back in the old days of lore, I wrote two programs that worked with NeverWinter Nights (a game made by BioWare). One of the programs took a binary model and converted it to ASCII. The other program took an ASCII model and compiled it into binary. One of the tests I wrote was to take all of BioWare's binary models, decompile them to ASCII and then back to binary. Then I compared my binary version with original one from BioWare. One of the problems during the comparison was dealing with some of the slight variances in floating point values. So instead of coming up with a bunch of different EPSILONS for each type of floating point number (vertex, normal, etc), I wanted to use something such as this twos compliment compare. Thus avoiding the whole multiple EPSILON issue.
The same type of issue can apply to any type of software that processes 3rd party data and then needs to validate their results with the original. In these cases you might not even know what the floating point values represent, you just have to compare them. We ran into this issue with our industrial automation software.
EDIT:
LOL, this has been voted up and down by different people.
I'll boil the problem down to this, given two arbitrary floating point numbers, how do you decide what epsilon to use? You can't.
How can you compare 1e23 and 1.0001e23 with an epsilon and still compare 1e-23 and 5.2e-23 using the same epsilon? Sure, you can do some dynamic epsilon tricks, but that is the whole point to the integer compare (which does NOT require the integers be exact).
The integer compare is able to compare two floats using an epsilon relative to the magnitude of the numbers.
EDIT
Steve, lets look at what you said in the comments:
"But you know what equality means to you... Hence, you should be able to find an appropriate epsilon".
Turn this statement around to say:
"If you know what equality means to you, then you should be able to find an appropriate epsilon."
The whole point to what I am trying to say is that there are applications where we don't know what equality means in the absolute sense, thus we have to resort to a relative compare which is what the integer version is trying to do.
When it comes to speed, follow these rules:
If you're not a very experienced developer, don't optimize.
If you are an experienced developer, don't optimize yet.
Do the easiest method.
Alex
Oskar's right. Don't screw with this unless you really, really need that performance.
And you don't. If you were in the situation that did, you wouldn't have needed to ask the question -- you'd already know. If you think you do, then you don't. Your performance problems lie elsewhere. Just use the readable version.
Using any method that compares bitwise will result in trouble when fractions are represented by approximations. All floating point numbers with fractions that are not denominated in powers of two (1/2, 1/4, 1/8, 1/65536, &c) are approximated. So, of course, are all irrational numbers.
float third = 1/3;
float two=2.0;
float another_two=third*6.0;
if(two != another_two)
print ("Approximation!\n");
The only time comparing bitwise would work is when you derive the floating point numbers exactly the same way or they are exact representations (whole numbers, fraction powers of two). Even then, there can be multiple representations of some numbers, though I have never seen this in a working system.