Floating point comparison in Apple libraries [duplicate] - c++

Apple CoreGraphics.framework, CGGeometry.h:
CG_INLINE bool __CGSizeEqualToSize(CGSize size1, CGSize size2)
{
return size1.width == size2.width && size1.height == size2.height;
}
#define CGSizeEqualToSize __CGSizeEqualToSize
Why do they (Apple) compare floats with ==? I can't believe this is a mistake. So can you explain me?
(I've expected something like fabs(size1.width - size2.width) < 0.001).

Floating point comparisons are native width on all OSX and iOS architectures.
For float, that comes to:
i386, x86_64:
32 bit XMM register (or memory for second operand)
using an instructions in the family of ucomiss
ARM:
32 bit register
using instructions in the same family as vcmp
Some of the floating point comparison issues have been removed by restricting storage to 32/64 for these types. Other platforms may use the native 80 bit FPU often (example). On OS X, SSE instructions are favored, and they use natural widths. So, that reduces many of the floating point comparison issues.
But there is still room for error, or times when you will favor approximation. One hidden detail about CGGeometry types' values is that they may be rounded to a nearby integer (you may want to do this yourself in some cases).
Given the range of CGFloat (float or double-x86_64) and typical values, it's reasonable to assume the rounded values generally be represented accurately enough, such that the results will be suitably comparable in the majority of cases. Therefore, it's "pretty safe", "pretty accurate", and "pretty fast" within those confines.
There are still times when you may prefer approximated comparisons in geometry calculations, but apple's implementation is what I'd consider the closest to a reference implementation for the general purpose solution in this context.

Related

How does the CPU "cast" a floating point x87 (i think) value?

I just wanted to know how the CPU "Cast" a floating point number.
I mean, i suppouse that when when we use a "float" or "double" in C/C++ the compiler is using the x87 unit, or am i wrong? (i couldn't find the answer) So, if this is the case and the floating point numbers are not emulated how does the compiler cast it?
I mean, i suppouse that when when we use a "float" or "double" in C/C++ the compiler is using the x87 unit, or am i wrong?
On modern Intel processors, the compiler is likely to use the SSE/AVX registers. The FPU is often not in regular use.
I just wanted to know how the CPU "Cast" a floating point number.
Converting an integer to a floating-point number is a computation that is basically (glossing over some details):
Start with the binary (for unsigned types) or two’s complement (for signed types) representation of the integer.
If the number is zero, return all bits zero.
If it is negative, remember that and negate the number to make it positive.
Locate the highest bit set in the integer.
Locate the lowest bit that will fit in the significand of the destination format. (For example, for the IEEE-754 binary32 format commonly used for float, 24 bits fit in the significand, so the 25th bit after the highest bit set does not fit.)
Round the number at that position where the significand will end.
Calculate the exponent, which is a function of where the highest bit set is. Add a “bias” used in encoding the exponent (127 for binary32, 1023 for binary64).
Assemble a sign bit, bits for the exponent, and bits for the significand (omitting the high bit, because it is always one). Return those bits.
That computation prepares the bits that represent a floating-point number. (It omits details involving special cases like NaNs, infinities, and subnormal numbers because these do not occur when converting typical integer formats to typical floating-point formats.)
That computation may be performed “in software” (that is, with general instructions for shifting bits, testing values, and so on) or “in hardware” (that is, with special instructions for doing the conversion). All desktop computers have instructions for this. Small processors for special-purpose embedded use might not have such instructions.
It is not clear what do you mean by
"Cast" a floating point number. ?
If target architecture has FPU then compiler will issue FPU instructions in order to manipulate floating point variables, no mistery there...
In order to assign float variable to int variable, float must be truncated or rounded(up or down). Special instructions usually exists to serve this purpose.
If target architecture is "FPU-less" then compiler(toolchain) might provide software implementation of floating point operations using CPU instructions available. For example, expression like a = x * y; will be equivalent to a = fmul(x, y); Where fmul() is compiler provided special function(intrinsic) to do floating point operations without FPU. Ofcourse this is typically MUCH slower than using hardware FPU. Floating point arithmetic is not used on such platforms if performance matters, fixed point arithmetic https://en.wikipedia.org/wiki/Fixed-point_arithmetic could be used instead.

Truncate Floats and Doubles after user defined points in X87 and SSE FPUs

I have made a function g that is able to approximate a function to a certain degree, this function gives accurate results up to 5 decimals ( 1,23456xxxxxxxxxxxx where the x positions are just rounding errors / junk ) .
To avoid spreading error to other computations that will use the results of g I would like to just set all the x positions to zero, better yet, just set to 0 everything after the 5th decimal .
I haven't found anything in X87 and SSE literature that let's me play with IEEE 754 bits or their representation the way I would like to .
There is an old reference to the FISTP instruction for X87 that is apparently mirrored in the SSE world with FISTTP, with the benefit that FISTTP doesn't necesserily modify the control word and is therefore faster .
I have noticed that FISTTP was called "chopping mode", but now in more modern literature is just "rounding toward zero" or "truncate" and this confuse me because "chopping" means removing something altogether where "rounding toward zero" doesn't necessarily means the same thing to me .
I don't need to round to zero, I only need to preserve up to 5 decimals as the last step in my function before storing the result in memory; how do I do this in X87 ( scalar FPU ) and SSE ( vector FPU ) ?
As several people commented, more early rounding doesn't help the final result be more accurate. If you want to read more about floating point comparisons and weirdness / gotchas, I highly recommend Bruce Dawson's series of articles on floating point. Here's a quote from the one with the index
We’ve finally reached the point in this series that I’ve been waiting
for. In this post I am going to share the most crucial piece of
floating-point math knowledge that I have. Here it is:
[Floating-point] math is hard.
You just won’t believe how vastly, hugely, mind-bogglingly hard it is.
I mean, you may think it’s difficult to calculate when trains from
Chicago and Los Angeles will collide, but that’s just peanuts to
floating-point math.
(Bonus points if you recognize that last paragraph as a paraphrase of a famous line about space.)
How you could actually implement your bad idea:
There aren't any machine instructions or C standard library functions to truncate or round to anything other than integer.
Note that there are machine instructions (and C functions) that round a double to nearest (representable) integer without converting it to intmax_t or anything, just double->double. So no round-trip through a fixed-width 2's complement integer.
So to use them, you could scale your float up by some factor, round to nearest integer, then scale back down. (like chux's round()-based function, but I'd recommend C99 double rint(double) instead of round(). round has weird rounding semantics that don't match any of the available rounding modes on x86, so it compiles to worse code.
The x86 asm instructions you keep mentioning are nothing special, and don't do anything that you can't ask the compiler to do with pure C.
FISTP (Float Integer STore (and Pop the x87 stack) is one way for a compiler or asm programmer to implement long lrint(double) or (int)nearbyint(double). Some compilers make better code for one or the other. It rounds using the current x87 rounding mode (default: round to nearest), which is the same semantics as those ISO C standard functions.
FISTTP (Float Integer STore with Truncation (and Pop the x87 stack) is part of SSE3, even though it operates on the x87 stack. It lets compilers avoid setting the rounding mode to truncation (round-towards-zero) to implement the C truncation semantics of (long)x, and then restoring the old rounding mode.
This is what the "not modify the control word" stuff is about. Neither instruction does that, but to implement (int)x without FISTTP, the compiler has to use other instructions to modify and restore the rounding mode around a FIST instruction. Or just use SSE2 CVTTSD2SI to convert a double in an xmm register with truncation, instead of an FP value on the legacy x87 stack.
Since FISTTP is only available with SSE3, you'd only use it for long double, or in 32-bit code that had FP values in x87 registers anyway because of the crusty old 32-bit ABI which returns FP values on the x87 stack.
PS. if you didn't recognize Bruce's HHGTG reference, the original is:
Space is big. Really big. You just won’t believe how vastly hugely
mindbogglingly big it is. I mean you may think it’s a long way down
the road to the chemist’s, but that’s just peanuts to space.
how do I do this in X87 ( scalar FPU ) and SSE ( vector FPU ) ?
The following does not use X87 nor SSE. I've included it as a community reference for general purpose code. If anything, it can be used to test a X87 solution.
Any "chopping" of the result of g() will at least marginally increase error, hopefully tolerable as OP said "To avoid spreading error to other computations ..."
It is unclear if OP wants "accurate results up to 5 decimals" to reflect absolute precision (+/- 0.000005) or relative precision (+/- 0.000005 * result). Will assume "absolute precision".
Since float, double are far often a binary floating point, any "chop" will reflect a FP number nearest to a multiple of 0.00001.
Text Method:
// - x xxx...xxx . xxxxx \0
char buf[1+1+ DBL_MAX_10_EXP+1 +5 +1];
sprintf(buf, "%.5f", x);
x = atof(buf);
round() rint() method:
#define SCALE 100000.0
if (fabs(x) < DBL_MAX/SCALE) {
x = x*SCALE;
x = rint(x)/SCALE;
}
Direct bit manipulation of x. Simply zero select bits in the significand.
TBD code.

Should I use floating point's NaN, or floating point + bool for a data set that contains invalid values?

I have a large amount of data to process with math intensive operations on each data set. Much of it is analogous to image processing. However, since this data is read directly from a physical device, many of the pixel values can be invalid.
This makes NaN's property of representing values that are not a number and spreading on arithmetic operations very compelling. However, it also seems to require turning off some optimizations such as gcc's -ffast-math, plus we need to be cross platform. Our current design uses a simple struct that contains a float value and a bool indicating validity.
While it seems NaN was designed with this use in mind,
others think it is more trouble than it is worth. Does anyone have advice based on their more intimate experience with IEEE754 with performance in mind?
BRIEF: For strictest portability, don't use NaNs. Use a separate valid bit. E.g. a template like Valid. However, if you know that you will only ever run on IEEE 754-2008 machines, and not IEEE 754-1985 (see below), then you may get away with it.
For performance, it is probably faster not to use NaNs on most of the machines that you have access to. However, I have been involved with hardware design of FP on several machines that are improving NaN handling performance, so there is a trend to make NaNs faster, and, in particular, signalling NaNs should soon be faster than Valid.
DETAIL:
Not all floating point formats have NaNs. Not all systems use IEEE floating point. IBM hex floating point can still be found on some machines - actually systems, since IBM now supports IEEE FP on more recent machines.
Furthermore, IEEE Floating Point itself had compatibility issues wrt NaNs, in IEEE 754-1985. E.g, see wikipedia http://en.wikipedia.org/wiki/NaN:
The original IEEE 754 standard from 1985 (IEEE 754-1985) only
described binary floating point formats, and did not specify how the
signaled/quiet state was to be tagged. In practice, the most
significant bit of the significand determined whether a NaN is
signalling or quiet. Two different implementations, with reversed
meanings, resulted.
* most processors (including those of the Intel/AMD x86-32/x86-64 family, the Motorola 68000 family, the AIM PowerPC family, the ARM
family, and the Sun SPARC family) set the signaled/quiet bit to
non-zero if the NaN is quiet, and to zero if the NaN is signaling.
Thus, on these processors, the bit represents an 'is_quiet' flag.
* in NaNs generated by the PA-RISC and MIPS processors, the signaled/quiet bit is zero if the NaN is quiet, and non-zero if the
NaN is signaling. Thus, on these processors, the bit represents an
'is_signaling' flag.
This, if your code may run on older HP machines, or current MIPS machines (which are ubiquitous in embedded systems), you should not depend on a fixed encoding of NaN, but should have a machine dependent #ifdef for your special NaNs.
IEEE 754-2008 standardizes NaN encodings, so this is getting better. It depends on your market.
As for performance: many machines essentially trap, or otherwise take a major hiccup in performance, when performing computations involving both SNaNs (which must trap) and QNaNs (which don't need to trap, i.e. which could be fast - and which are getting faster in some machines as we speak.)
I can say with confidence that on older machines, particularly older Intel machines, you did NOT want to use NaNs if you cared about performance. E.g. http://www.cygnus-software.com/papers/x86andinfinity.html says "The Intel Pentium 4 handles infinities, NANs, and denormals very badly. ... If you write code that adds floating point numbers at the rate of one per clock cycle, and then throw infinities at it as input, the performance drops. A lot. A huge amount. ... NANs are even slower. Addition with NANs takes about 930 cycles. ... Denormals are a bit trickier to measure."
Get the picture? Almost 1000x slower to use a NaN than to do a normal floating point operation? In this case it is almost guaranteed that using a template like Valid will be faster.
However, see the reference to "Pentium 4"? That's a really old web page. For years people like me have been saying "QNaNs should be faster", and it has slowly taken hold.
More recently (2009), Microsoft says http://connect.microsoft.com/VisualStudio/feedback/details/498934/big-performance-penalty-for-checking-for-nans-or-infinity "If you do math on arrays of double that contain large numbers of NaN's or Infinities, there is an order of magnitude performance penalty."
If I feel impelled, I may go and run a microbenchmark on some machines. But you should get the picture.
This should be changing because it is not that hard to make QNaNs fast. But it has always been a chicken and egg problem: hardware guys like those I work with say "Nobody uses NaNs, so we won;t make them fast", while software guys don't use NaNs because they are slow. Still, the tide is slowly changing.
Heck, if you are using gcc and want best performance, you turn on optimizations like "-ffinite-math-only ... Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs." Similar is true for most compilers.
By the way, you can google like I did, "NaN performance floating point" and check refs out yourself. And/or run your own microbenchmarks.
Finally, I have been assuming that you are using a template like
template<typename T> class Valid {
...
bool valid;
T value;
...
};
I like templates like this, because they can bring "validity tracking" not just to FP, but also to integer (Valid), etc.
But, they can have a big cost. The operations are probably not much more expensive than NaN handling on old machines, but the data density can be really poor. sizeof(Valid) may sometimes be 2*sizeof(float). This bad density may hurt performance much more than the operations involved.
By the way, you should consider template specialization, so that Valid uses NaNs if they arte available and fast, and a valid bit otherwise.
template <> class Valid<float> {
float value;
bool is_valid() {
return value != my_special_NaN;
}
}
etc.
Anyway, you are better off having as few valid bits as possible, and packing them elsewhere, rather than Valid right close to the value. E.g.
struct Point { float x, y, z; };
Valid<Point> pt;
is better (density wise) than
struct Point_with_Valid_Coords { Valid<float> x, y, z; };
unless you are using NaNs - or some other special encoding.
And
struct Point_with_Valid_Coords { float x, y, z; bool valid_x, valid_y, valid_z };
is in between - but then you have to do all the code yourself.
BTW, I have been assuming you are using C++. If FORTRAN or Java ...
BOTTOM LINE: separate valid bits is probably faster and more portable.
But NaN handling is speeding up, and one day soon will be good enough
By the way, my preference: create a Valid template. Then you can use it for all data types. Specialize it for NaNs if it helps. Although my life is making things faster, IMHO it is usually more important to make the code clean.
If invalid data is very common, you are of course wasting a lot of time on running this data through the processing. If the invalid data is common enough it is probably better to be running some kind of sparse datastructure of only the valid data. If it is not very common, you can of course keep a sparse datastructure of which data is invalid. That way you would not waste a bool for each value. But maybe memory is not a problem for you...
If you are doing operations such as multipling two possibly invalid data entries, I understand it is compelling to use NaNs instead of doing checks on both variables to see if they are valid and setting the same flag in the resultant.
How portable do you need to be? Will you ever need to be able to port it to an architecture with only fixed point support? If that is the case, I think your choice is clear.
Personally I would only use NaNs if it proved to be much faster. Otherwise I'd say the code gets more clear if you have explicit handling of invalid data.
Since the floating-point numbers come from a device, they probably have a limited range. You can use some other special number, rather than NaN, to indicate absense of data, e.g. 1e37. This solution is portable. I do not know whether or not is more convinient for you than using a bool flag.

Floating point versus fixed point: what are the pros/cons?

Floating point type represents a number by storing its significant digits and its exponent separately on separate binary words so it fits in 16, 32, 64 or 128 bits.
Fixed point type stores numbers with 2 words, one representing the integer part, another representing the part past the radix, in negative exponents, 2^-1, 2^-2, 2^-3, etc.
Float are better because they have wider range in an exponent sense, but not if one wants to store number with more precision for a certain range, for example only using integer from -16 to 16, thus using more bits to hold digits past the radix.
In terms of performances, which one has the best performance, or are there cases where some is faster than the other ?
In video game programming, does everybody use floating point because the FPU makes it faster, or because the performance drop is just negligible, or do they make their own fixed type ?
Why isn't there any fixed type in C/C++ ?
That definition covers a very limited subset of fixed point implementations.
It would be more correct to say that in fixed point only the mantissa is stored and the exponent is a constant determined a-priori. There is no requirement for the binary point to fall inside the mantissa, and definitely no requirement that it fall on a word boundary. For example, all of the following are "fixed point":
64 bit mantissa, scaled by 2-32 (this fits the definition listed in the question)
64 bit mantissa, scaled by 2-33 (now the integer and fractional parts cannot be separated by an octet boundary)
32 bit mantissa, scaled by 24 (now there is no fractional part)
32 bit mantissa, scaled by 2-40 (now there is no integer part)
GPUs tend to use fixed point with no integer part (typically 32-bit mantissa scaled by 2-32). Therefore APIs such as OpenGL and Direct3D often use floating-point types which are capable of holding these values. However, manipulating the integer mantissa is often more efficient so these APIs allow specifying coordinates (in texture space, color space, etc) this way as well.
As for your claim that C++ doesn't have a fixed point type, I disagree. All integer types in C++ are fixed point types. The exponent is often assumed to be zero, but this isn't required and I have quite a bit of fixed-point DSP code implemented in C++ this way.
At the code level, fixed-point arithmetic is simply integer arithmetic with an implied denominator.
For many simple arithmetic operations, fixed-point and integer operations are essentially the same. However, there are some operations which the intermediate values must be represented with a higher number of bits and then rounded off. For example, to multiply two 16-bit fixed-point numbers, the result must be temporarily stored in 32-bit before renormalizing (or saturating) back to 16-bit fixed-point.
When the software does not take advantage of vectorization (such as CPU-based SIMD or GPGPU), integer and fixed-point arithmeric is faster than FPU. When vectorization is used, the efficiency of vectorization matters a lot more, such that the performance differences between fixed-point and floating-point is moot.
Some architectures provide hardware implementations for certain math functions, such as sin, cos, atan, sqrt, for floating-point types only. Some architectures do not provide any hardware implementation at all. In both cases, specialized math software libraries may provide those functions by using only integer or fixed-point arithmetic. Often, such libraries will provide multiple level of precisions, for example, answers which are only accurate up to N-bits of precision, which is less than the full precision of the representation. The limited-precision versions may be faster than the highest-precision version.
Fixed point is widely used in DSP and embedded-systems where often the target processor has no FPU, and fixed point can be implemented reasonably efficiently using an integer ALU.
In terms of performance, that is likley to vary depending on the target architecture and application. Obviously if there is no FPU, then fixed point will be considerably faster. When you have an FPU it will depend on the application too. For example performing some functions such as sqrt() or log() will be much faster when directly supported in the instruction set rather thna implemented algorithmically.
There is no built-in fixed point type in C or C++ I imagine because they (or at least C) were envisaged as systems level languages and the need fixed point is somewhat domain specific, and also perhaps because on a general purpose processor there is typically no direct hardware support for fixed point.
In C++ defining a fixed-point data type class with suitable operator overloads and associated math functions can easily overcome this shortcomming. However there are good and bad solutions to this problem. A good example can be found here: http://www.drdobbs.com/cpp/207000448. The link to the code in that article is broken, but I tracked it down to ftp://66.77.27.238/sourcecode/ddj/2008/0804.zip
You need to be careful when discussing "precision" in this context.
For the same number of bits in representation the maximum fixed point value has more significant bits than any floating point value (because the floating point format has to give some bits away to the exponent), but the minimum fixed point value has fewer than any non-denormalized floating point value (because the fixed point value wastes most of its mantissa in leading zeros).
Also depending on the way you divide the fixed point number up, the floating point value may be able to represent smaller numbers meaning that it has a more precise representation of "tiny but non-zero".
And so on.
The diferrence between floating point and integer math depends on the CPU you have in mind. On Intel chips the difference is not big in clockticks. Int math is still faster because there are multiple integer ALU's that can work in parallel. Compilers are also smart to use special adress calculation instructions to optimize add/multiply in a single instruction. Conversion counts as an operation too, so just choose your type and stick with it.
In C++ you can build your own type for fixed point math. You just define as struct with one int and override the appropriate overloads, and make them do what they normally do plus a shift to put the comma back to the right position.
You dont use float in games because it is faster or slower you use it because it is easier to implement the algorithms in floating point than in fixed point. You are assuming the reason has to do with computing speed and that is not the reason, it has to do with ease of programming.
For example you may define the width of the screen/viewport as going from 0.0 to 1.0, the height of the screen 0.0 to 1.0. The depth of the word 0.0 to 1.0. and so on. Matrix math, etc makes things real easy to implement. Do all of the math that way up to the point where you need to compute real pixels on a real screen size, say 800x400. Project the ray from the eye to the point on the object in the world and compute where it pierces the screen, using 0 to 1 math, then multiply x by 800, y times 400 and place that pixel.
floating point does not store the exponent and mantissa separately and the mantissa is a goofy number, what is left over after the exponent and sign, like 23 bits, not 16 or 32 or 64 bits.
floating point math at its core uses fixed point logic with extra logic and extra steps required. By definition compared apples to apples fixed point math is cheaper because you dont have to manipulate the data on the way into the alu and dont have to manipulate the data on the way out (normalize). When you add in IEEE and all of its garbage that adds even more logic, more clock cycles, etc. (properly signed infinity, quiet and signaling nans, different results for same operation if there is an exception handler enabled). As someone pointed out in a comment in a real system where you can do fixed and float in parallel, you can take advantage of some or all of the processors and recover some clocks that way. both with float and fixed clock rate can be increased by using vast quantities of chip real estate, fixed will remain cheaper, but float can approach fixed speeds using these kinds of tricks as well as parallel operation.
One issue not covered is the answers is a power consumption. Though it highly depends on specific hardware architecture, usually FPU consumes much more energy than ALU in CPU thus if you target mobile applications where power consumption is important it's worth consider fixed point impelementation of the algorithm.
It depends on what you're working on. If you're using fixed point then you lose precision; you have to select the number of places after the decimal place (which may not always be good enough). In floating point you don't need to worry about this as the precision offered is nearly always good enough for the task in hand - uses a standard form implementation to represent the number.
The pros and cons come down to speed and resources. On modern 32bit and 64bit platforms there is really no need to use fixed point. Most systems come with built in FPUs that are hardwired to be optimised for fixed point operations. Furthermore, most modern CPU intrinsics come with operations such as the SIMD set which help optimise vector based methods via vectorisation and unrolling. So fixed point only comes with a down side.
On embedded systems and small microcontrollers (8bit and 16bit) you may not have an FPU nor extended instruction sets. In which case you may be forced to use fixed point methods or the limited floating point instruction sets that are not very fast. So in these circumstances fixed point will be a better - or even your only - choice.

changing float type to short but with same behaviour as float type variable

Is it possible to change the
float *pointer
type that is used in the VS c++ project
to some other type, so that it will still behave as a floating type but with less range?
I know that the floating point values never exceed some fixed value in that project, so I want to optimize the program by memory it uses. It doesn't need 4 bytes for each element of the 'float *pointer', 2 bytes will be enough I think. If I change a float to short and imitate the floating point behaviour, then it will use twice shorter memory. How to do it?
EDIT:
It calculates the probabilities. So there are divisions like
A / B
Where A < B,
And also B (and A) can be from 1 to 10 000.
There is a standard 16-bit floating point format described in IEEE 754-2008 called "binary16". It is specified as a format to store floating point values with reduced precisions. There is almost no compiler support for that yet (I think GCC supports it for certain ARM platforms), but it is quite easy to roll your own routines. This fellow:
http://blog.fpmurphy.com/2008/12/half-precision-floating-point-format_14.html
wrote a bit about it and also presents a routine to convert half-float <-> float.
Also, here seems to be a half-float C++ wrapper class:
half.h:
http://www.koders.com/cpp/fidABD00D95DE84C73BF0218AC621E400E07AA77B53.aspx
half.cpp
http://www.koders.com/cpp/fidF0DD0510FAAED03817A956D251787609BEB5989E.aspx
which supplies "HalfFloat" as a possible drop-in replacement type.
Maybe use fixed-point math? It all depends on value and precision you want to achieve.
http://www.eetimes.com/discussion/other/4024639/Fixed-point-math-in-C
For C there is a lot of code that makes fixed-point easy and I'm pretty sure there are also many C++ classes that make it even easier, but I don't know of any, I'm more into C.
The first, obvious, memory optimization would be to try and get rid of the pointer. If you can store just the float, that may, depending on the larger context, reduce your memory consumption from eight to four bytes already. (On a 64-Bit system, from twelve to four.)
Whether you can get by with a short depends on what your program does with the values. You may be able to use fix point arithmetic using an integral type such as a short, yes but your questions shows way too little context to judge that.
The code you posted and the text in the question do not deal with actual float, but with pointers to float. In all architectures I know of, the size of a pointer is the same regardless of the pointed type, so there would be no improvement in changing that to a short or char pointer.
Now, about the actual pointed elements, what is the range that you expect in your application? What is the precision you need? How many of those elements do you have? What are the memory constraints of your target platform? Unless the range and precision are small and the number of elements huge, just use floats. Also note that if you need floating point operations, storing any other type will require conversions before and after each operation, and you might be impacting performance.
Without greater knowledge of what you are doing, the ranges for short in many architectures are [-32k, 32k), where k stands for 1024. If your data ranges is [-32,32) and you can do with roughly 3 decimal digits you could use fixed point arithmetic with shorts, but there are few such situation.