Why is calculated Free & Total drive space capping out at 2G? - c++

I wrote a directory information utility and (because I, and the people I wrote this for collect & use vintage hardware,) made it compatible with DOS and Windows 9x as well as Windows XP/Vista/7/8 64-bit (because we also use those.) The problem I'm running into is that in Windows 9x it's reporting available drive space and total drive space as 2G (well 1.9997 G) even on larger drives. On Windows XP and beyond (32-bit or 64-bit,) it reports the drive sizes correctly. In DOS, of course, this isn't an issue as the maximum size in DOS is 2G already.
The code I'm using is (DInfo.Path is the directory being accessed, with [0] being the drive letter - A, B, C, etc...):
_dos_getdiskfree(DInfo.Path[0] - 'A' + 1, &Free);
BlockSize = Free.sectors_per_cluster * Free.bytes_per_sector;
for (i = 0; i < BlockSize; i++) {
DriveBytes += Free.total_clusters;
if (DriveBytes < Free.total_clusters) ++DBOverflow;
FreeBytes += Free.avail_clusters;
if (FreeBytes < Free.avail_clusters) ++FBOverflow;
}
The only difference between the code in the DOS stub and the Windows portion of the executable is the _dos_getdiskfree is replaced with _getdiskfree instead. I use unsigned __int32 variables in the above code (or unsigned long for the DOS code.) I used 32 bit for compatibility and to reduce re-writing the code as much as possible when converting the DOS code to Windows code. In Windows XP+ I could probably have simplified things by using __int64 variables, but again, I wasn't sure if Windows 9x would provide those or not. I wasn't even sure if the 32-bit versions of Windows XP+ would allow it or not, and really didn't want to research it just streamline it a bit. Even on older HW it works fast enough with the loop.
With the Overflow & Byte variables both 32 bit integers, the size should max out at 8 exabytes (kilobytes, megabytes, gigabytes, terabytes, petabytes, exabytes in case you were wondering,) and since the largest drives currently available are measured in single digit terabytes, that limit shouldn't cause a problem for a while. At least it's doubtful to do so during my lifetime.

The answer provided by Raymond Chen in a comment has fixed the problem. Using GetDiskFreeSpaceEx rather than GetDiskFreeSpace produced the correct results.

Related

i convert my compiler to 64 bit from 32 but still i cant use more than 2GB :( why?

i can create this array:
int Array[490000000];
cout << "Array Byte= " << sizeof(Array) << endl;
Array byte = 1,960,000,000 byte and convert gb = 1,96 GB about 2 gb whatever.
but i cant create same time these:
int Array[490000000];
int Array2[490000000];
it give error why ? sorry for bad englisgh :)
Also i checked my compiler like this:
printf("%d\n", sizeof(char *));
it gives me 8.
C++ programs are not usually compiled to have 2Gb+ of stack space, regardless of whether it is compiled in 32-bit mode or 64-bit mode. Stack space can be increased as part of the compiler options, but even in the scenario where it is permissible to set the stack size that high, it's still not an ideomatic solution or recommended.
If you need an array of 2Gb, you should use std::vector<int> Array(490'000'000); (strongly recommended) or a manually created array, i.e. int* Array = new int[490'000'000]; (remember that manually allocated memory must be manually deallocated with delete[]), either of which will allocate dynamic memory. You'll still want to be compiling in 64-bit mode, since this will brush up against the maximum memory limit of your application if you don't, but in your scenario, it's not strictly necessary, since 2Gb is less than the maximum memory of a 32-bit application.
But still I can't use more than 2 GB :( why?
The C++ language does not have semantics to modify (nor report) how much automatic memory is available (or at least I have not seen it.) The compilers rely on the OS to provide some 'useful' amount. You will have to search (google? your hw documents, user's manuals, etc) for how much. This limit is 'machine' dependent, in that some machines do not have as much memory as you may want.
On Ubuntu, for the last few releases, the Posix function ::pthread_attr_getstacksize(...) reports 8 M Bytes per thread. (I am not sure of the proper terminology, but) what linux calls 'Stack' is the resource that the C++ compiler uses for the automatic memory. For this release of OS and compiler, the limit for automatic var's is thus 8M (much smaller than 2G).
I suppose that because the next machine might have more memory, the compiler might be given a bigger automatic memory, and I've seen no semantics that will limit the size of your array based on memory size of the machine performing the compile ...
there can can be no compile-time-report that the stack will overflow.
I see Posix has a function suggesting a way to adjust size of stack. I've not tried it.
I have also found Ubuntu commands that can report and adjust size of various memory issues.
From https://www.nics.tennessee.edu/:
The command to modify limits varies by shell. The C shell (csh) and
its derivatives (such as tcsh) use the limit command to modify limits.
The Bourne shell (sh) and its derivatives (such as ksh and bash) use
the ulimit command. The syntax for these commands varies slightly and
is shown below. More detailed information can be found in the man page
for the shell you are using.
One minor experiment ... the command prompt
& dtb_chimes
launches this work-in-progress app which uses Posix and reports 8 MByte stack (automatic var)
With the ulimit prefix command
$ ulimit -S -s 131072 ; dtb_chimes
the app now reports 134,217,728
./dtb_chimes_ut
default Stack size: 134,217,728
argc: 1
1 ./dtb_chimes_ut
But I have not confirmed the actual allocation ... and this is still a lot smaller than 1.96 GBytes ... but, maybe you can get there.
Note: I strongly recommend std::vector versus big array.
On my Ubuntu desktop, there is 4 GByte total dram (I have memory test utilities), and my dynamic memory is limited to about 3.5 GB. Again, the amount of dynamic memory is machine dependent.
64 bits address a lot more memory than I can afford.

How to find out if we are using really 48, 56 or 64 bits pointers

I am using tricks to store extra information in pointers, At the moment some bits are not used in pointers(the highest 16 bits), but this will change in the future. I would like to have a way to detect if we are compiling or running on a platform that will use more than 48 bits for pointers.
related things:
Why can't OS use entire 64-bits for addressing? Why only the 48-bits?
http://developer.amd.com/wordpress/media/2012/10/24593_APM_v2.pdf
The solution is needed for x86-64, Windows, C/C++, preferably something that can be done compile-time.
Solutions for other platforms are also of interest but will not marked as correct answer.
Windows has exactly one switch for 32bit and 64bit programs to determine the top of their virtual-address-space:
IMAGE_FILE_LARGE_ADDRESS_AWARE
For both types, omitting it limits the program to the lower 2 GB of address-space, severely reducing the memory an application can map and thus also reducing effectiveness of Address-Space-Layout-Randomization (ASLR, an attack mitigation mechanism).
There is one upside to it though, and just what you seem to want: Only the lower 31 bits of a pointer can be set, so pointers can be safely round-tripped through int (32 bit integer, sign- or zero-extension).
At run-time the situation is slightly better:
Just use the cpuid-instruction from intel, function EAX=80000008H, and read the maximum number of used address bits for virtual addresses from bits 8-15.
The OS cannot use more than the CPU supports, remember that intel insists on canonical addresses (sign-extended).
See here for how to use cpuid from C++: CPUID implementations in C++

Heap size for windows app

I have a Windows7 Maximal 64-bit computer with 8 Gb RAM. I have created a Win32 console aplication in MSVC and wrote as follows:
size_t const s_chunkSize = 1024 * 32;
size_t total = 0;
for (;;)
{
if (!::malloc(s_chunkSize))
{
break;
}
total += s_chunkSize;
}
printf("total = %li", total);
// yes, I do not free allocated memory for simplicity
It output me 2111668224 that is below 2Gb. How can I force my program to allocate more that 2Gb? Do I have to change some MSVC project settings? Or do I have to use not malloc but Windows-specific functions? Or do I have to configure Windows somehow?
As explained in the comments, you must use the /LARGEADDRESSAWARE linker flag to enable the use of >2GB of virtual address space on machines that provide it (typically, 32 bit machines with the /3GB flag or 64 bit machines). Notice that doing this requires you to exercise extra care when dealing with pointers ( http://blogs.msdn.com/b/oldnewthing/archive/2004/08/12/213468.aspx and articles linked from there), and won't allow you to access more than 4 GB of virtual address space anyway.
A better solution is to build a 64 bit version of your program: you are no longer restricted to a 32 bit address space, and you avoid the caveats of addresses with the high bit set. Obviously, the downside (beside the porting problems that may arise) is that the generated executable will run only on 64 bit machines.

How to write convertible code, 32 bit/64 bit?

A c++ specific question. So i read a question about what makes a program 32 bit/64 bit, and the anwser it got was something like this (sorry i cant find the question, was somedays ago i looked at it and i cant find it again:( ): As long as you dont make any "pointer assumptions", you only need to recompile it. So my question is, what are pointer assumtions ? To my understanding there is 32 bit pointer and 64 bit pointers so i figure it is something to do with that . Please show the diffrence in code between them. Any other good habits to keep in mind while writing code, that helps it making it easy to convert between the to are also welcome :) tho please share examples with them
Ps. I know there is this post:
How do you write code that is both 32 bit and 64 bit compatible?
but i tougth it was kind of to generall with no good examples, for new programmers like myself. Like what is a 32 bit storage unit ect. Kinda hopping to break it down a bit more (no pun intended ^^ ) ds.
In general it means that your program behavior should never depend on the sizeof() of any types (that are not made to be of some exact size), neither explicitly nor implicitly (this includes possible struct alignments as well).
Pointers are just a subset of them, and it probably also means that you should not try to rely on being able to convert between unrelated pointer types and/or integers, unless they are specifically made for this (e.g. intptr_t).
In the same way you need to take care of things written to disk, where you should also never rely on the size of e.g. built in types, being the same everywhere.
Whenever you have to (because of e.g. external data formats) use explicitly sized types like uint32_t.
For a well-formed program (that is, a program written according to syntax and semantic rules of C++ with no undefined behaviour), the C++ standard guarantees that your program will have one of a set of observable behaviours. The observable behaviours vary due to unspecified behaviour (including implementation-defined behaviour) within your program. If you avoid unspecified behaviour or resolve it, your program will be guaranteed to have a specific and certain output. If you write your program in this way, you will witness no differences between your program on a 32-bit or 64-bit machine.
A simple (forced) example of a program that will have different possible outputs is as follows:
int main()
{
std::cout << sizeof(void*) << std::endl;
return 0;
}
This program will likely have different output on 32- and 64-bit machines (but not necessarily). The result of sizeof(void*) is implementation-defined. However, it is certainly possible to have a program that contains implementation-defined behaviour but is resolved to be well-defined:
int main()
{
int size = sizeof(void*);
if (size != 4) {
size = 4;
}
std::cout << size << std::endl;
return 0;
}
This program will always print out 4, despite the fact it uses implementation-defined behaviour. This is a silly example because we could have just done int size = 4;, but there are cases when this does appear in writing platform-independent code.
So the rule for writing portable code is: aim to avoid or resolve unspecified behaviour.
Here are some tips for avoiding unspecified behaviour:
Do not assume anything about the size of the fundamental types beyond that which the C++ standard specifies. That is, a char is at least 8 bit, both short and int are at least 16 bits, and so on.
Don't try to do pointer magic (casting between pointer types or storing pointers in integral types).
Don't use a unsigned char* to read the value representation of a non-char object (for serialisation or related tasks).
Avoid reinterpret_cast.
Be careful when performing operations that may over or underflow. Think carefully when doing bit-shift operations.
Be careful when doing arithmetic on pointer types.
Don't use void*.
There are many more occurrences of unspecified or undefined behaviour in the standard. It's well worth looking them up. There are some great articles online that cover some of the more common differences that you'll experience between 32- and 64-bit platforms.
"Pointer assumptions" is when you write code that relies on pointers fitting in other data types, e.g. int copy_of_pointer = ptr; - if int is a 32-bit type, then this code will break on 64-bit machines, because only part of the pointer will be stored.
So long as pointers are only stored in pointer types, it should be no problem at all.
Typically, pointers are the size of the "machine word", so on a 32-bit architecture, 32 bits, and on a 64-bit architecture, all pointers are 64-bit. However, there are SOME architectures where this is not true. I have never worked on such machines myself [other than x86 with it's "far" and "near" pointers - but lets ignore that for now].
Most compilers will tell you when you convert pointers to integers that the pointer doesn't fit into, so if you enable warnings, MOST of the problems will become apparent - fix the warnings, and chances are pretty decent that your code will work straight away.
There will be no difference between 32bit code and 64bit code, the goal of C/C++ and other programming languages are their portability, instead of the assembly language.
The only difference will be the distrib you'll compile your code on, all the work is automatically done by your compiler/linker, so just don't think about that.
But: if you are programming on a 64bit distrib, and you need to use an external library for example SDL, the external library will have to also be compiled in 64bit if you want your code to compile.
One thing to know is that your ELF file will be bigger on a 64bit distrib than on a 32bit one, it's just logic.
What's the point with pointer? when you increment/change a pointer, the compiler will increment your pointer from the size of the pointing type.
The contained type size is defined by your processor's register size/the distrib your working on.
But you just don't have to care about this, the compilation will do everything for you.
Sum: That's why you can't execute a 64bit ELF file on a 32bit distrib.
Typical pitfalls for 32bit/64bit porting are:
The implicit assumption by the programmer that sizeof(void*) == 4 * sizeof(char).
If you're making this assumption and e.g. allocate arrays that way ("I need 20 pointers so I allocate 80 bytes"), your code breaks on 64bit because it'll cause buffer overruns.
The "kitten-killer" , int x = (int)&something; (and the reverse, void* ptr = (void*)some_int). Again an assumption of sizeof(int) == sizeof(void*). This doesn't cause overflows but looses data - the higher 32bit of the pointer, namely.
Both of these issues are of a class called type aliasing (assuming identity / interchangability / equivalence on a binary representation level between two types), and such assumptions are common; like on UN*X, assuming time_t, size_t, off_t being int, or on Windows, HANDLE, void* and long being interchangeable, etc...
Assumptions about data structure / stack space usage (See 5. below as well). In C/C++ code, local variables are allocated on the stack, and the space used there is different between 32bit and 64bit mode due to the point below, and due to the different rules for passing arguments (32bit x86 usually on the stack, 64bit x86 in part in registers). Code that just about gets away with the default stacksize on 32bit might cause stack overflow crashes on 64bit.
This is relatively easy to spot as a cause of the crash but depending on the configurability of the application possibly hard to fix.
Timing differences between 32bit and 64bit code (due to different code sizes / cache footprints, or different memory access characteristics / patterns, or different calling conventions ) might break "calibrations". Say, for (int i = 0; i < 1000000; ++i) sleep(0); is likely going to have different timings for 32bit and 64bit ...
Finally, the ABI (Application Binary Interface). There's usually bigger differences between 64bit and 32bit environments than the size of pointers...
Currently, two main "branches" of 64bit environments exist, IL32P64 (what Win64 uses - int and long are int32_t, only uintptr_t/void* is uint64_t, talking in terms of the sized integers from ) and LP64 (what UN*X uses - int is int32_t, long is int64_t and uintptr_t/void* is uint64_t), but there's the "subdivisions" of different alignment rules as well - some environments assume long, float or double align at their respective sizes, while others assume they align at multiples of four bytes. In 32bit Linux, they align all at four bytes, while in 64bit Linux, float aligns at four, long and double at eight-byte multiples.
The consequence of these rules is that in many cases, bith sizeof(struct { ...}) and the offset of structure/class members are different between 32bit and 64bit environments even if the data type declaration is completely identical.
Beyond impacting array/vector allocations, these issues also affect data in/output e.g. through files - if a 32bit app writes e.g. struct { char a; int b; char c, long d; double e } to a file that the same app recompiled for 64bit reads in, the result will not be quite what's hoped for.
The examples just given are only about language primitives (char, int, long etc.) but of course affect all sorts of platform-dependent / runtime library data types, whether size_t, off_t, time_t, HANDLE, essentially any nontrivial struct/union/class ... - so the space for error here is large,
And then there's the lower-level differences, which come into play e.g. for hand-optimized assembly (SSE/SSE2/...); 32bit and 64bit have different (numbers of) registers, different argument passing rules; all of this affects strongly how such optimizations perform and it's very likely that e.g. SSE2 code which gives best performance in 32bit mode will need to be rewritten / needs to be enhanced to give best performance 64bit mode.
There's also code design constraints which are very different for 32bit and 64bit, particularly around memory allocation / management; an application that's been carefully coded to "maximize the hell out of the mem it can get in 32bit" will have complex logic on how / when to allocate/free memory, memory-mapped file usage, internal caching, etc - much of which will be detrimental in 64bit where you could "simply" take advantage of the huge available address space. Such an app might recompile for 64bit just fine, but perform worse there than some "ancient simple deprecated version" which didn't have all the maximize-32bit peephole optimizations.
So, ultimately, it's also about enhancements / gains, and that's where more work, partly in programming, partly in design/requirements comes in. Even if your app cleanly recompiles both on 32bit and 64bit environments and is verified on both, is it actually benefitting from 64bit ? Are there changes that can/should be done to the code logic to make it do more / run faster in 64bit ? Can you do those changes without breaking 32bit backward compatibility ? Without negative impacts on the 32bit target ? Where will the enhancements be, and how much can you gain ?
For a large commercial project, answers to these questions are often important markers on the roadmap because your starting point is some existing "money maker"...

segfault on Win XP x64, doesn't happen on XP x32 - strncpy issue? How to fix?

I'm pretty inexperienced using C++, but I'm trying to compile version 2.0.2 of the SBML toolbox for matlab on a 64-bit XP platform. The SBML toolbox depends upon Xerces 2.8 and libsbml 2.3.5.
I've been able to build and compile the toolbox on a 32-bit machine, and it works when I test it. However, after rebuilding it on a 64-bit machine (which is a HUGE PITA!), I get a segmentation fault when I try to read long .xml files with it.
I suspect that the issue is caused by pointer addresses issues.
The Stack Trace from the the segmentation fault starts with:
[ 0] 000000003CB3856E libsbml.dll+165230 (StringBuffer_append+000030)
[ 6] 000000003CB1BFAF libsbml.dll+049071 (EventAssignment_createWith+001631)
[ 12] 000000003CB1C1D7 libsbml.dll+049623 (SBML_formulaToString+000039)
[ 18] 000000003CB2C154 libsbml.dll+115028 (
So I'm looking at the StringBuffer_append function in the libsbml code:
LIBSBML_EXTERN
void
StringBuffer_append (StringBuffer_t *sb, const char *s)
{
unsigned long len = strlen(s);
StringBuffer_ensureCapacity(sb, len);
strncpy(sb->buffer + sb->length, s, len + 1);
sb->length += len;
}
ensureCapacity looks like this:
LIBSBML_EXTERN
void
StringBuffer_ensureCapacity (StringBuffer_t *sb, unsigned long n)
{
unsigned long wanted = sb->length + n;
unsigned long c;
if (wanted > sb->capacity)
{
/**
* Double the total new capacity (c) until it is greater-than wanted.
* Grow StringBuffer by this amount minus the current capacity.
*/
for (c = 2 * sb->capacity; c < wanted; c *= 2) ;
StringBuffer_grow(sb, c - sb->capacity);
}
}
and StringBuffer_grow looks like this:
LIBSBML_EXTERN
void
StringBuffer_grow (StringBuffer_t *sb, unsigned long n)
{
sb->capacity += n;
sb->buffer = (char *) safe_realloc(sb->buffer, sb->capacity + 1);
}
Is it likely that the
strncpy(sb->buffer + sb->length, s, len + 1);
in StringBuffer_append is the source of my segfault?
If so, can anyone suggest a fix? I really don't know C++, and am particularly confused by pointers and memory addressing, so am likely to have no idea what you're talking about - I'll need some hand-holding.
Also, I put details of my build process online here, in case anyone else is dealing with trying to compile C++ for 64-bit systems using Microsoft Visual C++ Express Edition.
Thanks in advance!
-Ben
Try printing or using a debugger to see what values your getting for some of your intermediate variables. In StringBuffer_append() O/P len, in StringBuffer_ensureCapacity() observe sb->capacity and c before and in the loop. See if the values make sense.
A segmentation fault may be caused by accessing data beyond the end of the string.
The strange fact that it worked on a 32-bit machine and not a 64-bit O/S is also a clue. Is the physical and pagefile memory size the same for the two machines? Also, in a 64-bit machine the kernel space may be larger than the 32-bit machine, and eating some available memory space that was in the user part of the memory space for 32-bit O/S. For XML the entire document must fit into memory. There are probably some switches to set the size if this is the problem. The difference in machines being the cause of the problem should only be the case if you are working with a very large string. If the string is not huge, it may be some problem with library or utility method that doesn't work well in a 64-bit environment.
Also, use a simple/small xml file to start with if you have nothing else to try.
Where do you initialize sb->length. Your problem is likely in strncpy(), though I don't know why the 32bit -> 64-bit O/S change would matter. Best bet is looking at the intermediate values and your problem will then be obvious.
being one of the developers of libsbml i just stumbled over this. Is this still a problem for you? In the meantime we have released libsbml 5, with separate 64bit and 32bit versions and improved testing a lot, please have a look at:
http://sf.net/projects/sbml/files/libsbml
The problem could be pretty much anything. True, it might be that strncpy does something bad, but most likely, it is simply passed a bad pointer. Which could originate from anywhere. A segfault (or access violation in Windows) simply means that the application tried to read or write to an address it did not have permission to access. So the real question is, where did that address come from? The function that tried to follow the pointer is probably ok. But it was passed a bad pointer from somewhere else. Probably.
Unfortunately, debugging C code is not trivial at the best of time. If the code isn't your own, that doesn't make it easier. :)
StringBuffer is defined as follows:
/**
* Creates a new StringBuffer and returns a pointer to it.
*/
LIBSBML_EXTERN
StringBuffer_t *
StringBuffer_create (unsigned long capacity)
{
StringBuffer_t *sb;
sb = (StringBuffer_t *) safe_malloc(sizeof(StringBuffer_t));
sb->buffer = (char *) safe_malloc(capacity + 1);
sb->capacity = capacity;
StringBuffer_reset(sb);
return sb;
}
A bit more of the stack trace is:
[ 0] 000000003CB3856E libsbml.dll+165230 (StringBuffer_append+000030)
[ 6] 000000003CB1BFAF libsbml.dll+049071 (EventAssignment_createWith+001631)
[ 12] 000000003CB1C1D7 libsbml.dll+049623 (SBML_formulaToString+000039)
[ 18] 000000003CB2C154 libsbml.dll+115028 (Rule::setFormulaFromMath+000036)
[ 20] 0000000001751913 libmx.dll+137491 (mxCheckMN_700+000723)
[ 25] 000000003CB1E7B2 libsbml.dll+059314 (KineticLaw_getFormula+000018)
[ 37] 0000000035727749 TranslateSBML.mexw64+030537 (mexFunction+009353)
Seems if it was in any of the StringBuffer_* functions, that would be in the stack trace. I disagree with how _ensureCapacity and _grow are implemented. None of the functions check if realloc works or not. Realloc failure will certainly cause a segfault. I would insert a check for null after _ensureCapacity. With the way _ensureCapacity and _grow are, it seems possible to get an off-by-one error. If you're running on Windows, the 64-bit and 32-bit systems may have different page protection mechanisms that cause it to fail. (You can often live through off-by-one errors in malloc'ed memory on systems with weak page protection.)
Let's assume that safe_malloc and safe_realloc do something sensible like aborting the program when they can't get the requested memory. That way your program won't continue executing with invalid pointers.
Now let's look at how big StringBuffer_ensureCapacity grows the buffer to, in comparison to the wanted capacity. It's not an off-by-one error. It's an off-by-a-factor-of-two error.
How did your program ever work in x32, I can't guess.
In response to bk1e's comment on the question - unfortunately, I need version 2.0.2 for use with the COBRA toolbox, which doesn't work with the newer version 3. So, I'm stuck with this older version for now.
I'm also hitting some walls attempting to debug - I'm building a .dll, so in addition to recompiling xerces to make sure it has the same debugging settings in MSVC++, I also need to attach to the Matlab process to do the debugging - it's a pretty big jump for my limited experience in this environment, and I haven't dug into it very far yet.
I had hoped there was some obvious syntax issue with the buffer allocation or expansion. Looks like I'm in for a few more days of pain, though.
unsigned long is not a safe type to use for sizes on a 64-bit machine in Windows. Unlike Linux, Windows defines long to be 32 bits on both 32- and 64-bit architectures. So if the buffer being appended to grows beyond 4 GB in size (or if you're trying to append a string whose length is >4GB), you need to change the unsigned long type declarations to size_t (which is 64 bits on 64-bit architectures, in all operating systems).
However, if you're only reading a 1.5 MB file, I don't see how you'd ever get a StringBuffer to exceed 4 GB in size, so this might be a blind alley.