What is the big deal with BUILDING 64-bit versions of binaries? - c++

There are a ton of drivers & famous applications that are not available in 64-bit. Adobe for instance does not provider a 64-bit Flash player plugin for Internet Explorer. And because of that, even though I am running 64-bit Vista, I have to run 32-bit IE. Microsoft Office, Visual Studio also don't ship in 64-bit AFAIK.
Now personally, I haven't had much problems building my applications in 64-bit. I just have to remember a few rules of thumb, e.g. always use SIZE_T instead of UINT32 for string lengths etc.
So my question is, what is preventing people from building for 64-bit?

If you are starting from scratch, 64-bit programming is not that hard. However, all the programs you mention are not new.
It's a whole lot easier to build a 64-bit application from scratch, rather than port it from an existing code base. There are many gotchas when porting, especially when you get into applications where some level of optimization has been done. Programmers use lots of little assumptions to gain speed, and these are not always easy to quickly port to 64-bit. A few examples I've had to deal with:
Proper alignment of elements within a struct. As data sizes change, assumptions that certain fields in a struct will be aligned on an optimal memory boundary may fail
Length of long integers change, so if you pass values over a socket to another program that may not be 64-bit, you need to refactor your code
Pointer lengths change, as so hard to decipher code written be a guru that has left the company become a little trickier to debug
Underlying libraries will also need to have 64-bit support to properly link. This is a large part of the problem of porting code if you rely on any libraries that are not open source

In addition to the things in #jvasak's post, the major thing that can causes bugs:
pointers are larger than ints - a huge amount of code makes the assumption that the sizes are the same.
Remember that Windows will not even allow an application (whether 32-bit or 64-bit) to handle pointers that have an address above 0x7FFFFFFF (2GB or above) unless they have been specially marked as "LARGE_ADDRESS_AWARE" because so many applications will treat the pointer as a negative value at some point and fall over.

The biggest issues that I've run into porting our C/C++ code to 64 bit is support from 3rd party libraries. E.g. there is currently only 32 bit versions of the Lotus Notes API and also MAPI so you can't even link against them.
Also since you can't load a 32 bit DLL into your 64 bit process you get burnt again trying to load things dynamically. We ran into this problem again trying to support Microsoft Access under 64 bit. From wikipedia:
The Jet Database Engine will remain
32-bit for the foreseeable future.
Microsoft has no plans to natively
support Jet under 64-bit versions of
Windows

Just a guess, but I would think a large part of it would be support - If Adobe compiles the 64 bit version, they have to support it. Even though it may be a simple compile switch, they'd still have to run through a lot of testing, etc, followed by training their support staff to respond correctly, when they do run into issues fixing them either results in a new version of the 32 bit binary or a branch in the code, etc. So while it seems simple, for a large application it can still end up costing a lot.

Another reason that a lot of companies have not gone through the effort of creating 64 bit versions is simply they don't need to.
Windows has WoW64 (Windows on Windows 64 bit) and Linux can have the 32 bit libraries available alongside the 64 bit. Both of these allow us to run 32 bit applications in 64 bit environments.
As long as the software is able to run in this way, there is not a major incentive to convert to 64 bit.
Exceptions to this are things such as device drivers as they are tied in deeper with the operating systems and cannot run in the 32 bit layer that the x86-64/AMD64 based 64-bit operating systems offer (IA64 is unable to do this from what I understand).
I agree with you on flash player though, I am very disappointed in Adobe that they have not updated this product. As you have pointed out, it does not work properly in 64 bit requiring you to run the 32 bit version of Internet Explorer.
I think it is a strategic mistake on Adobe's part. Having to run the 32 bit browser for flash player is an inconvenience for users, and many will not understand this solution. This could lead to developers being apprehensive about using flash. The most important thing for a web site is to make sure everyone can view it, solutions that alienate users are typically not popular ones. Flash's popularity was fed by its own popularity, the more sites that used it, the more users had it on their systems, the more users that had it on their systems, the more sites were willing to use it.
The retail market pushes these things forward, when a general consumer goes to buy a new computer, they aren't going to know they don't need a 64 bit OS they are going to get it either because they hear it is the latest and greatest thing, the future of computing, or just because they don't know the difference.
Vista has been out for about 2 years now, and Windows XP 64-bit was out before that. In my mind that is too long for a major technology such as Flash to not be upgraded if they want to hold on to their market. It may have to do with Adobe taking over Macromedia and this is a sign that Adobe does not feel Flash is part of their future, I find it hard to believe as I think Flash and Dreamweaver were the top parts of what they got out of Macromedia, but then why have they not updated it yet?

It is not as simple as just flipping a switch on your compiler. At least, not if you want to do it right. The most obvious example is that you need to declare all your pointers using 64-bit datatypes. If you have any code which makes assumptions about the size of these pointers (e.g. a datatype which allocates 4 bytes of memory per pointer), you'll need to change it. All this needs to have been done in any libraries you use, too. Further, if you miss just a few then you'll end up with pointers being down-casted and at the wrong location. Pointers are not the only sticky point but are certainly the most obvious.

Primarily a support and QA issue. The engineering work to build for 64-bit is fairly trivial for most code, but the testing effort, and the support cost, don't scale down the same way.
On the testing side, you'll still have to run all the same tests, even though you "know" they should pass.
For a lot of applications, converting to a 64-bit memory model doesn't actually give any benefit (since they never need more than a few GB of RAM), and can actually make things slower, due to the larger pointer size (makes every object field twice as large).
Add to that the lack of demand (due to the chicken/egg problem), and you can see why it wouldn't be worth it for most developers.

Their Linux/Flash blog goes some way to explain why there isn't a 64bit Flash Player as yet. Some is Linux-specific, some is not.

Related

64 bit features in a 32 bit application?

I have a 32 bit application which I plan to run on 64bit Windows 7.
At this stage I cannot convert the entire application to 64 bit due to dependencies to thirdparty functionality.
However, I would like to have access to the xmm9-xmm15 registers in my SSE optimizations and also use the additional registers which 64 bit cpus provides in general while executing my application.
Is this possible to achieve with some compiler flag?
It seems to me that the best way would be to divide your program in more than one executable. The EXE compiled as 64bit can communicate with another 32-bit EXE used the 32-bit thirdparty DLL which you need. You will have some overhead in the communication and will have to implement Starting/Stopping of depended process, but you will have clear program architecture.
If you develop native C++ application you can implement the second EXE for example as the COM++ out-of-process object (LocalServer or even LocalService). You can consider the way to implement the COM server in C# (see here). Sometimes that way could simplify the implementation and you can use the advantages of both .NET and native programming.
Yes, you can. See this example for how.
As Christopher pointed out, this technique will not always work, but for this express purpose, to jump into a few lines of handcrafted assembly and make use of more registers, then put the result somewhere and then jump back to 32 bit mode, it should work out just fine.

C++ BoundsChecker followup

We've been running for years with BoundsChecker for Visual C++ 6 (I think it was BoundsChecker 5 or 6, too). We've upgaded to VS2008 (finally!), and now need a follow-up for the outdated BoundsChecker.
How's the landscape?
What tools are out there?
Any new kids in town?
Any new ideas dealing with the problems we used memory profilers for?
Your recent experiences with these tools?
Recommendations?
The main application is C++ with many COM DLL's, we are looking to track native, C++ and COM leaks and objects. Bounds Checker for that size was already a pain in performance, sorting out the slew of data and some of its limitations.
Support for managed applications (primarily C#) is required, though that may be a separate tool.
Related (but IMO incomplete) question: Modern equivalent of BoundsChecker for Visual Studio 2008
[edit]
Regardign the comment, "In modern C++, you just use self-checking types, and bounds are never broken" :
Reference counted smart pointers can have cyclic references. Interfacing COM components is inherently unsafe, as it requires a lot of manual memory management. I've had a UI-less 3rd party service leak GDI handles so it crashed our overnight tests - the vendor blamed it on a "strange" Microsoft API. I have to interface C-based libraries, I have tons of legacy code that assumes allocation trickery in the sense of Numerical Recipes is a good thing and variable names longer than 3 letters are for typists. I have code from engineers for whom a std::vector<double>::iterator looks much more scary than a double ***, good luck developing and testing these without a solid background in signal processing.
So unless you come here, rewrite and encapsulate the core of a million lines of code in fool-proof C++ classes and make sure a few dozen products still work as before, keep your smart-assery to yourself. I wish I wouldn't need a memory checker, but I do. Thank you.
Disclaimer and warning: I work for Micro Focus, owner of the DevPartner Studio and BoundsChecker products.
BoundsChecker 10.5, part of DevPartner Studio 10.5 (though you can buy it by itself), supports Visual Studio 2005, 2008 and 2010 unmanaged code for 32 and 64 bit applications in essentially the same way it supported 32 bit applications on Visual Studio 6.0. While enhancing it to support X64 applications, we found and fixed quite a few very old problems, and made a start at working in spite of the .NET 4.0 code present in some VS 2010 applications. I say "in spite of" because .NET 4.0 turns out to do a lot of very nasty things in the process space, doing some things that Microsoft warns everybody else not to do, and has a certain amount of built-in resistance to tools like BoundsChecker, which are essentially gigantic viruses.
Anyway, since that release (February 4th), we have updated it to work on Windows 7 SP1 (which isn't quite public yet), and as far as BoundsChecker is concerned, we work with Visual Studio 2010 SP1 as well. We also discovered a nasty .NET 4.0 trap, and figured out to prevent it from taking us down. These enhancements and fixes will be available in our next public update, hopefully within the next month or so.
I have a massive application (here at work), and the new bounds checker 10.5 (supports 64 bit apps now) pretty much works with it. The trick Max is not to turn on all the checker features of devpartner bounds checker at once. Turn on just memory leaks, or turn on just some other feature, then run your app. And by all means exclude modules you don't need. There are quite a few things you can use to tune your settings so it goes faster. But yes, it does take a performance hit. But that is the name of the ballgame.
Intel's Parrallel inspector gave us thousands and thousands of false positives. Didn't use it.
Purify only works on 32 bit apps. i.e. small 32 bit native apps. Forget about using it with a managed C++ app.
And just for the record, if you have a large 32 bit app, memory analysis tools in general won't work very much, because of the massive memory overhead. And since you have very limited memory in a 32 bit address space, you quickly run out of room, and the tools fail.
We evaluated Boundschecker, Intel's Inspector and Purify.
They were all more or less crap.
For our main application, BoundsChecker would not start it after many hours; it only worked for a couple of smaller applications; but find a couple of things (I think we're still in contact with them to figure things out)
Intel's Inspector works, but does not instrument the code, it runs on the executable only (maybe works better when used with the whole suite of Intel products).
Purify failed miserably; we were never able to use it.
We're still in limbo about that.
Max.
Boundschecker: I just bought a (&(^ subscription which only entitles me to use the damned product for 99 days, so I'm pretty damned upset about that) but anyhow I was having big memory troubles and thought I ought to run this thing. It seems to catch lots of interesting things, but is so damned slow that, well put it this way: my appliciation is still in the DLL init code; it has been running for at least a couple hours, and so far it hasn't even gotten as far as the app does normally in the first COUPLE SECONDS. Boundschecker used to be 'the shit' back in numega days, but it seems like it is really another technological orphan being peddled by an opportunitstic business entity, like borland compilers.
So I really like it when it works, it has lots of great info. I just need to see if I will be able to actually get any decent results. It is currently using 4+ GB of RAM and hasn't even started up fully yet. Since I use win7/64 with a crippled home edition which will only recognize 12GB, I may run out of memory before anything really interesting happens. And it will be sometime a few days from now...

Why does the chip control the language to choose

I've asked the question before what language should I learn for embedded development. Most embedded engineers said c and c++ are a must, but also pointed out that it depends on the chip.
Can someone clarify? Is it a compiler issue or what? Do chips come with their own specific compilers (like a c compiler or c++ compiler) and that's why you have to use the language the compiler knows? Is it not possible to code and compile it elsewhere, then burn it to the chip directly in its compiled state? (I think I heard an acquaintance say something to this effect)
I'm not sure how this works, as clearly I don't know much embedded systems or how they work. It's probably an easy answer for those of you who know.
Probably, they meant some toolchains do not support C++. Yes, many chips and boards do come with their own toolchains. Different processors have different instruction sets, which means a different compiler (or more specifically a different backend). That doesn't mean you always have to relearn everything. Many of these are based on GCC (often considered the most ported compiler). The final executable/image formats also vary, so you need a specific linker. Most likely, you will be (cross-)compiling the chip on a "regular" computer, then burning it to the chip. However, that doesn't mean you can use a typical compiler and linker targeted towards a desktop operating system.
It "depends on the chip" in three possible ways:
Some very constrained architectures are not suited to C++, or at least C++ provides constructs not suited to such architectures so offers no benefit over C. Most 8 bit devices fall into this category, but by no means all; I have seen useful C++ code implemented on MegaAVR for example.
Some devices are not supported by a C++ compiler. For example Microchip's dsPIC/PIC24 compiler is C only (third-party tools may have C++ support).
The chip architecture is designed specifically for a particular language; for example INMOS Transputers invariably ran OCCAM.
As well as C, C++, other possibilities are assembler, Forth, Ada, Pascal and many others, but C is almost ubiquitous; few chip vendors will release a new architecture or device without a C compiler being available from day-one. For other languages you will generally have to wait until a third-part decides to develop one, and that wait may be forever for a niche architecture.
Is it not possible to code and compile it elsewhere, then burn it to the chip directly in its compiled state?
That is called cross-compilation or cross-development, and is the usual development method for embedded systems. Most embedded systems lack the OS, file, performance and memory resources to self-host a compiler, and most developers want the comfort of a sophisticated development environment with IDEs, debuggers etc. in a familiar user-oriented desktop OS.
I'm not sure how this works, as
clearly I don't know much embedded
systems or how they work.
Get up-to-speed with some of these:
http://www.state-machine.com/arm/Building_bare-metal_ARM_with_GNU.pdf
http://www.eetimes.com/design/embedded
http://www.amazon.com/exec/obidos/ASIN/020179523X
http://www.amazon.com/Embedded-Systems-Firmware-Demystified-CD-ROM/dp/1578200997
Yes, there are many architectures for which a C compiler exists but a C++ compiler does not. The smaller and less fully-featured a processor you choose, the more likely this situation is to occur.
For embedded development, you almost always compile the code 'elsewhere', as you say, and then send it to the chip for execution/debugging. The process of compiling code for a different architecture than the compiler itself is built for is called 'cross-compiling'.
You are correct: chips have variations on compilers. Most/many modern chips have a gcc port; but not all.
The term 'embedded' is used to describe a vast range of hardware. Most embedded software engineering will consist of writing C/C++ code to produce a binary for a target microprocessor, but there are devices that you may work with that are not coded with compiled binary.
One example is a Programmable Logic Controller (PLC). These devices use a language called "Ladder Logic". It's a wonderful language. I have enjoyed working with it in the past.
Another thing you may encounter, as I have in the past, is devices that have interpreted BASIC emulators. Hopefully that is rare today.
C/C++ are a very good choice for firmware development. So the software you make will run on a embedded CPU/Microcontroller. In order to proper programmer the device, you will need to know the language and the device architecture.
The same code probably will not work in different devices. So, you have to learn the language, and the device architecture.
Another options are FPGAs, which are not microcontroller. FPGA are devices with specialized cell capable to transform itself in any type of synchronous circuit, including microcontroller. FPGAs are programed with Hardware Description languages, like verilog and VHDL. The "compiled" (synthesized) version of the software are called gateware.
The HDLs are the same languages used for ASICs designe also. The path to properly learn
the language are long. So I recommend start with C/C++ with pic form Microchip, which is a
low cost and highly accepted microcontroller.
If you intend to do FPGA development, the knowledge gained with C/C++/pic will be helpfull and important, because must FPGAs have embedded CPU/Microcontroller inside.
There is no direct scientific reason for it. In a lot of cases it has to do with the management and politics of the specific company.
Some companies are driven to create a turn key system and force you to buy that system and pay for maintenance. It locks out the individual developers, but there are many companies and esp government agencies that prefer this model because the support is often much better and you can often drive the direction of their products to suit your needs.
Other companies do not have the staff or the talent and outsource the solution and sometimes take whatever they can get. And you might end up with a one time developed tool that after the contractor leaves is never updated or fixed again, or if it is fixed it is a patch job by someone else. It takes money to make money, but if you run out of money before you can sell your product you still fail.
Sometimes you have companies that both have a staff that maintains their in-house must buy from them tool AND has individuals that also contribute to open tools like gcc.
Sometimes the politics or management in the company have individuals that have a strong opinion of how the world must be and only allow tools to be developed for a specific language. Or perhaps they are owned by or partner with or just like a company that has a specific language and this chip product came to be simply to support that language.
On top of all of this you have the very real technical problems of memory space, the quality and efficiency of the instruction set and how compiler friendly it is. Some architectures may be fine for assembler, but higher level compiled code chews up the limited memory resources too quickly.
Gcc in particular has a lot of problems internally (not as a people but the software/source code itself). I challenge you to write a back end, even with the tutorials that are out there. A company requires specialised talent in order to create and then maintain a gcc backend year after year, otherwise you get dumped. if your chip architecture is not 32 bit or bigger you are already fighting a losing battle with gcc, your chip architecture might be compiler friendly but just not friendly with the popular compilers design.
In the near future llvm is going to shine as a cross compiler relative to gcc because it has not yet built this internal bulk, and perhaps because the internal guts are themselves a defined language/system it may never suffer what has happened to gcc. As more folks get comfortable with llvm we will see a number of architectures ported to it. The msp430 backend was done specifically to demonstrate that you can add a target literally in an afternoon. By the end of next month, some motivated individual could have all of the targets most of us have ever heard of ported to llvm. And you dont have to build a cross compiler it is always a cross compiler. I only mention llvm because the door is now open for targets that have suffered from bad tools to recover.
Some companies, microcontrollers in particular, can and will make the programming interface proprietary so that you must use their programming tool (and or hack it and take your chances with publishing those results and or a cat and mouse of them changing it to defeat you). And they may have only made tools for Windows leaving the linux and apple folks hanging in the wind. Or they make it so that the only binaries it will load are the ones generated by their tools, here again you may hack through the binary format allowing an alternate compiler, and they may or may not work to defeat you.
Despite the technical problems the biggest is the companies politics, management, marketing teams, and supply of or lack of talent in the engineering staff. The bottom line, follow the dollars not the technology or science to understand why this language is supported and not that, or the support for this language is good, bad, or marginal.
What language to learn as a result of all of this? Start with assembler on at least three different architectures. Then C and then C++ if you feel you really need it. C and assembler are your primary languages for embedded (depending on your definition of embedded). No, we write assembler mostly for initial boot code and to support C, interrupt stuff or special instructions that are needed that the compiler cannot create. There are places like microcontrollers where it may very well make sense to use assembler for various reasons like tools, limited chip resources, etc. Even if you dont use assembler knowing it makes you a much better high level programmer.
You do need to decide what your definition of embedded is. Is it api and library calls for an application on a(n embedded) linux system (indistinguishable from the same program/calls on a desktop system). Or at the other end of the spectrum are you talking a microcontroller with maybe 256 or 1024 bytes (not mega or giga, but bytes) of program space? Or something in the middle? The majority of the "embedded" folks out there are closer to the api calls for applications on an operating system (rtos, linux, wince, etc), than the deeply embedded, so that means C, maybe C++ (always be able to fall back on C), trying to avoid python and other scripty languages that are resource hogs.
Some 8-bit parts cannot efficiently access data from a stack. Instead of using a stack to pass parameters, auto-variables and parameters are statically allocated; typically, a linker allocates the automatic variables for main() at one end of memory, and then allocate the variables for functions that are called by main and nothing else, then allocate the variables for functions that are called by those functions and nothing else, etc. This will yield an optimal allocation fairly easily, subject to some caveats:
Recursion can only be supported by adding code to explicitly copy variables onto some sort of stack arrangement; in many compilers, it's simply not supported at all.
If a function looks like it "might" call another function, the linker will assume it can do so in all cases (e.g. it may be that when 'foo' calls 'bar', one of its parameters might always have a value such that 'bar' won't call 'boz', but the linker won't know that).
Any call to a function pointer with a certain signature will be regarded as a call to all functions with the same signature whose address is taken.
If the evaluation of more than one parameter to a function requires making additional function calls, additional temporary storage must generally be pessimistically allocated even if optimal placement of the parameter storage could have avoided that.
There are many types of C programs for which the above restrictions pose no problem at all, and many more for which they pose a nuisance but not a huge one (e.g. by adding dummy parameters or return values to ensure different classes of indirectly-called functions have different signatures). Unfortunately, the code generated by an C++ to C pre-compiler will almost always involve function pointers whose call graph cannot be reasonably divined, so using C++ on such a platform is apt to be difficult if not impossible.

Are there any Netbooks powerful enough for moderate C++ compilation?

I've tried a few Asus Ones, and found that even switching between multiple windows could take seconds. Is there anything powerful enough in that form factor for C++ programmers to build small to moderate size projects?
I also give it a qualified yes.
What OS you use may matter a lot. I have Kubuntu on a HP 2140 netbook with only 1 gb of ram and the usual Asus N270 cpu. And it is actually rather snappy for window or desktop switches etc under KDE 4.3.
Compile-times are ok but I am spoiled by better machines at the office or even at home. But I got this for the form factor and I take it with me while commuting. I mostly edit, write docs, etc pp while I am on the train and then commit back to SVN at the other end. That works well for me, including the occassional make or make check.
It depends on what compiler and editor/IDE you decide to use. The wimpiest Netbook is still a killer machine compared to what we used 20 or even 10 years ago. One of the easiest routes to better performance is to use an older editor/IDE (the compiler itself will probably be all right). Of course, we expected slower compilation back then too, but even so a minute to switch between windows would have been excessive.
Perhaps the HP Mini note? Amazon Link
You could also try to compile with the Nice command, which will supposedly only do intensive things during the moments when your not using your computer much.
I have an eeepc and it's okay for compilation. I definitely wouldn't want to compile a complete Boost build on it. It works, but you're kind of slumming it. P4 speed, slow hard drive, small amount of RAM... less everything.
A low-quality netbook has more resources and capabilities than my development workstations did a decade ago. What you will need, however, is RAM. If you try to do too many things at a time on one of these things you will swap like crazy because modern software and modern operating systems are written by lazy, slack developers who think RAM is a limitless resource. Alternatively you can boost your RAM. My wife's netbook got boosted to 1GB before the unit was even taken home and it's not at all bad.

Issues in porting c/c++ code to VxWorks

I need to port a c/c++ codebase that already supports Linux/Mac, to VxWorks. I am pretty new to VxWorks. Could you let me know what are the possible issues that could arise?
We recently did the opposite conversion - we ported code from a PowerPC machine running VxWorks to an Intel system running Linux. I don't remember hitting many snags as far as the differences between the operating systems. Obviously any call to an OS specific API will have to change and we were not making extensive use of these functions.
Our biggest problem was not the difference between the operating systems, but rather the difference between PowerPC and Intel hardware. PowerPC is Big Endian and Intel is Little Endian. Our software is written in C and made many assumptions as to the order of bytes and this was an absolute nightmare to get it working smoothly again. There were literally hundreds of structures that defined bitfields and needed to be re-ordered to work correctly. We ended up implementing a #pragma in GCC that reversed these bitfields at their definition (#pragma reverse_bitfields).
Much depends on which version of VxWorks you're targeting, and the actual target processor itself. One thing you will have to deal with is that there is no paged memory system or virtual memory--you have what's there. The environment itself is far more constrained than a linux system. Sometimes the work involved in porting applications goes all the way back to the architecture level because resources are not as unlimited as they are in linux.
Some other tips:
license vxworks such that you have the source code available
use a real, physical target as soon as possible in the development cycle; do not count on the simulators accurately emulating the target
use TSRs (technical support requests) as necessary; I don't know how they structure the purchase of the right to create TSRs, but don't let anybody cheap out on these
Depending on what processor you are running with VxWorks endianness, structure packing, and memory alignment could all be issues. The last time I used VxWorks it supported a pthreads, sockets, and mutex layer that mimicked the unix environments easily enough.
It's difficult to tell, without knowing more about the application that you're porting: What linux libraries and api calls does it use? Is it self-contained, or does it rely on slews of linux command-line tools and scripts to do its job?
As Average says, endianness can cause you way more problems than you expect - particularly if you're not prepared for it.