need book & web site suggestion for advanced low-level programming - c++

I want to learn all advanced details of low-level programming so i want to be able to
Learn advanced c/c++
Optimize my code with and without inline assembly
Understand the internals of an exe, dll, thread, process
Effeciently make use of technologies like SSE, 3DNow, MMX
Debug&disassemble executables/libraries and understand what's going on inside
The differences/features of different cpus/platforms like x86, MIPS, ARM, PowerPC
My first target is a x86 Windows based system. After that, comes linux based platforms. And embedded systems follow.
Any books, web sites, tutorials, forums, comunities that give me what I'm looking for DIRECTLY is fine.
Thanks..

What you are asking for cannot be found in a single book. Much of what you have mentioned is best found in User Manuals or Functional Specifications for various processors. I recommend starting with an understanding of the core x86 arch and working up from there. One of the old Intel 386 or 486 manuals might be a good start.
I know of no websites for this type of info.

A few recommendations from among my personal favourites to get you started:
“Effective C++: 55 Specific Ways to Improve Your Programs and Designs (3rd Edition)”
-- Scott Meyers
“Inside the Machine” -- John Stokes
“Hacker’s Delight” -- Henry S. Warren
“The Software Optimization Cookbook” -- Richard Gerber
“Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-M” (253666-021)
“Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B: Instruction Set Reference, N-Z” (253667-021)

Maybe it's time for you to get an account on http://my.safaribooksonline.com/, unplug the phone for a couple of weeks, load the refrigerator up with Jolt and Funyuns, say goodbye to your family and friends, and then read as many books as you can. They have a pretty substantial library on there that covers most of the topics that you're looking for.

that is a bit too much, that you want to learn. :)
i would suggest starting with basic ARM v4 core architecture.
it is simple enough to understand.
then move on to 8086, then build up to later versions of ARM and x86.
ARM is of the RISC type. and x86 of the CISC type.
you can never learn all of the processors. (like you wont be ever able to learn all the programming languages)
but having a knowledge of 1 or 2 can will enable to grasp any other you would come across.
there is nothing much Object oriented about low level programming.
so it doesnt matter if you use c++ or c.
get a full system simulator like gxemul or qemu.
try to execute a hello world assembly program - (without using the processor runtime libraries, - you want it hard, right?)
others might be able to guide you with respect to SSE, MMX etc.
checkout infocenter.arm.com for the ARM assembly language and architecture specifications.

I've always found Computer Systems: A Programmer's Perspective (http://www.amazon.com/Computer-Systems-Programmers-Randal-Bryant/dp/013034074X) to be a very good book. It's got a large amount of information about Computer Architecture, and it taught me about memory management, compilation and linking (as well as how to debug linking errors), optimization, relocatable object code, and some lower-level architecture items like how to go about studying computer science from a low-level (e.g. what the internals of the processor are like). There are a lot of good exercises, ranging from optimization examples to implementing buffer overflows. It discusses how to write inline assembly code (and make it work). There's even a section on writing code for a fictional (Y86) processor.
One caveat, though, is that it tends to focus to heavily on the Intel processor line (in my opinion). If you want something that's a bit more along the lines of working with say the ARM line, then you'll probably want to take the recommendations from others above.

Related

Microarchitectural profiling of C++ and assembly code on MIPS

As part of course project, I need to analyze a piece of C++ code for performance and find out which parts of the Computer Architecture (MIPS or x86) are mostly utilized while running the code and is possibly a bottleneck for the performance. I am looking at various Profilers for analyzing the performance and came across SimpleScalar which is a great tool but sadly only works with C code.
Since I am more familiar with MIPS architecture it would be great if there's a tool like SimpleScalar for simulating and profiling C++ code for MIPS. I am looking at the performance critical parts like branch, cache, instruction set, addressing modes etc. If not, mention of any tool which can do the similar kind of analysis for x86 architectures would be great as well.
(Just to clarify, I'm not looking for any old profiler, but for one that understands the CPU microarchitecture and knows what parts of the CPU are taken advantage of or underused.)
CACTI has detailed low-level simulation of cache.
SESC is a cycle accurate computer architecture simulator that supports MIPS.
SESC includes CACTI.
I doubt that what you want is possible. C++ is the language, but it still needs to be compiled to the target architecture. The optimisations (or the lack of them) will determine a lot of your performance criteria like cache use, etc. So I guess you need to look for machine level profilers (Hopefully they support the debug format of your compiler, so you see source code context).
My understanding is that SimpleScalar can simulate and profile MIPS machine code, no matter what the original language it was compiled from.
(The source-level debugger "DLite!" that comes with SimpleScalar may only support a few languages, but it sounds like you don't need to "debug" your code.)

recommended guides/books to read assembly [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
So lately I've been intrested in reading assembly which is displayed by a disassembler like ollydbg. The reason why I want to read this assembly is to learn how other developers build their applications or things like file formats of binary files the program has.
It's not like I'm a complete newbie at programming since I've been using C++ and C# for a while now. And I have a solid understanding of C++ so the whole pointer concept is clear to me.
Well I know that there are tons of assembly guides out there on the internet but I have no idea how reliable they are this tutorial: http://jakash3.wordpress.com/2010/04/24/x86-assembly-a-crash-course-tutorial-i/ was very usefull too me and this is the kind of tutorial with just a short explanation of the instruction. This one was very clear to me but it doesn't cover all of the assembly codes.
I hope someone could give a good guide/tutorial.
I think the standard guide to assembly is The Art of Assembly. It's online and free.
If you are interesed in x86 assembly and the opcodes for all instructions, try the Intel manuals (free download). If you want to know how to program in assembler, use the recommendation by Seth Carnegie. Another download would be the 32 bit edition.
I learned much of what I know about assembly language from Matt Pietrek's Just Enough Assembly Language to Get By and its sequel. This is especially made for C or C++ programmers wanting to read what the compiler emits in their debugger.
Pietrek's material is above any doubt and he writes clear and entertaining. It's tailored to the Windows platform, though.
This is a different approach. I recently released a first draft of learning assembly by example. It is not x86 assembly though, on purpose. Learning a better and easier to understand instruction set first, will get you through the basic concepts, from there other instruction sets, x86 included are often a matter of downloading an instruction set reference for the desired processor. https://github.com/dwelch67/lsasim
Normally googling xyz instruction set will get you a number of hits (instead of xyz choose the one you are interested in, arm, avr, 6502, etc). Ideally you want the vendors documentation which is usually free. There have been so many variations on the x86 by different companies that it adds to the mess. There are a lot of good online references though. For other families, msp430, avr, arm, mips, pic, etc, you can often go to the (core processor) vendors site to find a good reference. msp430, arm and thumb are also good first time instruction sets if you are not interested in the lsa thing. mips or dlx as well so I am told unfortunately I have not learned those yet. avr and x86 after you have learned something else first. Pic has a few flavors, the non-mips traditional pic instruction set is certainly educational in its simplicity and approach, might be a stepping stone to x86 (I dont necessarily recommend pic as a first instruction set either). I recommend learning a couple of non-x86 instruction sets first. And I recommend learning the 8088/86 instructions first, I can give you an ISBN number for the original intel manuals, can probably be found for a few bucks at a used book store (online). Lots of websites have the x86 instruction set defined as well, I highly recommend a visible simulator first before trying on hardware, will make life easier...qemu for example is not very visible nor easy to make visible. gdb's simulators might be as you can at least step and dump things out.
I really liked Programming From the Ground Up, a free book which aims to teach you the basics of ASM programming in a pretty easy to understand way. You can check it out here:
Programming from the Ground up

Micro-Controllers programming

I'm having this robotic arm project along with some engineers we haven't settled for the Micro Controller of choice yet but currently a PIC is being tested. I was wondering if there were Micros that support C++ ?
Background:
I'm a (Java) software developer, beginner in Embedded systems, currently programming using Mikro Elektronika IDE and C language.
AVR, MSP-430, Blackfin, almost anything 32 bit (ARM, AVR32, Renasis RX family).
If you are starting from nothing, an ARM is probably the best way to go. Atmel, NXP, TI and others have single chip ARM microcontrollers with inexpensive development kits.
I know you're asking for C++, but I just got a netduino that runs C# (very similar in syntax and concept to Java) and I'm loving it.
The whole dev board (which in many aspects is compatible with readily available arduino shields) costs less than 40 bucks.
I would add to hexa's answer that for ARM llvm is also a good compiler (I use binutils to assemble and link).
Going metal with C++ is not optimal for a number of reasons, simply because you are not running on top of an operating system and, to name one, dynamic memory allocation simply doesn't exists. No new no malloc. I don't mean you CAN'T go C++, but I would refrain.
I've used Mikroe C for PICs, it's ok but I'd go with MPLAB, just a matter of personal taste.
If you wanna go ARM, go GCC.
Why don't you try the mbed plattform? It's an open source arduino-like board which I consider to be more powerful. It is programmed in C/C++ and the good part is that there are literally thousands of APIs you can use in your project.
Hope this helps
https://mbed.org/

Writing a new jit

I'm interested in starting my own JIT project in C++. I'm not that unfamiliar with assembly, or compiler design etc etc. But, I am very unfamiliar with the resulting machine code format - like, what does a mov instruction actually look like when all is said and done and it's time to call that function pointer. So, what are the best resources for creating such a thing?
Edit: Right now, I'm only interested in x86 on Windows, stretching a tiny bit to 64bit Windows in the future.
You want to have a look at the processor manuals for the architecture you are interested in. Those manuals describe the opcode encoding. For x86 processors, the manuals can be downloaded from this page.
Starting your project on top of LLVM might shield you from the platform details.
http://llvm.org/
LLVM is used by several dynamic language JIT compilers.
GNU lightning is a multi-architecture (x86, SPARC, PPC) library for generating code within another program. You'll need to understand general assembly language concepts, but not at a very deep level. You won't have to write anything architecture-specific at all. The down side to lightning (at least last time I used it) is that the interface presented is the intersection of the features available on the supported targets: The small register set of x86, a RISC instruction set like SPARC, and so on. The single-pass code generation is easy to use but has its own quirks, like you can't relocate your output buffer (because of address references) so if you run out of space you generally have to start over. The good thing is that you will probably get a working example going very quickly.
Older versions of NASM come with a fairly concise opcode reference that has x86 instruction encodings. (Looks like there's no 64-bit info, though.) I found this one using google:
http://alien.dowling.edu/~rohit/nasmdocb.html
The official manuals say basically the same thing (and a lot more besides), but not quite so conveniently.

Why does the chip control the language to choose

I've asked the question before what language should I learn for embedded development. Most embedded engineers said c and c++ are a must, but also pointed out that it depends on the chip.
Can someone clarify? Is it a compiler issue or what? Do chips come with their own specific compilers (like a c compiler or c++ compiler) and that's why you have to use the language the compiler knows? Is it not possible to code and compile it elsewhere, then burn it to the chip directly in its compiled state? (I think I heard an acquaintance say something to this effect)
I'm not sure how this works, as clearly I don't know much embedded systems or how they work. It's probably an easy answer for those of you who know.
Probably, they meant some toolchains do not support C++. Yes, many chips and boards do come with their own toolchains. Different processors have different instruction sets, which means a different compiler (or more specifically a different backend). That doesn't mean you always have to relearn everything. Many of these are based on GCC (often considered the most ported compiler). The final executable/image formats also vary, so you need a specific linker. Most likely, you will be (cross-)compiling the chip on a "regular" computer, then burning it to the chip. However, that doesn't mean you can use a typical compiler and linker targeted towards a desktop operating system.
It "depends on the chip" in three possible ways:
Some very constrained architectures are not suited to C++, or at least C++ provides constructs not suited to such architectures so offers no benefit over C. Most 8 bit devices fall into this category, but by no means all; I have seen useful C++ code implemented on MegaAVR for example.
Some devices are not supported by a C++ compiler. For example Microchip's dsPIC/PIC24 compiler is C only (third-party tools may have C++ support).
The chip architecture is designed specifically for a particular language; for example INMOS Transputers invariably ran OCCAM.
As well as C, C++, other possibilities are assembler, Forth, Ada, Pascal and many others, but C is almost ubiquitous; few chip vendors will release a new architecture or device without a C compiler being available from day-one. For other languages you will generally have to wait until a third-part decides to develop one, and that wait may be forever for a niche architecture.
Is it not possible to code and compile it elsewhere, then burn it to the chip directly in its compiled state?
That is called cross-compilation or cross-development, and is the usual development method for embedded systems. Most embedded systems lack the OS, file, performance and memory resources to self-host a compiler, and most developers want the comfort of a sophisticated development environment with IDEs, debuggers etc. in a familiar user-oriented desktop OS.
I'm not sure how this works, as
clearly I don't know much embedded
systems or how they work.
Get up-to-speed with some of these:
http://www.state-machine.com/arm/Building_bare-metal_ARM_with_GNU.pdf
http://www.eetimes.com/design/embedded
http://www.amazon.com/exec/obidos/ASIN/020179523X
http://www.amazon.com/Embedded-Systems-Firmware-Demystified-CD-ROM/dp/1578200997
Yes, there are many architectures for which a C compiler exists but a C++ compiler does not. The smaller and less fully-featured a processor you choose, the more likely this situation is to occur.
For embedded development, you almost always compile the code 'elsewhere', as you say, and then send it to the chip for execution/debugging. The process of compiling code for a different architecture than the compiler itself is built for is called 'cross-compiling'.
You are correct: chips have variations on compilers. Most/many modern chips have a gcc port; but not all.
The term 'embedded' is used to describe a vast range of hardware. Most embedded software engineering will consist of writing C/C++ code to produce a binary for a target microprocessor, but there are devices that you may work with that are not coded with compiled binary.
One example is a Programmable Logic Controller (PLC). These devices use a language called "Ladder Logic". It's a wonderful language. I have enjoyed working with it in the past.
Another thing you may encounter, as I have in the past, is devices that have interpreted BASIC emulators. Hopefully that is rare today.
C/C++ are a very good choice for firmware development. So the software you make will run on a embedded CPU/Microcontroller. In order to proper programmer the device, you will need to know the language and the device architecture.
The same code probably will not work in different devices. So, you have to learn the language, and the device architecture.
Another options are FPGAs, which are not microcontroller. FPGA are devices with specialized cell capable to transform itself in any type of synchronous circuit, including microcontroller. FPGAs are programed with Hardware Description languages, like verilog and VHDL. The "compiled" (synthesized) version of the software are called gateware.
The HDLs are the same languages used for ASICs designe also. The path to properly learn
the language are long. So I recommend start with C/C++ with pic form Microchip, which is a
low cost and highly accepted microcontroller.
If you intend to do FPGA development, the knowledge gained with C/C++/pic will be helpfull and important, because must FPGAs have embedded CPU/Microcontroller inside.
There is no direct scientific reason for it. In a lot of cases it has to do with the management and politics of the specific company.
Some companies are driven to create a turn key system and force you to buy that system and pay for maintenance. It locks out the individual developers, but there are many companies and esp government agencies that prefer this model because the support is often much better and you can often drive the direction of their products to suit your needs.
Other companies do not have the staff or the talent and outsource the solution and sometimes take whatever they can get. And you might end up with a one time developed tool that after the contractor leaves is never updated or fixed again, or if it is fixed it is a patch job by someone else. It takes money to make money, but if you run out of money before you can sell your product you still fail.
Sometimes you have companies that both have a staff that maintains their in-house must buy from them tool AND has individuals that also contribute to open tools like gcc.
Sometimes the politics or management in the company have individuals that have a strong opinion of how the world must be and only allow tools to be developed for a specific language. Or perhaps they are owned by or partner with or just like a company that has a specific language and this chip product came to be simply to support that language.
On top of all of this you have the very real technical problems of memory space, the quality and efficiency of the instruction set and how compiler friendly it is. Some architectures may be fine for assembler, but higher level compiled code chews up the limited memory resources too quickly.
Gcc in particular has a lot of problems internally (not as a people but the software/source code itself). I challenge you to write a back end, even with the tutorials that are out there. A company requires specialised talent in order to create and then maintain a gcc backend year after year, otherwise you get dumped. if your chip architecture is not 32 bit or bigger you are already fighting a losing battle with gcc, your chip architecture might be compiler friendly but just not friendly with the popular compilers design.
In the near future llvm is going to shine as a cross compiler relative to gcc because it has not yet built this internal bulk, and perhaps because the internal guts are themselves a defined language/system it may never suffer what has happened to gcc. As more folks get comfortable with llvm we will see a number of architectures ported to it. The msp430 backend was done specifically to demonstrate that you can add a target literally in an afternoon. By the end of next month, some motivated individual could have all of the targets most of us have ever heard of ported to llvm. And you dont have to build a cross compiler it is always a cross compiler. I only mention llvm because the door is now open for targets that have suffered from bad tools to recover.
Some companies, microcontrollers in particular, can and will make the programming interface proprietary so that you must use their programming tool (and or hack it and take your chances with publishing those results and or a cat and mouse of them changing it to defeat you). And they may have only made tools for Windows leaving the linux and apple folks hanging in the wind. Or they make it so that the only binaries it will load are the ones generated by their tools, here again you may hack through the binary format allowing an alternate compiler, and they may or may not work to defeat you.
Despite the technical problems the biggest is the companies politics, management, marketing teams, and supply of or lack of talent in the engineering staff. The bottom line, follow the dollars not the technology or science to understand why this language is supported and not that, or the support for this language is good, bad, or marginal.
What language to learn as a result of all of this? Start with assembler on at least three different architectures. Then C and then C++ if you feel you really need it. C and assembler are your primary languages for embedded (depending on your definition of embedded). No, we write assembler mostly for initial boot code and to support C, interrupt stuff or special instructions that are needed that the compiler cannot create. There are places like microcontrollers where it may very well make sense to use assembler for various reasons like tools, limited chip resources, etc. Even if you dont use assembler knowing it makes you a much better high level programmer.
You do need to decide what your definition of embedded is. Is it api and library calls for an application on a(n embedded) linux system (indistinguishable from the same program/calls on a desktop system). Or at the other end of the spectrum are you talking a microcontroller with maybe 256 or 1024 bytes (not mega or giga, but bytes) of program space? Or something in the middle? The majority of the "embedded" folks out there are closer to the api calls for applications on an operating system (rtos, linux, wince, etc), than the deeply embedded, so that means C, maybe C++ (always be able to fall back on C), trying to avoid python and other scripty languages that are resource hogs.
Some 8-bit parts cannot efficiently access data from a stack. Instead of using a stack to pass parameters, auto-variables and parameters are statically allocated; typically, a linker allocates the automatic variables for main() at one end of memory, and then allocate the variables for functions that are called by main and nothing else, then allocate the variables for functions that are called by those functions and nothing else, etc. This will yield an optimal allocation fairly easily, subject to some caveats:
Recursion can only be supported by adding code to explicitly copy variables onto some sort of stack arrangement; in many compilers, it's simply not supported at all.
If a function looks like it "might" call another function, the linker will assume it can do so in all cases (e.g. it may be that when 'foo' calls 'bar', one of its parameters might always have a value such that 'bar' won't call 'boz', but the linker won't know that).
Any call to a function pointer with a certain signature will be regarded as a call to all functions with the same signature whose address is taken.
If the evaluation of more than one parameter to a function requires making additional function calls, additional temporary storage must generally be pessimistically allocated even if optimal placement of the parameter storage could have avoided that.
There are many types of C programs for which the above restrictions pose no problem at all, and many more for which they pose a nuisance but not a huge one (e.g. by adding dummy parameters or return values to ensure different classes of indirectly-called functions have different signatures). Unfortunately, the code generated by an C++ to C pre-compiler will almost always involve function pointers whose call graph cannot be reasonably divined, so using C++ on such a platform is apt to be difficult if not impossible.