I'm writing a MIPS32 emulator and would like to make it possible to use the whole Standard C Library (maybe with the GNU extensions) when compiling C programs with gcc.
As I understand at this point, I/O is handled by syscalls on the MIPS32 architecture. To successfully run a program using libc/glibc, how can I tell what syscalls do I need to emulate? (without trial and error)
Edit: See this for an example of what I mean by syscalls.
(You can check out the project here if you are interested, any feedback is welcome. Keep in mind that it's in a very early stage)
Very Short Answer
Read the much longer answer.
Short Answer
If you intend to provide a custom libc that uses some feature of your emulator to have the host OS execute your system calls, you have to implement all of them.
Much Longer Answer
Step back for a minute and look at the way things are typically layered in a real (non-emulated) system:
The peripherals have some I/O interface (e.g., numbered ports or memory mapping) that the CPU can tickle to make them do whatever they do.
The CPU runs software that understands how to manipulate the hardware. This can be a single-purpose program or an operating system that runs other programs. Since libc is in the picture, let's assume there's an OS and that it's something Unix-y.
Userspace programs run by the OS use a defined interface between themselves and OS to ask for certain "system" functions to be carried out.
What you're trying to accomplish takes place between layers 3 and 2, where a function in libc or user code does whatever the OS defines as triggering a system call. This opens up numerous cans of worms:
What the OS defines as triggering a system call differs from OS to OS and (rarely) between versions of the same OS. This problem is mitigated on "real" systems by providing a dynamically-linkable libc that takes care of hiding those details. That aside, if you have a MIPS32 binary you want to run, does it use a system call convention that your emulator supports?
You would need to provide a custom libc that does something your emulator can recognize as making a particular system call and carry it out. Any program you wish to run will have to be cross-compiled to MIPS32 and statically linked with it, as would any other libraries the program requires (libm comes to mind). Alternately, your emulator package will need to provide a simulation of a dynamic linker plus dynamically-linkable copies of all required libraries, because opening those on the host won't work. If you have enough source to recompile the program from scratch, porting might be better than emulation.
Any code that makes assumptions about paths to files on a particular system or other assumptions about what they'll find in certain devices (which are themselves files) won't run correctly.
If you're providing layer 2, you're signing yourself up to provide a complete, correct simulation of the behavior of one particular version of an entire operating system. Some calls like read() and write() would be easy to deal with; others like fork(), uselib() and ioctl() would be much more difficult. There also isn't necessarily a one-to-one mapping of calls and behaviors your program uses with those your host OS provides. All of this assumes the host is Unix and the target program is, too. If the target is compiled for some other environment, all bets are off.
That last point is why most emulators provide just a CPU and the hardware behaviors of some target system (i.e., everything in layer 1). With those in place, you can run an original system's boot ROM, OS and user programs, all unaltered. There are a number of existing MIPS32 emulators that do just this and can run unaltered versions of the operating systems that ran on the hardware they emulate.
HTH and best of luck on your project.
Most of the ISO standard C library can be written in straight C. Only a few portions need access to lower level OS functionality.
At a minimum, you'll need to emulate basic I/O at the block or character level for fopen, fread, and fwrite. You could take the Unix approach, though, and implement those on top of the lower-level open, read, and write calls.
And you'll have to manage dynamic memory allocation for malloc and free.
And setjmp and longjmp, which needs access to the execution stack.
Also time and the signal.h functions.
I don't know exactly how MIPS works, but on Win32 then OS calls have to be explicitly imported in to a process via the DLL/EXE import table. There could be something similar in the executable format used by the MIPS system.
The usual approach is to emulate not only the CPU, but also a representive set of standard peripherals. Then you start an operating system in your emulator which comes with a libc and hardware drivers included. Libc will invoke the OSes drivers which invoke the virtual hardware in your emulator. For a popular example, see DosBox.
The other interpretation of your question is that you don't want to write a full emulator, but a binary compatibility layer that allows you to execute mips32 binaries on a non-mips32 system. A popular example of that is MacOsX (Intel) that can also execute PowerPC applications.
In the latter scenario you need to emulate either the OSes ABI (application binary interface) or maybe you can get away with libc's ABI. In both cases you need to implement stub code running on the emulator and proxy code running on the host:
The stub serializes the function call arguments
...and transmits them from emulator memory to host memory using some special virtual instructions
The proxy needs to patch the arguments (endianness, integer length, address space ...)
...and executes the function call on the host system
The proxy then paches and serializes the outgoing function arguments
...and transmits them back to the stub
...which returns the data to the caller
Most calls will not be able to work with generic stub/proxy, but need a specific solutions.
Good luck!
Related
Is there a way (and then how to) share memory between a linux program and a windows program running through wine ?
Since it could be hard to understand why to do such a thing, I give you my situation :
I've a proprietary program compiled only for windows, but this program has an open C plugin API. But, I'd like to make part of my code running on a native application (and use other libraries and other advantages of linux), and doing the IPC in a fast way
The purpose of Wine is to provide a WinAPI-like environment on Unix(-like) systems. This implies that Wine may be considered a separate, API-facaded, "independent" operating system on top and along a Unix-like system. Thus, that machine you say may actually have two OSes, one over the other. Firstly, the "real" (controlling real-hardware) one, that is, GNU/Linux. Secondly, there is the WinAPI implementation known as Wine in top of the POSIX/SUS interfaces.
And, as far as humankind is concerned, there's one, and only one single portable way to create inter-process communication between machines with different operating systems, and, as you may have already noticed, I refer to sockets.
The Wine subsystem may be considered a semi-virtual machine by its own right, isolated from the Linux kernel, but tightly coupled to it at the same time.
For efficiency purposes, my proposal is to use what sockets in conjunction with what I call the SHMNP (Shared Memory Network Protocol) to provide network-wide shared memory. Again, remember, both "machines" (although it's physically just one) shall be though to be independent. The Wine implementation is too dirty for the clumsy details to be easily work-arounded (although that's nothing compared to Cygwin's hacks).
The SHMNP works this way. Note, however, that the SHMNP does not exist! It's just theoretical, and the protocol structures et al are not presented for obvious reasons.
Both machines create their own sockets/shared-memory areas (it's assumed they negotiated the area's size previously). At the same time, they choose a port number and one of the machines becomes the server, the other one becoming the client. The connection is initialized.
Initially, all "shared" memory in both machines contains uninitialized data (the other machine may have different values for any given shared memory block).
Until the connection is closed, if any of the two machines write to any of address of the shared memory area, a message shall be sent to the other machine with the information that changed. The Linux kernel's funky features may be exploited to allow even raw pointers to work perfectly fine with this (see below). I'm, however, not aware of doing it in Windows rather that by specialized ReadNetworkShared() and WriteNetworkShared()-like procedures.
The implementation may provide some sort of synchronization mechanism, so to allow network-wide semaphores, mutexes, et al.
Linux kernel specific quirks:
Most modern general-purpose hardware architectures and operating systems provide for a way to protect memory from malicious/buggy/unintended use by a user process. Whenever you read/write to memory that isn't mapped in your process's virtual address space, the CPU will notify the operating system kernel that a page fault has occured. Subsequently, the kernel (if Unix(-like)) will send a segmentation violation signal to the offending process, or in other words, you receive SIGSEGV.
The hidden magical secret is that SIGSEGV may be caught, and handled. Thus, we may mmap() some memory (the shared memory area), mark it as read-only with mprotect(), then, whenever we try to write to an address in the shared memory area, the process will receive a SIGSEGV. The signal handler subsequently performs checks in the siginfo_t passed on by the kernel, and deduces one of two actions.
If the faulty address is not in the shared memory area, abort() or whatever.
Otherwise, the to be written page shall be copied to a temporary storage (maybe with the help of splice()?). Then, mark the to be written page as read/write, and setup a timer so that in within a timeout the page is marked read-only again and the (maybe compressed) difference between the old copy and the now-written page is sent through the socket (SIMD may help you here). The handler then returns, allowing the write (and maybe, other writes!) to complete without further intervention until the timer fires out.
Whenever a machine receives compressed data through the socket, it's simply decompressed and written where it belongs.
Hope this helps you!
Edit: I just found an obvious flaw of the pre-edit design. If a (compressed) page was sent to another machine, that other machine would be unable to differentiate between data that has been modified within the page and data that hasn't been modified. This involves a race condition, where the receiving machine may lose information it hasn't yet sended. However, some more Linux-kernel-specific stuff fixes it.
I'm not sure this is a good idea or if it will even work, but you could create files in /dev/shm and access them both from Wine and your native Linux application.
It's not guaranteed to exist, so you should have a fallback IPC method.
https://superuser.com/questions/45342/when-should-i-use-dev-shm-and-when-should-i-use-tmp
Otherwise, you might try building a winelib application that can call your Windows code from Linux: http://web.archive.org/web/20150225173552/http://wine-wiki.org/index.php/WineLib#Calling_a_Native_Windows_dll_from_Linux. I am also not sure whether it will work.
When a program is compiled it is converted to machine code which can be "understood" by the machine. How does this machine code interact with the operating system in order to do things like getting input from the keyboard ?
To me, it seems that the machine code should run at a lower level than the operating system and therefore, I can't understand how the OS can act as an intermediary between the compiled application and the hardware.
PS : I just started C ++ programming and I am trying to understand how cin and cout work
This is a very good question (better than you know), and there is a lot to learn. A LOT.
I'll try to keep it short. The operating system acts as a level of abstraction between software and hardware:
Software
.
/|\
| communicates with
\|/
'
Operating System
.
/|\
| communicates with
\|/
'
Hardware
The OS communicates with the hardware through programs called drivers (widely used term), and the OS communicates with software through procedures called system calls (not-so-widely used term).
Essentially, when you make a system call, you are leaving your program and entering code of the operating system. System calls are the only way programmers are allowed to communicate with resources.
Now I would stop there, but you also said:
To me, it seems that the machine code should run at a lower level than
the operating system and therefore, I can't understand how the OS can
act as an intermediary between the compiled application and the
hardware.
This is tricky, but simple once you understand some basics.
First, all code is just machine code running on the CPU. No code is higher or lower than other code (with the exception of some commands that can only be run in a privileged kernel mode). So the question is, how can the OS possibly be in control even though it is relinquishing control of the CPU to the user?
When code is running on a CPU, there is a concept called an interrupt. This is a signal sent to the CPU that causes the currently running code to stop and get switched out with another piece of code, called an interrupt handler.
Examples of interrupts include the keyboard, the mouse, and most importantly, the clock.
The clock interrupt is raised on a regular basis causes the operating system's clock interrupt handler to run. Within this clock interrupt handler is the operating system's code that examines what code is currently running determines what code needs to run next. This can be either more operating system code or more user code.
Because the clock is always ticking, and because the operating system always gets this periodic chance to run on the CPU, it is able to orchestrate everything within the computer, even though it runs using the same set of CPU commands as any normal program.
The operating system provides system calls that programs can call to get access to lower level services.
Note that system calls are different from the system() function that you have probably used to execute external programs.
System calls are used to do things like access files, communicate over the network, request heap memory, etc.
Is there a way I could make a C or C++ program that would run without an operating system and that would draw something like a red pixel to the top left corner? I have always wondered how these types of applications are made. Since Windows is written in C I imagine there is a way to do this.
Thanks
If you're writing for a bare processor, with no library support at all, you'll have to get all the hardware manuals, figure out how to access your video memory, and perform whatever operations that hardware requires to get a pixel drawn onto the display (or a sound on the beeper, or a block of memory read from the disk, or whatever).
When you're using an operating system, you'll rely on device drivers to know all this for you. Programs are still written, every day, for platforms without operating systems, but rarely for a bare processor. Many small MPUs come with a support library, usually a set of routines that lets you manipulate whatever peripheral devices they support.
It can certainly be done. You typically write the code in C, and you pretty much have to do everything on your own, with no standard library. To set your pixel, you'd usually load a pointer to the physical address of the screen, and write the correct value to that pointer. Alternatively, on a PC you could consider using the VESA BIOS. In all honesty, it's fairly similar to the way most code for MS-DOS was written (most used MS-DOS to read and write data on disk, but little else).
The core bootloader and the part of the Kernel that bootstraps the OS are written in assembly. See http://en.wikipedia.org/wiki/Booting for a brief writeup of how an operating system boots. There's no way I'm aware of to write a bootloader or Kernel purely in a higher level language such as C or C++ without using assembly.
You need to write a bootstrapper and a loader combination followed by a payload which involves setting the VGA mode manually by interrupt, grabbing a handle to the basic video buffer and then writing a value to the 0th byte.
Start here: http://en.wikipedia.org/wiki/Bootstrapping_(computing)
Without an OS it's difficult to have a loader, which means no dynamic libc. You'd have to link statically, as well as have a decent amount of bootstrap code written in assembly (although it could be provided as object files which you could then link with). Also, since you'd be at the mercy of whatever the system has, you'd be stuck with the VESA video modes (unless you want to write your own graphics driver and subsystem, which you don't).
There is, but not generally from within the OS. Initially, they are an asm stub that's executed from the MBR on the drive. See MBR. For x86 processors, this is generally 16-bit processing code, this generally jumps into the operating system code from here, and upgrades to 32-bit/64-bit mode depending on the operating system and chipset.
What exactly does an operating system do? I know that operating systems can be programmed, in, for example, C++, but I previously believed that C++ programs must be run under an operating system? Can somebody please explain and give links? thanks in advance, ell
An operating system is a layer between your code (user code) and the hardware.
The OS is responsible for managing the physical components and giving you a simple (hopefully) API off of which to build. It handles which programs run, when, who goes first, how memory is handled, who gets memory, video drawing, and all that good stuff.
For example, when making a GUI, instead of you sending each bit to the monitor, you tell the OS (or window manager) to make a window. You then tell it to place a button in your window. The OS then handles drawing the window, moving the window, moving the button (but keeping it where it should be in the window).
Now, you can program an operating system in C++, but it's not easy. You have to develop your kernel and whatnot, find a way to interface with the hardware, then expose that interface to your users and their programs.
So, essentially, an OS handles software-to-hardware interfacing and manages your physical resources. C++ programs can be run in an OS or, with enough work, run by themselves or even be an OS.
Actually, the C++ standard itself has something to say on this issue. §1.4/7:
Two kinds of implementations are defined: hosted and freestanding. For a hosted implementation, this International Standard defines the set of available libraries. A freestanding implementation is one in which execution may take place without the benefit of an operating system, and has an implementation-defined set of libraries that includes certain language-support libraries (17.4.1.3).
And in 17.4.1.3,
A freestanding implementation has an implementation-defined set of headers. This set shall include at least the following headers, as shown in Table 13:
Table 13—C++ Headers for Freestanding Implementations
_______________________________________________
Subclause Header(s)
18.1 Types <cstddef>
18.2 Implementation properties <limits>
18.3 Start and termination <cstdlib>
18.4 Dynamic memory management <new>
18.5 Type identification <typeinfo>
18.6 Exception handling <exception>
18.7 Other runtime support <cstdarg>
The supplied version of the header shall declare at least the functions abort(), atexit(), and exit() (18.3).
These headers either define constants or provide basic support to the compiler. In practice, some language features will be missing until the OS completes some initialization, for example new and catch.
An OS is really just a program that runs other programs and manages hardware resources for them.
If you are really serious about getting into the internals, I'd recommend reading the book Understanding the Linux Kernel.
sure, http://en.wikipedia.org/wiki/Operating_system
An operating system is the software on a computer that manages the way different programs use its hardware, and regulates the ways that a user controls the computer. Operating systems are found on almost any device that contains a computer with multiple programs—from cellular phones and video game consoles to supercomputers and web servers. Some popular modern operating systems for personal computers include Microsoft Windows, Mac OS X, and Linux (see also: list of operating systems, comparison of operating systems).
I mean the description of an operating system, what it does when and why goes far beyond an answer on this site imho.
An operating system, more specifically its kernel, is developed in a language such as C. And it is compiled into machine code just like any other program. The major difference between a mainstream OS and some code that you write in C is that the C code will run in a timeshare via the OS's CPU Scheduler. Also consider that the OS runs first, and is able to setup such an environment where it completely controls and restricts anything which it launches. Also keep in mind that system calls are how a process can communicate back to the OS, everything is just typical machine instructions that could run on any other processor of its type.
A few key features that any mainstream OS provides:
CPU Scheduler - This will load a process, allow it to run for a very limited amount of time before kicking it back off, regaining control and allowing something else to run (wether it be a kernel task or another process, typically kernel tasks have priority)
Memory management - Any application which you run does not have exact memory addresses since this is prone to change. All processes will run in virtual memory, and the OS will translate virtual memory (ex: 0x41000+) into a physical address. (again, its abstracting the hardware as is mentioned often)
File systems - various kinds
Resources - any kind of device kind be treated as a resource. A process may request access to a resource. (Oddly enough, in this day and age no mainstream OS has a mechanism for preventing dead locks for resources.)
Security - This is done through roles. It is very important that every process run within tight constrictions. This is another abstraction that the OS provides.
An operating system is just a software which is an interface between your hardware and your software. It makes an abstraction of this hardware to make it simpler to use. For instance, you don't have to read the keyboard status in your programs to check if the user hit a key. You might think of it as a lot of bricks put together and piled on top of each other, from a very precise view of the hardware to a very abstract view (from bits, to windows or buttons... for instance)
You don't have to program an operating system in a specific language, but most of them are written in C for efficiency and convenience reasons. You can do programming (your own applications) in any language then, provided that you have the correct libraries installed on the operating system.
There is no "clear" definition of what are the responsibilities of an OS. It could include the following:
Memory management
Devices & drivers
File system(s)
Processes and threads
System calls
In a nutshell OS is a program that enable the user to control computer's hardware in a relatively simple way
From a programming perspective operating systems primarily provide abstraction. Abstraction from the details of the CPU and memory management, abstraction from dealing with hardware devices, abstraction from the details of network protocol stacks.
The operating system provides a higher-level programming interface, often standardized across several operating systems like POSIX does for all Unix flavors.
After reading the question, I see what you're trying to ask. What you're asking is if C/C++ programs require an OS to run. The answer is no. A C/C++ is a compiler that translate human language into machine language. It doesn't require a specific operating system. However, if you compile in say, Visual Studio, the resulting executable machine code can't run on anything but the Windows.
In specific, C/C++ code are usually portable in that if you have a compiler for an operating system, you can compile it and it will run like so. However, sometimes you have machine specific code (or OS specific code), such as a windows application that uses windows-based interfaces that cannot be ported over to another operating system. Some examples I can think of are like directory operations are usually not portable and usually depends on what OS you're on. However, most file operations, like fopen, are portable.
An OS is a bit different. It requires a different kind of compiler, and it requires a different way to load. Most OS are made in C/C++, it is then compiled by a compiler, then it is distributed. For example, Microsoft wrote Windows 95 in C/C++, they put it through a compiler, then burnt the resulting executable code into a CD-ROM, then sold it to you then you put the disk in and it will copy the resulting executable code onto your machine and you use it.
They don't give you the source code, then your computer compiles it; it's usually they give you the resulting executable.
Basically an OS is the program that all other programs run inside of. It's literally the first program that your computer starts running when it boots up. As such, it controls all the hardware, and acts at the gatekeeper for other programs to access that hardware. It also controls ( or should, at least ) all the programs that are running under it -- when they start, how the stop, and what resources they have access to. You might call it "The Master Control Program" :)
The term "operating system" when applied to a PC, normally refers to a modern "protected memory" operating system that provides not only a basic set of system services but also a complete user interface:
the combination of a kernel, device drivers, and system services that provide memory protection, tasks that can not interfere with each other's memory, and threads which are units of execution within a process, as well as ways for threads and tasks to talk to each other and to access shared resources like file-systems that contain files, on storage devices like your PC's hard disk, are in fact, the core of the operating system.
the "shell" on top of that operating system might be as simple as the "command.com" text command prompt on DOS (remember " C:> _ "?) or as complex as the Windows Shell, including its control panel, etc.
Sometimes, a "linux distribution" contains far more than an operating system, but is informally referred to by a single name (such as Ubuntu) and so the line between what the operating system is (the linux kernel and standard libraries perhaps) and the applications that merely ship with that operating system (the Gnome and KDE environments on Linux) is pretty gray.
A great way to learn what an operating system really is, is to read one of Tannenbaum's books on Operating Systems. I believe he shows the implementation in detail of his "minix" kernel. Another book is "Linux Kernel Internals". If you can handle the technical detail in this kind of book, then you can really understand what an operating system "kernel" is, and then begin to make sense of the layers around that kernel.
I am not aware of one commercial or open source operating system that is written primarily in C++. Such system-level programming is most commonly carried out in a mix of pure ANSI C, and Assembly/Machine language. The low level assembly bits often are involved in tasks like handling interrupts, initializing hardware and booting the system up. Before you have a heap, and a stack, and a working virtual memory system, you wouldn't want to be using C++ objects, or even certain C features like malloc. Your resources and your design must be constrained by performance criteria, and any kind of extra overhead, even a semantic overhead, is to be deplored.
Recently Linus Torvalds famously insulted C++ and described on a mailing list why he would never use it for a Linux kernel. I believe however, that C++ is making inroads in areas that have typically been havens of "pure C". The Gnu GCC team for example is willing to allow C++ into the GCC codebase now, at last.
So I take my C++ program in Visual studio, compile, and it'll spit out a nice little EXE file. But EXEs will only run on windows, and I hear a lot about how C/C++ compiles into assembly language, which is runs directly on a processor. The EXE runs with the help of windows, or I could have a program that makes an executable that runs on a mac. But aren't I compiling C++ code into assembly language, which is processor specific?
My Insights:
I'm guessing I'm probably not. I know there's an Intel C++ compiler, so would it make processor-specific assembly code? EXEs run on windows, so they advantage of tons of things already set up, from graphics packages to the massive .NET framework. A processor-specific executable would be literally starting from scratch, with just the instruction set of the processor.
Would this executable be a file-type? We could be running windows and open it, but then would control switch to processor only? I assume this executable would be something like an operating system, in that it would have to be run before anything else was booted up, and have only the processor instruction set to "use".
Let's think about what "run" means...
Something has to load the binary codes into memory. That's an OS feature. The .EXE or binary executable file or bundle or whatever, is formatted in a very OS-specific way so that the OS can load it into memory.
Something has to turn control over to those binary codes. There's the OS, again.
The I/O routines (in C++, but this is true in most places) are just a library that encapsulate OS API's. Drat that OS, it's everywhere.
Reminiscing.
In the olden days (yes, I'm this old) I worked on machines that didn't have OS's. We also didn't have C.
We wrote machine codes using tools like "assemblers" and "linkers" to create big binary images that we could load into the machine. We had to load these binary images through a painful bootstrap process.
We'd use front panel keys to load enough code into memory to read a handy device like a punched paper-tape reader. This would load a small piece of fairly standard boot linking loader software. (We used mylar tape so it wouldn't wear out.)
Then, when we had this linking loader in memory, we could feed the tape we'd prepared earlier with the assembler.
We wrote our own device drivers. Or we used library routines that were in source form, punched on paper tapes.
A "patch" was actually patched pieces of paper tape. Plus, since there were also little bugs, we'd have to adjust the memory image based on hand-written instructions -- patches that hadn't been put into the tape.
Later, we had simple OS's that had simple API's, simple device drivers, and a few utilities like a "file system", an "editor" and a "compiler". It was for a language called Jovial, but we also used Fortran sometimes.
We had to solder serial interface boards so we could plug in a device. We had to write device drivers.
Bottom Line.
You can easily write C++ programs that don't require an OS.
Learn about the hardware BIOS (or BIOS-like) facilities that are part of your processor's chipset. Most modern hardware has a simple OS wired into ROM that does power-on self-test (POST), loads a few simple drivers, and locates boot blocks.
Learn how to write your own boot block. That is the first proper "software" thing that's loaded after POST. This isn't all that hard. You can use various partitioning tools to force your boot block program onto a disk and you'll have complete control over the hardware. No OS.
Learn how GRUB, LILO or BootCamp launch an OS. It's not complicated. Once they're booted, they can load your program and you're off and running. This is slightly simpler because you create the kind of partition that a boot loader wants to load. Base yours on the Linux kernel and you'll be happier. Don't try to figure out how Windows boots -- it's too complicated.
Read up on ELF. http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
Learn how device drivers are written. If you don't use an OS, you'll need to write device drivers.
The problem is that the OS really does a lot to start your programs. The EXE file itself has header information on it that Windows recognizes, identifying itself as an EXE file. Your app does everything, from filesystem access to memory allocations, through the OS.
But yes, you CAN run apps compiled for Windows/intel on other platforms without emulation. If you want to run your EXE on a Mac or UNIX, you will need to install a bit more software to do the work that Windows would do to run your program -- take a look at the "Wine" project.
What you're talking about is what's known in the embedded world as a "bare-metal" application. They're very common for things like a ARM Cortex-M3 that goes in (say) a debit-card validator box or an interactive toy, and doesn't have enough memory or capability to run a full operating system. So, instead of getting an "ARM/Linux" compiler that would compile an application to run on Linux on an ARM processor, you get an "ARM bare-metal" compiler that compiles things to run on an ARM processor without an operating system. (I'm using ARM rather than x86 as an example, because x86 bare-metal applications are really quite rare these days.)
As stated in your question and the other answers, your application will need to do some things that would otherwise be taken care of by the operating system.
First, it needs to initialize the memory system, the interrupt vectors, and various other bits of board goo. Typically this is something that a bare-metal compiler will do for you, though if you have a weird board, you may need to tell it how to do that. This gets things from the point where the board turns on to the point where your main() function starts.
Then, you need to interact with things outside the CPU and RAM. An operating system includes all sorts of functions for doing this -- disk I/O, screen output, keyboard and mouse input, networking, etc., so forth, and so on. Without an operating system, you have to get that from somewhere else. You may get some of that from libraries from your hardware manufacturer; for instance, a board I was recently playing with has a 40x200-pixel LED screen, and it came with a library with the code to turn that on and set individual pixel values on it. And there are several companies selling libraries to implement a TCP/IP stack and things like that, for doing networking or whatnot.
Consider, for example, that this makes it difficult to do even a basic printf. When you have an operating system, printf just sends a message to the operating system that says "put this string on the console", and the operating system finds the current cursor position on the console, and does all the stuff to figure out what pixels to change on the screen, and what CPU instructions to use to change those pixels, in order to do that.
Oh, and did we mention that you first have to figure out how to get the program into the CPU? A typical computer has a bit of programmable ROM that it will load instructions from when it starts up. On an x86, this is the BIOS, and it usually already contains a handy program that gets the CPU started, sets up the display, looks for disks, and loads a program off the disk that it finds. On an embedded system, that's typically where your program goes -- which means you need some way to put your program there. Often, that means you have a device called a "debugger" that's physically attached to your embedded board that loads the program -- and can also do things that allow you to pause the processor and determine what its state is, so that you can step through your program just as if you were running it in a software debugger on your computer. But I digress.
Anyway, to answer your second question, this executable that you'd create is something that gets stored in that ROM on your embedded board -- or perhaps you'd just store a bit of it in ROM (which is, after all, pretty small) and store the rest on a flash drive, and the bit in ROM would include the instructions to get the rest of it off the flash drive. It would probably be stored as a file on your main computer (that is, the Linux or Windows computer where you're creating it), but that's just for storage, it wouldn't run there.
You'll notice that when you've got a lot of these libraries together, they're doing a fair bit of what an operating system does, and there's sort of this space between the pile of libraries and a real operating system. In that space goes what's called an RTOS -- "real-time operating system". The smaller ones of these are really just collections of libraries that work together to do all the operating-systemy things, and sometimes also include stuff so you can run multiple threads at once (and then you can have different threads act like different programs) -- though all of this is all compiled into the same compiled "program", and the RTOS is really nothing more than a library you've included. Larger ones start storing parts of the code in separate places, and I think some of them can even load pieces of code off of disks -- just like Windows and Linux do when running a program. It's sort of a continuum, rather than an either/or.
The FreeRTOS system is an open-source RTOS that's towards the smaller end of the RTOS space; they might be a good place to look at some of this if you're more interested. They do have some examples of x86 applications, which would give you an idea of what sort of x86 systems would run a bare-metal or RTOS-based program and how you'd compile something to run on one; link here: http://www.freertos.org/a00090.html#186.
The computer is not the CPU. To do anything useful, the CPU has to be connected to memory and IO controllers and other devices. An OS takes care of abstracting all of that from running programs. So, if you want to write a program that runs without an OS, your program will have to replicate at least some features of an OS: Taking over from the BIOS during the boot process, initializing devices, communicating with the disk controller to load code and data, communicating with the display controller to show information to the user, communicating with the keyboard controller and the mouse controller to read user input etc etc etc.
Unless you are building an embedded system with specialized hardware, there is no point in doing this. Besides, running your program would mean the user would have to give up running other programs. While this may be acceptable for an ATM today or WordStar in 1984, these days people frown on not being able to check email while listening to music.
Sure, they exist. They are called cross compilers. For example, that's how I can program for the iPhone platform using Xcode.
A related type of compiler is one that compiles for a virtual platform. That's how Java works.
Any given compiler/toolset produces code for a particular processor/OS combination. So your Visual Studio compile example produces code for x86/Windows. That .EXE will only run on x86/Windows and not on (for example) ARM/Windows (as used by some cellphones).
To produce code for a processor/OS combination other than what you're running the compiler on requires what is generally referred to as a cross-compiler. If you have a full professional Visual Studio subscription, you can get the ARM cross compiler, which will allow you to produce ARM/Windows .EXE files which won't run on your desktop machine, but WILL run on an ARM/Windows based cellphone or palmtop.
Yes, you can make an executable that runs on the 'bare metal' of a processor. Obviously that's how operating system kernels work. The main thing you need to do is create an executable that uses no libraries whatsoever. However, the "no libraries" restriction includes the C standard library! So that means no malloc, no printf, etc. You have to basically be your own OS and manage memory and I/O yourself. This will inevitably require a fair bit of work directly in assembly at some stage.
You also lose several other luxuries, such as main(), which can't be the starting point of your program since main() is something that is invoked by the OS and the C runtime environment.
Absolutely! That is what embedded programming is. As many have probably said already the operating system does quite a bit for you. And even in the embedded world without an operating system a number of the development tools will provide the startup code to get the processor running enough to jump to your program. Some/many provide full blow C/C++ libraries so that you can call functions like memcpy() and sometimes even malloc() and printf().
You are welcome to provide every line of code and every instruction and not use a development tool package but still use a compiler like gcc for example. Some of the binary formats are common to those run on operating systems like elf for example. You can execute elf files on Linux but also have your embedded program result in an elf binary. The processor cannot execute elf in that format but whatever programs the boot prom or ram in some cases will extract the binary program from the elf file, not unlike an operating system extracting the program to run from an elf file. EXE is not one of those file formats. Your favorite windows application compiler is probably not an embedded compiler either although you can sometimes use one to do the high level language stuff and then use an alternative assembler and linker. More work than it is worth usually. For example you write a function in C (that does NOT make any library or system calls), compile that to an object. Write your own or find a utility to extract the compiled binary from that object, convert it to another object format or to assembler (disassemble). Add your startup code and other assembly to it. Assemble and link everything together as an embedded program. I did it once with Microsofts embedded visual C just to see how it measured up to other compilers, it wasnt horrible but certainly was not worth the effort of hacking to get at the output.
Every processor from the one in your computer to the one in your cell phone or microwave has too have some boot up code. That code is not running on an operating system. That code uses the same or similar compilers than operating system applications use. For some devices that code puts the processor and memory and on and off chip peripherals in a state where the operating system can be started. From there the operating system takes over. On your computer this would be the BIOS followed by the bootloader, then eventually the operating system, dos, windows, linux, etc.
The main problem is the file format. PE is very different to ELF(Used in unix-like systems). A valid PE program cannot be a valid ELF. So, you either load the binary dynamically with different starters or you have to give up.
Other than that, with knowledge of OS services, the value of registers at startup, etc. your code can probably detect easily and reliably which OS you are running under and act accordingly(Some malware does just that). Another challenge is then reusing code instead of having two or more different programs in the same binary. Basically you would have to write an emulator, at least for the services that you need.
Don't also forget about the Windows libraries. Look into QT and GTK+