This question already has answers here:
How does Software/Code actually communicate with Hardware?
(14 answers)
Closed 9 years ago.
Ok so I'm very very confused how a piece of hardware can understand code.
I read somewhere it has to do with voltages but how exactly does the piece of hardware know what an instruction in software means? I know drivers is the bridge between software and hardware but a driver is still software :S.
For example, in C++ we have pointers and they can point to some address in memory.. Can we have a pointer that points to some hardware address and then write to that address and it would affect the hardware? Or does hardware not have addresses?
I guess what I'm really asking is how does the OS or BIOS know where a piece of hardware is and how to talk to it?
For example, in C++ we have pointers and they can point to some
address in memory.. Can we have a pointer that points to some hardware
address and then write to that address and it would affect the
hardware? Or does hardware not have addresses?
Some hardware have addresses like pointers, some doesn't (In which case it most likely uses something called I/O ports, which requires special IN and OUT instructions instead of the regular memory operations). But much of the modern hardware has a memory address somewhere, and if you write the correct value to the correct address the hardware will do what you ask it to do. This varies from the really simple approach - say a serial port where you write a byte to an "output register", and the byte is sent along the serial line, and another address holds the input data being received on the serial port, to graphics cards that have a machine language of their own and can run hundreds or thousands of threads.
And normally, it's the OS's responsibility, via drivers, to access the hardware.
This is very simplified, and the whole subject of programming, OS and hardware is enough to write a fairly thick book about (and that's just in general terms, if you want to actually know about specific hardware, it's easily a few dozen pages for a serial port, and hundreds or thousands of pages for a graphics chip).
There are whole books on this topic. But briefly:
SW talks to hardware in a variety of ways. A given piece of hardware may respond to values written to very specific addresses ("memory mapped") or via I/O ports and instructions supported by the CPU (e.g., x86 instruction in and out instructions). When accessing a memory mapped port (address), the HW is designed to recognize the specific address or small range of addresses and route the signals to the peripheral hardware rather than memory in that case. Or in the case of I/O instructions, the CPU has a separate set of signals used specifically for that purpose.
The OS (at the lowest level - board support package) and BIOS have "knowledge" built in to them about the hardware address and/or the I/O ports needed to execute the various hardware functions available. That is, at some level, they have coded in exactly what addresses are needed for the different features.
You should read The soul of new machine, by Tracy Kidder. It's a 1981 Pullitzer price and it goes to great length to explain in layman terms how a computer works and how humans must think to create it. Besides, it's a real story and one of the few to convey the thrill of hardware and software.
All in all, a nice introduction to the subject.
The hardware engineers know where the memory and peripherals live in the processors address space. So it is something that is known because those addresses were chosen by someone and documented so that others could write drivers.
The processor does not know peripherals from ram. The instructions are simply using addresses ultimately determined by the programmers that wrote the software that the processor is running. So that implies, correctly, that the peripherals and ram (and rom) are all just addresses. If you were writing a video driver and were changing the resolution of the screen, there would be a handful of addresses that you would need to write to. At some point between the processor core and the peripheral (the video card) there would be hardware that examines the address and basically routes it to the right place. This is how the hardware was designed, it examines addresses, some address ranges are ram and sent to the memory to be handled and some are peripherals and sent there to be handled. Sometimes the memory ranges are programmable themselves so that you can organize your memory space for whatever reason. Similar to if you move from where you are living now to somewhere else, it is still you and your stuff at the new house, but it has a different address and the postal folks who deliver the mail know how to find your new address. And then there are MMU's that add a layer of protection and other features. The MMU (memory management unit) can also virtualize an address, so the processor may be programmed to write to address 0x100000 but the mmu translates that to 0x2300000 before it goes out on the normal bus to be sorted as memory or peripheral eventually finding its destination. Why would you do such a thing, well two major reasons. One is so that for example when you compile an application to run in your operating system, all programs for that OS can be compiled to run at the same address lets say address 0x8000. But there is only one physical address 0x8000 out there (lets assume) what happens is the operating system has configured the mmu for your program such that your program things it is running at that address, also the operating system can, if it chooses and the mmu has the feature, to add protections such that if your program tries to access something outside its allocated memory space then a fault occurs and your program is prevented from doing that. Prevented from hacking into or crashing other programs memory space. Likewise if the operating system supports it could also choose to use that fault to swap out some data from ram to disk and then give you more ram, virtual memory, allowing the programs to think there is more memory than there really is. An mmu is not the only way to do all of this but it is the popular way. So when you have that pointer in C++ running on some operating system it is most likely that that is a virtual address not the physical address, the mmu converts that address that has been given to your program into the real memory address. When the os chooses to switch out your program for another it is relatively easy to tell the mmu to let the other task think that that low numbered address space 0x8000 for example now belongs to the other program. And your program is put to sleep (not executed) for a while.
Related
I wondered if C or C++ has a way to find where the operating system operates in RAM and free that place. I know that I can use free() to free up memory place. I wonder if I can shut down my computer by freeing my operating system's RAM space.
Before protected memory was a thing you could just access any bit of memory using its physical address and manipulate it. This was how DOS and DOS-based Windows (pre Windows 95, like 3.1) worked.
Protected memory, or virtualized memory, means you can do things like swap out parts of memory to disk, in effect pretending to have more memory than the computer physically has. Chunks of memory can be swapped around as necessary, paged in and paged out, with the running program being none the wiser. These addresses are all virtual, or "fake" as in they don't physically exist, but as far as the CPU is concerned, they are real and work exactly as you'd expect, something accomplished by integrated Memory Management Unit (MMU) in the CPU.
After protected memory your "user space" program no longer sees physical memory addresses, but instead virtual addresses that the operating system itself manages. On Intel-type systems the kernel, the core of the operating system, runs within a special protection ring that prevents user programs from directly accessing or manipulating memory.
Any multi-user system must implement this kind of memory and kernel protection or there would be no way to prevent one user from accessing the memory of another user's processes.
Within the kernel there is no "malloc" or "free" in the conventional sense, the kernel has its own special allocation mechanisms. These are completely separate from the traditional malloc() and free() functions in the C standard library and are not in any way inter-compatible. Each kernel, be it Linux or BSD or Windows or otherwise, does this in a different way even if they can all support user-space code that uses the exact same malloc() function.
There should be no way that you can, through simple memory allocation calls, crash the system. If you can, congratulations, you've found an exploit and should document it and forward it to the appropriate parties for further analysis. Keep in mind this kind of thing is heavily researched so the likelihood of you discovering one by chance is very low. Competitions like pwn2own show just how much work is involved in bypassing all this security.
It's also important to remember that the operating system does not necessarily live in a fixed location. Address Space Layout Randomization is a technique to scramble the addresses of various functions and data to ensure that an exploit can't use hard-coded values. Before this was common you could predict where various things would live in memory and do blind manipulation through a tiny bug, but that's made much harder now as you must not only find an exploit to manipulate, but another to discover the address in the first place.
All that being said, there's nothing special about C or C++ in terms of "power" that makes it able to do things no other language can do. Any program that is able to bind against the operating system functions has the same equivalent "power" in terms of control. This includes Python, Perl, Ruby, Node.js, C# and long, long, list of others that can bind to C libraries and make arbitrary function calls.
People prototype "exploits" in whatever language is the most convenient, and often that's Perl or Python as often as C. It really depends on what you're trying to accomplish. Some bugs, once discovered, are so easy to reproduce you could do it with something as mundane as browser JavaScript, as was the case with Row Hammer.
You mention free() as a means to free memory which is correct but too simplified. Its counterparts malloc() and calloc() merely translate to a system call which requests the operating system for a chunk of memory. When you call free(), you relinquish ownership of the memory you asked for and return it to the operating system.
Your C/C++ program runs in a virtual address space which the operating system's memory management subsystem maps to actual RAM addresses. No matter what address you access, it can never be out of this virtual address space which is entirely under the control of the operating system.
A user application can never access the operating system's memory in case of modern operating systems. All memory it uses is granted to it by the operating system. The OS acts a bridge/abstraction between your user applications and hardware, that's their whole purpose, to prevent direct interaction with the hardware, in your case, RAM.
RAM was once upon a time directly accessible before the advent of virtual memory. It was exactly due to this vulnerability, along with the need to run programs larger than the system memory, that virtual memory was introduced.
The only way you can mess with the operating system in user space is to make system calls with malignant arguments.
Pointers in C are very powerful and seem efficient. But how can using a pointer can give you access to hardware?
My idea of this would be setting a pointer's value equal to a hardware's associated object and than manipulating it through the pointer. But if you already have enough access to the hardware's objects and properties to use a pointer on it where does the pointer come into play? Perhaps im visualizing something wrong?
I'm running on windows 7.
A basic example along with an explanation of why the pointer is needed to manipulate that hardware property would be great.
The pointer holds a memory address. And not all of the memory addressing range points to RAM areas alone. Memory addresses have ranges and some ranges map to hardware registers. And by writing to these registers, we can access the hardware. Of course, this also depends on which operating system and which hardware. Here is an example.
In a free standing environment (like a microcontroller), a hardware platform that does not
have a Memory Management Unit (some ARM microprocessors), or an operating system that
does not support hardware protection (like DOS) pointers give you raw access to hardware
through the magic of memory mapped I/O. Pointers in program running on an operating
system like Windows or Linux (or just about any modern operating system) are pointers in
a virtual address space. These pointers will not allow you to directly access
hardware.
The way that memory mapped I/O works is that certain physical memory addresses are
reserved for communication with devices in the system. When an address that belongs to a
device is accessed the data is routed to the appropriate register of the device. On x86
platforms this translation is done by the north bridge.
Most hardware is memory mapped. What this means is that it exposes a range of hardware registers (or other hardware entities) as memory areas. These memory locations can be accessed like any other memory. You can read and write to it by using memory addresses - and these reads and writes make things happen in the hardware. Just as an example, a write to a hardware register (a memory address) may cause a LED to turn on, or a robot motor to start turning. All hardware operations are exposed via such memory mapped registers etc.
Now pointers are language entities that let you access a memory location. You stick an address into a pointer and dereference it to read (or write) from (or to) that address. So, basically the way you operate the hardware is by accessing its address space via pointers.
This is a very basic question boggling mind since the day I heard about the concept of virtual and physical memory concept in my OS class. Now I know that at load time and compile time , virtual address and logical adress binding scheme is same but at execution time they differ.
First of all why is it beneficial to generate virtual address at compile and load time and and what is returned when we apply the ampersand operator to get the address of a variable, naive datatypes , user-defined type and function definition addresses?
And how does OS maps exactly from virtual to physical address when it does so? These questions are hust out from curiosity and I would love some good and deep insights considering modern day OS' , How was it in early days OS' .I am only C/C++ specific since I don't know much about other languages.
Physical addresses occur in hardware, not software. A possible/occasional exception is in the operating system kernel. Physical means it's the address that the system bus and the RAM chips see.
Not only are physical addresses useless to software, but it could be a security issue. Being able to access any physical memory without address translation, and knowing the addresses of other processes, would allow unfettered access to the machine.
That said, smaller or embedded machines might have no virtual memory, and some older operating systems did allow shared libraries to specify their final physical memory location. Such policies hurt security and are obsolete.
At the application level (e.g. Linux application process), only virtual addresses exist. Local variables are on the stack (or in registers). The stack is organized in call frames. The compiler generates the offset of a local variable within the current call frame, usually an offset relative to the stack pointer or frame pointer register (so the address of a local variable, e.g. in a recursive function, is known only at runtime).
Try to step by step a recursive function in your gdb debugger and display the address of some local variable to understand more. Try also the bt command of gdb.
Type
cat /proc/self/maps
to understand the address space (and virtual memory mapping) of the process executing that cat command.
Within the kernel, the mapping from virtual addresses to physical RAM is done by code implementing paging and driving the MMU. Some system calls (notably mmap(2) and others) can change the address space of your process.
Some early computers (e.g. those from the 1950-s or early 1960-s like CAB 500 or IBM 1130 or IBM 1620) did not have any MMU, even the original Intel 8086 didn't have any memory protection. At that time (1960-s), C did not exist. On processors without MMU you don't have virtual addresses (only physical ones, including in your embedded C code for a washing-machine manufacturer). Some machines could protect writing into some memory banks thru physical switches. Today, some low end cheap processors (those in washing machines) don't have any MMU. Most cheap microcontrollers don't have any MMU. Often (but not always), the program is in some ROM so cannot be overwritten by buggy code.
I sometimes see statements that on some platforms the following C or C++ code:
int* ptr;
*ptr = 0;
can result in writing to a hardware input-output port if ptr happens to store the address to which that port is mapped. Usually they are called "embedded platforms".
What are real examples of such platforms?
Most systems in my experience use memory-mapped I/O. The x86 platform has a separate, non-memory-mapped I/O address space (that uses the in/out family of processor op-codes), but the PC architecture also extensively uses the standard memory address space for device I/O, which has a larger address space, faster access (generally), and easier programming (generally).
I think that the separate I/O address space was used initially because the memory address space of processors was sometimes quite limited and it made little sense to use a portion of it for device access. Once the memory address space was opened up to megabytes or more, that reason to separate I/O addresses from memory addresses became less important.
I'm not sure how many processors provide a separate I/O address space like the x86 does. As an indication of how the separate I/O address space has fallen out of favor, when the x86 architecture moved into the 32-bit realm, nothing was done to increase the I/O address space from 64KB (though they did add the ability to move 32-bit chunks of data in one instruction). When x86 moved into the 64-realm, the I/O address space remained at 64KB and they didn't even add the ability to move data in 64-bit units...
Also note that modern desktop and server platforms (or other systems that use virtual memory) generally don't permit an application to access I/O ports, whether they're memory-mapped or not. That access is restricted to device drivers, and even device drivers will have some OS interface to deal with virtual memory mappings of the physical address and/or to set up DMA access.
On smaller systems, like embedded systems, I/O addresses are often accessed directly by the application. For systems that use memory-mapped addresses, that will usually be done by simply setting a pointer with the physical address of the device's I/O port and using that pointer like any other. However, to ensure that the access occurs and occurs in the right order, the pointer must be declared as pointing to a volatile object.
To access a device that uses something other than a memory-mapped I/O port (like the x86's I/O address space), a compiler will generally provide an extension that allows you to read or write to that address space. In the absence of such an extension, you'd need to call an assembly language function to perform the I/O.
This is called Memory-mapped I/O, and a good place to start is the Wikipedia article.
Modern operating systems usually protect you from this unless you're writing drivers, but this technique is relevant even on PC architectures. Remember the DOS 640Kb limit? That's because memory addresses from 640K to 1Mb were allocated for I/O.
PlayStation. That was how we got some direct optimized access to low-level graphics (and other) features of the system.
An NDIS driver on Windows is an example. This is called memory mapped I/O and the benefit of this is performance.
See Embedded-Systems for examples of devices that use Memory-mapped I/O e.g. routers,adsl-modems, microcontroller etc.
It is mostly used when writing drivers, since most peripheral devices communicate with the main CPU through memory mapped registers.
Motorola 68k series and PowerPC are the big ones.
You can do this in modern Windows (and I'm pretty sure Linux offers it too). It's called memory mapped files. You can load a file into memory on Windows and then write/alter it just by manipulating pointers.
In my C++ program (on Windows), I'm allocating a block of memory and can make sure it stays locked (unswapped and contiguous) in physical memory (i.e. using VirtualAllocEx(), MapUserPhysicalPages() etc).
In the context of my process, I can get the VIRTUAL memory address of that block,
but I need to find out the PHYSICAL memory address of it in order to pass it to some external device.
1. Is there any way I can translate the virtual address to the physical one within my program, in USER mode?
2. If not, I can find out this virtual to physical mapping only in KERNEL mode. I guess it means I have to write a driver to do it...? Do you know of any readily available driver/DLL/API which I can use, that my application (program) will interface with to do the translation?
3. In case I'll have to write the driver myself, how do I do this translation? which functions do I use? Is it mmGetPhysicalAddress()? How do I use it?
4. Also, if I understand correctly, mmGetPhysicalAddress() returns the physical address of a virtual base address that is in the context of the calling process. But if the calling process is the driver, and I'm using my application to call the driver for that function, I'm changing contexts and I am no longer in the context of the app when the mmGetPhysicalAddress routine is called... so how do I translate the virtual address in the application (user-mode) memory space, not the driver?
Any answers, tips and code excerpts will be much appreciated!!
Thanks
In my C++ program (on Windows), I'm allocating a block of memory and can make sure it stays locked (unswapped and contiguous) in physical memory (i.e. using VirtualAllocEx(), MapUserPhysicalPages() etc).
No, you can't really ensure that it stays locked. What if your process crashes, or exits early? What if the user kills it? That memory will be reused for something else, and if your device is still doing DMA, that will eventually result in data loss/corruption or a bugcheck (BSOD).
Also, MapUserPhysicalPages is part of Windows AWE (Address Windowing Extensions), which is for handling more than 4 GB of RAM on 32-bit versions of Windows Server. I don't think it was intended to be used to hack up user-mode DMA.
1. Is there any way I can translate the virtual address to the physical one within my program, in USER mode?
There are drivers that let you do this, but you cannot program DMA from user mode on Windows and still have a stable and secure system. Letting a process that runs as a limited user account read/write physical memory allows that process to own the system. If this is for a one-off system or a prototype, this is probably acceptable, but if you expect other people (particularly paying customers) to use your software and your device, you should write a driver.
2. If not, I can find out this virtual to physical mapping only in KERNEL mode. I guess it means I have to write a driver to do it...?
That is the recommended way to approach this problem.
Do you know of any readily available driver/DLL/API which I can use, that my application (program) will interface with to do the translation?
You can use an MDL (Memory Descriptor List) to lock down arbitrary memory, including memory buffers owned by a user-mode process, and translate its virtual addresses into physical addresses. You can also have Windows temporarily create an MDL for the buffer passed into a call to DeviceIoControl by using METHOD_IN_DIRECT or METHOD_OUT_DIRECT.
Note that contiguous pages in the virtual address space are almost never contiguous in the physical address space. Hopefully your device is designed to handle that.
3. In case I'll have to write the driver myself, how do I do this translation? which functions do I use? Is it mmGetPhysicalAddress()? How do I use it?
There's a lot more to writing a driver than just calling a few APIs. If you're going to write a driver, I would recommend reading as much relevant material as you can from MSDN and OSR. Also, look at the examples in the Windows Driver Kit.
4. Also, if I understand correctly, mmGetPhysicalAddress() returns the physical address of a virtual base address that is in the context of the calling process. But if the calling process is the driver, and I'm using my application to call the driver for that function, I'm changing contexts and I am no longer in the context of the app when the mmGetPhysicalAddress routine is called... so how do I translate the virtual address in the application (user-mode) memory space, not the driver?
Drivers are not processes. A driver can run in the context of any process, as well as various elevated contexts (interrupt handlers and DPCs).
You have a virtually continguous buffer in your application. That range of virtual memory is, as you noted, only available in the context of your application and some of it may be paged out at any time. So, in order to do access the memory from a device (which is to say, do DMA) you need to both lock it down and get a description that can be passed to a device.
You can get a description of the buffer called an MDL, or Memory Descriptor List, by sending an IOCTL (via the DeviceControl function) to your driver using METHOD_IN_DIRECT or METHOD_OUT_DIRECT. See the following page for a discussion of defining IOCTLs.
http://msdn.microsoft.com/en-us/library/ms795909.aspx
Now that you have a description of the buffer in a driver for your device, you can lock it down so that the buffer remains in memory for the entire period that your device may act on it. Look up MmProbeAndLockPages on MSDN.
Your device may or may not be able to read or write all of the memory in the buffer. The device may only support 32-bit DMA and the machine may have more than 4GB of RAM. Or you may be dealing with a machine that has an IOMMU, a GART or some other address translation technology. To accomodate this, use the various DMA APIs to get a set of logical addresses that are good for use by your device. In many cases, these logical addresses will be equivalent to the physical addresses that your question orginally asked about, but not always.
Which DMA API you use depends on whether your device can handle scatter/gather lists and such. Your driver, in its setup code, will call IoGetDmaAdapter and use some of the functions returned by it.
Typically, you'll be interested in GetScatterGatherList and PutScatterGatherList. You supply a function (ExecutionRoutine) which actually programs your hardware to do the transfer.
There's a lot of details involved. Good Luck.
You can not access the page tables from user space, they are mapped in the kernel.
If you are in the kernel, you can simply inspect the value of CR3 to locate the base page table address and then begin your resolution.
This blog series has a wonderful explanation of how to do this. You do not need any OS facility/API to resolve virtual<->physical addresses.
Virtual Address: f9a10054
1: kd> .formats 0xf9a10054
Binary: 11111001 10100001 00000000 01010100
Page Directory Pointer Index(PDPI) 11 Index into
1st table(Page Directory Pointer
Table) Page Directory Index(PDI)
111001 101 Index into 2nd
table(Page Directory Table) Page
Table Index(PTI)
00001 0000 Index into 3rd
table(Page Table) Byte Index
0000 01010100 0x054, the offset
into the physical memory page
In his example, they use windbg, !dq is a physical memory read.
1) No
2) Yes, you have to write a driver. Best would be either a virtual driver, or change the driver for the special-external device.
3) This gets very confusing here. MmGetPhysicalAddress should be the method you are looking for, but I really don't know how the physical address is mapped to the bank/chip/etc. on the physical memory.
4) You cannot use paged memory, because that gets relocated. You can lock paged memory with MmProbeAndLockPages on an MDL you can build on memory passed in from the user mode calling context. But it is better to allocate non-paged memory and hand that to your user mode application.
PVOID p = ExAllocatePoolWithTag( NonPagedPool, POOL_TAG );
PHYSICAL_ADDRESS realAddr = MmGetPhysicalAddress( p );
// use realAddr
You really shouldn't be doing stuff like this in usermode; as Christopher says, you need to lock the pages so that mm doesn't decide to page out your backing memory while a device is using it, which would end up corrupting random memory pages.
But if the calling process is the driver, and I'm using my application to call the driver for that function, I'm changing contexts and I am no longer in the context of the app when the mmGetPhysicalAddress routine is called
Drivers don't have context like user-mode apps do; if you're calling into a driver via an IOCTL or something, you are usually (but not guaranteed!) to be in the calling user thread's context. But really, this doesn't matter for what you're asking, because kernel-mode memory (anything above 0x80000000) is the same mapping no matter where you are, and you'd end up allocating memory in the kernel side. But again, write a proper driver. Use WDF (http://www.microsoft.com/whdc/driver/wdf/default.mspx), and it will make writing a correct driver much easier (though still pretty tricky, Windows driver writing is not easy)
EDIT: Just thought I'd throw out a few book references to help you out, you should definitely (even if you don't pursue writing the driver) read Windows Internals by Russinovich and Solomon (http://www.amazon.com/Microsoft-Windows-Internals-4th-Server/dp/0735619174/ref=pd_bbs_sr_2?ie=UTF8&s=books&qid=1229284688&sr=8-2); Programming the Microsoft Windows Driver Model is good too (http://www.amazon.com/Programming-Microsoft-Windows-Driver-Second/dp/0735618038/ref=sr_1_1?ie=UTF8&s=books&qid=1229284726&sr=1-1)
Wait, there is more. For the privilege of runnning on your customer's Vista 64 bit, you get expend more time and money to get your kernal mode driver resigned my Microsoft,