Suppose we write a program in C and print the address of one of the variables declared in the program, is the address that gets printed on the screen the virtual address or the physical address of the variable?
If it is the virtual address, why is it that it still has the same range as a bit range of physical memory? Eg. for a 32 bit machine if it returns 0x833CA23E.
The address is going to be a virtual address in virtual memory, because the application has no knowledge of physical memory. That is hidden by the kernel and the MMU.
I am not sure what you mean by the same "bit range". If you have a 32-bit address space it will range across the entire 32-bit space regardless of what amount of physical memory you have. Likewise for 64-bit.
In most typical cases (Windows, Linux, etc.) it'll be a virtual address.
In the typical cases like Linux and Windows, both virtual addresses and physical addresses are normally 32 bits, so having numbers in the same range becomes inevitable. It is possible to allocate more than 4 gigabytes of memory, and when/if you do so, you end up with addresses larger than 32 bits--but unless you take special steps to do that, a 32-bit address is what you'll get by default.
When you do use more than 4 GB of memory under a 32-bit OS, you're normally doing so via some special API, such as Windows' Address Windowing Extensions. Using these, you get access to more than 4 GB of RAM, but it's not what's going to happen by default with code that's even close to portable.
Some (versions of some) operating systems also use Intel's Physical Address Extensions (PAE) to give the system as a whole access to more than 4 GB of RAM, but even when these are in use, any single process running on the system is still limited to addressing 4 GB (i.e., with PAE, you can have a limit of 4 GB per process, whereas older systems had a limit of 4 GB total, divided as needed between the processes).
It will be a 32 bit virtual address in most cases.
If your OS supports does paging then it would be the virtual address. It could have been mapped to the same physical address using paging. Linux and Windows do paging.
Another thing that matters is the architecture. On Intel x86 32bit system it will be 32 bit address. The first 10 bits of the address will be used to get page table. The second 10 bits will be used to get page from the selected page table. And the last 12 bits will give you the actual physical address from that page.
I hope it answers your question.
Related
I am developing a 32 bit application and got out of memory error.
And I noticed that my Visual Studio and a plugin (other apps too) used too much memory which is around 4 or 5 GB.
So I suspected that these program use up all the memory addresses where my program is able to find free memory.
I suppose that 32 bit can only use the first 4 GB, other memory it can not use at all.
I don't know if I am correct with this, other wise I will look for other answers, like I have bug in my code.
Your statement of
I suppose that 32bit can only use the first 4 giga byte, othere momery
it can not use at all.
is definitely incorrect. In a 64-bit OS, all applications can use all of the memory, regardless of what bitness it is, thanks to the translation table for virtual to physical memory being 64-bit.
Some really ancient hardware may not allow DMA to addresses above 4GB, but I really hope most of that is in the junk-yard by now.
If the system as a whole is running low on memory, it will affect all applications more or less equally.
However, a 32-bit application can only, by default, use the lower 2GB of the virtual address range (although these 2GB can be placed anywhere in the physical memory, as described above by means of a 64-bit translation table). You can extend this to nearly 4GB (3GB in a 32-bit OS, and subject to the /3GB boot flag in this case) by using /LARGEADDRESSAWARE in your linking command - this simply tells the OS that your application will "understand" that addresses can be negative, and thus will operate correctly with addresses over 2GB.
Any system can be brought down by a too heavy load.
But in normal use in Windows and any other virtual memory OS, the memory consumption of other programs does not much affect any given program execution.
Getting an out of memory error is unusual, but it can happen if you make a large allocation or if you declare a large local automatic variable. It can also happen if you fail to properly deallocate memory that's no longer used, i.e. if the program is leaking memory. For a 32-bit program on a 64-bit machine it's then not memory itself that's used up, but available address space within the program.
I have a 32-bit application consisting one EXE and multiple DLLs. The EXE has been built with /LARGEADDRESSAWARE flag set. So I expect on a 64-bit OS I should get 4 GB of user address space. But on some 64-bit Win 7 systems I am getting only 2 GB of user address space.
The physical memory is 8 GB if that matters. What could be reason for this behavior?
After browsing through MSDN, I found the following:
On http://msdn.microsoft.com/en-us/library/windows/desktop/aa366770(v=vs.85).aspx (the page for MEMORYSTATUSEX which is used by GlobalMemoryStatusEx (http://msdn.microsoft.com/en-us/library/windows/desktop/aa366589(v=vs.85).aspx) ) the description for ullTotalVirtual is:
this value is approximately 2 GB for most 32-bit processes on an x86 processor and approximately 3 GB for 32-bit processes that are large address aware running on a system with 4-gigabyte tuning enabled.
The 4GB tuning page is: http://msdn.microsoft.com/en-us/library/windows/desktop/bb613473(v=vs.85).aspx and it says something like:
On 64-bit editions of Windows, 32-bit applications marked with the IMAGE_FILE_LARGE_ADDRESS_AWARE flag have 4 GB of address space available.
Itanium editions of Windows Server 2003: Prior to SP1, 32-bit processes have only 2 GB of address space available.
Also, the memory limits page (http://msdn.microsoft.com/en-us/library/aa366778.aspx#memory_limits) can come handy if you want to determine the total memory your system supports.
However the real useful information comes from Mark Russinowich's blog: http://blogs.technet.com/b/markrussinovich/archive/2008/07/21/3092070.aspx
While 4GB is the licensed limit for 32-bit client SKUs, the effective limit is actually lower and dependent on the system's chipset and connected devices. The reason is that the physical address map includes not only RAM, but device memory as well, and x86 and x64 systems map all device memory below the 4GB address boundary to remain compatible with 32-bit operating systems that don't know how to handle addresses larger than 4GB.
So the conclusion is that yes, this might depend on the configuration of the system. Maybe you can complete your question with a table, with the amount of the memory you get on each system and some important system configuration settings, and we might discover a pattern in this case.
The problem is that an Application has a whole has to be large Adress aware - so that pointers are to be treated as unsigned.
If however on "some" system some of your used DLL is not large adressaware this renders your whole program not large address aware.
http://blogs.msdn.com/b/oldnewthing/archive/2010/09/22/10065933.aspx
Reading through the OpenGL Programming Guide, 8th Edition.
This is really a hardware question, actually...
I come to a section on OpenGL buffers, and as far as I understand they are memory spaces allocated in graphics card memory, is this correct?
If so, how are we able to get a pointer to read or modify that memory using glMapBuffer() ? As far as I was aware, all possible memory addresses (eg on a 64bit system there are uint64_t num = 0x0; num = ~num; possible addresses) were used for system memory as in RAM / CPU side Memory.
glMapBuffers() returns a void* to some memory. How can that pointer point to memory inside the graphics card? Particularly if I had a 32 bit system, and more than 4GB of RAM, and then a Graphics Card with say 2GB/4GB of memory. Surely there aren't enough addresses?!
This is really a hardware question, actually...
No it's not. You'll see why in a moment.
I come to a section on OpenGL buffers, and as far as I understand they are memory spaces allocated in graphics card memory, is this correct?
Not quite. You must understand that while OpenGL gets you really close to the actual hardware, you're still very far from touching it directly. What glMapBuffer does is, that it sets up a virtual address range mapping. On modern computer systems the software doesn't operate on physical addresses. Instead a virtual address space (of some size) is used. This virtual address space looks like one large contiguous block of memory to the software, while in fact its backed by a patchwork of physical pages. Those pages can be implemented anyhow, they can be actual physical memory, they can be I/O memory, they can even be created in-situ by another program. The mechanism for that is provided by the CPU's Memory Management Unit in collaboration with the OS.
So for each process the OS manages a table of which part of the process virtual address space maps to what page handler. If you're running Linux have a look at /proc/$PID/maps. If you have a program that uses glMapBuffer read (with your program, don't call system) /proc/self/maps before and after the map buffer and look for the differences.
As far as I was aware, all possible memory addresses (eg on a 64bit system there are uint64_t num = 0x0; num = ~num; possible addresses) were used for system memory as in RAM / CPU side Memory.
What makes you think that? Whoever told you that (if somebody told you that) should be slapped in the face… hard.
What you have is a virtual address space. And this address space is completely different from the physical address space on the hardware side. In fact the size of the virtual address space and the size of physical address spaces can differ largely. For example for a long time there were 32 bit CPUs and 32 bit operating systems around. But already then it was desireable to have more than 4 GiB of system memory. So while the CPU would support only 32 bits of address space for a process (maximum size of a pointer), it may have provided 36 bits of physical address lines to memory, to support some 64 GiB of system RAM; it would then be the OS's job to manually switch those extra bits, so that while each process sees only some 3 GiB of system RAM (max.) processes in total could spread. A technique like that has become known as Physical Address Extension (PAE).
Furthermore not all of the address space in a process are backed by RAM. Like I already explained, address space mappings could be backed by anything. Often the memory pagefault handler will also implement swapping, i.e. if there's not enough free RAM around it will use HDD storage (in fact on Linux all userspace requests for memory are backed by the Disk I/O Cache handler). Also since the address space mappings are per process, some part of the address space is mapped kernel memory, which is the (physically) same for all processes and also resides at the same place in all processes. From user space this address space mapping is not accessible, but as soon as a syscall makes a transistion into kernel space it gets accessible; yes the OS kernel uses virtual memory internally, too. It just can't choose as broadly from the available backings (for example it would be very difficult for a network driver to operate, if its memory was backed by the network itself).
Anyway: On modern 64 bit systems you got a 64 bit pointer size, and with current hardware there are 48 bits of physical RAM address lines. Which leaves plenty of space, namely 16 × 48 bits (EDIT which means 2^16 - 1 times a 48 bit address space), for virtual mappings where there's no RAM around. And because there's so much to go around, each and every PCI card gets its very own address space, that behaves a little bit like RAM to the CPU (remember those PAE I mentioned earlier, well, in good old 32 bit times something like that had to be done to talk with extension cards already).
Now here comes the OpenGL driver. It simply provides a new address mapping handler, that usually just builds on top of the PCI address space handler, which will map a portion of virtual address space of a process. And whatever happens in that address space will be reflected by that mapping handler into a buffer ultimately accessed by the GPU. However the GPU itself may be accessing CPU memory directly. And what AMD plans is, that GPU and CPU will live on the same Die and access the same memory, so there's no longer a physical distinction there.
glMapBuffers() returns a pointer in the virtual memory space of the application, that's why the pointer could point in something above 4GB on a 64bits system.
The memory you manipulate with the mapped pointer could be a cpu copy (shadow) of the texture(or buffer) allocated on gpu or it could be the actual texture moved to system memory. It's often the operating system who decides if a texture resides on system memory or gpu memory. The operating system can move the texture from one location to another and can make a copy of it (shadow)
Computers have multiple layers of memory mapping. First all physically addressable components (including RAM, PCI memory windows, other device registers, etc) are all assigned a physical address. The size of this address varies. Even on 32-bit Intel devices the physical space might be a 36-bit address (to the CPU) while it is a 64-bit address in the PCI memory space.
On top of that mapping is a virtual mapping. There are many different virtual mappings, including one for each process. Within that space (which on a 64-bit system is as huge as you say) any combination (including repeats!) of physical space can be mapped as long as it fits. On a 64-bit system the available virtual space is so large that every possible physical device could easily be mapped.
can I provide extra space to the process other than provided by the operating system.
Can extra detachable memory be used for such purposes.
can I provide extra space to the process other than provided by the
operating system.
No you cant, for every piece of memory you have to request your OS.malloc(), new and other memory allocating functions and operator resolve as a system call that request OS for memory to be provided to the program.
Every process has a definite maximum memory space allocated to it, that depends on the machine architecture. On a 32-bit machine, the maximum addressable space is 2^32 bytes ~= 4GB. Hence a process should be able to address 4 GB of memory typically. But this space is divided into two parts, 1. Kernel Space and 2. Process Space. Kernel space is used for OS drivers etc while Process space is the space where your data can be allocated. Hence the memory available to you is just the Process space.
On a typical Windows XP machine, it is equally divided. i.e. 2 GB for process space (However, there are ways to modify this. For example, the /3G option). Any allocation beyond 2 GB gives a out of memory error.This process space becomes more when you move from a 32-bit application to a 64-bit application. This is one of the major incentives for moving to 64-bit applications.
So to answer your question, there is a maximum memory available to a process beyond which the OS denies memory allocations to the process.
There are some obscure ways. E.g. if you would attach a Windows CE device to a Windows PC, the memory of that device could be accessed via the "RAPI" interface. The Windows OS wouldn't be aware of this device memory; this was handles via the ActiveSync service. It wasn't very quick memory, though.
With very large amounts of ram these days I was wondering, it is possible to allocate a single chunk of memory that is larger than 4GB? Or would I need to allocate a bunch of smaller chunks and handle switching between them?
Why???
I'm working on processing some openstreetmap xml data and these files are huge. I'm currently streaming them in since I can't load them all in one chunk but I just got curious about the upper limits on malloc or new.
Short answer: Not likely
In order for this to work, you absolutely would have to use a 64-bit processor.
Secondly, it would depend on the Operating System support for allocating more than 4G of RAM to a single process.
In theory, it would be possible, but you would have to read the documentation for the memory allocator. You would also be more susceptible to memory fragmentation issues.
There is good information on Windows memory management.
A Primer on physcal and virtual memory layouts
You would need a 64-bit CPU and O/S build and almost certainly enough memory to avoid thrashing your working set. A bit of background:
A 32 bit machine (by and large) has registers that can store one of 2^32 (4,294,967,296) unique values. This means that a 32-bit pointer can address any one of 2^32 unique memory locations, which is where the magic 4GB limit comes from.
Some 32 bit systems such as the SPARCV8 or Xeon have MMU's that pull a trick to allow more physical memory. This allows multiple processes to take up memory totalling more than 4GB in aggregate, but each process is limited to its own 32 bit virtual address space. For a single process looking at a virtual address space, only 2^32 distinct physical locations can be mapped by a 32 bit pointer.
I won't go into the details but This presentation (warning: powerpoint) describes how this works. Some operating systems have facilities (such as those described Here - thanks to FP above) to manipulate the MMU and swap different physical locations into the virtual address space under user level control.
The operating system and memory mapped I/O will take up some of the virtual address space, so not all of that 4GB is necessarily available to the process. As an example, Windows defaults to taking 2GB of this, but can be set to only take 1GB if the /3G switch is invoked on boot. This means that a single process on a 32 bit architecture of this sort can only build a contiguous data structure of somewhat less than 4GB in memory.
This means you would have to explicitly use the PAE facilities on Windows or Equivalent facilities on Linux to manually swap in the overlays. This is not necessarily that hard, but it will take some time to get working.
Alternatively you can get a 64-bit box with lots of memory and these problems more or less go away. A 64 bit architecture with 64 bit pointers can build a contiguous data structure with as many as 2^64 (18,446,744,073,709,551,616) unique addresses, at least in theory. This allows larger contiguous data structures to be built and managed.
The advantage of memory mapped files is that you can open a file much bigger than 4Gb (almost infinite on NTFS!) and have multiple <4Gb memory windows into it.
It's much more efficent than opening a file and reading it into memory,on most operating systems it uses the built-in paging support.
This shouldn't be a problem with a 64-bit OS (and a machine that has that much memory).
If malloc can't cope then the OS will certainly provide APIs that allow you to allocate memory directly. Under Windows you can use the VirtualAlloc API.
it depends on which C compiler you're using, and on what platform (of course) but there's no fundamental reason why you cannot allocate the largest chunk of contiguously available memory - which may be less than you need. And of course you may have to be using a 64-bit system to address than much RAM...
see Malloc for history and details
call HeapMax in alloc.h to get the largest available block size
Have you considered using memory mapped files? Since you are loading in really huge files, it would seem that this might be the best way to go.
It depends on whether the OS will give you virtual address space that allows addressing memory above 4GB and whether the compiler supports allocating it using new/malloc.
For 32-bit Windows you won't be able to get single chunk bigger than 4GB, as the pointer size is 32-bit, thus limiting your virtual address space to 4GB. (You could use Physical Address Extension to get more than 4GB memory; however, I believe you have to map that memory into the virtualaddress space of 4GB yourself)
For 64-bit Windows, the VC++ compiler supports 64-bit pointers with theoretical limit of the virtual address space to 8TB.
I suspect the same applies for Linux/gcc - 32-bit does not allow you, whereas 64-bit allows you.
As Rob pointed out, VirtualAlloc for Windows is a good option for this, as is an anonymouse file mapping. However, specifically with respect to your question, the answer to "if C or C++" can allocate, the answer is NO THIS IS NOT SUPPORTED EVEN ON WIN7 RC 64
In the PE/COFF specification for exe files, the field which specifies the HEAP reserve and HEAP commit, is a 32 bit quantity. This is in-line with the physical size limitations of the current heap implmentation in the windows CRT, which is just short of 4GB. So, there is no way to allocate more than 4GB from C/C++ (technicall the OS support facilities of CreateFileMapping and VirtualAlloc/VirtualAllocNuma etc... are not C or C++).
Also, BE AWARE that there are underlying x86 or amd64 ABI construct's known as the page table's. This WILL in effect do what you are concerened about, allocating smaller chunks for your larger request, even though this is happining in kernel memory, there is an effect on the overall system, these tables are finite.
If you are allocating memory in such grandious purportions, you would be well advised to allocate based on the allocation granularity (which VirtualAlloc enforces) and also to identify optional flags's or methods to enable larger pages.
4kb pages were the initial page size for the 386, subsaquently the pentium added 4MB. Today, the AMD64 (Software Optimization Guide for AMD Family 10h Processors) has a maximum page table entry size of 1GB. This mean's for your case here, let's say you just did 4GB, it would require only 4 unique entries in the kernel's directory to locate\assign and permission your process's memory.
Microsoft has also released this manual that articulates some of the finer points of application memory and it's use for the Vista/2008 platform and newer.
Contents
Introduction. 4
About the Memory Manager 4
Virtual Address Space. 5
Dynamic Allocation of Kernel Virtual
Address Space. 5
Details for x86 Architectures. 6
Details for 64-bit Architectures. 7
Kernel-Mode Stack Jumping in x86
Architectures. 7
Use of Excess Pool Memory. 8
Security: Address Space Layout
Randomization. 9
Effect of ASLR on Image Load
Addresses. 9
Benefits of ASLR.. 11
How to Create Dynamically Based
Images. 11
I/O Bandwidth. 11
Microsoft SuperFetch. 12
Page-File Writes. 12
Coordination of Memory Manager and
Cache Manager 13
Prefetch-Style Clustering. 14
Large File Management 15
Hibernate and Standby. 16
Advanced Video Model 16
NUMA Support 17
Resource Allocation. 17
Default Node and Affinity. 18
Interrupt Affinity. 19
NUMA-Aware System Functions for
Applications. 19
NUMA-Aware System Functions for
Drivers. 19
Paging. 20
Scalability. 20
Efficiency and Parallelism.. 20
Page-Frame Number and PFN Database. 20
Large Pages. 21
Cache-Aligned Pool Allocation. 21
Virtual Machines. 22
Load Balancing. 22
Additional Optimizations. 23
System Integrity. 23
Diagnosis of Hardware Errors. 23
Code Integrity and Driver Signing. 24
Data Preservation during Bug Checks. 24
What You Should Do. 24
For Hardware Manufacturers. 24
For Driver Developers. 24
For Application Developers. 25
For System Administrators. 25
Resources. 25
If size_t is greater than 32 bits on your system, you've cleared the first hurdle. But the C and C++ standards aren't responsible for determining whether any particular call to new or malloc succeeds (except malloc with a 0 size). That depends entirely on the OS and the current state of the heap.
Like everyone else said, getting a 64bit machine is the way to go. But even on a 32bit machine intel machine, you can address bigger than 4gb areas of memory if your OS and your CPU support PAE. Unfortunately, 32bit WinXP does not do this (does 32bit Vista?). Linux lets you do this by default, but you will be limited to 4gb areas, even with mmap() since pointers are still 32bit.
What you should do though, is let the operating system take care of the memory management for you. Get in an environment that can handle that much RAM, then read the XML file(s) into (a) data structure(s), and let it allocate the space for you. Then operate on the data structure in memory, instead of operating on the XML file itself.
Even in 64bit systems though, you're not going to have a lot of control over what portions of your program actually sit in RAM, in Cache, or are paged to disk, at least in most instances, since the OS and the MMU handle this themselves.