Shared memory and performance [closed] - c++

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
What is the performance penalty when accessing a data structure if this is located:
In the same process memory block.
In a shared memory block (including locking, but supposing
no other processes access it for a significant amount of time).
I am interested in an approximate comparison values (e.g. percentage), for access, read and write.

All your process memory is mmaped. It does not matter whether one or more processes map the same physical pages of memory, there is no difference in the speed of access in this regard.
What matters in whether memory is located on the local or remote NUMA node.
See NUMA benchmarks in Challenges of Memory Management on Modern NUMA System.

Related

What are memory mapped files? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Recently, I've come across this video that shows how to use mmap() with file io. However, I can't find the video of his that documents the function. I don't have an understanding of what it is, why it exists, nor how it relates to files.
Too much of the jargon is flying over my head to make sense of it. I had the same problem with sites like Wikipedia.
Files are arrays of bytes stored in a filesystem.
"Memory" in this case is an array of bytes stored in RAM.
Memory mapping is something that an operating system does. It means that some range of bytes in memory has some special meaning.
Memory mapped file is generally a file in the file system, which has been mapped by the operating system to some range of bytes in memory of a process. When the process writes to the memory of the process, the operating system takes care that the bytes are written to the file, and when the process reads from the memory, operating system takes care that the file is read.

Is the HEAP a term for ram, processor memory or BOTH? And how many unions can I allocate for at once? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I was hoping someone had some schooling they could lay down about the whole HEAP and stack ordeal. I am trying to make a program that would attempt to create about 20,000 instances of just one union and if so some day I may want to implement a much larger program. Other than my current project consisting of a maximum of just 20,000 unions stored where ever c++ will allocate them do you think I could up the anti into the millions while retaining a reasonable return speed on function calls, approximately 1,360,000 or so? And how do you think it will handle 20,000?
Heap is an area used for dynamic memory allocation.
It's usually used to allocate space for variable collection size, and/or to allocate a large amount of memory. It's definitely not a CPU register(s).
Besides this I think there is no guarantee what heap is.
This may be RAM, may be processor cache, even HDD storage. Let OS and hardware decide what it will be in particular case.

TCP vs Shared Memory? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I understand, if shared memory is used correctly it can be faster than any other kind of IPC. My question is a bit more specific: If I transfer many small packets, eg 100 bytes, from different programs to one main program, what kind of speed difference can I expect?
The benefit from using shared memory will not be so much, because you will end up with using conditional variables on the shared memory (cf. pthread_condattr_setpshared; it will be a substantial coding work, by the way.) Then your logic is governed by the OS scheduler, and it's not very different from using localhost TCP connection which has a different and fast implementation than standard TCP on most OS.
If it's OK to entirely rely on a spinlock on the shared mem, then you will indeed realize substantial speed up like x3 fold.

How to allocate resources to 2 VMs on a dedicated server for maximum performance? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a dedicated server with 32gb of RAM.
This server exists solely to host two virtual machines. (VMWARE vSphere) It does nothing else.
How much memory can I allocate to each of these machines for maximum performance?
Not sure I understand your question fully, but I guess you mean: When I have 32GB RAM on my ESXi host, how much is left to run VMs in?
For running ESXi on a 32GB host, my guess is that you will lose about 500MB to the hypervisor.
Per VM there will be some memory overhead, which depends on number of vCPUs and RAM assigned, see: http://pubs.vmware.com/vsphere-4-esx-vcenter/topic/com.vmware.vsphere.resourcemanagement.doc_41/managing_memory_resources/r_overhead_memory_on_virtual_machines.html
When the VMs are NOT 64bit and are NOT using Large Pages, you will have Transparent Page Sharing which means memory dedupe. If duplicate small pages of memory (4K) are found in physical ESXi memory, the page is only stored once. This can save quite a lot of memory.
Read my blogpost on memory overcommit: http://www.gabesvirtualworld.com/memory-overcommit-in-production-yes-yes-yes/
Also read on memory compression in ESXi. Search my blog for "Memory management and compression" (I can only post two links in this answer because I don't have enough reputation yet.
If you read the links, you'll understand why there is not just one answer :-)

How does a circularBuffer improve performance vs competing for a particular memory address(mutex suspends the other)? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
In theory memory circularBuffer sounds like a good idea... the setting and the getting are never at the same address. However the limiting factor in the hardware. The computer will only allow use to access one memory location at a time. So then how can a circularBuffer improve performance ??
This link gives some reasons why circular buffers offer better performance than synchronized access to a single, shared data structure.
What hardware are you using, that only allows access to one memory location at a time?