Can I limit the hugepages size of dpdk? - dpdk

I have two programs, one is based on dpdk. And both use hugepages. But dpdk uses up all hugepages by default. I can't find any document about how to set the hugepages size that dpdk can use. Is there any handy way to do this? If not, I have to research dpdk source and modify it.

A couple of caveats, till DPDK 18.11 option -m and --socket-mem does as what #andriy states. But with the release of DPDK 19.11, there has been huge revamp in the internal logic.
For higher version than 18.11 mode use --legacy-mem to emulate the older memory model. But if legacy-mode is not the choice, use --socket-limit.
Assuming the platform is x86 (as 2MB and 1GB), as far as I recollect 2MB can be dynamically changed while 1GB can not be dynamically allocated. Hence any options passed in kernel cmd line will result in equal allocation from all NUMA.

Sure, there are few command line options. The easiest is -m <megabytes>, but it’s internal logic might be completely wrong if you have a few NUMA nodes.
I recommend to use —socket-mem <mbytes,mbytes,...> instead, which allows to allocate a specific amount of megabytes per NUMA node.
For more details please see: https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html

To add, one can also use this addition to above whatever #andriy mentioned to implicitly set hard limit
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages for 2MB hugepage size
Similarly for 1G on too we can do it, but that depends support.

Related

Limit buffer cache used for mmap

I have a data structure that I'd like to rework to page out on-demand. mmap seems like an easy way to run some initial experiments. However, I want to limit the amount of buffer cache that the mmap uses. The machine has enough memory to page the entire data structure into cache, but for test reasons (and some production reasons too) I don't want to allow it to do that.
Is there a way to limit the amount of buffer cache used by mmap?
Alternatively, an mmap alternative that can achieve something similar and still limit memory usage would work too.
From my understanding, it is not possible. Memory mapping is controlled by the operating system. The kernel will make the decisions how to use the available memory in the best way, but it looks at the system in total. I'm not aware that quotas for caches on a process level are supported (at least, I have not seen such APIs in Linux or BSD).
There is madvise to give the kernel hints, but it does not support to limit the cache used for one process. You can give it hints like MADV_DONTNEED, which will reduce the pressure on the cache of other applications, but I would expect that it will do more harm than good, as it will most likely make caching less efficient, which will lead to more IO load on the system in total.
I see only two alternatives. One is trying to solve the problem at the OS level, and the other is to solve it at the application level.
At the OS level, I see two options:
You could run a virtual machine, but most likely this is not what you want. I would also expect that it will not improve the overall system performance. Still, it would be at least a way to define upper limits on the memory consumption.
Docker is the another idea that comes to mind, also operating at the OS level, but to the best of my knowledge, it does not support defining cache quotas. I don't think it will work.
That leaves only one option, which is to look at the application level. Instead of using memory mapped files, you could use explicit file system operations. If you need to have full control over the buffer, I think it is the only practical option. It is more work than memory mapping, and it is also not guaranteed to perform better.
If you want to stay with memory mapping, you could also map only parts of the file in memory and unmap other parts when you exceed your memory quota. It also has the same problem as the explicit file IO operations (more implementation work and non-trivial tuning to find a good caching strategy).
Having said that, you could question the requirement to limit the cache memory usage. I would expect that the kernel does a pretty good job at allocating memory resources in a good way. At least, it will likely be better than the solutions that I have sketched. (Explicit file IO, plus an internal cache, might be fast, but it is not trivial to implement and tune. Here, is a comparison of the trade-offs: mmap() vs. reading blocks.)
During testing, you could run the application with ionice -c 3 and nice -n 20 to somewhat reduce the impact on the other productive applications.
There is also a tool called nocache. I never used it but when reading through its documentation, it seems somewhat related to your question.
It might be possible to accomplish this through the use of mmap() and Linux Control Groups (more generally, here or here). Once installed, you have the ability to create arbitrary limits on the amount of, among other things, physical memory used by a process. As an example, here we limit the physical memory to 128 megs and swap memory to 256 megs:
cgcreate -g memory:/limitMemory
echo $(( 128 * 1024 * 1024 )) > /sys/fs/cgroup/memory/limitMemory/memory.limit_in_bytes
echo $(( 256 * 1024 * 1024 )) > /sys/fs/cgroup/memory/limitMemory/memory.memsw.limit_in_bytes
I would go the route of only map parts of the file at a time so you can retain full control on exactly how much memory is used.
you may use ipc shared memory segment, you will be the master of your memory segments.

How to get buffered/cached memory size in C++ under Linux?

I want to warn the user when memory available is low. Currently I'm using sysconf(_SC_PHYS_PAGES) to get the number of physical pages available.
However, there is also memory that the OS uses as buffer and cache. How do I obtain them programmatically?
The way the free command from procps does it is by reading the /proc/meminfo file. You can see their source here. The meminfo function updates globals, in particular kb_main_buffers and kb_main_cached. You could probably reuse their code to do what you want. (Assuming your license is compatible)

JVM minimum heap size recommendation reasons?

BEA recommends to keep both min and max heap sizes same. They didn't elaborate the reason for the suggestion. Can someone provide details?
I also got another recommendation from an architect of not setting anything for minimum and just set the maximum. Any comments on this? If i dont use it, what would be the default?
What is the best tool to monitor and tune JVM settings. I am using JDK1.6, on BEA weblogic 10g. It is on Linux 32bit JVM.
Is Max heap size set to 2GB any good? Server has lots of RAM. Currently it is set at 1.5GB, and it is at 80% usage when there is 40 concurrent users.
Thanks,
When the JVM needs to increase the size of the heap it will invoke a full garbage collection which may reduce throughput or cause a pause, so I would think that this is recommended by them for performance reasons.
Default value is documented as 2MB so if you don't override it you are likely to get a lot of (probably very quick) full collections after starting up as the heap is frequently resized.
Unless you are trying to keep the memory footprint as small as possible, I would follow BEA's advice.
Impossible to say from the information given if 2GB is appropriate, or if the objects that are using up space in it are still in scope - the old generation will just gradually fill up until it runs out of space when a full collection will be invoked. Where does that 80% figure come from?
Use the following JVM arguments to log GC details to a file called gc.log:
-verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log
Then you can analyse this using something like http://www.tagtraum.com/gcviewer.html

Limit physical memory per process

I am writing an algorithm to perform some external memory computations, i.e. where your input data does not fit into main memory and you have to consider the I/O complexity.
Since for my tests I do not always want to use real inputs I want to limit the amount of memory available to my process. What I have found is, that I can set the mem kernel parameter to limit the physically used memory of all processes (is that correct?)
Is there a way to do the same, but with a per process limit. I have seen ulimit, but it only limits the virtual memory per process. Any ideas (maybe I can even set it programmatically from within my C++ code)?
You can try with 'cgroups'.
To use them type the following commands, as root.
# mkdir /dev/cgroups
# mount -t cgroup -omemory memory /dev/cgroups
# mkdir /dev/cgroups/test
# echo 10000000 > /dev/cgroups/test/memory.limit_in_bytes
# echo 12000000 > /dev/cgroups/test/memory.memsw.limit_in_bytes
# echo <PID> > /dev/cgroups/test/tasks
Where is the PID of the process you want to add to the cgroup. Note that the limit applies to the sum of all the processes assigned to this cgroup.
From this moment on, the processes are limited to 10MB of physical memory and 12MB of pysical+swap.
There are other tunable parameters in that directory, but the exact list will depend on the kernel version you are using.
You can even make hierarchies of limits, just creating subdirectories.
The cgroup is inherited when you fork/exec, so if you add the shell from where your program is launched to a cgroup it will be assigned automatically.
Note that you can mount the cgroups in any directory you want, not just /dev/cgroups.
I can't provide a direct answer but pertaining to doing such stuff, I usually write my own memory management system so that I can have full control of the memory area and how much I allocate. This is usually appliacble when you're writing for microcontrollers as well. Hope it helps.
I would use the setrlimti with the RLIMIT_AS parameter to set the limit of virtual memory (this is what ulimit does) and then have the process use mlockall(MCL_CURRENT|MCL_FUTURE) to force the kernel to fault in and lock into physical RAM all the process pages, so that amount virtual == amount physical memory for this process
have you considered trying your code in some kind of virtual environment? A virtual machine might be too much for your needs, but something like User-Mode Linux could be a good fit. This runs a linux kernel as a single process inside your regular operating system. Then you can provide a separate mem= kernel setting, as well as a separate swap space to make controlled experiments.
Kernel mem= boot parameter limits how much memory in total OS will use.
This is almost never what user wants.
For physical memory, there is RSS rlimit aka RLIMIT_AS.
As other posters have indicated already, setrlimit is the most probable solution, it controls the limits of all configurable aspects of a process environment. Use this command to see these individual settings on your shell process:
ulimit -a
The ones most pertinent to your scenario in the resulting output are as follows:
data seg size (kbytes, -d) unlimited
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
virtual memory (kbytes, -v) unlimited
Checkout the manual page for setrlimit ("man setrlimit"), it can be invoked programmatically from your C/C++ code. I have used it to good effect in the past for controlling stack size limits. (btw, there is no dedicated man page for ulimit, it's actually an embedded bash command, so it's in the bash man page.)

How to profile memory usage?

I am aware of Valgrind, but it just detects memory management issues. What I am searching is a tool that gives me an overview, which parts of my program do consume how much memory. A graphical representation with e.g. a tree map (as KCachegrind does for Callgrind) would be cool.
I am working on a Linux machine, so windows tools will not help me very much.
Use massif, which is part of the Valgrind tools. massif-visualizer can help you graph the data or you can just use the ms_print command.
Try out the heap profiler delivered with gperftools, by Google. I've always built it from sources, but it's available as a precompiled package under several Linux distros.
It's as simple to use as linking a dynamic library to your executables and running the program. It collects information about every dynamic memory allocation (as far as I've seen) and save to disk a memory dump every time one of the following happens:
HEAP_PROFILE_ALLOCATION_INTERVAL bytes have been allocated by the program (default: 1Gb)
the high-water memory usage mark increases by HEAP_PROFILE_INUSE_INTERVAL bytes (default: 100Mb)
HEAP_PROFILE_TIME_INTERVAL seconds have elapsed (default: inactive)
You explicitly call HeapProfilerDump() from your code
The last one, in my experience, is the most useful because you can control exactly when to have a snapshot of the heap usage and then compare two different snapshots and see what's wrong.
Eventually, there are several possible output formats, like textual or graphical (in the form of a directed graph):
Using this tool I've been able to spot incorrect memory usages that I couldn't find using Massif.
A "newer" option is HeapTrack. Contrary to massif, it is an instrumented version of malloc/free that stores all the calls and dumps a log.
The GUI is nice (but requires Qt5 IIRC) and the results timings (because you may want to track time as well) are less biased than valgrind (as they are not emulated).
Use callgrind option with valgrind