How to release the hugepages allocated by DPDK application?

How to release the hugepages allocated by DPDK application? - dpdk

I am using DPDK-PROX application. Whenever I am closing that application, hugepages allocated by that application are not released. Every time I have to restart the system. Any solution?
I looked into below question but my issue is not resolved.
How to release hugepages from the crashed application
How to really free hugepages in Linux for use by a new process?
Proper Way to Release a Hugepage?
This is what is what i see in /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 1024
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB

The behavior you are seeing is characteristic of DPDK versions before 18.05, or 18.05+ with --legacy-mem parameter. I'm going to assume the former as that is more likely.
This is happening because DPDK supports a certain flavor of multiprocessing, which allows "secondary" processes to attach to a "primary" process, even after it has already exited. Since DPDK relies on hugepage files (in hugetlbfs filesystem) to share memory between its primary and secondary processes, these hugepage files are not deleted after application exit, to allow secondary processes to use them.
There are multiple solutions to this problem. First one - you may not need to do anything at all. If you fear that DPDK will not be able to start again because you've run out of hugepages, then this is not an issue, because DPDK will clear out any unused hugepages before allocating new ones.
If you want to release this memory back to the system after shutting down the process, you may want to do so manually. To do that, you have to know where the hugepages were stored (i.e. where is your hugetlbfs mounted) - on many distros, it is mounted at /dev/hugepages by default. If you go there and clear out all of the rte_* files (assuming you're using default DPDK prefix, which you probably do), all of the hugepages will be freed back to the system.
Finally, if you don't care about multiprocessing, you may use --huge-unlink EAL parameter[1] when running DPDK. This will make it so that whatever hugepages DPDK allocates, it will remove the files right afterwards. Then, when application closes, the hugepages will be automatically released by virtue of closing the file handles.
In newer versions of DPDK, this is less of a problem, because DPDK can dynamically scale its memory usage up and down [2]. I'm not familiar with DPDK-PROX so i cannot say if it supports newer DPDK versions.
[1] https://software.intel.com/en-us/articles/memory-in-dpdk-part-3-1711-and-earlier-releases
[2] https://software.intel.com/en-us/articles/memory-in-dpdk-part-4-1811-and-beyond

Related

In windows even small bytes leak showing high commit memory in resource monitor why it could happen?

We have an 64 bit application running on windows, for the fact we know it is leaking very few bytes of memory in the c++ code.But for a setup which has 16gb physical ram and 32gb pagefile.sys. Resource monitor is showing commit memory as 22gb and 900 MB in working set used by our process.
I knew for every process os will create virtual address space in pages and that number of addresses that will be depend on the 32 bit or 64 bit .I also knew that os will swap pages to disk i.e. pagefile.sys for running other apps.In windows i think page size is 4kb, what i want to know is if one byte has leaked in a page of 4 kb in physical ram , then after swapping to disk does it will show as 4kb is used instead of one byte by the process or not ?

net.ipv4 kernel variables affect on DPDK ports

Sorry beforehand if this question is trivial or even if the answer is the question.
Our devices have an application that employees DPDK to use its NICs.
As part of the device setup some init done, part of it is settings kernel variables net.ipv4.tcp_keepalive_intvl, tcp_max_syn_backlog, net.ipv4.conf.all.log_martians and etc
Do these kind of variables have any affect our ports under DPDK control?
Probably not, as DPDK is user space NICs, but I am not confident enough to assert it

As long as you have NIC which are listed in driver/net/ there are not much PMD devices which rely net.ipv4 other than TAP/TUN. So if it is physical NIC one would not be affected.
[EDIT-1]
only Physcial NIC with userspace PMD (not tap representations) can be guaranteed to be not be affected like e1000, ixgbe, i40e, ice, fm10k etc are not affected. PMD like AF_PACKET/AF_XDP are also not affected while TAP/PCAP goes through linux stack

How to swap two memory regions from different banks in FLASH Memory for STM32L475?

I am working on B-L475E-IOT01A2 which is a STM32L475 series Discovery IoT kit and has an ARM M4 cortex. It has two banks of FLASH Memory of size 512 KB each. I am implementing two applications along with a bootloader and all of them are stored in the FLASH. Since there is very little space, the bootloader, the 1st application and some part of the 2nd application is stored in the 1st bank whereas the 2nd bank contains the remaining part of the 2nd application. So at a point in the bootloader program, I need to swap both the applications.
The problem is that only some part of both the applications is getting swapped because the 2nd Application is stored partly in both the banks. Only one page (2 KB) of memory can be written at once in the FLASH . Both the applications have a size of 384 KB and after calculation it turns out to be 192 pages. But after running the swapping program only 72 pages were swapped.
Here are the addresses of the applications and the bootloader.
BOOTLOADER_ADDRESS 0x08000000, (Size = 48K )
APPLICATION1_ADDRESS 0x0800F000 (Size = 384 KB)
APPLICATION2_ADDRESS 0x0806F800 (Size = 384 KB)
So what should I do to ensure proper swapping? Should I enable dual bank mode or store the 2nd Application in the 2nd bank or do something else?
Your help will be highly appreciated.
Thanks,
Shetu

One possible workaround/different approach is to integrate the bootloader functionality into both application 1 and application 2, and having each application in its own flash bank (1 and 2). Using dual bank mode makes switching back and forth between applications much easier. I have used this approach with an STM32F7 device.
When the device boots it is configured to boot from flash bank 1 or 2 depending on several device option bytes/settings. If your code in the bootloader/application decides to boot into the other application, it can do this by modifying some option bytes and then performing a soft reset. Also, while running bootloader/application from one flash bank, the other flash bank can be updated.
If using this approach to do firmware updates, you must be especially careful that new firmware versions do not break the firmware update functionality of the bootloader.

.NET 2 to .NET 4 migration results in stack overflow crash

I have a legacy software with C++ and C# code, which worked in Windows XP SP3 and .NET 2.0 (VS2005). The software did scanning and image processing with plenty of memory intensive processing. The PC has 2gb RAM. The stack size is reserved to 15MB for the software process.
This software was migrated to .NET4 (VS2010). During the migration, code logic is not altered. The software works properly for individual scans and processing. However, for continous job runs the software crashes at random places. For all the crashes, the event viewer shows 'The software was terminated due to stack overflow'. On debugging the crash dump, it points to ntdll.dll (kernel dll).
To fix the issue following solutions were tried. None of them worked.
Stack size increased to 20MB. Software crashed.
Process is allocated 820 MB by VirtualAlloc in the beginning. This was increased to 1024 MB. It delayed the crash by a day. But eventually it crashed.
alloca was used to allocate memory for local variables. These were replaced by _malloca.
Please let me know if .NET 4 migration requires major increase in RAM to run the software without failure. Inputs on memory requirement change for .net 2 to .net 4 migration are welcome.

Windows based C++ application consumes more CPU over time

We have a C++ based Multi-threaded application on Windows that captures network packets in real-time using the WinPCAP library and then processes these packets for monitoring the network. This application is intended to run 24x7. Our applicatin easily consumes 7-8 GB of RAM.
Issue that we are observing :
Lets say the application is monitoring 100Mbps of network traffic and consumes 60% CPU. We have observed that when the application keeps running for a longer duration like a day or two, the CPU consumption of the application increases to like 70-80%, even though it is still processing 100 Mbps traffic (doing the same amount of work).
We have tried to debug this issue to the thread level using ProcessExplorer and noticed that the packet capturing threads start consuming more CPU over time. This issue is not resolved even after re-starting the application. Only a machine restart solves the problem.
We have observed this issue is easily reproducible on Windows 2012 R2 Server OS during over night runs. In Windows 7, the issue happens but over few days.
Any idea what might be causing this ?
Thanks in Advance

What about memory allocation? Because you are using lots of memory it could be a memory fregmentation problem so if you do several allocation/reallocation of buffers this of course will cause a major cost for the processor to find and allocate space available.

I finally found the reason for the above behavior : it was the winpcap code that was causing it. After replacing that, we did not observe this behavior.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js