I am stuck with quite a strange error in message queue code which used to work completely fine before. The error I am getting is :
terminate called after throwing an instance of 'std::runtime_error'
what(): Unable to create named message queue
The code which is throwing the error is:
void VMGR::initialize_mq()
{
int errsv;
queue_attr.mq_flags = 0;
queue_attr.mq_maxmsg = 10;
queue_attr.mq_msgsize = api::msg_sz;
queue_attr.mq_curmsgs = 0;
mq_unlink (api::nm_mqueue);
mq_close(queue);
queue = mq_open( api::nm_mqueue, O_RDWR | O_CREAT | O_EXCL|O_NONBLOCK, IO_FILE_PERMISSIONS, &queue_attr );
if (queue == (mqd_t)-1)
{
errsv = errno;
throw std::runtime_error(
std::string( "Unable to create named message queue" )
);
}
std::cout<< "initialize_mq works"<< std::endl;
return;
}
I have checked thoroughly the /dev/mqueue that I am not using the same file descriptor which already exists. Wasted a lot of time on resolving this error.
Need help and guidance.
Edit
After executing ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1030932
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1030932
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The number of message queue identifiers presently I have in the system:
Data7aa vmgr_proc135978 vmgr_proc40571 vmgr_proc55053
Data7bb vmgr_proc136146 vmgr_proc42152 vmgr_proc56557
Data7cc vmgr_proc136536 vmgr_proc43732 vmgr_proc9190
Data7dd vmgr_proc16914 vmgr_proc45247 vmgr_proc93026
Data7ee vmgr_proc17107 vmgr_proc46576 vmgr_proc9362
Data7gg vmgr_proc21970 vmgr_proc47958 vmgr_proc93706
Data7hh vmgr_proc29728 vmgr_proc49422 vmgr_proc93925
hello_moto vmgr_proc31111 vmgr_proc49700 vmgr_proc94596
vmgr_proc123434 vmgr_proc32504 vmgr_proc51452 vmgr_proc9528
vmgr_proc132441 vmgr_proc35308 vmgr_proc51869 vmgr_send
vmgr_proc132550 vmgr_proc39121 vmgr_proc53478
Related
When I searched for this problem, I found that almost no one else encountered such a problem.It's strange.By the way, I use dpdk-19.11.12 and Ubuntu-20.04
After looking at the dpdk source code, I know that the above code is in the rte_eth_dev_create function:
if (priv_data_size) {
ethdev->data->dev_private = **rte_zmalloc_socket**(
name, priv_data_size, RTE_CACHE_LINE_SIZE,
device->numa_node);
if (!ethdev->data->dev_private) {
RTE_LOG(ERR, EAL, "failed to allocate private data");
retval = -ENOMEM;
goto probe_failed;
}
}
It seems rte_zmalloc_socket return a NULL pointer. Why this happens?I allocated the relevant hugepage memory as requested.
Some Information:
EAL: Detected 4 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:31:00.0 on NUMA socket 3
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:31:00.1 on NUMA socket 3
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:51:00.0 on NUMA socket 5
EAL: probe driver: 8086:10fb net_ixgbe
failed to allocate private data
EAL: Requested device 0000:51:00.0 cannot be used
EAL: PCI device 0000:51:00.1 on NUMA socket 5
EAL: probe driver: 8086:10fb net_ixgbe
failed to allocate private data
EAL: Requested device 0000:51:00.1 cannot be used
Hugepages: (cat /proc/meminfo | grep Huge)
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 20
HugePages_Free: 19
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 20971520 kB
and mount information:
(mount | grep huge)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=1024M)
nodev on /mnt/huge type hugetlbfs (rw,relatime,pagesize=1024M)
NIC:
Network devices using DPDK-compatible driver
============================================
0000:51:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe,vfio-pci
0000:51:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe,vfio-pci
Network devices using kernel driver
===================================
0000:31:00.0 'I350 Gigabit Network Connection 1521' if=enp49s0f0 drv=igb unused=igb_uio,vfio-pci
0000:31:00.1 'I350 Gigabit Network Connection 1521' if=enp49s0f1 drv=igb unused=igb_uio,vfio-pci *Active*
Numa information:(numactl -H)
node 5 cpus: 30 31 32 33 34 35 78 79 80 81 82 83
node 5 size: 64475 MB
node 5 free: 58743 MB
82599ES NICs are on numa node 5.
I thought I did all the initialization but rte_eal_init returned the error "failed to allocate private data"
Any ideas on this issue? Thanks for help.
-----------------------update-----------------------------------------
Since this problem occurs in rte_zmalloc_socket(),I guess that there is something wrong with my configuration in Hugepages.But as posted above,i check Hugepages with command:
1.cat /proc/meminfo | grep Huge
2.mount | grep huge
first command checks available 1G Hugepages and second checks mount situation.
Result above seems to be normal which confuses me most.
Any clues?Thanks.
Thanks, the issue has been resolved.
The cause of the problem is that my cpu has 8 numa nodes and nics bound to dpdk are on node 5,but by default dpdk only supports 4 numa nodes.
My experience of finding the problem is like this:
rte_eth_dev_create calls rte_zmalloc_socket to allocate private-data
rte_zmalloc_socket directly calls rte_malloc_socket:
void *
rte_zmalloc_socket(const char *type, size_t size, unsigned align, int socket)
{
void *ptr = rte_malloc_socket(type, size, align, socket);
#ifdef RTE_MALLOC_DEBUG
/*
* If DEBUG is enabled, then freed memory is marked with poison
* value and set to zero on allocation.
* If DEBUG is not enabled then memory is already zeroed.
*/
if (ptr != NULL)
memset(ptr, 0, size);
#endif
return ptr;
}
rte_malloc_socket calls malloc_heap_alloc with socket_arg = 5 in my case:
void *
rte_malloc_socket(const char *type, size_t size, unsigned int align, int socket_arg)
{
/* return NULL if size is 0 or alignment is not power-of-2 */
if (size == 0 || (align && !rte_is_power_of_2(align)))
return NULL;
/* if there are no hugepages and if we are not allocating from an
* external heap, use memory from any socket available. checking for
* socket being external may return -1 in case of invalid socket, but
* that's OK - if there are no hugepages, it doesn't matter.
*/
if (rte_malloc_heap_socket_is_external(socket_arg) != 1 &&
!rte_eal_has_hugepages())
socket_arg = SOCKET_ID_ANY;
return malloc_heap_alloc(type, size, socket_arg, 0,
align == 0 ? 1 : align, 0, false);
}
4.In malloc_heap_alloc:
void*
malloc_heap_alloc(const char *type, size_t size, int socket_arg, unsigned int
flags, size_t align, size_t bound, bool contig)
{
int socket, heap_id, i;
void *ret;
/* return NULL if size is 0 or alignment is not power-of-2 */
if (size == 0 || (align && !rte_is_power_of_2(align)))
return NULL;
if (!rte_eal_has_hugepages() && socket_arg < RTE_MAX_NUMA_NODES)
socket_arg = SOCKET_ID_ANY;
if (socket_arg == SOCKET_ID_ANY)
socket = malloc_get_numa_socket();
else
socket = socket_arg;
/* turn socket ID into heap ID */
**heap_id = malloc_socket_to_heap_id(socket);**
/* if heap id is negative, socket ID was invalid */
**if (heap_id < 0)
return NULL;**
...
}
My program returns NULL because heap_ip = -1 while socket = 5,which eventually causes rte_zamlloc_socket return NULL and failed to allocate private data for nics.
By running " meson configure " in dpdk build directory,I got such result:
max_numa_nodes 4 maximum number of NUMA nodes supported by EAL
So in my case, I need to use either one of the steps to rebuild dpdk.
rebuild from scratch "cd [dpdk parent folder]; rm -rf build; meson -Dmax_numa_nodes=8 build"
to reuse existing build folder "cd [dpdk parent folder]; meson --reconfigure -Dmax_numa_nodes=8 build"
After that, everything is back to normal.
Thanks for help,and hope this can help anyone else why meet the same problem.:)
Overview
I have a c++ application that reads large amount of data (~1T). I run it using hugepages (614400 pages at 2M) and this works - until it hits 128G.
For testing I created a simple application in c++ that allocates chunks of 2M until it can't.
Application is run using:
LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=yes ./a.out
While running I monitor the nr of free hugepages (from /proc/meminfo).
I can see that it consumes hugepages at the expected rate.
However the application crashes with a std::bad_alloc exception at 128G allocated (or 65536 pages).
If I run two or more instances at the same time, they all crash at 128G each.
If I decrease the cgroup limit to something small, say 16G, the app crashes correctly at that point with a 'bus error'.
Am I missing something trivial? Please look below for details.
I'm running out of ideas...
Details
Machine, OS and software:
CPU : Intel(R) Xeon(R) CPU E5-2650 v4 # 2.20GHz
Memory : 1.5T
Kernel : 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
OS : CentOS Linux release 7.4.1708 (Core)
hugetlbfs : 2.16-12.el7
gcc : 7.2.1 20170829
Simple test code I used (allocates chunks of 2M until free hugepages is below a limit)
#include <iostream>
#include <fstream>
#include <vector>
#include <array>
#include <string>
#define MEM512K 512*1024ul
#define MEM2M 4*MEM512K
// data block
template <size_t N>
struct DataBlock {
char data[N];
};
// Hugepage info
struct HugePageInfo {
size_t memfree;
size_t total;
size_t free;
size_t size;
size_t used;
double used_size;
};
// dump hugepage info
void dumpHPI(const HugePageInfo & hpi) {
std::cout << "HugePages total : " << hpi.total << std::endl;
std::cout << "HugePages free : " << hpi.free << std::endl;
std::cout << "HugePages size : " << hpi.size << std::endl;
}
// dump hugepage info in one line
void dumpHPIline(const size_t i, const HugePageInfo & hpi) {
std::cout << i << " "
<< hpi.memfree << " "
<< hpi.total-hpi.free << " "
<< hpi.free << " "
<< hpi.used_size
<< std::endl;
}
// get hugepage info from /proc/meminfo
void getHugePageInfo( HugePageInfo & hpi ) {
std::ifstream fmeminfo;
fmeminfo.open("/proc/meminfo",std::ifstream::in);
std::string line;
size_t n=0;
while (fmeminfo.good()) {
std::getline(fmeminfo,line);
const size_t sep = line.find_first_of(':');
if (sep==std::string::npos) continue;
const std::string lblstr = line.substr(0,sep);
const size_t endpos = line.find(" kB");
const std::string trmstr = line.substr(sep+1,(endpos==std::string::npos ? line.size() : endpos-sep-1));
const size_t startpos = trmstr.find_first_not_of(' ');
const std::string valstr = (startpos==std::string::npos ? trmstr : trmstr.substr(startpos) );
if (lblstr=="HugePages_Total") {
hpi.total = std::stoi(valstr);
} else if (lblstr=="HugePages_Free") {
hpi.free = std::stoi(valstr);
} else if (lblstr=="Hugepagesize") {
hpi.size = std::stoi(valstr);
} else if (lblstr=="MemFree") {
hpi.memfree = std::stoi(valstr);
}
}
hpi.used = hpi.total - hpi.free;
hpi.used_size = double(hpi.used*hpi.size)/1024.0/1024.0;
}
// allocate data
void test_rnd_data() {
typedef DataBlock<MEM2M> elem_t;
HugePageInfo hpi;
getHugePageInfo(hpi);
dumpHPIline(0,hpi);
std::array<elem_t *,MEM512K> memmap;
for (size_t i=0; i<memmap.size(); i++) memmap[i]=nullptr;
for (size_t i=0; i<memmap.size(); i++) {
// allocate a new 2M block
memmap[i] = new elem_t();
// output progress
if (i%1000==0) {
getHugePageInfo(hpi);
dumpHPIline(i,hpi);
if (hpi.free<1000) break;
}
}
std::cout << "Cleaning up...." << std::endl;
for (size_t i=0; i<memmap.size(); i++) {
if (memmap[i]==nullptr) continue;
delete memmap[i];
}
}
int main(int argc, const char** argv) {
test_rnd_data();
}
Hugepages is setup at boot time to use 614400 pages at 2M each.
From /proc/meminfo:
MemTotal: 1584978368 kB
MemFree: 311062332 kB
MemAvailable: 309934096 kB
Buffers: 3220 kB
Cached: 613396 kB
SwapCached: 0 kB
Active: 556884 kB
Inactive: 281648 kB
Active(anon): 224604 kB
Inactive(anon): 15660 kB
Active(file): 332280 kB
Inactive(file): 265988 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 2097148 kB
SwapFree: 2097148 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 222280 kB
Mapped: 89784 kB
Shmem: 18348 kB
Slab: 482556 kB
SReclaimable: 189720 kB
SUnreclaim: 292836 kB
KernelStack: 11248 kB
PageTables: 14628 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 165440732 kB
Committed_AS: 1636296 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 7789100 kB
VmallocChunk: 33546287092 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 614400
HugePages_Free: 614400
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 341900 kB
DirectMap2M: 59328512 kB
DirectMap1G: 1552941056 kB
Limits from ulimit:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 6191203
max locked memory (kbytes, -l) 1258291200
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
cgroup limit:
> cat /sys/fs/cgroup/hugetlb/hugetlb.2MB.limit_in_bytes
9223372036854771712
Tests
Output when running the test code using HUGETLB_DEBUG=1:
...
libhugetlbfs [abc:185885]: INFO: Attempting to map 2097152 bytes
libhugetlbfs [abc:185885]: INFO: ... = 0x1ffb200000
libhugetlbfs [abc:185885]: INFO: hugetlbfs_morecore(2097152) = ...
libhugetlbfs [abc:185885]: INFO: heapbase = 0xa00000, heaptop = 0x1ffb400000, mapsize = 1ffaa00000, delta=2097152
libhugetlbfs [abc:185885]: INFO: Attempting to map 2097152 bytes
libhugetlbfs [abc:185885]: WARNING: New heap segment map at 0x1ffb400000 failed: Cannot allocate memory
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)
Using strace:
...
mmap(0x1ffb400000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa200000) = 0x1ffb400000
mmap(0x1ffb600000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa400000) = 0x1ffb600000
mmap(0x1ffb800000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa600000) = 0x1ffb800000
mmap(0x1ffba00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa800000) = 0x1ffba00000
mmap(0x1ffbc00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffaa00000) = -1 ENOMEM (Cannot allocate memory)
write(2, "libhugetlbfs", 12) = 12
write(2, ": WARNING: New heap segment map "..., 79) = 79
mmap(NULL, 3149824, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
write(2, "terminate called after throwing "..., 48) = 48
write(2, "std::bad_alloc", 14) = 14
write(2, "'\n", 2) = 2
write(2, " what(): ", 11) = 11
write(2, "std::bad_alloc", 14) = 14
write(2, "\n", 1) = 1
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
gettid() = 188617
tgkill(188617, 188617, SIGABRT) = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=188617, si_uid=1001} ---
Finally in /proc/pid/numa_maps:
...
1ffb000000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb200000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb400000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb600000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb800000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
...
However the application crashes with a std::bad_alloc exception at 128G allocated (or 65536 pages).
You are allocating too many small-sized segments, there is a limit of the number of map segments you can get per process.
sysctl -n vm.max_map_count
You are trying to use 1024 * 512 * 4 == 2097152 MAP at least and one more for the array, but the default value of vm.max_map_count is only 65536.
You can change it with:
sysctl -w vm.max_map_count=3000000
Or you could allocate a bigger segment in your code.
I have an old C++ application running on OS X (10.10/Yosemite).
When I'm debugging the application I get an exception on this following lines of code:
// create pipe
int pipefd[2];
int piperet = pipe(pipefd);
if( piperet )
{
wcsncpy(errbuf, CEmpError::GetErrorText(CEmpError::ERR_SYSTEM, L"Can't create pipe for IPC.", errno).c_str(), errbuflen);
CEmpError::LogError(errbuf);
return CEmpError::ERR_SYSTEM; //= 115
}
So the application is running and doing this lines of code a few times. After a while pipette is -1. The errno error-code is 25.
After some research, this means "Too many open files". Is there a workaround to close all these open files? Or is it possible to know which files are open too many?
When I type in Terminal ulimit -a I get:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 2560
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited
So I'm not the super c++-pro, here the required code of lines. Guess all not needed pipes or pipefd will be closed.
// create pipe
int pipefd[2];
int piperet = pipe(pipefd);
if( piperet )
{
wcsncpy(errbuf, CEmpError::GetErrorText(CEmpError::ERR_SYSTEM, L"Can't create pipe for IPC.", errno).c_str(), errbuflen);
CEmpError::LogError(errbuf);
return CEmpError::ERR_SYSTEM;
}
CEmpError *pError = 0;
// after transfer the execution bit could be reset, so set the rights back
chmod(args[0], S_IWUSR | S_IRUSR | S_IXUSR | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH );
pid_t pid = fork();
if(pid == 0)
{ // child process
close(pipefd[0]); // close reading end
int fd = pipefd[1];
// redirect stdout and stderr to pipe
dup2(fd, STDOUT_FILENO);
dup2(fd, STDERR_FILENO);
close(fd); // not needed anymore
// execute steup.sh with built argument list
execvp(args[0], (char**)args);
// if we ever reached this line the exec failed and we need to report error to parent process
// once we are in child process we will print the error into stdout of our child process
// and parent process will parse and return it to the caller.
char buf[128];
sprintf(buf, "setup.sh:ERROR:PI%03d",CEmpError::ERR_EXEC);
perror(buf);
// keep the process alive until the parent process got the error from the pipe and killed this child process
sleep(5);
return CEmpError::ERR_EXEC;
}
else if (pid > 0)
{ // parent process
delete[] args[0]; // release memory allocated to f.
delete[] args[3]; // release memory allocated to log f.
delete[] args[5]; // release memory allocated to pn
close(pipefd[1]);
pParser = new CPackageInstallerParser();
FILE* fp = fdopen(pipefd[0], "r");
/*int res = */setvbuf(fp, NULL, _IOLBF, 0);
try
{
pParser->ParseOutput(fp, statusCallback, statusContext, logFileName);
}
catch (CEmpError* pErr)
{
if (pErr->ErrorCode == CEmpError::ERR_EXEC)
kill(pid, SIGABRT); // the error is parsed kill the child process
pError = pErr;
}
catch (...)
{
// some exception from statusCallback
fclose(fp);
delete pParser;
pParser = NULL;
throw;
}
fclose(fp);
int stat;
// wait for the installation process to end.
waitpid(pid, &stat, 0);
if (WIFEXITED(stat) && (stat % 256 == 0) && pError == NULL)
{
// exited normally with code 0 (success)
// printf("Installed succesfully!\n");
// register succesful operation result
try
{
RegisterResult(operation);
}
catch (CEmpError* pErr)
{
pError = pErr;
}
}
else
{
if (pError == NULL) // no error was caught by parser
pError = new CEmpError(CEmpError::ERR_UNKNOWN);
//dumpError(stat);
}
}
else
pError = new CEmpError(CEmpError::ERR_FORK);
//clean up and exit
if (pParser != NULL)
delete pParser;
pParser = NULL;
int exitcode = 0;
if (pError != NULL)
{
exitcode = pError->ErrorCode;
wcsncpy(errbuf, pError->GetErrorText().c_str(), errbuflen);
pError->Log();
delete pError;
}
return exitcode;
You need to close the pipe FDs with close when you no longer need them.
You're allowed to have 2560 open files per process, so you should close the other files and/or pipes, when no longer needed.
It is always good advice to release resources, when you're done with them.
I'm writing a program that reads from two mono ALSA devices and writes them to one stereo ALSA device.
I use three threads and ping-pong buffer to manage them. Two reading threads and one writing threads. Their configurations are as follows:
// Capture ALSA device
alsaBufferSize = 16384;
alsaCaptureChunkSize = 4096;
bitsPerSample = 16;
samplingFrequency = 24000;
numOfChannels = 1;
block = true;
accessType = SND_PCM_ACCESS_RW_INTERLEAVED;
// Playback device (only list params that are different from above)
alsaBufferSize = 16384 * 2;
numOfChannels = 2;
accessType = SND_PCM_ACCESS_RW_NON_INTERLEAVED;
Two reading threads would write ping buffer and then pong buffer. The writing thread would wait for any of two buffer ready, lock it, read from it, and then unlock it.
But when I run this program, xrun appears and can't be recovered.
ALSA lib pcm.c:7316:(snd_pcm_recover) underrun occurred
ALSA lib pcm.c:7319:(snd_pcm_recover) cannot recovery from underrun, prepare failed: Broken pipe
Below is my code for writing to ALSA playback device:
bool CALSAWriter::writen(uint8_t** a_pOutputBuffer, uint32_t a_rFrames)
{
bool ret = false;
// 1. write audio chunk from ALSA
const snd_pcm_sframes_t alsaCaptureChunkSize = static_cast<snd_pcm_sframes_t>(a_rFrames); //(m_pALSACfg->alsaCaptureChunkSize);
const snd_pcm_sframes_t writenFrames = snd_pcm_writen(m_pALSAHandle, (void**)a_pOutputBuffer, alsaCaptureChunkSize);
if (0 < writenFrames)
{// write succeeded
ret = true;
}
else
{// write failed
logPrint("CALSAWriter WRITE FAILED for writen farmes = %d ", writenFrames);
ret = false;
const int alsaReadError = static_cast<int>(writenFrames);// alsa error is of int type
if (ALSA_OK == snd_pcm_recover(m_pALSAHandle, alsaReadError, 0))
{// recovery succeeded
a_rFrames = 0;// only recovery was done, no write at all was done
}
else
{
logPrint("CALSAWriter: failed to recover from ALSA write error: %s (%i)", snd_strerror(alsaReadError), alsaReadError);
ret = false;
}
}
// 2. check current buffer load
snd_pcm_sframes_t framesInBuffer = 0;
snd_pcm_sframes_t delayedFrames = 0;
snd_pcm_avail_delay(m_pALSAHandle, &framesInBuffer, &delayedFrames);
// round to nearest int, cast is safe, buffer size is no bigger than uint32_t
const int32_t ONE_HUNDRED_PERCENTS = 100;
const uint32_t bufferLoadInPercents = ONE_HUNDRED_PERCENTS *
static_cast<int32_t>(framesInBuffer) / static_cast<int32_t>(m_pALSACfg->alsaBufferSize);
logPrint("write: ALSA buffer percentage: %u, delayed frames: %d", bufferLoadInPercents, delayedFrames);
return ret;
}
Other diagnostic info:
02:53:00.465047 log info V 1 [write: ALSA buffer percentage: 75, delayed frames: 4096]
02:53:00.635758 log info V 1 [write: ALSA buffer percentage: 74, delayed frames: 4160]
02:53:00.805714 log info V 1 [write: ALSA buffer percentage: 74, delayed frames: 4152]
02:53:00.976781 log info V 1 [write: ALSA buffer percentage: 74, delayed frames: 4144]
02:53:01.147948 log info V 1 [write: ALSA buffer percentage: 0, delayed frames: 0]
02:53:01.317113 log error V 1 [CALSAWriter WRITE FAILED for writen farmes = -32 ]
02:53:01.317795 log error V 1 [CALSAWriter: failed to recover from ALSA write error: Broken pipe (-32)]
It took me about 3 days to find solution. Thanks for #CL. tips of "writen is called too late".
Issue:
Thread switching time is not constant.
Solution:
Insert an empty buffer before you invoke "writen" at the first time. The time length of this buffer could be any value to avoid multi-thread switching. I set it to 150ms.
Or you can set thread priority to high while I can't do this. Refer to ALSA: Ways to prevent underrun for speaker.
Problem diagnostic:
The fact is:
"readi" return every 171ms (4096/24000 = 0.171). Reading thread set buffer as ready.
Once buffer is ready, "writen" is invoked in writing thread. The buffer is copied to ALSA playback device. And it'll take playback device 171ms to play this part of buffer.
If playback device has finished playing all the buffer, and no new buffer is written. "Underrun" occurred.
The real scenario here:
At 0ms, "readi" starts. At 171ms "readi" finishes.
At 172ms, (1ms for thread switching), "writen" starts. At 343ms, "underrun" shall happen, if no new buffer written.
At 171ms, "readi" starts again. At 342ms "readi" finishes.
At this time, thread switching takes 2ms. Before "writen" starts at 344ms, "underrun" occurred at 343ms
When CPU load is high, it's not guarantee how long "thread switching" shall take. That's why you can insert an empty buffer at first write. And turn scenario into:
At 0ms, "readi" starts. At 171ms "readi" finishes.
At 172ms, (1ms for thread switching), "writen" starts with an 150ms-long buffer. At 493ms, "underrun" shall happen, if no new buffer written.
At 171ms, "readi" starts again. At 342ms "readi" finishes.
At this time, thread switching takes 50ms. "writen" starts at 392ms, "underrun" won't occur at all.
We're getting odd behaviour when trying to allocate an approximately 10MB block of memory from huge pages. System is SL6.4 64-bit, recent Intel CPU, 64GB RAM.
Initially we allocated 20 huge pages which should be enough.
$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
HugePages_Total: 20
HugePages_Free: 20
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Other huge page settings:
/proc/sys/kernel/shmall = 4294967296
/proc/sys/kernel/shmmax = 68719476736
/proc/sys/kernel/shmmni = 4096
/proc/sys/kernel/shm_rmid_forced = 0
shmget fails with ENOMEM. The only explanation I can find for this is in the man page which states "No memory could be allocated for segment overhead." but I haven't been able to discover what "segment overhead" is.
On another server with the same number of pages configured shmget returns successfully.
On the problem server we increased the number of huge pages to 100. The allocation succeeds but also allocates 64 2MB huge pages:
$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x0091efab 10223638 rsprod 600 2097152 1
0x0092efab 10256407 rsprod 600 2097152 1
0x0093efab 10289176 rsprod 600 2097152 1
0x0094efab 10321945 rsprod 600 2097152 1
0x0095efab 10354714 rsprod 600 2097152 1
0x0096efab 10387483 rsprod 600 2097152 1
...
0x00cdefab 12189778 rsprod 600 2097152 1
0x00ceefab 12222547 rsprod 600 2097152 1
0x00cfefab 12255316 rsprod 600 2097152 1
0x00d0efab 12288085 rsprod 600 2097152 1
0x00000000 12320854 rsprod 600 10485760 1
The code calling shmget is below. This is only being called once in the application.
uint64_t GetHugePageSize()
{
FILE *meminfo = fopen("/proc/meminfo", "r");
if(meminfo == NULL) {
return 0;
}
char line[256];
while(fgets(line, sizeof(line), meminfo)) {
uint64_t zHugePageSize = 0;
if(sscanf(line, "Hugepagesize: %lu kB", &zHugePageSize) == 1) {
fclose(meminfo);
return zHugePageSize*1024;
}
}
fclose(meminfo);
return 0;
}
char* HugeTableNew(size_t aSize, int& aSharedMemID)
{
static const uint64_t sHugePageSize = GetHugePageSize();
uint64_t zSize = aSize;
// round up to next page size, otherwise shmat fails with EINVAL (22)
const uint64_t HUGE_PAGE_MASK = sHugePageSize-1;
if(aSize & HUGE_PAGE_MASK) {
zSize = (aSize&~HUGE_PAGE_MASK) + sHugePageSize;
}
aSharedMemID = shmget(IPC_PRIVATE, zSize, IPC_CREAT|SHM_HUGETLB|SHM_R|SHM_W);
if(aSharedMemID < 0) {
perror("shmget");
return nullptr;
}
...
Does anyone know:
What causes the allocation to fail when there are enough free huge pages available?
What causes the extra 2MB pages to be allocated?
What "segment overhead" is?