I have the grpc route_guide example working on both aarch64 and x86_64 but when I translate this example into an already existing (CMake) project only x86_64 works. The aarch64 just hangs in server.BuildAndStart()
The code is identical for the 2 platforms:
void RunServer(const std::string& db_path) {
std::cout << "RunServer" << std::endl;
std::string server_address("0.0.0.0:50051");
RouteGuideImpl service(db_path);
ServerBuilder builder;
builder.AddListeningPort(server_address, grpc::InsecureServerCredentials());
builder.RegisterService(&service);
//builder.SetSyncServerOption(ServerBuilder::SyncServerOption::MAX_POLLERS,2); // Doesn't make a difference
std::cout << "Before builder.BuildAndStart()" << std::endl;
std::unique_ptr<grpc::Server> server(builder.BuildAndStart());
std::cout << "After builder.BuildAndStart()" << std::endl;
std::cout << "Server listening on " << server_address << std::endl;
server->Wait();
}
Using gdb I can see that x86_64 starts 1+4 threads and then listens for grpc traffic. It also works using any other example for route guide (e.g both in C++ and node) communicating with this.
(gdb) run
Starting program: /home/me/mycode/grpc_project/build/myexecutable
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffd0bd9700 (LWP 30537)]
RunServer
DB parsed, loaded 100 features.
Before builder.BuildAndStart()
[New Thread 0x7fffd03d8700 (LWP 30538)]
[New Thread 0x7fffcfbd7700 (LWP 30539)]
[New Thread 0x7fffcf3d6700 (LWP 30540)]
[New Thread 0x7fffcebd5700 (LWP 30541)]
After builder.BuildAndStart()
Server listening on 0.0.0.0:50051
But on the aarch64 platform only 1+2 threads are started then grpc hangs
Starting program: /home/ubuntu/mycode/grpc_project/build/myexecutable
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fab553030 (LWP 27568)]
RunServer
DB parsed, loaded 100 features.
Before builder.BuildAndStart()
[New Thread 0x7faad53030 (LWP 27569)]
[New Thread 0x7faa553030 (LWP 27570)]
Where to start debugging this and what could the root cause be?
Others (both here at SO and elsewhere) have suggested limiting the number of threads using ResourceQuota like this
grpc::ResourceQuota rq;
rq.SetMaxThreads(n); // n=1 or n=2
builder.SetResourceQuota(rq);
But it makes no difference, aarch64 still hangs
Ubuntu 16.04 with aarch64 (Nvidia TX2)
Related
Environment
OS: CentOS 7.9_x64
Memory, CPU, current Disk Space:Memory 96G, Disk 1T
TDengine Version:TDengine-server-2.0.20.13-Linux-x64
TDengine taosd daemon coredump.
gdb output:
[New LWP 5461]
[New LWP 5499]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/taosd'.
Program terminated with signal 11, Segmentation fault.
#0 0x000056308db735cf in gcBuildQueryJson (pContext=0x7fdfdc0008c0, cmd=0x7fdfe00014a0, result=0x7fdfcc048ab0, numOfRows=682) at /home/ubuntu/workroom/jenkins/TDinternal/community/src/plugins/http/src/httpGcJson.c:154
154 /home/ubuntu/workroom/jenkins/TDinternal/community/src/plugins/http/src/httpGcJson.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64
how to resolve it?
how to resolve it?
It's a bug in TDengine-server. You don't "resolve" bugs.
You can try to figure out what the bug is (via debugging), or you can try newer version of TDengine-server (current appears to be 2.2.0.2) and hope that the particular bug you've hit has been fixed.
I used unordered_map and I started 3 worker threads in my C++ program. But when I used gdb to run my code, I found the following GDB debug info frequently and continually were showing up:
[New Thread 0x2aaac3dc6700 (LWP 4709)]
[Thread 0x2aaac3dc6700 (LWP 4709) exited]
[New Thread 0x2aaac3dc6700 (LWP 4710)]
[Thread 0x2aaac3dc6700 (LWP 4710) exited]
[New Thread 0x2aaac3dc6700 (LWP 4711)]
[Thread 0x2aaac3dc6700 (LWP 4711) exited]
.....................
How could so many threads were created and quickly destroyed, were there background threads for garbage collection of obsoleted objects(just like Java), I doubt if the unordered_map object caused so many threads to do clean for it.
I had attached gdb to a long running process(>25 hours). To manage the session, I used screen on my Ubuntu machine. I could get the session back. I got back the gdb console. But on continuing I saw my process throw SIGABRT and exit followed up by other process exit messages.
[New LWP 122]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fe8ef29ea15 in futex_abstimed_wait_cancelable (private=0, abstime=0x7ffc8c628420, expected=0, futex_word=0x7fe8e6378640) at ../sysdeps/unix/sysv/
linux/futex-internal.h:205
205 ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) c
Continuing.
(gdb) [Thread 0x7be8d8bfd700 (LWP 48) exited]
Thread 32 "my-process" received signal SIGABRT, Aborted.
[Switching to Thread 0x7be8d2bbd700 (LWP 60)]
0x00007fe8eece4428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) c
Continuing.
Couldn't get registers: No such process.
Couldn't get registers: No such process.
Couldn't get registers: No such process.
(gdb) [Thread 0x7be8b67ff700 (LWP 119) exited]
[Thread 0x7be8b49fe700 (LWP 122) exited]
...
I am not able to get the gdb console after that. Though I see a process running when I run ps -ef
root 133 0 1 Jan14 ? 00:26:09 gdb --pid=23
How do get back the console for this gdb process? I wanted to see the backtrace.
Or is there a better way to attach gdb to a long running process ?
I am compiling the unit tests (via GoogleTests) for my program and whenever I try to compile in DEBUG mode on Solaris 11.3 with Eigen 3.2.x, I'm getting this SIGSEGV error then core dump when running the program in gdb:
(gdb) r
...
[Thread debugging using libthread_db enabled] [New Thread 1 (LWP 1)]
Program received signal SIGSEGV, Segmentation fault. [Switching to
Thread 1 (LWP 1)] 0x0830fc30 in
Eigen::internal::ploadu (
from=0xfeffe5a0) at ./eigen/Eigen/src/Core/arch/SSE/Complex.h:307 307 {
EIGEN_DEBUG_UNALIGNED_LOAD return Packet1cd(ploadu((const
double*)from)); }
(gdb)
When print from in gdb this is what I'm getting:
gdb p from: (const std::complex< double > *) 0xfeffe5a0
This SIGSEGV only on Solaris, and only when compiling with -Og. I've compiled and tested it on multiple other OSes and there are no issues whatsoever. Is this a known issue? It looks it has to do with some SSE optimizations and alignments, however I cannot pinpoint what exactly is going on.
I am using CentOS 6.3 64 bit and trying to develop a c++ application using IBM® Data Studio
Version 4.1.0.1 which is a flavor of eclipse. I have written a multithread application and all was going well till I tried to do file IO.
I used ofstream to write my data to a file . But this is not allowing me to debug this snippet of code, If I place any breakpoints it shows a break point can not be placed .
I have written this piece of code in a thread (pthread)
ofstream myfile;
char chFileName[100];
sprintf(chFileName,"Json_%s_%s_%d",strType,strGroupID,iIndex);
myfile.open (chFileName);
myfile << strJSON;
myfile.close();
Here is the console output before the application and thread terminates.
[Thread debugging using libthread_db enabled] [New Thread 0x7fffe9f1e700 (LWP 28380)] [New Thread 0x7fffe951d700 (LWP 28381)]
[Switching to Thread 0x7fffe951d700 (LWP 28381)] [Thread 0x7fffe951d700 (LWP 28381) exited] [Thread 0x7fffe9f1e700 (LWP 28380)
exited]
Is it not wise to perform file IO in a thread ?
Could someone please help me with how to fix this issue ?