RX channel out of range for configured RX frontends - c++

I have a simple c++ test program on a Ettus x310 that used to work now doesn't. I'm trying to simply set two center freq of two channels of a single USRP. The above Out of range error occurs when I try to set anything on the 2nd channel.
I get a crash with a Channel out of range error:
$ ./t2j.out
linux; GNU C++ version 4.8.4; Boost_105400; UHD_003.009.001-0-gf7a15853
-- X300 initialization sequence...
-- Determining maximum frame size... 1472 bytes.
-- Setup basic communication...
-- Loading values from EEPROM...
-- Setup RF frontend clocking...
-- Radio 1x clock:200
-- Initialize Radio0 control...
-- Performing register loopback test... pass
-- Initialize Radio1 control...
-- Performing register loopback test... pass
terminate called after throwing an instance of 'uhd::index_error'
what(): LookupError: IndexError: multi_usrp: RX channel 140445275195320 out of range for configured RX frontends
Aborted (core dumped)
Here is my test program:
int main( void )
{
// sources
gr::uhd::usrp_source::sptr usrp1;
const std::string usrp_addr = std::string( "addr=192.168.10.30" );
uhd::stream_args_t usrp_args = uhd::stream_args_t( "fc32" );
usrp_args.channels = std::vector<size_t> ( 0, 1 );
usrp1 = gr::uhd::usrp_source::make( usrp_addr, usrp_args );
usrp1->set_subdev_spec( std::string( "A:AB B:AB" ), 0 );
usrp1->set_clock_source( "external" );
usrp1->set_samp_rate( 5.0e6 );
usrp1->set_center_freq( 70e6, 0 ); // this is OK
usrp1->set_center_freq( 70e6, 1 ); // crashes here With RX Chan out of Range Error!
printf( "test Done!\n" );
return 0;
}
The only thing Ive found so far in searching is make sure PYTHONPATH is set correctly (and for the heck of it I made sure it pointed to the site_packages) but again that seems to be related to GRC and not C++.
I am using Ubuntu 14.04.4 and UHD 3.9.1 with gnuradio 3.7.8.1 (Ive also tried 3.7.9.2) with the same result.
The hardware is an Ettus x310 with two BasicRx daughterboards.

Someone from the gnuradio/uhd Mailing List helped me. It appears that the vector initialization was wrong:
Replace:
stream_args.channels = std::vector ( 0, 1 );
With these two lines:
stream_args.channels.push_back( 0 );
stream_args.channels.push_back( 1 );
There are other more concise methods but this does the trick for now.
-Bob

Related

Pytorch inference time difference between CUDA 10.0 & 10.2

We have a working library that uses LibTorch 1.5.0, built with CUDA 10.0 which runs as expected.
We are working on upgrading to CUDA 10.2 for various non-PyTorch related reasons. We noticed that when we run LibTorch inference on the newly compiled LibTorch (compiled exactly the same, except changing to CUDA 10.2), the runtime is about 20x slower.
We also checked it using the precompiled binaries.
This was tested on 3 different machines using 3 different GPUs (Tesla T4, GTX980 & P1000) and all gives consistent ~20x slower on CUDA 10.2
(Both on Windows 10 & Ubuntu 16.04), all with the latest drivers and on 3 different torch scripts (of the same architecture)
I've simplified the code to be extremely minimal without external dependencies other than Torch
int main(int argc, char** argv)
{
// Initialize CUDA device 0
cudaSetDevice(0);
std::string networkPath = DEFAULT_TORCH_SCRIPT;
if (argc > 1)
{
networkPath = argv[1];
}
auto jitModule = std::make_shared<torch::jit::Module>(torch::jit::load(networkPath, torch::kCUDA));
if (jitModule == nullptr)
{
std::cerr << "Failed creating module" << std::endl;
return EXIT_FAILURE;
}
// Meaningless data, just something to pass to the module to run on
// PATCH_HEIGHT & WIDTH are defined as 256
uint8_t* data = new uint8_t[PATCH_HEIGHT * PATCH_WIDTH * 3];
memset(data, 0, PATCH_HEIGHT * PATCH_WIDTH * 3);
auto stream = at::cuda::getStreamFromPool(true, 0);
bool res = infer(jitModule, stream, data, PATCH_WIDTH, PATCH_HEIGHT);
std::cout << "Warmed up" << std::endl;
res = infer(jitModule, stream, data, PATCH_WIDTH, PATCH_HEIGHT);
delete[] data;
return 0;
}
// Inference function
bool infer(std::shared_ptr<JitModule>& jitModule, at::cuda::CUDAStream& stream, const uint8_t* inputData, int width, int height)
{
std::vector<torch::jit::IValue> tensorInput;
// This function simply uses cudaMemcpy to copy to device and create a torch::Tensor from that data
// I can paste it if it's relevant but didn't now to keep as clean as possible
if (!prepareInput(inputData, width, height, tensorInput, stream))
{
return false;
}
// Reduce memory usage, without gradients
torch::NoGradGuard noGrad;
{
at::cuda::CUDAStreamGuard streamGuard(stream);
auto totalTimeStart = std::chrono::high_resolution_clock::now();
jitModule->forward(tensorInput);
// The synchronize here is just for timing sake, not use in production
cudaStreamSynchronize(stream.stream());
auto totalTimeStop = std::chrono::high_resolution_clock::now();
printf("forward sync time = %.3f milliseconds\n",
std::chrono::duration<double, std::milli>(totalTimeStop - totalTimeStart).count());
}
return true;
}
When compiling this with Torch that was compiled using CUDA 10.0 we get a runtime of 18 ms and when we run it with Torch compiled with CUDA 10.2, we get a runtime of 430 ms
Any thoughts on that?
This issue was also posted on PyTorch Forums.
Issue on GitHub
UPDATE
I profiled this small program using both CUDAs
It seems that both use very different kernels
96.5% of the 10.2 computes are conv2d_grouped_direct_kernel which takes ~60-100ms on my P1000
where as the top kernels in the 10.0 run are
47.1% - cudnn::detail::implicit_convolve_sgemm (~1.5 ms)
23.1% - maxwell_scudnn_winograd_128x128_ldg1_ldg4_tile148n_nt (~0.4 ms)
8.5% - maxwell_scudnn_128x32_relu_small_nn (~0.4ms)
so it's easy to see where the time difference comes from. Now the question is, why.

gcov report not showing coverage for functions called within boost::asio::io_service thread

I am using boost::asio::io_service to call the std::thread as shown below..
function:
91 0 : void start_tcp( std::shared_ptr<mapping_t> m_mapping, l_publisher& ll_publisher,
92 : reg_publisher& reg_publisher, std::mutex& mtx, bool debug, const string& host,
93 : uint16_t port, uint16_t maximum_number_of_connections ) {
94 : spdlog::info( "start_tcp debug:{} host:{} port:{} max_connections:{}", debug, host, port,
95 0 : maximum_number_of_connections );
96 0 : modbus_t* ctx = nullptr;
97 0 : if( host == "0.0.0.0" ) {
98 0 : ctx = new_tcp( NULL, port );
99 : } else {
100 0 : ctx = new_tcp( host.c_str(), port );
101 : }
**Above function calling from thread insdie main.cpp:**
: std::thread tcp_thread{start_tcp,
424 : m_mapping,
425 0 : std::ref( l_publisher ),
426 0 : std::ref( reg_publisher ),
427 0 : std::ref( mtx ),
428 : debug_mode,
429 0 : input_file["bindAddress"].as<string>(),
430 0 : input_file["port"].as<unsigned>(),
431 0 : config["maximum-connections"].as<unsigned>()};
432 0 : io_service->run();
When I compile I could see .gcno files are generated. It means, I properly set -g -O0 --coverage compiler flag and then -lgcov linker flag. Also, after the test has run, the .gcda files are generated. But there's no coverage shown for the function start_tcp even though it is executed.
I could see the following been printed to the console when the test has executed; which means this function is executed. But the coverage hasn't shown anything.
spdlog::info( "start_tcp debug:{} host:{} port:{} max_connections:{}", debug,
excerpt from test output:
[2020-03-21 10:41:48.268] [info] start_tcp debug:false host:127.0.0.1 port:1502 max_connections:50\n'
The test is not a unit test, rather it's a functional test written in python (using python sub-process) which validates the command line options of an executable built from the source.
Can someone help me with this problem. Am I doing anything wrong or for boost threads do we need some other special flags if any to get the right coverage report??
I resolved this problem. It is nothing to do with threads. The test is using the executable built from source. Test has spawn the process from executable using pythons subprocess and at the end of the test, this process is terminating using sigterm. But the executable is not handling the sigterm. So no .gcda files been created as explained here...
http://gcc.1065356.n8.nabble.com/gcov-can-t-collect-data-when-process-is-executed-by-systemctl-start-but-it-can-when-executed-by-procs-td1396806.html
By default systemd terminates a service by sending SIGTERM.
(See systemd.kill(5).) If your service doesn't handle it and will just be
killed, it won't produce .gcda file.
Your service should handle SIGTERM and terminate cleanly, like:
void handler(int signum)
{
/* notify the operator that the service has receive SIGTERM
and clean up (close file descriptors, etc). */
exit(0);
}
int main(int argc, char **argv)
{
signal(handler, SIGTERM);
/* do service */
}
enter code here
By using exit(), the functions registered by atexit() and on_exit() would be
called. GCC registers one atexit function to produce .gcda file.
Even if without gcov, it's recommended to catch SIGTERM and terminate your
service cleanly.
After adding code to my main.cpp to handle the sigterm as suggested above. Everything works fine. And I got the proper coverage report.
Thanks.

I'm trying to get the base address of loaded dylibs in osx

Well, i am trying to get the baseadresses for Loaded OSX Dylibs in running Proccesses on run time since a longtime now.
Ive dumped all attached Dylibs on runtime using dyld_all_image_infos after using task_info, got there name and ImageLoadADDRESS,mach_header and segment_command.
But i can't get there Baseaddress On runtime..
Everything works great except that i'm confused on how to get the actual Baseaddress of the requested Image in runtime!
Also my Magic number which i got after Getting the mach_header using my imageLoadAddress is "feedfacf" . Shouldn't it be "feedface"?
After searching i found that "fileoff" in "segment_command" should lead me to the data section. This isn't working either.
My output is Looking great but when i try to view this Memory Region in Cheat engine for example it is empty!
My Debug output:
mach object header
magic number feedfacf
cputype 01000007 - x86_64
subcputype 00000003
filetype 00000006
ncmds 00000015
sizeofcmds 00000988
flags 00118085
Segment
File Off 45545f5f
DYLIB: client.dylib
My Code is the following:
// mach_header
mach_msg_type_number_t size4 = sizeof(mach_header);
uint8_t* info_addr2 =
readProcessMemory(this->task, (mach_vm_address_t) info[i].imageLoadAddress, &size4);
mach_header *info2 = (mach_header*) info_addr2;
// segment
mach_msg_type_number_t size5 = sizeof(segment_command);
uint8_t* info_addr3 =
readProcessMemory(this->task, (mach_vm_address_t) info[i].imageLoadAddress + sizeof(info2), &size5);
segment_command *info3 = (segment_command*) info_addr3;
tmp_str = tmp_str.substr(tmp_str.find_last_of("/") + 1, tmp_str.length());
//printf("%i Debug: %s\n", i, tmp_str.c_str());
this->dylibs[cc].Name = tmp_str;
this->dylibs[cc].imgHead = info2;
// this->dylibs[i].tSize = this->size_of_image((mach_header*)info[i].imageLoadAddress);
if(strcmp(tmp_str.c_str(), "client.dylib") == 0){
this->client = cc;
printf("mach object header\n");
printf("magic number\t%08x\n", info2->magic);
printf("cputype\t\t%08x - %s\n", info2->cputype,cpu_type_name(info2->cputype));//cputype_to_string(info2->filetype)
printf("subcputype\t%08x\n", info2->cpusubtype);
printf("filetype\t%08x\n", info2->filetype);// filetype_to_string(info2->filetype)
printf("ncmds\t\t%08x\n", info2->ncmds);
printf("sizeofcmds\t%08x\n", info2->sizeofcmds);
printf("flags\t\t%08x\n", info2->flags);
printf("Segment\n");
printf("File Off\t%x\n", info3->fileoff );
could anyone may help me?
Would appreciate it!
( p.s: Code is a bit confusing, but i am working around since days and can't get it to work, so i didn't want to write it in an Nicer style right now! )
vmaddr field of the segment_command structure (or segment_command_64 for 64-bit OS X) is address of segment in the virtual memory.
Structure of mach-o image header (dylib as particular example) is following:
| mach_header | load_command | load_command | ... | load_command |
Count of load_command structures is stored in the ncmds field of the mach_header. Some of them are segment_command structures, you can check it with cmd field of load_command. If cmd is LC_SEGMENT or LC_SEGMENT_64 for 64 bit - it's segment_command and segment_command_64 for 64 bit. Just cast load_command to the segment_command.
Usually, mach_header is stored in the first bytes of first segment of dylib. So, info[i].imageLoadAddress is what are you looking for if I understand your question right.
Other thing, you mentioned about magic values feedfacf and feedface. Actually, feedface is MH_MAGIC, while feedfacf is MH_MAGIC_64. All modern macbooks, iMacs and mac mini are 64 bit.

Angelscript - RegisterScriptArray fails

i am trying to get an angelscript test running, however, calling RegisterScriptArray() fails
System function (1, 39) : ERR : Expected '<end of file>'
(0, 0) : ERR : Failed in call to function 'RegisterObjectBehaviour' with 'array' and 'array<T># f(int&in type, int&in list) {repeat T}' (Code: -10)
the code is:
engine = asCreateScriptEngine(ANGELSCRIPT_VERSION);
// message callback
int r = engine->SetMessageCallback(asFUNCTION(as_messageCallback), 0, asCALL_CDECL); assert( r >= 0 );
RegisterStdString(engine);
RegisterScriptArray(engine, false);
r = engine->RegisterGlobalFunction("void print(const string &in)", asFUNCTION(as_print), asCALL_CDECL); assert( r >= 0 );
What should i do? If i comment out the call it works, but thats obviously not what i want to archieve as i want arrays
After asking on their forums I got a reply (actually quite some time ago).
http://www.gamedev.net/topic/657233-registerscriptarray-fails
In case the link dies:
The main issue was a version mismatch between the plugins (which I compiled and installed manually) and the core (which I installed through my package manager). Now I include the plugins in my code and the core is manually compiled.
Hope it helps others encountering the same issue.

cocos2dx : Issue when returning multiple numberOfCellsInTableView in CCTableViwe?

I am creating more then one tableviews in cocos2dx using CCTableView. When I define the DataSource method "numberOfCellsInTableView" i want to return different-different value for different-different TableViews.
so i used if-control statement.
but it seems i am being unable to check the condition properly.
code in my CCLayer class's inti method is as follows:-
CCLOG("init debug 10 %d",characterImageNameArray->count());
numberOfRowsIncharacterTable = characterImageNameArray->count();
this->characterTable = cocos2d::extension::CCTableView::create(this,cocos2d::CCSizeMake((winSize.width/6.0)-20, winSize.height-720.0));
REST OF THE DEFINITION IS AS FOLLOWS :-
unsigned int numberOfCellsInTableView (cocos2d::extension::CCTableView * table)
{
CCLOG("init debug 11 ");
int rVal = 0;
if (table==this->characterTable) {
CCLOG("init debug 11a ");
rVal = this->characterImageNameArray->count();
}
CCLOG("init debug 12 rVal %d",rVal);
return rVal;
}
the Following is the console debug log :-
Cocos2d: init debug 9
Cocos2d: init debug 10 6
Cocos2d: init debug 11
Cocos2d: init debug 12 rVal 0
Cocos2d: init debug 11
Cocos2d: init debug 12 rVal 0
Cocos2d: init debug 11
Cocos2d: init debug 12 rVal 0
I am not getting what's gone wrong ?
I also present different tableviews in my scene and I follow a different approach.
I create TableView* my_table_01 = Table_creation....
Then I use TAG_TABLE_01 as my_table_01->setTag(TAG_TABLE_01); and so on for the others tables, changing the TAG.
When in the numberOfCellsInTableView I decide the number of cells checking the TAG of the table as:
if (table->getTag() == ....) return number_cells_for_this_table;
You can also use the same approach for cellsize and have multiple tableviews.