Trace32 - PBI=MCIServer Illegal commad error - trace32

I am creating a ubicom-32 core under MCI-Server configuration in T32 tree(t32 start).
When i am starting the Core, I get
PBI=MCIServer (Illegal command)
Config=C:\Temp\T321000023.t32
T32SYS = C:\T32\ (config file)
I have the below settings in my T321000023.t32 file which is autogenerated.
;Connection to Host
PBI=MCISERVER
NODE=localhost
INSTANCE=1
CORE=4
How to fix this? Any other config is required for MCIServer set up?
I am not getting the error when i start ubicom32 core via simulator/JTAG dongle.

Lauterbach supports MCI-Server configuration only for a limited number of target processor architectures and currently Ubicom32 is not in the list of supported architectures.
Support for MCI-Server configuration requires not only effort to implement this feature for an architecture, but it also requires customers that have interest in this feature and are willing to purchase it. Please contact Lauterbach support support#lauterbach.com to discuss feasibility and conditions.

Related

OpenVINO GPU performance optimization

I'm trying to speed up the inference on a people counter application, in order to use the GPU I've set the inference engine configuration setting as described:
device_name = "GPU"
ie.SetConfig({ {PluginConfigParams::KEY_CONFIG_FILE, "./cldnn_global_custom_kernels/cldnn_global_custom_kernels.xml"} }, device_name);
and loading the network on the inference engine I've set the target device like described below:
CNNNetwork net = netReader.getNetwork();
TargetDevice t_device = InferenceEngine::TargetDevice::eGPU;
network.setTargetDevice(t_device);
const std::map<std::string, std::string> dyn_config = { { PluginConfigParams::KEY_DYN_BATCH_ENABLED, PluginConfigParams::YES } };
ie_.LoadNetwork(network,device_name, dyn_config);
but the inference engine use the CPU yet, and this slow down the inference time. There is a way to use the Intel GPU at maximum power to do inference on a particular network? I'm using the person-detection-retail-0013 model.
Thank's.
Have you meant person-detection-retail-0013? Because I haven't found pedestrian-detection-retail-013 in open_model_zoo repo.
This might be expected that you see a slowdown while using GPU. The network, you tested, has the following layers as part of the network topology: PriorBox, DetectionOutput . Those layers are executed on CPU as documentation says: https://docs.openvinotoolkit.org/latest/_docs_IE_DG_supported_plugins_CL_DNN.html
I have a guess that this may be the reason of the slowdown.
But to be 100% percent sure I would suggest to run benchmark_app tool to do bench-marking of the model. This tool can print detailed performance information about each layer. It should help to shed light what is the real root cause of the slowdown. More information about benchmark_app can be found here: https://docs.openvinotoolkit.org/latest/_inference_engine_samples_benchmark_app_README.html
PS: Just a piece of advice regarding usage of IE API. network.setTargetDevice(t_device); - setTargetDevice is a deprecated method. It is enough to set a device using LoadNetwork like in your example: ie_.LoadNetwork(network,device_name, dyn_config);
Hope it will help.

tensorflow places softmax op on cpu instead of gpu

I have a tensorflow model with multiple inputs and several layers, and a final softmax layer. The model is trained in Python (using the Keras framework), then saved and inference is done using a C++ program that facilitates a CMake build of TensorFlow (following basically those instructions: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/cmake).
In python (tensorflow-gpu) all ops use the GPU (using log_device_placement):
out/MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0
2017-12-04 14:07:38.005837: I C:\tf_jenkins\home\workspace\rel-in\M\windows-gpu\PY\35\tensorflow\core\common_runtime\simple_placer.cc:872] out/MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0
out/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/gpu:0
2017-12-04 14:07:38.006201: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\simple_placer.cc:872]
out/BiasAdd: (BiasAdd)/job:localhost/replica:0/task:0/gpu:0
out/Softmax: (Softmax): /job:localhost/replica:0/task:0/gpu:0
2017-12-04 14:07:38.006535: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\simple_placer.cc:872] out/Softmax: (Softmax)/job:localhost/replica:0/task:0/gpu:0
To save the graph, the freeze_graph script is used (the script producing the log above loads again the freezed graph in .pb format).
When I use the C++ program and load the freezed graph (following closely the LoadGraph() function in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc - ReadBinaryProto() and session->Create()), and log again the device placements, I find that the Softmax is placed on CPU (all others ops are on GPU):
dense_6/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
dense_6/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
dense_6/Relu: (Relu): /job:localhost/replica:0/task:0/device:GPU:0
out/MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
out/BiasAdd: (BiasAdd): /job:localhost/replica:0/task:0/device:GPU:0
out/Softmax: (Softmax): /job:localhost/replica:0/task:0/device:CPU:0
This placement is also confirmed by high CPU/low GPU utilization, and also apparent from profiling the application. The data type of the out layer is float32 (out/Softmax -> (<tf.Tensor 'out/Softmax:0' shape=(?, 1418) dtype=float32>,)).
Further investigation revealed:
Creating the softmax-op in C++ and placing it on GPU explicitly throws this error message:
Cannot assign a device for operation 'tsoftmax': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
A call to tensorflow::LogAllRegisteredKernels() showed also that Softmax is only available for CPU!
The build directory contains many files related to "softmax" (e.g. `tf_core_gpu_kernels_generated_softmax_op_gpu.cu.cc.obj.Release.cmake). Don't know how to check every compilation step, though.
when I look into the "tf_core_gpu_kernels.lib" (one can open a .lib with 7Z ;)), there are files like "tf_core_gpu_kernels_generated_softmax_op_gpu.cu.cc.lib" - so I believe there is nothing wrong with compiling the kernels itself
But: inspecting the "tensorflow.dll" (Dependency Walker) shows that only CPU kernels for Softmax are included (there are functions like const tensorflow::SoftmaxOp<struct Eigen::ThreadPoolDevice,double>, but no functions with GPU such as const tensorflow::SoftplusGradOp<struct Eigen::GpuDevice,float>).
Setup: Tensorflow 1.3.0, Windows 10, GPU: NVidia GTX 1070 (8GB RAM, memory utilization also very low).
I found a workaround - the workaround is to include the tf_core_gpu_kernels.lib in some of the steps (create_def_file.py). More details here: GitHub Issue 15254

Tensorflow does not recognize GPU on AWS

So here it goes: I wanted to use TensorFlow with GPU on AWS - p2.xlarge plan. Unfortunately, something must have gone wrong and I continue to get:
InvalidArgumentError (see above for traceback): Cannot assign a device to node 'Variable_1': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
I checked both CUDA and cuDNN:
nvcc -V
cat /usr/local/cuda/include/cudnn.h
and got 8.0 and 5.1, respectively.
I call gpu like this:
with tf.device('/gpu:0'):
a = tf.Variable(tf.truncated_normal([100, 100]))
b = tf.Variable(tf.truncated_normal([100, 1000]))
with tf.Session() as sess:
sess.run(tf.matmul(a,b))
happy to post more details if necessary - don't know what will be useful yet.
I suppose you're trying to set up an EC2 instance from scratch? That can be difficult.
Instead, I'd strongly recommend using the Deep Learning AMI (https://aws.amazon.com/machine-learning/amis/). It comes preinstalled with everything you need (drivers, popular DL libraries, etc.). It's also free to use, you just pay for the instance itself.

How can I find the default MaxPermSize when -XX:+PrintFlagsFinal is not supported?

I'm working with a system where a number of jobs, implemented as Java applications, can be started simultaneously. Each job runs in a separate JVM.
Some of these jobs require bigger permgen size than others. However, it is not feasible to allow all jobs to use the maximum value, as the OS memory is limited.
Therefore, I want to specify -XX:MaxPermSize for every job. Currently, the jobs are running without any -XX:MaxPermSize argument, so they must be using the default value. But how can I find out what the default value is?
I have seen Default values for Xmx, Xms, MaxPermSize on non-server-class machines where the accepted answer is to run java -XX:+PrintFlagsFinal, which should output the default values. However, the JVM version I'm running does not support that argument (Unrecognized VM option '+PrintFlagsFinal'). Updating to a newer JVM is not currently an option.
So what are my options for finding the default value?
System information:
> java -version
Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> cat /etc/issue
Welcome to SUSE Linux Enterprise Server 11 SP2 (x86_64) - Kernel \r (\l).
> uname -r
3.0.101-0.7.17-default
Default values for various regions would depend on:
Collector being used (which would depend on Java version in case you're specifying it explicitly using CLI args).
Sizes of heap etc you have specified using CLI args. GC would distribute the space according to some ratios.
Installed (or may be it was available) RAM on the machine.
How to find out:
From GC Log files (-Xloggc:gc.log), I would expect that at least in the Full GC logs, GC would report the Perm Gen sizes. See examples at bottom. You can take a representative gc log file and find the max perm gen size from it, and decide based on that.
Additional params like PrintFlagsFinal etc. (specific to Java version)
I'll look through the 1.6 options to see if I can find something and update the post, otherwise it's time to upgrade. :-)
Here are 3 examples from different GCs (Metaspace, CMS Perm & PSPermGen is what you're looking for):
2014-11-14T08:43:53.197-0500: 782.973: [Full GC (Ergonomics) [PSYoungGen: 54477K->0K(917504K)] [ParOldGen: 1042738K->367416K(1048576K)] 1097216K->367416K(1966080K), [Metaspace: 46416K->46389K(1091584K)], 0.4689827 secs] [Times: user=3.52 sys=0.07, real=0.47 secs]
2014-10-29T06:14:56.491-0400: 6.754: [Full GC2014-10-29T06:14:56.491-0400: 6.754: [CMS: 96098K->113997K(5242880K), 0.7076870 secs] 735545K->113997K(6186624K), [CMS Perm : 13505K->13500K(51200K)], 0.7078280 secs] [Times: user=0.69 sys=0.01, real=0.71 secs]
2014-10-29T21:13:33.140-0500: 2644.411: [Full GC [PSYoungGen: 2379K->0K(695296K)] [ParOldGen: 1397977K->665667K(1398272K)] 1400357K->665667K(2093568K) [PSPermGen: 106995K->106326K(262144K)], 1.2151010 secs] [Times: user=6.83 sys=0.09, real=1.22 secs]

How to read a multi-session DVD disk size in Windows?

Trying to read the sizes of disks that were created in multiple sessions using GetDiskFreeSpaceEx() gives the size of the last session only. How do I read correctly the number and sizes of all sessions in C/C++?
Thanks.
You might want to look at the DeviceIoControl API function. See here for control codes. Here is a code example that retrieves the size of a CD disk. Substitute
CreateFile(TEXT("\\\\.\\PhysicalDrive0")
for e.g.
CreateFile(TEXT("\\\\.\\F:") /* Drive is F: */
if you wish.
Note: The page says that DeviceIoControl can be used to "retrieve information about a floppy disk drive, hard disk drive, tape drive, or CD-ROM drive", but I have also tested it on a DVD, and it seemed to work perfectly. I did not have access to any multisession DVDs to test, so you'll have to test if that works yourself. If it doesn't work, I'd try some of the other control codes, at least IOCTL_DISK_GET_DRIVE_GEOMETRY_EX, IOCTL_DISK_GET_DRIVE_LAYOUT_EX, IOCTL_DISK_GET_LENGTH_INFO and IOCTL_DISK_GET_PARTITION_INFO_EX.
If all fails with DeviceIoControl, you could possibly make use of the Windows Image Mastering API (IMAPI). You'll need v2 of the API (included with Vista & later, can be added to XP & 2003 too, see here: What's new in IMAPIv2) for DVD support. This API is primarily for CD burning, but does perhaps contain some functionality for retrieving disk size, I'd find it weird if it didn't. Particularly, this example seems to be interesting. I do not know if this one works for multisession disks either, but since it can create them, I guess it's likely.
Here are some resources for IMAPI:
MSDN - IMAPI
MSDN - IMAPI interfaces
MSDN - Creating multisession disks with IMAPI (note: example with VB, not C or C++)
Hey I got at least 2 solutions for you:
1) Download dvd+rw-mediainfo.exe from http://fy.chalmers.se/~appro/linux/DVD+RW/tools/win32/, it's a tool that reads info about your disc. Then just make a system call from your app and parse the results. Here's example output:
D:\Downloads>"dvd+rw-mediainfo.exe" f:
INQUIRY: [HL-DT-ST][DVDRAM GT30N ][1.01]
GET [CURRENT] CONFIGURATION:
Mounted Media: 10h, DVD-ROM
Current Write Speed: 1.0x1385=1385KB/s
Write Speed #0: 8.0x1385=11080KB/s
Write Speed #1: 4.0x1385=5540KB/s
Write Speed #2: 2.0x1385=2770KB/s
Write Speed #3: 1.0x1385=1385KB/s
Speed Descriptor#0: 00/2292991 R#8.0x1385=11080KB/s W#8.0x1385=11080KB/s
READ DVD STRUCTURE[#0h]:
Media Book Type: 01h, DVD-ROM book [revision 1]
Legacy lead-out at: 2292992*2KB=4696047616
READ DISC INFORMATION:
Disc status: complete
Number of Sessions: 1
State of Last Session: complete
Number of Tracks: 1
READ TRACK INFORMATION[#1]:
Track State: complete
Track Start Address: 0*2KB
Free Blocks: 0*2KB
Track Size: 2292992*2KB
Last Recorded Address: 2292991*2KB
FABRICATED TOC:
Track#1 : 17#0
Track#AA : 17#2292992
Multi-session Info: #1#0
READ CAPACITY: 2292992*2048=4696047616
2) Investigate mciSendString from [DllImport("winmm.dll", EntryPoint = "mciSendStringA", CharSet = CharSet.Ansi)], I suspect you can send some command and get the desired results.
PS: of course you may download dvd+rw-mediainfo.exe sources from here and investigate further, I am just giving you ideas to think of.
UPDATE
Link to source code updated, thanks #oystein
There are many way to do this since the DVD drives have several interfaces for this due to legacy and backward-compatibility issues.
You could send an IOCTL_SCSI_PASSTHROUGH_DIRECT command to the DVD-drive ( the physicaldevice handle for it). With it you issue a SCSI commands that will be answered by the drive. You can read session information, disk information disk capcity and more.
I believe that dvd+rw-mediainfo.exe issues these.
Unfortunatly, the interface is a bit tricky and obscure, since it is a command within a command. Th passthrough has a byte buffer you will have to fill in yourself with the command structure.
Or you can call IOCTL_CDROM_READ_TOC_EX:
http://www.osronline.com/ddkx/storage/k306_2cs2.htm
I also believe that the exact set of the IOCTL / commands that will work depends on on the drive and its firmaware.
Older drives will not support the newr interfaces and some of the newer drives will not support legacy interfaces.
Thus, some of the libraries & tools might use one or more of these interfaces.
Accseeing the older sessons is all quite messy, really, since most OS will not care about them, only the most recent ones.