OpenCL C++ context properties syntax - c++

I'm trying to learn OpenCL using the C++ bindings. The only thing I haven't understood so far is the following syntax. Trying to create a context based on a device type:
cl::Context context(CL_DEVICE_TYPE_CPU, properties);
I'm using nvidia's ICD, which as I understand won't let you create a context without defining the platform, so I need the second argument. From the standard, cl_context_properties should be a list of property names, followed by the corresponding values, ended by 0. There's only one cl_context_properties in the standard (table 4.4), which is the property CL_CONTEXT_PLATFORM and has property value of cl_platform_id type. Based on that I thought therefore that this should be OK:
cl_context_properties properties[] =
{ CL_CONTEXT_PLATFORM, platforms[0], 0};
where platforms is my vector of platforms. But it will fail to compile unless instead of platforms[0] I put:
(cl_context_properties)(platforms[0])()
This is from the example code in the cl.hpp header file.
1) It looks like platforms is being cast to type cl_context_properties. Why is this necessary?
2) Why is there an extra set of brackets () at the end?
Please assume that I'm not a C++ expert (definitely true). I know it's only a small thing but I don't like writing code that I don't understand fully.

I have not any experience related to OpenCL.
So mainly my answer is about C++ using.
Why cast is necessary?
The cast is necessary, because you're declaring C array properties[] where each element should be of type cl_context_properties.
Since cl_platform_id has different type it should be cast to appropriate type, exactly - cl_context_properties.
You're using C-style cast that looks like this:
(type_to_cast_to)(expression_to_be_cast).
If expression_to_be_cast is just a variable like in you case you can omit parentheses around expression_to_be_cast:
cl_context_properties properties[] =
{ CL_CONTEXT_PLATFORM, (cl_context_properties)platforms[0], 0};
Why is there an extra set of brackets () at the end?
You should use the brackets since variable platform[0] is of type cl::Platform which is not plain type (like int, char, double for example) and cl::Platform class is a wrapper. You should invoke operator() of this class in order to get underlying data of type cl_platform_id that you need.
So next code should be simpler:
cl_context_properties properties[] =
{ CL_CONTEXT_PLATFORM, static_cast<cl_context_properties>(platforms[0]()), 0};
Here you're doing cast with C++ style cast static_cast which is preferable cast in C++ ( you can read about it here ) of object returned by invocation of operator() on object platforms[0].
The operator() is defined in class cl::detail::Wrapper< T > (class reference) which is parent class for class cl::Platform

Related

Calling vkEnumerateDeviceExtensionProperties "twice" - is it required?

From the man page for vkEnumerateDeviceExtensionProperties,
vkEnumerateDeviceExtensionProperties retrieves properties for
extensions on a physical device whose handle is given in
physicalDevice. To determine the extensions implemented by a layer set
pLayerName to point to the layer’s name and any returned extensions
are implemented by that layer. Setting pLayerName to NULL will return
the available non-layer extensions. pPropertyCount must be set to the
size of the VkExtensionProperties array pointed to by pProperties. The
pProperties should point to an array of VkExtensionProperties to be
filled out or null. If null, vkEnumerateDeviceExtensionProperties will
update pPropertyCount with the number of extensions found. The
definition of VkExtensionProperties is as follows:
(emphasis mine). It seems in the current implementation (Window SDK v1.0.13), pPropertyCount is updated with the number of extensions, regardless of whether pProperties is null or not. However, the documentation doesn't appear to be explicit on what happens in this situation.
Here's an example, of why having such a feature is 'nicer':
const uint32_t MaxCount = 1024; // More than you'll ever need
uint32_t ActualCount = MaxCount;
VkLayerProperties layers[MaxCount];
VkResult result = vkEnumerateDeviceLayerProperties(physicalDevice, &ActualCount, layers);
//...
vs.
uint32_t ActualCount = 0;
VkLayerProperties* layers;
VkResult result = vkEnumerateDeviceLayerProperties(physicalDevice, &ActualCount, nullptr);
if (ActualCount > 0)
{
extensions = alloca(ActualCount * sizeof(VkLayerProperties));
result = vkEnumerateDeviceLayerProperties(physicalDevice, &ActualCount, layers);
//...
}
My question is: am I depending on unsupported functionality by doing this, or is this somehow advertised somewhere else in the documentation?
From the latest spec:
For both vkEnumerateInstanceExtensionProperties and vkEnumerateDeviceExtensionProperties, if pProperties is NULL, then the number of extensions properties available is returned in pPropertyCount. Otherwise, pPropertyCount must point to a variable set by the user to the number of elements in the pProperties array, and on return the variable is overwritten with the number of structures actually written to pProperties. If pPropertyCount is less than the number of extension properties available, at most pPropertyCount structures will be written. If pPropertyCount is smaller than the number of extensions available, VK_INCOMPLETE will be returned instead of VK_SUCCESS, to indicate that not all the available properties were returned.
So your approach is correct, even though it's a bit wasteful on memory. Similar functions returning arrays also behave like this.
Also note that since 1.0.13, device layers are deprecated. All instance layers are able to intercept commands to both the instance and the devices created from it.
Most Vulkan commands relays in double calls:
First call to get count number of returning structures or handles;
Second call to pass an properly sized array to get back requested structures/handle. In this second call, the count parameter tells the size of your array.
If , in second step, you get VkResult::VK_INCOMPLETE result then you passed an array too short to get all objects back. Note VK_INCOMPLETE is not an error, it is a partial success (2.6.2 Return Codes ... "All successful completion codes are non-negative values. ")
Your Question :
Am I depending on unsupported functionality by doing
this, or is this somehow advertised somewhere else in the
documentation?
You proposed create a big array before calling the function, to avoid a call Vulkan function twice.
My reply: Yes, and you are doing a bad design decision by "guessing"
the array size.
Please, don't get me wrong. I strongly agree with you that is annoying to call same function twice, but you can solve it by wrapping those sort functions with a more programmer friendly behaviour.
I'll use another Vulkan function, just to illustrate it. Let say you want to avoid double call to :
VkResult vkEnumeratePhysicalDevices(
VkInstance instance,
uint32_t* pPhysicalDeviceCount,
VkPhysicalDevice* pPhysicalDevices);
A possible solution would be write the sweet wrap function:
VkResult getPhysicalDevices(VkInstance instance, std::vector<VkPhysicalDevice>& container){
uint32_t count = 0;
VkResult res = vkEnumeratePhysicalDevices(instance, &count, NULL); // get #count
container.resize(count); //Removes extra entries or allocates more.
if (res < 0) // something goes wrong here
return res;
res = vkEnumeratePhysicalDevices(instance, &count, container.data()); // all entries overwritten.
return res; // possible OK
}
That is my two cents about the double call to Vulkan functions. It is a naive implementation and may not work for all cases! Note you must create the vector BEFORE you call the wrapping function.
Good Luck!

How to generate a cv::Mat type code?

I've been using the c-style api to generate opencv type codes. For example:
cv::Mat(h, w, CV_8UC2);
CV_8UC2 is a macro defined in types_c.h (deprecated?):
#define CV_MAKETYPE(depth,cn) (CV_MAT_DEPTH(depth) + (((cn)-1) << CV_CN_SHIFT))
Is there a similar type code generation function in the c++ api, something like
Mat m(w,h, cv::Type(Vec<unsigned char, 2>).typecode()) ?
As I said in my comments, CV_MAKETYPE is not deprecated, and afaik it is the standard way of generating those "type codes".
However (and just for fun), an alternative, more C++-ish, way of generating arbitrary codes (still in compile time) can be achieved by using TMP...
template <int depth,
int cn>
struct make_type
{
enum {
// (yes, it is exactly the same expression used by CV_MAKETYPE)
value = ((depth) & CV_MAT_DEPTH_MASK) + (((cn)-1) << CV_CN_SHIFT)
};
};
// You can check that it works exactly the same as good, old `CV_MAKETYPE`
cout << make_type<CV_8U,2>::value << " "<< CV_MAKETYPE(CV_8U,2) << endl;
... but don't do this. While tmp is fun and amazing, CV_MAKETYPE is the right way of doing things in this case.
EDIT: OpenCV has its own type traits utilities. In core/traits.hpp we can find class DataType:
The DataType class is basically used to provide a description of ...
primitive data types without adding any fields or methods to the
corresponding classes (and it is actually impossible to add anything
to primitive C/C++ data types). This technique is known in C++ as
class traits. It is not DataType itself that is used but its
specialized versions
...
The main purpose of this class is to convert compilation-time type
information to an OpenCV-compatible data type identifier
...
So, such
traits are used to tell OpenCV which data type you are working with,
even if such a type is not native to OpenCV.

D language function call with argument

I am learning D and have mostly experience in C#. Specifically I am trying to use the Derelict3 Binding to SDL2. I have been able to get some basic functionality working just fine but I have become stumped on how to create an array argument for a specific call.
The library contains a call
SDL_RenderDrawLines(SDL_Renderer*, const(SDL_Point)*, int) //Derelict3 Binding
And I have been unable to correctly form the argument for
const(SDL_Point)*
The SDL Documentation for this function states that this argument is an array of SDL_Point, but I am unclear how to create an appropriate array to pass to this function.
Here is an example of what I have at the moment:
void DrawShape(SDL_Renderer* renderer)
{
SDL_Point a = { x:10, y:10};
SDL_Point b = { x:500, y:500};
const(SDL_Point[2]) points = [a,b];
Uint8 q = 255;
SDL_SetRenderDrawColor(renderer,q,q,q,q);
SDL_RenderDrawLines(renderer,points,1);
}
And the compiler complains that I am not passing the correct type of argument for const(SDL_Point)* in points.
Error: function pointer SDL_RenderDrawLines (SDL_Renderer*, const(SDL_Point)*, int)
is not callable using argument types (SDL_Renderer*, const(SDL_Point[2u]), int)
I suspect this is a fundamental misunderstanding on my part so any help would be appreciated.
Arrays aren't implicitly castable to pointers in D. Instead, each array (both static and dynamic) has an intrinsic .ptr property that is a pointer to its first element.
Change your code to:
SDL_RenderDrawLines(renderer,points.ptr,1);
given that the call asks for a pointer and length, I feel it is safer to define you own wrapper:
SDL_RenderDrawLines(SDL_Renderer* rend, const SDL_Point[] points){
SDL_RenderDrawLines(rend,points.ptr,points.length);
}
(why it isn't defined I don't know, any performance hit from the extra function call is just a -inline away from being resolved)

How to call a JITed LLVM function with unknown type?

I am implementing a front-end for a JIT compiler using LLVM. I started by following the Kaleidoscope example in the LLVM tutorial. I know how to generate and JIT LLVM IR using the LLVM C++ API. I also know how to call the JITed function, using the "getPointerToFunction" method of llvm::ExecutionEngine.
getPointerToFunction returns a void* which I must then cast to the correct function type. For example, in my compiler I have unit test that looks like the following:
void* compiled_func = compiler.get_function("f");
auto f = reinterpret_cast<int32_t(*)(int32_t)>(compiled_func);
int32_t result = f(10);
The problem is that I have to know the function signature beforehand. In the example above, I have a function "f" which takes takes a 32-bit integer and returns a 32-bit integer. Since I create "f" myself, I know what the function type is, so I'm able to call the JIT'ed function. However, in general, I do not know what the function signature is (or what the struct types are) that are entered by the user. The user can create arbitrary functions, with arbitrary arguments and return types, so I don't know what function pointer type to cast the void* from LLVM's getPointerToFunction. My runtime needs to be able to call those functions (for a Read-Evaluate-Print loop, for example). How can I handle such arbitrary functions from my JIT runtime?
Thanks
There's not much information you get can from compiled_func - as you wrote, it's just a void*. But when you write "in general, I do not know what the function signature is", that's not accurate - you've just compiled that function, so you should have access to the LLVM Function object, which can be queried about its type. It's true that it's an LLVM IR type and not a C++ type, but you can often know which translates to which.
For example, if we borrow code from the tutorial's section on JITting Kaleidoscope:
if (Function *LF = F->Codegen()) {
LF->dump(); // Dump the function for exposition purposes.
// JIT the function, returning a function pointer.
void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
// Cast it to the right type (takes no arguments, returns a double) so we
// can call it as a native function.
double (*FP)() = (double (*)())(intptr_t)FPtr;
fprintf(stderr, "Evaluated to %f\n", FP());
}
Then yes, FPtr was "assumed" to be of type double (), but there's also LF of type Function* here, so you could have done something like:
Type* RetTy = LF->getReturnType();
if (RetTy->isDoubleTy()) {
double (*FP)() = (double (*)())(intptr_t)FPtr;
fprintf(stderr, "Evaluated to %f\n", FP());
} else if (RetTy->isIntegerTy(32)) {
int (*FP)() = (int (*)())(intptr_t)FPtr;
fprintf(stderr, "Evaluated to %d\n", FP());
} else ...
And in much the same way, you can query a function about its parameter types.
A bit cumbersome? You can use your execution engine to invoke the function, via its handy runFunction method, which receives a vector of GenericValues and returns a GenericValue. You should still query the Function type to find what the underlying type under each GenericValue should be.

Dereferencing a pointer to a variable with an unknown type

I didn't know exactly how to explain the problem that I am having right now, so sorry if I am being vague in the title of the question.
What I am having right now is a list of virtual addresses that are being stored in variables. For example, I'm having
0x8c334dd
stored in a char variable. This address is the address of another variable that has data on it. What I want to do is to go to that address and get the data that is stored on it.
My assumption was that dereferencing the pointer would have been the best way to go, unfortunately I don't know the type of the variable that the address is pointing to, so how does dereferencing works in this case? I cannot do: *(char *) 8c334dd because I don't know the type of the variable that the address is pointing to...
If I cast it as an (int *) I get some of the data of some of the variables that some addresses are pointing to (remember that I have several addresses) but for others I am just getting an address, and I need the data (this variables are structs, chars, etc).
I am working with the ELF Symbol Table
In general, C++ or C have no way of knowing what type of pointer you have.
The usual way to solve this problem is to make the pointer point to a struct, and have a known position in the struct indicate the type of the data. Usually the known position is the first position in the struct.
Example:
// signature value; use any value unlikely to happen by chance
#define VAR_SIG 0x11223344
typedef enum
{
vartypeInvalid = 0,
vartypeInt,
vartypeFloat,
vartypeDouble,
vartypeString,
vartypeMax // not a valid vartype
} VARTYPE;
typedef struct
{
VARTYPE type;
#ifdef DEBUG
uint32_t sig;
#endif // DEBUG
union data
{
int i;
float f;
double d;
char *s;
};
} VAR;
You can then do a sanity check: you can see if the type field has a value greater than vartypeInvalid and less than vartypeMax (and you will never need to edit those names in the sanity check code; if you add more types, you add them before vartypeMax in the list). Also, for a DEBUG build, you can check that the signature field sig contains some specific signature value. (This means that your init code to init a VAR instance needs to always set the sig field, of course.)
If you do something like this, then how do you initialize it? Runtime code will always work:
VAR v;
#ifdef DEBUG
v.sig = VAR_SIG;
#endif // DEBUG
v.type = vartypeFloat;
v.data = 3.14f;
What if you want to initialize it at compile time? It's easy if you want to initialize it with an integer value, because the int type is the first type in the union:
VAR v =
{
vartypeInt,
#ifdef DEBUG
VAR_SIG,
#endif // DEBUG
1234
};
If you are using a C99 compliant version of C, you can actually initialize the struct with a field name and have it assign any type. But Microsoft C isn't C99 compliant, so the above is a nightmare if you want to init your struct with a float or double value. (If you cast the float value to an integer, C won't just change the type, it will round the value; and there is no trick I know of to portably get a 32-bit integer value that correctly represents a 32-bit float at compile time in a C program.)
Compile time float packing/punning
If you are working with pointers, though, that's easy. Just make the first field name in the union be a pointer type, cast the pointer to void * and init the struct as above (the pointer would go where 1234 went above).
If you are reading tables written by someone else's code, and you don't have a way to add a type identifier field, I don't have a general answer for you. I guess you could try reading the pointer out as different types, and see which one(s) work?
Just wanted to add something, for people out there working with the ELF symbol table, I've found the DIEs in the DWARF file easier to work with. You can get the addresses, types and names of variables using DWARF instead of ELF, and libdwarf has good documentation.