CUDA kernel launch macro with templates

CUDA kernel launch macro with templates - c++

I made a macro to simplify CUDA kernel calls:
#define LAUNCH LAUNCH_ASYNC
#define LAUNCH_ASYNC(kernel_name, gridsize, blocksize, ...) \
LOG("Async kernel launch: " #kernel_name); \
kernel_name <<< (gridsize), (blocksize) >>> (__VA_ARGS__);
#define LAUNCH_SYNC(kernel_name, gridsize, blocksize, ...) \
LOG("Sync kernel launch: " #kernel_name); \
kernel_name <<< (gridsize), (blocksize) >>> (__VA_ARGS__); \
cudaDeviceSynchronize(); \
// error check, etc...
Usage:
LAUNCH(my_kernel, 32, 32, param1, param2)
LAUNCH(my_kernel<int>, 32, 32, param1, param2)
This works fine; with the first define I can enable synronous calls and error checking for debugging.
However it does not work with multiple template arguments like below:
LAUNCH(my_kernel<int,float>, 32, 32, param1, param3)
The error message I get in the line where I call the macro:
error : expected a ">"
Is it possible to make this macro work with multiple template arguments?

The problem is that the preprocessor knows nothing about angle bracket nesting, so it interprets the comma between them as macro argument separator.
If the kernel-launch syntax supports parentheses around the kernel name (I can't check now, not on a CUDA machine), you could do this:
LAUNCH((my_kernel<int, float>), 32, 32, param1, param3)

Something else you could try that I have used (based on the macro you posted) is wrapping the kernel block size and grid size arguments in their own macro:
#define KERNEL_ARGS2(grid, block) <<< grid, block >>>
#define KERNEL_ARGS3(grid, block, sh_mem) <<< grid, block, sh_mem >>>
#define KERNEL_ARGS4(grid, block, sh_mem, stream) <<< grid, block, sh_mem, stream >>>
Now you should be able to use your macro like so:
#define CUDA_LAUNCH(kernel_name, gridsize, blocksize, ...) \
kernel_name KERNEL_ARGS2(gridsize, blocksize)(__VA_ARGS__);
You can use it like:
CUDA_LAUNCH(my_kernel, grid_size, block_size, float* input, float* output, int size);
This will launch the kernel called 'my_kernal' with the given grid and block size and the input arguments.

consider this solution that also throws error
inline void echoError(cudaError_t e, const char *strs) {
char a[255];
if (e != cudaSuccess) {
strncpy(a, strs, 255);
fprintf(stderr, "Failed to %s,errorCode %s",
a, cudaGetErrorString(e));
exit(EXIT_FAILURE);
}
}
#define CUDA_KERNEL_DYN(kernel, bpg, tpb, shd, ...){ \
kernel<<<bpg,tpb,shd>>>( __VA_ARGS__ ); \
cudaError_t err = cudaGetLastError(); \
echoError(err, #kernel); \
}

Related

Error in macro with __VA_OPT__ and parenthesis

I am using c++,gcc. I have code for logging with macro like this:
#define E_DEBUG(level, ...) \
if (err_get_debug_level() >= level) \
err_msg(ERR_DEBUG, FILELINE, __VA_OPT__)
#define ERR_DEBUG 1
#define FILELINE __FILE__ , __LINE__
int err_get_debug_level(void);
void err_msg(int lvl, const char *path, long ln, const char *fmt, ...);
int main ( void ) {
E_DEBUG(1,("%d",14));
}
The code give an error VA_OPT must be followed by an open parenthesis
I change the code according this Error in macro with __va_args__ and parenthesis.
The code look like:
#define PASTE(...) __VA_OPT__
#define E_DEBUG(level, ...) \
if (err_get_debug_level() >= level) \
err_msg(ERR_DEBUG, FILELINE, PASTE __VA_OPT__)
#define ERR_DEBUG 1
#define FILELINE __FILE__ , __LINE__
int err_get_debug_level(void);
void err_msg(int lvl, const char *path, long ln, const char *fmt, ...);
int main ( void ) {
E_DEBUG(1,("%d",14));
}
It gives an error unterminated VA_OPT. How should I fix it?

__VA_OPT__ is used to conditionally insert something in your macro, it's not equivalent to __VA_ARGS__, you need both:
#define E_DEBUG(level, ...) \
if (err_get_debug_level() >= level) \
err_msg(ERR_DEBUG, FILELINE __VA_OPT__(,) __VA_ARGS__)
This will not compile with your code because you used ("%d", 14) and I don't really understand why. If you remove the extra brackets, the code compiles.

C++ assert with time stamp

is that possible to log information when assert failed with time stamp
ex.
int a = 10
assert( a > 100 );
then it will be failed and output just like with the timestamp as well
2013-12-02 , 17:00:05 assert failed !! (a > 100) line : 22
Thank you

assert is a macro (it has to be one, to give __LINE__ and __FILE__ information).
You could define your own. I would name it something else like tassert for readability reasons, perhaps like (untested code)
#ifdef NDEBUG
#define tassert(Cond) do {if (0 && (Cond)) {}; } while(0)
#else
#define tassert_at(Cond,Fil,Lin) do { if ((Cond)) { \
time_t now##Lin = time(NULL); \
char tbuf##Lin [64]; struct tm tm##Lin; \
localtime_r(&now##Lin, &tm##Lin); \
strftime (tbuf##Lin, sizeof(tbuf##Lin), \
"%Y-%m-%d,%T", &tm##Lin); \
fprintf(stderr, "tassert %s failure: %s %s:%d\n", \
#Cond, tbuf##Lin, Fil, Lin); \
abort(); }} while(0)
#define tassert(Cond) tassert_at(Cond,__FILE__,__LINE__)
#endif /*NDEBUG*/
I am using cpp concatenation ## with Lin to lower probability of name collisions, and I am using cpp stringification # to make a string out of Cond macro formal. The Cond is always expanded, to make sure the compiler catch syntactic errors in it even when disabling tassert with NDEBUG as assert(3) does.
One could put most of the code in the above macro in some function, e.g.
void tassert_at_failure (const char* cond, const char* fil, int lin) {
timer_t now = time(NULL);
char tbuf[64]; struct tm tm;
localtime_r (&now, &tm);
strftime (tbuf, sizeof(tbuf), "%Y-%m-%d,%T", &tm);
fprintf (stderr, "tassert %s failure: %s %s:%d\n",
cond, tbuf, fil, lin);
abort();
}
and then just define (a bit like <assert.h> does...)
#define tassert_at(Cond,Fil,Lin) do { if ((Cond)) { \
tassert_at_failure(#Cond, Fil, Lin); }} while(0)
but I don't like much that approach, because for debugging with gdb having  abort() being called in the macro is much easier (IMHO size of code for debugging executables does not matter at all; calling abort in a macro is much more convenient inside gdb - making shorter backtraces and avoiding one down command...). If you don't want libc portability and just use recent GNU libc you could simply redefine the Glibc specific __assert_fail function (see inside <assert.h> header file). YMMV.
BTW, in real C++ code I prefer to use << for assertion-like debug outputs. This enables usage of my own operator << outputting routines (if you give it as an additional macro argument) so I am thinking of (untested code!)
#define tassert_message_at(Cond,Out,Fil,Lin) \
do { if ((Cond)) { \
time_t now##Lin = time(NULL); \
char tbuf##Lin [64]; struct tm tm##Lin; \
localtime_r(&now##Lin, &tm##Lin); \
strftime (tbuf##Lin, sizeof(tbuf##Lin), \
"%Y-%m-%d,%T", &tm##Lin); \
std::clog << "assert " << #Cond << " failed " \
tbuf##Lin << " " << Fil << ":" << Lin \
<< Out << std::endl; \
abort (); } } while(0)
#define tassert_message(Cond,Out) \
tassert_message_at(Cond,Out,__FILE__,__LINE__)
and then I would use tassert_message(i>5,"i=" << i);
BTW, you might want to use syslog(3) instead of fprintf in your tassert_at macro.

How to create a macro that throws on verifying HR and also logs?

I have a scenario in the code where the following pattern exist -
if (!function(A))
{
log("this is the %d error in this file called %s", num, fileName);
throw AppException(FUNCTION_ERROR);
}
the issue with this is you need to do this all the time and the code looks really dirty. so I want to define a macro like -
#define VerifyOrThrow(b, retcode, logerror)
if (b == 0) \
{ \
log(logerror,arg1, arg2) -->this is the issue \
throw(AppException(retcode)); \
}
then I can use it like this in a single line
VerifyOrThrow(functionA(), FUNCTION_ERROR,this is the %d error in this file called %s);
The issue is I am not sure how to define the macro for the variable length argument for the log string.
Any ideas?

Use __VA_ARGS__ as:
#define VerifyOrThrow(b, retcode, ...)
if (b == 0) \
{ \
log(__VA_ARGS__); \
throw(AppException(retcode)); \
}

My favourite logging macro in C++:
#define ATHROW( msg ) \
{ \
std::ostringstream os; \
os << msg; \
throw ALib::Exception( os.str(), __LINE__, __FILE__ ); \
}
In this case, I throw an exception, but you could do whatever you want after you have formatted the string. In use:
ATHROW( "The value of x is " << x << " when it should be " << correct );
This has all the advantages of the C++ stream output system - type safety, extensibility etc. It is also portable, which the use of variadic macros is currently not (they are not part of the current C++ standard).

How to enable the TRACE macro in Release mode?

The TRACE macro can be used to output diagnostic messages to the debugger when the code is compiled in Debug mode. I need the same messages while in Release mode. Is there a way to achieve this?
(Please do not waste your time discussing why I should not be using TRACE in Release mode :-)

Actually, the TRACE macro is a lot more flexible than OutputDebugString. It takes a printf() style format string and parameter list whereas OutputDebugString just takes a single string. In order to implement the full TRACE functionality in release mode you need to do something like this:
void trace(const char* format, ...)
{
char buffer[1000];
va_list argptr;
va_start(argptr, format);
wvsprintf(buffer, format, argptr);
va_end(argptr);
OutputDebugString(buffer);
}

A few years back I needed similar functionality so I cobbled together the following code. Just save it into a file, e.g. rtrace.h, include it at the end of your stdafx.h, and add _RTRACE to the release mode Preprocessor defines.
Maybe someone will find a use for it :-)
John
#pragma once
//------------------------------------------------------------------------------------------------
//
// Author: John Cullen
// Date: 2006/04/12
// Based On: MSDN examples for variable argument lists and ATL implementation of TRACE.
//
// Description: Allows the use of TRACE statements in RELEASE builds, by overriding the
// TRACE macro definition and redefining in terms of the RTRACE class and overloaded
// operator (). Trace output is generated by calling OutputDebugString() directly.
//
//
// Usage: Add to the end of stdafx.h and add _RTRACE to the preprocessor defines (typically
// for RELEASE builds, although the flag will be ignored for DEBUG builds.
//
//------------------------------------------------------------------------------------------------
#ifdef _DEBUG
// NL defined as a shortcut for writing FTRACE(_T("\n")); for example, instead write FTRACE(NL);
#define NL _T("\n")
#define LTRACE TRACE(_T("%s(%d): "), __FILE__, __LINE__); TRACE
#define FTRACE TRACE(_T("%s(%d): %s: "), __FILE__, __LINE__, __FUNCTION__); TRACE
#else // _DEBUG
#ifdef _RTRACE
#undef TRACE
#define TRACE RTRACE()
#define LTRACE RTRACE(__FILE__, __LINE__)
#define FTRACE RTRACE(__FILE__, __LINE__, __FUNCTION__)
#define NL _T("\n")
class RTRACE
{
public:
// default constructor, no params
RTRACE(void) : m_pszFileName( NULL ), m_nLineNo( 0 ), m_pszFuncName( NULL ) {};
// overloaded constructor, filename and lineno
RTRACE(PCTSTR const pszFileName, int nLineNo) :
m_pszFileName(pszFileName), m_nLineNo(nLineNo), m_pszFuncName(NULL) {};
// overloaded constructor, filename, lineno, and function name
RTRACE(PCTSTR const pszFileName, int nLineNo, PCTSTR const pszFuncName) :
m_pszFileName(pszFileName), m_nLineNo(nLineNo), m_pszFuncName(pszFuncName) {};
virtual ~RTRACE(void) {};
// no arguments passed, e.g. RTRACE()()
void operator()() const
{
// no arguments passed, just dump the file, line and function if requested
OutputFileAndLine();
OutputFunction();
}
// format string and parameters passed, e.g. RTRACE()(_T("%s\n"), someStringVar)
void operator()(const PTCHAR pszFmt, ...) const
{
// dump the file, line and function if requested, followed by the TRACE arguments
OutputFileAndLine();
OutputFunction();
// perform the standard TRACE output processing
va_list ptr; va_start( ptr, pszFmt );
INT len = _vsctprintf( pszFmt, ptr ) + 1;
TCHAR* buffer = (PTCHAR) malloc( len * sizeof(TCHAR) );
_vstprintf( buffer, pszFmt, ptr );
OutputDebugString(buffer);
free( buffer );
}
private:
// output the current file and line
inline void OutputFileAndLine() const
{
if (m_pszFileName && _tcslen(m_pszFileName) > 0)
{
INT len = _sctprintf( _T("%s(%d): "), m_pszFileName, m_nLineNo ) + 1;
PTCHAR buffer = (PTCHAR) malloc( len * sizeof(TCHAR) );
_stprintf( buffer, _T("%s(%d): "), m_pszFileName, m_nLineNo );
OutputDebugString( buffer );
free( buffer );
}
}
// output the current function name
inline void OutputFunction() const
{
if (m_pszFuncName && _tcslen(m_pszFuncName) > 0)
{
INT len = _sctprintf( _T("%s: "), m_pszFuncName ) + 1;
PTCHAR buffer = (PTCHAR) malloc( len * sizeof(TCHAR) );
_stprintf( buffer, _T("%s: "), m_pszFuncName );
OutputDebugString( buffer );
free( buffer );
}
}
private:
PCTSTR const m_pszFuncName;
PCTSTR const m_pszFileName;
const int m_nLineNo;
};
#endif // _RTRACE
#endif // NDEBUG

TRACE is just a macro for OutputDebugString. So you can easily just make your own TRACE macro (or call it something else) that will call OutputDebugString.

It's most simply code that I had see
#undef ATLTRACE
#undef ATLTRACE2
#define ATLTRACE2 CAtlTrace(__FILE__, __LINE__, __FUNCTION__)
#define ATLTRACE ATLTRACE2
see
http://alax.info/blog/1351

In MFC, TRACE is defined as ATLTRACE. And in release mode that is defined as:
#define ATLTRACE __noop
So, using the out-the-box TRACE from MFC, you won't actually be able to read any TRACE text, because it won't even be written out. You could write your own TRACE function instead, then re-define the TRACE macro. You could do something like this:
void MyTrace(const CString& text)
{
::OutputDebugString(text); // Outputs to console, same as regular TRACE
// TODO: Do whatever output you need here. Write to event log / write to text file / write to pipe etc.
}

How do you create a debug only function that takes a variable argument list? Like printf()

I'd like to make a debug logging function with the same parameters as printf. But one that can be removed by the pre-processor during optimized builds.
For example:
Debug_Print("Warning: value %d > 3!\n", value);
I've looked at variadic macros but those aren't available on all platforms. gcc supports them, msvc does not.

I still do it the old way, by defining a macro (XTRACE, below) which correlates to either a no-op or a function call with a variable argument list. Internally, call vsnprintf so you can keep the printf syntax:
#include <stdio.h>
void XTrace0(LPCTSTR lpszText)
{
::OutputDebugString(lpszText);
}
void XTrace(LPCTSTR lpszFormat, ...)
{
va_list args;
va_start(args, lpszFormat);
int nBuf;
TCHAR szBuffer[512]; // get rid of this hard-coded buffer
nBuf = _vsnprintf(szBuffer, 511, lpszFormat, args);
::OutputDebugString(szBuffer);
va_end(args);
}
Then a typical #ifdef switch:
#ifdef _DEBUG
#define XTRACE XTrace
#else
#define XTRACE
#endif
Well that can be cleaned up quite a bit but it's the basic idea.

This is how I do debug print outs in C++. Define 'dout' (debug out) like this:
#ifdef DEBUG
#define dout cout
#else
#define dout 0 && cout
#endif
In the code I use 'dout' just like 'cout'.
dout << "in foobar with x= " << x << " and y= " << y << '\n';
If the preprocessor replaces 'dout' with '0 && cout' note that << has higher precedence than && and short-circuit evaluation of && makes the whole line evaluate to 0. Since the 0 is not used the compiler generates no code at all for that line.

Here's something that I do in C/C++. First off, you write a function that uses the varargs stuff (see the link in Stu's posting). Then do something like this:
int debug_printf( const char *fmt, ... );
#if defined( DEBUG )
#define DEBUG_PRINTF(x) debug_printf x
#else
#define DEBUG_PRINTF(x)
#endif
DEBUG_PRINTF(( "Format string that takes %s %s\n", "any number", "of args" ));
All you have to remember is to use double-parens when calling the debug function, and the whole line will get removed in non-DEBUG code.

Ah, vsprintf() was the thing I was missing. I can use this to pass the variable argument list directly to printf():
#include <stdarg.h>
#include <stdio.h>
void DBG_PrintImpl(char * format, ...)
{
char buffer[256];
va_list args;
va_start(args, format);
vsprintf(buffer, format, args);
printf("%s", buffer);
va_end(args);
}
Then wrap the whole thing in a macro.

Another fun way to stub out variadic functions is:
#define function sizeof

#CodingTheWheel:
There is one slight problem with your approach. Consider a call such as
XTRACE("x=%d", x);
This works fine in the debug build, but in the release build it will expand to:
("x=%d", x);
Which is perfectly legitimate C and will compile and usually run without side-effects but generates unnecessary code. The approach I usually use to eliminate that problem is:
Make the XTrace function return an int (just return 0, the return value doesn't matter)
Change the #define in the #else clause to:
0 && XTrace
Now the release version will expand to:
0 && XTrace("x=%d", x);
and any decent optimizer will throw away the whole thing since short-circuit evaluation would have prevented anything after the && from ever being executed.
Of course, just as I wrote that last sentence, I realized that perhaps the original form might be optimized away too and in the case of side effects, such as function calls passed as parameters to XTrace, it might be a better solution since it will make sure that debug and release versions will behave the same.

In C++ you can use the streaming operator to simplify things:
#if defined _DEBUG
class Trace
{
public:
static Trace &GetTrace () { static Trace trace; return trace; }
Trace &operator << (int value) { /* output int */ return *this; }
Trace &operator << (short value) { /* output short */ return *this; }
Trace &operator << (Trace &(*function)(Trace &trace)) { return function (*this); }
static Trace &Endl (Trace &trace) { /* write newline and flush output */ return trace; }
// and so on
};
#define TRACE(message) Trace::GetTrace () << message << Trace::Endl
#else
#define TRACE(message)
#endif
and use it like:
void Function (int param1, short param2)
{
TRACE ("param1 = " << param1 << ", param2 = " << param2);
}
You can then implement customised trace output for classes in much the same way you would do it for outputting to std::cout.

What platforms are they not available on? stdarg is part of the standard library:
http://www.opengroup.org/onlinepubs/009695399/basedefs/stdarg.h.html
Any platform not providing it is not a standard C implementation (or very, very old). For those, you will have to use varargs:
http://opengroup.org/onlinepubs/007908775/xsh/varargs.h.html

Part of the problem with this kind of functionality is that often it requires
variadic macros. These were standardized fairly recently(C99), and lots of
old C compilers do not support the standard, or have their own special work
around.
Below is a debug header I wrote that has several cool features:
Supports C99 and C89 syntax for debug macros
Enable/Disable output based on function argument
Output to file descriptor(file io)
Note: For some reason I had some slight code formatting problems.
#ifndef _DEBUG_H_
#define _DEBUG_H_
#if HAVE_CONFIG_H
#include "config.h"
#endif
#include "stdarg.h"
#include "stdio.h"
#define ENABLE 1
#define DISABLE 0
extern FILE* debug_fd;
int debug_file_init(char *file);
int debug_file_close(void);
#if HAVE_C99
#define PRINT(x, format, ...) \
if ( x ) { \
if ( debug_fd != NULL ) { \
fprintf(debug_fd, format, ##__VA_ARGS__); \
} \
else { \
fprintf(stdout, format, ##__VA_ARGS__); \
} \
}
#else
void PRINT(int enable, char *fmt, ...);
#endif
#if _DEBUG
#if HAVE_C99
#define DEBUG(x, format, ...) \
if ( x ) { \
if ( debug_fd != NULL ) { \
fprintf(debug_fd, "%s : %d " format, __FILE__, __LINE__, ##__VA_ARGS__); \
} \
else { \
fprintf(stderr, "%s : %d " format, __FILE__, __LINE__, ##__VA_ARGS__); \
} \
}
#define DEBUGPRINT(x, format, ...) \
if ( x ) { \
if ( debug_fd != NULL ) { \
fprintf(debug_fd, format, ##__VA_ARGS__); \
} \
else { \
fprintf(stderr, format, ##__VA_ARGS__); \
} \
}
#else /* HAVE_C99 */
void DEBUG(int enable, char *fmt, ...);
void DEBUGPRINT(int enable, char *fmt, ...);
#endif /* HAVE_C99 */
#else /* _DEBUG */
#define DEBUG(x, format, ...)
#define DEBUGPRINT(x, format, ...)
#endif /* _DEBUG */
#endif /* _DEBUG_H_ */

Have a look at this thread:
How to make a variadic macro (variable number of arguments)
It should answer your question.

This is what I use:
inline void DPRINTF(int level, char *format, ...)
{
# ifdef _DEBUG_LOG
va_list args;
va_start(args, format);
if(debugPrint & level) {
vfprintf(stdout, format, args);
}
va_end(args);
# endif /* _DEBUG_LOG */
}
which costs absolutely nothing at run-time when the _DEBUG_LOG flag is turned off.

This is a TCHAR version of user's answer, so it will work as ASCII (normal), or Unicode mode (more or less).
#define DEBUG_OUT( fmt, ...) DEBUG_OUT_TCHAR( \
TEXT(##fmt), ##__VA_ARGS__ )
#define DEBUG_OUT_TCHAR( fmt, ...) \
Trace( TEXT("[DEBUG]") #fmt, \
##__VA_ARGS__ )
void Trace(LPCTSTR format, ...)
{
LPTSTR OutputBuf;
OutputBuf = (LPTSTR)LocalAlloc(LMEM_ZEROINIT, \
(size_t)(4096 * sizeof(TCHAR)));
va_list args;
va_start(args, format);
int nBuf;
_vstprintf_s(OutputBuf, 4095, format, args);
::OutputDebugString(OutputBuf);
va_end(args);
LocalFree(OutputBuf); // tyvm #sam shaw
}
I say, "more or less", because it won't automatically convert ASCII string arguments to WCHAR, but it should get you out of most Unicode scrapes without having to worry about wrapping the format string in TEXT() or preceding it with L.
Largely derived from MSDN: Retrieving the Last-Error Code

Not exactly what's asked in the question . But this code will be helpful for debugging purposes , it will print each variable's value along with it's name . This is completely type independent and supports variable number of arguments.
And can even display values of STL's nicely , given that you overload output operator for them
#define show(args...) describe(#args,args);
template<typename T>
void describe(string var_name,T value)
{
clog<<var_name<<" = "<<value<<" ";
}
template<typename T,typename... Args>
void describe(string var_names,T value,Args... args)
{
string::size_type pos = var_names.find(',');
string name = var_names.substr(0,pos);
var_names = var_names.substr(pos+1);
clog<<name<<" = "<<value<<" | ";
describe(var_names,args...);
}
Sample Use :
int main()
{
string a;
int b;
double c;
a="string here";
b = 7;
c= 3.14;
show(a,b,c);
}
Output :
a = string here | b = 7 | c = 3.14

Having come across the problem today, my solution is the following macro:
static TCHAR __DEBUG_BUF[1024];
#define DLog(fmt, ...) swprintf(__DEBUG_BUF, fmt, ##__VA_ARGS__); OutputDebugString(__DEBUG_BUF);
You can then call the function like this:
int value = 42;
DLog(L"The answer is: %d\n", value);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

CUDA kernel launch macro with templates - c++

Related

Error in macro with __VA_OPT__ and parenthesis

C++ assert with time stamp

How to create a macro that throws on verifying HR and also logs?

How to enable the TRACE macro in Release mode?

How do you create a debug only function that takes a variable argument list? Like printf()

Categories

Resources