Template Function Branch Optimization - c++

I'm trying to write a template method to create shaders for Direct3D. The API functions to create each type of shader as well as the types of shaders have different names. So, I wrote the following code:
class Shader final
{
public:
explicit Shader( _In_ ID3DBlob *const pBlob );
template <class T>
void Create
( std::weak_ptr<ID3D11Device>& pDevice
, CComPtr<T>& pResource )
{
auto p_Device = pDevice.lock();
if ( mp_Blob && p_Device )
{
HRESULT hr = E_FAIL;
ID3D11ClassLinkage* pClassLinkage = nullptr; // unsupported for now
pResource.Release();
CComPtr<ID3D11DeviceChild> pRes;
if ( std::is_same<T, ID3D11VertexShader>() )
{
hr = p_Device->CreateVertexShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11VertexShader**>( &pRes ) );
}
else if ( std::is_same<T, ID3D11HullShader>() )
{
hr = p_Device->CreateHullShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11HullShader**>( &pRes ) );
}
else if ( std::is_same<T, ID3D11DomainShader>() )
{
hr = p_Device->CreateDomainShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11DomainShader**>( &pRes ) );
}
else if ( std::is_same<T, ID3D11GeometryShader>() )
{
hr = p_Device->CreateGeometryShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11GeometryShader**>( &pRes ) );
}
else if ( std::is_same<T, ID3D11ComputeShader>() )
{
hr = p_Device->CreateComputeShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11ComputeShader**>( &pRes ) );
}
else if ( std::is_same<T, ID3D11PixelShader>() )
{
hr = p_Device->CreatePixelShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11PixelShader**>( &pRes ) );
}
else
{
assert( false
&& "Need a pointer to an ID3D11 shader interface" );
}
//TODO: log hr's error code.
assert( SUCCEEDED( hr ) && "Error: shader creation failed!" );
if ( FAILED( hr ) )
{
pResource.Release();
}
else
{
hr = pRes->QueryInterface( IID_PPV_ARGS( &pResource ) );
assert( SUCCEEDED( hr ) );
}
}
}
private:
CComPtr<ID3DBlob> mp_Blob;
};
It should work, although I have not tested it yet. But the issue is that the compiler doesn't throw away the branching paths that will certainly not be taken. So for example:
CComPtr<ID3D11DomainShader> pDS;
//pShader is an instance of Shader class
pShader->Create(pDevice, pDs);
will create a domain shader. But the compiler keeps all the paths in the generated function instead of generating just
void Create
( std::weak_ptr<ID3D11Device>& pDevice
, CComPtr<ID3D11DomainShader>& pResource )
{
auto p_Device = pDevice.lock();
if ( mp_Blob && p_Device )
{
HRESULT hr = E_FAIL;
ID3D11ClassLinkage* pClassLinkage = nullptr; // unsupported for now
pResource.Release();
CComPtr<ID3D11DeviceChild> pRes;
if ( true ) // this is the evaluation of std::is_same<ID3D11DomainShader, ID3D11DomainShader>()
{
hr = p_Device->CreateDomainShader
( mp_Blob->GetBufferPointer()
, mp_Blob->GetBufferSize()
, pClassLinkage
, reinterpret_cast<ID3D11DomainShader**>( &pRes ) );
}
//TODO: log hr's error code.
assert( SUCCEEDED( hr ) && "Error: shader creation failed!" );
if ( FAILED( hr ) )
{
pResource.Release();
}
else
{
hr = pRes->QueryInterface( IID_PPV_ARGS( &pResource ) );
assert( SUCCEEDED( hr ) );
}
}
}
I think there should be a way to do this because the type of the shader is known at compile-time, but I don't really know how (my metaprogramming skills need yet to grow).
p.s.
I compiled both in debug and releas setting and in both the paths are kept.

Following may help:
HRESULT createShader(
ID3D11Device& pDevice,
CComPtr<ID3D11VertexShader>& pResource,
CComPtr<ID3D11DeviceChild> pRes)
{
return p_Device.CreateVertexShader(
mp_Blob->GetBufferPointer(),
mp_Blob->GetBufferSize(),
pClassLinkage,
reinterpret_cast<ID3D11VertexShader**>(&pRes));
}
// similar for other Shader type
template <class T>
void Create(
std::weak_ptr<ID3D11Device>& pDevice,
CComPtr<T>& pResource)
{
auto p_Device = pDevice.lock();
if (!mp_Blob || !p_Device) {
return;
}
pResource.Release();
CComPtr<ID3D11DeviceChild> pRes;
// ---------------- 8< --------------------
// Here is the change: no more `if` to check type,
// let the compiler choose the correct overload
HRESULT hr = createShader(*p_device, pResource, pRes);
// ---------------- >8 --------------------
assert( SUCCEEDED( hr ) && "Error: shader creation failed!" );
if ( FAILED( hr ) ) {
pResource.Release();
} else {
hr = pRes->QueryInterface( IID_PPV_ARGS( &pResource ) );
assert( SUCCEEDED( hr ) );
}
}

Regarding your optimisations:
I think you're upset about the code being created to handle everything regardless of which template type.
You need to shift your is_same logic to the enable_if in my meta-programming solution, then the function that matches the template for what you want will ONLY be the code you want.
HOWEVER I interpret your question still as a problem of too much abstraction, you cannot use an Animal class to only accept a Banana if the underlying animal is a monkey.
(In this classic example, Monkey derives from Animal and Banana from Food, where Animal has a method void eat(Food))
Answer of how to do what you want well
A bit long, so I skimmed it.
Remember meta-programming wont always save the day (there are many cases where you know the types but the program doesn't, take for example columns in database result sets).
High performance
Don't let unknown types in in the first place. Here's a common pattern:
class unverified_thing: public base_class {
public:
unverified_thing(base_class* data): data(data) { type_code = -1; }
void set_type_code(int to) { /*throw if not -1*/ type_code = to; }
derived_A* get_as_derived_A() const { /*throw if not the right type code*/
return *(derived_A*)data;
}
derived_B* get_as_derived_B() const { /*throw is not right type code*/
return *(derived_B*)data;
}
//now do the base class methods
whatever base_class_method() {
return data->base_class_method();
}
private:
int type_code;
base_class data;
};
Now you can pretend unverified_thing is your data, and you have introduced a form of type checking. You can afford to throw in the getter because you wont be calling that every frame or whatever. You only deal with that when you're setting up.
So say shader is the base class of fragment_shader and vertex_shader, you can be dealing with a shader but have set the type_id, so you can deal with shaders right until you compile your shader, then you can cast to the correct derived type with a runtime error if wrong. This avoids C++ RTTI which can be quite heavy.
Remember you can afford setup time, you want to make sure every bit of data you send into the engine is correct.
This type pattern comes from validated input only being allowed through (which stops SO many bugs) you have a unverified_thing that doesn't derive from the data type, you can only extract the data without error if you set the type to verified.
An even better way to do this (but can get messy quick) is to have:
template<bool VERIFIED=true>
class user_input { };
/*somewhere in your dialog class (or whatever)*/
user_input<false> get_user_input() const { /*whatever*/ }
/*then have somewhere*/
user_input verify_input(const user_input<false>& some_input) { /*which will throw as needed*/ }
For large data classes of user_input it can be good to hide a large_data* inside the user_input class, but you get the idea.
To use metaprogramming (Limits how flexible the end result can be re. user input)
template<class U>
typename ::std::enable_if<my_funky_criteria<U>::value,funky_shader>::type
Create(::std::istream& input) { /*blah*/ }
with
template<class U>
struct my_funky_criteria: typename ::std::conditional</*what you want*/,::std::true_type,::std::false_type>::type { };

This must be a compiler setting issue, even though you stated that you use release mode. Have you checked that you're using /O3 and not /O2 in your release mode configuration? O2 optimizes for size and could perhaps reuse the same binary instead of creating a version for each type (even though I'm not sure if it's prohibited by the standard).
Also, check the disassembler window to see if the IDE simply cheats you. And rebuild your project, etc. Some times Visual Studio fails seeing changed header files.
There is simply no other answer than build settings in this particular case...

Related

Adding XPointer/XPath searches to ALL(?) C++ JSON libraries, is it doable?

Is it possible to extend all(?) existing C++ JSON libraries with XPath/XPointer or subset with just one C++ implementation? At least those with iterators for object and array values?
I have reviewed three C++ JSON libraries (reviewing nlohmann, Boost.JSON and RapidJSON) to see the internals and check their search functionality. Some have implemented Json pointer. Json pointer is basic, almost like working with json as a name-value list.
XML has XPath and XPointer searches and rules are standardized. With XPath and XPointer you can do more.
One reason to reviewing these libraries was to see if it is possible to extend any of them with better search functionality. Or might it be possible to extend all(?) C++ JSON libraries at once?
A longer text describing this can be found here, trying to be brief.
I tried to do one traverse method that selects json values with one specific property name and that method should work an all tested JSON libraries. If I got that to work it may be possible to add more search logic and get it to work on almost all C++ JSON.
I got this C++ templated function to work an all tested json libraries. It can walk the JSON tree and select json values on all tested libraries.
What is needed to is to implement specializations of is_object, is_array, compare_name, get_value, begin and end. That are just one liners so it's easy.
template<typename json_value>
bool is_object( const json_value* p )
{ static_assert(sizeof(json_value) == 0, "Only specializations of is_object is allowed"); }
template<typename json_value>
bool is_array( const json_value* p )
{ static_assert(sizeof(json_value) == 0, "Only specializations of is_array is allowed"); }
template<typename iterator>
bool compare_name( iterator it, std::string_view stringName )
{ static_assert(sizeof(it) == 0, "Only specializations of compare_name is allowed"); }
template<typename iterator, typename json_value>
const json_value* get_value( iterator it )
{ static_assert(sizeof(it) == 0, "Only specializations of get_value is allowed"); }
template<typename iterator, typename json_value>
iterator begin( const json_value& v ) { return std::begin( v ); }
template<typename iterator, typename json_value>
iterator end( const json_value& v ) { return std::end( v ); }
// ------------------------------------------------
// Selects all json values that match property name
template<typename json_value, typename object_iterator,typename array_iterator = object_iterator>
uint32_t select( const json_value& jsonValue, std::string_view stringQuery, std::vector<const json_value*>* pvectorValue = nullptr )
{ assert( is_object( &jsonValue ) || is_array( &jsonValue ) );
uint32_t uCount = 0;
if( is_object( &jsonValue ) == true ) // found object ?
{
for( auto it = begin<object_iterator,json_value>( jsonValue ); it != end<object_iterator,json_value>( jsonValue ); it++ )
{
if( is_object( get_value<object_iterator,json_value>( it ) ) == true )
{ // found object, scan it
auto value = get_value<object_iterator,json_value>( it );
uCount += select<json_value,object_iterator>( *value, stringQuery, pvectorValue );
}
else if( is_array( get_value<object_iterator,json_value>( it ) ) == true )
{ // found array, scan it
auto parray = get_value<object_iterator,json_value>( it );
uCount += select<json_value,object_iterator,array_iterator>( *parray, stringQuery, pvectorValue );
}
else if( compare_name<object_iterator>( it, stringQuery ) == true )
{ // property name matches, store value if pointer to vector
if( pvectorValue != nullptr ) pvectorValue->push_back( get_value<object_iterator,json_value>( it ) );
uCount++;
}
}
}
else if( is_array( &jsonValue ) == true ) // found array
{
for( auto it = begin<array_iterator,json_value>( jsonValue ); it != end<array_iterator,json_value>( jsonValue ); it++ )
{
if( is_object( get_value<array_iterator,json_value>( it ) ) == true )
{ // found object, scan it
auto value = get_value<array_iterator,json_value>( it );
uCount += select<json_value,object_iterator>( *value, stringQuery, pvectorValue );
}
else if( is_array( get_value<array_iterator,json_value>( it ) ) == true )
{ // found array, scan it
auto parray = get_value<array_iterator,json_value>( it );
uCount += select<json_value,object_iterator,array_iterator>( *parray, stringQuery, pvectorValue );
}
}
}
return uCount;
}
if this works and if I haven't forgot something, shouldn't it be possible to extend all libraries with just one implementation? The additional logic for XPath and XPointer is not dependent on the implementation of these C++ JSON libraries.
Am I missing something

CA2202 Do not dispose objects multiple times - many times in managed c++

We are getting many instances of: "CA2202 Do not dispose objects multiple times" in managed c++ with code analysis on.
To me it seems like a mistake in the code analysis, but I may be missing something.
CA2202 Do not dispose objects multiple times Object 'gcnew ConfigurationDataAssembler()' can be disposed more than once in method 'DataAssembler::CreateConfiguration(Guid, int^, int^, ObjectReference^, ObjectReference^, ObjectReference^, List^%, PLResult^%)'. To avoid generating a System.ObjectDisposedException you should not call Dispose more than one time on an object.: Lines: 935, 938 PL dataassembler.cpp 935
The two lines it mentions are "return nullptr" and "return configDTO"
I have marked those lines with comments // here, // and here
Here is the function
//---------------------------------------------------------------------------------------------------------
// For IDataAssembler
Ivara::PL::Data::UIData::Control::MCLBConfig^ DataAssembler::CreateConfiguration( System::Guid activityKey, int subscriptionID, int controlID, ObjectReference^ pRootObjRef, ObjectReference^ pSelectedObjRef, ObjectReference^ pOwningObjRef, [Out] List<Ivara::PL::Data::UIData::Control::ConfigurationListItem^>^% configList, [Out] PLResult^% result )
{
try
{
AutoStopWatch stopwatch( __FUNCTION__, LogCategories::RemotingTimings );
ThreadToActivity cTTA( activityKey );
result = PLResult::Success;
//param check
if ( subscriptionID <= 0 )
{
throw gcnew Ivara::PL::Exceptions::IvaraArgumentException( _T( "Invalid configurationID" ), _T( "configurationID" ) );
}
//fetch config
UserConfigurationOR orUserConfig( subscriptionID );
if ( !orUserConfig.isSet() )
{
result = gcnew PLResult( PLResult::eStatus::RelatedObjectNotFound, String::Format( _T( "The user configuration {0} could not be found" ), subscriptionID ) );
return nullptr;
}
UserConfiguration* pUserConfig = orUserConfig.qryObjPtr();
if ( pUserConfig == NULL )
{
result = gcnew PLResult( PLResult::eStatus::RelatedObjectNotFound, String::Format( _T( "The user configuration {0} could not be fetched, even though isSet returns true" ), subscriptionID ) );
return nullptr;
}
//create assembler
ConfigurationDataAssembler assembler;
assembler.Initialize( controlID, pRootObjRef, pSelectedObjRef, pOwningObjRef, result );
if ( result != PLResult::Success )
{
return nullptr; // here
}
Ivara::PL::Data::UIData::Control::MCLBConfig^ configDTO = assembler.AssembleConfigurationDTO( pUserConfig, configList /*out param*/, nullptr );
return configDTO; // and here
}
catch ( OTBaseException& unmanagedException )
{
throw FatalExceptionPolicy::HandleUnmanagedException( &unmanagedException, __FUNCDNAME__, __FILE__, __LINE__ );
}
catch ( Exception^ managedException )
{
throw FatalExceptionPolicy::HandleManagedException( managedException, __FUNCDNAME__, __FILE__, __LINE__ );
}
}

Strange Return Value

My code,
LPSTR Internal::Gz_GetSystemKey( BOOL SHOW_ERROR, BOOL SHOW_KEY ) {
HW_PROFILE_INFO HwProfInfo;
if (!GetCurrentHwProfile(&HwProfInfo))
{
if(SHOW_ERROR)
Message::Error( "An Internal Error Has Occurred", "Gizmo Message", TRUE );
return NULL;
}
std::string __clean( (char*)HwProfInfo.szHwProfileGuid );
__clean.append( std::string( (char*)HwProfInfo.szHwProfileName ) );
LPSTR neet_key = Crypt::CRC32( Crypt::MD5( (char*)__clean.c_str() ) );
if (SHOW_KEY)
Message::Info( neet_key ); // shows expected result
return neet_key; // returns strange ascii result
};
Gz BOOL Gz_CreateContext( BOOL SHOW_ERROR, BOOL SHOW_KEY ) {
HKEY CHECK; // key result container
BOOL RESULT;
std::wstring neet_key_uni; // must use unicode string in RegSetValueExW
if ( RegOpenKey(HKEY_CURRENT_USER, TEXT("Software\\NEET\\Gizmo\\"), &CHECK) != ERROR_SUCCESS )
goto CREATE_REG_CONTEXT;
else
goto STORE_NEET_KEY;
CREATE_REG_CONTEXT:
if ( RegCreateKeyA( HKEY_CURRENT_USER, "Software\\NEET\\Gizmo\\", &CHECK ) != ERROR_SUCCESS ) {
if( SHOW_ERROR )
Message::Error( "Context Could Not Be Created" );
RESULT = FALSE;
goto END_MACRO;
}
STORE_NEET_KEY:
LPSTR neet_key = Internal::Gz_GetSystemKey( SHOW_ERROR, SHOW_KEY ); // GetSystemKey generates good key, returns weird ascii
Message::Notify( neet_key );
neet_key_uni = std::wstring(neet_key, neet_key+strlen(neet_key));
if ( RegSetValueEx( CHECK, TEXT("Key"), 0, REG_SZ, (const BYTE*)neet_key_uni.c_str(), ( neet_key_uni.size() + 1 ) * sizeof( wchar_t ) ) != ERROR_SUCCESS ) {
if( SHOW_ERROR )
Message::Error( "Context Could Not Be Reached" );
RESULT = FALSE;
goto END_MACRO;
}
RESULT = TRUE;
END_MACRO:
RegCloseKey(CHECK); // safely close registry key
return RESULT;
};
I'm creating a simple PC identification lib for practice, not for commercial use.
Message::Info( neet_key );
Shows
but the actual return value is
Any ideas why? The 'Message' namespace/functions are just message boxes. As for the 'Crypt' namespace/functions, they aren't the issue at hand.
From the comments: Who owns the memory for the 'neet_key'? My guess would be that the 'Message::Info' shows a valid value because whatever memory structure its from is still in memory but when you return its no longer in memory. Therefore the returned value prints rubbish.
This is a common issue for the C++ language. I would highly recommend that you avoid using raw pointers where possible (especially when returning from functions/methods). For strings you could obviously use 'std::string'.

C++ iterator class causes R6025 run-time error in Visual C++

I have the following code, when I run the code below I get 'R6025 run-time error in Visual C++'
CommandParameterAndValue param( "Key", "value" );
parameters.AddParameter( &param );
parameters.HasParameter( "akeyval" );
I am lost, any ideas? Is it something to do with the casting?
typedef std::vector<iCommandParameter *> ParamsVectorList;
class CommandParametersList
{
public:
.... functions here ....
void AddParameter( iCommandParameter *param );
bool HasParameter( std::string parameterKey );
protected:
ParamsVectorList m_parameters;
};
void CommandParametersList::AddParameter( iCommandParameter *param )
{
m_parameters.push_back( param );
}
bool CommandParametersList::HasParameter( std::string parameterKey )
{
ParamsVectorList::iterator it;
CommandParameterAndValue *paramItem = NULL;
bool returnValue = false;
for ( it = m_parameters.begin(); it != m_parameters.end(); it++ )
{
paramItem = static_cast<CommandParameterAndValue *>( *it );
if ( paramItem->GetKey().compare( parameterKey ) == 0 )
{
returnValue = true;
break;
}
}
return returnValue;
}
I need more information to give a complete answer, but if you look here: http://support.microsoft.com/kb/125749
That run-time error means you tried to call a pure virtual function - it couldn't find an implementation. I would suggest running through a debugger and finding which line of code throws this error. Than it should be easy to understand and fix. It's probably happening here:
if ( paramItem->GetKey().compare( parameterKey ) == 0 )

Error handling for xml parsing

I'm using tinyxml to parse xml files, and I've found that error handling here lends itself to arrow code. Our error handling is simply reporting a message to a file.
Here is an example:
const TiXmlElement *objectType = dataRoot->FirstChildElement( "game_object" );
if ( objectType ) {
do {
const char *path = objectType->Attribute( "path" );
if ( path ) {
const TiXmlElement *instance = objectType->FirstChildElement( "instance" );
if ( instance ) {
do {
int x, y = 0;
instance->QueryIntAttribute( "x", &x );
instance->QueryIntAttribute( "y", &y );
if ( x >= 0 && y >= 0 ) {
AddGameObject( new GameObject( path, x, y ));
} else {
LogErr( "Tile location negative for GameObject in state file." );
return false;
}
} while ( instance = instance->NextSiblingElement( "instance" ));
} else {
LogErr( "No instances specified for GameObject in state file." );
return false;
}
} else {
LogErr( "No path specified for GameObject in state file." );
return false;
}
} while ( objectType = objectType->NextSiblingElement( "game_object" ));
} else {
LogErr( "No game_object specified in <game_objects>. Thus, not necessary." );
return false;
}
return true;
I'm not huffing and puffing over it, but if anyone can think of a cleaner way to accomplish this it would be appreciated.
P.S. Exceptions not an option.
Edit:
Would something like this be preferable?
if ( !path ) {
// Handle error, return false
}
// Continue
This eliminates the arrow code, but the arrow code kind of puts all of the error logging on one place.
Using return values as error codes just leads to such code, it can't be improved much. A slightly cleaner way would use goto to group all error handling into a single block and to decrease the nesting of blocks.
This does however not solve the actual problem, which is using return values as error codes. In C, there is no alternative, but in C++ exceptions are available and should be used. If they are not an option, you're are stuck with what you have.
You could create a macro for that, which encapsulates the if (!var) { .. return false; } and error reporting.
However, I do not see how this can be improved all that much; its just the way it is. C'est la vie. C'est le code...
I'm not huffing and puffing over it,
but if anyone can think of a cleaner
way to accomplish this it would be
appreciated.
I have replaced the nested ifs with return statements on error (this makes the code "flow down" instead of going "arrow shaped". I have also replaced your do loopps with for loops (so I could understand it better).
Is this what you wanted?
const TiXmlElement *objectType = dataRoot->FirstChildElement( "game_object" );
if ( !objectType ) {
LogErr( "No game_object specified in <game_objects>. Thus, not necessary." );
return false;
}
for(; objectType != 0; objectType = objectType->NextSiblingElement( "game_object" )) {
const char *path = objectType->Attribute( "path" );
if ( !path ) {
LogErr( "No path specified for GameObject in state file." );
return false;
}
const TiXmlElement *instance = objectType->FirstChildElement( "instance" );
if ( !instance ) {
LogErr( "No instances specified for GameObject in state file." );
return false;
}
for(; instance != 0; instance = instance->NextSiblingElement( "instance" )) {
int x, y = 0;
instance->QueryIntAttribute( "x", &x );
instance->QueryIntAttribute( "y", &y );
if ( x >= 0 && y >= 0 ) {
AddGameObject( new GameObject( path, x, y ));
} else {
LogErr( "Tile location negative for GameObject in state file." );
return false;
}
}
}
return true;
I know it is a little late, but I know that QueryIntAttribute returns a value which can be used for error handling in case you want this for your attributes too.
if (instance->QueryIntAttribute("x",&x)!=TIXML_SUCCESS)
cout << "No x value found";