why Resource Barrier need before state? - directx-12

I am very very sorry for my poor english.
I have a question about directx 12 D3D12_RESOURCE_TRANSITION_BARRIER.
typedef struct D3D12_RESOURCE_TRANSITION_BARRIER {
ID3D12Resource *pResource;
UINT Subresource;
D3D12_RESOURCE_STATES StateBefore;
D3D12_RESOURCE_STATES StateAfter;
} D3D12_RESOURCE_TRANSITION_BARRIER;
I think GPU will knows resource's state because StateAfter.
What's the purpose of StateBefore ?

Related

D3D11: E_OUTOFMEMORY when mapping vertex buffer

In my Unity game, I have to modify a lot of graphic resources like textures and vertex buffers via native code to keep good performance.
The problems start when code calls ID3D11ImmediateContext::Map several times in a very short time (I mean very short - called from different threads running parallel). There is no rule if mapping is successful or not. Method call looks like
ID3D11DeviceContext* sU_m_D_context;
void* BeginModifyingVBO(void* bufferHandle)
{
ID3D11Buffer* d3dbuf = static_cast<ID3D11Buffer*>(bufferHandle);
D3D11_MAPPED_SUBRESOURCE mapped;
HRESULT res = sU_m_D_context->Map(d3dbuf, 0, D3D11_MAP_WRITE_DISCARD, 0, &mapped);
assert(mapped.pData);
return mapped.pData;
}
void FinishModifyingVBO(void* bufferHandle)
{
ID3D11Buffer* d3dbuf = static_cast<ID3D11Buffer*>(bufferHandle);
sU_m_D_context->Unmap(d3dbuf, 0);
}
std::mutex sU_m_D_locker;
void Mesh::ApplyBuffer()
{
sU_m_D_locker.lock();
// map buffer
VBVertex* mappedBuffer = (VBVertex*)BeginModifyingVBO(this->currentBufferPtr);
memcpy(mappedBuffer, this->mainBuffer, this->mainBufferLength * sizeof(VBVertex));
// unmap buffer
FinishModifyingVBO(this->currentBufferPtr);
sU_m_D_locker.unlock();
this->markedAsChanged = false;
}
where d3dbuf is dynamic vertex buffer. I don't know why, but sometimes result is E_OUTOFMEMORY, despite there is a lot of free memory. I tried to surround code with mutexes - no effect.
Is this really memory problem or maybe something less obvious?
None of the device context methods are thread safe. If you are going to use them from several threads you will need to either manually sync all the calls, or use multiple (deferred) contexts, one per thread. See Introduction to Multithreading in Direct3D 11.
Also error checking should be better: you need to always check returned HRESULT values because in case of failure something like assert(mapped.pData); may still succeed.

Thread data encapsulation best practices

I have a question how to encapsulate data gracefully in such case.
Let's think we want to make some class that can asynchronous download images from internet. Let us have unblocking method:
void download(string url)
This method will create thread and start downloading, and then invoke callback:
void callback(char* data)
What is best: allocate memory for data in Downloader or allocate it out of Downloader class? In first case we will need to copy data returned in callback and if data is big - it is not good, otherwise we will allocate memory in Downloader class and release it somewhere else. In second case we need to allocate memory for data and pass it as parameter in download method:
char *data = new char[DATA_SIZE];
downloader.download(url, data);
But how can we protect this allocated data from changing it from callable thread, while it used by downloader thread. I think there is some way to make it without synchronization in callable thread some way to make this logic invisible for client.
Hope you get my mind right
Some sprinkling of C++ Standard Library classes would probably be good.
std::future<std::vector<unsigned char>> download(std::string url) {
std::promise<std::vector<unsigned char>> promise;
std::future<std::vector<unsigned char>> future = promise.get_future();
//I'm like 99% certain that both promise and future ref-count their shared state
//so it's probably safe to move and later even delete the promise object.
func_which_begins_asynchronous_process(url, std::move(promise));
return future;
}
void callback(std::vector<unsigned char> data, std::promise<std::vector<unsigned char>> promise) {
promise.set_value(std::move(data));
}
int main() {
auto future = download("google.com");
//Do whatever
std::vector<unsigned char> result = future.get();
//Do whatever
return 0;
}
This will both make the code easy to reason about, and will reliably handle the "pointer ownership" issue you discussed in your original post.
I don't know the exact semantics/requirements of your code, so my code won't work in your solution "as-is", but this should give you a pretty good idea of the kind of paradigm that will solve your problem.

Locking/Unlocking functions with CRITICAL_SECTION

So, when I use "EnterCriticalSection" & "LeaveCriticalSection" I throws an exception at me, this is my current setup:
void printer::Unlock()
{
LeaveCriticalSection(&_cs);
}
void printer::Lock()
{
EnterCriticalSection(&_cs);
}
_cs is a CRITICAL_SECTION object created inside my class "printer" like this:
class printer {
private:
static CRITICAL_SECTION _cs;
When I call "Lock" it throws the exception, I'm not really sure why, I've tried reading the MSDN but I dont quite 100% understand it.
(I dont want to use mutexes...)
I believe you need to add
InitializeCriticalSection(&_cs);
If that fails, you might try changing the CRITICAL_SECTION _cs to mutable rather than static, but that's kind of a shot in the dark.

How can I do automata/state machine coding in C++?

I have used it in another programming language and It's very usefull.
I cannot find anything about this for C++.
Let's for example take the following code:
void change();
enum
{
end = 0,
gmx
}
int
gExitType;
int main()
{
gExitType = end;
SetTimer(&change, 10000, 0);
return 0;
}
void ApplicationExit()
{
switch (gExitType)
{
case end:
printf("This application was ended by the server");
case gmx:
printf("This application was ended by the timer");
}
::exit(0);
}
void change()
{
gExitType = gmx;
ApplicationExit();
}
That's kind of how we would do it in C++, but when using state machine/automata I could do something like this in the other language:
void change();
int main()
{
state exitType:end;
SetTimer(&change, 10000, 0);
return 0;
}
void ApplicationExit() <exitType:end>
{
printf("This application was ended by the server");
}
void ApplicationExit() <exitType:gmx>
{
printf("This application ended by the timer");
}
void change()
{
state exitType:gmx;
ApplicationExit();
}
In my opition this is a really elegant way to achieve things.
How would I do this in C++? This code doesn't seem to work (obviously as I cannot find anything automata related to C++)
To clarify my opinion:
So what are the advantages to using this technique? Well, as you can clearly see the code is smaller; granted I added an enum to the first version to make the examples more similar but the ApplicationExit functions are definately smaller. It's also alot more explicit - you don't need large switch statements in functions to determine what's going on, if you wanted you could put the different ApplicationExits in different files to handle different sets of code independently. It also uses less global variables.
There are C++ libraries like Boost.statechart that specifically try to provide rich support for encoding state machines:
http://www.boost.org/doc/libs/1_54_0/libs/statechart/doc/tutorial.html
Besides this, one very elegant way to encode certain types of state machines is by defining them as a couroutine:
http://c2.com/cgi/wiki?CoRoutine
http://eli.thegreenplace.net/2009/08/29/co-routines-as-an-alternative-to-state-machines/
Coroutines are not directly supported in C++, but there are two possible approaches for
implementing them:
1) Using a technique similar to implementing a duff's device, explained in details here:
http://blog.think-async.com/search/label/coroutines
This is very similar to how C#'s iterators work for example and one limitation is that yielding form the coroutine can be done only from the topmost function in the coroutine call-stack. OTOH, the advantage of this method is that very little memory is required for each instance of the coroutine.
2) Allocating a separate stack and registers space for each coroutine.
This essentially makes the coroutine a full-blown thread of execution with the only difference that the user has full responsibility for the thread scheduling (also known as cooperative multi-tasking).
A portable implementation is available from boost:
http://www.boost.org/doc/libs/1_54_0/libs/coroutine/doc/html/coroutine/intro.html
For this particular example, you could use objects and polymorphism to represent the different states. For example:
class StateObject
{
public:
virtual void action(void) = 0;
};
class EndedBy : public StateObject
{
private:
const char *const reason;
public:
EndedBy( const char *const reason_ ) : reason( reason_ ) { }
virtual void action(void)
{
puts(reason);
}
};
EndedBy EndedByServer("This application was ended by the server");
EndedBy EndedByTimer ("This application ended by the timer");
StateObject *state = &EndedByServer;
void change()
{
state = &EndedByTimer;
}
void ApplicationExit()
{
state->action();
::exit(0);
}
int main()
{
SetTimer(&change, 10000, 0);
// whatever stuff here...
// presumably eventually causes ApplicationExit() to get called before return 0;
return 0;
}
That said, this isn't great design, and it isn't an FSM in the general sense. But, it would implement your immediate need.
You might look up the State Pattern (one reference: http://en.wikipedia.org/wiki/State_pattern ) for a more general treatment of this pattern.
The basic idea, though, is that each state is a subclass of some common "state" class, and you can use polymorphism to determine the different actions and behaviors represented by each state. A pointer to the common "state" base class then keeps track of the state you're currently in.
The state objects may be different types, or as in my example above, different instances of the same object configured differently, or a blend.
You can use Template value specialization over an int to achieve pretty much what you want.
(Sorry I'm at my tablet so I cannot provide an example, I will update on Sunday)

Thread-Safe Game Engine: Multi-Threading Best Practices?

I'm writing a multi-threaded game engine, and I'm wondering about best practices around waiting for threads. It occurs to me that there could be much better options out there than what I've implemented, so I'm wondering what you guys think.
Option A) "wait()" method gets called at the top of every other method in the class. This is my current implementation, and I'm realizing it's not ideal.
class Texture {
public:
Texture(const char *filename, bool async = true);
~Texture();
void Render();
private:
SDL_Thread *thread;
const char *filename;
void wait();
static int load(void *data);
}
void Texture::wait() {
if (thread != NULL) {
SDL_WaitThread(thread, NULL);
thread = NULL;
}
}
int Texture::load(void *data) {
Texture *self = static_cast<Texture *>(data);
// Load the Image Data in the Thread Here...
return 0;
}
Texture::Texture(const char *filename, bool async) {
this->filename = filename;
if (async) {
thread = SDL_CreateThread(load, NULL, this);
} else {
thread = NULL;
load(this);
}
}
Texture::~Texture() {
// Unload the Thread and Texture Here
}
void Texture::Render() {
wait();
// Render the Texture Here
}
Option B) Convert the "wait()" method in to a function pointer. This would save my program from a jmp at the top of every other method, and simply check for "thread != NULL" at the top of every method. Still not ideal, but I feel like the less jumps, the better. (I've also considered just using the "inline" keyword on the function... but would this include the entire contents of the wait function when all I really need is the "if (thread != NULL)" check to determine whether the rest of the code should be executed or not?)
Option C) Convert all of the class' methods in to function pointers, and ditch the whole concept of calling "wait()" except while actually loading the texture. I see advantages and disadvantages to this approach... namely, this feels the most difficult to implement and keep track of. Admittedly, my knowledge of the inner workings on GCC's optimizations and assembly and especially memory->cpu->memory communication isn't the best, so using a bunch of function pointers might actually be slower than a properly defined class.
Anyone have any even better ideas?
Best practice is often not reinventing the wheel :D
You might want to take a look at std::thread library, if you have a compiler that supports C++11. Everything you need is already implemented and made as safe as possible (which is not really safe considering the topic).
In particular, your wait() function is implemented by std::condition_variable.
Boost thread library offers pretty much the same functionality.
I don't know about the library you're using sorry :D