Vulkan proper frame synchronization - c++

I'm trying to synchronize frames in Vulkan API, but I have some weird problems. I implemented synchronization like this:
void RenderSystem::OnUpdate(const float deltaTime)
{
uint32_t frameIndex{};
auto result = SwapChain->AcquireNextImageIndex(PresentationCompleteSemaphore.get(),
nullptr,
&frameIndex);
InFlightFences[frameIndex]->Wait();
InFlightFences[frameIndex]->Reset();
if (result == VK_ERROR_OUT_OF_DATE_KHR)
{
Recreate();
return;
}
else if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR)
{
throw std::runtime_error("Error when acquiring next image...");
}
UpdateModelMatrix(deltaTime, frameIndex); // TODO: Remove this! For testing purposes only
VkPipelineStageFlags waitStages[] = { VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT };
GraphicsMainQueue.Submit({ TriangleCommandBuffers[frameIndex].get() },
{ PresentationCompleteSemaphore.get() },
{ RenderCompleteSemaphore.get() },
InFlightFences[frameIndex].get(),
waitStages);
result = PresentationQueue.Present({ RenderCompleteSemaphore.get() },
{ SwapChain.get() },
&frameIndex);
if (result == VK_ERROR_OUT_OF_DATE_KHR || result == VK_SUBOPTIMAL_KHR || MainWindow->HasBeenResized())
Recreate();
else if (result != VK_SUCCESS)
throw std::runtime_error("Failed to present result!");
}
And it works on Windows 10 like a charm. Unfortunately on Linux Mint, it doesn't work in some cases. First of all, moving window on Linux is very laggy and sometimes freezes the whole OS for a second, but it's not the biggest problem. Closing the window calls vkDeviceWaitIdle and... it freezes the application. It will never start responding because it will wait for the device forever. The validation layer doesn't report any problem with my code.
I partly solved this problem by moving fences synchronization at the bottom of my function, but in my opinion, it's a suboptimal solution, because I wait for the frame to finish rendering, instead of preparing the next frame.
// ...
if (result == VK_ERROR_OUT_OF_DATE_KHR || result == VK_SUBOPTIMAL_KHR || MainWindow->HasBeenResized())
Recreate();
else if (result != VK_SUCCESS)
throw std::runtime_error("Failed to present result!");
InFlightFences[frameIndex]->Wait();
InFlightFences[frameIndex]->Reset();
}
How can I properly synchronize frames not only on Windows but also on Linux? What am I doing wrong? What am I missing?

You have only one set of semaphores. That means access to those semaphores might be missynchronized.
Let's see the code without the distractors:
AcquireNextImageIndex( PresentationCompleteSemaphore, frameIndex );
InFlightFences[frameIndex].WaitAndReset();
QSubmit( PresentationCompleteSemaphore, RenderCompleteSemaphore, InFlightFences[frameIndex] );
Present( RenderCompleteSemaphore, frameIndex );
Now, how do we know we can reuse PresentationCompleteSemaphore on Acquire? The Submit waits\unsignals it, and must finish. We could infer this from the fence, but the fence wait happens after the Acquire. So the semaphore still might be in use while Acquire tries to reuse it. This is a possible program flow:
AcquireNextImageIndex( PresentationCompleteSemaphore ) -> frameIndex = 0;
QSubmit( PresentationCompleteSemaphore, RenderCompleteSemaphore, InFlightFences[0] );
// hazard; QSubmit still might be waiting on PresentationCompleteSemaphore
AcquireNextImageIndex( PresentationCompleteSemaphore ) -> frameIndex = 1;
How do we know we can reuse RenderCompleteSemaphore? The QSubmit can only use it when Present is already done with it. Only sane way currently to infer that is when Acquire gives back the same swapchain image. This is a possible program flow:
AcquireNextImageIndex( PresentationCompleteSemaphore ) -> frameIndex = 0;
QSubmit( PresentationCompleteSemaphore, RenderCompleteSemaphore, InFlightFences[0] );
Present( RenderCompleteSemaphore, 0 );
AcquireNextImageIndex( PresentationCompleteSemaphore ) -> frameIndex = 1;
// hazard; RenderCompleteSemaphore might still be waited on by Present
// which presented image 0, but we acquired image 1, so it might be async
QSubmit( PresentationCompleteSemaphore, RenderCompleteSemaphore, InFlightFences[1] );

Related

Using timer with zmq

I am working on a project where I have to use zmq_poll. But I did not completely understand what it does.
So I also tried to implement it:
zmq_pollitem_t timer_open(void){
zmq_pollitem_t items[1];
if( items[0].socket == nullptr ){
printf("error socket %s: %s\n", zmq_strerror(zmq_errno()));
return;
}
else{
items[0].socket = gsock;
}
items[0].fd = -1;
items[0].events = ZMQ_POLLIN;
// get a timer
items[0].fd = timerfd_create( CLOCK_REALTIME, 0 );
if( items[0].fd == -1 )
{
printf("timerfd_create() failed: errno=%d\n", errno);
items[0].socket = nullptr;
return;
}
int rc = zmq_poll(items,1,-1);
if(rc == -1){
printf("error poll %s: %s\n", zmq_strerror(zmq_errno()));
return;
}
else
return items[0];
}
I am very new to this topic and I have to modify an old existing project and replace the functions with the one of zmq. On other websites I saw examples where they used two items and the zmq_poll function in an endless loop. I have read the documentation but still could not properly understand how this works. And these are the other two functions I have implemented. I do not know if it is the correct way to implement it like this:
void timer_set(zmq_pollitem_t items[] , long msec, ipc_timer_mode_t mode ) {
struct itimerspec t;
...
timerfd_settime( items[0].fd , 0, &t, NULL );
}
void timer_close(zmq_pollitem_t items[]){
if( items[0].fd != -1 )
close(items[0].fd);
items[0].socket = nullptr;
}
I am not sure if I need the zmq_poll function because I am using a timer.
EDIT:
void some_function_timer_example() {
// We want to wait on two timers
zmq_pollitem_t items[2] ;
// Setup first timer
ipc_timer_open_(&items[0]);
ipc_timer_set_(&items[0], 1000, IPC_TIMER_ONE_SHOT);
// Setup second timer
ipc_timer_open_(&items[1]);
ipc_timer_set_(&items[1], 1000, IPC_TIMER_ONE_SHOT);
// Now wait for the timers in a loop
while (1) {
//ipc_timer_set_(&items[0], 1000, IPC_TIMER_REPEAT);
//ipc_timer_set_(&items[1], 5000, IPC_TIMER_REPEAT);
int rc = zmq_poll (items, 2, -1);
assert (rc >= 0); /* Returned events will be stored in items[].revents */
if (items [0].revents & ZMQ_POLLIN) {
// Process task
std::cout << "revents: 1" << std::endl;
}
if (items [1].revents & ZMQ_POLLIN) {
// Process weather update
std::cout << "revents: 2" << std::endl;
}
}
}
Now it still prins very fast and is not waiting. It is still waiting only in the beginning. And when the timer_set is inside the loop it waits properly, only if the waiting time is the same like: ipc_timer_set(&items[1], 1000,...) and ipctimer_set(&items[0], 1000,...)
So how do I have to change this? Or is this the correct behavior?
zmq_poll works like select, but it allows some additional stuff. For instance you can select between regular synchronous file descriptors, and also special async sockets.
In your case you can use the timer fd as you have tried to do, but you need to make a few small changes.
First you have to consider how you will invoke these timers. I think the use case is if you want to create multiple timers and wait for them. This would be typically the function in yuor current code that might be using a loop for the timer (either using select() or whatever else they might be doing).
It would be something like this:
void some_function() {
// We want to wait on two timers
zmq_pollitem items[2];
// Setup first timer
ipc_timer_open(&item[0]);
ipc_timer_set(&item[0], 1000, IPC_TIMER_ONE_REPEAT);
// Setup second timer
ipc_timer_open(&item[1]);
ipc_timer_set(&item[1], 5000, IPC_TIMER_ONE_SHOT);
// Now wait for the timers in a loop
while (1) {
int rc = zmq_poll (items, 2, -1);
assert (rc >= 0); /* Returned events will be stored in items[].revents */
}
}
Now, you need to fix the ipc_timer_open. It will be very simple - just create the timer fd.
// Takes a pointer to pre-allocated zmq_pollitem_t and returns 0 for success, -1 for error
int ipc_timer_open(zmq_pollitem_t *items){
items[0].socket = NULL;
items[0].events = ZMQ_POLLIN;
// get a timer
items[0].fd = timerfd_create( CLOCK_REALTIME, 0 );
if( items[0].fd == -1 )
{
printf("timerfd_create() failed: errno=%d\n", errno);
return -1; // error
}
return 0;
}
Edit: Added as reply to comment, since this is long:
From the documentation:
If both socket and fd are set in a single zmq_pollitem_t, the ØMQ socket referenced by socket shall take precedence and the value of fd shall be ignored.
So if you are passing the fd, you have to set socket to NULL. I am not even clear where gsock is coming from. Is this in the documentation? I couldn't find it.
And when will it break out of the while(1) loop?
This is application logic, and you have to code according to what you require. zmq_poll just keeps returning everytime one of the timer hits. In this example, every second the zmq_poll returns because the first timer (which is a repeat) keeps triggering. But at 5 seconds, it will also return because of the second timer (which is a one shot). Its up to you to decide when you exit the loop. Do you want this to go infinitely? Do you need to check for a different condition to exit the loop? Do you want to do this for say 100 times and then return? You can code whatever logic you want on top of this code.
And what kind of events are returned back
ZMQ_POLLIN since timer fds behave like readable file descriptors.

QNX pthread_mutex_lock causing deadlock error ( 45 = EDEADLK )

I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.
void CTraceFileWriterThread::run()
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.
while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}
//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.
std::string message;
if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}
// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {
if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.
writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();
if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("\n queue full, discarding traces, traces are incomplete" );
}
if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);
condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"
}
writeQueueMutex.unLock();
}
int wrapperMutex::lock() {
//#[ operation lock()
int iRetval;
int iRetry = 10;
do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);
return iRetval;
//#]
}
I generated the core dump and analysed it with GDB and here are some findings
Program terminated with signal 11, Segmentation fault.
"Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.
The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.

Vulkan Fence already in use by another submission

I'm trying to create a game using Vulkan and C++. I've got to a part where I use multiple command buffers with threading - or so I thought if I'm doing it correctly.
Now, I'm having a problem with fence. The console-I added a validation layer-says "Fence 0x21 is already in use by another submission."
I never used the fence in other functions.
The code below is the draw function. I call this function in a loop.
update_ubo (); // this function just writes uniform data on the uniform buffer in the local device.
uint32_t image_index = 0;
VkResult result = vkAcquireNextImageKHR (device, swapchain, numeric_limits <uint64_t>::max (), semaphore_image_avail, fence, &image_index);
// I hope I'm using multithreading correctly.
// all command buffers recorded in record_commandbuffers function are secondary command buffers.
#pragma omp parallel for num_threads(thread::hardware_concurrency ())
for (int64_t i = 0 ; i < (int64_t) vkthreads.size () ; i ++)
record_commandbuffers (vkthreads [i], framebuffers [image_index]);
VkCommandBufferBeginInfo cmdbuf_info = {
VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,
nullptr,
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,
nullptr
};
VkRenderPassBeginInfo renderpass_begin = {
VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
nullptr,
renderpass,
framebuffers [image_index],
{
{ 0, 0 },
swapchain_extent
},
1,
&clear_value
};
vkBeginCommandBuffer (pcmdbuf, &cmdbuf_info);
vkCmdBeginRenderPass (pcmdbuf, &renderpass_begin, VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS);
for (size_t i = 0 ; i < vkthreads.size () ; i ++)
vkCmdExecuteCommands (pcmdbuf, (uint32_t) vkthreads [i].cmdbufs.size (), vkthreads [i].cmdbufs.data ());
vkCmdEndRenderPass (pcmdbuf);
vkEndCommandBuffer (pcmdbuf);
if (result == VK_ERROR_OUT_OF_DATE_KHR)
window_changed ();
else if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR)
throw exception ("Could not acquire next images.");
VkPipelineStageFlags pipeline_flags [] = { VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT };
VkSubmitInfo submit_info = {
VK_STRUCTURE_TYPE_SUBMIT_INFO,
nullptr,
1,
&semaphore_image_avail,
pipeline_flags,
1,
&pcmdbuf,
1,
&semaphore_render_finished
};
if (vkQueueSubmit (graphics_queue, 1, &submit_info, fence))
throw exception ("Could not submit information into the graphics queue.");
while (vkWaitForFences (device, 1, &fence, VK_TRUE, (uint64_t)100000000) == VK_TIMEOUT)
;
vkResetFences (device, 1, &fence);
VkSwapchainKHR swapchains [] = { swapchain };
VkPresentInfoKHR present_info = {
VK_STRUCTURE_TYPE_PRESENT_INFO_KHR,
nullptr,
1,
&semaphore_render_finished,
1,
swapchains,
&image_index,
nullptr
};
result = vkQueuePresentKHR (present_queue, &present_info);
if (result == VK_ERROR_OUT_OF_DATE_KHR)
window_changed ();
else if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR)
throw exception ("Could not presnet the queue.");
P.S. FPS dropped significantly when I added multithreading (2000 fps to 210 fps, in debug release), and CPU usage went up significantly, which is expected. Should I care about FPS?
You pass the same fence to AcquireNextImage and QueueSubmit without waiting in between. You only need to pass it to QueueSubmit as the semaphore will take care of any required sync. Just pass VK_NULL_HANDLE to the acquireNexImage.
Paying such a huge cost of 5 ms per frame penalty due to threading overhead does seem a bit steep I'd expect a millisecond or two due to scheduler though it depends on how you are actually multithreading. But as long as total per frame remains under 16 ms for 60 fps it's no big deal.

How do I interrupt xcb_wait_for_event?

In a separate thread (std::thread), I have an event loop that waits on xcb_wait_for_event. When the program exits, I'd like to shut things down nicely by interrupting (I have a solution that sets a thread-local variable, and checkpoints in the loop throw an exception), and then joining my event thread into the main thread. The issue is xcb_wait_for_event; I need a way to return from it early, or I need an alternative to the function.
Can anyone suggest a solution? Thanks for your help!
I believe I've come up with a suitable solution. I've replaced xcb_wait_for_event with the following function:
xcb_generic_event_t *WaitForEvent(xcb_connection_t *XConnection)
{
xcb_generic_event_t *Event = nullptr;
int XCBFileDescriptor = xcb_get_file_descriptor(XConnection);
fd_set FileDescriptors;
struct timespec Timeout = { 0, 250000000 }; // Check for interruptions every 0.25 seconds
while (true)
{
interruptible<std::thread>::check();
FD_ZERO(&FileDescriptors);
FD_SET(XCBFileDescriptor, &FileDescriptors);
if (pselect(XCBFileDescriptor + 1, &FileDescriptors, nullptr, nullptr, &Timeout, nullptr) > 0)
{
if ((Event = xcb_poll_for_event(XConnection)))
break;
}
}
interruptible<std::thread>::check();
return Event;
}
Making use of xcb_get_file_descriptor, I can use pselect to wait until there are new events, or until a specified timeout has occurred. This method incurs negligible additional CPU costs, resting at a flat 0.0% (on this i7). The only "downside" is having to wait a maximum of 0.25 seconds to check for interruptions, and I'm sure that limit could be safely lowered.
A neater way would be to do something like this (the code snippet is extracted from some code I am currently working on):
void QXcbEventQueue::sendCloseConnectionEvent() const {
// A hack to close XCB connection. Apparently XCB does not have any APIs for this?
xcb_client_message_event_t event;
memset(&event, 0, sizeof(event));
event.response_type = XCB_CLIENT_MESSAGE;
event.format = 32;
event.sequence = 0;
event.window = m_connection->clientLeader();
event.type = m_connection->atom(QXcbAtom::_QT_CLOSE_CONNECTION);
event.data.data32[0] = 0;
xcb_connection_t *c = m_connection->xcb_connection();
xcb_send_event(c, false, m_connection->clientLeader(),
XCB_EVENT_MASK_NO_EVENT, reinterpret_cast<const char *>(&event));
xcb_flush(c); }
For _QT_CLOSE_CONNECTION use your own atom to signal an exit and in my case clientLeader() is some invisible window that is always present on my X11 connection. If you don't have any invisible windows that could be reused for this purpose, create one :)
With this you can terminate the thread with xcb_wait_for_event when you see this special event arriving.

How to properly handle audio interruptions?

I've created a OpenGL 3D game utilizing OpenAL for audio playback and experienceing a problem of losing audio if "Home" button is getting pressed before audio device is getting initialized. I tried to hook up to audio session interrupt handler, but my callback is never getting called. No matter if I minimize or maximize my application. My "OpenALInterruptionListener" is never getting called.
What am I doing wrong?
AudioSessionInitialize(NULL, NULL, OpenALInterriptionListener, this);
void OpenALInterriptionListener(void * inClientData, UInt32 inInterruptionState)
{
OpenALDevice * device = (OpenALDevice *) inClientData;
if (inInterruptionState == kAudioSessionBeginInterruption)
{
alcSuspendContext(_context);
alcMakeContextCurrent(_context);
AudioSessionSetActive(false);
}
else if (inInterruptionState == kAudioSessionEndInterruption)
{
UInt32 sessionCategory = kAudioSessionCategory_AmbientSound;
AudioSessionSetProperty(kAudioSessionProperty_AudioCategory, sizeof(sessionCategory), &sessionCategory);
AudioSessionSetActive(true);
alcMakeContextCurrent(_context);
alcProcessContext(_context);
}
}
Please note that there are currently issues with Audio Interruptions and IOS. Interruption notifications are fine, but end Audio Interruptions Notifications do not always work. There is a bug into Apple on this and they have not responded.
Try using NULL in alcMakeContextCurrent()
void OpenALInterriptionListener(void *inClientData, UInt32 inInterruptionState)
{
OpenALDevice * device = (OpenALDevice *) inClientData;
OSStatus nResult;
if( inInterruptionState == kAudioSessionBeginInterruption )
{
alcMakeContextCurrent(NULL);
}
else if( inInterruptionState == kAudioSessionEndInterruption )
{
nResult = AudioSessionSetActive(true);
if( nResult )
{
// "Error setting audio session active"
}
alcMakeContextCurrent( device->GetContext() );
}
}