After a while, glGenBuffers is very slow - opengl

I'm programming an in-game editor for my simple 2D game, where meshes are dynamically created and removed quite often. I'm using libgdx
Everything works fine, except that the editor becomes slow. It responds on input events quite late.
To find the weak point in my code (which isn't optimized at all atm), I run jprofiler and profiled the CPU. It turns out, that glGenBuffers tooks over 2 sec for 14 invocations! That is for 7 meshes, with one vertex and index buffer each. Actually, I have a quite fast machine (i7-4790T, GTX980m, 16GB...), so this shouldn't be the problem here.
I just want to know how this is possible since I have no idea.

1st two comments were right. I was rebuilding a large part of the scene to check, if it still exists (not that clever in general). If it still exists, the rebuilded part wan't disposed. Only replaced parts were replaced correctly. :/ .
I found that issue using jprofiler and profiling memory, as suggested in comment 1.
I didn't check the graphics memory, but there is no other way, that the garbage is there, too.

Buffers are always stored in memory. Cleanup is always needed, frame-buffers, byte-buffers etc.

Related

UE4 - Detect Compute Shader Completion

I'm currently trying to make a plugin for Unreal that involves a compute shader. Currently, my shader runs, and I can access the memory block that it writes to, but I can't get to it at the right time. Either my array isn't fully formed, or I've come too late and all my data is (seemingly) gone. I don't expect there to be a BlockUntilShaderIsDonePlease function, but anything to go on to grab my data when everything's been filled out would be greatly appreciated.
Currently, I'm trying to loop until I extract the length value I expect from my shader code
//This is at the bottom of my usf function
InterlockedExchange(OutputArray[0], OutputArray.IncrementCounter(), Count);
Not only does this not work, but it seems like a bad way to go about it.
I've spent the last month scratching away at this, any help from someone who knows what they're talking about would be huge for me. Thanks

Style transfer on large image. (in chunks?)

I am looking into various style transfer models and I noted that they all have limited resolution (when running on Pixel 3, for example, I couldn't go beyond 1,024x1,024, OOM otherwise).
I've noticed a few apps (eg this app) which appear to be doing style transfer for up to ~10MP images, these apps also show progress bar which I guess means that they don't just call a single tensorflow "run" method for entire image as otherwise they won't know how much was processed.
I would guess they are using some sort of tiling, but naively splitting the image into 256x256 produces inconsistent style (not just on the borders).
As this seems like an obvious problem I tried to find any publications about this, but I couldn't find any. Am I missing something?
Thanks!
I would guess people split the model into multiple ones (for VGG it is easy to do manually, eg. via layers) and then use model_summary Keras function (or benchmarks) to estimate relative time it takes for each step and thus guide progress bar. Such separation probably also saves memory as tensorflow lite might not be clever enough to reuse memory storing intermediate activations from lower layers once they are not needed.

Testing ZXing - some discoveries

Just to start off, thanks for making ZXing freely available.
I have been working in the barcode field since '98, and have a fair bit of experience decoding barcodes from images.
I consult now, and one of my products in a program that tests barcode decoders. This is an extremely CPU intensive test, with literally millions of images being tested per symbology. I test with different amounts of blur, different angles, different brightness and contrast, different sizes (pixels per module), curvature, perspective distortion, ripple, varying illumination ... the whole shebang. I am testing the C++ implementation on a Linux machine.
I have come across a few issues that may interest the developers:
1) Aztec has a bug that crashes sometimes. In AztecDecoder.cpp, in Decoder::correctBits, sometimes numECCCodewords takes on a negative value. Bad things happen after that. I was able to patch with a simple test and a "throw" to complete my testing.
2) Aztec has a huge memory leak. In no time at all, my program is taking over one GB of RAM, and the number steadily increases over time. It gets bad enough that the OS starts bogging down and reboots after a while. I don't seem to have this problem with other symbologies.
3) You don't seem to make the version number available in the code. I know, you should keep track of which version you downloaded. Life isn't always like that, sometimes you inherit code, and you had nothing to do with downloading it. Even a simple #define in a header file would suffice. I like to display the version number of the decoder in my test program. I get this value directly from the decoder so that if I upgrade decoders, the reported version automatically changes. Just a nice thing to have.
4) Aztec appears to be extremely weak. I know, it's only alpha, but it hardly ever decodes. I am not meaning to pick on aztec, but it gave me the most issues.
5) The entire UPC family misdecodes (returns incorrect information for a correctly encoded barcode) extremely frequently. You may want to put some protection in there. ITF does a fair bit as well. All well-known weaknesses.
6) Aztec misdecodes as well. This should pretty well never happen with a Reed-Solomon error correction system. I haven't taken the time to look into this yet.
Those were the major problems, now for comments:
1) There is no support for barcodes at angles (omnidirectional decoding). This is supported by all major commercial packages.
2) Decoding appears to be weak overall in terms of handling blur, and low pixels per module. Yes, I realize that it is a free package, but just stating the weak points.
That's about all that I found now. I'll update when I have more information.

How to search for images/png/jpeg/any other types in the memory of a program and display it?

Well, lately I have found a very interesting article about Map Hacks in online games.
After reading it I read they used a memory scanner to look for images in the memory.
How would they accomplish such a program, is there a solution for this freely available?
if not how would I code it in C++? How can I know a piece of memory is an "image"?
I can load my own DLL into the process so that shouldn't be a big issue..
To answer your questions:
A memory scanner uses OS apis to query memory from another process and perform searches for patterns or differences. A great tool for this is cheat engine.
The tool mentioned in the article visualizes the memory by coloring pixels according to the value of the bytes in memory. The alignment still needs to be done manually and could be very time consuming. I don't think the mentioned program was ever released.
The main problem is that you can't know that a particular piece of memory is supposed to be a map. Any big regular structure could look like one when colorized and aligned. Finding the actual piece of memory you are looking for is very hard.
Additional Info:
A property map in a game is very dynamic. If units or something moves the visibility has to update. So the actual format of a map like this is most likely a binary bitmap with no specific image format (png,jpg,...).
I personally find the approach to look for a map structure in memory is a very inefficient and time consuming approach. It's beatuful to show to people that have no idea about reverse engineering, but to me seems very impractical. The approach which is best totally depends on the game and your creativity.
I hope I can help you with the following example how I made a map hack for starcraft 2.
My idea was to load up a replay of a game, where I had full view of the map and find the difference to loading up a normal game where my vision is restricted. I switched a couple of times between replay and normal game and could indeed find a state variable that was 0 on normal game and 1 on replay (a common tool for finding memory like this is cheat engine).
Next I loaded the game up in a debugger and put a memory access breakpoint on this state variable. Now when loading up a normal game I would change the value when it is accessed while the map was loading. Through some trial and error I was able to find the correct location that was responsible for revealing the minimap and real map. The only task left was to create a dll that detours the code location and make sure the map is always revealed on every mapload.
Reply to typ1232: You mention that it's hard and impractical to find the map structure in memory. Heres a method I have had great success with: Load up a map in any game with fog of war, like StarCraft 2. Take a dump of the memory and save it. Send out troops/units and reveal as much of the previously undiscovered map and take another memory dump. Compare the two dumps and look closer at the areas in memory where there are a high frequency of changes. This is likely to be where the map is stored.
Sorry if I'm doing it wrong, new to stackoverflow :)
This might be a bit broader answer to the subject of "finding data" but there are binary analysis tools out there. For example, ..cantor.dust.. is a binary visualization tool (though its only in beta the idea remains the same). You can search for different patterns within a memory dump for "images" or structures. Youtube cantor dust and the creator did a presentation at DerbyCon of how he used it to find EFI structures to recreate an exploit of a PNG parser at the EFI level.
I also think the saving two memory states of visible map vs limited visibility map and search for the changes is viable, if not the best option, I just am trying to point out an alternative.

Help with algorithm to dynamically update text display

First, some backstory:
I'm making what may amount to be a "roguelike" game so i can exersize some interesting ideas i've got floating around in my head. The gameplay isn't going to be a dungeon crawl, but in any case, the display is going to be done in a similar fasion, with simple ascii characters.
Being that this is a self exercise, I endeavor to code most of it myself.
Eventually I'd like to have the game runnable on arbitrarily large game worlds. (to the point where i envision havening the game networked and span over many monitors in a computer lab).
Right now, I've got some code that can read and write to arbitrary sections of a text console, and a simple partitioning system set up so that i can path-find efficiently.
And now the question:
I've ran some benchmarks, and the biggest bottleneck is the re-drawing of text consoles.
Having a game world that large will require an intelligent update of the display. I don't want to have to re-push my entire game buffer every frame... I need some pointers on how to set it up so that it only draws sections of the game have have been updated. (and not just individual characters as I've got now)
I've been manipulating the windows console via windows.h, but I would also be interested in getting it to run on linux machines over a puTTY client connected to the server.
I've tried adapting some video-processing routines, as there is nearly a 1:1 ratio between pixel and character, but I had no luck.
Really I want a simple explanation of some of the principles behind it. But some example (psudo)code would be nice too.
Use Curses, or if you need to be doing it yourself, read about the VTnnn control codes. Both of these should work on windows and on *nix terms and consoles (and Windows). You can also consult the nethack source code for hints. This will let you change characters on the screen wherever changes have happened.
I am not going to claim to understand this, but I believe this is close to the issue behind James Gosling's legendary Gosling Emacs redrawing code. See his paper, titled appropriately, "A Redisplay Algorithm", and also the general string-to-string correction problem.
Having a game world that large will
require an intelligent update of the
display. I don't want to have to
re-push my entire game buffer every
frame... I need some pointers on how
to set it up so that it only draws
sections of the game have have been
updated. (and not just individual
characters as I've got now)
The size of the game world isn't really relevant, as all you need to do is work out the visible area for each client and send that data. If you have a typical 80x25 console display then you're going to be sending just 2 or 3 kilobytes of data each time, even if you add in colour codes and the like. This is typical of most online games of this nature: update what the person can see, not everything in the world.
If you want to experiment with trying to find a way to cut down what you send, then feel free to do that for learning purposes, but we're about 10 years past the point where it is inefficient to update a console display in something approaching real time and it would be a shame to waste time fixing a problem that doesn't need fixing. Note that the PDF linked above gives an O(ND) solution whereas simply sending the entire console is half of O(N), where N is defined as the sum of the lengths of A and B and D.