I'm not trying to stream or anything, I just want to speed up my file loading code by loading vertex and index data directly into OpenGL's buffer instead of having to put it in an intermediate buffer first. Here's the code that grabs the pointer:
void* VertexArray::beginIndexLoad(GLenum indexFormat, unsigned int indexCount)
{
if (vao == 0)
return NULL;
bindArray();
glGenBuffers(1, &ibo);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ibo);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, indexSize(indexFormat) * indexCount, NULL, GL_STATIC_DRAW);
iformat = indexFormat;
icount = indexCount;
GLenum err = glGetError();
printf("%i\n", err);
void* ptr = glMapBuffer(GL_ELEMENT_ARRAY_BUFFER, GL_WRITE_ONLY);
err = glGetError();
printf("%i\n", err);
unbindArray();
return ptr;
}
Problem is, this returns NULL. What's worse, just before I do something similar with GL_ARRAY_BUFFER, and I get a perfectly valid pointer. Why does this fail, while the other succeeds?
The first glGetError returns 1280 (GL_INVALID_ENUM). The second glGetError returns 1285(GL_OUT_OF_MEMORY). I know it's not actually out of memory because uploading the exact same data normally via glBufferData works fine.
Maybe I'm just handling vertex arrays wrong?
(ps. I asked this on gamedev stack exchange and got nothing. Re-posting here to try to figure it out)
First and foremost your error checking code is wrong. You must call glGetError in a loop until it returns GL_NO_ERROR.
Regarding the GL_OUT_OF_MEMORY error code: It can also mean out of address space, which can easily happen if a large contiguous area of virtual address space is requested from the OS, but the process' address space is so much fragmented that no chunk that size is available (even if the total amount of free address space would suffice).
This has become the bane of 32 bit systems. A simple remedy is to use a 64 bit system. If you're stuck with a 32 bit plattform you'll have to defragment your address space (which is not trivial).
If I were you I would try the following:
Replace GL_STATIC_DRAW with GL_DYNAMIC_DRAW
Make sure that indexSize(indexFormat) * indexCount produces the size you are expecting
Try using glMapBufferRange instead of glMapBuffer, something along the line of glMapBufferRange(GL_ELEMENT_ARRAY_BUFFER, 0, yourBufferSize, GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT);
Check that ibo is of type GLuint
EDIT: fwiw, I would get a gDEBugger and set a breakpoint to break when there is an OpenGL error.
I solved the problem. I was passing in indexSize(header.indexFormat) when I should have been passing in header.indexFormat. I feel like an idiot now, and I'm sorry for wasting everyone's time.
Related
The bounty expires in 7 days. Answers to this question are eligible for a +50 reputation bounty.
Paul Aner is looking for a canonical answer:
I think the reason for this question is clear: I want the main-loop to NOT lock while a compute shader is processing larger amounts of data. I could try and seperate the data into smaller snippets, but if the computations were done on CPU, I would simply start a thread and everything would run nice and smoothly. Altough I of course would have to wait until the calculation-thread delivers new data to update the screen - the GUI (ImGUI) would not lock up...
I have written a program that does some calculations on a compute shader and the returned data is then being displayed. This works perfectly, except that the program execution is blocked while the shader is running (see code below) and depending on the parameters, this can take a while:
void CalculateSomething(GLfloat* Result)
{
// load some uniform variables
glDispatchCompute(X, Y, 1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
GLfloat* mapped = (GLfloat*)(glMapBuffer(GL_SHADER_STORAGE_BUFFER, GL_READ_ONLY));
memcpy(Result, mapped, sizeof(GLfloat) * X * Y);
glUnmapBuffer(GL_SHADER_STORAGE_BUFFER);
}
void main
{
// Initialization stuff
// ...
while (glfwWindowShouldClose(Window) == 0)
{
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glfwPollEvents();
glfwSwapInterval(2); // Doesn't matter what I put here
CalculatateSomething(Result);
Render(Result);
glfwSwapBuffers(Window.WindowHandle);
}
}
To keep the main loop running while the compute shader is calculating, I changed CalculateSomething to something like this:
void CalculateSomething(GLfloat* Result)
{
// load some uniform variables
glDispatchCompute(X, Y, 1);
GPU_sync = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
}
bool GPU_busy()
{
GLint GPU_status;
if (GPU_sync == NULL)
return false;
else
{
glGetSynciv(GPU_sync, GL_SYNC_STATUS, 1, nullptr, &GPU_status);
return GPU_status == GL_UNSIGNALED;
}
}
These two functions are part of a class and it would get a little messy and complicated if I had to post all that here (if more code is needed, tell me). So every loop when the class is told to do the computation, it first checks, if the GPU is busy. If it's done, the result is copied to CPU-memory (or a calculation is started), else it returns to main without doing anything else. Anyway, this approach works in that it produces the right result. But my main loop is still blocked.
Doing some timing revealed that CalculateSomething, Render (and everything else) runs fast (as I would expect them to do). But now glfwSwapBuffers takes >3000ms (depending on how long the calculations of the compute shader take).
Shouldn't it be possible to switch buffers while a compute shader is running? Rendering the result seems to work fine and without delay (as long as the compute shader is not done yet, the old result should get rendered). Or am I missing something here (queued OpenGL calls get processed before glfwSwapBuffers does something?)?
Edit:
I'm not sure why this question got closed and what additional information is needed (maybe other than the OS, which would be Windows). As for "desired behavior": Well - I'd like the glfwSwapBuffers-call not to block my main loop. For additional information, please ask...
As pointed out by Erdal Küçük an implicit call of glFlush might cause latency. I did put this call before glfwSwapBuffer for testing purposes and timed it - no latency here...
I'm sure, I can't be the only one who ever ran into this problem. Maybe someone could try and reproduce it? Simply put a compute shader in the main-loop that takes a few seconds to do it's calculations. I have read somewhere that similar problems occur escpecially when calling glMapBuffer. This seems to be an issue with the GPU-driver (mine would be an integrated Intel-GPU). But nowhere have I read about latencies above 200ms...
Solved a similar issue with GL_PIXEL_PACK_BUFFER effectively used as an offscreen compute shader. The approach with fences is correct, but you then need to have a separate function that checks the status of the fence using glGetSynciv to read the GL_SYNC_STATUS. The solution (admittedly in Java) can be found here.
An explanation for why this is necessary can be found in: in #Nick Clark's comment answer:
Every call in OpenGL is asynchronous, except for the frame buffer swap, which stalls the calling thread until all submitted functions have been executed. Thus, the reason why glfwSwapBuffers seems to take so long.
The relevant portion from the solution is:
public void finishHMRead( int pboIndex ){
int[] length = new int[1];
int[] status = new int[1];
GLES30.glGetSynciv( hmReadFences[ pboIndex ], GLES30.GL_SYNC_STATUS, 1, length, 0, status, 0 );
int signalStatus = status[0];
int glSignaled = GLES30.GL_SIGNALED;
if( signalStatus == glSignaled ){
// Ready a temporary ByteBuffer for mapping (we'll unmap the pixel buffer and lose this) and a permanent ByteBuffer
ByteBuffer pixelBuffer;
texLayerByteBuffers[ pboIndex ] = ByteBuffer.allocate( texWH * texWH );
// map data to a bytebuffer
GLES30.glBindBuffer( GLES30.GL_PIXEL_PACK_BUFFER, pbos[ pboIndex ] );
pixelBuffer = ( ByteBuffer ) GLES30.glMapBufferRange( GLES30.GL_PIXEL_PACK_BUFFER, 0, texWH * texWH * 1, GLES30.GL_MAP_READ_BIT );
// Copy to the long term ByteBuffer
pixelBuffer.rewind(); //copy from the beginning
texLayerByteBuffers[ pboIndex ].put( pixelBuffer );
// Unmap and unbind the currently bound pixel buffer
GLES30.glUnmapBuffer( GLES30.GL_PIXEL_PACK_BUFFER );
GLES30.glBindBuffer( GLES30.GL_PIXEL_PACK_BUFFER, 0 );
Log.i( "myTag", "Finished copy for pbo data for " + pboIndex + " at: " + (System.currentTimeMillis() - initSphereStart) );
acknowledgeHMReadComplete();
} else {
// If it wasn't done, resubmit for another check in the next render update cycle
RefMethodwArgs finishHmRead = new RefMethodwArgs( this, "finishHMRead", new Object[]{ pboIndex } );
UpdateList.getRef().addRenderUpdate( finishHmRead );
}
}
Basically, fire off the computer shader, then wait for the glGetSynciv check of GL_SYNC_STATUS to equal GL_SIGNALED, then rebind the GL_SHADER_STORAGE_BUFFER and perform the glMapBuffer operation.
I'm completely new to OpenGL, so I assume I'm probably doing something stupid here. Basically I'm just trying to memory map a buffer object I've created, but glMapBuffer() is returning NULL and giving an error code of GL_INVALID_ENUM. Here's the relevant code that's failing:
glGenVertexArrays(1, &m_vao);
glBindVertexArray(m_vao);
glGenBuffers(1, &m_vertex_buffer);
glNamedBufferStorage(m_vertex_buffer,
BUFFER_SIZE,
NULL,
GL_MAP_READ_BIT | GL_MAP_WRITE_BIT | GL_DYNAMIC_STORAGE_BIT);
glBindBuffer(GL_ARRAY_BUFFER, m_vertex_buffer);
void* vertex_buffer = glMapBuffer(GL_ARRAY_BUFFER, GL_READ_ONLY);
if (!vertex_buffer)
{
GLenum error = glGetError();
fprintf(stderr, "Buffer map failed! %d (%s)\n", error, gluErrorString(error));
return;
}
glUnmapBuffer(GL_ARRAY_BUFFER);
glBindBuffer(GL_ARRAY_BUFFER, 0);
This is printing:
Buffer map failed! 1280 (invalid enumerant)
According to the docs, that error gets returned if the target is not one of the available target enums. That said, GL_ARRAY_BUFFER is definitely listed as available.
Am I just doing something wrong here?
In case it helps anyone else, I had multiple issues:
glGetError() returns the first value from a queue of errors. The GL_INVALID_ENUM I thought I was getting was actually from a previous (unrelated) call.
Per this thread, glGenBuffers() allocates a new buffer name, but the buffer is not actually created until it gets bound to a context using glBindBuffer(). I was instead calling glNamedBufferStorage() immediately, which resulted in a GL_INVALID_OPERATION since the buffer didn't actually exist yet. So basically I should always use glCreate*() instead of glGen*() when using DSA.
I believe that glNamedBufferStorage() is for immutable buffers, and glNamedBufferData() is for buffers that you can modify, although I'm not entirely clear on that from the documentation. In any case, I'm using glNamedBufferData() now.
I'm now able to successfully map and write to my buffers after the following setup:
glCreateBuffers(1, &m_vertex_buffer);
glNamedBufferData(m_vertex_buffer, BUFFER_SIZE, NULL, GL_DYNAMIC_DRAW);
void* vertex_buffer_ptr = glMapNamedBuffer(m_vertex_buffer, GL_WRITE_ONLY);
For the past couple of hours I've been trying to track down a bug in my program, which only occurs when running it in release mode. I've already resolved all level-4 compiler-warnings, and there are no uninitialized variables anywhere (Which would usually be my first suspect in a case like this).
This is a tough one to explain, since I don't even exactly know what exactly is going on, so bear with me please.
After a lot of debugging, I've narrowed the cause of the bug down to somewhere in the following function:
void CModelSubMesh::Update()
{
ModelSubMesh::Update();
auto bHasAlphas = (GetAlphaCount() > 0) ? true : false;
auto bAnimated = (!m_vertexWeights.empty() || !m_weightBoneIDs.empty()) ? true : false;
if(bHasAlphas == false && bAnimated == false)
m_glMeshData = std::make_unique<GLMeshData>(m_vertices,m_normals,m_uvs,m_triangles);
else
{
m_glmesh = GLMesh();
auto bufVertex = OpenGL::GenerateBuffer();
auto bufUV = OpenGL::GenerateBuffer();
auto bufNormal = OpenGL::GenerateBuffer();
auto bufIndices = OpenGL::GenerateBuffer();
auto bufAlphas = 0;
if(bHasAlphas == true)
bufAlphas = OpenGL::GenerateBuffer();
auto vao = OpenGL::GenerateVertexArray();
m_glmesh.SetVertexArrayObject(vao);
m_glmesh.SetVertexBuffer(bufVertex);
m_glmesh.SetUVBuffer(bufUV);
m_glmesh.SetNormalBuffer(bufNormal);
if(bHasAlphas == true)
m_glmesh.SetAlphaBuffer(bufAlphas);
m_glmesh.SetIndexBuffer(bufIndices);
m_glmesh.SetVertexCount(CUInt32(m_vertices.size()));
auto numTriangles = CUInt32(m_triangles.size()); // CUInt32 is equivalent to static_cast<unsigned int>
m_glmesh.SetTriangleCount(numTriangles);
// PLACEHOLDER LINE
OpenGL::BindVertexArray(vao);
OpenGL::BindBuffer(bufVertex,GL_ARRAY_BUFFER);
OpenGL::BindBufferData(CInt32(m_vertices.size()) *sizeof(glm::vec3),&m_vertices[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);
OpenGL::EnableVertexAttribArray(SHADER_VERTEX_BUFFER_LOCATION);
OpenGL::SetVertexAttribData(
SHADER_VERTEX_BUFFER_LOCATION,
3,
GL_FLOAT,
GL_FALSE,
(void*)0
);
OpenGL::BindBuffer(bufUV,GL_ARRAY_BUFFER);
OpenGL::BindBufferData(CInt32(m_uvs.size()) *sizeof(glm::vec2),&m_uvs[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);
OpenGL::EnableVertexAttribArray(SHADER_UV_BUFFER_LOCATION);
OpenGL::SetVertexAttribData(
SHADER_UV_BUFFER_LOCATION,
2,
GL_FLOAT,
GL_FALSE,
(void*)0
);
OpenGL::BindBuffer(bufNormal,GL_ARRAY_BUFFER);
OpenGL::BindBufferData(CInt32(m_normals.size()) *sizeof(glm::vec3),&m_normals[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);
OpenGL::EnableVertexAttribArray(SHADER_NORMAL_BUFFER_LOCATION);
OpenGL::SetVertexAttribData(
SHADER_NORMAL_BUFFER_LOCATION,
3,
GL_FLOAT,
GL_FALSE,
(void*)0
);
if(!m_vertexWeights.empty())
{
m_bufVertWeights.bufWeights = OpenGL::GenerateBuffer();
OpenGL::BindBuffer(m_bufVertWeights.bufWeights,GL_ARRAY_BUFFER);
OpenGL::BindBufferData(CInt32(m_vertexWeights.size()) *sizeof(float),&m_vertexWeights[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);
OpenGL::EnableVertexAttribArray(SHADER_BONE_WEIGHT_LOCATION);
OpenGL::BindBuffer(m_bufVertWeights.bufWeights,GL_ARRAY_BUFFER);
OpenGL::SetVertexAttribData(
SHADER_BONE_WEIGHT_LOCATION,
4,
GL_FLOAT,
GL_FALSE,
(void*)0
);
}
if(!m_weightBoneIDs.empty())
{
m_bufVertWeights.bufBoneIDs = OpenGL::GenerateBuffer();
OpenGL::BindBuffer(m_bufVertWeights.bufBoneIDs,GL_ARRAY_BUFFER);
OpenGL::BindBufferData(CInt32(m_weightBoneIDs.size()) *sizeof(int),&m_weightBoneIDs[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);
OpenGL::EnableVertexAttribArray(SHADER_BONE_WEIGHT_ID_LOCATION);
OpenGL::BindBuffer(m_bufVertWeights.bufBoneIDs,GL_ARRAY_BUFFER);
glVertexAttribIPointer(
SHADER_BONE_WEIGHT_ID_LOCATION,
4,
GL_INT,
0,
(void*)0
);
}
if(bHasAlphas == true)
{
OpenGL::BindBuffer(bufAlphas,GL_ARRAY_BUFFER);
OpenGL::BindBufferData(CInt32(m_alphas.size()) *sizeof(glm::vec2),&m_alphas[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);
OpenGL::EnableVertexAttribArray(SHADER_USER_BUFFER1_LOCATION);
OpenGL::SetVertexAttribData(
SHADER_USER_BUFFER1_LOCATION,
2,
GL_FLOAT,
GL_FALSE,
(void*)0
);
}
OpenGL::BindBuffer(bufIndices,GL_ELEMENT_ARRAY_BUFFER);
OpenGL::BindBufferData(numTriangles *sizeof(unsigned int),&m_triangles[0],GL_STATIC_DRAW,GL_ELEMENT_ARRAY_BUFFER);
OpenGL::BindVertexArray(0);
OpenGL::BindBuffer(0,GL_ARRAY_BUFFER);
OpenGL::BindBuffer(0,GL_ELEMENT_ARRAY_BUFFER);
}
ComputeTangentBasis(m_vertices,m_uvs,m_normals,m_triangles);
}
My program is a graphics application, and this piece of code generates the object buffers which are required for rendering later on. The bug basically causes the vertices of a specific mesh to be rendered incorrectly when certain conditions are met. The bug is consistent and happens every time for the same mesh.
Sadly I can't narrow the code down any further, since that would make the bug disappear, and explaining what each line does would take quite a while and isn't too relevant here. I'm almost positive that this is a problem with compiler optimization, so the actual bug is more of a side-effect in this case anyway.
With the code above, the bug will occur, but only when in release mode. The interesting part is the line I marked as "PLACEHOLDER LINE".
If I change the code to one of the following 3 variants, the bug will disappear:
#1:
void CModelSubMesh::Update()
{
[...]
// PLACEHOLDER LINE
std::cout<<numTriangles<<std::endl;
[...]
}
#2:
#pragma optimize( "", off )
void CModelSubMesh::Update()
{
[...] // No changes to the code
}
#pragma optimize( "", on )
#3:
static void test()
{
auto *f = new float; // Do something to make sure the compiler doesn't optimize this function away; Doesn't matter what
delete f;
}
void CModelSubMesh::Update()
{
[...]
// PLACEHOLDER LINE
test()
[...]
}
Especially variant #2 indicates that something is being optimized which shouldn't be.
I don't expect anyone to magically know what the root of the problem is, since that would require deeper knowledge of the code. However, maybe someone with a better understanding of the compiler optimization process can give me some hints, what could be going on here?
Since almost any change to the code gets rid of the bug, I'm just not sure what I can do to actually find the cause of it.
Most often when I've hit something that works in debug but not in release it's an uninitialized variable. Most compilers initialize variables to 0x00 in debug builds, but you lose that when optimizations are turned on.
This could explain why modifying the program alters the behavior: by adjusting the memory map of you application you end up getting some random different chunk of uninitialized memory that somehow masks the issue.
If you're keeping up good memory management hygiene you might find the issue quickly using a tool like valgrind. Long term you may want to look into leveraging a memory management framework that detects memory abuse automagically, (see Ogre MemoryTracker, TCMalloc, Clang Memory Sanitizer).
Using Radeon 3870HD I've run into strange behavior with AMD's drivers (updated yesterday to newest version available).
First of all I'd like to note that whole code ran without problem on NVIDIA GeForce 540M and that glGetUniformLocation isn't failing all the time.
So my problem is that I get from glGetUniformLocation strange values with one of my shader programs used in app, while the other shader program doesn't have such flaw. I switch that shaders between frames so I'm sure it isn't temporary and is related to that shader. By strange values I mean something like 17-26, while I have only 9 uniforms present. I'm using my interface for shaders that afterwards queries for type of GLSL variable with just obtained variable and as side effect I also query its name. For all those 17-26 locations I got returned the name wasn't set and same for type. Now I've got idea to debug into interface which is separate library and change those values to something I'd expect: 0-8. Using debugger I changed these and indeed they returned proper variable name in that shader and also type was correct.
My question is how possibly could the code that is working always with NVIDIA and also with that other shader on Radeon behave differently for another shader, that is treated the same way, fail?
I include related part of interface for this:
//this fails to return correct value
m_location = glGetUniformLocation(m_program.getGlID(), m_name.c_str());
printGLError();
if(m_location == -1){
std::cerr << "ERROR: Uniform " << m_name << " doesn't exist in program" << std::endl;
return FAILURE;
}
GLsizei charSize = m_name.size()+1, size = 0, length = 0;
GLenum type = 0;
GLchar* name = new GLchar[charSize];
name[charSize-1] = '\0';
glGetActiveUniform(m_program.getGlID(), m_location, charSize, &length, &size, &type, name);
delete name; name = 0;
if(!TypeResolver::resolve(type, m_type))
return FAILURE;
m_prepared = true;
m_applied = false;
The index you pass to glGetActiveUniform is not supposed to be a uniform location. Uniform locations are only used with glUniform calls; nothing else.
The index you pass to glGetActiveUniform is just a index between 0 and the value returned by glGetProgram(GL_ACTIVE_UNIFORMS). It is used to ask what uniforms exist and to inspect the properties of those uniforms.
Your code works on NVIDIA only because you got lucky. The OpenGL specification doesn't guarantee that the order of uniform locations is the same as the order of active uniform indices. AMD's drivers don't work that way, so your code doesn't work.
I am running through a file and dealing with 30 or so different fragment types. So every time, I read in a fragment and compare it's type (in hex) with those of the fragments I know. Is this fast or is there another way I can do this quicker?
Here is a sample of the code I am using:
// Iterate through the fragments and address them individually
for(int i = 0; i < header.fragmentCount; i++)
{
// Read in memory for the current fragment
memcpy(&frag, (wld + file_pos), sizeof(struct_wld_basic_frag));
// Deal with each frag type
switch(frag.id)
{
// Texture Bitmap Name(s)
case 0x03:
errorLog.OutputSuccess("[%i] 0x03 - Texture Bitmap Name", i);
break;
// Texture Bitmap Info
case 0x04:
errorLog.OutputSuccess("[%i] 0x04 - Texture Bitmap Info", i);
break;
// Texture Bitmap Reference Info
case 0x05:
errorLog.OutputSuccess("[%i] 0x05 - Texture Bitmap Reference Info", i);
break;
// Two-dimensional Object
case 0x06:
errorLog.OutputSuccess("[%i] 0x06 - Two-dimensioanl object", i);
break;
It runs through about 30 of these and when there are thousands of fragments, it can chug a bit. How would one recommend I speed this process up?
Thank you!
If all of these cases are the same except for the format string, consider having a array of format strings, and no case, as in:
const char *fmtStrings[] = {
NULL, NULL, NULL,
"[%i] 0x03 - Texture Bitmap Name",
"[%i] 0x04 - Texture Bitmap Info",
/* ... */
};
// ...
errorLog.OutputSuccess(fmtStrings[i], i);
// (range checks elided)
This should be less expensive than a switch, as it won't involve a branch misprediction penalty. That said, the cost of this switch is probably less than the cost of actually formatting the output string, so your optimization efforts may be a bit misplaced.
The case statement should be very fast, because when your code is optimized (and even sometimes when it isn't) it is implemented as a jump table. Go into the debugger and put a breakpoint on the switch and check the disassembly to make sure that's the case.
I think performing the memcpy is probably causing a lot of overhead. Maybe use your switch statement on a direct access to your data at (wld + file_pos).
I'm skeptical that the 30 case statements are the issue. That's just not very much code compared to whatever your memcpy and errorLog methods are doing. First verify that your speed is limited by CPU time and not by disk access. If you really are CPU bound, examine the code in a profiler.
If your fragment identifiers aren't too sparse, you can create an array of fragment type names and use it as a lookup table.
static const char *FRAGMENT_NAMES[] = {
0,
0,
0,
"Texture Bitmap Name", // 0x03
"Texture Bitmap Info", // 0x04
// etc.
};
...
const char *name = FRAGMENT_NAMES[frag.id];
if (name) {
errorLog.OutputSuccess("[%i] %x - %s", i, frag.id, name);
} else {
// unknown name
}
If your log statements are always strings of the form "[%i] 0xdd - message..." and frag.id is always an integer between 0 and 30, you could instead declare an array of strings:
std::string messagesArray[] = {"[%i] 0x00 - message one", "[%i] 0x01 - message two", ...}
Then replace the switch statement with
errorLog.OutputSuccess(messagesArray[frag.id], i);
If the possible fragment type values are all contiguous, and you don't want to do anything much more complex than printing a string upon matching, you can just index into an array, e.g.:
const char* typeNames[] = {"Texture Bitmap Name", "Texture Bitmap Info", ...};
/* for each frag.id: */
if (LOWER_LIMIT <= frag.id && frag.id < UPPER_LIMIT) {
printf("[%i] %#02x - %s\n", i, frag.id, typeNames[frag.id-LOWER_LIMIT]);
} else {
/* complain about error */
}
It's impossible to say for sure without seeing more, but it appears that you can avoid the memcpy, and instead use a pointer to walk through the data.
struct_wld_basic_frag *frag = (struct_wld_basic_frag *)wld;
for (i=0; i<header.fragmentCount; i++)
errorlog.OutputSuccess(fragment_strings[frag[i].id], i);
For the moment, I've assumed an array of strings for the different fragment types, as recommended by #Chris and #Ates. Even at worst, that will improve readability and maintainability without hurting speed. At best, it might (for example) improve cache usage, and give a major speed improvement -- one copy of the code to call errorlog.outputSuccess instead of 30 separate copies could make room for a lot of other "stuff" in the cache.
Avoiding copying data every time is a lot more likely to do real good though. At the same time, I should probably add that it's possible for this to cause a problem -- if the data isn't correctly aligned in the original buffer, attempting to use the pointer won't work.