I have joined a project to simplify legacy graphics code, and would be grateful for advice on this data conversion problem.
The input is compressed textures in DXT1, DXT3, DXT5 formats. The data is in main memory, not graphics card memory. The input does not have the standard DDS_HEADER, only the compressed pixel data. The desired output is QImages.
Using existing metadata, we can construct a DDS_HEADER, write the texture to a temporary file, then load the QImage from that file. However we want to avoid this solution and work with the original data directly as there are many, many instances of it.
My research has uncovered no Qt functions to perform this conversion directly. So far, the most promising sounding approach is to use our existing OpenGL context to draw the texture to a QOpenGLFrameBufferObject. This class has a toImage() member function. However, I don't understand how to construct a usable texture object out of the raw data and draw that to the frame buffer.
Edit: A clarification, based on Scheff's excellent answer. I am aware that the textures can be manually decompressed and a QImage loaded from the result. I would prefer to avoid this step and use library functions if possible, for greatest simplicity. QOpenGLTexture has a member function setCompressedData that might be used.
Thanks for any suggestions.
Reading this question, I became curious and learnt about S3 Texture Compression. Funny enough, although I heart about compressed textures in the past, I always assumed that it would be something complicated like LZW Algorithm or JPEG Compression, and never digged deeper. But, today I realized I was totally wrong.
S3 Texture Compression is actually much simpler though it can achieve quite impressive compression ratios.
Nice introductions can be easily found by google. The question already mentions MSDN. Additionally, I found some other sites which gave me a quite good introduction into this topic in least time:
khronos.org: S3 Texture Compression (which I consider as authoritative source)
FSDeveloper.com: DXT compression explained
Joost's Dev Blog: Texture formats for 2D games, part 3: DXT and PVRTC
MSDN: Programming Guide for DDS
Brandon Jones webgl-texture-utils on GitHub
nv_dds on GitHub DDS image loader for OpenGL/ OpenGL ES2.
Concerning the GitHub projects, it seems that somebodies already did the work. I scanned a little bit the code by eyes but, finally, I'm not sure whether they support all possible features. However, I "borrowed" the test images from Brandon Jones site, so, it's fair enough to mention it.
So, this is my actual answer: An alternative approach could be to decode of texture to the QImage on CPU side completely.
As a proof of concept, I leave the result of my code fiddling I did this morning – my trial to transform the linked descriptions into working C++ code – DXT1-QImage.cc:
#include <cstdint>
#include <fstream>
#include <QtWidgets>
#ifndef _WIN32
typedef quint32 DWORD;
#endif // _WIN32
/* borrowed from:
* https://msdn.microsoft.com/en-us/library/windows/desktop/bb943984(v=vs.85).aspx
*/
struct DDS_PIXELFORMAT {
DWORD dwSize;
DWORD dwFlags;
DWORD dwFourCC;
DWORD dwRGBBitCount;
DWORD dwRBitMask;
DWORD dwGBitMask;
DWORD dwBBitMask;
DWORD dwABitMask;
};
/* borrowed from:
* https://msdn.microsoft.com/en-us/library/windows/desktop/bb943982(v=vs.85).aspx
*/
struct DDS_HEADER {
DWORD dwSize;
DWORD dwFlags;
DWORD dwHeight;
DWORD dwWidth;
DWORD dwPitchOrLinearSize;
DWORD dwDepth;
DWORD dwMipMapCount;
DWORD dwReserved1[11];
DDS_PIXELFORMAT ddspf;
DWORD dwCaps;
DWORD dwCaps2;
DWORD dwCaps3;
DWORD dwCaps4;
DWORD dwReserved2;
};
inline quint32 stretch(std::uint16_t color)
{
return 0xff000000u
| (quint32)(color & 0x001f) << 3 // >> 0 << 3 << 0
| (quint32)(color & 0x07e0) << 5 // >> 5 << 2 << 8
| (quint32)(color & 0xf800) << 8;// >> 11 << 3 << 16
}
void makeLUT(
quint32 lut[4], std::uint16_t color0, std::uint16_t color1)
{
const quint32 argb0 = stretch(color0);
const quint32 argb1 = stretch(color1);
lut[0] = argb0;
lut[1] = argb1;
if (color0 > color1) {
lut[2] = 0xff000000u
| ((((argb0 & 0xff0000) >> 15) + ((argb1 & 0xff0000) >> 16)) / 3) << 16
| ((((argb0 & 0x00ff00) >> 7) + ((argb1 & 0x00ff00) >> 8)) / 3) << 8
| ((((argb0 & 0x0000ff) << 1) + ((argb1 & 0x0000ff) >> 0)) / 3) << 0;
lut[3] = 0xff000000u
| ((((argb1 & 0xff0000) >> 15) + ((argb0 & 0xff0000) >> 16)) / 3) << 16
| ((((argb1 & 0x00ff00) >> 7) + ((argb0 & 0x00ff00) >> 8)) / 3) << 8
| ((((argb1 & 0x0000ff) << 1) + ((argb0 & 0x0000ff) >> 0)) / 3) << 0;
} else {
lut[2] = 0xff000000u
| ((((argb0 & 0xff0000) >> 16) + ((argb1 & 0xff0000) >> 16)) / 2) << 16
| ((((argb0 & 0x00ff00) >> 8) + ((argb1 & 0x00ff00) >> 8)) / 2) << 8
| ((((argb0 & 0x0000ff) >> 0) + ((argb1 & 0x0000ff) >> 0)) / 2) << 0;
lut[3] = 0xff000000u;
}
}
const std::uint8_t* uncompress(
const std::uint8_t *data, QImage &qImg, int x, int y)
{
// get color 0 and color 1
std::uint16_t color0 = data[0] | data[1] << 8;
std::uint16_t color1 = data[2] | data[3] << 8;
data += 4;
quint32 lut[4]; makeLUT(lut, color0, color1);
// decode 4 x 4 pixels
for (int i = 0; i < 4; ++i) {
qImg.setPixel(x + 0, y + i, lut[data[i] >> 0 & 3]);
qImg.setPixel(x + 1, y + i, lut[data[i] >> 2 & 3]);
qImg.setPixel(x + 2, y + i, lut[data[i] >> 4 & 3]);
qImg.setPixel(x + 3, y + i, lut[data[i] >> 6 & 3]);
}
data += 4;
// done
return data;
}
QImage loadDXT1(const char *file)
{
std::ifstream fIn(file, std::ios::in | std::ios::binary);
// read magic code
enum { sizeMagic = 4 }; char magic[sizeMagic];
if (!fIn.read(magic, sizeMagic)) {
return QImage(); // ERROR: read failed
}
if (strncmp(magic, "DDS ", sizeMagic) != 0) {
return QImage(); // ERROR: wrong format (wrong magic code)
}
// read header
DDS_HEADER header;
if (!fIn.read((char*)&header, sizeof header)) {
return QImage(); // ERROR: read failed
}
qDebug() << "header size:" << sizeof header;
// get raw data (size computation unclear)
const unsigned w = (header.dwWidth + 3) / 4;
const unsigned h = (header.dwHeight + 3) / 4;
std::vector<std::uint8_t> data(w * h * 8);
qDebug() << "data size:" << data.size();
if (!fIn.read((char*)data.data(), data.size())) {
return QImage(); // ERROR: read failed
}
// decode raw data
QImage qImg(header.dwWidth, header.dwHeight, QImage::Format_ARGB32);
const std::uint8_t *pData = data.data();
for (int y = 0; y < (int)header.dwHeight; y += 4) {
for (int x = 0; x < (int)header.dwWidth; x += 4) {
pData = uncompress(pData, qImg, x, y);
}
}
qDebug() << "processed image size:" << fIn.tellg();
// done
return qImg;
}
int main(int argc, char **argv)
{
qDebug() << "Qt Version:" << QT_VERSION_STR;
QApplication app(argc, argv);
// build QImage
QImage qImg = loadDXT1("test-dxt1.dds");
// setup GUI
QMainWindow qWin;
QLabel qLblImg;
qLblImg.setPixmap(QPixmap::fromImage(qImg));
qWin.setCentralWidget(&qLblImg);
qWin.show();
// exec. application
return app.exec();
}
I did the development and debugging on VS2013. To check out, whether it is portable to Linux the best I could do was to compile and test on cygwin as well.
For this, I wrote a QMake file DXT1-QImage.pro:
SOURCES = DXT1-QImage.cc
QT += widgets
to compile and run this in bash:
$ qmake-qt5 DXT1-QImage.pro
$ make
g++ -c -fno-keep-inline-dllexport -D_GNU_SOURCE -pipe -O2 -Wall -W -D_REENTRANT -DQT_NO_DEBUG -DQT_WIDGETS_LIB -DQT_GUI_LIB -DQT_CORE_LIB -I. -isystem /usr/include/qt5 -isystem /usr/include/qt5/QtWidgets -isystem /usr/include/qt5/QtGui -isystem /usr/include/qt5/QtCore -I. -I/usr/lib/qt5/mkspecs/cygwin-g++ -o DXT1-QImage.o DXT1-QImage.cc
g++ -o DXT1-QImage.exe DXT1-QImage.o -lQt5Widgets -lQt5Gui -lQt5Core -lGL -lpthread
$ ./DXT1-QImage
Qt Version: 5.9.2
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-ds32737'
header size: 124
data size: 131072
processed image size: 131200
QXcbShmImage: shmget() failed (88: Function not implemented) for size 1048576 (512x512)
For the test, I used the sample file test-dxt1.dds.
And this is what came out:
For comparison, the original image:
Notes:
I'm implemented a file loader although the questioner explicitly mentioned that he wants to convert raw image data which is already in memory. I had to do this as I didn't see any other way to get (valid) DXT1 raw data into memory on my side (to justify afterwards if it works or not).
My debug output shows that my loader reads 131200 bytes (i.e. 4 bytes magic code, 124 bytes header, and 131072 bytes compressed image data).
In opposition to this, the file test-dxt1.dds contains 174904 bytes. So, there is additional data in file but I do not (yet) know for what it is good for.
After I got the feed-back, that I didn't match the expectations of the questioner in my first answer, I modified my sources to draw the DXT1 raw-data into an OpenGL texture.
So, this answer addresses specifically this part of the question:
However, I don't understand how to construct a usable texture object out of the raw data and draw that to the frame buffer.
The modifications are strongly "inspired" by the Qt docs Cube OpenGL ES 2.0 example.
The essential part is how the QOpenGLTexture is constructed out of the DXT1 raw data:
_pQGLTex = new QOpenGLTexture(QOpenGLTexture::Target2D);
_pQGLTex->setFormat(QOpenGLTexture::RGB_DXT1);
_pQGLTex->setSize(_img.w, _img.h);
_pQGLTex->allocateStorage(QOpenGLTexture::RGBA, QOpenGLTexture::UInt8);
_pQGLTex->setCompressedData((int)_img.data.size(), _img.data.data());
_pQGLTex->setMinificationFilter(QOpenGLTexture::Nearest);
_pQGLTex->setMagnificationFilter(QOpenGLTexture::Linear);
_pQGLTex->setWrapMode(QOpenGLTexture::ClampToEdge);
And, this is the complete sample code DXT1-QTexture-QImage.cc:
#include <cstdint>
#include <fstream>
#include <QtWidgets>
#include <QOpenGLFunctions_4_0_Core>
#ifndef _WIN32
typedef quint32 DWORD;
#endif // _WIN32
/* borrowed from:
* https://msdn.microsoft.com/en-us/library/windows/desktop/bb943984(v=vs.85).aspx
*/
struct DDS_PIXELFORMAT {
DWORD dwSize;
DWORD dwFlags;
DWORD dwFourCC;
DWORD dwRGBBitCount;
DWORD dwRBitMask;
DWORD dwGBitMask;
DWORD dwBBitMask;
DWORD dwABitMask;
};
/* borrowed from:
* https://msdn.microsoft.com/en-us/library/windows/desktop/bb943982(v=vs.85).aspx
*/
struct DDS_HEADER {
DWORD dwSize;
DWORD dwFlags;
DWORD dwHeight;
DWORD dwWidth;
DWORD dwPitchOrLinearSize;
DWORD dwDepth;
DWORD dwMipMapCount;
DWORD dwReserved1[11];
DDS_PIXELFORMAT ddspf;
DWORD dwCaps;
DWORD dwCaps2;
DWORD dwCaps3;
DWORD dwCaps4;
DWORD dwReserved2;
};
struct Image {
int w, h;
std::vector<std::uint8_t> data;
explicit Image(int w = 0, int h = 0):
w(w), h(h), data(((w + 3) / 4) * ((h + 3) / 4) * 8)
{ }
~Image() = default;
Image(const Image&) = delete;
Image& operator=(const Image&) = delete;
Image(Image &&img): w(img.w), h(img.h), data(move(img.data)) { }
};
Image loadDXT1(const char *file)
{
std::ifstream fIn(file, std::ios::in | std::ios::binary);
// read magic code
enum { sizeMagic = 4 }; char magic[sizeMagic];
if (!fIn.read(magic, sizeMagic)) {
return Image(); // ERROR: read failed
}
if (strncmp(magic, "DDS ", sizeMagic) != 0) {
return Image(); // ERROR: wrong format (wrong magic code)
}
// read header
DDS_HEADER header;
if (!fIn.read((char*)&header, sizeof header)) {
return Image(); // ERROR: read failed
}
qDebug() << "header size:" << sizeof header;
// get raw data (size computation unclear)
Image img(header.dwWidth, header.dwHeight);
qDebug() << "data size:" << img.data.size();
if (!fIn.read((char*)img.data.data(), img.data.size())) {
return Image(); // ERROR: read failed
}
qDebug() << "processed image size:" << fIn.tellg();
// done
return img;
}
const char *vertexShader =
"#ifdef GL_ES\n"
"// Set default precision to medium\n"
"precision mediump int;\n"
"precision mediump float;\n"
"#endif\n"
"\n"
"uniform mat4 mvp_matrix;\n"
"\n"
"attribute vec4 a_position;\n"
"attribute vec2 a_texcoord;\n"
"\n"
"varying vec2 v_texcoord;\n"
"\n"
"void main()\n"
"{\n"
" // Calculate vertex position in screen space\n"
" gl_Position = mvp_matrix * a_position;\n"
"\n"
" // Pass texture coordinate to fragment shader\n"
" // Value will be automatically interpolated to fragments inside polygon faces\n"
" v_texcoord = a_texcoord;\n"
"}\n";
const char *fragmentShader =
"#ifdef GL_ES\n"
"// Set default precision to medium\n"
"precision mediump int;\n"
"precision mediump float;\n"
"#endif\n"
"\n"
"uniform sampler2D texture;\n"
"\n"
"varying vec2 v_texcoord;\n"
"\n"
"void main()\n"
"{\n"
" // Set fragment color from texture\n"
"#if 0 // test check tex coords\n"
" gl_FragColor = vec4(1, v_texcoord.x, v_texcoord.y, 1);\n"
"#else // (not) 0;\n"
" gl_FragColor = texture2D(texture, v_texcoord);\n"
"#endif // 0\n"
"}\n";
struct Vertex {
QVector3D coord;
QVector2D texCoord;
Vertex(qreal x, qreal y, qreal z, qreal s, qreal t):
coord(x, y, z), texCoord(s, t)
{ }
};
class OpenGLWidget: public QOpenGLWidget, public QOpenGLFunctions_4_0_Core {
private:
const Image &_img;
QOpenGLShaderProgram _qGLSProg;
QOpenGLBuffer _qGLBufArray;
QOpenGLBuffer _qGLBufIndex;
QOpenGLTexture *_pQGLTex;
public:
explicit OpenGLWidget(const Image &img):
QOpenGLWidget(nullptr),
_img(img),
_qGLBufArray(QOpenGLBuffer::VertexBuffer),
_qGLBufIndex(QOpenGLBuffer::IndexBuffer),
_pQGLTex(nullptr)
{ }
virtual ~OpenGLWidget()
{
makeCurrent();
delete _pQGLTex;
_qGLBufArray.destroy();
_qGLBufIndex.destroy();
doneCurrent();
}
// disabled: (to prevent accidental usage)
OpenGLWidget(const OpenGLWidget&) = delete;
OpenGLWidget& operator=(const OpenGLWidget&) = delete;
protected:
virtual void initializeGL() override
{
initializeOpenGLFunctions();
glClearColor(0, 0, 0, 1);
initShaders();
initGeometry();
initTexture();
}
virtual void paintGL() override
{
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
_pQGLTex->bind();
QMatrix4x4 mat; mat.setToIdentity();
_qGLSProg.setUniformValue("mvp_matrix", mat);
_qGLSProg.setUniformValue("texture", 0);
// draw geometry
_qGLBufArray.bind();
_qGLBufIndex.bind();
quintptr offset = 0;
int coordLocation = _qGLSProg.attributeLocation("a_position");
_qGLSProg.enableAttributeArray(coordLocation);
_qGLSProg.setAttributeBuffer(coordLocation, GL_FLOAT, offset, 3, sizeof(Vertex));
offset += sizeof(QVector3D);
int texCoordLocation = _qGLSProg.attributeLocation("a_texcoord");
_qGLSProg.enableAttributeArray(texCoordLocation);
_qGLSProg.setAttributeBuffer(texCoordLocation, GL_FLOAT, offset, 2, sizeof(Vertex));
glDrawElements(GL_TRIANGLE_STRIP, 4, GL_UNSIGNED_SHORT, 0);
//glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
}
private:
void initShaders()
{
if (!_qGLSProg.addShaderFromSourceCode(QOpenGLShader::Vertex,
QString::fromLatin1(vertexShader))) close();
if (!_qGLSProg.addShaderFromSourceCode(QOpenGLShader::Fragment,
QString::fromLatin1(fragmentShader))) close();
if (!_qGLSProg.link()) close();
if (!_qGLSProg.bind()) close();
}
void initGeometry()
{
Vertex vertices[] = {
// x y z s t
{ -1.0f, -1.0f, 0.0f, 0.0f, 0.0f },
{ +1.0f, -1.0f, 0.0f, 1.0f, 0.0f },
{ +1.0f, +1.0f, 0.0f, 1.0f, 1.0f },
{ -1.0f, +1.0f, 0.0f, 0.0f, 1.0f }
};
enum { nVtcs = sizeof vertices / sizeof *vertices };
// OpenGL ES doesn't have QUAD. A TRIANGLE_STRIP does as well.
GLushort indices[] = { 3, 0, 2, 1 };
//GLushort indices[] = { 0, 1, 2, 0, 2, 3 };
enum { nIdcs = sizeof indices / sizeof *indices };
_qGLBufArray.create();
_qGLBufArray.bind();
_qGLBufArray.allocate(vertices, nVtcs * sizeof (Vertex));
_qGLBufIndex.create();
_qGLBufIndex.bind();
_qGLBufIndex.allocate(indices, nIdcs * sizeof (GLushort));
}
void initTexture()
{
#if 0 // test whether texturing works at all
//_pQGLTex = new QOpenGLTexture(QImage("test.png").mirrored());
_pQGLTex = new QOpenGLTexture(QImage("test-dxt1.dds").mirrored());
#else // (not) 0
_pQGLTex = new QOpenGLTexture(QOpenGLTexture::Target2D);
_pQGLTex->setFormat(QOpenGLTexture::RGB_DXT1);
_pQGLTex->setSize(_img.w, _img.h);
_pQGLTex->allocateStorage(QOpenGLTexture::RGBA, QOpenGLTexture::UInt8);
_pQGLTex->setCompressedData((int)_img.data.size(), _img.data.data());
#endif // 0
_pQGLTex->setMinificationFilter(QOpenGLTexture::Nearest);
_pQGLTex->setMagnificationFilter(QOpenGLTexture::Nearest);
_pQGLTex->setWrapMode(QOpenGLTexture::ClampToEdge);
}
};
int main(int argc, char **argv)
{
qDebug() << "Qt Version:" << QT_VERSION_STR;
QApplication app(argc, argv);
// load a DDS image to get DTX1 raw data
Image img = loadDXT1("test-dxt1.dds");
// setup GUI
QMainWindow qWin;
OpenGLWidget qGLView(img);
/* I apply brute-force to get main window to sufficient size
* -> not appropriate for a productive solution...
*/
qGLView.setMinimumSize(img.w, img.h);
qWin.setCentralWidget(&qGLView);
qWin.show();
// exec. application
return app.exec();
}
For the test, I used (again) the sample file test-dxt1.dds.
And this is, how it looks (sample compiled with VS2013 and Qt 5.9.2):
Notes:
The texture is displayed upside-down. Please, consider that the original sample as well as my (excluded) code for texture loading from QImage applies a QImage::mirror(). It seems that QImage stores the data from top to bottom where OpenGL textures expect the opposite – from bottom to top. I guess the most easy would be to fix this after the texture is converted back to QImage.
My original intention was to implement also the part to read the texture back to a QImage (as described/sketched in the question). In general, I already did something like this in OpenGL (but without Qt). (I recently posted another answer OpenGL offscreen render about this. I have to admit that I had to cancel this plan due to a "time-out" issue. This was caused by some issues for which I needed quite long until I could fix them. I will share this experiences in the following as I'm thinking this could be helpful for others.
To find sample code for the initialization of the QOpenGLTexture with DXT1 data, I did a long google research – without success. Hence, I eye-scanned the Qt doc. of QOpenGLTexture for methods which looked promising/nessary to get it working. (I have to admit that I already did OpenGL texturing successfully but in pure OpenGL.) Finally, I got the actual set of necessary functions. It compiled and started but all I got was a black window. (Everytimes, I start something new with OpenGL it first ends up in a black or blue window – depending on what clear color I used resp...) So, I had a look into the qopengltexture.cpp on woboq.org (specifically in the implementation of QOpenGLTexture::QOpenGLTexture(QImage&, ...)). This didn't help much – they do it very similar as I tried.
The most essential problem, I could fix discussing this program with a colleague who contributed the final hint: I tried to get this running using QOpenGLFunctions. The last steps (toward the final fix) were trying this out with
_pQGLTex = new QOpenGLTexture(QImage("test.png").mirrored());
(worked) and
_pQGLTex = new QOpenGLTexture(QImage("test-dxt1.dds").mirrored());
(did not work).
This brought as to the idea that QOpenGLFunctions (which is claimed to be compatible to OpenGL ES 2.0) just seems not to enable S3 Texture loading. Hence, we replaced QOpenGLFunctions by QOpenGLFunctions_4_0_Core and, suddenly, it worked.
I did not overload the QOpenGLWidget::resizeGL() method as I use an identity matrix for the model-view-projection transformation of the OpenGL rendering. This is intended to have model space and clip space identical. Instead, I built a rectangle (-1, -1, 0) - (+1, +1, 0) which should exactly fill the (visible part of) the clip space x-y plane (and it does).
This can be checked visually by enabling the left-in debug code in the shader
gl_FragColor = vec4(1, v_texcoord.x, v_texcoord.y, 1);
which uses texture coordinates itself as green and blue color component. This makes a nice colored rectangle with red in the lower-left corner, magenta (red and blue) in the upper-left, yellow (red and green) in the lower-right, and white (red, green, and blue) in the upper-right corner. It shows that the rectangle fits perfectly.
As I forced the minimum size of the OpenGLWidget to the exact size of the texture image the texture to pixel mapping should be 1:1. I checked out what happens if magnification is set to Nearest – there was no visual difference.
I have to admit that the DXT1 data rendered as texture looks much better than the decompression I've exposed in my other answer. Considering, that these are the exact same data (read by my nearly identical loader) this let me think my own uncompress algorithm does not yet consider something (in other words: it still seems to be buggy). Hmm... (It seems that needs additional fixing.)
Related
[Please look at the edit below, the solution to the question could simply be there]
I'm trying to learn OpenCL through the study of a small ray tracer (see the code below, from this link).
I don't have a "real" GPU, I'm currently on a macosx laptop with Intel(R) Iris(TM) Graphics 6100 graphic cards.
The code works well on the CPU but its behavior is strange on the GPU. It works (or not) depending on the number of samples per pixel (the number of rays that are shot through the pixel to get its color after propagating the rays in the scene). If I take a small number of sample (64) I can have a 1280x720 picture but if I take 128 samples I'm only able to get a smaller picture. As I understand things, the number of samples should not change anything (except for the quality of the picture of course). Is there something purely related to OpenCL/GPU that I miss ?
Moreover, it seems to be the extraction of the results from the memory of the GPU that crashes :
queue.enqueueReadBuffer(cl_output, CL_TRUE, 0, image_width * image_height * sizeof(cl_float4), cpu_output);
I get an "Abort trap: 6" at this stage.
I'm missing something.
[EDIT] After some research I found an interesting trail : the graphic card may voluntarily aborts the task because it takes too much time. This behavior would have been put in place to avoid "frozen" screen. This topic talk about that.
What do you think about that ?
I can't find the way to turn this behavior off. Do you know how to do ?
Here are the files:
main.cpp:
// OpenCL based simple sphere path tracer by Sam Lapere, 2016
// based on smallpt by Kevin Beason
// http://raytracey.blogspot.com
#include <iostream>
#include <fstream>
#include <vector>
#include <CL\cl.hpp>
using namespace std;
using namespace cl;
const int image_width = 1280;
const int image_height = 720;
cl_float4* cpu_output;
CommandQueue queue;
Device device;
Kernel kernel;
Context context;
Program program;
Buffer cl_output;
Buffer cl_spheres;
// dummy variables are required for memory alignment
// float3 is considered as float4 by OpenCL
struct Sphere
{
cl_float radius;
cl_float dummy1;
cl_float dummy2;
cl_float dummy3;
cl_float3 position;
cl_float3 color;
cl_float3 emission;
};
void pickPlatform(Platform& platform, const vector<Platform>& platforms){
if (platforms.size() == 1) platform = platforms[0];
else{
int input = 0;
cout << "\nChoose an OpenCL platform: ";
cin >> input;
// handle incorrect user input
while (input < 1 || input > platforms.size()){
cin.clear(); //clear errors/bad flags on cin
cin.ignore(cin.rdbuf()->in_avail(), '\n'); // ignores exact number of chars in cin buffer
cout << "No such option. Choose an OpenCL platform: ";
cin >> input;
}
platform = platforms[input - 1];
}
}
void pickDevice(Device& device, const vector<Device>& devices){
if (devices.size() == 1) device = devices[0];
else{
int input = 0;
cout << "\nChoose an OpenCL device: ";
cin >> input;
// handle incorrect user input
while (input < 1 || input > devices.size()){
cin.clear(); //clear errors/bad flags on cin
cin.ignore(cin.rdbuf()->in_avail(), '\n'); // ignores exact number of chars in cin buffer
cout << "No such option. Choose an OpenCL device: ";
cin >> input;
}
device = devices[input - 1];
}
}
void printErrorLog(const Program& program, const Device& device){
// Get the error log and print to console
string buildlog = program.getBuildInfo<CL_PROGRAM_BUILD_LOG>(device);
cerr << "Build log:" << std::endl << buildlog << std::endl;
// Print the error log to a file
FILE *log = fopen("errorlog.txt", "w");
fprintf(log, "%s\n", buildlog);
cout << "Error log saved in 'errorlog.txt'" << endl;
system("PAUSE");
exit(1);
}
void initOpenCL()
{
// Get all available OpenCL platforms (e.g. AMD OpenCL, Nvidia CUDA, Intel OpenCL)
vector<Platform> platforms;
Platform::get(&platforms);
cout << "Available OpenCL platforms : " << endl << endl;
for (int i = 0; i < platforms.size(); i++)
cout << "\t" << i + 1 << ": " << platforms[i].getInfo<CL_PLATFORM_NAME>() << endl;
// Pick one platform
Platform platform;
pickPlatform(platform, platforms);
cout << "\nUsing OpenCL platform: \t" << platform.getInfo<CL_PLATFORM_NAME>() << endl;
// Get available OpenCL devices on platform
vector<Device> devices;
platform.getDevices(CL_DEVICE_TYPE_ALL, &devices);
cout << "Available OpenCL devices on this platform: " << endl << endl;
for (int i = 0; i < devices.size(); i++){
cout << "\t" << i + 1 << ": " << devices[i].getInfo<CL_DEVICE_NAME>() << endl;
cout << "\t\tMax compute units: " << devices[i].getInfo<CL_DEVICE_MAX_COMPUTE_UNITS>() << endl;
cout << "\t\tMax work group size: " << devices[i].getInfo<CL_DEVICE_MAX_WORK_GROUP_SIZE>() << endl << endl;
}
// Pick one device
pickDevice(device, devices);
cout << "\nUsing OpenCL device: \t" << device.getInfo<CL_DEVICE_NAME>() << endl;
cout << "\t\t\tMax compute units: " << device.getInfo<CL_DEVICE_MAX_COMPUTE_UNITS>() << endl;
cout << "\t\t\tMax work group size: " << device.getInfo<CL_DEVICE_MAX_WORK_GROUP_SIZE>() << endl;
// Create an OpenCL context and command queue on that device.
context = Context(device);
queue = CommandQueue(context, device);
// Convert the OpenCL source code to a string
string source;
ifstream file("opencl_kernel.cl");
if (!file){
cout << "\nNo OpenCL file found!" << endl << "Exiting..." << endl;
system("PAUSE");
exit(1);
}
while (!file.eof()){
char line[256];
file.getline(line, 255);
source += line;
}
const char* kernel_source = source.c_str();
// Create an OpenCL program by performing runtime source compilation for the chosen device
program = Program(context, kernel_source);
cl_int result = program.build({ device });
if (result) cout << "Error during compilation OpenCL code!!!\n (" << result << ")" << endl;
if (result == CL_BUILD_PROGRAM_FAILURE) printErrorLog(program, device);
// Create a kernel (entry point in the OpenCL source program)
kernel = Kernel(program, "render_kernel");
}
void cleanUp(){
delete cpu_output;
}
inline float clamp(float x){ return x < 0.0f ? 0.0f : x > 1.0f ? 1.0f : x; }
// convert RGB float in range [0,1] to int in range [0, 255] and perform gamma correction
inline int toInt(float x){ return int(clamp(x) * 255 + .5); }
void saveImage(){
// write image to PPM file, a very simple image file format
// PPM files can be opened with IrfanView (download at www.irfanview.com) or GIMP
FILE *f = fopen("opencl_raytracer.ppm", "w");
fprintf(f, "P3\n%d %d\n%d\n", image_width, image_height, 255);
// loop over all pixels, write RGB values
for (int i = 0; i < image_width * image_height; i++)
fprintf(f, "%d %d %d ",
toInt(cpu_output[i].s[0]),
toInt(cpu_output[i].s[1]),
toInt(cpu_output[i].s[2]));
}
#define float3(x, y, z) {{x, y, z}} // macro to replace ugly initializer braces
void initScene(Sphere* cpu_spheres){
// left wall
cpu_spheres[0].radius = 200.0f;
cpu_spheres[0].position = float3(-200.6f, 0.0f, 0.0f);
cpu_spheres[0].color = float3(0.75f, 0.25f, 0.25f);
cpu_spheres[0].emission = float3(0.0f, 0.0f, 0.0f);
// right wall
cpu_spheres[1].radius = 200.0f;
cpu_spheres[1].position = float3(200.6f, 0.0f, 0.0f);
cpu_spheres[1].color = float3(0.25f, 0.25f, 0.75f);
cpu_spheres[1].emission = float3(0.0f, 0.0f, 0.0f);
// floor
cpu_spheres[2].radius = 200.0f;
cpu_spheres[2].position = float3(0.0f, -200.4f, 0.0f);
cpu_spheres[2].color = float3(0.9f, 0.8f, 0.7f);
cpu_spheres[2].emission = float3(0.0f, 0.0f, 0.0f);
// ceiling
cpu_spheres[3].radius = 200.0f;
cpu_spheres[3].position = float3(0.0f, 200.4f, 0.0f);
cpu_spheres[3].color = float3(0.9f, 0.8f, 0.7f);
cpu_spheres[3].emission = float3(0.0f, 0.0f, 0.0f);
// back wall
cpu_spheres[4].radius = 200.0f;
cpu_spheres[4].position = float3(0.0f, 0.0f, -200.4f);
cpu_spheres[4].color = float3(0.9f, 0.8f, 0.7f);
cpu_spheres[4].emission = float3(0.0f, 0.0f, 0.0f);
// front wall
cpu_spheres[5].radius = 200.0f;
cpu_spheres[5].position = float3(0.0f, 0.0f, 202.0f);
cpu_spheres[5].color = float3(0.9f, 0.8f, 0.7f);
cpu_spheres[5].emission = float3(0.0f, 0.0f, 0.0f);
// left sphere
cpu_spheres[6].radius = 0.16f;
cpu_spheres[6].position = float3(-0.25f, -0.24f, -0.1f);
cpu_spheres[6].color = float3(0.9f, 0.8f, 0.7f);
cpu_spheres[6].emission = float3(0.0f, 0.0f, 0.0f);
// right sphere
cpu_spheres[7].radius = 0.16f;
cpu_spheres[7].position = float3(0.25f, -0.24f, 0.1f);
cpu_spheres[7].color = float3(0.9f, 0.8f, 0.7f);
cpu_spheres[7].emission = float3(0.0f, 0.0f, 0.0f);
// lightsource
cpu_spheres[8].radius = 1.0f;
cpu_spheres[8].position = float3(0.0f, 1.36f, 0.0f);
cpu_spheres[8].color = float3(0.0f, 0.0f, 0.0f);
cpu_spheres[8].emission = float3(9.0f, 8.0f, 6.0f);
}
void main(){
// initialise OpenCL
initOpenCL();
// allocate memory on CPU to hold the rendered image
cpu_output = new cl_float3[image_width * image_height];
// initialise scene
const int sphere_count = 9;
Sphere cpu_spheres[sphere_count];
initScene(cpu_spheres);
// Create buffers on the OpenCL device for the image and the scene
cl_output = Buffer(context, CL_MEM_WRITE_ONLY, image_width * image_height * sizeof(cl_float3));
cl_spheres = Buffer(context, CL_MEM_READ_ONLY, sphere_count * sizeof(Sphere));
queue.enqueueWriteBuffer(cl_spheres, CL_TRUE, 0, sphere_count * sizeof(Sphere), cpu_spheres);
// specify OpenCL kernel arguments
kernel.setArg(0, cl_spheres);
kernel.setArg(1, image_width);
kernel.setArg(2, image_height);
kernel.setArg(3, sphere_count);
kernel.setArg(4, cl_output);
// every pixel in the image has its own thread or "work item",
// so the total amount of work items equals the number of pixels
std::size_t global_work_size = image_width * image_height;
std::size_t local_work_size = kernel.getWorkGroupInfo<CL_KERNEL_WORK_GROUP_SIZE>(device);
cout << "Kernel work group size: " << local_work_size << endl;
// Ensure the global work size is a multiple of local work size
if (global_work_size % local_work_size != 0)
global_work_size = (global_work_size / local_work_size + 1) * local_work_size;
cout << "Rendering started..." << endl;
// launch the kernel
queue.enqueueNDRangeKernel(kernel, NULL, global_work_size, local_work_size);
queue.finish();
cout << "Rendering done! \nCopying output from device to host" << endl;
// read and copy OpenCL output to CPU
queue.enqueueReadBuffer(cl_output, CL_TRUE, 0, image_width * image_height * sizeof(cl_float3), cpu_output);
// save image
saveImage();
cout << "Saved image to 'opencl_raytracer.ppm'" << endl;
// release memory
cleanUp();
system("PAUSE");
}
opencl_kernel.cl:
/* OpenCL based simple sphere path tracer by Sam Lapere, 2016*/
/* based on smallpt by Kevin Beason */
/* http://raytracey.blogspot.com */
__constant float EPSILON = 0.00003f; /* required to compensate for limited float precision */
__constant float PI = 3.14159265359f;
__constant int SAMPLES = 128;
typedef struct Ray{
float3 origin;
float3 dir;
} Ray;
typedef struct Sphere{
float radius;
float3 pos;
float3 color;
float3 emission;
} Sphere;
static float get_random(unsigned int *seed0, unsigned int *seed1) {
/* hash the seeds using bitwise AND operations and bitshifts */
*seed0 = 36969 * ((*seed0) & 65535) + ((*seed0) >> 16);
*seed1 = 18000 * ((*seed1) & 65535) + ((*seed1) >> 16);
unsigned int ires = ((*seed0) << 16) + (*seed1);
/* use union struct to convert int to float */
union {
float f;
unsigned int ui;
} res;
res.ui = (ires & 0x007fffff) | 0x40000000; /* bitwise AND, bitwise OR */
return (res.f - 2.0f) / 2.0f;
}
Ray createCamRay(const int x_coord, const int y_coord, const int width, const int height){
float fx = (float)x_coord / (float)width; /* convert int in range [0 - width] to float in range [0-1] */
float fy = (float)y_coord / (float)height; /* convert int in range [0 - height] to float in range [0-1] */
/* calculate aspect ratio */
float aspect_ratio = (float)(width) / (float)(height);
float fx2 = (fx - 0.5f) * aspect_ratio;
float fy2 = fy - 0.5f;
/* determine position of pixel on screen */
float3 pixel_pos = (float3)(fx2, -fy2, 0.0f);
/* create camera ray*/
Ray ray;
ray.origin = (float3)(0.0f, 0.1f, 2.0f); /* fixed camera position */
ray.dir = normalize(pixel_pos - ray.origin); /* vector from camera to pixel on screen */
return ray;
}
/* (__global Sphere* sphere, const Ray* ray) */
float intersect_sphere(const Sphere* sphere, const Ray* ray) /* version using local copy of sphere */
{
float3 rayToCenter = sphere->pos - ray->origin;
float b = dot(rayToCenter, ray->dir);
float c = dot(rayToCenter, rayToCenter) - sphere->radius*sphere->radius;
float disc = b * b - c;
if (disc < 0.0f) return 0.0f;
else disc = sqrt(disc);
if ((b - disc) > EPSILON) return b - disc;
if ((b + disc) > EPSILON) return b + disc;
return 0.0f;
}
bool intersect_scene(__constant Sphere* spheres, const Ray* ray, float* t, int* sphere_id, const int sphere_count)
{
/* initialise t to a very large number,
so t will be guaranteed to be smaller
when a hit with the scene occurs */
float inf = 1e20f;
*t = inf;
/* check if the ray intersects each sphere in the scene */
for (int i = 0; i < sphere_count; i++) {
Sphere sphere = spheres[i]; /* create local copy of sphere */
/* float hitdistance = intersect_sphere(&spheres[i], ray); */
float hitdistance = intersect_sphere(&sphere, ray);
/* keep track of the closest intersection and hitobject found so far */
if (hitdistance != 0.0f && hitdistance < *t) {
*t = hitdistance;
*sphere_id = i;
}
}
return *t < inf; /* true when ray interesects the scene */
}
/* the path tracing function */
/* computes a path (starting from the camera) with a defined number of bounces, accumulates light/color at each bounce */
/* each ray hitting a surface will be reflected in a random direction (by randomly sampling the hemisphere above the hitpoint) */
/* small optimisation: diffuse ray directions are calculated using cosine weighted importance sampling */
float3 trace(__constant Sphere* spheres, const Ray* camray, const int sphere_count, const int* seed0, const int* seed1){
Ray ray = *camray;
float3 accum_color = (float3)(0.0f, 0.0f, 0.0f);
float3 mask = (float3)(1.0f, 1.0f, 1.0f);
for (int bounces = 0; bounces < 8; bounces++){
float t; /* distance to intersection */
int hitsphere_id = 0; /* index of intersected sphere */
/* if ray misses scene, return background colour */
if (!intersect_scene(spheres, &ray, &t, &hitsphere_id, sphere_count))
return accum_color += mask * (float3)(0.15f, 0.15f, 0.25f);
/* else, we've got a hit! Fetch the closest hit sphere */
Sphere hitsphere = spheres[hitsphere_id]; /* version with local copy of sphere */
/* compute the hitpoint using the ray equation */
float3 hitpoint = ray.origin + ray.dir * t;
/* compute the surface normal and flip it if necessary to face the incoming ray */
float3 normal = normalize(hitpoint - hitsphere.pos);
float3 normal_facing = dot(normal, ray.dir) < 0.0f ? normal : normal * (-1.0f);
/* compute two random numbers to pick a random point on the hemisphere above the hitpoint*/
float rand1 = 2.0f * PI * get_random(seed0, seed1);
float rand2 = get_random(seed0, seed1);
float rand2s = sqrt(rand2);
/* create a local orthogonal coordinate frame centered at the hitpoint */
float3 w = normal_facing;
float3 axis = fabs(w.x) > 0.1f ? (float3)(0.0f, 1.0f, 0.0f) : (float3)(1.0f, 0.0f, 0.0f);
float3 u = normalize(cross(axis, w));
float3 v = cross(w, u);
/* use the coordinte frame and random numbers to compute the next ray direction */
float3 newdir = normalize(u * cos(rand1)*rand2s + v*sin(rand1)*rand2s + w*sqrt(1.0f - rand2));
/* add a very small offset to the hitpoint to prevent self intersection */
ray.origin = hitpoint + normal_facing * EPSILON;
ray.dir = newdir;
/* add the colour and light contributions to the accumulated colour */
accum_color += mask * hitsphere.emission;
/* the mask colour picks up surface colours at each bounce */
mask *= hitsphere.color;
/* perform cosine-weighted importance sampling for diffuse surfaces*/
mask *= dot(newdir, normal_facing);
}
return accum_color;
}
__kernel void render_kernel(__constant Sphere* spheres, const int width, const int height, const int sphere_count, __global float3* output){
unsigned int work_item_id = get_global_id(0); /* the unique global id of the work item for the current pixel */
unsigned int x_coord = work_item_id % width; /* x-coordinate of the pixel */
unsigned int y_coord = work_item_id / width; /* y-coordinate of the pixel */
/* seeds for random number generator */
unsigned int seed0 = x_coord;
unsigned int seed1 = y_coord;
Ray camray = createCamRay(x_coord, y_coord, width, height);
/* add the light contribution of each sample and average over all samples*/
float3 finalcolor = (float3)(0.0f, 0.0f, 0.0f);
float invSamples = 1.0f / SAMPLES;
for (int i = 0; i < SAMPLES; i++)
finalcolor += trace(spheres, &camray, sphere_count, &seed0, &seed1) * invSamples;
/* store the pixelcolour in the output buffer */
output[work_item_id] = finalcolor;
}
Since your program is working correctly on CPU but not on the GPU it could mean that you are exceeding the GPU TDR (Timeout Detection and Recovery) timer.
A cause for the Abort trap:6 error when doing computations on the GPU is locking the GPU into computation mode for too much time (a common value seems to be 5 seconds but I found contradicting resources on this number). When this occurs the watchdog will forcefully stop and restart the graphic driver to prevent the screen being stuck.
There are a couple possible solutions to this problem:
Work on a headless machine
Most (if not all) OS won't enforce the TDR if no screen is attached to them
Switch GPU mode
If you are working on an Nvidia Tesla GPU you can check if it's possible to switch it to Tesla Compute Cluster mode. In this mode the TDR limit is not enforced. There may be a similar mode for AMD GPUs but I'm not sure.
Change the TDR value
This can be done under Windows by editing the TdrDelay and TdrDdiDelay registry keys under HKEY_LOCAL_MACHINE -> SYSTEM -> CurrentControlSet -> Control -> GraphicsDrivers with a higher value. Beware to not put a number too high or you won't be able to know if the driver has really crashed.
Also take note that graphic drivers or Windows updates may reset these values to default.
Under Linux the TDR should already be disable by default (I know it is under Ubuntu 18 and Centos 8 but I haven't tested on other versions/distros), if you have problems anyway you can add Option Interactive "0" in your Xorg config like stated in this SO question
Unfortunately I don't know (and couldn't find) a way to do this on MacOS, however I do know that this limit is not enforced on a secondary GPU if you have it installed in your MacOS system.
Split your work in smaller chunks
If you can manage to split your computation into smaller chunks you may be able to not surpass the TDR timer (E.G. 2 computations that take 4s each instead of a single 8s one), the feasibility of this depends on what your problem is and may or may not be an easy task though.
I am having a hard time determining a way to include graphics.h file in my compiler. All information I have came across is for IDE such as CodeBlocks. I would like to be able include graphics file for use without facing any problems. My questions are:
Can you use a text editor like Atom to create a graphics object?
If so what steps should be taken in order to accomplish that?
There are lot of graphics formats available with varying capabilities.
First distinction I would make is:
raster graphics vs. vector graphics
Raster graphics (storing the image pixel by pixel) are more often binary encoded as the amount of data is usally directly proportional to size of image. However some of them are textual encoded or can be textual as well as binary encoded.
Examples are:
Portable anymap
X Pixmap
Although these file formats are a little bit exotic, it is not difficult to find software which supports them. E.g. GIMP supports both out of the box (and even on Windows). Btw. they are that simple that it is not too complicated to write loader and writer by yourself.
A simple reader and writer for PPM (the color version of Portable anymap) can be found in my answer to SO: Convolution for Edge Detection in C.
Vector graphics (store graphics primitives which build the image) are more often textual encoded. As vector graphics can be "loss-less" scaled to any image size by simply applying a scaling factor to all coordinates, file size and destination image size are not directly related. Thus, vector graphics are the preferrable format for drawings especially if they are needed in multiple target resolutions.
For this, I would exclusively recommend:
Scalable Vector Graphics
which is (hopefully) the upcoming standard for scalable graphics in Web contents. Qt does provide (limited) support for SVG and thus, it is my preferred option for resolution independent icons.
A different (but maybe related) option is to embed graphics in source code. This can be done with rather any format if your image loader library provides image loading from memory (as well as from file). (All I know does this.)
Thus, the problem can be reduced to: How to embed a large chunk of (ASCII or binary) data as constant in C/C++ source code? which is IMHO trivial to solve.
I did this in my answer for SO: Paint a rect on qglwidget at specifit times.
Update:
As I noticed that the linked sample for PPM (as well as another for PBM) read actually the binary format, I implemented a sample application which demonstrates usage of ASCII PPM.
I believe that XPM is better suitable for the specific requirement to be editable in a text editor. Thus, I considered this in my sample also.
As the question doesn't mention what specific internal image format is desired nor in what API it shall be usable, I choosed Qt which
is something I'm familiar with
provides a QImage which is used as destination for image import
needs only a few lines of code for visual output of result.
Source code test-QShowPPM-XPM.cc:
// standard C++ header:
#include <cassert>
#include <iostream>
#include <string>
#include <sstream>
// Qt header:
#include <QtWidgets>
// sample image in ASCII PPM format
// (taken from https://en.wikipedia.org/wiki/Netpbm_format)
const char ppmData[] =
"P3\n"
"3 2\n"
"255\n"
"255 0 0 0 255 0 0 0 255\n"
"255 255 0 255 255 255 0 0 0\n";
// sample image in XPM3 format
/* XPM */
const char *xpmData[] = {
// w, h, nC, cPP
"16 16 5 1",
// colors
" c #ffffff",
"# c #000000",
"g c #ffff00",
"r c #ff0000",
"b c #0000ff",
// pixels
" ## ",
" ###gg### ",
" #gggggggg# ",
" #gggggggggg# ",
" #ggbbggggbbgg# ",
" #ggbbggggbbgg# ",
" #gggggggggggg# ",
"#gggggggggggggg#",
"#ggrrggggggrrgg#",
" #ggrrrrrrrrgg# ",
" #ggggrrrrgggg# ",
" #gggggggggggg# ",
" #gggggggggg# ",
" #gggggggg# ",
" ###gg### ",
" ## "
};
// Simplified PPM ASCII Reader (no support of comments)
inline int clamp(int value, int min, int max)
{
return value < min ? min : value > max ? max : value;
}
inline int scale(int value, int maxOld, int maxNew)
{
return value * maxNew / maxOld;
}
QImage readPPM(std::istream &in)
{
std::string header;
std::getline(in, header);
if (header != "P3") throw "ERROR! Not a PPM ASCII file.";
int w = 0, h = 0, max = 255; // width, height, bits per component
if (!(in >> w >> h >> max)) throw "ERROR! Premature end of file.";
if (max <= 0 || max > 255) throw "ERROR! Invalid format.";
QImage qImg(w, h, QImage::Format_RGB32);
for (int y = 0; y < h; ++y) {
for (int x = 0; x < w; ++x) {
int r, g, b;
if (!(in >> r >> g >> b)) throw "ERROR! Premature end of file.";
qImg.setPixel(x, y,
scale(clamp(r, 0, 255), max, 255) << 16
| scale(clamp(g, 0, 255), max, 255) << 8
| scale(clamp(b, 0, 255), max, 255));
}
}
return qImg;
}
// Simplified XPM Reader (implements sub-set of XPM3)
char getChar(const char *&p)
{
if (!*p) throw "ERROR! Premature end of file.";
return *p++;
}
std::string getString(const char *&p)
{
std::string str;
while (*p && !isspace(*p)) str += *p++;
return str;
}
void skipWS(const char *&p)
{
while (*p && isspace(*p)) ++p;
}
QImage readXPM(const char **xpmData)
{
int w = 0, h = 0; // width, height
int nC = 0, cPP = 1; // number of colors, chars per pixel
{ std::istringstream in(*xpmData);
if (!(in >> w >> h >> nC >> cPP)) throw "ERROR! Premature end of file.";
++xpmData;
}
std::map<std::string, std::string> colTbl;
for (int i = nC; i--; ++xpmData) {
const char *p = *xpmData;
std::string chr;
for (int j = cPP; j--;) chr += getChar(p);
skipWS(p);
if (getChar(p) != 'c') throw "ERROR! Format not supported.";
skipWS(p);
colTbl[chr] = getString(p);
}
QImage qImg(w, h, QImage::Format_RGB32);
for (int y = 0; y < h; ++y, ++xpmData) {
const char *p = *xpmData;
for (int x = 0; x < w; ++x) {
std::string pixel;
for (int j = cPP; j--;) pixel += getChar(p);
qImg.setPixelColor(x, y, QColor(colTbl[pixel].c_str()));
}
}
return qImg;
}
// a customized QLabel to handle scaling
class LabelImage: public QLabel {
private:
QPixmap _qPixmap, _qPixmapScaled;
public:
LabelImage();
LabelImage(const QPixmap &qPixmap): LabelImage()
{
setPixmap(qPixmap);
}
LabelImage(const QImage &qImg): LabelImage(QPixmap::fromImage(qImg))
{ }
void setPixmap(const QPixmap &qPixmap) { setPixmap(qPixmap, size()); }
protected:
virtual void resizeEvent(QResizeEvent *pQEvent);
private:
void setPixmap(const QPixmap &qPixmap, const QSize &size);
};
// main function
int main(int argc, char **argv)
{
qDebug() << QT_VERSION_STR;
// main application
#undef qApp // undef macro qApp out of the way
QApplication qApp(argc, argv);
// setup GUI
QMainWindow qWin;
QGroupBox qBox;
QGridLayout qGrid;
LabelImage qLblImgPPM(readPPM(std::istringstream(ppmData)));
qGrid.addWidget(&qLblImgPPM, 0, 0, Qt::AlignCenter);
LabelImage qLblImgXPM(readXPM(xpmData));
qGrid.addWidget(&qLblImgXPM, 1, 0, Qt::AlignCenter);
qBox.setLayout(&qGrid);
qWin.setCentralWidget(&qBox);
qWin.show();
// run application
return qApp.exec();
}
// implementation of LabelImage
LabelImage::LabelImage(): QLabel()
{
setFrameStyle(Raised | Box);
setAlignment(Qt::AlignCenter);
//setMinimumSize(QSize(1, 1)); // seems to be not necessary
setSizePolicy(QSizePolicy(QSizePolicy::Ignored, QSizePolicy::Ignored));
}
void LabelImage::resizeEvent(QResizeEvent *pQEvent)
{
QLabel::resizeEvent(pQEvent);
setPixmap(_qPixmap, pQEvent->size());
}
void LabelImage::setPixmap(const QPixmap &qPixmap, const QSize &size)
{
_qPixmap = qPixmap;
_qPixmapScaled = _qPixmap.scaled(size, Qt::KeepAspectRatio);
QLabel::setPixmap(_qPixmapScaled);
}
This has been compiled in VS2013 and tested in Windows 10 (64 bit):
I am currently trying to render textured objects in Opengl. Everything worked fine until I wanted to render a texture with transparency. Instead of showing the the object transparent it just rendered in total black.
The method fo loading the texture file is this:
// structures for reading and information variables
char magic[4];
unsigned char header[124];
unsigned int width, height, linearSize, mipMapCount, fourCC;
unsigned char* dataBuffer;
unsigned int bufferSize;
fstream file(path, ios::in|ios::binary);
// read magic and header
if (!file.read((char*)magic, sizeof(magic))){
cerr<< "File " << path << " not found!"<<endl;
return false;
}
if (magic[0]!='D' || magic[1]!='D' || magic[2]!='S' || magic[3]!=' '){
cerr<< "File does not comply with dds file format!"<<endl;
return false;
}
if (!file.read((char*)header, sizeof(header))){
cerr<< "Not able to read file information!"<<endl;
return false;
}
// derive information from header
height = *(int*)&(header[8]);
width = *(int*)&(header[12]);
linearSize = *(int*)&(header[16]);
mipMapCount = *(int*)&(header[24]);
fourCC = *(int*)&(header[80]);
// determine dataBuffer size
bufferSize = mipMapCount > 1 ? linearSize * 2 : linearSize;
dataBuffer = new unsigned char [bufferSize*2];
// read data and close file
if (file.read((char*)dataBuffer, bufferSize/1.5))
cout<<"Loading texture "<<path<<" successful"<<endl;
else{
cerr<<"Data of file "<<path<<" corrupted"<<endl;
return false;
}
file.close();
// check pixel format
unsigned int format;
switch(fourCC){
case FOURCC_DXT1:
format = GL_COMPRESSED_RGBA_S3TC_DXT1_EXT;
break;
case FOURCC_DXT3:
format = GL_COMPRESSED_RGBA_S3TC_DXT3_EXT;
break;
case FOURCC_DXT5:
format = GL_COMPRESSED_RGBA_S3TC_DXT5_EXT;
break;
default:
cerr << "Compression type not supported or corrupted!" << endl;
return false;
}
glGenTextures(1, &ID);
glBindTexture(GL_TEXTURE_2D, ID);
glPixelStorei(GL_UNPACK_ALIGNMENT,1);
unsigned int blockSize = (format == GL_COMPRESSED_RGBA_S3TC_DXT1_EXT) ? 8 : 16;
unsigned int offset = 0;
/* load the mipmaps */
for (unsigned int level = 0; level < mipMapCount && (width || height); ++level) {
unsigned int size = ((width+3)/4)*((height+3)/4)*blockSize;
glCompressedTexImage2D(GL_TEXTURE_2D, level, format, width, height,
0, size, dataBuffer + offset);
offset += size;
width /= 2;
height /= 2;
}
textureType = DDS_TEXTURE;
return true;
In the fragment shader I just set the gl_FragColor = texture2D( myTextureSampler, UVcoords )
I hope that there is an easy explanation such as some code missing.
In the openGL initialization i glEnabled GL_Blend and set a blend function.
Does anyone have an idea of what I did wrong?
Make sure the blend function is the correct function for what you are trying to accomplish. For what you've described that should be glBlendFunc(GL_SRC_ALPHA,GL_ONE_MINUS_SRC_ALPHA);
You probably shouldn't set the blend function in your openGL initialization function but should wrap it around your draw calls like:
glEnable(GL_BLEND)
glBlendFunc(GL_SRC_ALPHA,GL_ONE_MINUS_SRC_ALPHA);
//gl draw functions (glDrawArrays,glDrawElements,etc..)
glDisable(GL_BLEND)
Are you clearing the 2D texture binding before you swap buffers? i.e ...
glBindTexture(GL_TEXTURE_2D, 0);
I'm trying to store a 1365x768 image on a 2048x1024 texture in OpenGL ES but the resulting image once drawn appears skewed. If I run the same 1365x768 image through gluScaleImage() and fit it onto the 2048x1024 texture it looks fine when drawn but this OpenGL call is slow and hurts performance.
I'm doing this on an Android device (Motorola Milestone) which has 256MB of memory. Not sure if the memory is a factor though since it works fine when scaled using gluScaleImage() (it's just slower.)
Mapping smaller textures (854x480 onto 1024x512, for example) works fine though. Does anyone know why this is and suggestions for what I can do about it?
Update
Some code snippets to help understand context...
// uiImage is loaded. The texture dimensions are determined from upsizing the image
// dimensions to a power of two size:
// uiImage->_width = 1365
// uiImage->_height = 768
// width = 2048
// height = 1024
// Once the image is loaded:
// INT retval = gluScaleImage(GL_RGBA, uiImage->_width, uiImage->_height, GL_UNSIGNED_BYTE, uiImage->_texels, width, height, GL_UNSIGNED_BYTE, data);
copyImage(GL_RGBA, uiImage->_width, uiImage->_height, GL_UNSIGNED_BYTE, uiImage->_texels, width, height, GL_UNSIGNED_BYTE, data);
if (pixelFormat == RGB565 || pixelFormat == RGBA4444)
{
unsigned char* tempData = NULL;
unsigned int* inPixel32;
unsigned short* outPixel16;
tempData = new unsigned char[height*width*2];
inPixel32 = (unsigned int*)data;
outPixel16 = (unsigned short*)tempData;
if(pixelFormat == RGB565)
{
// "RRRRRRRRGGGGGGGGBBBBBBBBAAAAAAAA" --> "RRRRRGGGGGGBBBBB"
for(unsigned int i = 0; i < numTexels; ++i, ++inPixel32)
{
*outPixel16++ = ((((*inPixel32 >> 0) & 0xFF) >> 3) << 11) |
((((*inPixel32 >> 8) & 0xFF) >> 2) << 5) |
((((*inPixel32 >> 16) & 0xFF) >> 3) << 0);
}
}
if(tempData != NULL)
{
delete [] data;
data = tempData;
}
}
// [snip..]
// Copy function (mostly)
static void copyImage(GLint widthin, GLint heightin, const unsigned int* datain, GLint widthout, GLint heightout, unsigned int* dataout)
{
unsigned int* p1 = const_cast<unsigned int*>(datain);
unsigned int* p2 = dataout;
int nui = widthin * sizeof(unsigned int);
for(int i = 0; i < heightin; i++)
{
memcpy(p2, p1, nui);
p1 += widthin;
p2 += widthout;
}
}
In the render code, without changing my texture coordinates I should see the correct image when using gluScaleImage() and a smaller image (that requires some later correction factors) for the copyImage() code. This is what happens when the image is small (854x480 for example works fine with copyImage()) but when I use the 1365x768 image, that's when the skewing appears.
Finally solved the issue. First thing to know is what's the maximum texture size allowed for the device:
GLint texSize;
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &texSize);
When I ran this the texture size max for the Motorola Milestone was 2048x2048, which was fine in my case.
After messing with the texture mapping to no end I finally decided to try opening and resaving the image..and voilà it suddenly began working. I don't know what was wrong with the format the original image was stored in but as advice to anyone else experiencing a similar problem: might be worth looking at your image itself.
I am trying to do some Image processing on a UIImage using some EAGLView code from the GLImageProcessing sample from Apple. The sample code is configured to perform processing to a pre-installed image (Image.png). I am trying to modify the code so that it will accept a UIImage (or at least CGImage data) of my choice and process that instead. Problem is, the texture-loader method loadTexture() (below) seems to accept only C structures as parameters, and I have not been able to get it to accept a UIImage* or a CGImage as a parameter. Can someone give me a clue as how to bridge the gap so that I can pass my UIImage into the C-method?
------------ from Texture.h ---------------
#ifndef TEXTURE_H
#define TEXTURE_H
#include "Imaging.h"
void loadTexture(const char *name, Image *img, RendererInfo *renderer);
#endif /* TEXTURE_H */
----------------from Texture.m---------------------
#import <UIKit/UIKit.h>
#import "Texture.h"
static unsigned int nextPOT(unsigned int x)
{
x = x - 1;
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >>16);
return x + 1;
}
// This is not a fully generalized image loader. It is an example of how to use
// CGImage to directly access decompressed image data. Only the most commonly
// used image formats are supported. It will be necessary to expand this code
// to account for other uses, for example cubemaps or compressed textures.
//
// If the image format is supported, this loader will Gen a OpenGL 2D texture object
// and upload texels from it, padding to POT if needed. For image processing purposes,
// border pixels are also replicated here to ensure proper filtering during e.g. blur.
//
// The caller of this function is responsible for deleting the GL texture object.
void loadTexture(const char *name, Image *img, RendererInfo *renderer)
{
GLuint texID = 0, components, x, y;
GLuint imgWide, imgHigh; // Real image size
GLuint rowBytes, rowPixels; // Image size padded by CGImage
GLuint POTWide, POTHigh; // Image size padded to next power of two
CGBitmapInfo info; // CGImage component layout info
CGColorSpaceModel colormodel; // CGImage colormodel (RGB, CMYK, paletted, etc)
GLenum internal, format;
GLubyte *pixels, *temp = NULL;
CGImageRef CGImage = [UIImage imageNamed:[NSString stringWithUTF8String:name]].CGImage;
rt_assert(CGImage);
if (!CGImage)
return;
// Parse CGImage info
info = CGImageGetBitmapInfo(CGImage); // CGImage may return pixels in RGBA, BGRA, or ARGB order
colormodel = CGColorSpaceGetModel(CGImageGetColorSpace(CGImage));
size_t bpp = CGImageGetBitsPerPixel(CGImage);
if (bpp < 8 || bpp > 32 || (colormodel != kCGColorSpaceModelMonochrome && colormodel != kCGColorSpaceModelRGB))
{
// This loader does not support all possible CGImage types, such as paletted images
CGImageRelease(CGImage);
return;
}
components = bpp>>3;
rowBytes = CGImageGetBytesPerRow(CGImage); // CGImage may pad rows
rowPixels = rowBytes / components;
imgWide = CGImageGetWidth(CGImage);
imgHigh = CGImageGetHeight(CGImage);
img->wide = rowPixels;
img->high = imgHigh;
img->s = (float)imgWide / rowPixels;
img->t = 1.0;
// Choose OpenGL format
switch(bpp)
{
default:
rt_assert(0 && "Unknown CGImage bpp");
case 32:
{
internal = GL_RGBA;
switch(info & kCGBitmapAlphaInfoMask)
{
case kCGImageAlphaPremultipliedFirst:
case kCGImageAlphaFirst:
case kCGImageAlphaNoneSkipFirst:
format = GL_BGRA;
break;
default:
format = GL_RGBA;
}
break;
}
case 24:
internal = format = GL_RGB;
break;
case 16:
internal = format = GL_LUMINANCE_ALPHA;
break;
case 8:
internal = format = GL_LUMINANCE;
break;
}
// Get a pointer to the uncompressed image data.
//
// This allows access to the original (possibly unpremultiplied) data, but any manipulation
// (such as scaling) has to be done manually. Contrast this with drawing the image
// into a CGBitmapContext, which allows scaling, but always forces premultiplication.
CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider(CGImage));
rt_assert(data);
pixels = (GLubyte *)CFDataGetBytePtr(data);
rt_assert(pixels);
// If the CGImage component layout isn't compatible with OpenGL, fix it.
// On the device, CGImage will generally return BGRA or RGBA.
// On the simulator, CGImage may return ARGB, depending on the file format.
if (format == GL_BGRA)
{
uint32_t *p = (uint32_t *)pixels;
int i, num = img->wide * img->high;
if ((info & kCGBitmapByteOrderMask) != kCGBitmapByteOrder32Host)
{
// Convert from ARGB to BGRA
for (i = 0; i < num; i++)
p[i] = (p[i] << 24) | ((p[i] & 0xFF00) << 8) | ((p[i] >> 8) & 0xFF00) | (p[i] >> 24);
}
// All current iPhoneOS devices support BGRA via an extension.
if (!renderer->extension[IMG_texture_format_BGRA8888])
{
format = GL_RGBA;
// Convert from BGRA to RGBA
for (i = 0; i < num; i++)
#if __LITTLE_ENDIAN__
p[i] = ((p[i] >> 16) & 0xFF) | (p[i] & 0xFF00FF00) | ((p[i] & 0xFF) << 16);
#else
p[i] = ((p[i] & 0xFF00) << 16) | (p[i] & 0xFF00FF) | ((p[i] >> 16) & 0xFF00);
#endif
}
}
// Determine if we need to pad this image to a power of two.
// There are multiple ways to deal with NPOT images on renderers that only support POT:
// 1) scale down the image to POT size. Loses quality.
// 2) pad up the image to POT size. Wastes memory.
// 3) slice the image into multiple POT textures. Requires more rendering logic.
//
// We are only dealing with a single image here, and pick 2) for simplicity.
//
// If you prefer 1), you can use CoreGraphics to scale the image into a CGBitmapContext.
POTWide = nextPOT(img->wide);
POTHigh = nextPOT(img->high);
if (!renderer->extension[APPLE_texture_2D_limited_npot] && (img->wide != POTWide || img->high != POTHigh))
{
GLuint dstBytes = POTWide * components;
GLubyte *temp = (GLubyte *)malloc(dstBytes * POTHigh);
for (y = 0; y < img->high; y++)
memcpy(&temp[y*dstBytes], &pixels[y*rowBytes], rowBytes);
img->s *= (float)img->wide/POTWide;
img->t *= (float)img->high/POTHigh;
img->wide = POTWide;
img->high = POTHigh;
pixels = temp;
rowBytes = dstBytes;
}
// For filters that sample texel neighborhoods (like blur), we must replicate
// the edge texels of the original input, to simulate CLAMP_TO_EDGE.
{
GLuint replicatew = MIN(MAX_FILTER_RADIUS, img->wide-imgWide);
GLuint replicateh = MIN(MAX_FILTER_RADIUS, img->high-imgHigh);
GLuint imgRow = imgWide * components;
for (y = 0; y < imgHigh; y++)
for (x = 0; x < replicatew; x++)
memcpy(&pixels[y*rowBytes+imgRow+x*components], &pixels[y*rowBytes+imgRow-components], components);
for (y = imgHigh; y < imgHigh+replicateh; y++)
memcpy(&pixels[y*rowBytes], &pixels[(imgHigh-1)*rowBytes], imgRow+replicatew*components);
}
if (img->wide <= renderer->maxTextureSize && img->high <= renderer->maxTextureSize)
{
glGenTextures(1, &texID);
glBindTexture(GL_TEXTURE_2D, texID);
// Set filtering parameters appropriate for this application (image processing on screen-aligned quads.)
// Depending on your needs, you may prefer linear filtering, or mipmap generation.
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_2D, 0, internal, img->wide, img->high, 0, format, GL_UNSIGNED_BYTE, pixels);
}
if (temp) free(temp);
CFRelease(data);
CGImageRelease(CGImage);
img->texID = texID;
}
Side Note: The above code is the original and unmodified sample code from Apple and does not generate any errors when compiled. However, when I try to modify the .h and .m to accept a UIImage* parameter (as below) the compiler generates the following error:"Error: expected declaration specifiers or "..." before UIImage"
----------Modified .h Code that generates the Compiler Error:-------------
void loadTexture(const char name, Image *img, RendererInfo *renderer, UIImage* newImage)
You are probably importing this .h into a .c somewhere. That tells the compiler to use C rather than Objective-C. UIKit.h (and it's many children) are in Objective-C and cannot be compiled by a C compiler.
You can rename all you .c files to .m, but what you really probably want is just to use CGImageRef and import CGImage.h. CoreGraphics is C-based. UIKit is Objective-C. There is no problem, if you want, for Texture.m to be in Objective-C. Just make sure that Texture.h is pure C. Alternatively (and I do this a lot with C++ code), you can make a Texture+C.h header that provides just the C-safe functions you want to expose. Import Texture.h in Objective-C code, and Texture+C.h in C code. Or name them the other way around if more convenient, with a Texture+ObjC.h.
It sounds like your file isn't importing the UIKit header.
WHy are you passing new image to loadTexture, instead of using loadTexture's own UImage loading to open the new image you want?
loadTexture:
void loadTexture(const char *name, Image *img, RendererInfo *renderer)
{
GLuint texID = 0, components, x, y;
GLuint imgWide, imgHigh; // Real image size
GLuint rowBytes, rowPixels; // Image size padded by CGImage
GLuint POTWide, POTHigh; // Image size padded to next power of two
CGBitmapInfo info; // CGImage component layout info
CGColorSpaceModel colormodel; // CGImage colormodel (RGB, CMYK, paletted, etc)
GLenum internal, format;
GLubyte *pixels, *temp = NULL;
[Why not have the following fetch your UIImage?]
CGImageRef CGImage = [UIImage imageNamed:[NSString stringWithUTF8String:name]].CGImage;
rt_assert(CGImage);
if (!CGImage)
return;