I'm currently working on something in C++ using SDL2 that requires being able to draw a lot of individual pixels with specific color values to the screen every update. I'm using SDL_RenderDrawPoint just to make sure my program works but I'm sure the performance on that is terrible. From a cursory search it seems like using a texture that is the size of my window would be fastest by using SDL_UpdateTexture and updating it with a vector of pixels with my desired pixel values with a default of {0,0,0,0} RGBA value for any pixel not changed.
However every attempt I've had at writing it fails and I'm not sure where my misunderstandings lie. This is my current code that attempts to draw a specific RGBA color value to a specific x,y coordinate in my texture. I assume the part of the buffer I'm accessing using my x,y values is incorrect but I'm unsure how to make it correct if so.
Any help is appreciated including suggestions on how to efficiently do this without a texture if there's a better way.
SDL_Texture* windowTexture = SDL_CreateTexture(render, SDL_PIXELFORMAT_RGBA8888, SDL_TEXTUREACCESS_STREAMING, screenWidth, screenHeight);
unsigned int* lockedPixels = nullptr;
std::vector<int> pixels(screenHeight*screenWidth*4, 0);
int pitch = 0;
int start = (y * screenWidth + x) * 4;
pixels[start + 0] = B;
pixels[start + 1] = G;
pixels[start + 2] = R;
pixels[start + 3] = A;
SDL_UpdateTexture(windowTexture, nullptr, pixels.data(), screenWidth * 4);
The pixel format RGBA8888 means that each pixel is a 32 bit element with each channel (i.e. red, green, blue or alpha) taking up 8 bits, in that order.
You may want to declare pixels as containing the type "32 bit unsigned integer". An unsigned int is typically 32 bits, but it may also be larger.
std::vector<Uint32> pixels(screenHeight*screenWidth, 0); // Note: no *4
The individual R, G, B, A values (which should each be 8 bit unsigned integers) can be combined into one pixel by using shifts and bit-wise ORs:
int start = y * screenWidth + x; // Note: no *4
pixels[start] = (R << 24U) | (G << 16U) | (B << 8U) | A;
Lastly, you may want to not hardcode the last parameter of SDL_UpdateTexture (i.e. pitch). Instead, use screenWidth * sizeof(Uint32).
The implementation above is basically a direct implementation of "RGBA8888" and allows you to access individual pixels.
Alternatively, you could also declare an array of four times the size containing 8 bit unsigned integers. Then, the first four indices would correspond to the R, G, B, A values of the first pixel, the next four indices would correspond to the R, G, B, A values of the second pixel, etc.
Which one is faster would depend on the exact system and use-case (whether the most common operations are on pixels or individual channels).
PS. Instead of Uint32 you could also use C++'s own std::uint32_t from the cstdlib header.
Related
I tried to create heightmap with an png or jpg file. And it works too 75% but I can't solve the last 25...
Here is a picture of the map as png
And this is the resulting heightmap/terrain
As you can see the symbols starts to repeat and I have no clue why.
The code:
auto image = IMG_Load(path.c_str());
int lineOffSet = i*(image->pitch/4);
uint32 pixel = static_cast<uint32*>(image->pixels)[lineOffSet + j];
uint8 r, g ,b;
SDL_GetRGB(pixel,image->format,&r, &g, &b);
What I tried:
The number of vertices is correct(256x256).
int lineOffSet = i*(image->pitch/4);
4 represents the bytes per pixel which should be in this case 3 but than I get a complete different terrain (The pitch is 768). The range from i and j goes from 0-255.
I hope someone has a hint to solve this thing
I think you calculate the address of the desired pixel wrong. You assume that one pixel is 4 bytes in size. It's usually more reliable to directly calculate the address in bytes and then cast to uint32. Try this:
uint32 pixel = *static_cast<uint32*>(image->pixels +
image->pitch * i +
image->format->BytesPerPixel * j);
I have a function that needs to return a 16 bit unsigned int vector, but for another from which I also call this one, I need the output in 8 bit unsigned int vector format. For example, if I start out with:
std::vector<uint16_t> myVec(640*480);
How might I convert it to the format of:
std::vector<uint8_t> myVec2(640*480*4);
UPDATE (more information):
I am working with libfreenect and its getDepth() method. I have modified it to output a 16 bit unsigned integer vector so that I can retrieve the depth data in millimeters. However, I would also like to display the depth data. I am working with some example code c++ from the freenect installation, which uses glut and requires an 8 bit unsigned int vector to display the depth, however, i need the 16 bit to retrieve the depth in millimeters and log it to a text file. Therefore, i was looking to retrieve the data as a 16 bit unsigned int vector in glut's draw function, and then convert it so that I can display it with the glut function that's already written.
As per your update, assuming the 8-bit unsigned int is going to be displayed as a gray scale image, what you need is akin to a Brightness Transfer Function. Basically, your output function is looking to map the data to the values 0-255, but you don't necessarily want those to correspond directly to millimeters. What if all of your data was from 0-3mm? Then your image would look almost completely black. What if it was all 300-400mm? Then it'd be completely white because it was clipped to 255.
A rudimentary way to do it would be to find the minimum and maximum values, and do this:
double scale = 255.0 / (double)(maxVal - minVal);
for( int i = 0; i < std::min(myVec.size(), myVec2.size()); ++i )
{
myVec2.at(i) = (unsigned int)((double)(myVec.at(i)-minVal) * scale);
}
depending on the distribution of your data, you might need to do something a little more complex to get the most out of your dynamic range.
Edit: This assumes your glut function is creating an image, if it is using the 8-bit value as an input to a graph then you can disregard.
Edit2: An update after your other update. If you want to fill a 640x480x4 vector, you are clearly doing an image. You need to do what I outlined above, but also the 4 dimensions that it is looking for are Red, Green, Blue, and Alpha. The Alpha channel needs to be 255 at all times (this controls how transparent it is, you don't want it to be transparent), as for the other 3... that value you got from the function above (the scaled value) if you set all 3 channels (channels being red, green, and blue) to the same value it will appear as grayscale. For example, if my data ranged from 0-25mm, for a pixel who's value is 10mm, I would set the data to 255/(25-0)* 10 = 102 and therefore the pixel would be (102, 102, 102, 255)
Edit 3: Adding wikipedia link about Brightness Transfer Functions - https://en.wikipedia.org/wiki/Color_mapping
How might I convert it to the format of:
std::vector myVec2; such that myVec2.size() will be twice as
big as myVec.size()?
myVec2.reserve(myVec.size() * 2);
for (auto it = begin(myVec); it!=end(myVec); ++it)
{
uint8_t val = static_cast<uint8_t>(*it); // isolate the low 8 bits
myVec2.push_back(val);
val = static_cast<uint8_t>((*it) >> 8); // isolate the upper 8 bits
myVec2.push_back(val);
}
Or you can change the order of push_back()'s if it matters which byte come first (the upper or the lower).
Straightforward way:
std::vector<std::uint8_t> myVec2(myVec.size() * 2);
std::memcpy(myVec2.data(), myVec.data(), myVec.size());
or with the use of the standard library
std::copy( begin(myVec), end(myVec), begin(myVec2));
Lets start with some code:
QByteArray OpenGLWidget::modifyImage(QByteArray imageArray, const int width, const int height){
if (vertFlip){
/* Each pixel constist of four unisgned chars: Red Green Blue Alpha.
* The field is normally 640*480, this means that the whole picture is in fact 640*4 uChars wide.
* The whole ByteArray is onedimensional, this means that 640*4 is the red of the first pixel of the second row
* This function is EXTREMELY SLOW
*/
QByteArray tempArray = imageArray;
for (int h = 0; h < height; ++h){
for (int w = 0; w < width/2; ++w){
for (int i = 0; i < 4; ++i){
imageArray.data()[h*width*4 + 4*w + i] = tempArray.data()[h*width*4 + (4*width - 4*w) + i ];
imageArray.data()[h*width*4 + (4*width - 4*w) + i] = tempArray.data()[h*width*4 + 4*w + i];
}
}
}
}
return imageArray;
}
This is the code I use right now to vertically flip an image which is 640*480 (The image is actually not guaranteed to be 640*480, but it mostly is). The color encoding is RGBA, which means that the total array size is 640*480*4. I get the images with 30 FPS, and I want to show them on the screen with the same FPS.
On an older CPU (Athlon x2) this code is just too much: the CPU is racing to keep up with the 30 FPS, so the question is: can I do this more efficient?
I am also working with OpenGL, does that have a gimmic I am not aware of that can flip images with relativly low CPU/GPU usage?
According to this question, you can flip an image in OpenGL by scaling it by (1,-1,1). This question explains how to do transformations and scaling.
You can improve at least by doing it blockwise, making use of the cache architecture. In your example one of the accesses (either the read OR the write) will be off-cache.
For a start it can help to "capture scanlines" if you're using two loops to loop through the pixels of an image, like so:
for (int y = 0; y < height; ++y)
{
// Capture scanline.
char* scanline = imageArray.data() + y*width*4;
for (int x = 0; x < width/2; ++x)
{
const int flipped_x = width - x-1;
for (int i = 0; i < 4; ++i)
swap(scanline[x*4 + i], scanline[flipped_x*4 + i]);
}
}
Another thing to note is that I used swap instead of a temporary image. That'll tend to be more efficient since you can just swap using registers instead of loading pixels from a copy of the entire image.
But also it generally helps if you use a 32-bit integer instead of working one byte at a time if you're going to be doing anything like this. If you're working with pixels with 8-bit types but know that each pixel is 32-bits, e.g., as in your case, you can generally get away with a case to uint32_t*, e.g.
for (int y = 0; y < height; ++y)
{
uint32_t* scanline = (uint32_t*)imageArray.data() + y*width;
std::reverse(scanline, scanline + width);
}
At this point you might parellelize the y loop. Flipping an image horizontally (it should be "horizontal" if I understood your original code correctly) in this way is a little bit tricky with the access patterns, but you should be able to get quite a decent boost using the above techniques.
I am also working with OpenGL, does that have a gimmic I am not aware
of that can flip images with relativly low CPU/GPU usage?
Naturally the fastest way to flip images is to not touch their pixels at all and just save the flipping for the final part of the pipeline when you render the result. For this you might render a texture in OGL with negative scaling instead of modifying the pixels of a texture.
Another thing that's really useful in video and image processing is to represent an image to process like this for all your image operations:
struct Image32
{
uint32_t* pixels;
int32_t width;
int32_t height;
int32_t x_stride;
int32_t y_stride;
};
The stride fields are what you use to get from one scanline (row) of an image to the next vertically and one column to the next horizontally. When you use this representation, you can use negative values for the stride and offset the pixels accordingly. You can also use the stride fields to, say, render only every other scanline of an image for fast interactive half-res scanline previews by using y_stride=height*2 and height/=2. You can quarter-res an image by setting x stride to 2 and y stride to 2*width and then halving the width and height. You can render a cropped image without making your blit functions accept a boatload of parameters by just modifying these fields and keeping the y stride to width to get from one row of the cropped section of the image to the next:
// Using the stride representation of Image32, this can now
// blit a cropped source, a horizontally flipped source,
// a vertically flipped source, a source flipped both ways,
// a half-res source, a quarter-res source, a quarter-res
// source that is horizontally flipped and cropped, etc,
// and all without modifying the source image in advance
// or having to accept all kinds of extra drawing parameters.
void blit(int dst_x, int dst_y, Image32 dst, Image32 src);
// We don't have to do things like this (and I think I lost
// some capabilities with this version below but it hurts my
// brain too much to think about what capabilities were lost):
void blit_gross(int dst_x, int dst_y, int dst_w, int dst_h, uint32_t* dst,
int src_x, int src_y, int src_w, int src_h,
const uint32_t* src, bool flip_x, bool flip_y);
By using negative values and passing it to an image operation (ex: a blit operation), the result will naturally be flipped without having to actually flip the image. It'll end up being "drawn flipped", so to speak, just as with the case of using OGL with a negative scaling transformation matrix.
I have written a volume rendering program that turns some 2d images into a 3d volume that can be rotated around by a user. I need to calculate a normal for each point in the 3d texture (for lighting) by taking the gradient in each direction around the point.
Calculating the normal requires six extra texture accesses within the fragment shader. The program is much faster without these extra texture access, so I am trying to precompute the gradients for each direction (x,y,z) in bytes and store it in the BGA channels of the original texture. My bytes seem to contain the right values when I test on the CPU, but when I get to the shader it comes out looking wrong. It's hard to tell why it fails from the shader, I think it is because some of the gradient values are negative. However, when I specify the texture type as GL_BYTE (as opposed to GL_UNSIGNED_BYTE) it is still wrong, and that screws up how the original texture should look. I can't tell exactly what's going wrong just by rendering the data as colors. What is the right way to put negative values into a texture? How can I know that values are negative when I read from it in the fragment shader?
The following code shows how I run the operation to compute the gradients from a byte array (byte[] all) and then turn it into a byte buffer (byteBuffer bb) that is read in as a 3d texture. The function 'toLoc(x,y,z,w,h,l)' simply returns (x+w*(y+z*h))*4)--it converts 3d subscripts to a 1d index. The image is grayscale, so I discard gba and only use the r channel to hold the original value. The remaining channels (gba) store the gradient.
int pixelDiffxy=5;
int pixelDiffz=1;
int count=0;
Float r=0f;
byte t=r.byteValue();
for(int i=0;i<w;i++){
for(int j=0;j<h;j++){
for(int k=0;k<l;k++){
count+=4;
if(i<pixelDiffxy || i>=w-pixelDiffxy || j<pixelDiffxy || j>=h-pixelDiffxy || k<pixelDiffz || k>=l-pixelDiffz){
//set these all to zero since they are out of bounds
all[toLoc(i,j,k,w,h,l)+1]=t;//green=0
all[toLoc(i,j,k,w,h,l)+2]=t;//blue=0
all[toLoc(i,j,k,w,h,l)+3]=t;//alpha=0
}
else{
int ri=(int)all[toLoc(i,j,k,w,h,l)+0] & 0xff;
//find the values on the sides of this pixel in each direction (use red channel)
int xgrad1=(all[toLoc(i-pixelDiffxy,j,k,w,h,l)])& 0xff;
int xgrad2=(all[toLoc(i+pixelDiffxy,j,k,w,h,l)])& 0xff;
int ygrad1=(all[toLoc(i,j-pixelDiffxy,k,w,h,l)])& 0xff;
int ygrad2=(all[toLoc(i,j+pixelDiffxy,k,w,h,l)])& 0xff;
int zgrad1=(all[toLoc(i,j,k-pixelDiffz,w,h,l)])& 0xff;
int zgrad2=(all[toLoc(i,j,k+pixelDiffz,w,h,l)])& 0xff;
//find the difference between the values on each side and divide by the distance between them
int xgrad=(xgrad1-xgrad2)/(2*pixelDiffxy);
int ygrad=(ygrad1-ygrad2)/(2*pixelDiffxy);
int zgrad=(zgrad1-zgrad2)/(2*pixelDiffz);
Vec3f grad=new Vec3f(xgrad,ygrad,zgrad);
Integer xg=(int) (grad.x);
Integer yg=(int) (grad.y);
Integer zg=(int) (grad.z);
//System.out.println("gs are: "+xg +", "+yg+", "+zg);
byte gby= (byte) (xg.byteValue());//green channel
byte bby= (byte) (yg.byteValue());//blue channel
byte aby= (byte) (zg.byteValue());//alpha channel
//System.out.println("gba is: "+(int)gby +", "+(int)bby+", "+(int)aby);
all[toLoc(i,j,k,w,h,l)+1]=gby;//green
all[toLoc(i,j,k,w,h,l)+2]=bby;//blue
all[toLoc(i,j,k,w,h,l)+3]=aby;//alpha
}
}
}
}
ByteBuffer bb=ByteBuffer.wrap(all);
final GL gl = drawable.getGL();
final GL2 gl2 = gl.getGL2();
final int[] bindLocation = new int[1];
gl.glGenTextures(1, bindLocation, 0);
gl2.glBindTexture(GL2.GL_TEXTURE_3D, bindLocation[0]);
gl2.glPixelStorei(GL.GL_UNPACK_ALIGNMENT, 1);//-byte alignment
gl2.glTexParameteri(GL2.GL_TEXTURE_3D, GL.GL_TEXTURE_WRAP_S, GL2.GL_CLAMP);
gl2.glTexParameteri(GL2.GL_TEXTURE_3D, GL.GL_TEXTURE_WRAP_T, GL2.GL_CLAMP);
gl2.glTexParameteri(GL2.GL_TEXTURE_3D, GL2.GL_TEXTURE_WRAP_R, GL2.GL_CLAMP);
gl2.glTexParameteri(GL2.GL_TEXTURE_3D, GL.GL_TEXTURE_MAG_FILTER, GL.GL_LINEAR);
gl2.glTexParameteri(GL2.GL_TEXTURE_3D, GL.GL_TEXTURE_MIN_FILTER, GL.GL_LINEAR);
gl2.glTexEnvf(GL2.GL_TEXTURE_ENV, GL2.GL_TEXTURE_ENV_MODE, GL.GL_REPLACE);
gl2.glTexImage3D( GL2.GL_TEXTURE_3D, 0,GL.GL_RGBA,
w, h, l, 0,
GL.GL_RGBA, GL.GL_UNSIGNED_BYTE, bb );//GL_UNSIGNED_BYTE
Is there a better way to get a large array of signed data into the shader?
gl2.glTexImage3D( GL2.GL_TEXTURE_3D, 0,GL.GL_RGBA,
w, h, l, 0, GL.GL_RGBA, GL.GL_UNSIGNED_BYTE, bb );
Well, there are two ways to go about doing this, depending on how much work you want to do in the shader vs. what OpenGL version you want to limit things to.
The version that requires more shader work also requires a bit more out of your code. See, what you want to do is have your shader take unsigned bytes, then reinterpret them as signed bytes.
The way that this would typically be done is to pass unsigned normalized bytes (as you're doing), which produces floating-point values on the [0, 1] range, then simply expand that range by multiplying by 2 and subtracting 1, yielding numbers on the [-1, 1] range. This means that your uploading code needs to take it's [-128, 127] signed bytes and convert them into [0, 255] unsigned bytes by adding 128 to them.
I have no idea how to do this in Java, which does not appear to have an unsigned byte type at all. You can't just pass a 2's complement byte and expect it to work in the shader; that's not going to happen. The byte value -128 would map to the floating-point value 1, which isn't helpful.
If you can manage to convert the data properly as I described above, then your shader access would have to unpack from the [0, 1] range to the [-1, 1] range.
If you have access to GL 3.x, then you can do this quite easily, with no shader changes:
gl2.glTexImage3D( GL2.GL_TEXTURE_3D, 0,GL.GL_RGBA8_SNORM,
w, h, l, 0, GL.GL_RGBA, GL.GL_BYTE, bb );
The _SNORM in the image format means that it is a signed, normalized format. So your bytes on the range [-128, 127] will be mapped to floats on the range [-1, 1]. Exactly what you want.
I'm using JNI to obtain raw image data in the following format:
The image data is returned in the format of a DATA32 (32 bits) per pixel in a linear array ordered from the top left of the image to the bottom right going from left to right each line. Each pixel has the upper 8 bits as the alpha channel and the lower 8 bits are the blue channel - so a pixel's bits are ARGB (from most to least significant, 8 bits per channel). You must put the data back at some point.
The DATA32 format is essentially an unsigned int in C.
So I obtain an int[] array and then try to create a Buffered Image out of it by
int w = 1920;
int h = 1200;
BufferedImage b = new BufferedImage(w, h, BufferedImage.TYPE_INT_ARGB);
int[] f = (new Capture()).capture();
for(int i = 0; i < f.length; i++){;
b.setRGB(x, y, f[i]);
}
f is the array with the pixel data.
According to the Java documentation this should work since BufferedImage.TYPE_INT_ARGB is:
Represents an image with 8-bit RGBA color components packed into integer pixels. The image has a DirectColorModel with alpha. The color data in this image is considered not to be premultiplied with alpha. When this type is used as the imageType argument to a BufferedImage constructor, the created image is consistent with images created in the JDK1.1 and earlier releases.
Unless by 8-bit RGBA, them mean that all components added together are encoded in 8bits? But this is impossible.
This code does work, but the image that is produced is not at all like the image that it should produce. There are tonnes of artifacts. Can anyone see something obviously wrong in here?
Note I obtain my pixel data with
imlib_context_set_image(im);
data = imlib_image_get_data();
in my C code, using the library imlib2 with api http://docs.enlightenment.org/api/imlib2/html/imlib2_8c.html#17817446139a645cc017e9f79124e5a2
i'm an idiot.
This is merely a bug.
I forgot to include how I calculate x,y above.
Basically I was using
int x = i%w;
int y = i/h;
in the for loop, which is wrong. SHould be
int x = i%w;
int y = i/w;
Can't believe I made this stupid mistake.