GL_TEXTURE_RECTANGLE_ARB not working with shader and OS X - c++

I've got an OSX app that uses OpenGL. I'm drawing most of my stuff with textures of the type GL_TEXTURE_2D, and everything works fine as long as I stick to GL_TEXTURE_2D. But I need to have a couple of textures of type GL_TEXTURE_RECTANGLE_ARB.
To create a texture of type GL_TEXTURE_RECTANGLE_ARB I do the following:
// 4x4 RGBA texture data
GLubyte TexData[4*4*4] =
{
0x00, 0x00, 0xff, 0xff, 0x00, 0xff, 0x00, 0xff,
0xff, 0x00, 0x00, 0xff, 0x00, 0x00, 0xff, 0xff,
0x00, 0xff, 0x00, 0xff, 0xff, 0x00, 0x00, 0xff,
0x00, 0x00, 0xff, 0xff, 0x00, 0xff, 0x00, 0xff,
0xff, 0x00, 0x00, 0xff, 0x00, 0x00, 0xff, 0xff,
0x00, 0xff, 0x00, 0xff, 0xff, 0x00, 0x00, 0xff,
0x00, 0x00, 0xff, 0xff, 0x00, 0xff, 0x00, 0xff,
0xff, 0x00, 0x00, 0xff, 0x00, 0x00, 0xff, 0xff,
};
GLuint myArbTexture;
glEnable(GL_TEXTURE_RECTANGLE_ARB);
glGenTextures(1, &myArbTexture);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, myArbTexture);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameterf( GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_MAX_LEVEL, 0 );
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_RGBA, 4, 4, 0, GL_RGBA, GL_UNSIGNED_BYTE, TexData);
To draw the texture I do the following:
SetupMyShader();
SetupMyMatrices();
glEnable(GL_TEXTURE_RECTANGLE_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, myArbTexture);
DrawMyQuads();
My shader is very simple:
void main (void)
{
gl_FragColor = texture2D(Tex, v_texcoords)*u_color;
}
Using the above code, my shader always references the last texture used in:
glBindTexture(GL_TEXTURE_2D, lastTexture)
instead of referencing the texture specified in:
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, myArbTexture);
Some things to make note of:
After every GL call I'm checking for errors glGetError() and I don't get any errors.
If I replace GL_TEXTURE_RECTANGLE_ARB with GL_TEXTURE_2D everything works fine
This is not an issue with uv (st) coordinates. When drawing with GL_TEXTURE_2D my uv are 0->1 and with GL_TEXTURE_RECTANGLE_ARB my uv are 0->texWidth or texHeight
I'm running a pretty new mac that reports GL Version 2.1 NVIDIA-10.2.1 310.41.15f01 so it certainly should support GL_TEXTURE_RECTANGLE_ARB. (I would think anyway)
I've narrowed down the issue enough to be pretty darn sure that when rendering the shader always refers to the previous texture that was bound as GL_TEXTURE_2D. My quad always draws in the right place and with sensible uv coords, it's just that it is referencing the wrong texture.
So, anyone got any guesses what I'm missing? Is there some call that I should be making other than glBindTexture that my shader would need when using GL_TEXTURE_RECTANGLE_ARB?

You will need to update your shader code to use rectangle textures. The uniform needs to be declared as:
uniform sampler2DRect Tex;
and accessed with:
gl_FragColor = texture2DRect(Tex, v_texcoords)*u_color;
Another aspect to keep in mind is that texture coordinates are defined differently for rectangle textures. Instead of the normalized texture coordinates in the range 0.0 to 1.0 used by all other texture types, rectangle textures use non-normalized texture coordinates in the range 0.0 to width and 0.0 to height.

Related

Receiving packets in Arduino

I have a sender that has a bitmap array that represents an image I've downloaded. The sender sends the pixels to the receiver, which the receiver should receive. I need to insert them into another bitmap array and display the picture.
Below is my sender sketch:
static const unsigned char PROGMEM myBitmap[] ={
0x00, 0x00, 0x00, 0x00, 0x11, 0xff, 0xff, 0xff, 0xff, 0xdf, 0x07, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x21, 0xff, 0xff, 0xff, 0xff, 0xc0, 0xc7, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xc1, 0xff, 0xff, 0xff, 0xff, 0xe0, 0xe3, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x01, 0xff, 0xff, 0xff, 0xff, 0xe0, 0x31, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x01, 0xff, 0xff, 0xff, 0xff, 0xf2, 0x71, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x11, 0xff, 0xff, 0xff, 0xff, 0xff, 0xf8, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x31, 0xff, 0xff, 0xc0, 0xff, 0xff, 0xf8, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x31, 0xff, 0xff, 0xc0, 0x3f, 0xff, 0xfc, 0x7f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x71, 0xff, 0xff, 0xe0, 0x1f, 0xff, 0xfe, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x71, 0xff, 0xff, 0xf0, 0x07, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf3, 0xff, 0xff, 0xf0, 0x03, 0xef, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xf0, 0x03, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xf8, 0x01, 0xff, 0xff, 0xfe, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xf0, 0x00, 0xff, 0xff, 0xfc, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xe0, 0x00, 0x7f, 0x8f, 0xf8, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0x80, 0x00, 0x7f, 0x0f, 0xf8, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xfe, 0xff, 0x80, 0x00, 0x3c, 0x0f, 0xf0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0xff, 0xc0, 0x00, 0x10, 0x0f, 0xe0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0xff, 0xc0, 0x00, 0x00, 0x0f, 0xe0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0xff, 0xc0, 0x00, 0x00, 0x0f, 0xc0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0x7f, 0xe0, 0x00, 0x00, 0x1f, 0xc0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0x7f, 0xff, 0x38, 0x00, 0xdf, 0x80, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0x7f, 0xff, 0xf8, 0x01, 0xbf, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0x7f, 0xff, 0xf0, 0x07, 0x7f, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfe, 0x3f, 0xff, 0xe0, 0x1c, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0x3f, 0xff, 0xc0, 0x39, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0x3f, 0xff, 0x80, 0x3f, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0x3f, 0xff, 0x00, 0x3f, 0xfc, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xfe, 0x00, 0x1f, 0xfd, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xfc, 0x00, 0x1f, 0xf8, 0x0f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xf8, 0x70, 0x1f, 0xf8, 0x0f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xf0, 0xf8, 0x7f, 0xf8, 0x07, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xf1, 0xfc, 0xff, 0xf0, 0x07, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xfd, 0xff, 0xe3, 0xdc, 0xff, 0xf0, 0x1f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xe7, 0xdc, 0xff, 0xe0, 0x38, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xcf, 0xdc, 0xff, 0xe0, 0x70, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xff, 0xde, 0x0c, 0x7f, 0xe0, 0xe0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xff, 0xbf, 0x1e, 0x7d, 0xe3, 0xc0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xff, 0xbf, 0x1e, 0xff, 0xc7, 0xc0, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xff, 0x7f, 0xbe, 0xff, 0xc7, 0x80, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xfe, 0xff, 0xfe, 0xff, 0xcf, 0x80, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xff, 0xfc, 0xff, 0xdf, 0x80, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xfb, 0xff, 0xf8, 0xff, 0x9f, 0x80, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xe7, 0xff, 0xff, 0xff, 0x8f, 0x80, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xef, 0xff, 0xfb, 0xff, 0x87, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xef, 0xff, 0xff, 0xfe, 0x07, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xef, 0xff, 0xf3, 0xfe, 0x07, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xff, 0xff, 0xe3, 0xfe, 0x07, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf1, 0xff, 0xff, 0xff, 0xf3, 0xfe, 0xff, 0x07, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf8, 0xff, 0xff, 0xff, 0xf8, 0x7f, 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xf9, 0xff, 0xff, 0xff, 0xf8, 0x3f, 0xff, 0x1f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xb8, 0xff, 0xff, 0xff, 0xf0, 0x0f, 0xff, 0x1f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xb9, 0xff, 0xff, 0xff, 0xf8, 0x0f, 0xfe, 0x3f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x99, 0xff, 0xff, 0xff, 0xf8, 0x07, 0xfc, 0x3f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd9, 0xff, 0xff, 0xff, 0xfc, 0x06, 0xfc, 0x3f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd8, 0xff, 0xff, 0xff, 0xfc, 0x07, 0xf8, 0x7f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd8, 0xff, 0xff, 0xff, 0xfc, 0x03, 0xfc, 0x7f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd8, 0xff, 0xff, 0xff, 0xfe, 0x03, 0xfc, 0x7f, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd8, 0xff, 0xff, 0xff, 0xfe, 0x03, 0xf8, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd9, 0xff, 0xff, 0xff, 0xff, 0x01, 0xf8, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd9, 0xff, 0xff, 0xff, 0xff, 0x01, 0xf9, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xd8, 0xff, 0xff, 0xff, 0xff, 0x01, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xc8, 0xff, 0xff, 0xff, 0xff, 0x81, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xc9, 0xff, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00
};
void setup() {
//initialize Serial Monitor
Serial.begin(115200);
while (!Serial);
Serial.println("LoRa Sender");
//set up LoRa transceiver module
LoRa.setPins(ss, rst, dio0);
//replace the LoRa.begin(---E-) argument with your location's frequency
//433E6 for Asia
//866E6 for Europe
//915E6 for North America
while (!LoRa.begin(915E6)) {
Serial.println(".");
delay(500);
}
// Change sync word (0xF3) to match the receiver
// The sync word assures you don't get LoRa messages from other LoRa transceivers
// ranges from 0-0xFF
LoRa.setSyncWord(0xF3);
Serial.println("LoRa Initializing OK!");
}
void loop() {
LoRa.beginPacket();
Serial.print("Sending packet");
LoRa.printf("0x%02x\n",myBitmap[i]);
i=i+1;
LoRa.endPacket();
ID=ID+1;
Below is my receiver sketch:
void setup() {
//initialize Serial Monitor
Serial.begin(115200);
while (!Serial);
Serial.println("LoRa Receiver");
//setup LoRa transceiver module
LoRa.setPins(ss, rst, dio0);
//replace the LoRa.begin(---E-) argument with your location's frequency
//433E6 for Asia
//866E6 for Europe
//915E6 for North America
while (!LoRa.begin(915E6)) {
Serial.println(".");
delay(500);
}
// Change sync word (0xF3) to match the receiver
// The sync word insures you don't get LoRa messages from other LoRa transceivers
// ranges from 0-0xFF
LoRa.setSyncWord(0xF3);
Serial.println("LoRa Initializing OK!");
}
void loop() {
char buff[4]={0};
char c;
// try to parse packet
static unsigned char PROGMEM myBitmap [200]={};
int packetSize = LoRa.parsePacket();
uint8_t ix = 0;
if (packetSize) {
char buff[4] = {0};
while (LoRa.available()) {
Serial.print("this is the data");
buff[ix]=(char)LoRa.read();
}
Serial.print("this is the string\n");
Serial.print(finall);
// print RSSI of packet
Serial.print("' with RSSI ");
Serial.println(LoRa.packetRssi());
}
}
The problem that is on the receiver, I am unable to create an array, put the pixels that are successfully sent from the sender in it.
Any help would be appreciated. Thank you.
Your message-reading loop doesn't increment the index counter:
buff[ix]=(char)LoRa.read();
should be
buff[ix++]=(char)LoRa.read();
In case there may be more than 255 chars in one read, I'd declare ix at least 16-bit:
uint16_t ix = 0;
Also, you are declaring buff twice.
Also, to be safe, are you sure 4 bytes is enough?
There are two issues in your code:
In your receiver code, buff[ix]=(char)LoRa.read(); never increments ix.
You cannot send 1,024 bytes of data. The LoRa chip's buffer is 256 bytes long.
The second point means that you need to split your array in smaller pieces, and implement a transfer protocol with a few key elements:
Frame counter
Data
Checksum
When the sender is done sending a frame – say 128 bytes – the receiver decodes the frame, saves it, and sends back an ACK of some sort, eg ACK xx yy zz nn where xx is the counter, yy zz the checksum, and nn the number of bytes received. If the sender agrees with that, it moves on the next frame. If not, it either retries, or sends a FAIL signal.
If you mean you don't know how to create an array, you can first make your program send the size of the array and do dynamic memory allocation.
unsigned char *arr = malloc(size);
Then you can copy the received buffer to here

How to copy X bytes or bits from an __m128i into standard memory

I have a loop that's adding int16s from two arrays together via _mm_add_epi16(). There's a small array and a large array, the results get written back to the large array.
The intrinsic may get less than 8x int16s (128 bits) from the small array if it's reached its end - how do I store the results of _mm_add_epi16() back into standard memory int16_t* when I don't want all of its 128 bits? Padding the array to power-of-two is not an option. Example:
int16_t* smallArray;
int16_t* largeArray;
__m128i inSmallArray = _mm_load_si128((__m128i*)smallArray);
__m128i* pInLargeArray = (__m128i*)largeArray;
__m128i inLargeArray = _mm_load_si128(pInLargeArray);
inLargeArray = _mm_add_epi16(inLargeArray, inSmallArray);
_mm_store_si128(pInLargeArray, inLargeArray);
My guess is that I need to substitute _mm_store_si128() with a "masked" store somehow.
There is a _mm_maskmoveu_si128 intrinsic, which translates to maskmovdqu (in SSE) or vmaskmovdqu (in AVX).
// Store masks. The highest bit in each byte indicates the byte to store.
alignas(16) const unsigned char masks[16][16] =
{
{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00 }
};
void store_n(__m128i mm, unsigned int n, void* storage)
{
assert(n < 16u);
_mm_maskmoveu_si128(mm, reinterpret_cast< const __m128i& >(masks[n]), static_cast< char* >(storage));
}
The problem with this code is that maskmovdqu (and, presumably, vmaskmovdqu) instructions have an associated hint for non-temporal access to the target memory, which makes the instruction expensive and also requires a fence afterwards.
AVX adds new instructions vmaskmovps/vmaskmovpd (and AVX2 also adds vpmaskmovd/vpmaskmovq), which work similarly to vmaskmovdqu but do not have the non-temporal hint and only operate on 32 and 64-bit granularity.
// Store masks. The highest bit in each 32-bit element indicates the element to store.
alignas(16) const unsigned char masks[4][16] =
{
{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00 }
};
void store_n(__m128i mm, unsigned int n, void* storage)
{
assert(n < 4u);
_mm_maskstore_epi32(static_cast< int* >(storage), reinterpret_cast< const __m128i& >(masks[n]), mm);
}
AVX-512 adds masked stores, and you could use vmovdqu8/vmovdqu16 with an appropriate mask to store 8 or 16-bit elements.
void store_n(__m128i mm, unsigned int n, void* storage)
{
assert(n < 16u);
_mm_mask_storeu_epi8(storage, static_cast< __mmask16 >((1u << n) - 1u), mm);
}
Note that the above requires AVX-512BW and VL extensions.
If you require 8 or 16-bit granularity and don't have AVX-512 then you're better off with a function that manually stores the vector register piece by piece.
void store_n(__m128i mm, unsigned int n, void* storage)
{
assert(n < 16u);
unsigned char* p = static_cast< unsigned char* >(storage);
if (n >= 8u)
{
_mm_storel_epi64(reinterpret_cast< __m128i* >(p), mm);
mm = _mm_unpackhi_epi64(mm, mm); // move high 8 bytes to the low 8 bytes
n -= 8u;
p += 8;
}
if (n >= 4u)
{
std::uint32_t data = _mm_cvtsi128_si32(mm);
std::memcpy(p, &data, sizeof(data)); // typically generates movd
mm = _mm_srli_si128(mm, 4);
n -= 4u;
p += 4;
}
if (n >= 2u)
{
std::uint16_t data = _mm_extract_epi16(mm, 0); // or _mm_cvtsi128_si32
std::memcpy(p, &data, sizeof(data));
mm = _mm_srli_si128(mm, 2);
n -= 2u;
p += 2;
}
if (n > 0u)
{
std::uint32_t data = _mm_cvtsi128_si32(mm);
*p = static_cast< std::uint8_t >(data);
}
}

binary translation for a glPolygonStipple argument

I am trying to learn the language from http://www.glprogramming.com/red/chapter02.html At that site there is an example on how to use glPolygonStipple. My understanding is the hexadecimal numbers in the GLubyte arrays are held to translate to binary numbers so it can make a bitmap. I just was wondering how exactly are the elements in these arrays making these patterns.
Here is the example from the website on this:
#include <Windows.h>
#include <GL/gl.h>
#include <GL/glut.h>
void display(void)
{
GLubyte fly[] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x03, 0x80, 0x01, 0xC0, 0x06, 0xC0, 0x03, 0x60,
0x04, 0x60, 0x06, 0x20, 0x04, 0x30, 0x0C, 0x20,
0x04, 0x18, 0x18, 0x20, 0x04, 0x0C, 0x30, 0x20,
0x04, 0x06, 0x60, 0x20, 0x44, 0x03, 0xC0, 0x22,
0x44, 0x01, 0x80, 0x22, 0x44, 0x01, 0x80, 0x22,
0x44, 0x01, 0x80, 0x22, 0x44, 0x01, 0x80, 0x22,
0x44, 0x01, 0x80, 0x22, 0x44, 0x01, 0x80, 0x22,
0x66, 0x01, 0x80, 0x66, 0x33, 0x01, 0x80, 0xCC,
0x19, 0x81, 0x81, 0x98, 0x0C, 0xC1, 0x83, 0x30,
0x07, 0xe1, 0x87, 0xe0, 0x03, 0x3f, 0xfc, 0xc0,
0x03, 0x31, 0x8c, 0xc0, 0x03, 0x33, 0xcc, 0xc0,
0x06, 0x64, 0x26, 0x60, 0x0c, 0xcc, 0x33, 0x30,
0x18, 0xcc, 0x33, 0x18, 0x10, 0xc4, 0x23, 0x08,
0x10, 0x63, 0xC6, 0x08, 0x10, 0x30, 0x0c, 0x08,
0x10, 0x18, 0x18, 0x08, 0x10, 0x00, 0x00, 0x08};
GLubyte halftone[] = {
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55,
0xAA, 0xAA, 0xAA, 0xAA, 0x55, 0x55, 0x55, 0x55};
glClear (GL_COLOR_BUFFER_BIT);
glColor3f (1.0, 1.0, 1.0);
/* draw one solid, unstippled rectangle, */
/* then two stippled rectangles */
glRectf (25.0, 25.0, 125.0, 125.0);
glEnable (GL_POLYGON_STIPPLE);
glPolygonStipple (fly);
glRectf (125.0, 25.0, 225.0, 125.0);
glPolygonStipple (halftone);
glRectf (225.0, 25.0, 325.0, 125.0);
glDisable (GL_POLYGON_STIPPLE);
glFlush ();
}
void init (void)
{
glClearColor (0.0, 0.0, 0.0, 0.0);
glShadeModel (GL_FLAT);
}
void reshape (int w, int h)
{
glViewport (0, 0, (GLsizei) w, (GLsizei) h);
glMatrixMode (GL_PROJECTION);
glLoadIdentity ();
gluOrtho2D (0.0, (GLdouble) w, 0.0, (GLdouble) h);
}
int main(int argc, char** argv)
{
glutInit(&argc, argv);
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
glutInitWindowSize (350, 150);
glutCreateWindow (argv[0]);
init ();
glutDisplayFunc(display);
glutReshapeFunc(reshape);
glutMainLoop();
return 0;
}
Binary is a base 2 number system, which means each digit is a 0 or a 1. This lends itself very well to stipple patterns, because a 0 means "don't draw this pixel", and a 1 means "draw this pixel". The stipple pattern used across a polygon is 2 dimensional, so you have several rows of these 0's and 1's, building up a pattern of pixels.
To be specific, you have 32 rows of 32 binary digits (bits) each.
Unfortunately you can't enter binary numbers into the source code of languages such as C and C++. Hexadecimal is commonly used instead. It's a base 16 number system, so each digit can be 0-9 or A-F (the letters A-F represents decimal values 10-15).
The nice thing about it is that each digit neatly corresponds to a pattern of 4 binary digits (or bits). That makes it very easy to convert. Here's how they correspond:
Hex Binary
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
A 1010
B 1011
C 1100
D 1101
E 1110
F 1111
(If you're not familiar with how numbers are represented in binary, then that might look strange. There should be plenty of tutorials and explanations online though if you want to learn more about the details.)
When you see a hex number such as 0x31, you can firstly ignore the "0x" prefix -- that just indicates that the number is in hexadecimal. To figure out the binary equivalent, just look up the other digits in the table, one at a time, to get the binary equivalent. In this case, it's a 3 followed by a 1, which means the binary pattern is 0011 0001 (without the space).
In a stipple pattern, that means it will leave 2 pixels blank, draw 2 pixels, leave 3 pixels blank, and finally draw 1 pixel.
In the example code you posted, you can see several pairs of hex digits. Each hex pair gives you 8 binary bits (or 1 byte). That means 4 consecutive pairs of hex digits is 32 bits, which is one complete row of the stipple pattern. There are 32 rows in total.
It's worth noting that the example code has slightly confusing formatting. It's showing 8 hex pairs per line of source code. OpenGL doesn't care about that though. It just sees a contiguous array of numbers, which it splits into 32 bits per row.

Drawing bitmap fonts with OpenGL, what does glRasterPos2i() do?

This is another one of those "I have a blank screen, please help me fix it" moments.
This example is from The OpenGL Programming Guide, Version 2.1, Page 311-312.
The example is supposed to draw 2 lines of text on the screen.
Past of the problem I think is that I don't understand how glRasterPos2i() works. Does it:
A:) Set the position of bitmaps to be drawn in the 3D world in homogeneous / "OpenGL coordinates"
B:) Set the position of bitmaps to be drawn on the screen in pixel coordinates
Here is the code I have so far: You can pretty much ignore the first big lump which defines what the bitmaps are.
#include <GL/glut.h>
#include <cstdlib>
#include <iostream>
#include <cstring>
// This first bit is kind of irreverent, it sets up some fonts in memory as bitmaps
GLubyte space[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
GLubyte letters[][13] = {
{ 0x00, 0x00, 0xc3, 0xc3, 0xc3, 0xc3, 0xff, 0xc3, 0xc3, 0xc3, 0x66, 0xc3, 0x18 },
{ 0x00, 0x00, 0xfe, 0xc7, 0xc3, 0xc3, 0xc7, 0xfe, 0xc7, 0xc3, 0xc3, 0xc7, 0xfe },
{ 0x00, 0x00, 0x7e, 0xe7, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xe7, 0x7e },
{ 0x00, 0x00, 0xfc, 0xce, 0xc7, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc7, 0xce, 0xfc },
{ 0x00, 0x00, 0xff, 0xc0, 0xc0, 0xc0, 0xc0, 0xfc, 0xc0, 0xc0, 0xc0, 0xc0, 0xff },
{ 0x00, 0x00, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xfc, 0xc0, 0xc0, 0xc0, 0xff },
{ 0x00, 0x00, 0x7e, 0xe7, 0xc3, 0xc3, 0xcf, 0xc0, 0xc0, 0xc0, 0xc0, 0xe7, 0x7e },
{ 0x00, 0x00, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xff, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3 },
{ 0x00, 0x00, 0x7e, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x7e },
{ 0x00, 0x00, 0x7c, 0xee, 0xc6, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06 },
{ 0x00, 0x00, 0xc3, 0xc6, 0xcc, 0xd8, 0xf0, 0xe0, 0xf0, 0xd8, 0xcc, 0xc6, 0xc3 },
{ 0x00, 0x00, 0xff, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xc0 },
{ 0x00, 0x00, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xdb, 0xff, 0xff, 0xe7, 0xc3 },
{ 0x00, 0x00, 0xc7, 0xc7, 0xcf, 0xcf, 0xdf, 0xdb, 0xfb, 0xf3, 0xf3, 0xe3, 0xe3 },
{ 0x7e, 0xe7, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xe7, 0x7e },
{ 0xc0, 0xc0, 0xc0, 0xc0, 0xc0, 0xfe, 0xc0, 0xf3, 0xc7, 0xc3, 0xc3, 0xc7, 0xfe },
{ 0x00, 0x00, 0x3f, 0x6e, 0xdf, 0xdb, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0x66, 0x3c },
{ 0x00, 0x00, 0xc3, 0xc6, 0xcc, 0xd8, 0xf0, 0xfe, 0xc7, 0xc3, 0xc3, 0xc7, 0xfe },
{ 0x00, 0x00, 0x7e, 0xe7, 0x03, 0x03, 0x07, 0x7e, 0xe0, 0xc0, 0xc0, 0xe7, 0x7e },
{ 0x00, 0x00, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0xff },
{ 0x00, 0x00, 0x7e, 0xe7, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3 },
{ 0x00, 0x00, 0x18, 0x3c, 0x3c, 0x66, 0x66, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3 },
{ 0x00, 0x00, 0xc3, 0xe7, 0xff, 0xff, 0xdb, 0xdb, 0xc3, 0xc3, 0xc3, 0xc3, 0xc3 },
{ 0x00, 0x00, 0xc3, 0x66, 0x66, 0xc3, 0xc3, 0x18, 0xc3, 0xc3, 0x66, 0x66, 0xc3 },
{ 0x00, 0x00, 0x18, 0x18, 0x18, 0x18, 0x18, 0x18, 0xc3, 0xc3, 0x66, 0x66, 0xc3 },
{ 0x00, 0x00, 0xff, 0xc0, 0xc0, 0x60, 0x30, 0x7e, 0x0c, 0x06, 0x03, 0x03, 0xff }
};
// This is just copying from the book
GLuint fontOffset;
void makeRasterFont()
{
GLuint i, j;
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
fontOffset = glGenLists(128);
for(i = 0, j = 'A'; i < 26; i ++, j ++)
{
glNewList(fontOffset + ' ', GL_COMPILE);
glBitmap(8, 13, 0.0, 2.0, 10.0, 0.0, letters[i]);
glEndList();
}
glNewList(fontOffset + ' ', GL_COMPILE);
glBitmap(8, 13, 0.0, 2.0, 10.0, 0.0, space);
glEndList();
}
void init()
{
glShadeModel(GL_FLAT);
makeRasterFont();
}
void printString(char* s)
{
glPushAttrib(GL_LIST_BIT);
glListBase(fontOffset);
glCallLists(std::strlen(s), GL_UNSIGNED_BYTE, (GLubyte*)s);
glPopAttrib();
}
void display()
{
GLfloat white[3] = {1.0, 1.0, 1.0 };
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glColor3fv(white);
// Print some text on the screen at (20,60) and (20,40)
glRasterPos2i(20, 60);
printString("THE QUICK BROWN FOX JUMPS");
glRasterPos2i(20, 40);
printString("OVER A LAZY DOG");
glFlush();
}
void reshape(int w, int h)
{
// Set the viewport
glViewport(0, 0, (GLsizei)w, (GLsizei)h);
// Set viewing mode
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluPerspective(45.0, (GLfloat)w / (GLfloat)h, 0.01, 100.0);
glMatrixMode(GL_MODELVIEW);
}
int main(int argc, char** argv)
{
/* Init glut with a single buffer display mode,
* window size, position and title */
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB | GLUT_DEPTH);
glutInitWindowSize(500, 500);
glutInitWindowPosition(100, 100);
glutCreateWindow(argv[0]);
// Call init routine to set OpenGL specific initialization values
init();
// Set callback function
glutDisplayFunc(display);
glutReshapeFunc(reshape);
// Enter main loop
glutMainLoop();
return EXIT_SUCCESS;
}
Sorry for the type of question - I hate just asking "please fix my code", because really I should be able to fix it myself. On this occasion I find myself, stuck, basically. Thanks for you time and help.
Solution:
For those interested, to "get it to work", the changes made were:
1: Change gluPerspective to gluOrtho2D(0, width, 0, height).
2: Change glnewList(fontOffset + ' ', GL_COMPILE) to glnewList(fontOffset + j, GL_COMPILE) - not BOTH, just the FIRST ONE IN THE LOOP.
3: Set the glRasterPos2i to be anywhere within the region specified by glOrtho2D. My width and height are both 500, so I used coordinates (20, 60) and then (20, 40).
You could have just left it with gluPerspective, and used coordinates about (0,0) without specifying any transformations. However, since a bitmap is 2D I think this is less intuitive.
As to your rendering problem, hint, you don't use j...
In the for loop:
glNewList(fontOffset + ' ', GL_COMPILE);
replace your space with the letter you want.
The glRasterPos function specifies the raster position in object coordinates. Those are passed through the current modelview and projection matrices (at the time of the glRasterPos-call) to get the actual raster position in window (viewport) coordinates to be used for things like glDrawPixels and glBitmap (thus option A). So given your current perspective projection and identity modelview, those (20,40) (which are probably meant as pixels) are quite off the screen. If you want to specify it in pixels (which is usually the case), you need to setup your transformation pipeline accordingly.
But I wouldn't recommend using those old and deprecated (and likely slooow) pixel drawing functions at all (and neither to learn from the unfortunately awfully outdated Redbook). Just draw a textured quad with a custom shader that just takes window coordinates.

OpenGL and monochrome texture

Is it possible to pump monochrome (graphical data with 1 bit image depth) texture into OpenGL?
I'm currently using this:
glTexImage2D( GL_TEXTURE_2D, 0, 1, game->width, game->height, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, game->culture[game->phase] );
I'm pumping it with square array of 8 bit unsigned integers in GL_LUMINANCE mode (one 8 bit channel represents brightness of all 3 channels and full alpha), but it is IMO vastly ineffective, because the onlu values in the array are 0x00 and 0xFF.
Can I (and how) use simple one-bit per pixel array of booleans instead somehow? The excessive array size slows down any other operations on the array :(
After some research, I was able to render the 1-bit per pixel image as a texture with the following code:
static GLubyte smiley[] = /* 16x16 smiley face */
{
0x03, 0xc0, /* **** */
0x0f, 0xf0, /* ******** */
0x1e, 0x78, /* **** **** */
0x39, 0x9c, /* *** ** *** */
0x77, 0xee, /* *** ****** *** */
0x6f, 0xf6, /* ** ******** ** */
0xff, 0xff, /* **************** */
0xff, 0xff, /* **************** */
0xff, 0xff, /* **************** */
0xff, 0xff, /* **************** */
0x73, 0xce, /* *** **** *** */
0x73, 0xce, /* *** **** *** */
0x3f, 0xfc, /* ************ */
0x1f, 0xf8, /* ********** */
0x0f, 0xf0, /* ******** */
0x03, 0xc0 /* **** */
};
float index[] = {0.0, 1.0};
glPixelStorei(GL_UNPACK_ALIGNMENT,1);
glPixelMapfv(GL_PIXEL_MAP_I_TO_R, 2, index);
glPixelMapfv(GL_PIXEL_MAP_I_TO_G, 2, index);
glPixelMapfv(GL_PIXEL_MAP_I_TO_B, 2, index);
glPixelMapfv(GL_PIXEL_MAP_I_TO_A, 2, index);
glTexImage2D(GL_TEXTURE_2D,0,GL_RGBA,16,16,0,GL_COLOR_INDEX,GL_BITMAP,smiley);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
and here is the result:
The smallest uncompressed texture-format for luminance images uses 8 bits per pixel.
However, 1 bit per pixel images can be compressed without loss to the S3TC or DXT format. This will still not be 1 bit per pixel but somewhere between 2 and 3 bits.
If you really need 1 bit per pixel you can do so with a little trick. Load 8 1 bit per pixel textures as one 8 bit Alpha-only texture (image 1 gets loaded into bit 1, image 2 into bit 2 and so on). Once you've done that you can "address" each of the sub-textures using the alpha-test feature and a bit of texture environment programming to turn alpha into a color.
This will of only work if you have 8 1 bit per pixel textures and tricky to get right though.