Transparent spectrogram selection overlays - c++

I'm trying to create transparent selection overlays on top of a spectrogram but it doesn't quite work. I mean the result is not really satisfactory. In contrast, the overlays painted on top of a waveform work well but I need to support both the waveform as well as the spectrogram view (and maybe other views in the future)
The selection overlay works fine in the waveform view
Here's the selection overlay in the spectrogram view (the selection looks really bad and obscures parts of the spectrogram)
The code (VCL) is the same for both views
void TWaveDisplayContainer::DrawSelectedRegion(){
if(selRange.selStart.x == selRange.selEnd.x){
DrawCursorPosition( selRange.selStart.x);
return;
}
Graphics::TBitmap *pWaveBmp = eContainerView == WAVEFORM ? pWaveBmpLeft : pSfftBmpLeft;
TRect selRect(selRange.selStart.x, 0, selRange.selEnd.x, pWaveLeft->Height);
TCanvas *pCanvas = pWaveLeft->Canvas;
int copyMode = pCanvas->CopyMode;
pCanvas->Draw(0,0, pWaveBmp);
pCanvas->Brush->Color = clActiveBorder;
pCanvas->CopyMode = cmSrcAnd;
pCanvas->Rectangle(selRect);
pCanvas->CopyRect(selRect, pWaveBmp->Canvas, selRect);
pCanvas->CopyMode = copyMode;
if(numChannels == 2){
TCanvas* pOtherCanvas = pWaveRight->Canvas;
pWaveBmp = eContainerView == WAVEFORM ? pWaveBmpRight :
pSfftBmpRight;
pOtherCanvas->Draw(0,0, pWaveBmp);
pOtherCanvas->Brush->Color = clActiveBorder;
pOtherCanvas->CopyMode = cmSrcAnd;
pOtherCanvas->Rectangle(selRect);
pOtherCanvas->CopyRect(selRect, pWaveBmp->Canvas, selRect);
pOtherCanvas->CopyMode = copyMode;
}
}
So, I'm using cmSrcAnd copy mode and the CopyRect method to do the actual painting/drawing (TCanvas corresponds to a device context (HDC on Windows). I think, since a spectrogram, unlike a waveform, doesn't really have a single background colour using simple mixing copy modes isn't going to work well in most cases.
Note that I can still accomplish what I want but that would require messing with the individual pixels, which is something I'd like to avoid if possible)
I'm basically looking for an API (VCL wraps GDI so even WINAPI is fine) able to do this.
Any help is much appreciated

I'm going to answer my own question and hopefully this will prove to be useful to some people. Since there's apparently no way this can be achieved
in either plain VCL or using WINAPI (except in some situations), I've written a simple function that blends a bitmap (32bpp / 24bpp) with an overlay colour (any colour).
The actual result will also depend on the weights (w0,w1) given to the red, green and blue components of an individual pixel. Changing these will produce
an overlay that leans more toward the spectrogram colour or the overlay colour respectively.
The code
Graphics::TBitmap *TSelectionOverlay::GetSelectionOverlay(Graphics::TBitmap *pBmp, TColor selColour,
TRect &rect, EChannel eChannel){
Graphics::TBitmap *pSelOverlay = eChannel==LEFT ? pSelOverlayLeft : pSelOverlayRight;
const unsigned cGreenShift = 8;
const unsigned cBlueShift = 16;
const unsigned overlayWidth = abs(rect.right-rect.left);
const unsigned overlayHeight = abs(rect.bottom-rect.top);
pSelOverlay->Width = pBmp->Width;
pSelOverlay->Height = pBmp->Height;
const unsigned startOffset = rect.right>rect.left ? rect.left : rect.right;
pSelOverlay->Assign(pBmp);
unsigned char cRed0, cGreen0, cBlue0,cRed1, cGreen1, cBlue1, bRedColor0, bGreenColor0, bBlueColor0;
cBlue0 = selColour >> cBlueShift;
cGreen0 = selColour >> cGreenShift & 0xFF;
cRed0 = selColour & 0xFF;
unsigned *pPixel;
for(int i=0;i<overlayHeight;i++){
pPixel = (unsigned*)pSelOverlay->ScanLine[i];//provides access to the pixel array
for(int j=0;j<overlayWidth;j++){
unsigned pixel = pPixel[startOffset+j];
cBlue1 = pixel >> cBlueShift;
cGreen1 = pixel >> cGreenShift & 0xFF;
cRed1 = pixel & 0xFF;
//blend the current bitmap pixel with the overlay colour
const float w0 = 0.5f; //these weights influence the appearance of the overlay (here we use 50%)
const float w1 = 0.5f;
bRedColor0 = cRed0*w0+cRed1*w1;
bGreenColor0 = cGreen0*w0+cGreen1*w1);
bBlueColor0 = cBlue0*w0+cBlue1*w1;
pPixel[startOffset+j] = ((bBlueColor0 << cBlueShift) | (bGreenColor0 << cGreenShift)) | bRedColor0;
}
}
return pSelOverlay;
}
Note that for some reason, CopyRect used with a CopyMode value of cmSrcCopy didn't work well so I used Draw instead.
pCanvas->CopyMode = cmSrcCopy;
pCanvas->CopyRect(dstRect, pSelOverlay->Canvas, srcRec);//this still didn't work well--possibly a bug
so I used
pCanvas->Draw(0,0, pSelOverlay);
The result

Related

Seamless Textures in Voxel Worlds

I've been working on a minecraft clone recently and I've been able to generate simple infinite worlds with some noise for height Maps etc but the problem I'm facing are the textures. As you can see in the image below the Textures have some kind of border. They aren't seamless.I use a sprite sheet to send a single texture to the GPU and then Use Different Texture Coords for different BlockTypes.Also I'm using Vulkan As the Rendering Backend and here are some texturing details. I Would really Appreciate some insight on how to tackle this problem.
VkSamplerCreateInfo sInfo{};
sInfo.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
sInfo.pNext = nullptr;
sInfo.magFilter = VK_FILTER_NEAREST;
sInfo.minFilter = VK_FILTER_LINEAR;
sInfo.addressModeU = VK_SAMPLER_ADDRESS_MODE_REPEAT;
sInfo.addressModeV = VK_SAMPLER_ADDRESS_MODE_REPEAT;
sInfo.addressModeW = VK_SAMPLER_ADDRESS_MODE_REPEAT;
sInfo.anisotropyEnable = VK_TRUE;
VkPhysicalDeviceProperties Props;
vkGetPhysicalDeviceProperties(Engine::Get().GetGpuHandle(), &Props);
sInfo.maxAnisotropy = Props.limits.maxSamplerAnisotropy;
sInfo.borderColor = VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK;
sInfo.unnormalizedCoordinates = VK_FALSE;
sInfo.compareEnable = VK_FALSE;
sInfo.compareOp = VK_COMPARE_OP_ALWAYS;
sInfo.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR;
sInfo.mipLodBias = 0.f;
sInfo.minLod = 0.f;
sInfo.maxLod = 1;
VkImageCreateInfo iInfo{};
iInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
iInfo.pNext = nullptr;
iInfo.arrayLayers = 1;
iInfo.extent = { extent.width,extent.height,1 };
iInfo.imageType = VK_IMAGE_TYPE_2D;
iInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
iInfo.mipLevels = 1;
iInfo.samples = samples;
iInfo.flags = 0;
iInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
Info.usage = VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
iInfo.format = format;
iInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
VkImageViewCreateInfo vInfo{};
vInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
vInfo.pNext = nullptr;
vInfo.image = m_Image;
vInfo.viewType = VK_IMAGE_VIEW_TYPE_2D;
vInfo.format = format;
vInfo.flags = 0;
vInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
vInfo.subresourceRange.baseMipLevel = 0;
vInfo.subresourceRange.levelCount = 1;
vInfo.subresourceRange.baseArrayLayer = 0;
vInfo.subresourceRange.layerCount = 1;
First I thought It's an issue with my sprite sheet so I different ones but that wasn't the problem it seems. Then I tried several other sampler parameter combinations but still No Luck
The issue you are facing with texture borders is a common problem when using texture atlases (sprite sheets) in games. The reason for this is that texture coordinates are not always perfectly aligned with the pixel boundaries of the texture, which can result in sampling from adjacent pixels and introducing seams or artifacts.
There are several techniques that can be used to address this issue, some of which are:
1.Texture coordinate offsetting
2.Texture atlasing
In your code, it seems that you are using texture wrapping with the VK_SAMPLER_ADDRESS_MODE_REPEAT mode, which can exacerbate the issue of texture bleeding. One way to mitigate this is to use a border color that matches the texture's edge pixels to reduce the visible seams. You can set the border color using the sInfo.borderColor member of the VkSamplerCreateInfo structure.
Overall, it may be helpful to experiment with different combinations of texture padding, coordinate offsetting, and texture atlasing to find the best solution for your specific case. Additionally, you may want to consider using a tool such as TexturePacker or Sprite Sheet Packer to generate optimized texture atlases with padding and other optimizations to reduce visible seams.

Image to ASCII art conversion

Prologue
This subject pops up here on Stack Overflow from time to time, but it is removed usually because of being a poorly written question. I saw many such questions and then silence from the OP (usual low rep) when additional information is requested. From time to time if the input is good enough for me I decide to respond with an answer and it usually gets a few up-votes per day while active, but then after a few weeks the question gets removed/deleted and all starts from the beginning. So I decided to write this Q&A so I can reference such questions directly without rewriting the answer over and over again …
Another reason is also this meta thread targeted at me so if you got additional input, feel free to comment.
Question
How can I convert a bitmap image to ASCII art using C++?
Some constraints:
gray scale images
using mono-spaced fonts
keeping it simple (not using too advanced stuff for beginner level programmers)
Here is a related Wikipedia page ASCII art (thanks to #RogerRowland).
Here similar maze to ASCII Art conversion Q&A.
There are more approaches for image to ASCII art conversion which are mostly based on using mono-spaced fonts. For simplicity, I stick only to basics:
Pixel/area intensity based (shading)
This approach handles each pixel of an area of pixels as a single dot. The idea is to compute the average gray scale intensity of this dot and then replace it with character with close enough intensity to the computed one. For that we need some list of usable characters, each with a precomputed intensity. Let's call it a character map. To choose more quickly which character is the best for which intensity, there are two ways:
Linearly distributed intensity character map
So we use only characters which have an intensity difference with the same step. In other words, when sorted ascending then:
intensity_of(map[i])=intensity_of(map[i-1])+constant;
Also when our character map is sorted then we can compute the character directly from intensity (no search needed)
character = map[intensity_of(dot)/constant];
Arbitrary distributed intensity character map
So we have array of usable characters and their intensities. We need to find intensity closest to the intensity_of(dot) So again if we sorted the map[], we can use binary search, otherwise we need an O(n) search minimum distance loop or O(1) dictionary. Sometimes for simplicity, the character map[] can be handled as linearly distributed, causing a slight gamma distortion, usually unseen in the result unless you know what to look for.
Intensity-based conversion is great also for gray-scale images (not just black and white). If you select the dot as a single pixel, the result gets large (one pixel -> single character), so for larger images an area (multiply of font size) is selected instead to preserve the aspect ratio and do not enlarge too much.
How to do it:
Evenly divide the image into (gray-scale) pixels or (rectangular) areas dots
Compute the intensity of each pixel/area
Replace it by character from character map with the closest intensity
As the character map you can use any characters, but the result gets better if the character has pixels dispersed evenly along the character area. For starters you can use:
char map[10]=" .,:;ox%##";
sorted descending and pretend to be linearly distributed.
So if intensity of pixel/area is i = <0-255> then the replacement character will be
map[(255-i)*10/256];
If i==0 then the pixel/area is black, if i==127 then the pixel/area is gray, and if i==255 then the pixel/area is white. You can experiment with different characters inside map[] ...
Here is an ancient example of mine in C++ and VCL:
AnsiString m = " .,:;ox%##";
Graphics::TBitmap *bmp = new Graphics::TBitmap;
bmp->LoadFromFile("pic.bmp");
bmp->HandleType = bmDIB;
bmp->PixelFormat = pf24bit;
int x, y, i, c, l;
BYTE *p;
AnsiString s, endl;
endl = char(13); endl += char(10);
l = m.Length();
s ="";
for (y=0; y<bmp->Height; y++)
{
p = (BYTE*)bmp->ScanLine[y];
for (x=0; x<bmp->Width; x++)
{
i = p[x+x+x+0];
i += p[x+x+x+1];
i += p[x+x+x+2];
i = (i*l)/768;
s += m[l-i];
}
s += endl;
}
mm_log->Lines->Text = s;
mm_log->Lines->SaveToFile("pic.txt");
delete bmp;
You need to replace/ignore VCL stuff unless you use the Borland/Embarcadero environment.
mm_log is the memo where the text is outputted
bmp is the input bitmap
AnsiString is a VCL type string indexed from 1, not from 0 as char*!!!
This is the result: Slightly NSFW intensity example image
On the left is ASCII art output (font size 5 pixels), and on the right input image zoomed a few times. As you can see, the output is larger pixel -> character. If you use larger areas instead of pixels then the zoom is smaller, but of course the output is less visually pleasing. This approach is very easy and fast to code/process.
When you add more advanced things like:
automated map computations
automatic pixel/area size selection
aspect ratio corrections
Then you can process more complex images with better results:
Here is the result in a 1:1 ratio (zoom to see the characters):
Of course, for area sampling you lose the small details. This is an image of the same size as the first example sampled with areas:
Slightly NSFW intensity advanced example image
As you can see, this is more suited for bigger images.
Character fitting (hybrid between shading and solid ASCII art)
This approach tries to replace area (no more single pixel dots) with character with similar intensity and shape. This leads to better results, even with bigger fonts used in comparison with the previous approach. On the other hand, this approach is a bit slower of course. There are more ways to do this, but the main idea is to compute the difference (distance) between image area (dot) and rendered character. You can start with naive sum of the absolute difference between pixels, but that will lead to not very good results because even a one-pixel shift will make the distance big. Instead you can use correlation or different metrics. The overall algorithm is the almost the same as the previous approach:
So evenly divide the image to (gray-scale) rectangular areas dot's
ideally with the same aspect ratio as rendered font characters (it will preserve the aspect ratio. Do not forget that characters usually overlap a bit on the x-axis)
Compute the intensity of each area (dot)
Replace it by a character from the character map with the closest intensity/shape
How can we compute the distance between a character and a dot? That is the hardest part of this approach. While experimenting, I develop this compromise between speed, quality, and simpleness:
Divide character area to zones
Compute a separate intensity for left, right, up, down, and center zone of each character from your conversion alphabet (map).
Normalize all intensities, so they are independent on area size, i=(i*256)/(xs*ys).
Process the source image in rectangle areas
(with the same aspect ratio as the target font)
For each area, compute the intensity in the same manner as in bullet #1
Find the closest match from intensities in the conversion alphabet
Output the fitted character
This is the result for font size = 7 pixels
As you can see, the output is visually pleasing, even with a bigger font size used (the previous approach example was with a 5 pixel font size). The output is roughly the same size as the input image (no zoom). The better results are achieved because the characters are closer to the original image, not only by intensity, but also by overall shape, and therefore you can use larger fonts and still preserve details (up to a point of course).
Here is the complete code for the VCL-based conversion application:
//---------------------------------------------------------------------------
#include <vcl.h>
#pragma hdrstop
#include "win_main.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
Graphics::TBitmap *bmp=new Graphics::TBitmap;
//---------------------------------------------------------------------------
class intensity
{
public:
char c; // Character
int il, ir, iu ,id, ic; // Intensity of part: left,right,up,down,center
intensity() { c=0; reset(); }
void reset() { il=0; ir=0; iu=0; id=0; ic=0; }
void compute(DWORD **p,int xs,int ys,int xx,int yy) // p source image, (xs,ys) area size, (xx,yy) area position
{
int x0 = xs>>2, y0 = ys>>2;
int x1 = xs-x0, y1 = ys-y0;
int x, y, i;
reset();
for (y=0; y<ys; y++)
for (x=0; x<xs; x++)
{
i = (p[yy+y][xx+x] & 255);
if (x<=x0) il+=i;
if (x>=x1) ir+=i;
if (y<=x0) iu+=i;
if (y>=x1) id+=i;
if ((x>=x0) && (x<=x1) &&
(y>=y0) && (y<=y1))
ic+=i;
}
// Normalize
i = xs*ys;
il = (il << 8)/i;
ir = (ir << 8)/i;
iu = (iu << 8)/i;
id = (id << 8)/i;
ic = (ic << 8)/i;
}
};
//---------------------------------------------------------------------------
AnsiString bmp2txt_big(Graphics::TBitmap *bmp,TFont *font) // Character sized areas
{
int i, i0, d, d0;
int xs, ys, xf, yf, x, xx, y, yy;
DWORD **p = NULL,**q = NULL; // Bitmap direct pixel access
Graphics::TBitmap *tmp; // Temporary bitmap for single character
AnsiString txt = ""; // Output ASCII art text
AnsiString eol = "\r\n"; // End of line sequence
intensity map[97]; // Character map
intensity gfx;
// Input image size
xs = bmp->Width;
ys = bmp->Height;
// Output font size
xf = font->Size; if (xf<0) xf =- xf;
yf = font->Height; if (yf<0) yf =- yf;
for (;;) // Loop to simplify the dynamic allocation error handling
{
// Allocate and initialise buffers
tmp = new Graphics::TBitmap;
if (tmp==NULL)
break;
// Allow 32 bit pixel access as DWORD/int pointer
tmp->HandleType = bmDIB; bmp->HandleType = bmDIB;
tmp->PixelFormat = pf32bit; bmp->PixelFormat = pf32bit;
// Copy target font properties to tmp
tmp->Canvas->Font->Assign(font);
tmp->SetSize(xf, yf);
tmp->Canvas->Font ->Color = clBlack;
tmp->Canvas->Pen ->Color = clWhite;
tmp->Canvas->Brush->Color = clWhite;
xf = tmp->Width;
yf = tmp->Height;
// Direct pixel access to bitmaps
p = new DWORD*[ys];
if (p == NULL) break;
for (y=0; y<ys; y++)
p[y] = (DWORD*)bmp->ScanLine[y];
q = new DWORD*[yf];
if (q == NULL) break;
for (y=0; y<yf; y++)
q[y] = (DWORD*)tmp->ScanLine[y];
// Create character map
for (x=0, d=32; d<128; d++, x++)
{
map[x].c = char(DWORD(d));
// Clear tmp
tmp->Canvas->FillRect(TRect(0, 0, xf, yf));
// Render tested character to tmp
tmp->Canvas->TextOutA(0, 0, map[x].c);
// Compute intensity
map[x].compute(q, xf, yf, 0, 0);
}
map[x].c = 0;
// Loop through the image by zoomed character size step
xf -= xf/3; // Characters are usually overlapping by 1/3
xs -= xs % xf;
ys -= ys % yf;
for (y=0; y<ys; y+=yf, txt += eol)
for (x=0; x<xs; x+=xf)
{
// Compute intensity
gfx.compute(p, xf, yf, x, y);
// Find the closest match in map[]
i0 = 0; d0 = -1;
for (i=0; map[i].c; i++)
{
d = abs(map[i].il-gfx.il) +
abs(map[i].ir-gfx.ir) +
abs(map[i].iu-gfx.iu) +
abs(map[i].id-gfx.id) +
abs(map[i].ic-gfx.ic);
if ((d0<0)||(d0>d)) {
d0=d; i0=i;
}
}
// Add fitted character to output
txt += map[i0].c;
}
break;
}
// Free buffers
if (tmp) delete tmp;
if (p ) delete[] p;
return txt;
}
//---------------------------------------------------------------------------
AnsiString bmp2txt_small(Graphics::TBitmap *bmp) // pixel sized areas
{
AnsiString m = " `'.,:;i+o*%&$##"; // Constant character map
int x, y, i, c, l;
BYTE *p;
AnsiString txt = "", eol = "\r\n";
l = m.Length();
bmp->HandleType = bmDIB;
bmp->PixelFormat = pf32bit;
for (y=0; y<bmp->Height; y++)
{
p = (BYTE*)bmp->ScanLine[y];
for (x=0; x<bmp->Width; x++)
{
i = p[(x<<2)+0];
i += p[(x<<2)+1];
i += p[(x<<2)+2];
i = (i*l)/768;
txt += m[l-i];
}
txt += eol;
}
return txt;
}
//---------------------------------------------------------------------------
void update()
{
int x0, x1, y0, y1, i, l;
x0 = bmp->Width;
y0 = bmp->Height;
if ((x0<64)||(y0<64)) Form1->mm_txt->Text = bmp2txt_small(bmp);
else Form1->mm_txt->Text = bmp2txt_big (bmp, Form1->mm_txt->Font);
Form1->mm_txt->Lines->SaveToFile("pic.txt");
for (x1 = 0, i = 1, l = Form1->mm_txt->Text.Length();i<=l;i++) if (Form1->mm_txt->Text[i] == 13) { x1 = i-1; break; }
for (y1=0, i=1, l=Form1->mm_txt->Text.Length();i <= l; i++) if (Form1->mm_txt->Text[i] == 13) y1++;
x1 *= abs(Form1->mm_txt->Font->Size);
y1 *= abs(Form1->mm_txt->Font->Height);
if (y0<y1) y0 = y1; x0 += x1 + 48;
Form1->ClientWidth = x0;
Form1->ClientHeight = y0;
Form1->Caption = AnsiString().sprintf("Picture -> Text (Font %ix%i)", abs(Form1->mm_txt->Font->Size), abs(Form1->mm_txt->Font->Height));
}
//---------------------------------------------------------------------------
void draw()
{
Form1->ptb_gfx->Canvas->Draw(0, 0, bmp);
}
//---------------------------------------------------------------------------
void load(AnsiString name)
{
bmp->LoadFromFile(name);
bmp->HandleType = bmDIB;
bmp->PixelFormat = pf32bit;
Form1->ptb_gfx->Width = bmp->Width;
Form1->ClientHeight = bmp->Height;
Form1->ClientWidth = (bmp->Width << 1) + 32;
}
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner):TForm(Owner)
{
load("pic.bmp");
update();
}
//---------------------------------------------------------------------------
void __fastcall TForm1::FormDestroy(TObject *Sender)
{
delete bmp;
}
//---------------------------------------------------------------------------
void __fastcall TForm1::FormPaint(TObject *Sender)
{
draw();
}
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseWheel(TObject *Sender, TShiftState Shift, int WheelDelta, TPoint &MousePos, bool &Handled)
{
int s = abs(mm_txt->Font->Size);
if (WheelDelta<0) s--;
if (WheelDelta>0) s++;
mm_txt->Font->Size = s;
update();
}
//---------------------------------------------------------------------------
It is simple a form application (Form1) with a single TMemo mm_txt in it. It loads an image, "pic.bmp", and then according to the resolution, choose which approach to use to convert to text which is saved to "pic.txt" and sent to memo to visualize.
For those without VCL, ignore the VCL stuff and replace AnsiString with any string type you have, and also the Graphics::TBitmap with any bitmap or image class you have at disposal with pixel access capability.
A very important note is that this uses the settings of mm_txt->Font, so make sure you set:
Font->Pitch = fpFixed
Font->Charset = OEM_CHARSET
Font->Name = "System"
to make this work properly, otherwise the font will not be handled as mono-spaced. The mouse wheel just changes the font size up/down to see results on different font sizes.
[Notes]
See Word Portraits visualization
Use a language with bitmap/file access and text output capabilities
I strongly recommend to start with the first approach as it is very easy straightforward and simple, and only then move to the second (which can be done as modification of the first, so most of the code stays as is anyway)
It is a good idea to compute with inverted intensity (black pixels is the maximum value) because the standard text preview is on a white background, hence leading to much better results.
you can experiment with size, count, and layout of the subdivision zones or use some grid like 3x3 instead.
Comparison
Finally here is a comparison between the two approaches on the same input:
The green dot marked images are done with approach #2 and the red ones with #1, all on a six-pixel font size. As you can see on the light bulb image, the shape-sensitive approach is much better (even if the #1 is done on a 2x zoomed source image).
Cool application
While reading today's new questions, I got an idea of a cool application that grabs a selected region of the desktop and continuously feed it to the ASCIIart convertor and view the result. After an hour of coding, it's done and I am so satisfied with the result that I simply must have to add it here.
OK the application consists of just two windows. The first master window is basically my old convertor window without the image selection and preview (all the stuff above is in it). It has just the ASCII preview and conversion settings. The second window is an empty form with transparent inside for the grabbing area selection (no functionality whatsoever).
Now on a timer, I just grab the selected area by the selection form, pass it to conversion, and preview the ASCIIart.
So you enclose an area you want to convert by the selection window and view the result in master window. It can be a game, viewer, etc. It looks like this:
So now I can watch even videos in ASCIIart for fun. Some are really nice :).
If you want to try to implement this in GLSL, take a look at this:
Convert floating-point numbers to decimal digits in GLSL?

How to most efficiently modify R / G / B values?

So I wanted to implement lighting in my pixel based rendering system, googled and found out to display R / G / B values lighter or darker I have to multiply each red green and blue value by a number < 1 to display it darker and by a number > 1 to display it lighter.
So I implemented it like this, but its really dragging down my performance since I have to do this for each pixel:
void PixelRenderer::applyLight(Uint32& color){
Uint32 alpha = color >> 24;
alpha << 24;
alpha >> 24;
Uint32 red = color >> 16;
red = red << 24;
red = red >> 24;
Uint32 green = color >> 8;
green = green << 24;
green = green >> 24;
Uint32 blue = color;
blue = blue << 24;
blue = blue >> 24;
red = red * 0.5;
green = green * 0.5;
blue = blue * 0.5;
color = alpha << 24 | red << 16 | green << 8 | blue;
}
Any ideas or examples on how to improve the speed?
Try this: (EDIT: as it turns out, this is only a readability improvement, but read on for more insights.)
void PixelRenderer::applyLight(Uint32& color)
{
Uint32 alpha = color >> 24;
Uint32 red = (color >> 16) & 0xff;
Uint32 green = (color >> 8) & 0xff;
Uint32 blue = color & 0xff;
red = red * 0.5;
green = green * 0.5;
blue = blue * 0.5;
color = alpha << 24 | red << 16 | green << 8 | blue;
}
That having been said, you should understand that performing operations of that sort using a general-purpose processor such as the CPU of your computer is bound to be extremely slow. That's why hardware-accelerated graphics cards were invented.
EDIT
If you insist on operating this way, then you will probably have to resort to hacks in order to improve efficiency. One type of hack which is very often used when dealing with 8-bit channel values is lookup tables. With a lookup table, instead of multiplying each individual channel value by a float, you precompute an array of 256 values where the index into the array is a channel value, and the value in that index is the precomputed result of multiplying the channel value by that float. Then, when converting your image, you just use channel values to lookup entries of the array instead of performing actual float multiplication. This is much, much faster. (But still not nearly as fast as programming dedicated, massively parallel hardware do that stuff for you.)
EDIT
As others have already pointed out, if you are not planning to operate on the alpha channel, then you do not need to extract it and then later apply it, you can just leave it unaltered. So, you can just do color = (color & 0xff000000) | red << 16 | green << 8 | blue;
Shifts and masks like this are generally very fast on a modern processor. I might look at a few other things:
Follow the first rule of optimisation - profile your code. You can do this simply by calling the method millions of times and timing it. Are your calculations slow, or is it something else? What is slow? Try omitting part of the method - do things speed up?
Make sure that this function is declared inline (and make sure it has actually been inlined). The function call overhead will massively outweigh the pixel manipulations (particularly if it is virtual).
Consider declaring your method Uint32 PixelRenderer::applyLight(Uint32 color) and returning the modified value, that may help avoid some dereferences and give the compiler some additional optimisation opportunities.
Avoid fp to integer conversions, they can be very expensive. If a plain integer divide is insufficient, look at using fixed-point math.
Finally, look at the assembler to see what the compiler has generated (with optimisations on). Are there any branches or conversions? Has your method actually been inlined?
To preserve the alpha value in the front use:
(color>>1)&0x7F7F7F | (color&0xFF000000)
(A tweak on what Wimmel offered in the comments).
I think the 'learning curve' here is that you were using shift and shift back to mask out bits. You should use & with a masking value.
For a more general solution (where 0.0<=factor<=1.0) :
void PixelRenderer::applyLight(Uint32& color, double factor){
Uint32 alpha=color&0xFF000000;
Uint32 red= (color&0x00FF0000)*factor;
Uint32 green= (color&0x0000FF00)*factor;
Uint32 blue=(color&0x000000FF)*factor;
color=alpha|(red&0x00FF0000)|(green&0x0000FF00)|(blue&0x000000FF);
}
Notice there is no need to shift the components down to the low order bits before performing the multiplication.
Ultimately you may find that the bottleneck is floating point conversions and arithmetic.
To reduce that you should consider either:
Reduce it to a scaling factor for example in the range 0-256.
Precompute factor*component as a 256 element array and 'pick' the components out oft.
I'm proposing a range of 257 because you can achieve the factor as follows:
For a more general solution (where 0<=factor<=256) :
void PixelRenderer::applyLight(Uint32& color, Uint32 factor){
Uint32 alpha=color&0xFF000000;
Uint32 red= ((color&0x00FF0000)*factor)>>8;
Uint32 green= ((color&0x0000FF00)*factor)>>8;
Uint32 blue=((color&0x000000FF)*factor)>>8;
color=alpha|(red&0x00FF0000)|(green&0x0000FF00)|(blue&0x000000FF);
}
Here's a runnable program illustrating the first example:
#include <stdio.h>
#include <inttypes.h>
typedef uint32_t Uint32;
Uint32 make(Uint32 alpha,Uint32 red,Uint32 green,Uint32 blue){
return (alpha<<24)|(red<<16)|(green<<8)|blue;
}
void output(Uint32 color){
printf("alpha=%"PRIu32" red=%"PRIu32" green=%"PRIu32" blue=%"PRIu32"\n",(color>>24),(color&0xFF0000)>>16,(color&0xFF00)>>8,color&0xFF);
}
Uint32 applyLight(Uint32 color, double factor){
Uint32 alpha=color&0xFF000000;
Uint32 red= (color&0x00FF0000)*factor;
Uint32 green= (color&0x0000FF00)*factor;
Uint32 blue=(color&0x000000FF)*factor;
return alpha|(red&0x00FF0000)|(green&0x0000FF00)|(blue&0x000000FF);
}
int main(void) {
Uint32 color1=make(156,100,50,20);
Uint32 result1=applyLight(color1,0.9);
output(result1);
Uint32 color2=make(255,255,255,255);
Uint32 result2=applyLight(color2,0.1);
output(result2);
Uint32 color3=make(78,220,200,100);
Uint32 result3=applyLight(color3,0.05);
output(result3);
return 0;
}
Expected Output is:
alpha=156 red=90 green=45 blue=18
alpha=255 red=25 green=25 blue=25
alpha=78 red=11 green=10 blue=5
One thing that I don't see anyone else mentioning is parallelizing your code. There are at least 2 ways to do this: SIMD instructions, and multiple threads.
SIMD instructions (like SSE, AVX, etc.) perform the same math on multiple pieces of data at the same time. So you could, for example, multiply the red, green, blue, and alpha of a pixel by the same values in 1 instruction, like this:
vec4 lightValue = vec4(0.5, 0.5, 0.5, 1.0);
vec4 result = vec_Mult(inputPixel, lightValue);
That's the equivalent of:
lightValue.red = 0.5;
lightValue.green = 0.5;
lightValue.blue = 0.5;
lightValue.alpha = 1.0;
result.red = inputPixel.red * lightValue.red;
result.green = inputPixel.green * lightValue.green;
result.blue = inputPixel.blue * lightValue.blue;
result.alpha = inputPixel.alpha * lightValue.alpha;
You can also cut your image into tiles and perform the lightening operation on several tiles at once using threads run on multiple cores. If you're using C++11, you can use std::thread to start multiple threads. Otherwise your OS probably has functionality for threading, such as WinThreads, Grand Central Dispatch, pthreads, boost threads, Threading Building Blocks, etc.
You can combine both of the above and have multithreaded code that operates on whole pixels at a time.
If you want to take it even further, you can do your processing on the GPU of your machine using OpenGL, OpenCL, DirectX, Metal, Mantle, CUDA, or one of the other GPGPU technologies. GPUs are generally hundreds of cores that can very quickly process many tiles in parallel, each of which processes whole pixels (rather than just channels) at a time.
But an even better option may be not to write any code at all. It's extremely likely that someone has already done this work and you can leverage it. For example, on MacOS there's CoreImage and the Accelerate framework. On iOS you also have CoreImage, and there's also GPUImage. I'm sure there are similar libraries on Windows, Linux, and other OSes you might be working with.
Another solution without using bit shifters, is to convert your 32 bits uint into a struct.
Try to keep your implementation in the .h include file, so that it can be inlined
If you don't want to have the implementation inlined (see above), modify your applyLight method to accept an array of pixels. Method call overhead can be significant for such a small method
Enable "loop unroll" optimisation on your compiler, which will enable the usage of SIMD instructions
Implementation:
class brightness {
private:
struct pixel { uint8_t b, g, r, a; };
float factor;
static inline void apply(uint8_t& p, float f) {
p = max(min(int(p * f), 255),0);
}
public:
brightness(float factor) : factor(factor) { }
void apply(uint32_t& color){
pixel& p = (pixel&)color;
apply(p.b, factor);
apply(p.g, factor);
apply(p.r, factor);
}
};
Implementation with a lookup table (slower when you use "loop unroll"):
class brightness {
struct pixel { uint8_t b, g, r, a; };
uint8_t table[256];
public:
brightness(float factor) {
for(int i = 0; i < 256; i++)
table[i] = max(min(int(i * factor), 255), 0);
}
void apply(uint32_t& color){
pixel& p = (pixel&)color;
p.b = table[p.b];
p.g = table[p.g];
p.r = table[p.r];
}
};
// usage
brightness half_bright(0.5);
uint32_t pixel = 0xffffffff;
half_bright.apply(pixel);

C++/SDL: Fading out a surface already having per-pixel alpha information

Suppose we have a 32-bit PNG file of some ghostly/incorporeal character, which is drawn in a semi-transparent fashion. It is not equally transparent in every place, so we need the per-pixel alpha information when loading it to a surface.
For fading in/out, setting the alpha value of an entire surface is a good way; but not in this case, as the surface already has the per-pixel information and SDL doesn't combine the two.
What would be an efficient workaround (instead of asking the artist to provide some awesome fade in/out animation for the character)?
I think the easiest way for you to achieve the result you want is to start by loading the source surface containing your character sprites, then, for every instance of your ghost create a working copy of the surface. What you'll want to do is every time the alpha value of an instance change, SDL_BlitSurface (doc) your source into your working copy and then apply your transparency (which you should probably keep as a float between 0 and 1) and then apply your transparency on every pixel's alpha channel.
In the case of a 32 bit surface, assuming that you initially loaded source and allocated working SDL_Surfaces you can probably do something along the lines of:
SDL_BlitSurface(source, NULL, working, NULL);
if(SDL_MUSTLOCK(working))
{
if(SDL_LockSurface(working) < 0)
{
return -1;
}
}
Uint8 * pixels = (Uint8 *)working->pixels;
pitch_padding = (working->pitch - (4 * working->w));
pixels += 3; // Big Endian will have an offset of 0, otherwise it's 3 (R, G and B)
for(unsigned int row = 0; row < working->h; ++row)
{
for(unsigned int col = 0; col < working->w; ++col)
{
*pixels = (Uint8)(*pixels * character_transparency); // Could be optimized but probably not worth it
pixels += 4;
}
pixels += pitch_padding;
}
if(SDL_MUSTLOCK(working))
{
SDL_UnlockSurface(working);
}
This code was inspired from SDL_gfx (here), but if you're doing only that, I wouldn't bother linking against a library just for that.

Kinect SDK: align depth and color frames

I'm working with Kinect sensor and I'm trying to align depth and color frames so that I can save them as images which "fit" into each other. I've spent a lot of time going through msdn forums and modest documentation of Kinect SDK and I'm getting absolutely nowhere.
Based on this answer: Kinect: Converting from RGB Coordinates to Depth Coordinates
I have the following function, where depthData and colorData are obtained from NUI_LOCKED_RECT.pBits and mappedData is the output containing new color frame, mapped to depth coordinates:
bool mapColorFrameToDepthFrame(unsigned char *depthData, unsigned char* colorData, unsigned char* mappedData)
{
INuiCoordinateMapper* coordMapper;
// Get coordinate mapper
m_pSensor->NuiGetCoordinateMapper(&coordMapper);
NUI_DEPTH_IMAGE_POINT* depthPoints = new NUI_DEPTH_IMAGE_POINT[640 * 480];
HRESULT result = coordMapper->MapColorFrameToDepthFrame(NUI_IMAGE_TYPE_COLOR, NUI_IMAGE_RESOLUTION_640x480, NUI_IMAGE_RESOLUTION_640x480, 640 * 480, reinterpret_cast<NUI_DEPTH_IMAGE_PIXEL*>(depthData), 640 * 480, depthPoints);
if (FAILED(result))
{
return false;
}
int pos = 0;
int* colorRun = reinterpret_cast<int*>(colorData);
int* mappedRun = reinterpret_cast<int*>(mappedData);
// For each pixel of new color frame
for (int i = 0; i < 640 * 480; ++i)
{
// Find the corresponding pixel in original color frame from depthPoints
pos = (depthPoints[i].y * 640) + depthPoints[i].x;
// Set pixel value if it's within frame boundaries
if (pos < 640 * 480)
{
mappedRun[i] = colorRun[pos];
}
}
return true;
}
All I get when running this code is an unchanged color frame with removed (white) all pixels where depthFrame had no information.
With the OpenNI framework there an option call registration.
IMAGE_REGISTRATION_DEPTH_TO_IMAGE – The depth image is transformed to have the same apparent vantage point as the RGB image.
OpenNI 2.0 and Nite 2.0 works very well to capture Kinect information and there a lot of tutorials.
You can have a look to this :
Kinect with OpenNI
And OpenNi have a example in SimplerViewer that merge Depth and Color maybe you can just look on that and try it.
This might not be the quick answer you're hoping for, but this transformation is done successfully within the ofxKinectNui addon for openFrameworks (see here).
It looks like ofxKinectNui delegates to the GetColorPixelCoordinatesFromDepthPixel function defined here.
I think the problem is that you're calling MapColorFrameToDepthFrame, when you should actually call MapDepthFrameToColorFrame.
The smoking gun is this line of code:
mappedRun[i] = colorRun[pos];
Reading from pos and writing to i is backwards, since pos = depthPoints[i] represents the depth coordinates corresponding to the color coordinates at i. You actually want to iterate over writing all depth coordinates and read from the input color image at the corresponding color coordinates.
I think that in your code there are different not correct lines.
First of all, which kind of depth map are you passing to your function?
Depth data is storred using two bytes for each value, that means that the correct type of the pointer that you should use for your depth data
is unsigned short.
Second point is that from what i have understood, you want to map depth frame to color frame, so the correct function that you have
to call from kinect sdk is MapDepthFrameToColorFrame instead of MapColorFrameToDepthFrame.
Finally the function will return a map of point where for each depth data at position [i], you have the position x and position y where that point should
be mapped.
To do this you don't need for colorData pointer.
So your function should be modified as follow:
/** Method used to build a depth map aligned to color frame
#param [in] depthData : pointer to your data;
#param [out] mappedData : pointer to your aligned depth map;
#return true if is all ok : false whene something wrong
*/
bool DeviceManager::mapColorFrameToDepthFrame(unsigned short *depthData, unsigned short* mappedData){
INuiCoordinateMapper* coordMapper;
NUI_COLOR_IMAGE_POINT* colorPoints = new NUI_COLOR_IMAGE_POINT[640 * 480]; //color points
NUI_DEPTH_IMAGE_PIXEL* depthPoints = new NUI_DEPTH_IMAGE_PIXEL[640 * 480]; // depth pixel
/** BE sURE THAT YOU ARE WORKING WITH THE RIGHT HEIGHT AND WIDTH*/
unsigned long refWidth = 0;
unsigned long refHeight = 0;
NuiImageResolutionToSize( NUI_IMAGE_RESOLUTION_640x480, refWidth, refHeight );
int width = static_cast<int>( refWidth ); //get the image width in a right way
int height = static_cast<int>( refHeight ); //get the image height in a right way
m_pSensor>NuiGetCoordinateMapper(&coordMapper); // get the coord mapper
//Map your frame;
HRESULT result = coordMapper->MapDepthFrameToColorFrame( NUI_IMAGE_RESOLUTION_640x480, width * height, depthPoints, NUI_IMAGE_TYPE_COLOR, NUI_IMAGE_RESOLUTION_640x480, width * height, colorPoints );
if (FAILED(result))
return false;
// apply map in terms of x and y (image coordinates);
for (int i = 0; i < width * height; i++)
if (colorPoints[i].x >0 && colorPoints[i].x < width && colorPoints[i].y>0 && colorPoints[i].y < height)
*(mappedData + colorPoints[i].x + colorPoints[i].y*width) = *(depthData + i );
// free your memory!!!
delete colorPoints;
delete depthPoints;
return true;
}
Make sure that your mappedData has been initialized in correct way, for example as follow.
mappedData = (USHORT*)calloc(width*height, sizeof(ushort));
Remember that kinect sdk does not provide an accurate align function between color and depth data.
If you want an accurate alignment between two images you should use a calibration model.
In that case i suggest you to use the Kinect Calibration Toolbox, based on Heikkilä calibration model.
You can find it in the follow link:
http://www.ee.oulu.fi/~dherrera/kinect/.
First of all, you must calibrate your device.
That means, you should calibrate the RGB and the IR sensor and then find the transformation between RGB and IR.
Once you know this information, you can apply the function:
RGBPoint = RotationMatrix * DepthPoint + TranslationVector
Check OpenCV or ROS projects for further details on it.
Extrinsic Calibration
Intrinsic Calibration