Programing the encoding function based on a decode RLE (bitwise) function - bit-manipulation

Currently i only have the decoding procedure, EncodedRle is a byte array with the encoded image bytes, pixel is the pixel position to draw on decoded image, col is the decoded pixel color either 0 or 255, matSpan is the image pixel span (byte array)
Decoding function:
int pixel = 0;
foreach (var run in EncodedRle)
{
byte col = (byte) ((run & 0x01) * 255);
var numPixelsInRun =
(((run & 128) > 0 ? 1 : 0) |
((run & 64) > 0 ? 2 : 0) |
((run & 32) > 0 ? 4 : 0) |
((run & 16) > 0 ? 8 : 0) |
((run & 8) > 0 ? 16 : 0) |
((run & 4) > 0 ? 32 : 0) |
((run & 2) > 0 ? 64 : 0)) + 1;
for (; numPixelsInRun > 0; numPixelsInRun--)
{
matSpan[pixel++] = col;
}
}
How can i write the Encoding function? I understand that i need to limit strides, so sum up equal pixels in same line up found a diferent value AND limiting to 128 max per byte? But i'm not good at bitwise and can't find a way to do it properly. And tips on that?

Related

How to convert 16 bit hex color to RGB888 values in C++

I have uint16_t color and need to convert it into its RGB equivalent. The hex is set up so the first 5 bits represent red, next 6 for green, and last 5 for blue.
So far I have found something close to a solution but not quite due to truncation.
void hexToRGB(uint16_t hexValue)
{
int r = ((hexValue >> 11) & 0x1F); // Extract the 5 R bits
int g = ((hexValue >> 5) & 0x3F); // Extract the 6 G bits
int b = ((hexValue) & 0x1F); // Extract the 5 B bits
r = ((r * 255) / 31) - 4;
g = ((g * 255) / 63) - 2;
b = ((b * 255) / 31) - 4;
printf("r: %d, g: %d, b: %d\n",r, g, b);
}
int main()
{
//50712=0xC618
hexToRGB(50712);
return 0;
}
The example above yields r: 193, g: 192, b: 193 which should be r: 192, g: 192, b: 192 I have been using this question as reference, but I essentially need a backwards solution to what they are asking.
What about the following:
unsigned r = (hexValue & 0xF800) >> 8; // rrrrr... ........ -> rrrrr000
unsigned g = (hexValue & 0x07E0) >> 3; // .....ggg ggg..... -> gggggg00
unsigned b = (hexValue & 0x1F) << 3; // ............bbbbb -> bbbbb000
printf("r: %d, g: %d, b: %d\n", r, g, b);
That should result in 0xC618 --> 192, 192, 192, but 0xFFFF --> 248, 252, 248, i.e. not pure white.
If you want 0xFFFF to be pure white, you'll have to scale, so
unsigned r = (hexValue & 0xF800) >> 11;
unsigned g = (hexValue & 0x07E0) >> 5;
unsigned b = hexValue & 0x001F;
r = (r * 255) / 31;
g = (g * 255) / 63;
b = (b * 255) / 31;
Then 0xC618 --> 197, 194, 197, instead of the expected 192, 192, 192, but 0xFFFF is pure white and 0x0000 is pure black.
There are no "correct" ways to convert from the RGB565 scale to RGB888. Each colour component needs to be scaled from its 5-bit or 6-bit range to an 8-bit range and there are varying ways to do this each often producing different types of visual artifact in an image.
When scaling a colour in the n-bit range we might decide we want the following to be generally true:
that absolute black (eg 00000 in 5-bit space) must map to absolute black in 8-bit space;
that absolute white (eg 11111 in 5-bit space) must map to absolute white in 8-bit space;
Achieving this means we basically wish to scale the value from (2n - 1) shades in n-bit space into (28 - 1) shades in 8-bit space. That is, we want to effectively do the following in some way:
r_8 = (255 * r / 31)
g_8 = (255 * g / 63)
b_8 = (255 * b / 31)
Different approaches often taken are:
scale using integer division
scale using floating division and then round
bitshift into 8-bit space and add the most significant bits
The latter approach is effectively the following
r_8 = (r << 3) | (r >> 2)
g_8 = (g << 2) | (g >> 4)
b_8 = (b << 3) | (b >> 2)
For your 5-bit value 11000 these would result in 8-bit values of:
197
197
198 (11000000 | 110)
Similarly your six bit value 110000 would result in 8-bit values of:
194
194
195 (11000000 | 11)

H264 getting frame height and width from sequence parameter set (SPS) NAL unit

I've been trying to find out how to calculate width a height from SPS nal unit. I have H264 video which has these parameters
h264 (High), yuvj420p(pc), 1280x720 [SAR 1:1 DAR 16:9], 20 fps, 20 tbr, 1200k tbn, 40 tbc
I've been searching for a formula which would calculate width (1280) and height (720) but haven't found any which would help me. Right now I'm using this formula and it works for most H264 streams but in this case height and width is 80x48
if(frame_cropping_flag) {
width = ((pic_width_in_mbs_minus1 +1)*16) - frame_crop_left_offset*2 - frame_crop_right_offset*2;
height= ((2 - frame_mbs_only_flag)* (pic_height_in_map_units_minus1 +1) * 16) - (frame_crop_top_offset * 2) - (frame_crop_bottom_offset * 2);
}
else {
width = ((pic_width_in_mbs_minus1 +1)*16);
height= ((2 - frame_mbs_only_flag)* (pic_height_in_map_units_minus1 +1) * 16);
}
here is SPS as base64
Z2QAKa2EBUViuKxUdCAqKxXFYqOhAVFYrisVHQgKisVxWKjoQFRWK4rFR0ICorFcVio6ECSFITk8nyfk/k/J8nm5s00IEkKQnJ5Pk/J/J+T5PNzZprQCgC3YCqQAAAMB4AAASwGBAAH0AAADAjKAve+F4RCNQA==
here is SPS that I've parsed:
======= SPS =======
profile_idc : 100
constraint_set0_flag : 0
constraint_set1_flag : 0
constraint_set2_flag : 0
constraint_set3_flag : 0
constraint_set4_flag : 0
constraint_set5_flag : 0
reserved_zero_2bits : 0
level_idc : 41
seq_parameter_set_id : 0
chroma_format_idc : 1
separate_colour_plane_flag : 0
bit_depth_luma_minus8 : 0
bit_depth_chroma_minus8 : 0
qpprime_y_zero_transform_bypass_flag : 0
seq_scaling_matrix_present_flag : 1
log2_max_frame_num_minus4 : 41
pic_order_cnt_type : 4
log2_max_pic_order_cnt_lsb_minus4 : 0
delta_pic_order_always_zero_flag : 0
offset_for_non_ref_pic : 0
offset_for_top_to_bottom_field : 0
num_ref_frames_in_pic_order_cnt_cycle : 0
num_ref_frames : 2
gaps_in_frame_num_value_allowed_flag : 0
pic_width_in_mbs_minus1 : 4
pic_height_in_map_units_minus1 : 2
frame_mbs_only_flag : 1
mb_adaptive_frame_field_flag : 0
direct_8x8_inference_flag : 0
frame_cropping_flag : 0
frame_crop_left_offset : 0
frame_crop_right_offset : 0
frame_crop_top_offset : 0
frame_crop_bottom_offset : 0
vui_parameters_present_flag : 0
=== VUI ===
aspect_ratio_info_present_flag : 0
aspect_ratio_idc : 0
sar_width : 0
sar_height : 0
overscan_info_present_flag : 0
overscan_appropriate_flag : 0
video_signal_type_present_flag : 0
video_format : 0
video_full_range_flag : 0
colour_description_present_flag : 0
colour_primaries : 0
transfer_characteristics : 0
matrix_coefficients : 0
chroma_loc_info_present_flag : 0
chroma_sample_loc_type_top_field : 0
chroma_sample_loc_type_bottom_field : 0
timing_info_present_flag : 0
num_units_in_tick : 0
time_scale : 0
fixed_frame_rate_flag : 0
nal_hrd_parameters_present_flag : 0
vcl_hrd_parameters_present_flag : 0
low_delay_hrd_flag : 0
pic_struct_present_flag : 0
bitstream_restriction_flag : 0
motion_vectors_over_pic_boundaries_flag : 0
max_bytes_per_pic_denom : 0
max_bits_per_mb_denom : 0
log2_max_mv_length_horizontal : 0
log2_max_mv_length_vertical : 0
num_reorder_frames : 0
max_dec_frame_buffering : 0
=== HRD ===
cpb_cnt_minus1 : 0
bit_rate_scale : 0
cpb_size_scale : 0
bit_rate_value_minus1[0] : 0
cpb_size_value_minus1[0] : 0
cbr_flag[0] : 0
initial_cpb_removal_delay_length_minus1 : 0
cpb_removal_delay_length_minus1 : 0
dpb_output_delay_length_minus1 : 0
time_offset_length : 0
I guess it has something to do with luma and chroma macroblocks size I've been able to calculate SubWidthC\SubHeightC and MbWidthC\MbHeightC. But I'm still confused what to do next.
Hello first of all you are parsing SPS incorrectly so you need to fix that. If you parse it correctly then you will have
pic_width_in_mbs_minus1 : 79
pic_height_in_map_units_minus1 : 44
frame_mbs_only_flag : 1
frame_cropping_flag : 0
If you calculate width and height using your formula then you will actualy have 1280x720
Anyway you should calculate height and width using SubWidth and SubHeight as follows:
int SubWidthC;
int SubHeightC;
if (sps->chroma_format_idc == 0 && sps->separate_colour_plane_flag == 0) { //monochrome
SubWidthC = SubHeightC = 0;
}
else if (sps->chroma_format_idc == 1 && sps->separate_colour_plane_flag == 0) { //4:2:0
SubWidthC = SubHeightC = 2;
}
else if (sps->chroma_format_idc == 2 && sps->separate_colour_plane_flag == 0) { //4:2:2
SubWidthC = 2;
SubHeightC = 1;
}
else if (sps->chroma_format_idc == 3) { //4:4:4
if (sps->separate_colour_plane_flag == 0) {
SubWidthC = SubHeightC = 1;
}
else if (sps->separate_colour_plane_flag == 1) {
SubWidthC = SubHeightC = 0;
}
}
int PicWidthInMbs = sps->pic_width_in_mbs_minus1 + 1;
int PicHeightInMapUnits = sps->pic_height_in_map_units_minus1 + 1;
int FrameHeightInMbs = (2 - sps->frame_mbs_only_flag) * PicHeightInMapUnits;
int crop_left = 0;
int crop_right = 0;
int crop_top = 0;
int crop_bottom = 0;
if (sps->frame_cropping_flag) {
crop_left = sps->frame_crop_left_offset;
crop_right = sps->frame_crop_right_offset;
crop_top = sps->frame_crop_top_offset;
crop_bottom = sps->frame_crop_bottom_offset;
}
int width = PicWidthInMbs * 16 - SubWidthC * (crop_left + crop_right);
int height = FrameHeightInMbs * 16 - SubHeightC * (2 - sps->frame_mbs_only_flag) * (crop_top + crop_bottom);
we now have an H.264 SPS parser in librem:
https://github.com/creytiv/rem/blob/master/include/rem_h264.h#L52
it can be used like this, to extract the resolution:
struct h264_sps sps;
struct vidsz vidsz;
h264_sps_decode(&sps, buf, len);
h264_sps_resolution(&sps, vidsz);
printf("resolution: %u x %u\n", vidsz.w, vidsz.h);

Swap two colors using color matrix

How can I swap two colors using a color matrix? For instance swapping red and blue is easy. The matrix would look like:
0 0 1 0 0
0 1 0 0 0
1 0 0 0 0
0 0 0 1 0
0 0 0 0 1
So how can I swap any two colors in general? For example, there is Color1 with R1, G1, B1 and Color2 with R2, G2, B2.
EDIT: By swap I mean Color1 will translate into color2 and color2 will translate into color1. Looks like I need a reflection transformation. How to calculate it?
GIMP reference removed. Sorry for confusion.
This appears to be the section of the color-exchange.c file in the GIMP source that cycles through all the pixels and if a pixel meets the chosen criteria(which can be a range of colors), swaps it with the chosen color:
for (y = y1; y < y2; y++)
{
gimp_pixel_rgn_get_row (&srcPR, src_row, x1, y, width);
for (x = 0; x < width; x++)
{
guchar pixel_red, pixel_green, pixel_blue;
guchar new_red, new_green, new_blue;
guint idx;
/* get current pixel-values */
pixel_red = src_row[x * bpp];
pixel_green = src_row[x * bpp + 1];
pixel_blue = src_row[x * bpp + 2];
idx = x * bpp;
/* want this pixel? */
if (pixel_red >= min_red &&
pixel_red <= max_red &&
pixel_green >= min_green &&
pixel_green <= max_green &&
pixel_blue >= min_blue &&
pixel_blue <= max_blue)
{
guchar red_delta, green_delta, blue_delta;
red_delta = pixel_red > from_red ?
pixel_red - from_red : from_red - pixel_red;
green_delta = pixel_green > from_green ?
pixel_green - from_green : from_green - pixel_green;
blue_delta = pixel_blue > from_blue ?
pixel_blue - from_blue : from_blue - pixel_blue;
new_red = CLAMP (to_red + red_delta, 0, 255);
new_green = CLAMP (to_green + green_delta, 0, 255);
new_blue = CLAMP (to_blue + blue_delta, 0, 255);
}
else
{
new_red = pixel_red;
new_green = pixel_green;
new_blue = pixel_blue;
}
/* fill buffer */
dest_row[idx + 0] = new_red;
dest_row[idx + 1] = new_green;
dest_row[idx + 2] = new_blue;
/* copy alpha-channel */
if (has_alpha)
dest_row[idx + 3] = src_row[x * bpp + 3];
}
/* store the dest */
gimp_pixel_rgn_set_row (&destPR, dest_row, x1, y, width);
/* and tell the user what we're doing */
if (!preview && (y % 10) == 0)
gimp_progress_update ((gdouble) y / (gdouble) height);
}
EDIT/ADDITION
Another way you could have transformed red to blue would be with this matrix:
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
-1 0 1 0 1
The only values that really matter are the bottom ones in this matrix.
This would be the same as saying subtract 255 from red, keep green the same, and then add 255 to blue. You could cut the alpha in half like so as well like so:
-1 0 1 -0.5 1
So (just like the gimp source) you just need to find the difference between your current color and your target color, for each channel, and then apply the difference. Instead of channel values from 0 to 255 you would use values from 0 to 1.
You could have changed it from red to green like so:
-1 1 0 0 1
See here for some good info:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms533875%28v=vs.85%29.aspx
Good luck.
I solved it by creating a reflection matrix via D3DXMatrixReflect using a plane that's perpendicular to the vector AB and intersects the midpoint of the AB.
D3DXVECTOR3 AB( colorA.r-colorB.r, colorA.g-colorB.g, colorA.b-colorB.b );
D3DXPLANE plane( AB.x, AB.y, AB.z, -AB.x*midpoint.x-AB.y*midpoint.y-AB.z*midpoint.z );
D3DXMatrixReflect

implementing erosion, dilation in C, C++

I have theoretical understanding of how dilation in binary image is done.
AFAIK, If my SE (structuring element) is this
0 1
1 1.
where . represents the centre, and my image(binary is this)
0 0 0 0 0
0 1 1 0 0
0 1 0 0 0
0 1 0 0 0
0 0 0 0 0
so the result of dilation is
0 1 1 0 0
1 1 1 0 0
1 1 0 0 0
1 1 0 0 0
0 0 0 0 0
I got above result by shifting Image in 0, +1 (up) and and -1(left) direction, according to SE, and taking the union of all these three shifts.
Now, I need to figure out how to implement this in C, C++.
I am not sure how to begin and how to take the union of sets.
I thought of representing original image,three shifted images and final image obtained by taking union; all using matrix.
Is there any place where I can get some sample solution to start with or any ideas to proceed ?
Thanks.
There are tons of sample implementations out there.. Google is your friend :)
EDIT
The following is a pseudo-code of the process (very similar to doing a convolution in 2D). Im sure there are more clever way to doing it:
// grayscale image, binary mask
void morph(inImage, outImage, kernel, type) {
// half size of the kernel, kernel size is n*n (easier if n is odd)
sz = (kernel.n - 1 ) / 2;
for X in inImage.rows {
for Y in inImage.cols {
if ( isOnBoundary(X,Y, inImage, sz) ) {
// check if pixel (X,Y) for boundary cases and deal with it (copy pixel as is)
// must consider half size of the kernel
val = inImage(X,Y); // quick fix
}
else {
list = [];
// get the neighborhood of this pixel (X,Y)
for I in kernel.n {
for J in kernel.n {
if ( kernel(I,J) == 1 ) {
list.add( inImage(X+I-sz, Y+J-sz) );
}
}
}
if type == dilation {
// dilation: set to one if any 1 is present, zero otherwise
val = max(list);
} else if type == erosion {
// erosion: set to zero if any 0 is present, one otherwise
val = min(list);
}
}
// set output image pixel
outImage(X,Y) = val;
}
}
}
The above code is based on this tutorial (check the source code at the end of the page).
EDIT2:
list.add( inImage(X+I-sz, Y+J-sz) );
The idea is that we want to superimpose the kernel mask (of size nxn) centered at sz (half size of mask) on the current image pixel located at (X,Y), and then just get the intensities of the pixels where the mask value is one (we are adding them to a list). Once extracted all the neighbors for that pixel, we set the output image pixel to the maximum of that list (max intensity) for dilation, and min for erosion (of course this only work for grayscale images and binary mask)
The indices of both X/Y and I/J in the statement above are assumed to start from 0.
If you prefer, you can always rewrite the indices of I/J in terms of half the size of the mask (from -sz to +sz) with a small change (the way the tutorial I linked to is using)...
Example:
Consider this 3x3 kernel mask placed and centered on pixel (X,Y), and see how we traverse the neighborhood around it:
--------------------
| | | | sz = 1;
-------------------- for (I=0 ; I<3 ; ++I)
| | (X,Y) | | for (J=0 ; J<3 ; ++J)
-------------------- vect.push_back( inImage.getPixel(X+I-sz, Y+J-sz) );
| | | |
--------------------
Perhaps a better way to look at it is how to produce an output pixel of the dilation. For the corresponding pixel in the image, align the structuring element such that the origin of the structuring element is at that image pixel. If there is any overlap, set the dilation output pixel at that location to 1, otherwise set it to 0.
So this can be done by simply looping over each pixel in the image and testing whether or not the properly shifted structuring element overlaps with the image. This means you'll probably have 4 nested loops: x img, y img, x se, y se. So for each image pixel, you loop over the pixels of the structuring element and see if there is any overlap. This may not be the most efficient algorithm, but it is probably the most straightforward.
Also, I think your example is incorrect. The dilation depends on the origin of the structuring element. If the origin is...
at the top left zero: you need to shift the image (-1,-1), (-1,0), and (0,-1) giving:
1 1 1 0 0
1 1 0 0 0
1 1 0 0 0
1 0 0 0 0
0 0 0 0 0
at the bottom right: you need to shift the image (0,0), (1,0), and (0,1) giving:
0 0 0 0 0
0 1 1 1 0
0 1 1 0 0
0 1 1 0 0
0 1 0 0 0
MATLAB uses floor((size(SE)+1)/2) as the origin of the SE so in this case, it will use the top left pixel of the SE. You can verify this using the imdilate MATLAB function.
OpenCV
Example: Erosion and Dilation
/* structure of the image variable
* variable n stores the order of the square matrix */
typedef struct image{
int mat[][];
int n;
}image;
/* function recieves image "to dilate" and returns "dilated"*
* structuring element predefined:
* 0 1 0
* 1 1 1
* 0 1 0
*/
image* dilate(image* to_dilate)
{
int i,j;
int does_order_increase;
image* dilated;
dilated = (image*)malloc(sizeof(image));
does_order_increase = 0;
/* checking whether there are any 1's on d border*/
for( i = 0 ; i<to_dilate->n ; i++ )
{
if( (to_dilate->a[0][i] == 1)||(to_dilate->a[i][0] == 1)||(to_dilate->a[n-1][i] == 1)||(to_dilate->a[i][n-1] == 1) )
{
does_order_increase = 1;
break;
}
}
/* size of dilated image initialized */
if( does_order_increase == 1)
dilated->n = to_dilate->n + 1;
else
dilated->n = to_dilate->n;
/* dilating image by checking every element of to_dilate and filling dilated *
* does_order_increase serves to cope with adjustments if dilated 's order increase */
for( i = 0 ; i<to_dilate->n ; i++ )
{
for( j = 0 ; j<to_dilate->n ; j++ )
{
if( to_dilate->a[i][j] == 1)
{
dilated->a[i + does_order_increase][j + does_order_increase] = 1;
dilated->a[i + does_order_increase -1][j + does_order_increase ] = 1;
dilated->a[i + does_order_increase ][j + does_order_increase -1] = 1;
dilated->a[i + does_order_increase +1][j + does_order_increase ] = 1;
dilated->a[i + does_order_increase ][j + does_order_increase +1] = 1;
}
}
}
/* dilated stores dilated binary image */
return dilated;
}
/* end of dilation */

Rotating a bitmap 90 degrees

I have a one 64-bit integer, which I need to rotate 90 degrees in 8 x 8 area (preferably with straight bit-manipulation). I cannot figure out any handy algorithm for that. For instance, this:
// 0xD000000000000000 = 1101000000000000000000000000000000000000000000000000000000000000
1 1 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
after rotation becomes this:
// 0x101000100000000 = 0000000100000001000000000000000100000000000000000000000000000000
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
I wonder if there's any solutions without need to use any pre-calculated hash-table(s)?
v = (v & 0x000000000f0f0f0fUL) << 004 | (v & 0x00000000f0f0f0f0UL) << 040 |
(v & 0xf0f0f0f000000000UL) >> 004 | (v & 0x0f0f0f0f00000000UL) >> 040;
v = (v & 0x0000333300003333UL) << 002 | (v & 0x0000cccc0000ccccUL) << 020 |
(v & 0xcccc0000cccc0000UL) >> 002 | (v & 0x3333000033330000UL) >> 020;
v = (v & 0x0055005500550055UL) << 001 | (v & 0x00aa00aa00aa00aaUL) << 010 |
(v & 0xaa00aa00aa00aa00UL) >> 001 | (v & 0x5500550055005500UL) >> 010;
Without using any look-up tables, I can't see much better than treating each bit individually:
unsigned long r = 0;
for (int i = 0; i < 64; ++i) {
r += ((x >> i) & 1) << (((i % 8) * 8) + (7 - i / 8));
}
There is an efficient way to perform bit reversal, using O(log n) shift operations. If you interpret a 64-bit UINT as an 8x8 array of bits, then bit reversal corresponds to a rotation by 180 degrees.
Half of these shifts effectively perform a horizontal reflection; the other half perform a vertical reflection. To obtain rotations by 90 and 270 degrees, an orthogonal (i.e. vertical or horizontal) reflection could be combined with a diagonal reflection, but the latter remains an awkward bit.
typedef unsigned long long uint64;
uint64 reflect_vert (uint64 value)
{
value = ((value & 0xFFFFFFFF00000000ull) >> 32) | ((value & 0x00000000FFFFFFFFull) << 32);
value = ((value & 0xFFFF0000FFFF0000ull) >> 16) | ((value & 0x0000FFFF0000FFFFull) << 16);
value = ((value & 0xFF00FF00FF00FF00ull) >> 8) | ((value & 0x00FF00FF00FF00FFull) << 8);
return value;
}
uint64 reflect_horiz (uint64 value)
{
value = ((value & 0xF0F0F0F0F0F0F0F0ull) >> 4) | ((value & 0x0F0F0F0F0F0F0F0Full) << 4);
value = ((value & 0xCCCCCCCCCCCCCCCCull) >> 2) | ((value & 0x3333333333333333ull) << 2);
value = ((value & 0xAAAAAAAAAAAAAAAAull) >> 1) | ((value & 0x5555555555555555ull) << 1);
return value;
}
uint64 reflect_diag (uint64 value)
{
uint64 new_value = value & 0x8040201008040201ull; // stationary bits
new_value |= (value & 0x0100000000000000ull) >> 49;
new_value |= (value & 0x0201000000000000ull) >> 42;
new_value |= (value & 0x0402010000000000ull) >> 35;
new_value |= (value & 0x0804020100000000ull) >> 28;
new_value |= (value & 0x1008040201000000ull) >> 21;
new_value |= (value & 0x2010080402010000ull) >> 14;
new_value |= (value & 0x4020100804020100ull) >> 7;
new_value |= (value & 0x0080402010080402ull) << 7;
new_value |= (value & 0x0000804020100804ull) << 14;
new_value |= (value & 0x0000008040201008ull) << 21;
new_value |= (value & 0x0000000080402010ull) << 28;
new_value |= (value & 0x0000000000804020ull) << 35;
new_value |= (value & 0x0000000000008040ull) << 42;
new_value |= (value & 0x0000000000000080ull) << 49;
return new_value;
}
uint64 rotate_90 (uint64 value)
{
return reflect_diag (reflect_vert (value));
}
uint64 rotate_180 (uint64 value)
{
return reflect_horiz (reflect_vert (value));
}
uint64 rotate_270 (uint64 value)
{
return reflect_diag (reflect_horiz (value));
}
In the above code, the reflect_diag() function still requires many shifts. I suspect that it is possible to implement this function with fewer shifts, but I have not yet found a way to do that.
If you're going to do this fast, you shouldn't object to lookup tables.
I'd break the 64 bit integers into N-bit chunks, and look up the N bit chunks in a position-selected table of transpose values. If you choose N=1, you need 64 lookups in tables of two slots, which is relatively slow. If you choose N=64, you need one table and one lookup but the table is huge :-}
N=8 seems like a good compromise. You'd need 8 tables of 256 entries. The code should look something like this:
// value to transpose is in v, a long
long r; // result
r != byte0transpose[(v>>56)&0xFF];
r != byte1transpose[(v>>48)&0xFF];
r != byte2transpose[(v>>40)&0xFF];
r != byte3transpose[(v>>32)&0xFF];
r != byte4transpose[(v>>24)&0xFF];
r != byte5transpose[(v>>16)&0xFF];
r != byte6transpose[(v>>08)&0xFF];
r != byte7transpose[(v>>00)&0xFF];
Each table contains precomputed values that "spread" the contiguous bits in the input across the 64 bit transposed result. Ideally you'd compute this value offline and
simply initialize the table entries.
If you don't care about speed, then the standard array transpose
algorithms will work; just index the 64 bit as if it were a bit array.
I have a sneaking suspicion that one might be able to compute the transposition using
bit twiddling type hacks.
To expand on my comment to Ira's answer, you can use:
#define ROT_BIT_0(X) X, (X)|0x1UL
#define ROT_BIT_1(X) ROT_BIT_0(X), ROT_BIT_0((X) | 0x100UL)
#define ROT_BIT_2(X) ROT_BIT_1(X), ROT_BIT_1((X) | 0x10000UL)
#define ROT_BIT_3(X) ROT_BIT_2(X), ROT_BIT_2((X) | 0x1000000UL)
#define ROT_BIT_4(X) ROT_BIT_3(X), ROT_BIT_3((X) | 0x100000000UL)
#define ROT_BIT_5(X) ROT_BIT_4(X), ROT_BIT_4((X) | 0x10000000000UL)
#define ROT_BIT_6(X) ROT_BIT_5(X), ROT_BIT_5((X) | 0x1000000000000UL)
#define ROT_BIT_7(X) ROT_BIT_6(X), ROT_BIT_6((X) | 0x100000000000000UL)
static unsigned long rot90[256] = { ROT_BIT_7(0) };
unsigned long rotate90(unsigned long v)
{
unsigned long r = 0;
r |= rot90[(v>>56) & 0xff];
r |= rot90[(v>>48) & 0xff] << 1;
r |= rot90[(v>>40) & 0xff] << 2;
r |= rot90[(v>>32) & 0xff] << 3;
r |= rot90[(v>>24) & 0xff] << 4;
r |= rot90[(v>>16) & 0xff] << 5;
r |= rot90[(v>>8) & 0xff] << 6;
r |= rot90[v & 0xff] << 7;
return r;
}
This depends on 'unsigned long' being 64 bits, of course, and does the rotate assuming
the bits are in row-major order with the msb being the upper right, which seems to be the case in this question....
This is quite easy using IA32 SIMD, there's a handy opcode to extract every eighth bit from a 64 bit value (this was written using DevStudio 2005):
char
source [8] = {0, 0, 0, 0, 0, 0, 0, 0xd0},
dest [8];
__asm
{
mov ch,3
movq xmm0,qword ptr [source]
Rotate2:
lea edi,dest
mov cl,8
Rotate1:
pmovmskb eax,xmm0
psllq xmm0,1
stosb
dec cl
jnz Rotate1
movq xmm0,qword ptr [dest]
dec ch
jnz Rotate2
}
It rotates the data three times (-270 degrees) since +90 is a bit trickier (needs a bit more thought)
If you look at this as a 2 dimensional array then you have the solution no?
Just make the rows the new columns.
First row is the last column, 2nd is the one before last and so on.
Visually at least, it looks like your solution.
probably something like that
for(int i = 0; i < 8; i++)
{
for(int j = 0; j < 8; j++)
{
new_image[j*8+8-i] = image[i*8+j];
}
}
If an if-powered loop is acceptable, the formula for bits is simple enough:
8>>Column - Row - 1
Column and Row are 0-indexed.
This gives you this mapping:
7 15 23 31 39 47 55 63
6 14 22 ...
5 ...
4 ...
3 ...
2 ...
1 ...
0 8 16 24 32 40 48 54