H264 getting frame height and width from sequence parameter set (SPS) NAL unit - height

I've been trying to find out how to calculate width a height from SPS nal unit. I have H264 video which has these parameters
h264 (High), yuvj420p(pc), 1280x720 [SAR 1:1 DAR 16:9], 20 fps, 20 tbr, 1200k tbn, 40 tbc
I've been searching for a formula which would calculate width (1280) and height (720) but haven't found any which would help me. Right now I'm using this formula and it works for most H264 streams but in this case height and width is 80x48
if(frame_cropping_flag) {
width = ((pic_width_in_mbs_minus1 +1)*16) - frame_crop_left_offset*2 - frame_crop_right_offset*2;
height= ((2 - frame_mbs_only_flag)* (pic_height_in_map_units_minus1 +1) * 16) - (frame_crop_top_offset * 2) - (frame_crop_bottom_offset * 2);
}
else {
width = ((pic_width_in_mbs_minus1 +1)*16);
height= ((2 - frame_mbs_only_flag)* (pic_height_in_map_units_minus1 +1) * 16);
}
here is SPS as base64
Z2QAKa2EBUViuKxUdCAqKxXFYqOhAVFYrisVHQgKisVxWKjoQFRWK4rFR0ICorFcVio6ECSFITk8nyfk/k/J8nm5s00IEkKQnJ5Pk/J/J+T5PNzZprQCgC3YCqQAAAMB4AAASwGBAAH0AAADAjKAve+F4RCNQA==
here is SPS that I've parsed:
======= SPS =======
profile_idc : 100
constraint_set0_flag : 0
constraint_set1_flag : 0
constraint_set2_flag : 0
constraint_set3_flag : 0
constraint_set4_flag : 0
constraint_set5_flag : 0
reserved_zero_2bits : 0
level_idc : 41
seq_parameter_set_id : 0
chroma_format_idc : 1
separate_colour_plane_flag : 0
bit_depth_luma_minus8 : 0
bit_depth_chroma_minus8 : 0
qpprime_y_zero_transform_bypass_flag : 0
seq_scaling_matrix_present_flag : 1
log2_max_frame_num_minus4 : 41
pic_order_cnt_type : 4
log2_max_pic_order_cnt_lsb_minus4 : 0
delta_pic_order_always_zero_flag : 0
offset_for_non_ref_pic : 0
offset_for_top_to_bottom_field : 0
num_ref_frames_in_pic_order_cnt_cycle : 0
num_ref_frames : 2
gaps_in_frame_num_value_allowed_flag : 0
pic_width_in_mbs_minus1 : 4
pic_height_in_map_units_minus1 : 2
frame_mbs_only_flag : 1
mb_adaptive_frame_field_flag : 0
direct_8x8_inference_flag : 0
frame_cropping_flag : 0
frame_crop_left_offset : 0
frame_crop_right_offset : 0
frame_crop_top_offset : 0
frame_crop_bottom_offset : 0
vui_parameters_present_flag : 0
=== VUI ===
aspect_ratio_info_present_flag : 0
aspect_ratio_idc : 0
sar_width : 0
sar_height : 0
overscan_info_present_flag : 0
overscan_appropriate_flag : 0
video_signal_type_present_flag : 0
video_format : 0
video_full_range_flag : 0
colour_description_present_flag : 0
colour_primaries : 0
transfer_characteristics : 0
matrix_coefficients : 0
chroma_loc_info_present_flag : 0
chroma_sample_loc_type_top_field : 0
chroma_sample_loc_type_bottom_field : 0
timing_info_present_flag : 0
num_units_in_tick : 0
time_scale : 0
fixed_frame_rate_flag : 0
nal_hrd_parameters_present_flag : 0
vcl_hrd_parameters_present_flag : 0
low_delay_hrd_flag : 0
pic_struct_present_flag : 0
bitstream_restriction_flag : 0
motion_vectors_over_pic_boundaries_flag : 0
max_bytes_per_pic_denom : 0
max_bits_per_mb_denom : 0
log2_max_mv_length_horizontal : 0
log2_max_mv_length_vertical : 0
num_reorder_frames : 0
max_dec_frame_buffering : 0
=== HRD ===
cpb_cnt_minus1 : 0
bit_rate_scale : 0
cpb_size_scale : 0
bit_rate_value_minus1[0] : 0
cpb_size_value_minus1[0] : 0
cbr_flag[0] : 0
initial_cpb_removal_delay_length_minus1 : 0
cpb_removal_delay_length_minus1 : 0
dpb_output_delay_length_minus1 : 0
time_offset_length : 0
I guess it has something to do with luma and chroma macroblocks size I've been able to calculate SubWidthC\SubHeightC and MbWidthC\MbHeightC. But I'm still confused what to do next.

Hello first of all you are parsing SPS incorrectly so you need to fix that. If you parse it correctly then you will have
pic_width_in_mbs_minus1 : 79
pic_height_in_map_units_minus1 : 44
frame_mbs_only_flag : 1
frame_cropping_flag : 0
If you calculate width and height using your formula then you will actualy have 1280x720
Anyway you should calculate height and width using SubWidth and SubHeight as follows:
int SubWidthC;
int SubHeightC;
if (sps->chroma_format_idc == 0 && sps->separate_colour_plane_flag == 0) { //monochrome
SubWidthC = SubHeightC = 0;
}
else if (sps->chroma_format_idc == 1 && sps->separate_colour_plane_flag == 0) { //4:2:0
SubWidthC = SubHeightC = 2;
}
else if (sps->chroma_format_idc == 2 && sps->separate_colour_plane_flag == 0) { //4:2:2
SubWidthC = 2;
SubHeightC = 1;
}
else if (sps->chroma_format_idc == 3) { //4:4:4
if (sps->separate_colour_plane_flag == 0) {
SubWidthC = SubHeightC = 1;
}
else if (sps->separate_colour_plane_flag == 1) {
SubWidthC = SubHeightC = 0;
}
}
int PicWidthInMbs = sps->pic_width_in_mbs_minus1 + 1;
int PicHeightInMapUnits = sps->pic_height_in_map_units_minus1 + 1;
int FrameHeightInMbs = (2 - sps->frame_mbs_only_flag) * PicHeightInMapUnits;
int crop_left = 0;
int crop_right = 0;
int crop_top = 0;
int crop_bottom = 0;
if (sps->frame_cropping_flag) {
crop_left = sps->frame_crop_left_offset;
crop_right = sps->frame_crop_right_offset;
crop_top = sps->frame_crop_top_offset;
crop_bottom = sps->frame_crop_bottom_offset;
}
int width = PicWidthInMbs * 16 - SubWidthC * (crop_left + crop_right);
int height = FrameHeightInMbs * 16 - SubHeightC * (2 - sps->frame_mbs_only_flag) * (crop_top + crop_bottom);

we now have an H.264 SPS parser in librem:
https://github.com/creytiv/rem/blob/master/include/rem_h264.h#L52
it can be used like this, to extract the resolution:
struct h264_sps sps;
struct vidsz vidsz;
h264_sps_decode(&sps, buf, len);
h264_sps_resolution(&sps, vidsz);
printf("resolution: %u x %u\n", vidsz.w, vidsz.h);

Related

How should I implement the Grassfire Algorithm in C++

So in my program, I generate a random grid using 2D Arrays where all indexes are initialized to 0. Now, a certain percentage of random indexes are filled with -1 which means that they are impassable/ act like a wall. The user also inputs a certain target index say (i,j) from where he starts and his goal is to reach index (0,0) by taking the shortest path possible.
To find the shortest path, I have to check for the neighbours of each cell, starting from the target location. If they have neighbours, I increment the neighbour value by 1. Refer to my figure for more details. I got the code on how to calculate the shortest path, but I'm stuck with this incrementation part. I tried writing a code but it doesn't seem to work. Any help would be appreciated:-
GRID is generated in the following way:
1 is the user input location, and the goal is to reach X i.e 0,0
-X 0 0 0 0 0 0 0 0 -1
-0 0 0 -1 -1 0 0 0 0 0
-0 0 0 0 -1 0 0 0 0 0
-0 0 0 0 0 0 0 0 0 -1
-0 0 0 0 0 0 0 1 0 0
Starting by incrementing
-X 0 0 0 0 0 0 0 0 -1
-0 0 0 -1 -1 0 0 0 0 0
-0 0 0 0 -1 3 3 3 3 3
-0 0 0 0 0 3 2 2 2 -1
-0 0 0 0 0 3 2 1 2 3
I have only showed it till 3, but it keeps on going until index 0,0 is reached.
void waveAlgorithm(int *array, int height, int width, int x, int y)
{
while (array != NULL)
{
// Assume that index 0 0 is never 1
if (currX == 0 && currY == 0){
break;
}
// Check South
int currX = x;
int currY = y + 1;
if (currX < width && currX > 0 && currY < height && currY >= 0)
{
if (*(array + currX * width + currY) == 0)
{
(*(array + currX * width + currY))++;
}
}
// Check North
currX = x;
currY = y - 1;
if (currX < width && currX > 0 && currY < height && currY >= 0)
{
if (*(array + currX * width + currY) != -1)
{
(*(array + currX * width + currY))++;
}
}
// Check West
currX = x - 1;
currY = y;
if (currX < width && currX > 0 && currY < height && currY >= 0)
{
if (*(array + currX * width + currY) != -1)
{
(*(array + currX * width + currY))++;
}
}
// Check East
currX = x + 1;
currY = y;
if (currX < width && currX > 0 && currY < height && currY >= 0)
{
if (*(array + currX * width + currY) != -1)
{
(*(array + currX * width + currY))++;
}
}
}
}
I am kinda stuck while implementing this program, especially for the the directions that are combinational i.e North East, South East, etc. I tried writing a recursive program but couldn't figure out how to increment the cells
waveAlgorithm(int *arr)
{
if(index is 0,0)
return;
waveAlgorithm(int[i+1][j]);
waveAlgorithm(int[i][j+1]);
waveAlgorithm(int[i-1][j]);
waveAlgorithm(int[i][j-1]);
}

How to reverse an Adjacency list?

Let's say my adjacency list is as follows:
0 : 1
1 : 0
it's reverse should be empty because there is two arcs in the initial list:
0:
1:
or
0 : 1
1 :
becomes
0 :
1 : 0
another example:
0 :
1 :
becomes
0 : 1
1 : 0
last example
0 : 1, 4
1 : 0, 4
2 : 0, 1, 3, 4
3 :
4 : 3, 1
becomes
0 : 2, 3
1 : 2, 3
2 :
3 : 0, 1, 2, 4
4 : 0, 2
is there an algorithm to do this ?
Graphe *Graphe::grapheInverse( void ){
Graphe *r = new Graphe (_adjacences.size() );
for (unsigned i = 0; i < _adjacences.size(); i++)
for ( unsigned j = 0; j < _adjacences[i]->size(); j++ )
if ( (*_adjacences[i])[j] == 1 ) // or 0 or 2 or 3 or 4 or ... like last example
//r->addArcs( i, (*_adjacences[i])[j] ); //adds an arc from A to B
return r;
}
Just invert the adjacency matrix.
Wherever you have a 0, replace it with a 1, and vice versa. Only exclude the diagonal, always put a 0 there.

Turning a 2D array of different color into a single color optimally

I am trying to find a solution of the puzzle game 'Flood It'. The main idea is to turn a whole N*M game board of k different colors into a single color. I have to start from the top left corner of the board and turn the same colored block into one of the colors of neighboring nodes and thus moving ahead and flooding the whole board into a single color at last. For example:
Initial Board:
1 1 1 2 2 3
1 1 2 3 4 5
1 1 1 1 3 4
1 4 3 2 1 5
2 3 4 5 1 2
Final Board:
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
where 1,2,3,4,5 represents different colors. I have prepared a C++ code for finding out the area of same colored block at any position of the board . This can be applied at the top left cell at first and then at the neighboring nodes of it to flood the color. My code is as follows:
#include <cstdint>
#include <vector>
#include <queue>
#include <string>
#include <iostream>
typedef std::vector<int32_t> vec_1d;
typedef std::vector<vec_1d> vec_2d;
// Print the 2d vector with a label
void dump(std::string const& label, vec_2d const& v)
{
std::cout << label << "\n";
for (std::size_t y(0); y < v.size(); ++y) {
for (std::size_t x(0); x < v[0].size(); ++x) {
std::cout << v[y][x] << " ";
}
std::cout << "\n";
}
std::cout << "\n";
}
// Recursive implementation of the search
void find_connected_r(int32_t target_color
, std::size_t x
, std::size_t y
, vec_2d const& colors
, vec_2d& result)
{
if ((result[y][x] == 1) || (colors[y][x] != target_color)) {
return;
}
result[y][x] = 1;
std::size_t width(colors[0].size());
std::size_t height(colors.size());
if (x > 0) {
find_connected_r(target_color, x - 1, y, colors, result);
}
if (y > 0) {
find_connected_r(target_color, x, y - 1, colors, result);
}
if (x < (width - 1)) {
find_connected_r(target_color, x + 1, y, colors, result);
}
if (y < (height - 1)) {
find_connected_r(target_color, x, y + 1, colors, result);
}
}
// Entry point to the search, select the implementation with last param
vec_2d find_connected(std::size_t x, std::size_t y, vec_2d const& colors, bool recursive)
{
if (colors.empty() || colors[0].empty()) {
throw std::runtime_error("Invalid input array size");
}
int32_t target_color(colors[y][x]);
vec_2d result(colors.size(), vec_1d(colors[0].size(), 0));
if (recursive) {
find_connected_r(target_color, x, y, colors, result);
}
else {
find_connected(target_color, x, y, colors, result);
}
return result;
}
void dump_coordinates(std::string const& label, vec_2d const& v)
{
std::cout << label << "\n";
for (std::size_t y(0); y < v.size(); ++y) {
for (std::size_t x(0); x < v[0].size(); ++x) {
if (v[y][x]) {
std::cout << "(" << x << ", " << y << ") ";
}
}
}
std::cout << "\n";
}
int main()
{
vec_2d colors{
{ 1, 1, 1, 1, 1, 1 }
, { 2, 2, 2, 3, 3, 1 }
, { 1, 1, 1, 1, 3, 1 }
, { 1, 3, 3, 3, 3, 1 }
, { 1, 1, 1, 1, 1, 1 }
};
}
How will I turn the whole board/matrix into a single color by examining the neighboring nodes?
A possible top-level algorithm to solve this puzzle is to repeat the following until there is only one color on the whole board:
Find all contiguous color regions. Treat the region at (0,0) as primary, all others as secondary.
Pick the largest (by count of tiles) secondary region with a color that is different to the primary region's color. Let's name the color of this secondary region the new_color.
Recolor the primary region to new_color.
Finding all the regions
We should keep a cumulative_mask to track of all the tiles that are already identified as part of some region.
First we find the primary region, starting search at (0,0), and update our cumulative_mask with the result.
Then repeat until no more regions can be found:
Find the position of the first zero tile in the cumulative_mask, which has at least one non-zero tile in the primary region mask.
Find the region starting at this position.
Update the cumulative_mask with the mask of this region.
Selecting the color
Simply iterate through secondary regions, and find the region with largest count, which has a different color than the primary region.
Code
(also on coliru)
Note: Intentionally written in a way to make it possible to understand the algorithm. This could definitely be refactored, and it's missing a lot of error checking.
#include <cstdint>
#include <vector>
#include <queue>
#include <string>
#include <iostream>
typedef std::vector<int32_t> vec_1d;
typedef std::vector<vec_1d> vec_2d;
typedef std::pair<std::size_t, std::size_t> position;
position const INVALID_POSITION(-1, -1);
int32_t const INVALID_COLOR(0);
// ============================================================================
struct region_info
{
int32_t color;
vec_2d mask;
std::size_t count() const
{
std::size_t result(0);
for (std::size_t y(0); y < mask.size(); ++y) {
for (std::size_t x(0); x < mask[0].size(); ++x) {
if (mask[y][x]) {
++result;
}
}
}
return result;
}
};
struct region_set
{
// The region that contains (0, 0)
region_info primary;
// All other regions
std::vector<region_info> secondary;
};
// ============================================================================
// Print the 2D vector with a label
void dump(std::string const& label, vec_2d const& v)
{
std::cout << label << "\n";
for (std::size_t y(0); y < v.size(); ++y) {
for (std::size_t x(0); x < v[0].size(); ++x) {
std::cout << v[y][x] << " ";
}
std::cout << "\n";
}
std::cout << "\n";
}
// Print the coordinates of non-zero elements of 2D vector with a label
void dump_coordinates(std::string const& label, vec_2d const& v)
{
std::cout << label << "\n";
for (std::size_t y(0); y < v.size(); ++y) {
for (std::size_t x(0); x < v[0].size(); ++x) {
if (v[y][x]) {
std::cout << "(" << x << ", " << y << ") ";
}
}
}
std::cout << "\n";
}
void dump(region_info const& ri)
{
std::cout << "Region color: " << ri.color << "\n";
std::cout << "Region count: " << ri.count() << "\n";
dump("Region mask:", ri.mask);
}
void dump(region_set const& rs)
{
std::cout << "Primary Region\n" << "\n";
dump(rs.primary);
for (std::size_t i(0); i < rs.secondary.size(); ++i) {
std::cout << "Secondary Region #" << i << "\n";
dump(rs.secondary[i]);
}
}
// ============================================================================
// Find connected tiles - implementation
void find_connected(int32_t target_color
, std::size_t x
, std::size_t y
, vec_2d const& colors
, vec_2d& result)
{
std::size_t width(colors[0].size());
std::size_t height(colors.size());
std::queue<position> s;
s.push(position(x, y));
while (!s.empty()) {
position pos(s.front());
s.pop();
if (result[pos.second][pos.first] == 1) {
continue;
}
if (colors[pos.second][pos.first] != target_color) {
continue;
}
result[pos.second][pos.first] = 1;
if (pos.first > 0) {
s.push(position(pos.first - 1, pos.second));
}
if (pos.second > 0) {
s.push(position(pos.first, pos.second - 1));
}
if (pos.first < (width - 1)) {
s.push(position(pos.first + 1, pos.second));
}
if (pos.second < (height - 1)) {
s.push(position(pos.first, pos.second + 1));
}
}
}
// Find connected tiles - convenience wrapper
vec_2d find_connected(std::size_t x, std::size_t y, vec_2d const& colors)
{
if (colors.empty() || colors[0].empty()) {
throw std::runtime_error("Invalid input array size");
}
int32_t target_color(colors[y][x]);
vec_2d result(colors.size(), vec_1d(colors[0].size(), 0));
find_connected(target_color, x, y, colors, result);
return result;
}
// ============================================================================
// Change color of elements at positions with non-zero mask value to new color
vec_2d& change_masked(int32_t new_color
, vec_2d& colors
, vec_2d const& mask)
{
for (std::size_t y(0); y < mask.size(); ++y) {
for (std::size_t x(0); x < mask[0].size(); ++x) {
if (mask[y][x]) {
colors[y][x] = new_color;
}
}
}
return colors;
}
// Combine two masks
vec_2d combine(vec_2d const& v1, vec_2d const& v2)
{
vec_2d result(v1);
for (std::size_t y(0); y < v2.size(); ++y) {
for (std::size_t x(0); x < v2[0].size(); ++x) {
if (v2[y][x]) {
result[y][x] = v2[y][x];
}
}
}
return result;
}
// Find position of first zero element in mask
position find_first_zero(vec_2d const& mask)
{
for (std::size_t y(0); y < mask.size(); ++y) {
for (std::size_t x(0); x < mask[0].size(); ++x) {
if (!mask[y][x]) {
return position(x, y);
}
}
}
return INVALID_POSITION;
}
bool has_nonzero_neighbor(std::size_t x, std::size_t y, vec_2d const& mask)
{
bool result(false);
if (x > 0) {
result |= (mask[y][x - 1] != 0);
}
if (y > 0) {
result |= (mask[y - 1][x] != 0);
}
if (x < (mask[0].size() - 1)) {
result |= (mask[y][x + 1] != 0);
}
if (y < (mask.size() - 1)) {
result |= (mask[y + 1][x] != 0);
}
return result;
}
// Find position of first zero element in mask
// which neighbors at least one non-zero element in primary mask
position find_first_zero_neighbor(vec_2d const& mask, vec_2d const& primary_mask)
{
for (std::size_t y(0); y < mask.size(); ++y) {
for (std::size_t x(0); x < mask[0].size(); ++x) {
if (!mask[y][x]) {
if (has_nonzero_neighbor(x, y, primary_mask)) {
return position(x, y);
}
}
}
}
return INVALID_POSITION;
}
// ============================================================================
// Find all contiguous color regions in the image
// The region starting at (0,0) is considered the primary region
// All other regions are secondary
// If parameter 'only_neighbors' is true, search only for regions
// adjacent to primary region, otherwise search the entire board
region_set find_all_regions(vec_2d const& colors, bool only_neighbors = false)
{
region_set result;
result.primary.color = colors[0][0];
result.primary.mask = find_connected(0, 0, colors);
vec_2d cumulative_mask = result.primary.mask;
for (;;) {
position pos;
if (only_neighbors) {
pos = find_first_zero_neighbor(cumulative_mask, result.primary.mask);
} else {
pos = find_first_zero(cumulative_mask);
}
if (pos == INVALID_POSITION) {
break; // No unsearched tiles left
}
region_info reg;
reg.color = colors[pos.second][pos.first];
reg.mask = find_connected(pos.first, pos.second, colors);
cumulative_mask = combine(cumulative_mask, reg.mask);
result.secondary.push_back(reg);
}
return result;
}
// ============================================================================
// Select the color to recolor the primary region with
// based on the color of the largest secondary region of non-primary color
int32_t select_color(region_set const& rs)
{
int32_t selected_color(INVALID_COLOR);
std::size_t selected_count(0);
for (auto const& ri : rs.secondary) {
if (ri.color != rs.primary.color) {
if (ri.count() > selected_count) {
selected_count = ri.count();
selected_color = ri.color;
}
}
}
return selected_color;
}
// ============================================================================
// Solve the puzzle
// If parameter 'only_neighbors' is true, search only for regions
// adjacent to primary region, otherwise search the entire board
// Returns the list of selected colors representing the solution steps
vec_1d solve(vec_2d colors, bool only_neighbors = false)
{
vec_1d selected_colors;
for (int32_t i(0);; ++i) {
std::cout << "Step #" << i << "\n";
dump("Game board: ", colors);
region_set rs(find_all_regions(colors, true));
dump(rs);
int32_t new_color(select_color(rs));
if (new_color == INVALID_COLOR) {
break;
}
std::cout << "Selected color: " << new_color << "\n";
selected_colors.push_back(new_color);
change_masked(new_color, colors, rs.primary.mask);
std::cout << "\n------------------------------------\n\n";
}
return selected_colors;
}
// ============================================================================
int main()
{
vec_2d colors{
{ 1, 1, 1, 1, 1, 1 }
, { 2, 2, 2, 3, 3, 1 }
, { 1, 1, 4, 5, 3, 1 }
, { 1, 3, 3, 4, 3, 1 }
, { 1, 1, 1, 1, 1, 1 }
};
vec_1d steps(solve(colors, true));
std::cout << "Solved in " << steps.size() << " step(s):\n";
for (auto step : steps) {
std::cout << step << " ";
}
std::cout << "\n\n";
}
// ============================================================================
Output of the program:
Step #0
Game board:
1 1 1 1 1 1
2 2 2 3 3 1
1 1 4 5 3 1
1 3 3 4 3 1
1 1 1 1 1 1
Primary Region
Region color: 1
Region count: 18
Region mask:
1 1 1 1 1 1
0 0 0 0 0 1
1 1 0 0 0 1
1 0 0 0 0 1
1 1 1 1 1 1
Secondary Region #0
Region color: 2
Region count: 3
Region mask:
0 0 0 0 0 0
1 1 1 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #1
Region color: 3
Region count: 4
Region mask:
0 0 0 0 0 0
0 0 0 1 1 0
0 0 0 0 1 0
0 0 0 0 1 0
0 0 0 0 0 0
Secondary Region #2
Region color: 4
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #3
Region color: 3
Region count: 2
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1 1 0 0 0
0 0 0 0 0 0
Secondary Region #4
Region color: 4
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 0 0 0 0 0
Selected color: 3
------------------------------------
Step #1
Game board:
3 3 3 3 3 3
2 2 2 3 3 3
3 3 4 5 3 3
3 3 3 4 3 3
3 3 3 3 3 3
Primary Region
Region color: 3
Region count: 24
Region mask:
1 1 1 1 1 1
0 0 0 1 1 1
1 1 0 0 1 1
1 1 1 0 1 1
1 1 1 1 1 1
Secondary Region #0
Region color: 2
Region count: 3
Region mask:
0 0 0 0 0 0
1 1 1 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #1
Region color: 4
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #2
Region color: 5
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #3
Region color: 4
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 0 0 0 0 0
Selected color: 2
------------------------------------
Step #2
Game board:
2 2 2 2 2 2
2 2 2 2 2 2
2 2 4 5 2 2
2 2 2 4 2 2
2 2 2 2 2 2
Primary Region
Region color: 2
Region count: 27
Region mask:
1 1 1 1 1 1
1 1 1 1 1 1
1 1 0 0 1 1
1 1 1 0 1 1
1 1 1 1 1 1
Secondary Region #0
Region color: 4
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #1
Region color: 5
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Secondary Region #2
Region color: 4
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 0 0 0 0 0
Selected color: 4
------------------------------------
Step #3
Game board:
4 4 4 4 4 4
4 4 4 4 4 4
4 4 4 5 4 4
4 4 4 4 4 4
4 4 4 4 4 4
Primary Region
Region color: 4
Region count: 29
Region mask:
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 0 1 1
1 1 1 1 1 1
1 1 1 1 1 1
Secondary Region #0
Region color: 5
Region count: 1
Region mask:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Selected color: 5
------------------------------------
Step #4
Game board:
5 5 5 5 5 5
5 5 5 5 5 5
5 5 5 5 5 5
5 5 5 5 5 5
5 5 5 5 5 5
Primary Region
Region color: 5
Region count: 30
Region mask:
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
Solved in 4 step(s):
3 2 4 5
There's a bunch of things I don't understand in your code so instead of trying to fix them I'll create a new function and you can compare the two.
// this function is called when the user inputs the x and y values
// the colors vector will be modified in place by reference
void change_color(int x, int y, vec_2d& colors)
{
int target_color = colors[x][y];
// call the recursive flood fill function
flood_fill(0, 0, target_color, colors);
}
//this function is the recursive flood fill
void flood_fill(int x, int y, const int target_color, vec_2d& colors)
{
// if the current tile is already the target color, do nothing
if (colors[x][y] == target_color) return;
// only need to go right and down, since starting from top left
// Also, only goes to the next tile if the next tile's color is
// the same as the current tile's color
if (x < colors.size()-1 && colors[x+1][y] == colors[x][y])
{
flood_fill(x+1, y, target_color, colors);
}
if (y < colors[0].size()-1 && colors[x][y+1] == colors[x][y])
{
flood_fill(x, y+1, target_color, colors);
}
// finally, fill in the current tile with target_color
colors[x][y] = target_color;
}
EDIT: Since you meant you wanted to solve the game instead of implementing the game...
Keep track of which colors are still available on the board at all times. On each "turn", find the color that will fill the most tile area starting from the top left. Repeat until all tiles are filled with the same color.
This is more of a brute force approach, and there is probably a more optimized method, but this is the most basic one in my opinion.

Converting 1-bit bmp file to array in C/C++ [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm looking to turn a 1-bit bmp file of variable height/width into a simple two-dimensional array with values of either 0 or 1. I don't have any experience with image editing in code and most libraries that I've found involve higher bit-depth than what I need. Any help regarding this would be great.
Here's the code to read a monochrome .bmp file
(See dmb's answer below for a small fix for odd-sized .bmps)
#include <stdio.h>
#include <string.h>
#include <malloc.h>
unsigned char *read_bmp(char *fname,int* _w, int* _h)
{
unsigned char head[54];
FILE *f = fopen(fname,"rb");
// BMP header is 54 bytes
fread(head, 1, 54, f);
int w = head[18] + ( ((int)head[19]) << 8) + ( ((int)head[20]) << 16) + ( ((int)head[21]) << 24);
int h = head[22] + ( ((int)head[23]) << 8) + ( ((int)head[24]) << 16) + ( ((int)head[25]) << 24);
// lines are aligned on 4-byte boundary
int lineSize = (w / 8 + (w / 8) % 4);
int fileSize = lineSize * h;
unsigned char *img = malloc(w * h), *data = malloc(fileSize);
// skip the header
fseek(f,54,SEEK_SET);
// skip palette - two rgb quads, 8 bytes
fseek(f, 8, SEEK_CUR);
// read data
fread(data,1,fileSize,f);
// decode bits
int i, j, k, rev_j;
for(j = 0, rev_j = h - 1; j < h ; j++, rev_j--) {
for(i = 0 ; i < w / 8; i++) {
int fpos = j * lineSize + i, pos = rev_j * w + i * 8;
for(k = 0 ; k < 8 ; k++)
img[pos + (7 - k)] = (data[fpos] >> k ) & 1;
}
}
free(data);
*_w = w; *_h = h;
return img;
}
int main()
{
int w, h, i, j;
unsigned char* img = read_bmp("test1.bmp", &w, &h);
for(j = 0 ; j < h ; j++)
{
for(i = 0 ; i < w ; i++)
printf("%c ", img[j * w + i] ? '0' : '1' );
printf("\n");
}
return 0;
}
It is plain C, so no pointer casting - beware while using it in C++.
The biggest problem is that the lines in .bmp files are 4-byte aligned which matters a lot with single-bit images. So we calculate the line size as "width / 8 + (width / 8) % 4". Each byte contains 8 pixels, not one, so we use the k-based loop.
I hope the other code is obvious - much has been told about .bmp header and pallete data (8 bytes which we skip).
Expected output:
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0
0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 1 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0
0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0
0 0 0 1 0 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0
I tried the solution of Viktor Lapyov on a 20x20 test image:
But with his code, I get this output (slightly reformatted but you can see the problem):
The last 4 pixels are not read. The problem is here. (The last partial byte in a row is ignored.)
// decode bits
int i, j, k, rev_j;
for(j = 0, rev_j = h - 1; j < h ; j++, rev_j--) {
for(i = 0 ; i < w / 8; i++) {
int fpos = j * lineSize + i, pos = rev_j * w + i * 8;
for(k = 0 ; k < 8 ; k++)
img[pos + (7 - k)] = (data[fpos] >> k ) & 1;
}
}
I rewrote the inner loop like this:
// decode bits
int i, byte_ctr, j, rev_j;
for(j = 0, rev_j = h - 1; j < h ; j++, rev_j--) {
for( i = 0; i < w; i++) {
byte_ctr = i / 8;
unsigned char data_byte = data[j * lineSize + byte_ctr];
int pos = rev_j * w + i;
unsigned char mask = 0x80 >> i % 8;
img[pos] = (data_byte & mask ) ? 1 : 0;
}
}
and all is well:
The following c code works with monochrome bitmaps of any size. I'll assume you've got your bitmap in a buffer with heights and width initialized from file. So
// allocate mem for global buffer
if (!(img = malloc(h * w)) )
return(0);
int i = 0, k, j, scanline;
// calc the scanline. Monochrome images are
// padded with 0 at every line end. This
// makes them divisible by 4.
scanline = ( w + (w % 8) ) >> 3;
// account for the paddings
if (scanline % 4)
scanline += (4 - scanline % 4);
// loop and set the img values
for (i = 0, k = h - 1; i < h; i++)
for (j = 0; j < w; j++) {
img[j+i*w] = (buffer[(j>>3)+k*scanline])
& (0x80 >> (j % 8));
}
Hope this help's. To convert it to 2D is now a trivial matter: But if u get lost here is the math to convert 1D array to 2D suppose r & c are row and column and w is the width then:
. c + r * w = r, c
If you got further remarks hit me back, am out!!!
Lets think of a1x7 monochrome bitmap i.e. This is a bitmap of a straight line with 7 pixels wide. To store this image on a Windows OS; since 7 is not evenly divisible by 4 it's going to pad in it an extra 3 bytes.
So the biSizeImage of the BITMAPINFOHEADER structure will show a total of 4 bytes. Nonetheless the biHeight and biWidth members will correctly state the true bitmap dimensions.
The above code will fail because 7 / 8 = 0 (by rounding off as with all c compilers do). Hence loop "i" will not execute so will "k".
That means the vector "img" now contains garbage values that do not correspond to the pixels contained in " data" i.e. the result is incorrect.
And by inductive reasoning if it does not satisfy the base case then chances are it wont do much good for general cases.

Image downscaling algorithm

Could you help me find the right algorithm for image resizing? I have an image of a number. The maximum size is 200x200, I need to get an image with size 15x15 or even less. The image is monochrome (black and white) and the result should be the same. That's the info about my task.
I've already tried one algorithm, here it is
// xscale, yscale - decrease/increase rate
for (int f = 0; f<=49; f++)
{
for (int g = 0; g<=49; g++)//49+1 - final size
{
xpos = (int)f * xscale;
ypos = (int)g * yscale;
picture3[f][g]=picture4[xpos][ypos];
}
}
But it won't work with the decrease of an image, which is my prior target.
Could you help me find an algorithm, which could solve that problem (quality mustn't be perfect, the speed doesn't even matter). Some information about it would be perfect too considering the fact I'm a newbie. Of course, a short piece of c/c++ code (or a library) will be perfect too.
Edit:
I've found an algorithm. Will it be suitable for compressing from 200 to 20?
The general approach is to filter the input to generate a smaller size, and threshold to convert to monochrome. The easiest filter to implement is a simple average, and it often produces OK results. The Sinc filter is theoretically the best but it's impractical to implement and has ringing artifacts which are often undesirable. Many other filters are available, such as Lanczos or Tent (which is the generalized form of Bilinear).
Here's a version of an average filter combined with thresholding. Assuming picture4 is the input with pixel values of 0 or 1, and the output is picture3 in the same format. I also assumed that x is the least significant dimension which is opposite to the usual mathematical notation, and opposite to the coordinates in your question.
int thumbwidth = 15;
int thumbheight = 15;
double xscale = (thumbwidth+0.0) / width;
double yscale = (thumbheight+0.0) / height;
double threshold = 0.5 / (xscale * yscale);
double yend = 0.0;
for (int f = 0; f < thumbheight; f++) // y on output
{
double ystart = yend;
yend = (f + 1) / yscale;
if (yend >= height) yend = height - 0.000001;
double xend = 0.0;
for (int g = 0; g < thumbwidth; g++) // x on output
{
double xstart = xend;
xend = (g + 1) / xscale;
if (xend >= width) xend = width - 0.000001;
double sum = 0.0;
for (int y = (int)ystart; y <= (int)yend; ++y)
{
double yportion = 1.0;
if (y == (int)ystart) yportion -= ystart - y;
if (y == (int)yend) yportion -= y+1 - yend;
for (int x = (int)xstart; x <= (int)xend; ++x)
{
double xportion = 1.0;
if (x == (int)xstart) xportion -= xstart - x;
if (x == (int)xend) xportion -= x+1 - xend;
sum += picture4[y][x] * yportion * xportion;
}
}
picture3[f][g] = (sum > threshold) ? 1 : 0;
}
}
I've now tested this code. Here's the input 200x200 image, followed by a nearest-neighbor reduction to 15x15 (done in Paint Shop Pro), followed by the results of this code. I'll leave you to decide which is more faithful to the original; the difference would be much more obvious if the original had some fine detail.
To properly downscale an image, you should divide your image up into square blocks of pixels and then use something like Bilinear Interpolation in order to find the right color of the pixel that should replace the NxN block of pixels you're doing the interpolation on.
Since I'm not so good at the math involved, I'm not going to try give you an example of how the code would like. Sorry :(
Since you're fine with using a library, you could look into the imagemagick C++ bindings.
You could also output the image in a simple format like a pbm, and then call the imagemagick command to resize it:
system("convert input.pbm -resize 10x10 -compress none output.pbm");
Sample output file (note: you don't need to use a new line for each row):
P1
20 20
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The output file:
P1
10 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 0 1 1 0
0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1
1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0
I've found an implementation of a bilinear interpolaton. C code.
Assuming that:
a - a primary array (which we need to stretch/compress) pointer.
oldw - primary width
oldh - primary height
b - a secondary array (which we get after compressing/stretching) pointer
neww - secondary width
newh - seconday height
#include <stdio.h>
#include <math.h>
#include <sys/types.h>
void resample(void *a, void *b, int oldw, int oldh, int neww, int newh)
{
int i;
int j;
int l;
int c;
float t;
float u;
float tmp;
float d1, d2, d3, d4;
u_int p1, p2, p3, p4; /* nearby pixels */
u_char red, green, blue;
for (i = 0; i < newh; i++) {
for (j = 0; j < neww; j++) {
tmp = (float) (i) / (float) (newh - 1) * (oldh - 1);
l = (int) floor(tmp);
if (l < 0) {
l = 0;
} else {
if (l >= oldh - 1) {
l = oldh - 2;
}
}
u = tmp - l;
tmp = (float) (j) / (float) (neww - 1) * (oldw - 1);
c = (int) floor(tmp);
if (c < 0) {
c = 0;
} else {
if (c >= oldw - 1) {
c = oldw - 2;
}
}
t = tmp - c;
/* coefficients */
d1 = (1 - t) * (1 - u);
d2 = t * (1 - u);
d3 = t * u;
d4 = (1 - t) * u;
/* nearby pixels: a[i][j] */
p1 = *((u_int*)a + (l * oldw) + c);
p2 = *((u_int*)a + (l * oldw) + c + 1);
p3 = *((u_int*)a + ((l + 1)* oldw) + c + 1);
p4 = *((u_int*)a + ((l + 1)* oldw) + c);
/* color components */
blue = (u_char)p1 * d1 + (u_char)p2 * d2 + (u_char)p3 * d3 + (u_char)p4 * d4;
green = (u_char)(p1 >> 8) * d1 + (u_char)(p2 >> 8) * d2 + (u_char)(p3 >> 8) * d3 + (u_char)(p4 >> 8) * d4;
red = (u_char)(p1 >> 16) * d1 + (u_char)(p2 >> 16) * d2 + (u_char)(p3 >> 16) * d3 + (u_char)(p4 >> 16) * d4;
/* new pixel R G B */
*((u_int*)b + (i * neww) + j) = (red << 16) | (green << 8) | (blue);
}
}
}
Hope it will be useful for other users. But nevertheless I still doubth whether it will work in my situation (when not stratching, but compressing an array). Any ideas?
I think, you need Interpolation. There are a lot of algorithms, for example you can use Bilinear interpolation
If you use Win32, then StretchBlt function possibly help.
The StretchBlt function copies a bitmap from a source rectangle into a destination rectangle, stretching or compressing the bitmap to fit the dimensions of the destination rectangle, if necessary. The system stretches or compresses the bitmap according to the stretching mode currently set in the destination device context.
One approach to downsizing a 200x200 image to, say 100x100, would be to take every 2nd pixel along each row and column. I'll leave you to roll your own code for downsizing to a size which is not a divisor of the original size. And I provide no warranty as to the suitability of this approach for your problem.