C++ code apparently executing out of sequence

C++ code apparently executing out of sequence - c++

The code
This is a project on Raspberry Pi using WiringPi. I have the following three member functions of a template class, along with pure virtuals for read() and write(). This base class is then subclassed by more specialized classes that provide the read() and write() function (sample shown down below):
// IChip.hpp (Root abstract class)
class IChip {
public:
virtual bool test() noexcept = 0;
};
// End IChip.hpp
// IMemory.hpp (class of interest to the question)
class IMemory: public IChip {
protected:
...
TAddr m_wordCount;
TWord m_dataMax;
// ctor and dtor, and more member fields
public:
virtual TWord read(const TAddr addr) const noexcept = 0;
virtual void write(const TAddr addr, const TWord data) const noexcept = 0;
// accessors and whatnot ...
bool march(bool keepGoing = false) noexcept;
bool checkerboard(bool keepGoing = false) noexcept;
bool test() noexcept final override;
};
// End IMemory.hpp
// IMemory.cpp
template <typename TAddr, typename TWord>
bool IMemory<TAddr, TWord>::march(bool keepGoing) noexcept {
bool result = true;
TAddr i;
TWord r;
const uint64_t totalIter = (m_wordCount * 6) - 1;
uint64_t counter = 0;
std::cout << "Starting MARCH test." << std::endl;
for (i = 0; i < m_wordCount; i++) {
this->write(i, 0);
std::cout << '\r' << counter << " / " << totalIter << std::flush;
counter++;
}
for (i = 0; i < m_wordCount; i++) {
r = this->read(i);
if (r != 0) {
result = false;
if (!keepGoing)
return result;
}
this->write(i, m_dataMax);
std::cout << '\r' << counter << " / " << totalIter << std::flush;
counter++;
}
// 4 more similar loops
std::cout << std::endl;
std::cout << "MARCH test done." << std::endl;
return result;
}
template <typename TAddr, typename TWord>
bool IMemory<TAddr, TWord>::checkerboard(bool keepGoing) noexcept {
bool result = true;
TAddr i;
TWord r;
TWord curWord;
const uint64_t totalIter = (m_wordCount * 4) - 1;
uint64_t counter = 0;
std::cout << "Starting CHECKERBOARD test." << std::endl;
curWord = 0;
for (i = 0; i < m_wordCount; i++) {
this->write(i, curWord);
std::cout << '\r' << counter << " / " << totalIter << std::flush;
counter++;
curWord = curWord == 0 ? m_dataMax : 0;
}
curWord = 0;
for (i = 0; i < m_wordCount; i++) {
r = this->read(i);
if (r != curWord) {
result = false;
if (!keepGoing)
return result;
}
std::cout << '\r' << counter << " / " << totalIter << std::flush;
counter++;
curWord = curWord == 0 ? m_dataMax : 0;
}
// 2 more similar loops ...
std::cout << std::endl;
std::cout << "CHECKERBOARD test done." << std::endl;
return result;
}
template <typename TAddr, typename TWord>
bool IMemory<TAddr, TWord>::test() noexcept {
bool march_result = this->march();
bool checkerboard_result = this->checkerboard();
bool result = march_result && checkerboard_result;
std::cout << "MARCH: " << (march_result ? "Passed" : "Failed") << std::endl;
std::cout << "CHECKERBOARD: " << (checkerboard_result ? "Passed" : "Failed") << std::endl;
return result;
}
// Explicit instantiation
template class IMemory<uint16_t, uint8_t>;
// End IMemory.cpp
// Sample read() and write() from HM62256, a subclass of IMemory<uint16_t, uint8_t>
// These really just bitbang onto / read data from pins with appropriate timings for each chip.
// m_data and m_address are instances of a Bus class, that is just a wrapper around an array of pins, provides bit-banging and reading functionality.
uint8_t HM62256::read(uint16_t addr) const noexcept {
uint8_t result = 0;
m_data->setMode(INPUT);
m_address->write(addr);
digitalWrite(m_CSPin, LOW);
digitalWrite(m_OEPin, LOW);
delayMicroseconds(1);
result = m_data->read();
digitalWrite(m_OEPin, HIGH);
digitalWrite(m_CSPin, HIGH);
delayMicroseconds(1);
return result;
}
void HM62256::write(uint16_t addr, uint8_t data) const noexcept {
digitalWrite(m_OEPin, HIGH);
delayMicroseconds(1);
m_address->write(addr);
delayMicroseconds(1);
m_data->setMode(OUTPUT);
m_data->write(data);
digitalWrite(m_CSPin, LOW);
digitalWrite(m_WEPin, LOW);
delayMicroseconds(1);
digitalWrite(m_WEPin, HIGH);
digitalWrite(m_CSPin, HIGH);
delayMicroseconds(1);
}
// main.cpp
void hm62256_test() {
const uint8_t ADDR_PINS[] = {4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18};
const uint8_t DATA_PINS[] = {19, 20, 21, 22, 23, 24, 25, 26};
Chiptools::Memory::HM62256 *device = new Chiptools::Memory::HM62256(ADDR_PINS, DATA_PINS, 2, 3, 27);
device->setup();
bool result = device->test();
std::cout << "Device " << ( result ? "passed all" : "failed some") << " tests." << std::endl;
delete device;
}
int main(int argc, char *argv[]) {
wiringPiSetupGpio();
hm62256_test();
}
The output
Now when I run this, sometimes it works just fine:
Starting MARCH test.
196607 / 196607
MARCH test done.
Starting CHECKERBOARD test.
131071 / 131071
CHECKERBOARD test done.
MARCH: Passed
CHECKERBOARD: Passed
Device passed all tests.
But randomly I will get this output:
Starting MARCH test.
67113 / 196607Starting CHECKERBOARD test.
33604 / 131071MARCH: Failed
CHECKERBOARD: Failed
Device failed some tests.
Toolchain info
gcc 8.3.0 arm-linux / C++14
Cmake 3.16.3
No threading.
Compiler & Linker flags:
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -fsanitize=address,leak,undefined")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=address,leak,undefined -static-libasan")
The issue and what I tried
I have a couple dozen chips. All of the chips work fine with a TL866ii programmer / tester. This happens with all of these chips. So that rules out the chips as a source of the issue.
Well, at first I thought maybe I'm not flushing the cout stream properly, but AFAIK std::endl does flush the output, so that's not that.
Next, I set a few breakpoints: (A) right before march() returns, (B) right where checkerboard() is called (2nd line in test()), (C) at the 1st line inside checkerboard().
When output was as expected, breakpoints were hit in this order A, B, C.
When output was not as expected, breakpoints were hit in this order B, C, A.
What it looks like is happening is, sometimes checkerboard() is called while march() is still running, causing random GPIO output at which point one or both tests fail.
While I'm looking for a solution to this, I'm more interested in some insight on what is happening. I would've thought that since my code is not making use of multithreading, and per my understanding of the C++ standard, statements are executed one by one to completion before the next statement is executed. I'm aware that some compiler implementations do reorder statements for optimization, but AFAIK it should not affect the semantics of my code. I might be wrong as that stuff is way over my head.

This might not be an answer, but it's too long for a comment.
Is A at the return statement after the "March test done" line?
I'm basing the following comments based off this output:
Starting MARCH test.
67113 / 196607Starting CHECKERBOARD test.
33604 / 131071MARCH: Failed
CHECKERBOARD: Failed
Device failed some tests.
What appears to be happening is your MARCH test is failing in the 3rd loop, thus returning early (within the loop). Your Checkerboard test then fails within the 2nd loop, and also returns early. If A is at the position I mentioned, then I think it's just luck or a compiler quirk that that breakpoint is hit.
That is to say, logically, I wouldn't expect breakpoint A to be hit at all when the failure occurs, only for B and C. I think A being hit at the end is probably down to how it program was compiled, and maybe some odd optimizations. Or perhaps where in the assembly the debugger is putting the breakpoint; it might just be on a final instruction that's going to be called anyway. Try putting the breakpoint on the std::cout line before the return and see if it's still hit.
To expand on your comment, this is what I'm seeing in the problem output:
Starting MARCH test.
67113 / 196607 [march() returns early] [checkerboard() starts] Starting CHECKERBOARD test.
33604 / 131071 [checkerboard() returns early] [test() reports results] MARCH: Failed
CHECKERBOARD: Failed
Device failed some tests.
All in all, I think the output will match your expectations if you change your return lines from this:
if (!keepGoing)
return result;
to something like this:
if (!keepGoing) {
std::cout << std::endl;
std::cout << "MARCH test failed." << std::endl;
return result;
}
Which I would expect to produce an output like this:
Starting MARCH test.
67113 / 196607
MARCH test failed
Starting CHECKERBOARD test.
33604 / 131071
CHECKERBOARD test failed
MARCH: Failed
CHECKERBOARD: Failed
Device failed some tests.

Related

C++: Use future.get with timeout and without blocking

I'm having this problem, where I have a main loop, that needs to trigger an async work and must not wait for it to finish. What I want it to do is to check every while-loop whether the async work is done.
This can be accomplished with the future.wait_for().
Since I don't want to block the main loop, I can use future.wait_for(0).
So far so good.
In addition, I'd like to verify that I received (or didn't receive) an answer within X ms.
I can do that by checking how long since I launched the "async", and verify what comes first - X ms passed or future_status::ready returned.
My question - is this a good practice, or is there a better way to do it?
Some more information:
Since the main loop must launch many different async jobs, it means I need to have a lot of duplicated code - every launch needs to "remember" the timestamp it was launched and every time I check if the async job is ready, I need to re-calculate the time differences for each async job. This might be quite a hassle.
for now - this is an example of what I described (might have build errors):
#define MAX_TIMEOUT_MS 30
bool myFunc()
{
bool result = false;
//do something for quite some time
return result;
}
int main()
{
int timeout_ms = MAX_TIMEOUT_MS;
steady_clock::time_point start;
bool async_return = false;
std::future_status status = std::future_status::ready;
int delta_ms = 0;
while(true) {
// On first time, or once we have an answer, launch async again
if (status == std::future_status::ready) {
std::future<bool> fut = std::async (std::launch::async, myFunc);
start = steady_clock::now(); // record the start timestamp whenever we launch async()
}
// do something...
status = fut.wait_for(std::chrono::seconds(0));
// check how long since we launched async
delta_ms = chrono::duration_cast<chrono::milliseconds>(steady_clock::now() - start).count();
if (status != std::future_status::ready && delta_ms > timeout_ms ) {
break;
} else {
async_return = fut.get();
// and we do something with the result
}
}
return 0;
}

One thing you might want to consider: If your while loop doesn't do any relevant work, and just checks for task completion, you may be doing a busy-wait (https://en.wikipedia.org/wiki/Busy_waiting).
This means you are wasting a lot of CPU time doing useless work. This may sound counter-intuitive, but it can negatively affect your performance in evaluating task completion even if you are constantly checking it!
This can happen because this thread will look like it is doing a lot of work to the OS, and will receive high priority for processing. Which may make other threads (that are doing your async job) look less important and took longer to complete. Of course, this is not set in stone and anything can happen, but still, it is a waste of CPU if you are not doing any other work in that loop.
wait_for(0) is not the best option since it effectively breaks the execution of this thread, even if the work is not ready yet. And it may take longer than you expect for it to resume work (https://en.cppreference.com/w/cpp/thread/future/wait_for). std::future doesn't seem to have a truly non-blocking API yet (C++ async programming, how to not wait for future?), but you can use other resources such as a mutex and the try_lock (http://www.cplusplus.com/reference/mutex/try_lock/).
That said, if your loop still does important work, this flow is ok to use. But you might want to have a queue of completed jobs to check, instead of a single future. This queue would only be consumed by your main thread and can be implemented with a non-blocking thread-safe "try_get" call to get next completed jobs. As others commented, you may want to wrap your time-saving logic in a job dispatcher class or similar.
Maybe something like this (pseudo code!):
struct WorkInfo {
time_type begin_at; // initialized on job dispatch
time_type finished_at;
// more info
};
thread_safe_vector<WorkInfo> finished_work;
void timed_worker_job() {
info.begin_at = current_time();
do_real_job_work();
WorkInfo info;
info.finished_at = current_time();
finished_work.push(some_data);
}
void main() {
...
while (app_loop)
{
dispatch_some_jobs();
WorkInfo workTemp;
while (finished_work.try_get(&work)) // returns true if returned work
{
handle_finished_job(workTemp);
}
}
...
}
And if you are not familiar, I also suggest you to read about Thread-Pools (https://en.wikipedia.org/wiki/Thread_pool) and Producer-Consumer (https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem).

The code below runs tasks async and checks later if they are finished.
I've added some fake work and waits to see the results.
#define MAX_TIMEOUT_MS 30
struct fun_t {
size_t _count;
bool finished;
bool result;
fun_t () : _count (9999), finished (false), result (false) {
}
fun_t (size_t c) : _count (c), finished (false), result (false) {
}
fun_t (const fun_t & f) : _count (f._count), finished (f.finished), result (f.result) {
}
fun_t (fun_t && f) : _count (f._count), finished (f.finished), result (f.result) {
}
~fun_t () {
}
const fun_t & operator= (fun_t && f) {
_count = f._count;
finished = f.finished;
result = f.result;
return *this;
}
void run ()
{
for (int i = 0; i < 50; ++i) {
cout << _count << " " << i << endl;;
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
result = true;
finished = true;
cout << " results: " << finished << ", " << result << endl;
}
operator bool () { return result; }
};
int main()
{
int timeout_ms = MAX_TIMEOUT_MS;
chrono::steady_clock::time_point start;
bool async_return = false;
std::future_status status = std::future_status::ready;
int delta_ms = 0;
std::map<size_t, fun_t> futs;
std::vector<std::future<void>> futfuncs;
size_t count = 0;
bool loop = true;
cout << "Begin --------------- " << endl;
while (loop) {
loop = false;
// On first time, or once we have an answer, launch async again
if (count < 3 && status == std::future_status::ready) {
//std::future<bool> fut = std::async (std::launch::async, myFunc);
futs[count] = std::move(fun_t(count));
//futs[futs.size() - 1].fut = std::async (std::launch::async, futs[futs.size() - 1]);
futfuncs.push_back (std::move(std::async(std::launch::async, &fun_t::run, &futs[count])));
}
// do something...
std::this_thread::sleep_for(std::chrono::seconds(2));
for (auto & f : futs) {
if (! f.second.finished) {
cout << " Not finished " << f.second._count << ", " << f.second.finished << endl;
loop = true;
} else {
bool aret = f.second;
cout << "Result: " << f.second._count << ", " << aret << endl;;
}
}
++count;
}
for (auto & f : futs) {
cout << " Verify " << f.second._count << ", " << f.second.finished;
if (f.second.finished) {
bool aret = f.second;
cout << "; result: " << aret;
}
cout << endl;
}
cout << "End --------------- " << endl;
return 0;
}
After removing lines (there are too much) you see the tasks. First number is the task id, second the iteration number.
Begin ---------------
0 0
0 1
0 2
Not finished 0, 0
1 0
0 20
1 1
Not finished 0, 0
Not finished 1, 0
2 0
1 20
0 40
2 1
0 49 // here task 0 ends
2 10
1 30
results: 1, 1 // "run" function ends
1 39
Result: 0, 1 // this is the verification "for"
Not finished 1, 0
Not finished 2, 0
results: 1, 1
Result: 0, 1
Result: 1, 1
Result: 2, 1
Verify 0, 1; result: 1
Verify 1, 1; result: 1
Verify 2, 1; result: 1
End ---------------

Dynamic batch is not supported on Intel NCS2 vpu

I'm trying to run FP16 person-detection-retail-0013 and person-reidentification-retail-0079 on Intel Neural Compute Stick hardware, but once I run the application to load the nets on the device I get this exception:
[INFERENCE ENGINE EXCEPTION] Dynamic batch is not supported
I've load the net with setting of the max batch size to 1 and I've started my project from the pedestrian tracker demo into the OpenVINO toolkit:
main.cpp --> CreatePedestrianTracker
CnnConfig reid_config(reid_model, reid_weights);
reid_config.max_batch_size = 16;
try {
if (ie.GetConfig(deviceName, CONFIG_KEY(DYN_BATCH_ENABLED)).as<std::string>() !=
PluginConfigParams::YES) {
reid_config.max_batch_size = 1;
std::cerr << "[DEBUG] Dynamic batch is not supported for " << deviceName << ". Fall back
to batch 1." << std::endl;
}
}
catch (const InferenceEngine::details::InferenceEngineException& e) {
reid_config.max_batch_size = 1;
std::cerr << e.what() << " for " << deviceName << ". Fall back to batch 1." << std::endl;
}
Cnn.cpp --> void CnnBase::InferBatch
void CnnBase::InferBatch(
const std::vector<cv::Mat>& frames,
std::function<void(const InferenceEngine::BlobMap&, size_t)> fetch_results) const {
const size_t batch_size = input_blob_->getTensorDesc().getDims()[0];
size_t num_imgs = frames.size();
for (size_t batch_i = 0; batch_i < num_imgs; batch_i += batch_size) {
const size_t current_batch_size = std::min(batch_size, num_imgs - batch_i);
for (size_t b = 0; b < current_batch_size; b++) {
matU8ToBlob<uint8_t>(frames[batch_i + b], input_blob_, b);
}
if ((deviceName_.find("MYRIAD") == std::string::npos) && (deviceName_.find("HDDL") ==
std::string::npos)) {
infer_request_.SetBatch(current_batch_size);
}
infer_request_.Infer();
fetch_results(outputs_, current_batch_size);
}
}
I suppose that the problem could be the topology of the detection net, but I ask if anyone has had the same problem and solved the issue.
Thank's.

I am afraid, myriad plugin does not support dynamic batch. Please try an updated version of the demo. You can find it, for example, here: https://github.com/opencv/open_model_zoo/tree/master/demos/pedestrian_tracker_demo
The demo is updated not to use dynamic batch at all.

Intel MKL Sparse QR Solve in C++ returns not initialized error

When attempting to use mkl_sparse_s_qr_solve, I receive a result of all 0's and the error status of SPARSE_STATUS_NOT_INITIALIZED which means that the handle/matrix is empty.
I have tried reading through the documentation thoroughly and printing all arrays that are used to instantiate the CSR sparse matrix required for the solve, and all the arrays contain the correct values.
int main() {
std::vector<int> i_b(22);
std::vector<int> i_e(22);
// j, v, and b vectors are just examples.
// Regardless of the resulting numbers of the solve, I just
// cannot get the SPARSE_STATUS_NOT_INITIALIZED error to stop
// for the qr solver.
std::vector<int> j(44, 0);
std::vector<float> v(44, 1.0);
std::vector<float> b(22, 1.0);
{
struct IncGenerator {
int current_;
IncGenerator(int start) : current_(start) {}
int operator() () {
current_ += 2;
return current_;
}
};
// Fill i_b with {0, 2, 4, ..., 42}
IncGenerator g(-2);
std::generate(i_b.begin(), i_b.end(), g);
// Fill i_e with {2, 4, 6, ..., 44}
IncGenerator f(0);
std::generate(i_e.begin(), i_e.end(), f);
}
// ...
// j, v, and b arrays are all the correct values
// confirmed. The sparse A matrix should have 2 values
// per row, with 22 rows, and 15 columns.
int out;
sparse_matrix_t A;
out = mkl_sparse_s_create_csr(&A, SPARSE_INDEX_BASE_ZERO, 22, 15, &i_b[0], &i_e[0], &j[0], &v[0]);
switch (out) {
case SPARSE_STATUS_SUCCESS:
std::cout << "Successfully created matrix!" << std::endl;
break;
case SPARSE_STATUS_NOT_INITIALIZED:
std::cout << "Not initialized." << std::endl;
break;
case SPARSE_STATUS_ALLOC_FAILED:
std::cout << "Internal memory allocation failed." << std::endl;
break;
default:
std::cout << "Unknown." << std::endl;
break;
}
std::vector<float> X(22 * 15);
out = mkl_sparse_s_qr_solve(SPARSE_OPERATION_NON_TRANSPOSE, A, NULL, SPARSE_LAYOUT_COLUMN_MAJOR, 1, &X[0], 15, &asv[0], 22);
switch (out) {
case SPARSE_STATUS_SUCCESS:
std::cout << "Successfully solved!" << std::endl;
break;
case SPARSE_STATUS_NOT_INITIALIZED:
std::cout << "Not initialized." << std::endl;
break;
case SPARSE_STATUS_ALLOC_FAILED:
std::cout << "Internal memory allocation failed." << std::endl;
break;
default:
std::cout << "Unknown." << std::endl;
break;
}
return 0;
}
Therefore, for some reason I cannot solve with A because it either thinks A is empty or something else is uninitialized. I do not think A is empty (I wanted to check but there was no convenient way to print A) as the initialization of the matrix A returns as a successful operation (the only thing I am slightly doubtful of is the row beginning i_b and row end i_e indices).
Can anyone please offer some guidance?

This is not how you are supposed to use mkl_sparse_?_qr_solve. Sparse systems are solved in 3 steps (phases):
Reorder.
Factorize.
Solve.
First, you have to call mkl_sparse_qr_reorder, then mkl_sparse_?_qr_factorize, and only then mkl_sparse_?_qr_solve:
Try to insert the following code before mkl_sparse_?_qr_solve:
struct matrix_descr descr;
descr.type = SPARSE_MATRIX_TYPE_GENERAL;
out = mkl_sparse_qr_reorder(A, descr);
switch (out) { ... }
out = mkl_sparse_?_qr_factorize(A, NULL);
switch (out) { ... }
Or just use mkl_sparse_?_qr that will do all 3 steps for you in a single call. Separation of the process into three steps gives you more freedom. For example, if you want to solve several systems with the same A, you can save time by calling mkl_sparse_qr_reorder and mkl_sparse_?_qr_factorize only once.
Not directly related, but don't use int instead of MKL_INT. When MKL_ILP64 is defined, MKL_INT is not int, but long long int.

Make compiler assume that all cases are handled in switch without default

Let's start with some code. This is an extremely simplified version of my program.
#include <stdint.h>
volatile uint16_t dummyColorRecepient;
void updateColor(const uint8_t iteration)
{
uint16_t colorData;
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
case 2:
colorData = 345;
break;
}
dummyColorRecepient = colorData;
}
// dummy main function
int main()
{
uint8_t iteration = 0;
while (true)
{
updateColor(iteration);
if (++iteration == 3)
iteration = 0;
}
}
The program compiles with a warning:
./test.cpp: In function ‘void updateColor(uint8_t)’:
./test.cpp:20:25: warning: ‘colorData’ may be used uninitialized in this function [-Wmaybe-uninitialized]
dummyColorRecepient = colorData;
~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
As you can see, there is an absolute certainty that the variable iteration is always 0, 1 or 2. However, the compiler doesn't know that and it assumes that switch may not initialize colorData. (Any amount of static analysis during compilation won't help here because the real program is spread over multiple files.)
Of course I could just add a default statement, like default: colorData = 0; but this adds additional 24 bytes to the program. This is a program for a microcontroller and I have very strict limits for its size.
I would like to inform the compiler that this switch is guaranteed to cover all possible values of iteration.

As you can see, there is an absolute certainty that the variable iteration is always 0, 1 or 2.
From the perspective of the toolchain, this is not true. You can call this function from someplace else, even from another translation unit. The only place that your constraint is enforced is in main, and even there it's done in a such a way that might be difficult for the compiler to reason about.
For our purposes, though, let's take as read that you're not going to link any other translation units, and that we want to tell the toolchain about that. Well, fortunately, we can!
If you don't mind being unportable, then there's GCC's __builtin_unreachable built-in to inform it that the default case is not expected to be reached, and should be considered unreachable. My GCC is smart enough to know that this means colorData is never going to be left uninitialised unless all bets are off anyway.
#include <stdint.h>
volatile uint16_t dummyColorRecepient;
void updateColor(const uint8_t iteration)
{
uint16_t colorData;
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
case 2:
colorData = 345;
break;
// Comment out this default case to get the warnings back!
default:
__builtin_unreachable();
}
dummyColorRecepient = colorData;
}
// dummy main function
int main()
{
uint8_t iteration = 0;
while (true)
{
updateColor(iteration);
if (++iteration == 3)
iteration = 0;
}
}
(live demo)
This won't add an actual default branch, because there's no "code" inside it. In fact, when I plugged this into Godbolt using x86_64 GCC with -O2, the program was smaller with this addition than without it — logically, you've just added a major optimisation hint.
There's actually a proposal to make this a standard attribute in C++ so it could be an even more attractive solution in the future.

Use the "immediately invoked lambda expression" idiom and an assert:
void updateColor(const uint8_t iteration)
{
const auto colorData = [&]() -> uint16_t
{
switch(iteration)
{
case 0: return 123;
case 1: return 234;
}
assert(iteration == 2);
return 345;
}();
dummyColorRecepient = colorData;
}
The lambda expression allows you to mark colorData as const. const variables must always be initialized.
The combination of assert + return statements allows you to avoid warnings and handle all possible cases.
assert doesn't get compiled in release mode, preventing overhead.
You can also factor out the function:
uint16_t getColorData(const uint8_t iteration)
{
switch(iteration)
{
case 0: return 123;
case 1: return 234;
}
assert(iteration == 2);
return 345;
}
void updateColor(const uint8_t iteration)
{
const uint16_t colorData = getColorData(iteration);
dummyColorRecepient = colorData;
}

You can get this to compile without warnings simply by adding a default label to one of the cases:
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
case 2: default:
colorData = 345;
break;
}
Alternatively:
uint16_t colorData = 345;
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
}
Try both, and use the shorter of the two.

I know there have been some good solutions, but alternatively If your values are going to be known at compile time, instead of a switch statement you can use constexpr with a static function template and a couple of enumerators; it would look something like this within a single class:
#include <iostream>
class ColorInfo {
public:
enum ColorRecipient {
CR_0 = 0,
CR_1,
CR_2
};
enum ColorType {
CT_0 = 123,
CT_1 = 234,
CT_2 = 345
};
template<const uint8_t Iter>
static constexpr uint16_t updateColor() {
if constexpr (Iter == CR_0) {
std::cout << "ColorData updated to: " << CT_0 << '\n';
return CT_0;
}
if constexpr (Iter == CR_1) {
std::cout << "ColorData updated to: " << CT_1 << '\n';
return CT_1;
}
if constexpr (Iter == CR_2) {
std::cout << "ColorData updated to: " << CT_2 << '\n';
return CT_2;
}
}
};
int main() {
const uint16_t colorRecipient0 = ColorInfo::updateColor<ColorInfo::CR_0>();
const uint16_t colorRecipient1 = ColorInfo::updateColor<ColorInfo::CR_1>();
const uint16_t colorRecipient2 = ColorInfo::updateColor<ColorInfo::CR_2>();
std::cout << "\n--------------------------------\n";
std::cout << "Recipient0: " << colorRecipient0 << '\n'
<< "Recipient1: " << colorRecipient1 << '\n'
<< "Recipient2: " << colorRecipient2 << '\n';
return 0;
}
The cout statements within the if constexpr are only added for testing purposes, but this should illustrate another possible way to do this without having to use a switch statement provided your values will be known at compile time. If these values are generated at runtime I'm not completely sure if there is a way to use constexpr to achieve this type of code structure, but if there is I'd appreciate it if someone else with a little more experience could elaborate on how this could be done with constexpr using runtime values. However, this code is very readable as there are no magic numbers and the code is quite expressive.
-Update-
After reading more about constexpr it has come to my attention that they can be used to generate compile time constants. I also learned that they can not generate runtime constants but they can be used within a runtime function. We can take the above class structure and use it within a runtime function as such by adding this static function to the class:
static uint16_t colorUpdater(const uint8_t input) {
// Don't forget to offset input due to std::cin with ASCII value.
if ( (input - '0') == CR_0)
return updateColor<CR_0>();
if ( (input - '0') == CR_1)
return updateColor<CR_1>();
if ( (input - '0') == CR_2)
return updateColor<CR_2>();
return updateColor<CR_2>(); // Return the default type
}
However I want to change the naming conventions of the two functions. The first function I will name colorUpdater() and this new function that I just shown above I will name it updateColor() as it seems more intuitive this way. So the updated class will now look like this:
class ColorInfo {
public:
enum ColorRecipient {
CR_0 = 0,
CR_1,
CR_2
};
enum ColorType {
CT_0 = 123,
CT_1 = 234,
CT_2 = 345
};
static uint16_t updateColor(uint8_t input) {
if ( (input - '0') == CR_0 ) {
return colorUpdater<CR_0>();
}
if ( (input - '0') == CR_1 ) {
return colorUpdater<CR_1>();
}
if ( (input - '0') == CR_2 ) {
return colorUpdater<CR_2>();
}
return colorUpdater<CR_0>(); // Return the default type
}
template<const uint8_t Iter>
static constexpr uint16_t colorUpdater() {
if constexpr (Iter == CR_0) {
std::cout << "ColorData updated to: " << CT_0 << '\n';
return CT_0;
}
if constexpr (Iter == CR_1) {
std::cout << "ColorData updated to: " << CT_1 << '\n';
return CT_1;
}
if constexpr (Iter == CR_2) {
std::cout << "ColorData updated to: " << CT_2 << '\n';
return CT_2;
}
}
};
If you want to use this with compile time constants only you can use it just as before but with the function's updated name.
#include <iostream>
int main() {
auto output0 = ColorInfo::colorUpdater<ColorInfo::CR_0>();
auto output1 = ColorInfo::colorUpdater<ColorInfo::CR_1>();
auto output2 = ColorInfo::colorUpdater<ColorInfo::CR_2>();
std::cout << "\n--------------------------------\n";
std::cout << "Recipient0: " << output0 << '\n'
<< "Recipient1: " << output1 << '\n'
<< "Recipient2: " << output2 << '\n';
return 0;
}
And if you want to use this mechanism with runtime values you can simply do the following:
int main() {
uint8_t input;
std::cout << "Please enter input value [0,2]\n";
std::cin >> input;
auto output = ColorInfo::updateColor(input);
std::cout << "Output: " << output << '\n';
return 0;
}
And this will work with runtime values.

Well, if you are sure you won't have to handle other possible values, you can just use arithmetic. Gets rid of he branching and the load.
void updateColor(const uint8_t iteration)
{
dummyColorRecepient = 123 + 111 * iteration;
}

I'm going to extend the Lightness Races in Orbit's answer.
The code I'm using currently is:
#ifdef __GNUC__
__builtin_unreachable();
#else
__assume(false);
#endif
__builtin_unreachable() works in GCC and Clang but not MSVC. I used __GNUC__ to check whether it is one of the first two (or another compatible compiler) and used __assume(false) for MSVC instead.

Solving 8-Puzzle in C++ with A* results in endless loop

I'm currently trying to solve the 8-Puzzle with the A* search algorithm, but my program gets stuck in an endless loop.
My main searching loop is:
std::vector<Field> Search::AStar(Field &start, Field &goal){
std::cout << "Calculating..." << std::endl;
std::unordered_map<Field, Field> explored;
std::vector<Field> searched;
if (Puzzle::finished(start))
return MakePath(start, start);
std::priority_queue<Field, std::vector<Field>, std::greater<Field>> frontier;
frontier.push(start);
Field current;
Field child;
size_t i = 0;
while (!frontier.empty())
{
current = frontier.top();
frontier.pop();
if (++i > 500)
{
std::cout << "Iteration Error" << std::endl;
return searched;
}
searched.push_back(current);
for (Direction d : Puzzle::Actions(current))
{
child = Puzzle::Action(d, current);
if (Puzzle::finished(child))
{
std::cout << "Found goal!" << std::endl;
return MakePath(explored[child], start);
}
child.CostG = current.CostG + 1; // Make a step
if (!isIn(child, explored) || child.CostG < explored[child].CostG)
{
child.CostH = Puzzle::Heuristic(child, goal); // Calculate Heuristic
child.CostF = child.CostG + child.CostH; // Calculate final costs
frontier.push(child);
explored[child] = child;
explored[child].setParent(&explored[current]);
}
}
}
std::cout << "Error: frontier Empty" << std::endl;
return searched;
}
The vector "searched" is just so that I can see what A* does, and I will delete it as soon as the algorithm works.
The CostG stands for the number of steps done until this point, the CostH are the estimated minimum (heuristic) costs to the "goal" and the CostF are those two combined.
The index of the Field::Boxes vector is the number of the field, and every element contains the position.
My Heuristic function looks like this:
inline int Heuristic(Field &goal)
{
size_t d = 0;
for (size_t i = 0; i < Boxes.size(); i++)
{
d += (std::abs(static_cast<int>(Boxes[i].x) - static_cast<int>(goal.Boxes[i].x))
+ std::abs(static_cast<int>(Boxes[i].y) - static_cast<int>(goal.Boxes[i].y)));
}
return d;
}
For better readability and stuff, the code also is on Github. However, to execute it, you need SFML in your Visual Studio include direction.
Every help is appreciated!
Edit 1:
You now no longer need SFML to executed & debug the program! I commited the changes to github, the link is the same.

The problem is that although you remove the current node from your frontier, you never added it to the explored set, i.e. you never close it. The following code should work. My revisions closely follow Wikipedia's A* Pseudocode.
I also recommend you test your algorithm with the trivial heuristic (the one that returns zero for all values) on a simple puzzle to verify that your algorithm is implemented correctly. (See this answer for a brief explanation of this technique.)
while (!frontier.empty())
{
current = frontier.top();
frontier.pop();
if (++i > 500)
{
std::cout << "Iteration Error" << std::endl;
return searched;
}
// Check for goal here
if (Puzzle::finished(current)
{
std::cout << "Found goal!" << std::endl;
return MakePath(explored[current], start);
}
explored[current] = current; //close the current node
searched.push_back(current);
for (Direction d : Puzzle::Actions(current))
{
child = Puzzle::Action(d, current);
if (isIn(child,explored))
{
continue; //ignore the neighbor which is already evaluated
}
child.CostG = current.CostG + 1; // Make a step
if (!isIn(child, frontier)) //discovered a new node
{
frontier.push(child);
}
else if (child.CostG >= explored[child].CostG)
{
continue; //this is not a better path
{
//the path is best until now. Record it!
child.CostH = Puzzle::Heuristic(child, goal); // Calculate Heuristic
child.CostF = child.CostG + child.CostH; // Calculate final costs
//frontier.push(child); moved up to earlier point in code
explored[child] = child;
explored[child].setParent(&explored[current]);
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ code apparently executing out of sequence - c++

Related

C++: Use future.get with timeout and without blocking

Dynamic batch is not supported on Intel NCS2 vpu

Intel MKL Sparse QR Solve in C++ returns not initialized error

Make compiler assume that all cases are handled in switch without default

Solving 8-Puzzle in C++ with A* results in endless loop

Categories

Resources