Some background:
I wrote a single layer multi output perceptron class in C++. It uses the typical WX + b discriminant function and allows for user-defined activation functions. I have tested everything pretty throughly and it all seems to be working as I expect it to. I noticed a small logical error in my code, and when I attempted to fix it the network performed much worse than before. The error is as follows:
I evaluate the value at each output neuron using the following code:
output[i] =
activate_(std::inner_product(weights_[i].begin(), weights_[i].end(),
features.begin(), -1 * biases_[i]));
Here I treat the bias input as a fixed -1, but when I apply the learning rule to each bias, I treat the input as +1.
// Bias can be treated as a weight with a constant feature value of 1.
biases_[i] = weight_update(1, error, learning_rate_, biases_[i]);
So I attempted to fix my mistake by changing the call to weight_updated to be conistent with the output evaluation:
biases_[i] = weight_update(-1, error, learning_rate_, biases_[i]);
But doing so results in a 20% drop in accuracy!
I have been pulling my hair out for the past few days trying to find some other logical error in my code which might explain this strange behaviour, but have come up empty handed. Can anyone with more knowledge than I provide any insight into this? I have provided the entire class below for reference. Thank you in advance.
#ifndef SINGLE_LAYER_PERCEPTRON_H
#define SINGLE_LAYER_PERCEPTRON_H
#include <cassert>
#include <functional>
#include <numeric>
#include <vector>
#include "functional.h"
#include "random.h"
namespace qp {
namespace rf {
namespace {
template <typename Feature>
double weight_update(const Feature& feature, const double error,
const double learning_rate, const double current_weight) {
return current_weight + (learning_rate * error * feature);
}
template <typename T>
using Matrix = std::vector<std::vector<T>>;
} // namespace
template <typename Feature, typename Label, typename ActivationFn>
class SingleLayerPerceptron {
public:
// For testing only.
SingleLayerPerceptron(const Matrix<double>& weights,
const std::vector<double>& biases, double learning_rate)
: weights_(weights),
biases_(biases),
n_inputs_(weights.front().size()),
n_outputs_(biases.size()),
learning_rate_(learning_rate) {}
// Initialize the layer with random weights and biases in [-1, 1].
SingleLayerPerceptron(std::size_t n_inputs, std::size_t n_outputs,
double learning_rate)
: n_inputs_(n_inputs),
n_outputs_(n_outputs),
learning_rate_(learning_rate) {
weights_.resize(n_outputs_);
std::for_each(
weights_.begin(), weights_.end(), [this](std::vector<double>& wv) {
generate_back_n(wv, n_inputs_,
std::bind(random_real_range<double>, -1, 1));
});
generate_back_n(biases_, n_outputs_,
std::bind(random_real_range<double>, -1, 1));
}
std::vector<double> predict(const std::vector<Feature>& features) const {
std::vector<double> output(n_outputs_);
for (auto i = 0ul; i < n_outputs_; ++i) {
output[i] =
activate_(std::inner_product(weights_[i].begin(), weights_[i].end(),
features.begin(), -1 * biases_[i]));
}
return output;
}
void learn(const std::vector<Feature>& features,
const std::vector<double>& true_output) {
const auto actual_output = predict(features);
for (auto i = 0ul; i < n_outputs_; ++i) {
const auto error = true_output[i] - actual_output[i];
for (auto weight = 0ul; weight < n_inputs_; ++weight) {
weights_[i][weight] = weight_update(
features[weight], error, learning_rate_, weights_[i][weight]);
}
// Bias can be treated as a weight with a constant feature value of 1.
biases_[i] = weight_update(1, error, learning_rate_, biases_[i]);
}
}
private:
Matrix<double> weights_; // n_outputs x n_inputs
std::vector<double> biases_; // 1 x n_outputs
std::size_t n_inputs_;
std::size_t n_outputs_;
ActivationFn activate_;
double learning_rate_;
};
struct StepActivation {
double operator()(const double x) const { return x > 0 ? 1 : -1; }
};
} // namespace rf
} // namespace qp
#endif /* SINGLE_LAYER_PERCEPTRON_H */
I ended up figuring it out...
My fix was indeed correct and the loss of accuracy was just a consequence of having a lucky (or unlucky) dataset.
Related
In some testing code there's a helper function like this:
auto make_condiment(bool salt, bool pepper, bool oil, bool garlic) {
// assumes that first bool is salt, second is pepper,
// and so on...
//
// Make up something according to flags
return something;
};
which essentially builds up something based on some boolean flags.
What concerns me is that the meaning of each bool is hardcoded in the name of the parameters, which is bad because at the call site it's hard to remember which parameter means what (yeah, the IDE can likely eliminate the problem entirely by showing those names when tab completing, but still...):
// at the call site:
auto obj = make_condiment(false, false, true, true); // what ingredients am I using and what not?
Therefore, I'd like to pass a single object describing the settings. Furthermore, just aggregating them in an object, e.g. std::array<bool,4>.
I would like, instead, to enable a syntax like this:
auto obj = make_smart_condiment(oil + garlic);
which would generate the same obj as the previous call to make_condiment.
This new function would be:
auto make_smart_condiment(Ingredients ingredients) {
// retrieve the individual flags from the input
bool salt = ingredients.hasSalt();
bool pepper = ingredients.hasPepper();
bool oil = ingredients.hasOil();
bool garlic = ingredients.hasGarlic();
// same body as make_condiment, or simply:
return make_condiment(salt, pepper, oil, garlic);
}
Here's my attempt:
struct Ingredients {
public:
enum class INGREDIENTS { Salt = 1, Pepper = 2, Oil = 4, Garlic = 8 };
explicit Ingredients() : flags{0} {};
explicit Ingredients(INGREDIENTS const& f) : flags{static_cast<int>(f)} {};
private:
explicit Ingredients(int fs) : flags{fs} {}
int flags; // values 0-15
public:
bool hasSalt() const {
return flags % 2;
}
bool hasPepper() const {
return (flags / 2) % 2;
}
bool hasOil() const {
return (flags / 4) % 2;
}
bool hasGarlic() const {
return (flags / 8) % 2;
}
Ingredients operator+(Ingredients const& f) {
return Ingredients(flags + f.flags);
}
}
salt{Ingredients::INGREDIENTS::Salt},
pepper{Ingredients::INGREDIENTS::Pepper},
oil{Ingredients::INGREDIENTS::Oil},
garlic{Ingredients::INGREDIENTS::Garlic};
However, I have the feeling that I am reinventing the wheel.
Is there any better, or standard, way of accomplishing the above?
Is there maybe a design pattern that I could/should use?
I think you can remove some of the boilerplate by using a std::bitset. Here is what I came up with:
#include <bitset>
#include <cstdint>
#include <iostream>
class Ingredients {
public:
enum Option : uint8_t {
Salt = 0,
Pepper = 1,
Oil = 2,
Max = 3
};
bool has(Option o) const { return value_[o]; }
Ingredients(std::initializer_list<Option> opts) {
for (const Option& opt : opts)
value_.set(opt);
}
private:
std::bitset<Max> value_ {0};
};
int main() {
Ingredients ingredients{Ingredients::Salt, Ingredients::Pepper};
// prints "10"
std::cout << ingredients.has(Ingredients::Salt)
<< ingredients.has(Ingredients::Oil) << "\n";
}
You don't get the + type syntax, but it's pretty close. It's unfortunate that you have to keep an Option::Max, but not too bad. Also I decided to not use an enum class so that it can be accessed as Ingredients::Salt and implicitly converted to an int. You could explicitly access and cast if you wanted to use enum class.
If you want to use enum as flags, the usual way is merge them with operator | and check them with operator &
#include <iostream>
enum Ingredients{ Salt = 1, Pepper = 2, Oil = 4, Garlic = 8 };
// If you want to use operator +
Ingredients operator + (Ingredients a,Ingredients b) {
return Ingredients(a | b);
}
int main()
{
using std::cout;
cout << bool( Salt & Ingredients::Salt ); // has salt
cout << bool( Salt & Ingredients::Pepper ); // doesn't has pepper
auto sp = Ingredients::Salt + Ingredients::Pepper;
cout << bool( sp & Ingredients::Salt ); // has salt
cout << bool( sp & Ingredients::Garlic ); // doesn't has garlic
}
note: the current code (with only the operator +) would not work if you mix | and + like (Salt|Salt)+Salt.
You can also use enum class, just need to define the operators
I would look at a strong typing library like:
https://github.com/joboccara/NamedType
For a really good video talking about this:
https://www.youtube.com/watch?v=fWcnp7Bulc8
When I first saw this, I was a little dismissive, but because the advice came from people I respected, I gave it a chance. The video convinced me.
If you look at CPP Best Practices and dig deeply enough, you'll see the general advice to avoid boolean parameters, especially strings of them. And Jonathan Boccara gives good reasons why your code will be stronger if you don't directly use the raw types, for the very reason that you've already identified.
Some of my unit tests have started failing since adapting some code to enable multi-precision. Header file:
#ifndef SCRATCH_UNITTESTBOOST_INCLUDED
#define SCRATCH_UNITTESTBOOST_INCLUDED
#include <boost/multiprecision/cpp_dec_float.hpp>
// typedef double FLOAT;
typedef boost::multiprecision::cpp_dec_float_50 FLOAT;
const FLOAT ONE(FLOAT(1));
struct Rect
{
Rect(const FLOAT &width, const FLOAT &height) : Width(width), Height(height){};
FLOAT getArea() const { return Width * Height; }
FLOAT Width, Height;
};
#endif
Main test file:
#define BOOST_TEST_DYN_LINK
#define BOOST_TEST_MODULE RectTest
#include <boost/test/unit_test.hpp>
#include "SCRATCH_UnitTestBoost.h"
namespace utf = boost::unit_test;
// Failing
BOOST_AUTO_TEST_CASE(AreaTest1)
{
Rect R(ONE / 2, ONE / 3);
FLOAT expected_area = (ONE / 2) * (ONE / 3);
std::cout << std::setprecision(std::numeric_limits<FLOAT>::digits10) << std::showpoint;
std::cout << "Expected: " << expected_area << std::endl;
std::cout << "Actual : " << R.getArea() << std::endl;
// BOOST_CHECK_EQUAL(expected_area, R.getArea());
BOOST_TEST(expected_area == R.getArea());
}
// Tolerance has no effect?
BOOST_AUTO_TEST_CASE(AreaTestTol, *utf::tolerance(1e-40))
{
Rect R(ONE / 2, ONE / 3);
FLOAT expected_area = (ONE / 2) * (ONE / 3);
BOOST_TEST(expected_area == R.getArea());
}
// Passing
BOOST_AUTO_TEST_CASE(AreaTest2)
{
Rect R(ONE / 7, ONE / 2);
FLOAT expected_area = (ONE / 7) * (ONE / 2);
BOOST_CHECK_EQUAL(expected_area, R.getArea());
}
Note that when defining FLOAT as the double type, all the tests pass. What confuses me is that when printing the exact expected and actual values (see AreaTest1) we see the same result. But the error reported from BOOST_TEST is:
error: in "AreaTest1": check expected_area == R.getArea() has failed
[0.16666666666666666666666666666666666666666666666666666666666666666666666666666666 !=
0.16666666666666666666666666666666666666666666666666666666666666666666666672236366]
Compiling with g++ SCRATCH_UnitTestBoost.cpp -o utb.o -lboost_unit_test_framework.
Questions:
Why is the test failing?
Why does the use of tolerance in AreaTestTol not give outputs as documented here?
Related info:
Tolerances with floating point comparison
Gotchas with multiprecision types
Two Issues:
where does the difference come from
how to apply the epsilon?
Where The Difference Comes From
Boost Multiprecision uses template expressions to defer evaluation.
Also, you're choosing some rational fractions that cannot be exactly represented base-10 (cpp_dec_float uses decimal, so base-10).
This means that when you do
T x = 1/3;
T y = 1/7;
That will actually approximate both fractions inexactly.
Doing this:
T z = 1/3 * 1/7;
Will actually evaluate the right-handside expression template, so instead of calculating the temporaries like x ans y before, the right hand side has a type of:
expression<detail::multiplies, detail::expression<?>, detail::expression<?>, [2 * ...]>
That's shortened from the actual type:
boost::multiprecision::detail::expression<
boost::multiprecision::detail::multiplies,
boost::multiprecision::detail::expression<
boost::multiprecision::detail::divide_immediates,
boost::multiprecision::number<boost::multiprecision::backends::cpp_dec_float<50u,
int, void>, (boost::multiprecision::expression_template_option)1>, int,
void, void>,
boost::multiprecision::detail::expression<
boost::multiprecision::detail::divide_immediates,
boost::multiprecision::number<boost::multiprecision::backends::cpp_dec_float<50u,
int, void>, (boost::multiprecision::expression_template_option)1>, int,
void, void>,
void, void>
Long story short, this is what you want because it saves you work and keeps better accuracy because the expression is is first normalized to 1/(3*7) so 1/21.
This is where your difference comes from in the first place. Fix it by either:
turning off expression templates
using T = boost::multiprecision::number<
boost::multiprecision::cpp_dec_float<50>,
boost::multiprecision::et_off > >;
rewriting the expression to be equivalent of your implementation:
T expected_area = T(ONE / 7) * T(ONE / 2);
T expected_area = (ONE / 7).eval() * (ONE / 2).eval();
Applying The Tolerance
I find it hard to parse the Boost Unit Test docs on this, but here's empirical data:
BOOST_CHECK_EQUAL(expected_area, R.getArea());
T const eps = std::numeric_limits<T>::epsilon();
BOOST_CHECK_CLOSE(expected_area, R.getArea(), eps);
BOOST_TEST(expected_area == R.getArea(), tt::tolerance(eps));
This fails the first, and passes the last two. Indeed, in addition, the following two also fail:
BOOST_CHECK_EQUAL(expected_area, R.getArea());
BOOST_TEST(expected_area == R.getArea());
So it appears that something has to be done before the utf::tolerance decorator takes effect. Testing with native doubles tells me that only BOOST_TEST applies the tolerance implicitly. So dived into the preprocessed expansion:
::boost::unit_test::unit_test_log.set_checkpoint(
::boost::unit_test::const_string(
"/home/sehe/Projects/stackoverflow/test.cpp",
sizeof("/home/sehe/Projects/stackoverflow/test.cpp") - 1),
static_cast<std::size_t>(42));
::boost::test_tools::tt_detail::report_assertion(
(::boost::test_tools::assertion::seed()->*a == b).evaluate(),
(::boost::unit_test::lazy_ostream::instance()
<< ::boost::unit_test::const_string("a == b", sizeof("a == b") - 1)),
::boost::unit_test::const_string(
"/home/sehe/Projects/stackoverflow/test.cpp",
sizeof("/home/sehe/Projects/stackoverflow/test.cpp") - 1),
static_cast<std::size_t>(42), ::boost::test_tools::tt_detail::CHECK,
::boost::test_tools::tt_detail::CHECK_BUILT_ASSERTION, 0);
} while (::boost::test_tools::tt_detail::dummy_cond());
Digging in a lot more, I ran into:
/*!#brief Indicates if a type can be compared using a tolerance scheme
*
* This is a metafunction that should evaluate to #c mpl::true_ if the type
* #c T can be compared using a tolerance based method, typically for floating point
* types.
*
* This metafunction can be specialized further to declare user types that are
* floating point (eg. boost.multiprecision).
*/
template <typename T>
struct tolerance_based : tolerance_based_delegate<T, !is_array<T>::value && !is_abstract_class_or_function<T>::value>::type {};
There we have it! But no,
static_assert(boost::math::fpc::tolerance_based<double>::value);
static_assert(boost::math::fpc::tolerance_based<cpp_dec_float_50>::value);
Both already pass. Hmm.
Looking at the decorator I noticed that the tolerance injected into the fixture context is typed.
Experimentally I have reached the conclusion that the tolerance decorator needs to have the same static type argument as the operands in the comparison for it to take effect.
This may actually be very useful (you can have different implicit tolerances for different floating point types), but it is pretty surprising as well.
TL;DR
Here's the full test set fixed and live for your enjoyment:
take into account evaluation order and the effect on accuracy
use the static type in utf::tolerance(v) to match your operands
do not use BOOST_CHECK_EQUAL for tolerance-based comparison
I'd suggest to use explicit test_tools::tolerance instead of relying on "ambient" tolerance. After all, we want to be testing our code, not the test framework
Live On Coliru
template <typename T> struct Rect {
Rect(const T &width, const T &height) : width(width), height(height){};
T getArea() const { return width * height; }
private:
T width, height;
};
#define BOOST_TEST_DYN_LINK
#define BOOST_TEST_MODULE RectTest
#include <boost/multiprecision/cpp_dec_float.hpp>
using DecFloat = boost::multiprecision::cpp_dec_float_50;
#include <boost/test/unit_test.hpp>
namespace utf = boost::unit_test;
namespace tt = boost::test_tools;
namespace {
template <typename T>
static inline const T Eps = std::numeric_limits<T>::epsilon();
template <typename T> struct Fixture {
T const epsilon = Eps<T>;
T const ONE = 1;
using Rect = ::Rect<T>;
void checkArea(int wdenom, int hdenom) const {
auto w = ONE/wdenom; // could be expression templates
auto h = ONE/hdenom;
Rect const R(w, h);
T expect = w*h;
BOOST_TEST(expect == R.getArea(), "1/" << wdenom << " x " << "1/" << hdenom);
// I'd prefer explicit toleranc
BOOST_TEST(expect == R.getArea(), tt::tolerance(epsilon));
}
};
}
BOOST_AUTO_TEST_SUITE(Rectangles)
BOOST_FIXTURE_TEST_SUITE(Double, Fixture<double>, *utf::tolerance(Eps<double>))
BOOST_AUTO_TEST_CASE(check2_3) { checkArea(2, 3); }
BOOST_AUTO_TEST_CASE(check7_2) { checkArea(7, 2); }
BOOST_AUTO_TEST_CASE(check57_31) { checkArea(57, 31); }
BOOST_AUTO_TEST_SUITE_END()
BOOST_FIXTURE_TEST_SUITE(MultiPrecision, Fixture<DecFloat>, *utf::tolerance(Eps<DecFloat>))
BOOST_AUTO_TEST_CASE(check2_3) { checkArea(2, 3); }
BOOST_AUTO_TEST_CASE(check7_2) { checkArea(7, 2); }
BOOST_AUTO_TEST_CASE(check57_31) { checkArea(57, 31); }
BOOST_AUTO_TEST_SUITE_END()
BOOST_AUTO_TEST_SUITE_END()
Prints
I am trying to reconstruct two Z bosons.
I am using this tutorial as an example. https://root.cern.ch/doc/master/df103__NanoAODHiggsAnalysis_8C.html
However, I am still not familiar in using ROOT data frame and I am writing it in terms of what I am familiar with.
What I do not understand is how is i1, i2, and idx that is shown in the ROOT tutorial defined?
In my tree, I have the BranchMuon and the variables in this branch are called Muon.PT, Muon.Eta, Muon.Phi and Muon.Charge, and Muon.Mass.
My attempt to follow the tutorial is constructing a muon vector and defining the variables (which generates errors but I am still learning how to work with it):
#ifdef __CLING__
R__LOAD_LIBRARY(libDelphes)
#include "classes/DelphesClasses.h"
#include "external/ExRootAnalysis/ExRootTreeReader.h"
#include "external/ExRootAnalysis/ExRootResult.h"
#else
class ExRootTreeReader;
class ExRootResult;
#endif
#include <vector>
#include "ROOT/RDataFrame.hxx"
#include "ROOT/RVec.hxx"
#include "ROOT/RDF/RInterface.hxx"
#include "TCanvas.h"
#include "TH1D.h"
#include "TLatex.h"
#include "TLegend.h"
#include "Math/Vector4Dfwd.h"
#include "TStyle.h"
using namespace ROOT::VecOps;
using RNode = ROOT::RDF::RNode;
using rvec_f = const RVec<float> &;
using rvec_i = const RVec<int> &;
const auto z_mass = 91.2;
template<typename T>
void CollectionFilter(const TClonesArray& inColl ,vector<T*>& outColl, Double_t ptMin=30, Double_t etaMax=2.5)
{
const TObject *object;
for (Int_t i = 0; i < inColl.GetEntriesFast(); i++)
{
object = inColl.At(i);
const T *t = static_cast<const T*>(object);
if(t->P4().Pt() < ptMin) continue;
if(TMath::Abs(t->P4().Eta()) > etaMax) continue;
outColl.push_back(t);
}
}
void selectMuon(const char *inputFile)
{
gSystem->Load("libDelphes");
// Create chain of root trees
TChain chain("Delphes");
chain.Add(inputFile);
// Create object of class ExRootTreeReader
ExRootTreeReader *treeReader = new ExRootTreeReader(&chain);
Long64_t numberOfEntries = treeReader->GetEntries();
// Get pointers to branches used in this analysis
TClonesArray *branchMuon = treeReader->UseBranch("Muon");
// Book histograms
TH1F *histZMass = new TH1F("mass", "M_{inv}(Z[1]); M_inv (GeV/c^2); Events", 50, 0.0, 1500);
TH1F *histDiMuonMass = new TH1F("mass", "M_{inv}(Z[3]Z[5]); M_inv (GeV/c^2); Events", 50, 0.0, 1500);
// Define variables
float_t PT, Eta, Phi, Mass, Charge;
// Initializing the vectors
//vector<const Muon*> *muons = new vector<const Muon*>();;
//MuonVector muon(PT, Eta, Phi, Mass, Charge);
RVec<RVec<size_t>> reco_zz_to_4l(rvec_f PT, rvec_f Eta, rvec_f Phi, rvec_f Mass, rvec_i Charge);
RVec<RVec<size_t>> idx(2);
idx[0].reserve(2); idx[1].reserve(2);
auto idx_cmb = Combinations(PT, 2);
for (size_t i = 0; i < idx_cmb[0].size(); i++)
{
const auto i1 = idx_cmb[0][i];
const auto i2 = idx_cmb[1][i];
if(Charge[i1] != Charge[i2]
{
ROOT::Math::PtEtaPhiMVector m1(PT[0], Eta[0], Phi[0], Mass[0]);
ROOT::Math::PtEtaPhiMVector m2(PT[1], Eta[1], Phi[1], Mass[1]);
const auto mass = (m1 + m2).M();
}
}
histDiMuonMass->Fill(mass);
// end of event for loop
histDiMuonMass->Draw();
}
The errors that I am getting is below.
In file included from input_line_188:1:
/mnt/c/1/MG5_aMC_v2_6_6/Delphes/examples/selectMuon.C:89:16: error: subscripted value is not an array, pointer, or vector
if(Charge[i1] != Charge[i2]
~~~~~~^~~
/mnt/c/1/MG5_aMC_v2_6_6/Delphes/examples/selectMuon.C:89:30: error: subscripted value is not an array, pointer, or vector
if(Charge[i1] != Charge[i2]
~~~~~~^~~
/mnt/c/1/MG5_aMC_v2_6_6/Delphes/examples/selectMuon.C:100:28: error: use of undeclared identifier 'mass'
histDiMuonMass->Fill(mass);
How can I fix it?
Thank you.
I learnt that I had a previous definition of charge to be a float while I am using it to be an array.
Most of the time when I use boost::geometry::difference it does what I'd expect. However, when I have two polygons whose edges are almost coincident, I get some weird behavior. Consider this example:
#include <iostream>
#include <fstream>
#include <list>
#include <boost/geometry.hpp>
#include <boost/geometry/geometries/point_xy.hpp>
#include <boost/geometry/geometries/polygon.hpp>
#include <boost/geometry/geometries/multi_polygon.hpp>
using PointType = boost::geometry::model::d2::point_xy<double>;
using PolygonType = boost::geometry::model::polygon<PointType>;
using MultiPolygonType = boost::geometry::model::multi_polygon<PolygonType>;
template <typename TPolygon>
void printValidity(const TPolygon& polygons)
{
boost::geometry::validity_failure_type failure;
bool valid = boost::geometry::is_valid(polygons, failure);
if(!valid) {
std::cout << "not valid: " << failure << std::endl;
}
else {
std::cout << "valid." << std::endl;
}
}
template <typename TPolygon>
void WritePolygonsToSVG(const std::vector<TPolygon>& polygons, const std::string& filename)
{
std::ofstream svg(filename);
boost::geometry::svg_mapper<PointType> mapper(svg, 400, 400);
for(size_t polygonID = 0; polygonID < polygons.size(); ++polygonID) {
mapper.add(polygons[polygonID]);
int redValue = 50 * polygonID; // NOTE: This will break with more than 5 polygons
mapper.map(polygons[polygonID], "fill:rgb(" + std::to_string(redValue) + ",128,0);stroke:rgb(0,0,100);stroke-width:1");
}
}
int main()
{
MultiPolygonType polygon1;
boost::geometry::read_wkt("MULTIPOLYGON(((-23.8915 -12.2238,-23.7604 -10.2739,-23.1774 -7.83411,-22.196 -5.52561,-20.8436 -3.41293,-19.7976 -2.26009,-19.8108 -2.00306,24.8732 2.51519,26.0802 -9.42191,-23.6662 -14.452,-23.8915 -12.2238)))", polygon1);
WritePolygonsToSVG(polygon1, "polygon1.svg");
MultiPolygonType polygon2;
boost::geometry::read_wkt("MULTIPOLYGON(((-8.85138 -12.954,-8.89712 -12.7115,-8.9307 -12.5279,-3.35773 -12.078,-3.42937 -11.5832,-3.50013 -11.0882,-3.57007 -10.5932,-3.63925 -10.098,-3.70775 -9.60273,-3.77561 -9.10737,-3.84289 -8.61192,-3.90967 -8.1164,-3.976 -7.62081,-4.04194 -7.12517,-4.10756 -6.62948,-4.17291 -6.13375,-4.23805 -5.63799,-4.30305 -5.14222,-4.36797 -4.64643,-4.43287 -4.15064,-4.49781 -3.65485,-4.56285 -3.15909,-4.62806 -2.66331,-4.69349 -2.16756,-4.75921 -1.67185,-4.82528 -1.17619,-4.89175 -0.680597,-4.91655 -0.497015,17.8166 1.80166,17.8143 1.61653,17.8078 1.11656,17.8009 0.61658,17.7937 0.116605,17.7863 -0.383369,17.7786 -0.883343,17.7707 -1.3833,17.7627 -1.88324,17.7545 -2.38318,17.7463 -2.88312,17.7381 -3.38307,17.7299 -3.88301,17.7218 -4.38295,17.7139 -4.8829,17.706 -5.38285,17.6984 -5.88279,17.6911 -6.38274,17.684 -6.88269,17.6772 -7.38265,17.6709 -7.8826,17.6649 -8.38256,17.6595 -8.88251,17.6545 -9.38247,17.6501 -9.88244,17.6471 -10.2746,-8.85138 -12.954)))", polygon2);
WritePolygonsToSVG(polygon2, "polygon2.svg");
MultiPolygonType differencePolygon;
boost::geometry::difference(polygon1, polygon2, differencePolygon);
WritePolygonsToSVG(differencePolygon, "difference.svg");
printValidity(differencePolygon);
MultiPolygonType realDifference;
boost::geometry::read_wkt("MULTIPOLYGON(((-23.8915 -12.2238,-23.7604 -10.2739,-23.1774 -7.83411,-22.196 -5.52561,-20.8436 -3.41293,-19.7976 -2.26009,-19.8108 -2.00306,24.8732 2.51519,26.0802 -9.42191,-23.6662 -14.452,-23.8915 -12.2238),(-8.85138 -12.954,17.6471 -10.2746,17.6501 -9.88244,17.6545 -9.38247,17.6595 -8.88251,17.6649 -8.38256,17.6709 -7.8826,17.6772 -7.38265,17.684 -6.88269,17.6911 -6.38274,17.6984 -5.88279,17.706 -5.38285,17.7139 -4.8829,17.7218 -4.38295,17.7299 -3.88301,17.7381 -3.38307,17.7463 -2.88312,17.7545 -2.38318,17.7627 -1.88324,17.7707 -1.3833,17.7786 -0.883343,17.7863 -0.383369,17.7937 0.116605,17.8009 0.61658,17.8078 1.11656,17.8143 1.61653,17.8166 1.80166,-4.91655 -0.497015,-4.89175 -0.680597,-4.82528 -1.17619,-4.75921 -1.67185,-4.69349 -2.16756,-4.62806 -2.66331,-4.56285 -3.15909,-4.49781 -3.65485,-4.43287 -4.15064,-4.36797 -4.64643,-4.30305 -5.14222,-4.23805 -5.63799,-4.17291 -6.13375,-4.10756 -6.62948,-4.04194 -7.12517,-3.976 -7.62081,-3.90967 -8.1164,-3.84289 -8.61192,-3.77561 -9.10737,-3.70775 -9.60273,-3.63925 -10.098,-3.57007 -10.5932,-3.50013 -11.0882,-3.42937 -11.5832,-3.35773 -12.078,-8.9307 -12.5279,-8.89712 -12.7115,-8.85138 -12.954)))", realDifference);
WritePolygonsToSVG(realDifference, "realDifference.svg");
printValidity(realDifference);
return 0;
}
The 'differencePolygon' computed using boost::geometry::difference in this code is "valid", though it has an infinitesimally thin region:
The 'realDifference' polygon is what I output from my "real code" (where the input was slightly different due to (what I'm guessing) is the precision of the WKT writer). You can see this polygon has two infinitesimally small regions, and actually returns not valid (21: failure_self_intersections):
Is there some kind of tolerance parameter than can be set to avoid these infinitesimally small regions during the difference operation (which apparently sometimes break the polygon)? Or is there a function to "correct" the polygon by removing these self intersections? (I know the correct() function only fixes ring ordering).
I am using libsvm version 3.16. I have done some training in Matlab, and created a model. Now I would like to save this model to disk and load this model in my C++ program. So far I have found the following alternatives:
This answer explains how to save a model from C++, which is based on this website. Not exactly what I need, but could be adapted. (This requires development time).
I could find the best training parameters (kernel,C) in Matlab and re-train everything in C++. (Will require doing the training in C++ each time I change a parameter. It's not scalable).
Thus, both of these options are not satisfactory,
Does anyone have an idea?
My solution was to retrain in C++ because I couldn't find a nice way to directly save the model. Here's my code. You'll need to adapt it and clean it up a bit. The biggest change you'll have to make it not hard coding the svm_parameter values like I did. You'll also have to replace FilePath with std::string. I'm copying, pasting and making small edits here in SO so the formatting won't e perfect:
Used like this:
auto targetsPath = FilePath("targets.txt");
auto observationsPath = FilePath("observations.txt");
auto targetsMat = MatlabMatrixFileReader::Read(targetsPath, ',');
auto observationsMat = MatlabMatrixFileReader::Read(observationsPath, ',');
auto v = MiscVector::ConvertVecOfVecToVec(targetsMat);
auto model = SupportVectorRegressionModel{ observationsMat, v };
std::vector<double> observation{ { // 32 feature observation
0.883575729725847,0.919446119013878,0.95359403450317,
0.968233630936732,0.91891307107125,0.887897763183844,
0.937588566544751,0.920582702918882,0.888864454119387,
0.890066735260163,0.87911085669864,0.903745573664995,
0.861069296586979,0.838606194934074,0.856376230548304,
0.863011311537075,0.807688936997926,0.740434984165146,
0.738498042748759,0.736410940165691,0.697228384912424,
0.608527698289016,0.632994967880269,0.66935784966765,
0.647761430696238,0.745961037635717,0.560761134660957,
0.545498063585615,0.590854855113663,0.486827902942118,
0.187128866890822,- 0.0746523069562551
} };
double prediction = model.Predict(observation);
miscvector.h
static vector<double> ConvertVecOfVecToVec(const vector<vector<double>> &mat)
{
vector<double> targetsVec;
targetsVec.reserve(mat.size());
for (size_t i = 0; i < mat.size(); i++)
{
targetsVec.push_back(mat[i][0]);
}
return targetsVec;
}
libsvmtargetobjectconvertor.h
#pragma once
#include "machinelearning.h"
struct svm_node;
class LibSvmTargetObservationConvertor
{
public:
svm_node ** LibSvmTargetObservationConvertor::ConvertObservations(const vector<MlObservation> &observations, size_t numFeatures) const
{
svm_node **svmObservations = (svm_node **)malloc(sizeof(svm_node *) * observations.size());
for (size_t rowI = 0; rowI < observations.size(); rowI++)
{
svm_node *row = (svm_node *)malloc(sizeof(svm_node) * numFeatures);
for (size_t colI = 0; colI < numFeatures; colI++)
{
row[colI].index = colI;
row[colI].value = observations[rowI][colI];
}
row[numFeatures].index = -1; // apparently needed
svmObservations[rowI] = row;
}
return svmObservations;
}
svm_node* LibSvmTargetObservationConvertor::ConvertMatToSvmNode(const MlObservation &observation) const
{
size_t numFeatures = observation.size();
svm_node *obsNode = (svm_node *)malloc(sizeof(svm_node) * numFeatures);
for (size_t rowI = 0; rowI < numFeatures; rowI++)
{
obsNode[rowI].index = rowI;
obsNode[rowI].value = observation[rowI];
}
obsNode[numFeatures].index = -1; // apparently needed
return obsNode;
}
};
machinelearning.h
#pragma once
#include <vector>
using std::vector;
using MlObservation = vector<double>;
using MlTarget = double;
//machinelearningmodel.h
#pragma once
#include <vector>
#include "machinelearning.h"
class MachineLearningModel
{
public:
virtual ~MachineLearningModel() {}
virtual double Predict(const MlObservation &observation) const = 0;
};
matlabmatrixfilereader.h
#pragma once
#include <vector>
using std::vector;
class FilePath;
// Matrix created with command:
// dlmwrite('my_matrix.txt', somematrix, 'delimiter', ',', 'precision', 15);
// In these files, each row is a matrix row. Commas separate elements on a row.
// There is no space at the end of a row. There is a blank line at the bottom of the file.
// File format:
// 0.4,0.7,0.8
// 0.9,0.3,0.5
// etc.
static class MatlabMatrixFileReader
{
public:
static vector<vector<double>> Read(const FilePath &asciiFilePath, char delimiter)
{
vector<vector<double>> values;
vector<double> valueline;
std::ifstream fin(asciiFilePath.Path());
string item, line;
while (getline(fin, line))
{
std::istringstream in(line);
while (getline(in, item, delimiter))
{
valueline.push_back(atof(item.c_str()));
}
values.push_back(valueline);
valueline.clear();
}
fin.close();
return values;
}
};
supportvectorregressionmodel.h
#pragma once
#include <vector>
using std::vector;
#include "machinelearningmodel.h"
#include "svm.h" // libsvm
class FilePath;
class SupportVectorRegressionModel : public MachineLearningModel
{
public:
SupportVectorRegressionModel::~SupportVectorRegressionModel()
{
svm_free_model_content(model_);
svm_destroy_param(¶m_);
svm_free_and_destroy_model(&model_);
}
SupportVectorRegressionModel::SupportVectorRegressionModel(const vector<MlObservation>& observations, const vector<MlTarget>& targets)
{
// assumes all observations have same number of features
size_t numFeatures = observations[0].size();
//setup targets
//auto v = ConvertVecOfVecToVec(targetsMat);
double *targetsPtr = const_cast<double *>(&targets[0]); // why aren't the targets const?
LibSvmTargetObservationConvertor conv;
svm_node **observationsPtr = conv.ConvertObservations(observations, numFeatures);
// setup observations
//svm_node **observations = BuildObservations(observationsMat, numFeatures);
// setup problem
svm_problem problem;
problem.l = targets.size();
problem.y = targetsPtr;
problem.x = observationsPtr;
// specific to out training sets
// TODO: This is hard coded.
// Bust out these values for use in constructor
param_.C = 0.4; // cost
param_.svm_type = 4; // SVR
param_.kernel_type = 2; // radial
param_.nu = 0.6; // SVR nu
// These values are the defaults used in the Matlab version
// as found in svm_model_matlab.c
param_.gamma = 1.0 / (double)numFeatures;
param_.coef0 = 0;
param_.cache_size = 100; // in MB
param_.shrinking = 1;
param_.probability = 0;
param_.degree = 3;
param_.eps = 1e-3;
param_.p = 0.1;
param_.shrinking = 1;
param_.probability = 0;
param_.nr_weight = 0;
param_.weight_label = NULL;
param_.weight = NULL;
// suppress command line output
svm_set_print_string_function([](auto c) {});
model_ = svm_train(&problem, ¶m_);
}
double SupportVectorRegressionModel::Predict(const vector<double>& observation) const
{
LibSvmTargetObservationConvertor conv;
svm_node *obsNode = conv.ConvertMatToSvmNode(observation);
double prediction = svm_predict(model_, obsNode);
return prediction;
}
SupportVectorRegressionModel::SupportVectorRegressionModel(const FilePath & modelFile)
{
model_ = svm_load_model(modelFile.Path().c_str());
}
private:
svm_model *model_;
svm_parameter param_;
};
Option 1 is actually pretty reasonable. If you save the model in libsvm's C format through matlab, then it is straightforward to work with the model in C/C++ using functions provided by libsvm. Trying to work with matlab-formatted data in C++ will probably be much more difficult.
The main function in "svm-predict.c" (located in the root directory of the libsvm package) probably has most of what you need:
if((model=svm_load_model(argv[i+1]))==0)
{
fprintf(stderr,"can't open model file %s\n",argv[i+1]);
exit(1);
}
To predict a label for example x using the model, you can run
int predict_label = svm_predict(model,x);
The trickiest part of this will be to transfer your data into the libsvm format (unless your data is in the libsvm text file format, in which case you can just use the predict function in "svm-predict.c").
A libsvm vector, x, is an array of struct svm_node that represents a sparse array of data. Each svm_node has an index and a value, and the vector must be terminated by an index that is set to -1. For instance, to encode the vector [0,1,0,5], you could do the following:
struct svm_node *x = (struct svm_node *) malloc(3*sizeof(struct svm_node));
x[0].index=2; //NOTE: libsvm indices start at 1
x[0].value=1.0;
x[1].index=4;
x[1].value=5.0;
x[2].index=-1;
For SVM types other than the classifier (C_SVC), look at the predict function in "svm-predict.c".