How can I improve the speed of my large txt processing script? - c++

I have a program that scans a very large txt file (.pts file actually) that looks like this :
437288479
-6.9465 -20.49 -1.3345 70
-6.6835 -20.82 -1.3335 83
-7.3105 -20.179 -1.3325 77
-7.1005 -20.846 -1.3295 96
-7.3645 -20.759 -1.2585 79
...
The first line is the number of points contained in the file, and every other line corresponds to a {x,y,z,intensity} point in a 3D space. This file above is ~11 GB but I have more files to process that can be up to ~50 GB.
Here's the code I use to read this file :
#include <iostream>
#include <chrono>
#include <vector>
#include <algorithm>
#include <tuple>
#include <cmath>
// boost library
#include <boost/iostreams/device/mapped_file.hpp>
#include <boost/iostreams/stream.hpp>
struct point
{
double x;
double y;
double z;
};
void readMappedFile()
{
boost::iostreams::mapped_file_source mmap("my_big_file.pts");
boost::iostreams::stream<boost::iostreams::mapped_file_source> is(mmap, std::ios::binary);
std::string line;
// get rid of the first line
std::getline(is, line);
while (std::getline(is, line))
{
point p;
sscanf(line.c_str(),"%lf %lf %lf %*d", &(p.x), &(p.y), &(p.z));
if (p.z > minThreshold && p.z < maxThreshold)
{
// do something with p and store it in the vector of tuples
// O(n) complexity
}
}
}
int main ()
{
readMappedFile();
return 0;
}
For my 11 GB file, scanning all the lines and storing data in point p takes ~13 minutes to execute.
Is there a way to make it way faster ? Because each time I scan a point, I also have to do some stuff with it. Which will make my program to take several hours to execute in the end.
I started looking into using several cores but it seems it could be problematic if some points are linked together for some reason. If you have any advice on how you would proceed, I'll gladly hear about it.
Edit1 : I'm running the program on a laptop with a CPU containing 8 cores - 2.9GHz, ram is 16GB and I'm using an ssd. The program has to run on similar hardware for this purpose.
Edit2 : Here's the complete program so you can tell me what I've been doing wrong.
I localize each point in a sort of 2D grid called slab. Each cell will contain a certain amount of points and a z mean value.
#include <iostream>
#include <chrono>
#include <vector>
#include <algorithm>
#include <tuple>
#include <cmath>
// boost library
#include <boost/iostreams/device/mapped_file.hpp>
#include <boost/iostreams/stream.hpp>
struct point
{
double x;
double y;
double z;
};
/*
Compute Slab
*/
float slabBox[6] = {-25.,25.,-25.,25.,-1.,0.};
float dx = 0.1;
float dy = 0.1;
int slabSizeX = (slabBox[1] - slabBox[0]) / dx;
int slabSizeY = (slabBox[3] - slabBox[2]) / dy;
std::vector<std::tuple<double, double, double, int>> initSlab()
{
// initialize the slab vector according to the grid size
std::vector<std::tuple<double, double, double, int>> slabVector(slabSizeX * slabSizeY, {0., 0., 0., 0});
// fill the vector with {x,y} cells coordinates
for (int y = 0; y < slabSizeY; y++)
{
for (int x = 0; x < slabSizeX; x++)
{
slabVector[x + y * slabSizeX] = {x * dx + slabBox[0], y * dy + slabBox[2], 0., 0};
}
}
return slabVector;
}
std::vector<std::tuple<double, double, double, int>> addPoint2Slab(point p, std::vector<std::tuple<double, double, double, int>> slabVector)
{
// find the region {x,y} in the slab in which coord {p.x,p.y} is
int x = (int) floor((p.x - slabBox[0])/dx);
int y = (int) floor((p.y - slabBox[2])/dy);
// calculate the new z value
double z = (std::get<2>(slabVector[x + y * slabSizeX]) * std::get<3>(slabVector[x + y * slabSizeX]) + p.z) / (std::get<3>(slabVector[x + y * slabSizeX]) + 1);
// replace the older z
std::get<2>(slabVector[x + y * slabSizeX]) = z;
// add + 1 point in the cell
std::get<3>(slabVector[x + y * slabSizeX])++;
return slabVector;
}
/*
Parse the file
*/
void readMappedFile()
{
boost::iostreams::mapped_file_source mmap("my_big_file.pts");
boost::iostreams::stream<boost::iostreams::mapped_file_source> is(mmap, std::ios::binary);
std::string line;
std::getline(is, line);
auto slabVector = initSlab();
while (std::getline(is, line))
{
point p;
sscanf(line.c_str(),"%lf %lf %lf %*d", &(p.x), &(p.y), &(p.z));
if (p.z > slabBox[4] && p.z < slabBox[5])
{
slabVector = addPoint2Slab(p, slabVector);
}
}
}
int main ()
{
readMappedFile();
return 0;
}

If you use HDD to store your file just reading with 100Mb/s will spend ~2min and it is a good case. Try to read a block of the file and process it in another thread while the next block will be reading.
Also, you have something like:
std::vector<...> addPoint2Slab(point, std::vector<...> result)
{
...
return result;
}
slabVector = addPoint2Slab(point, slabVector);
I suppose it will bring unnecessary copying of the slabVector on every call (actually, a compiler might optimize it).
Try to check speed if you pass vector as follow:
std::vector<...> addPoint2Slab(point, std::vector<...> & result);
And call:
addPoint2Slab(point, slabVector);
And if it will get a speed bonus you can check how to forward results without the overhead.

Using memory maps is good. Using IOStreams isn't. Here's a complete take using Boost Spirit to do the parsing:
An Easy Starter
I'd suggest some cleanup around the typenames
using Record = std::tuple<double, double, double, int>;
std::vector<Record> initSlab()
{
// initialize the slab vector according to the grid size
std::vector<Record> slabVector(slabSizeX * slabSizeY, {0., 0., 0., 0});
// fill the vector with {x,y} cells coordinates
for (int y = 0; y < slabSizeY; y++) {
for (int x = 0; x < slabSizeX; x++) {
slabVector[x + y * slabSizeX] = {
x * dx + slabBox[0],
y * dy + slabBox[2],
0.,
0,
};
}
}
return slabVector;
}
You could just use a struct instead of the tuple, but that's an exercise for the reader
Don't Copy The SlabVector All The Time
You had addPoint2Slab taking the slabVector by value (copying) and returning the modified vector. Even if that's optimized to a couple of moves, it's still at least allocating a temporary copy each time addPoint2Slab is called. Instead, make it a mutating function as intended:
void addPoint2Slab(point const p, std::vector<Record>& slabVector)
{
// find the region {x,y} in the slab in which coord {p.x,p.y} is
int x = (int) floor((p.x - slabBox[0])/dx);
int y = (int) floor((p.y - slabBox[2])/dy);
auto& [ix, iy, iz, icount] = slabVector[x + y * slabSizeX];
iz = (iz * icount + p.z) / (icount + 1);
icount += 1;
}
Note also that the tuple handling has been greatly simplified with structured bindings. You can even see what the code is doing, which was nearly impossible before - let alone verify.
ReadMappedFile
auto readMappedFile(std::string fname)
{
auto slabVector = initSlab();
boost::iostreams::mapped_file_source mmap(fname);
auto handle = [&](auto& ctx) {
using boost::fusion::at_c;
point p{at_c<0>(_attr(ctx)), at_c<1>(_attr(ctx)), at_c<2>(_attr(ctx))};
//auto intensity = at_c<3>(_attr(ctx));
if (p.z > slabBox[4] && p.z < slabBox[5])
addPoint2Slab(p, slabVector);
};
namespace x3 = boost::spirit::x3;
static auto const line_ =
x3::float_ >> x3::float_ >> x3::float_ >> x3::int_;
auto first = mmap.data(), last = first + mmap.size();
try {
bool ok = x3::phrase_parse( //
first, last,
x3::expect[x3::uint_ >> x3::eol] //
>> line_[handle] % x3::eol //
// expect EOF here
>> *x3::eol >> x3::expect[x3::eoi], //
x3::blank);
// ok is true due to the expectation points
assert(ok);
} catch (x3::expectation_failure<char const*> const& ef) {
auto where = ef.where();
auto till = std::min(last, where + 32);
throw std::runtime_error("Expected " + ef.which() + " at #" +
std::to_string(where - mmap.data()) + " '" +
std::string(where, till) + "'...");
}
return slabVector;
}
Here we use Boost Spirit X3 to generate a parser that reads the lines and calls handle on each, much like you had before. A modicum of error handling has been added.
Let's Test It
Here's the test driver I used
#include <fmt/ranges.h>
#include <fstream>
#include <random>
#include <ranges>
using std::ranges::views::filter;
int main()
{
std::string const fname = "T032_OSE.pts";
#if 0 || defined(GENERATE)
using namespace std;
// generates a ~12Gib file
ofstream ofs(fname);
mt19937 prng{random_device{}()};
uniform_real_distribution<float> x(-25, 25), y(-25, +25), z(-1, 0);
uniform_int_distribution<> n(0, 100);
auto N = 437288479;
ofs << N << "\n";
while (N--)
ofs << x(prng) << " " << y(prng) << " " << z(prng) << " " << n(prng) << "\n";
#else
auto sv = readMappedFile(fname);
auto has_count = [](Record const& tup) { return get<3>(tup) > 0; };
fmt::print("slabVector:\n{}\n", fmt::join(sv | filter(has_count), "\n"));
#endif
}
Notice how you can use the conditionally compiled code to generate an input file (because I don't have your large file).
On this ~13GiB file (compressed copy online) it runs in 1m14s on my machine:
slabVector:
(-25, -25, -0.49556059843940164, 1807)
(-24.899999618530273, -25, -0.48971092838941654, 1682)
(-24.799999237060547, -25, -0.49731256076256386, 1731)
(-24.700000762939453, -25, -0.5006042266973916, 1725)
(-24.600000381469727, -25, -0.5000671732885645, 1784)
(-24.5, -25, -0.4940826157717386, 1748)
(-24.399999618530273, -25, -0.5045350563593015, 1720)
(-24.299999237060547, -25, -0.5088279537549671, 1812)
(-24.200000762939453, -25, -0.5065565364794715, 1749)
(-24.100000381469727, -25, -0.4933392542558793, 1743)
(-24, -25, -0.4947248105973453, 1808)
(-23.899999618530273, -25, -0.48640208470636714, 1696)
(-23.799999237060547, -25, -0.4994672590531847, 1711)
(-23.700000762939453, -25, -0.5033631130808075, 1782)
(-23.600000381469727, -25, -0.4995593140170436, 1760)
(-23.5, -25, -0.5009948279948179, 1737)
(-23.399999618530273, -25, -0.4995986820225158, 1732)
(-23.299999237060547, -25, -0.49833906199795897, 1764)
(-23.200000762939453, -25, -0.5013796942594327, 1728)
(-23.100000381469727, -25, -0.5072275248223541, 1700)
(-23, -25, -0.4949060352670081, 1749)
(-22.899999618530273, -25, -0.5026246990689665, 1740)
(-22.799999237060547, -25, -0.493411989775698, 1746)
// ... ~25k lines skipped...
(24.200000762939453, 24.900001525878906, -0.508382879738258, 1746)
(24.299999237060547, 24.900001525878906, -0.5064457874896565, 1740)
(24.400001525878906, 24.900001525878906, -0.4990733400392924, 1756)
(24.5, 24.900001525878906, -0.5063144518978036, 1732)
(24.60000228881836, 24.900001525878906, -0.49988387744959534, 1855)
(24.700000762939453, 24.900001525878906, -0.49970549673984693, 1719)
(24.799999237060547, 24.900001525878906, -0.48656442707683384, 1744)
(24.900001525878906, 24.900001525878906, -0.49267272688797675, 1705)
Remaining Notes
Beware of numerical error. You used float in some places, but with data sets this large it's very likely you will get noticeably large numeric errors in the running average calculation. Consider switching to [long] double or use a "professional" accumulator (many existing correlation frameworks or Boost Accumulator will do better).
Full Code
Live On Compiler Explorer
#include <algorithm>
#include <chrono>
#include <cmath>
#include <iostream>
#include <tuple>
#include <vector>
#include <fmt/ranges.h>
// boost library
#include <boost/iostreams/device/mapped_file.hpp>
#include <boost/iostreams/stream.hpp>
struct point { double x, y, z; };
/*
Compute Slab
*/
using Float = float; //
Float slabBox[6] = {-25.,25.,-25.,25.,-1.,0.};
Float dx = 0.1;
Float dy = 0.1;
int slabSizeX = (slabBox[1] - slabBox[0]) / dx;
int slabSizeY = (slabBox[3] - slabBox[2]) / dy;
using Record = std::tuple<double, double, double, int>;
std::vector<Record> initSlab()
{
// initialize the slab vector according to the grid size
std::vector<Record> slabVector(slabSizeX * slabSizeY, {0., 0., 0., 0});
// fill the vector with {x,y} cells coordinates
for (int y = 0; y < slabSizeY; y++) {
for (int x = 0; x < slabSizeX; x++) {
slabVector[x + y * slabSizeX] = {
x * dx + slabBox[0],
y * dy + slabBox[2],
0.,
0,
};
}
}
return slabVector;
}
void addPoint2Slab(point const p, std::vector<Record>& slabVector)
{
// find the region {x,y} in the slab in which coord {p.x,p.y} is
int x = (int) floor((p.x - slabBox[0])/dx);
int y = (int) floor((p.y - slabBox[2])/dy);
auto& [ix, iy, iz, icount] = slabVector[x + y * slabSizeX];
iz = (iz * icount + p.z) / (icount + 1);
icount += 1;
}
/* Parse the file */
#include <boost/spirit/home/x3.hpp>
auto readMappedFile(std::string fname)
{
auto slabVector = initSlab();
boost::iostreams::mapped_file_source mmap(fname);
auto handle = [&](auto& ctx) {
using boost::fusion::at_c;
point p{at_c<0>(_attr(ctx)), at_c<1>(_attr(ctx)), at_c<2>(_attr(ctx))};
//auto intensity = at_c<3>(_attr(ctx));
if (p.z > slabBox[4] && p.z < slabBox[5])
addPoint2Slab(p, slabVector);
};
namespace x3 = boost::spirit::x3;
static auto const line_ =
x3::double_ >> x3::double_ >> x3::double_ >> x3::int_;
auto first = mmap.data(), last = first + mmap.size();
try {
bool ok = x3::phrase_parse( //
first, last,
x3::expect[x3::uint_ >> x3::eol] //
>> line_[handle] % x3::eol //
// expect EOF here
>> *x3::eol >> x3::expect[x3::eoi], //
x3::blank);
// ok is true due to the expectation points
assert(ok);
} catch (x3::expectation_failure<char const*> const& ef) {
auto where = ef.where();
auto till = std::min(last, where + 32);
throw std::runtime_error("Expected " + ef.which() + " at #" +
std::to_string(where - mmap.data()) + " '" +
std::string(where, till) + "'...");
}
return slabVector;
}
#include <fmt/ranges.h>
#include <fstream>
#include <random>
#include <ranges>
using std::ranges::views::filter;
int main()
{
std::string const fname = "T032_OSE.pts";
#if 0 || defined(GENERATE)
using namespace std;
// generates a ~12Gib file
ofstream ofs(fname);
mt19937 prng{random_device{}()};
uniform_real_distribution<Float> x(-25, 25), y(-25, +25), z(-1, 0);
uniform_int_distribution<> n(0, 100);
auto N = 437288479;
ofs << N << "\n";
while (N--)
ofs << x(prng) << " " << y(prng) << " " << z(prng) << " " << n(prng) << "\n";
#else
auto sv = readMappedFile(fname);
auto has_count = [](Record const& tup) { return get<3>(tup) > 0; };
fmt::print("slabVector:\n{}\n", fmt::join(sv | filter(has_count), "\n"));
#endif
}

Get rid of std::getline. iostreams are pretty slow compared to direct "inmemory" processing of strings. Also do not use sscanf.
Allocate a large chunk of memory, i.e. 128MB or more. Read all of it from file in one call. Then parse this chunk until you reach the end.
Sort of like this:
std::vector<char> huge_chunk(128*1024*1024);
ifstream in("my_file");
do {
in.read(huge_chunk.data(), huge_chunk.size());
parse(huge_chunk.data, in.gcount());
} while (in.good());
you get the idea.
Parse the chunk with strtof, find and the like.
Parsing the chunk will leave a few characters at the end of the chunk which do not form a complete line. You need to store them temporarily and resume parsing the next chunk from there.
Generally speaking: The fewer calls to ifstream, the better. And using "lower API" functions such as strtof, strtoul etc... is usually faster than sscanf, format etc...
This usually does not matter for small files <1MB, but can make a huge difference with very large files.
Also: Use a profiler to find out exactly where your program is waiting. Intels VTune profiler is free, afaik. It is part of the OneAPI Toolkit and is one of the best tools I know.

Related

CPP: why is dereferencing a pointer to a template type not populating a variable?

I do a lot of modeling and simulation and I am writing a sim_logger in CPP. The basics of it are this: a user constructs the class with a logging frequency and an output path. They can then "register" any number of variables which gives the logger a reference to the desired variable (its not incredibly safe right now but I'll work on that later, focused on the issue at hand). I've created a template type called "variable" which contains three things, T *var, T last_val, and string ID. My problem is this, whenever I set the last_val equivalent to the var, the last_val inside the variable does not actually change. I am setting this value in line 180 of sim_logger.h. I feel like this is a silly problem, probably due to some misunderstanding I have of pointers. However, I've tried several different things and cannot seem to solve this problem.
sim_logger.h
#include <iostream>
#include <iomanip>
#include <fstream>
#include <vector>
#include <variant>
#include <type_traits>
#include <math.h>
pragma once
// a class to log simulation data
// specifically for logging time dependent differential functions
class sim_logger
{
private:
// a type that represents a variable
/*
meant to contain anything, but limited by the variadic type
"poly_var_types" below
*/
template <typename T>
struct variable
{
T *var; // pointer to the variable itself
T last_val; // the last value of the variable
std::string ident; // the identity of the variable
};
// a variadic type
template <typename ... T>
using poly_var_types = std::variant<T...>;
// defined variable types
// these are the typical types that are logged, feel free to add more
using var_types = poly_var_types<
variable<double>,
variable<float>
// variable<int>,
// variable<bool>,
// variable<std::string>
>;
// class members
std::vector<var_types> registered_variables; // container of all variables
std::ofstream file; // output file stream
double dt; // the logging time step in seconds
double clock = 0.0; // the logging clock in seconds
double last_sim_time = clock; // the last sim time for interp
bool is_time_to_log = false; // flag for log function
const double EPSILON = 0.000000001; // rounding error
// a linear interpolation method
// only returns floating point values
double lin_interp(double x, double x1, double x2, double y1, double y2)
{
return (y1+(x-x1)*((y2-y1)/(x2-x1)));
}
public:
// constructor which sets the logging frequency and output path
// log_dt is a floating point value in units of seconds
// path_to_file is a string representation of the desired output path
sim_logger(double log_dt, std::string path_to_file)
{
dt = log_dt;
file.open(path_to_file);
file << std::setprecision(16) << std::fixed;
}
// method to register a variable with the logger
template <typename T>
void register_variable(std::string ident, T *aVar)
{
variable<T> v;
v.ident = ident;
v.var = aVar;
registered_variables.push_back(v);
};
// a method to write the log file header and log sim time 0.0 data
void write_header_and_log_init_data()
{
// write header
file << "sim_time" << " ";
for (int i = 0; i < registered_variables.size(); i++)
{
std::visit([&](auto rv)
{
if (i == registered_variables.size()-1)
file << rv.ident << "\n";
else
file << rv.ident << " ";
}, registered_variables[i]);
}
// log all registered variables
file << clock << " ";
for (int i = 0; i < registered_variables.size(); i++)
{
std::visit([&](auto rv)
{
if (i == registered_variables.size()-1)
file << *rv.var << "\n";
else
file << *rv.var << " ";
}, registered_variables[i]);
}
}
// method to log all registered variables
void log_data(double sim_time)
{
// check the timing
if (sim_time > (clock + dt))
{
is_time_to_log = true;
}
// check if its time to log
if (is_time_to_log)
{
// update the clock
clock += dt;
// debug
std::cout << "\n";
// log all registered variables
file << clock << " ";
for (int i = 0; i < registered_variables.size(); i++)
{
std::visit([&](auto rv)
{
// instantiate the value to be logged
double log_val;
// debug
std::cout << rv.last_val << " " << *rv.var << std::endl;
// if sim time is even with clock time, log at time
if (fabs(sim_time - clock) < EPSILON)
// if (true)
{
log_val = *rv.var;
}
// if sim time is past clock time, interpolate
else
{
log_val = lin_interp(sim_time, last_sim_time,
clock, rv.last_val, *rv.var);
}
// if last variable in vector create new line
if (i == registered_variables.size()-1)
file << log_val << "\n";
// otherwise just whitespace
else
file << log_val << " ";
}, registered_variables[i]);
}
// debug
std::cout << "\n";
// reset flag
is_time_to_log = false;
}
// get all the last values
for (int i = 0; i < registered_variables.size(); i++)
{
std::visit([&](auto rv)
{
// have to get last value at every update call
// This works in scope but the memory does not actually change?
// I am very confuse.
rv.last_val = *rv.var;
// debug
std::cout << rv.last_val << " " << *rv.var << std::endl;
}, registered_variables[i]);
}
// set the last sim time
last_sim_time = sim_time;
}
};
main.cpp
include <iostream>
include "sim_logger.h"
int main()
{
sim_logger logger(0.1, "sim_logger/log.dat");
double test1 = 100.0;
double test2 = 100.0;
double test3 = 100.0;
logger.register_variable("test1", &test1);
logger.register_variable("test2", &test2);
logger.register_variable("test3", &test3);
logger.write_header_and_log_init_data();
double simTime = 0.0;
double simDt = 1.0 / 20.0;
for (int i = 0; i < 3; i++)
{
simTime += simDt;
test1 += 1.0;
test2 += 2.0;
test3 += 3.0;
logger.log_data(simTime);
}
return 0;
};
output
101 101
102 102
103 103
102 102
104 104
106 106
1.88705e-26 103
1.88705e-26 106
1.88705e-26 109
103 103
106 106
109 109
std::visit([&](auto rv)
rv is, effectively, a parameter to this function (the closure, for the purposes of this answer, is effectively a function).
As you know: in C++ function parameters get passed by value. For example, using a simple function:
void func(int x)
{
x=5;
}
This func can set x to 5 as often as it wants. Whatever actually gets passed in, by anyone that calls func(), will remain unaffected:
int z=7;
func(z);
z is still 7. Even though func set its parameter to 5. This is fundamental to C++:
std::visit([&](auto rv)
{
rv.last_val = *rv.var;
So, this sets rv.last_val. Great. But this has no effect on whatever gets passed into here.
}, registered_variables[i]);
The visited instance of this variant is still what it is. It hasn't changed. Why would it change? C++ does not work this way.
So, if your intent, here, is to modify registered_variables[i], it should be passed by reference:
std::visit([&](auto &rv)
Now, the object referenced by rv gets modified.

Arrays to binary files and vice versa

I was trying to write an array of ints to a binary file and then read the just written file and write it in another array (of the same size of the first), but i don't understand why the second array contains the correct numbers only until 25 (its 26th element, since numbers i wrote in the first start from 0).
A very weird thing i noticed, is that if i replace 'x[i] = i;' with 'x[i] = i * 3;' in the first for cycle in main, i obtain the correct numbers printed until 279 instead of 25 (and 25*3 != 279).
How could I write/read binary files to/from raw arrays in C++?
main.cpp:
#include "arrays_binary_files.hpp"
#include <iostream>
int main()
{
int x[1000];
//#if 0
for (size_t i{ 0 }; i != sizeof x / sizeof * x; ++i)
x[i] = i;
//#endif
int y[sizeof x / sizeof *x];
std::cout << "scrivo x su x.bin? [invio per continuare] _";//write?
(void)getchar();
std::cout << "\nscrivo x su x.bin...";//writing...
arrToBinFile(x, sizeof x / sizeof * x, "x.bin");
std::cout << "\nscritto x su x.bin";//written!
std::cout << "\n\nscrivo x.bin su y? [invio per continuare] _";//read?
(void)getchar();
std::cout << "scrivo x.bin su y...";//reading...
binFileToArr("x.bin", y, sizeof y / sizeof * x);
std::cout << "\nscritto x.bin su y";//read!
std::cout << "\n\nvisualizzo y? [invio per continuare] _";//show?
for (size_t i{ 0 }; i != sizeof y / sizeof * y; ++i) {
std::cout << '\n' << y[i];//stampa bene solo fino a 25
(void)getchar();
}
return 0;
}
arrays_binary_files.hpp:
#ifndef arrays_binary_files_hpp_included
#define arrays_binary_files_hpp_included
#include <fstream>
#include <filesystem>
//namespace {
/*
* gives internal linkage (like specifying static for everything), so that each
* function is "local" in each translation unit which is pasted in by #include
*/
//i tried to inline in order to debug using breakpoints, but i didn't understand the same where the bug is
inline char arrToBinFile(const int inputArray[], const size_t inputArrayLength, const std::string& fileName) {
std::ofstream outputData;
outputData.open(fileName);
if (outputData) {
outputData.write(reinterpret_cast<const char*>(inputArray), sizeof(int) * inputArrayLength);
outputData.close();
return 0;
}
else {
outputData.close();
return -1;
}
}
//i tried to inline to debug using breakpoints, but i didn't understand the same where the bug is
inline char binFileToArr(const std::string& fileName, int outputArray[], const size_t outputArrayLength) {
std::ifstream inputData;
inputData.open(fileName);
if (inputData /*&& std::filesystem::file_size(fileName) <= outputArrayLength*/) {
inputData.read(reinterpret_cast<char*>(outputArray), sizeof(int) * outputArrayLength);
inputData.close();
return 0;
}
else {
inputData.close();
return -1;
}
}
//}
#endif
screenshot of the console in case of leaving the main function in main.cpp as it is:
screenshot of the console in case of replacing 'x[i] = i;' with 'x[i] = i * 3;' in the main function in main.cpp:

C++ Vector of structs, read access violation

Edit: For loop didn't have a ending condition. Newbie mistakes.
I'm doing a school assignment for school, using MS VS, which has very specific
requirements. We're reading shape names and dimensions from a txt file, creating a struct for each shape with only the dimensions as members, and using a supporting function to calculate area/volume and output the results. Have to have 4 Loops:
The First will parse a txt file line by line,
check the type of shape, create a dynamic object and put it into a
generic bag.
The Second will process the bag and output the type of shape,
dimensions, and calculations to console.
The Third will do the same but output to another txt file.
The Last loop will delete all dynamic objects.
My program with only the code for squares:
#include <iterator>
#include <string>
#include <sstream>
#include <vector>
#include <iostream>
#include <fstream>
#include <map>
#include <cmath>
#include <cstdlib>
using namespace std;
int main()
{
string line, str;
double d1, d2, d3;
map < string, int > shapes;
vector<void*> myBag;
vector<char> myBagType;
shapes.insert(pair<string, int>("SQUARE", 1));
ifstream shapesin("TextFile1.txt");
ofstream shapesout("TextFile2.txt");
if (!shapesin || !shapesout)
{
cout << "Unable to open file\n";
}
while (getline(shapesin, line))
{
d1 = d2 = d3 = 0;
vector<string> token = parseString(line);
if (token.size() >= 1)
{
str = token[0];
switch (shapes[str])
{
case 1: //Square
{
Square* s = new Square;
myBag.push_back(s);
myBagType.push_back('S');
if (token.size() < 2)
{
s->side = 0;
break;
}
else
{
str = token[1];
d1 = atof(str.c_str());
s->side = d1;
}
break;
}
}
}
for (unsigned int i = 0; myBag.size(); i++)
{
if (myBagType[i] == 'S')
{
Square* aSquare = reinterpret_cast<Square*>(myBag[i]);
Square& bSquare = *aSquare;
outputSquare(cout, bSquare);
}
}
for (unsigned int i = 0; myBag.size(); i++)
{
if (myBagType[i] == 'S')
{
Square* aSquare = reinterpret_cast<Square*>(myBag[i]);
Square& bSquare = *aSquare;
outputSquare(shapesout, bSquare);
}
}
for (unsigned int i = 0; myBag.size(); i++)
{
if (myBagType[i] == 'S')
{
Square* aSquare = reinterpret_cast<Square*>(myBag[i]);
delete aSquare;
}
}
shapesin.close();
shapesout.close();
}
}
vector<string> parseString(string str)
{
stringstream s(str);
istream_iterator<string> begin(s), end;
return vector<string>(begin, end);
}
void outputSquare(ostream& shapesout, const Square& x)
{
double perim, area;
perim = (x.side * 4); //exception thrown here
area = (x.side * x.side);
shapesout << "SQUARE " << "side=" << x.side;
shapesout.setf(ios::fixed);
shapesout.precision(2);
shapesout << " area=" << area << " perimeter=" << perim << endl;
shapesout.unsetf(ios::fixed);
shapesout.precision(6);
}
Txt file input is:
SQUARE 14.5 344
SQUARE
RECTANGLE 14.5 4.65
DIMENSIONS
CIRCLE 14.5
BOX x 2 9
CUBE 13
BOX 1 2 3
CYLINDER 2.3 4 56
CANDY
SPHERE 2.4
CYLINDER 1.23
CYLINDER 50 1.23
TRIANGLE 1.2 3.2
PRISM 2.199 5
EOF
I know I have a problem with the way I'm accessing the struct member x.side but every other way I've tried won't compile, where as this will at least output the first line. I've read other, similar, questions but couldn't find one quite like this. I would really appreciate some assistance.
for (unsigned int i = 0; myBag.size(); i++)
No terminating condition
for (unsigned int i = 0; i < myBag.size(); i++)
Fixed

Store struct containing vector and cv::Mat to disk - Data serialization in C++

I'd like to store the structure below in a disk and be able to read it again: (C++)
struct pixels {
std::vector<cv::Point> indexes;
cv::Mat values;
};
I've tried to use ofstream and ifstream but they need the size of the variable which I don't really know how to calculate in this situation. It's not a simple struct with some int and double. Is there any way to do it in C++, preferably without using any third-party libraries.
(I'm actually coming from the Matlab language. It was easy to do it in that language using save: save(filename, variables)).
Edit:
I've just tried Boost Serialization. Unfortunately it's very slow for my use.
Several approaches come to mind with various cons and pros.
Use OpenCV's XML/YAML persistence functionality.
XML format (portable)
YAML format (portable)
JSON format (portable)
Use Boost.Serialization
Plain text format (portable)
XML format (portable)
binary format (non-portable)
Raw data to std::fstream
binary format (non-portable)
By "portable" I mean that the data files written on an arbitrary platform+compiler can be read on any other platform+compiler. By "non-portable", I mean that's not necessarily the case. Endiannes matters, and compilers could possibly make a difference too. You could add additional handling for such situations at the cost of performance. In this answer, I'll assume you're reading and writing on the same machine.
First here are includes, common data structures and utility functions we will use:
#include <opencv2/opencv.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/archive/xml_iarchive.hpp>
#include <boost/filesystem.hpp>
#include <boost/serialization/vector.hpp>
#include <chrono>
#include <fstream>
#include <vector>
// ============================================================================
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
using std::chrono::microseconds;
namespace ba = boost::archive;
namespace bs = boost::serialization;
namespace fs = boost::filesystem;
// ============================================================================
struct pixels
{
std::vector<cv::Point> indexes;
cv::Mat values;
};
struct test_results
{
bool matches;
double write_time_ms;
double read_time_ms;
size_t file_size;
};
// ----------------------------------------------------------------------------
bool validate(pixels const& pix_out, pixels const& pix_in)
{
bool result(true);
result &= (pix_out.indexes == pix_in.indexes);
result &= (cv::countNonZero(pix_out.values != pix_in.values) == 0);
return result;
}
pixels generate_data()
{
pixels pix;
for (int i(0); i < 10000; ++i) {
pix.indexes.emplace_back(i, 2 * i);
}
pix.values = cv::Mat(1024, 1024, CV_8UC3);
cv::randu(pix.values, 0, 256);
return pix;
}
void dump_results(std::string const& label, test_results const& results)
{
std::cout << label << "\n";
std::cout << "Matched = " << (results.matches ? "true" : "false") << "\n";
std::cout << "Write time = " << results.write_time_ms << " ms\n";
std::cout << "Read time = " << results.read_time_ms << " ms\n";
std::cout << "File size = " << results.file_size << " bytes\n";
std::cout << "\n";
}
// ============================================================================
Using OpenCV FileStorage
This is the first obvious choice is to use the serialization functionality OpenCV provides -- cv::FileStorage, cv::FileNode and cv::FileNodeIterator. There's a nice tutorial in the 2.4.x documentation, which I can't seem to find right now in the new docs.
The advantage here is that we already have support for cv::Mat and cv::Point, so there's very little to implement.
However, all the formats provided are textual, so there will be a fairly large cost in reading and writing the values (especially for the cv::Mat). It may be advantageous to save/load the cv::Mat using cv::imread/cv::imwrite and serialize the filename. I'll leave this to the reader to implement and benchmark.
// ============================================================================
void save_pixels(pixels const& pix, cv::FileStorage& fs)
{
fs << "indexes" << "[";
for (auto const& index : pix.indexes) {
fs << index;
}
fs << "]";
fs << "values" << pix.values;
}
void load_pixels(pixels& pix, cv::FileStorage& fs)
{
cv::FileNode n(fs["indexes"]);
if (n.type() != cv::FileNode::SEQ) {
throw std::runtime_error("Input format error: `indexes` is not a sequence.");;
}
pix.indexes.clear();
cv::FileNodeIterator it(n.begin()), it_end(n.end());
cv::Point pt;
for (; it != it_end; ++it) {
(*it) >> pt;
pix.indexes.push_back(pt);
}
fs["values"] >> pix.values;
}
// ----------------------------------------------------------------------------
test_results test_cv_filestorage(std::string const& file_name, pixels const& pix)
{
test_results results;
pixels pix_in;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
{
cv::FileStorage fs(file_name, cv::FileStorage::WRITE);
save_pixels(pix, fs);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
{
cv::FileStorage fs(file_name, cv::FileStorage::READ);
load_pixels(pix_in, fs);
}
high_resolution_clock::time_point t3 = high_resolution_clock::now();
results.matches = validate(pix, pix_in);
results.write_time_ms = static_cast<double>(duration_cast<microseconds>(t2 - t1).count()) / 1000;
results.read_time_ms = static_cast<double>(duration_cast<microseconds>(t3 - t2).count()) / 1000;
results.file_size = fs::file_size(file_name);
return results;
}
// ============================================================================
Using Boost Serialization
Another potential approach is to use Boost.Serialization library, as you mention you have tried. We have three options here on the archive format, two of which are textual (and portable), and one is binary (non-portable, but much more efficient).
There's more work to do here. We need to provide good serialization for cv::Mat, cv::Point and our pixels structure. Support for std::vector is provided, and to handle XML, we need to generate key-value pairs.
In case of the two textual formats, it may again be advantageous to save the cv::Mat as an image, and only serialize the path. The reader is free to try this approach. For binary format it would most likely be a tradeoff between space and time. Again, feel free to test this (you could even use cv::imencode and imdecode).
// ============================================================================
namespace boost { namespace serialization {
template<class Archive>
void serialize(Archive &ar, cv::Mat& mat, const unsigned int)
{
int cols, rows, type;
bool continuous;
if (Archive::is_saving::value) {
cols = mat.cols; rows = mat.rows; type = mat.type();
continuous = mat.isContinuous();
}
ar & boost::serialization::make_nvp("cols", cols);
ar & boost::serialization::make_nvp("rows", rows);
ar & boost::serialization::make_nvp("type", type);
ar & boost::serialization::make_nvp("continuous", continuous);
if (Archive::is_loading::value)
mat.create(rows, cols, type);
if (continuous) {
size_t const data_size(rows * cols * mat.elemSize());
ar & boost::serialization::make_array(mat.ptr(), data_size);
} else {
size_t const row_size(cols * mat.elemSize());
for (int i = 0; i < rows; i++) {
ar & boost::serialization::make_array(mat.ptr(i), row_size);
}
}
}
template<class Archive>
void serialize(Archive &ar, cv::Point& pt, const unsigned int)
{
ar & boost::serialization::make_nvp("x", pt.x);
ar & boost::serialization::make_nvp("y", pt.y);
}
template<class Archive>
void serialize(Archive &ar, ::pixels& pix, const unsigned int)
{
ar & boost::serialization::make_nvp("indexes", pix.indexes);
ar & boost::serialization::make_nvp("values", pix.values);
}
}}
// ----------------------------------------------------------------------------
template <typename OArchive, typename IArchive>
test_results test_bs_filestorage(std::string const& file_name
, pixels const& pix
, bool binary = false)
{
test_results results;
pixels pix_in;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
{
std::ios::openmode mode(std::ios::out);
if (binary) mode |= std::ios::binary;
std::ofstream ofs(file_name.c_str(), mode);
OArchive oa(ofs);
oa & boost::serialization::make_nvp("pixels", pix);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
{
std::ios::openmode mode(std::ios::in);
if (binary) mode |= std::ios::binary;
std::ifstream ifs(file_name.c_str(), mode);
IArchive ia(ifs);
ia & boost::serialization::make_nvp("pixels", pix_in);
}
high_resolution_clock::time_point t3 = high_resolution_clock::now();
results.matches = validate(pix, pix_in);
results.write_time_ms = static_cast<double>(duration_cast<microseconds>(t2 - t1).count()) / 1000;
results.read_time_ms = static_cast<double>(duration_cast<microseconds>(t3 - t2).count()) / 1000;
results.file_size = fs::file_size(file_name);
return results;
}
// ============================================================================
Raw Data to std::fstream
If we don't care about portability of the data files, we can just do the minimal amount of work to dump and restore the memory. With some effort (at the cost of speed) you could make this more flexible.
// ============================================================================
void save_pixels(pixels const& pix, std::ofstream& ofs)
{
size_t index_count(pix.indexes.size());
ofs.write(reinterpret_cast<char const*>(&index_count), sizeof(index_count));
ofs.write(reinterpret_cast<char const*>(&pix.indexes[0]), sizeof(cv::Point) * index_count);
int cols(pix.values.cols), rows(pix.values.rows), type(pix.values.type());
bool continuous(pix.values.isContinuous());
ofs.write(reinterpret_cast<char const*>(&cols), sizeof(cols));
ofs.write(reinterpret_cast<char const*>(&rows), sizeof(rows));
ofs.write(reinterpret_cast<char const*>(&type), sizeof(type));
ofs.write(reinterpret_cast<char const*>(&continuous), sizeof(continuous));
if (continuous) {
size_t const data_size(rows * cols * pix.values.elemSize());
ofs.write(reinterpret_cast<char const*>(pix.values.ptr()), data_size);
} else {
size_t const row_size(cols * pix.values.elemSize());
for (int i(0); i < rows; ++i) {
ofs.write(reinterpret_cast<char const*>(pix.values.ptr(i)), row_size);
}
}
}
void load_pixels(pixels& pix, std::ifstream& ifs)
{
size_t index_count(0);
ifs.read(reinterpret_cast<char*>(&index_count), sizeof(index_count));
pix.indexes.resize(index_count);
ifs.read(reinterpret_cast<char*>(&pix.indexes[0]), sizeof(cv::Point) * index_count);
int cols, rows, type;
bool continuous;
ifs.read(reinterpret_cast<char*>(&cols), sizeof(cols));
ifs.read(reinterpret_cast<char*>(&rows), sizeof(rows));
ifs.read(reinterpret_cast<char*>(&type), sizeof(type));
ifs.read(reinterpret_cast<char*>(&continuous), sizeof(continuous));
pix.values.create(rows, cols, type);
if (continuous) {
size_t const data_size(rows * cols * pix.values.elemSize());
ifs.read(reinterpret_cast<char*>(pix.values.ptr()), data_size);
} else {
size_t const row_size(cols * pix.values.elemSize());
for (int i(0); i < rows; ++i) {
ifs.read(reinterpret_cast<char*>(pix.values.ptr(i)), row_size);
}
}
}
// ----------------------------------------------------------------------------
test_results test_raw(std::string const& file_name, pixels const& pix)
{
test_results results;
pixels pix_in;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
{
std::ofstream ofs(file_name.c_str(), std::ios::out | std::ios::binary);
save_pixels(pix, ofs);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
{
std::ifstream ifs(file_name.c_str(), std::ios::in | std::ios::binary);
load_pixels(pix_in, ifs);
}
high_resolution_clock::time_point t3 = high_resolution_clock::now();
results.matches = validate(pix, pix_in);
results.write_time_ms = static_cast<double>(duration_cast<microseconds>(t2 - t1).count()) / 1000;
results.read_time_ms = static_cast<double>(duration_cast<microseconds>(t3 - t2).count()) / 1000;
results.file_size = fs::file_size(file_name);
return results;
}
// ============================================================================
Complete main()
Let's run all the tests for the various approaches and compare the results.
Code:
// ============================================================================
int main()
{
namespace ba = boost::archive;
pixels pix(generate_data());
auto r_c_xml = test_cv_filestorage("test.cv.xml", pix);
auto r_c_yaml = test_cv_filestorage("test.cv.yaml", pix);
auto r_c_json = test_cv_filestorage("test.cv.json", pix);
auto r_b_txt = test_bs_filestorage<ba::text_oarchive, ba::text_iarchive>("test.bs.txt", pix);
auto r_b_xml = test_bs_filestorage<ba::xml_oarchive, ba::xml_iarchive>("test.bs.xml", pix);
auto r_b_bin = test_bs_filestorage<ba::binary_oarchive, ba::binary_iarchive>("test.bs.bin", pix, true);
auto r_b_raw = test_raw("test.raw", pix);
// ----
dump_results("OpenCV - XML", r_c_xml);
dump_results("OpenCV - YAML", r_c_yaml);
dump_results("OpenCV - JSON", r_c_json);
dump_results("Boost - TXT", r_b_txt);
dump_results("Boost - XML", r_b_xml);
dump_results("Boost - Binary", r_b_bin);
dump_results("Raw", r_b_raw);
return 0;
}
// ============================================================================
Console output (i7-4930k, Win10, MSVC 2013)
NB: We're testing this with 10000 indexes and values being a 1024x1024 BGR image.
OpenCV - XML
Matched = true
Write time = 257.563 ms
Read time = 257.016 ms
File size = 12323677 bytes
OpenCV - YAML
Matched = true
Write time = 135.498 ms
Read time = 311.999 ms
File size = 16353873 bytes
OpenCV - JSON
Matched = true
Write time = 137.003 ms
Read time = 312.528 ms
File size = 16353873 bytes
Boost - TXT
Matched = true
Write time = 1293.84 ms
Read time = 1210.94 ms
File size = 11333696 bytes
Boost - XML
Matched = true
Write time = 4890.82 ms
Read time = 4042.75 ms
File size = 62095856 bytes
Boost - Binary
Matched = true
Write time = 12.498 ms
Read time = 4 ms
File size = 3225813 bytes
Raw
Matched = true
Write time = 8.503 ms
Read time = 2.999 ms
File size = 3225749 bytes
Conclusion
Looking at the results, the textual Boost.Serialization formats are abhorently slow -- I see what you meant. Saving values separately would definitely bring significant benefit here. The binary approach is quite good if portability is not an issue. You could still fix that at a reasonable cost.
OpenCV performs much better, XML being balanced on reads and writes, YAML/JSON (apparently identical) being faster on writes, but slower on reads. Still rather sluggish, so writing values as an image and saving filename might still be of benefit.
The raw approach is the fastest (no surprise), but also inflexible. You could make some improvements, of course, but it seems to need a lot more code than using a binary Boost.Archive -- not really worth it here. Still, if you're doing everything on the same machine, this may do the job.
Personally I'd go for the binary Boost approach, and tweak it if you need cross-platform capability.

Should I be attempting to return an array, or is there a better solution?

A problem set for people learning C++ is
Write a short program to simulate a ball being dropped off of a tower. To start, the user should be asked for the initial height of the tower in meters. Assume normal gravity (9.8 m/s2), and that the ball has no initial velocity. Have the program output the height of the ball above the ground after 0, 1, 2, 3, 4, and 5 seconds. The ball should not go underneath the ground (height 0).
Before starting C++ I had a reasonable, but primarily self taught, knowledge of Java. So looking at the problem it seems like it ought to be split into
input class
output class
calculations class
Physical constants class (recommended by the question setter)
controller ('main') class
The input class would ask the user for a starting height, which would be passed to the controller. The controller would give this and a number of seconds (5) to the calculations class, which would create an array of results and return this to the controller. The controller would hand the array of results to the output class that would print them to the console.
I will put the actual code at the bottom, but it's possibly not needed.
You can probably already see the problem, attempting to return an array. I'm not asking how to get round that problem, there is a workaround here and here. I'm asking, is the problem a result of bad design? Should my program be structured differently, for performance, maintenance or style reasons, such that I would not be attempting to return an array like object?
Here is the code (which works apart from trying to return arrays);
main.cpp
/*
* Just the main class, call other classes and passes variables around
*/
#include <iostream>
#include "dropSim.h"
using namespace std;
int main()
{
double height = getHeight();
int seconds = 5;
double* results = calculateResults(height, seconds);
outputResults(results);
return 0;
}
getHeight.cpp
/*
* Asks the user for a height from which to start the experiment
* SI units
*/
#include <iostream>
using namespace std;
double getHeight()
{
cout << "What height should the experiment start at; ";
double height;
cin >> height;
return height;
}
calculateResults.cpp
/*
* given the initial height and the physical constants, the position of the ball
* is calculated at integer number seconds, beginning at 0
*/
#include "constants.h"
#include <cmath>
#include <iostream>
using namespace std;
double getPosition(double height, double time);
double* calculateResults(double height, int seconds)
{
double positions[seconds + 1];
for(int t = 0; t < seconds + 1; t++)
{
positions[t] = getPosition(height, t);
}
return positions;
}
double getPosition(double height, double time)
{
double position = height - 0.5*constants::gravity*pow(static_cast<double>(time), 2);
if( position < 0) position = 0;
//Commented code is for testing
//cout << position << endl;
return position;
}
outputResults.cpp
/*
* Takes the array of results and prints them in an appropriate format
*/
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
void outputResults(double* results){
string outputText = "";
//The commented code is to test the output method
//Which is working
//double results1[] = {1,2,3,4,5};
//int numResults = sizeof(results1)/sizeof(results1[0]);
int numResults = sizeof(results)/sizeof(results[0]);
//cout << numResults; //= 0 ... Oh
for(int t = 0; t < numResults; t++)
{
ostringstream line;
line << "After " << t << " seconds the height of the object is " << results[t] << "\r";
outputText.append(line.str());
}
cout << outputText;
}
And finally a couple of headers;
dropSim.h
/*
* dropSim.h
*/
#ifndef DROPSIM_H_
#define DROPSIM_H_
double getHeight();
double* calculateResults(double height, int seconds);
void outputResults(double* results);
#endif /* DROPSIM_H_ */
constants.h
/*
* Contains physical constants relevant to simulation.
* SI units
*/
#ifndef CONSTANTS_H_
#define CONSTANTS_H_
namespace constants
{
const double gravity(9.81);
}
#endif /* CONSTANTS_H_ */
I would say that you're over-engineering a big solution to a little problem, but to answer your specific question:
Should my program be structured differently, for performance, maintenance or style reasons, such that I would not be attempting to return an array like object?
Returning an array-like object is fine. But that doesn't mean returning an array, nor does it mean allocating raw memory with new.
And it's not restricted to return values either. When you're starting out with C++, it's probably best to just forget that it has built-in arrays at all. Most of the time, you should be using either std::vector or std::array (or another linear collection such as std::deque).
Built-in arrays should normally be viewed as a special-purpose item, included primarily for compatibility with C, not for everyday use.
It may, however, be worth considering writing your computation in the same style as the algorithms in the standard library. This would mean writing the code to receive an iterator to a destination, and writing its output to wherever that iterator designates.
I'd probably package the height and time together as a set of input parameters, and have a function that generates output based on those:
struct params {
double height;
int seconds;
};
template <class OutIt>
void calc_pos(params const &p, OutIt output) {
for (int i=0; i<p.seconds; i++) {
*output = get_position(p.height, i);
++output;
}
}
This works somewhat more clearly along with the rest of the standard library:
std::vector<double> results;
calc_pos(inputs, std::back_inserter(results));
You can go a few steps further if you like--the standard library has quite a bit to help with a great deal of this. Your calc_pos does little more than invoke another function repeatedly with successive values for the time. You could (for example) use std::iota to generate the successive times, then use std::transform to generate outputs:
std::vector<int> times(6);
std::iota(times.begin(), times.end(), 0);
std::vector<double> distances;
std::transform(times.begin(), times.end(), compute_distance);
This computes the distances as the distance dropped after a given period of time rather than the height above the ground, but given an initial height, computing the difference between the two is quite trivial:
double initial_height = 5;
std::vector<double> heights;
std::transform(distances.begin(), distances.end(),
std::back_inserter(heights),
[=](double v) { return max(initial_height-v, 0); });
At least for now, this doesn't attempt to calculate the ball bouncing when it hits the ground--it just assumes the ball immediately stops when it hits the ground.
You should get rid of self-allocated double * and use std::vector<double> instead. It's not difficult to learn and a basic step in modern C++
This is how I would solve the problem:
#include <cmath>
#include <iostream>
#include <iomanip>
using std::cin;
using std::cout;
using std::endl;
using std::sqrt;
using std::fixed;
using std::setprecision;
using std::max;
using std::setw;
static const double g = 9.81;
class Calculator {
public:
Calculator(double inh) : h(inh)
{
}
void DoWork() const {
double tmax = sqrt(h / ( g / 2));
for (double t=0.0; t<tmax; t+=1.0) {
GenerateOutput(t);
}
GenerateOutput(tmax);
}
private:
void GenerateOutput(double t) const {
double x = g * t * t / 2;
double hremaining = max(h - x, 0.0);
cout << fixed << setprecision(2) << setw(10) << t;
cout << setw(10) << hremaining << endl;
}
double h;
};
int main() {
double h(0.0);
cout << "Enter height in meters: ";
cin >> h;
if (h > 0.0) {
const Calculator calc(h);
calc.DoWork();
} else {
return 1;
}
return 0;
}