Speeding up large file write to disk in c++ - c++

I need to write a large csv file to disk. I've reduced the problem to the below code. When I compile with VS 2017 run on my Windows 7 box it runs on average in 26 seconds. Could someone suggest a way to speed this up without changing the data container or output format? Any help will be appreciated.
PS:Probably obvious, but the speedup should be to the base case on your hardware
I tried using fopen and fprintf but got worse results. I also played around with setting the buffer size with no success.
#include <iostream>
#include <iomanip>
#include <fstream>
#include <chrono>
#include <vector>
#include <string>
typedef std::chrono::high_resolution_clock Clock;
typedef std::vector<double> VecD;
typedef std::vector<VecD> VecVecD;
void test_file_write_stream() {
VecVecD v(10000, VecD(2000, 1.23456789));
const std::string delimiter(",");
const std::string file_path("c:\\junk\\speedtest.csv");
auto t1_stream = Clock::now();
std::ofstream ostream(file_path.c_str());
if (!ostream.good())
return;
ostream << std::setprecision(12);
for (const auto & row : v) {
for (const auto & col : row) {
ostream << col << delimiter;
}
ostream << std::endl;
}
auto t2_stream = Clock::now();
std::cout << "Stream test: " << std::chrono::duration_cast<std::chrono::microseconds>(t2_stream - t1_stream).count() / 1.0e6 << " seconds" << std::endl;
}
void main(int argc, char * argv[]) {
test_file_write_stream();
}
Stream test: 26.2086 seconds

What you wan't to use is memory mapped files according to wikipedi:
The benefit of memory mapping a file is increasing I/O performance,
especially when used on large files.
Why ? because the data is not needed to be copied around an extra time - and you should start seeing increases in the ballpark of 50%-100% or maybe more.
boost has a very neat interface in boost-interprocess. I do not have a testbench for this atm., but something ala:
boost::interprocess::file_mapping fm(filename, ...);
boost::interprocess::mapped_region region(fm, ...);
//mapped_region is a memory mapped file
Otherwise, you can of course use the interface for your platform:
https://learn.microsoft.com/en-us/dotnet/standard/io/memory-mapped-files

I tried your code and made some minor hipshot-optimizations - but got consistant results. The baseline was ~14 s for me, and it stayed close to that all the time. Good work -O3.
I replaced your std::vector<std::vector<double>> with a one-dimensional version and std::copyed it to disk with only a minor improvement.
Only when I gave up and dumped the memory as-is (not as a .csv) to disk did I get down to 1-3 seconds.
The big drawback is that what I dumped isn't portable in any way. It can hopefully be read back to the same computer on a good day, but I wouldn't recommend dumping doubles this way for real:
#include <chrono>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
typedef std::chrono::high_resolution_clock Clock;
// custom 2D array with a 1D memory layout
template<typename T>
class array2d {
public:
array2d(size_t h, size_t w, const T& value = T{}) : data_(h * w, value), w_(w) {}
inline double* operator[](size_t y) { return &data_[y * w_]; }
inline double const* operator[](size_t y) const { return &data_[y * w_]; }
inline size_t width() const { return w_; }
T const* data() const { return data_.data(); }
size_t size() const { return data_.size(); }
private:
std::vector<T> data_;
size_t w_;
};
using VecVecD = array2d<double>;
void test_file_write_stream() {
VecVecD v(10000, 2000, 1.23456789);
const std::string delimiter(",");
const std::string file_path("c:\\junk\\speedtest.csv");
auto t1_stream = Clock::now();
std::ofstream ostream(file_path.c_str(), std::ios::binary);
if(!ostream) return;
ostream << std::setprecision(12);
/* this may give a somewhat better performance than yours, but not much:
std::copy(v.data(), v.data() + v.size(),
std::ostream_iterator<double>(ostream, delimiter));
*/
// non-portable binary dump
ostream.write(reinterpret_cast<const char*>(v.data()),
static_cast<std::streamsize>(v.size() * sizeof(double)));
auto t2_stream = Clock::now();
auto elapsed_s =
std::chrono::duration_cast<std::chrono::seconds>(t2_stream - t1_stream);
std::cout << "Stream test: " << elapsed_s.count() << " seconds" << std::endl;
}
int main(int, char**) {
test_file_write_stream();
}

Related

How to write and call std::hash? - for gmp's mpz_class and mpz_t

I think most of the work is done here, just a little detail is missing at the end. Read on.
I'm trying to write the glue code for using MurmurHash3 to hash big integers (mpz_t and mpz_class) of the GMP library in C++. I do this in order to later use them in a std::unordered_map<mpz_class, int>.
I want the code to compile in a useful way for 32 bit and for 64 bit systems and to be easily extensible when 128 bit systems are required. Therefor I've written the MurmurHash3_size_t() function which calls the right hash function of MurmurHash3 and then converts the result to size_t. I assume that size_t is of the correct bit size in terms of 32/64/128 bit systems. (I don't know if this assumption is useful.) This part of the code compiles nicely.
The problem arises when I want to define the std::hash function. I get a compiler error for my code (see comment in code). How to write these std::hash functions correctly and how to call them?
(click to view MurmurHash3.h)
File hash_mpz.cpp:
#include "hash_mpz.h"
#include <gmpxx.h>
#include "MurmurHash3.h"
size_t MurmurHash3_size_t(const void *key, int len, uint32_t seed) {
#if SIZE_MAX==0xffffffff
size_t result;
MurmurHash3_x86_32(key, len, seed, &result);
return result;
#elif SIZE_MAX==0xffffffffffffffff
size_t result[2];
MurmurHash3_x64_128(key, len, seed, &result);
return result[0] ^ result[1];
#else
#error cannot determine correct version of MurmurHash3, because SIZE_MAX is neither 0xffffffff nor 0xffffffffffffffff
#endif
}
namespace std {
size_t hash<mpz_t>::operator()(const mpz_t &x) const {
// found 1846872219 by randomly hitting digits on my keyboard
return MurmurHash3_size_t(x->_mp_d, x->_mp_size * sizeof(mp_limb_t), 1846872219);
}
size_t hash<mpz_class>::operator()(const mpz_class &x) const {
// compiler error in next statement
// error: no matching function for call to ‘std::hash<__mpz_struct [1]>::operator()(mpz_srcptr)’
return hash<mpz_t>::operator()(x.get_mpz_t());
}
}
Found a solution which works for me:
namespace std {
size_t hash<mpz_srcptr>::operator()(const mpz_srcptr x) const {
// found 1846872219 by randomly typing digits on my keyboard
return MurmurHash3_size_t(x->_mp_d, x->_mp_size * sizeof(mp_limb_t),
1846872219);
}
size_t hash<mpz_t>::operator()(const mpz_t &x) const {
return hash<mpz_srcptr> { }((mpz_srcptr) x);
}
size_t hash<mpz_class>::operator()(const mpz_class &x) const {
return hash<mpz_srcptr> { }(x.get_mpz_t());
}
}
Then you can use the hash function as follows:
#include <iostream>
#include <gmpxx.h>
#include <unordered_map>
#include "hash_mpz.h"
using namespace std;
int main() {
mpz_class a;
mpz_ui_pow_ui(a.get_mpz_t(), 168, 16);
cout << "a : " << a << endl;
cout << "hash(a): " << (hash<mpz_class> { }(a)) << endl;
unordered_map<mpz_class, int> map;
map[a] = 2;
cout << "map[a] : " << map[a] << endl;
return 0;
}
Output:
a : 402669288768856477614113920779288576
hash(a): 11740158581999522595
map[a] : 2
Comments are appreciated.

Store struct containing vector and cv::Mat to disk - Data serialization in C++

I'd like to store the structure below in a disk and be able to read it again: (C++)
struct pixels {
std::vector<cv::Point> indexes;
cv::Mat values;
};
I've tried to use ofstream and ifstream but they need the size of the variable which I don't really know how to calculate in this situation. It's not a simple struct with some int and double. Is there any way to do it in C++, preferably without using any third-party libraries.
(I'm actually coming from the Matlab language. It was easy to do it in that language using save: save(filename, variables)).
Edit:
I've just tried Boost Serialization. Unfortunately it's very slow for my use.
Several approaches come to mind with various cons and pros.
Use OpenCV's XML/YAML persistence functionality.
XML format (portable)
YAML format (portable)
JSON format (portable)
Use Boost.Serialization
Plain text format (portable)
XML format (portable)
binary format (non-portable)
Raw data to std::fstream
binary format (non-portable)
By "portable" I mean that the data files written on an arbitrary platform+compiler can be read on any other platform+compiler. By "non-portable", I mean that's not necessarily the case. Endiannes matters, and compilers could possibly make a difference too. You could add additional handling for such situations at the cost of performance. In this answer, I'll assume you're reading and writing on the same machine.
First here are includes, common data structures and utility functions we will use:
#include <opencv2/opencv.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/archive/xml_iarchive.hpp>
#include <boost/filesystem.hpp>
#include <boost/serialization/vector.hpp>
#include <chrono>
#include <fstream>
#include <vector>
// ============================================================================
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
using std::chrono::microseconds;
namespace ba = boost::archive;
namespace bs = boost::serialization;
namespace fs = boost::filesystem;
// ============================================================================
struct pixels
{
std::vector<cv::Point> indexes;
cv::Mat values;
};
struct test_results
{
bool matches;
double write_time_ms;
double read_time_ms;
size_t file_size;
};
// ----------------------------------------------------------------------------
bool validate(pixels const& pix_out, pixels const& pix_in)
{
bool result(true);
result &= (pix_out.indexes == pix_in.indexes);
result &= (cv::countNonZero(pix_out.values != pix_in.values) == 0);
return result;
}
pixels generate_data()
{
pixels pix;
for (int i(0); i < 10000; ++i) {
pix.indexes.emplace_back(i, 2 * i);
}
pix.values = cv::Mat(1024, 1024, CV_8UC3);
cv::randu(pix.values, 0, 256);
return pix;
}
void dump_results(std::string const& label, test_results const& results)
{
std::cout << label << "\n";
std::cout << "Matched = " << (results.matches ? "true" : "false") << "\n";
std::cout << "Write time = " << results.write_time_ms << " ms\n";
std::cout << "Read time = " << results.read_time_ms << " ms\n";
std::cout << "File size = " << results.file_size << " bytes\n";
std::cout << "\n";
}
// ============================================================================
Using OpenCV FileStorage
This is the first obvious choice is to use the serialization functionality OpenCV provides -- cv::FileStorage, cv::FileNode and cv::FileNodeIterator. There's a nice tutorial in the 2.4.x documentation, which I can't seem to find right now in the new docs.
The advantage here is that we already have support for cv::Mat and cv::Point, so there's very little to implement.
However, all the formats provided are textual, so there will be a fairly large cost in reading and writing the values (especially for the cv::Mat). It may be advantageous to save/load the cv::Mat using cv::imread/cv::imwrite and serialize the filename. I'll leave this to the reader to implement and benchmark.
// ============================================================================
void save_pixels(pixels const& pix, cv::FileStorage& fs)
{
fs << "indexes" << "[";
for (auto const& index : pix.indexes) {
fs << index;
}
fs << "]";
fs << "values" << pix.values;
}
void load_pixels(pixels& pix, cv::FileStorage& fs)
{
cv::FileNode n(fs["indexes"]);
if (n.type() != cv::FileNode::SEQ) {
throw std::runtime_error("Input format error: `indexes` is not a sequence.");;
}
pix.indexes.clear();
cv::FileNodeIterator it(n.begin()), it_end(n.end());
cv::Point pt;
for (; it != it_end; ++it) {
(*it) >> pt;
pix.indexes.push_back(pt);
}
fs["values"] >> pix.values;
}
// ----------------------------------------------------------------------------
test_results test_cv_filestorage(std::string const& file_name, pixels const& pix)
{
test_results results;
pixels pix_in;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
{
cv::FileStorage fs(file_name, cv::FileStorage::WRITE);
save_pixels(pix, fs);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
{
cv::FileStorage fs(file_name, cv::FileStorage::READ);
load_pixels(pix_in, fs);
}
high_resolution_clock::time_point t3 = high_resolution_clock::now();
results.matches = validate(pix, pix_in);
results.write_time_ms = static_cast<double>(duration_cast<microseconds>(t2 - t1).count()) / 1000;
results.read_time_ms = static_cast<double>(duration_cast<microseconds>(t3 - t2).count()) / 1000;
results.file_size = fs::file_size(file_name);
return results;
}
// ============================================================================
Using Boost Serialization
Another potential approach is to use Boost.Serialization library, as you mention you have tried. We have three options here on the archive format, two of which are textual (and portable), and one is binary (non-portable, but much more efficient).
There's more work to do here. We need to provide good serialization for cv::Mat, cv::Point and our pixels structure. Support for std::vector is provided, and to handle XML, we need to generate key-value pairs.
In case of the two textual formats, it may again be advantageous to save the cv::Mat as an image, and only serialize the path. The reader is free to try this approach. For binary format it would most likely be a tradeoff between space and time. Again, feel free to test this (you could even use cv::imencode and imdecode).
// ============================================================================
namespace boost { namespace serialization {
template<class Archive>
void serialize(Archive &ar, cv::Mat& mat, const unsigned int)
{
int cols, rows, type;
bool continuous;
if (Archive::is_saving::value) {
cols = mat.cols; rows = mat.rows; type = mat.type();
continuous = mat.isContinuous();
}
ar & boost::serialization::make_nvp("cols", cols);
ar & boost::serialization::make_nvp("rows", rows);
ar & boost::serialization::make_nvp("type", type);
ar & boost::serialization::make_nvp("continuous", continuous);
if (Archive::is_loading::value)
mat.create(rows, cols, type);
if (continuous) {
size_t const data_size(rows * cols * mat.elemSize());
ar & boost::serialization::make_array(mat.ptr(), data_size);
} else {
size_t const row_size(cols * mat.elemSize());
for (int i = 0; i < rows; i++) {
ar & boost::serialization::make_array(mat.ptr(i), row_size);
}
}
}
template<class Archive>
void serialize(Archive &ar, cv::Point& pt, const unsigned int)
{
ar & boost::serialization::make_nvp("x", pt.x);
ar & boost::serialization::make_nvp("y", pt.y);
}
template<class Archive>
void serialize(Archive &ar, ::pixels& pix, const unsigned int)
{
ar & boost::serialization::make_nvp("indexes", pix.indexes);
ar & boost::serialization::make_nvp("values", pix.values);
}
}}
// ----------------------------------------------------------------------------
template <typename OArchive, typename IArchive>
test_results test_bs_filestorage(std::string const& file_name
, pixels const& pix
, bool binary = false)
{
test_results results;
pixels pix_in;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
{
std::ios::openmode mode(std::ios::out);
if (binary) mode |= std::ios::binary;
std::ofstream ofs(file_name.c_str(), mode);
OArchive oa(ofs);
oa & boost::serialization::make_nvp("pixels", pix);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
{
std::ios::openmode mode(std::ios::in);
if (binary) mode |= std::ios::binary;
std::ifstream ifs(file_name.c_str(), mode);
IArchive ia(ifs);
ia & boost::serialization::make_nvp("pixels", pix_in);
}
high_resolution_clock::time_point t3 = high_resolution_clock::now();
results.matches = validate(pix, pix_in);
results.write_time_ms = static_cast<double>(duration_cast<microseconds>(t2 - t1).count()) / 1000;
results.read_time_ms = static_cast<double>(duration_cast<microseconds>(t3 - t2).count()) / 1000;
results.file_size = fs::file_size(file_name);
return results;
}
// ============================================================================
Raw Data to std::fstream
If we don't care about portability of the data files, we can just do the minimal amount of work to dump and restore the memory. With some effort (at the cost of speed) you could make this more flexible.
// ============================================================================
void save_pixels(pixels const& pix, std::ofstream& ofs)
{
size_t index_count(pix.indexes.size());
ofs.write(reinterpret_cast<char const*>(&index_count), sizeof(index_count));
ofs.write(reinterpret_cast<char const*>(&pix.indexes[0]), sizeof(cv::Point) * index_count);
int cols(pix.values.cols), rows(pix.values.rows), type(pix.values.type());
bool continuous(pix.values.isContinuous());
ofs.write(reinterpret_cast<char const*>(&cols), sizeof(cols));
ofs.write(reinterpret_cast<char const*>(&rows), sizeof(rows));
ofs.write(reinterpret_cast<char const*>(&type), sizeof(type));
ofs.write(reinterpret_cast<char const*>(&continuous), sizeof(continuous));
if (continuous) {
size_t const data_size(rows * cols * pix.values.elemSize());
ofs.write(reinterpret_cast<char const*>(pix.values.ptr()), data_size);
} else {
size_t const row_size(cols * pix.values.elemSize());
for (int i(0); i < rows; ++i) {
ofs.write(reinterpret_cast<char const*>(pix.values.ptr(i)), row_size);
}
}
}
void load_pixels(pixels& pix, std::ifstream& ifs)
{
size_t index_count(0);
ifs.read(reinterpret_cast<char*>(&index_count), sizeof(index_count));
pix.indexes.resize(index_count);
ifs.read(reinterpret_cast<char*>(&pix.indexes[0]), sizeof(cv::Point) * index_count);
int cols, rows, type;
bool continuous;
ifs.read(reinterpret_cast<char*>(&cols), sizeof(cols));
ifs.read(reinterpret_cast<char*>(&rows), sizeof(rows));
ifs.read(reinterpret_cast<char*>(&type), sizeof(type));
ifs.read(reinterpret_cast<char*>(&continuous), sizeof(continuous));
pix.values.create(rows, cols, type);
if (continuous) {
size_t const data_size(rows * cols * pix.values.elemSize());
ifs.read(reinterpret_cast<char*>(pix.values.ptr()), data_size);
} else {
size_t const row_size(cols * pix.values.elemSize());
for (int i(0); i < rows; ++i) {
ifs.read(reinterpret_cast<char*>(pix.values.ptr(i)), row_size);
}
}
}
// ----------------------------------------------------------------------------
test_results test_raw(std::string const& file_name, pixels const& pix)
{
test_results results;
pixels pix_in;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
{
std::ofstream ofs(file_name.c_str(), std::ios::out | std::ios::binary);
save_pixels(pix, ofs);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
{
std::ifstream ifs(file_name.c_str(), std::ios::in | std::ios::binary);
load_pixels(pix_in, ifs);
}
high_resolution_clock::time_point t3 = high_resolution_clock::now();
results.matches = validate(pix, pix_in);
results.write_time_ms = static_cast<double>(duration_cast<microseconds>(t2 - t1).count()) / 1000;
results.read_time_ms = static_cast<double>(duration_cast<microseconds>(t3 - t2).count()) / 1000;
results.file_size = fs::file_size(file_name);
return results;
}
// ============================================================================
Complete main()
Let's run all the tests for the various approaches and compare the results.
Code:
// ============================================================================
int main()
{
namespace ba = boost::archive;
pixels pix(generate_data());
auto r_c_xml = test_cv_filestorage("test.cv.xml", pix);
auto r_c_yaml = test_cv_filestorage("test.cv.yaml", pix);
auto r_c_json = test_cv_filestorage("test.cv.json", pix);
auto r_b_txt = test_bs_filestorage<ba::text_oarchive, ba::text_iarchive>("test.bs.txt", pix);
auto r_b_xml = test_bs_filestorage<ba::xml_oarchive, ba::xml_iarchive>("test.bs.xml", pix);
auto r_b_bin = test_bs_filestorage<ba::binary_oarchive, ba::binary_iarchive>("test.bs.bin", pix, true);
auto r_b_raw = test_raw("test.raw", pix);
// ----
dump_results("OpenCV - XML", r_c_xml);
dump_results("OpenCV - YAML", r_c_yaml);
dump_results("OpenCV - JSON", r_c_json);
dump_results("Boost - TXT", r_b_txt);
dump_results("Boost - XML", r_b_xml);
dump_results("Boost - Binary", r_b_bin);
dump_results("Raw", r_b_raw);
return 0;
}
// ============================================================================
Console output (i7-4930k, Win10, MSVC 2013)
NB: We're testing this with 10000 indexes and values being a 1024x1024 BGR image.
OpenCV - XML
Matched = true
Write time = 257.563 ms
Read time = 257.016 ms
File size = 12323677 bytes
OpenCV - YAML
Matched = true
Write time = 135.498 ms
Read time = 311.999 ms
File size = 16353873 bytes
OpenCV - JSON
Matched = true
Write time = 137.003 ms
Read time = 312.528 ms
File size = 16353873 bytes
Boost - TXT
Matched = true
Write time = 1293.84 ms
Read time = 1210.94 ms
File size = 11333696 bytes
Boost - XML
Matched = true
Write time = 4890.82 ms
Read time = 4042.75 ms
File size = 62095856 bytes
Boost - Binary
Matched = true
Write time = 12.498 ms
Read time = 4 ms
File size = 3225813 bytes
Raw
Matched = true
Write time = 8.503 ms
Read time = 2.999 ms
File size = 3225749 bytes
Conclusion
Looking at the results, the textual Boost.Serialization formats are abhorently slow -- I see what you meant. Saving values separately would definitely bring significant benefit here. The binary approach is quite good if portability is not an issue. You could still fix that at a reasonable cost.
OpenCV performs much better, XML being balanced on reads and writes, YAML/JSON (apparently identical) being faster on writes, but slower on reads. Still rather sluggish, so writing values as an image and saving filename might still be of benefit.
The raw approach is the fastest (no surprise), but also inflexible. You could make some improvements, of course, but it seems to need a lot more code than using a binary Boost.Archive -- not really worth it here. Still, if you're doing everything on the same machine, this may do the job.
Personally I'd go for the binary Boost approach, and tweak it if you need cross-platform capability.

Vector push_back of one object results in a vector of enormous size

i'm working on a project in c++, and I have a vector of objects, where I want to push_back an object on the existing vector. However, when checking the size before and after the object is added, the size goes from 0 to 12297829382473034412 which puzzles me greatly. The code in question is the addCommodity function below. (I have created a smaller example of the same problem further down, so skip to "SMALL PROBLEM")
void Instance::addCommodity(std::vector<std::string> & tokens) {
/*if(tokens.size()!=5){
std::cerr << "Error in commodity data format"<< std::endl;
exit(-1);
}*/
// size_t so = std::atoi(tokens[1].c_str());
// size_t si = std::atoi(tokens[2].c_str());
// size_t demand = std::atoi(tokens[3].c_str());
// size_t ti = std::atoi(tokens[4].c_str());
std::cout << "size: " << this->_commodities->size() << "\n";
this->_commodities->push_back(Commodity(1,2,3,4)); // ???
std::cout << "size: " << this->_commodities->size() << "\n";
}
Here I have commented out the parts of the code which are used to read data from a string which was loaded from a file. Commodity is defined as follows:
#include "commodity.h"
Commodity::Commodity(size_t so, size_t si, size_t d, size_t ti):
_source(so),
_sink(si),
_demand(d),
_maxTime(ti)
{}
Commodity::~Commodity(){}
size_t Commodity::getSource() const{
return _source;
}
size_t Commodity::getSink() const {
return _sink;
}
size_t Commodity::getDemand() const {
return _demand;
}
size_t Commodity::getTime() const {
return _maxTime;
}
Where Instance is initialised as:
Instance::Instance(std::shared_ptr<Param> p, size_t n):
_params(p),
_nNodes(n)
{
this->_commodities.reset(new std::vector<Commodity>());
this->_arcs.reset(new std::vector<Arc>());
}
As mentioned before my issue lies in the addCommodity code, when trying to push_back a Commodity. Hopefully this is enough code to identify any stupid mistakes that I have made. I left out most of the other code for this project as it doesn't seem to have an impact on the addCommodity function.
The output received when calling the function is:
size: 0
size: 12297829382473034412
SMALL PROBLEM
Instead of showing all the code, I have run the push_back on the vector in main:
#include <iostream>
#include <memory>
#include <sys/time.h>
#include <vector>
#include "commodity.h"
int main(int argc, char* argv[]){
std::shared_ptr< std::vector<Commodity>> commodities;
commodities.reset(new std::vector<Commodity>());
std::cout << "size: " << commodities->size() << "\n";
size_t a = 1;
size_t b = 2;
size_t c = 3;
size_t d = 4;
commodities->emplace_back(Commodity(a,b,c,d));
std::cout << "size: " << commodities->size() << std::endl;
return 0;
}
This is basically a smaller instance of the same code. The commodity cpp and h files are as follows:
#include "commodity.h"
Commodity::Commodity(size_t so, size_t si, size_t d, size_t ti):
_source(so),
_sink(si),
_demand(d),
_maxTime(ti)
{}
Commodity::~Commodity(){}
size_t Commodity::getSource() const{
return _source;
}
size_t Commodity::getSink() const {
return _sink;
}
size_t Commodity::getDemand() const {
return _demand;
}
size_t Commodity::getTime() const {
return _maxTime;
}
The header file:
#ifndef CG_MCF_COMMODITY_H
#define CG_MCF_COMMODITY_H
#include <stdlib.h>
class Commodity {
public:
Commodity(size_t so, size_t si, size_t d, size_t t);
~Commodity();
size_t getSource() const;
size_t getSink() const;
size_t getDemand() const;
size_t getTime() const;
private:
size_t _source;
size_t _sink;
size_t _demand;
size_t _maxTime;
};
#endif /*CG_MCF_COMMODITY_H*/
The output received when calling the function is:
size: 0
size: 12297829382473034412
Your Commodity class violates the rule of 0/3/5.
Your code (inexplicably) does this:
commodities->emplace_back(Commodity(a,b,c,d));
This is really strange. Presumably, you're calling emplace_back to avoid having to construct a separate Commodity from the one in the vector. But you force that to happen by explicitly constructing a separate Commodity as the parameter to emplace_back.
That invokes Commodity's copy constructor to construct the Commodity in the vector as a copy of the one you explicitly created. Except Commodity doesn't have one. Most likely, the real Commmodity class needs one, since it has a destructor.

de-serialize ASCII to struct

I have come up with the following structure to declare various formats if messages that are to be received from the network:
#include <stdint.h>
#include <iostream>
#include <string.h>
template<int T>
struct uint
{
static uint<T> create(uint64_t value)
{
uint<T> r = {value};
return r;
}
uint(uint64_t value)
{
v = value;
}
uint()
{}
uint<T>& operator =(uint64_t value)
{
v = value;
return *this;
}
operator uint64_t() const
{
return (uint64_t)v;
}
unsigned long long v:T;
}__attribute__((packed));
example:
typedef uint<5> second_t;
suppose one of the message formats (which are auto-generated via some process) is like this:
struct seconds
{
char _type;
second_t _second;
} __attribute__((packed));
Now suppose I would like to populate an instance of the above messahe using a string:
int main()
{
seconds ii;
const char *i = "123456";
// memset, memcpy,sprintf... ??? what to use here?
std::cout << ii._type << " " << ii._second << std::endl;
}
Given a stream 123456, I expect the instance of the seconds (ii) structure to have char ii._type = '1' and integer ii._second = 23456. But I dont know how to do that. Do you have a clue how i can do that? and do you have any suggestion how to improve the basic structure?
thanks
You have a number of easier and more reliable options available that require almost no work.
check out google protocol buffers (platform independent message serialisation and deserialisation): https://developers.google.com/protocol-buffers/
or boost::serialization - (probably faster, but not platform-independant) http://www.boost.org/doc/libs/1_58_0/libs/serialization/doc/index.html

Speed of associative array (map) in STL [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
Wrote a simple program to measure the speed of STL. The following code showed that it took 1.49sec on my Corei7-2670QM PC (2.2GHz and turbo 3.1GHz). If I remove the Employees[buf] = i%1000; part in the loop, it only took 0.0132sec. So the hashing part took 1.48sec. Why is it that slow?
#include <string.h>
#include <iostream>
#include <map>
#include <utility>
#include <stdio.h>
#include <sys/time.h>
using namespace std;
extern "C" {
int get(map<string, int> e, char* s){
return e[s];
}
int set(map<string, int> e, char* s, int value) {
e[s] = value;
}
}
double getTS() {
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec + tv.tv_usec/1000000.0;
}
int main()
{
map<string, int> Employees;
char buf[10];
int i;
double ts = getTS();
for (i=0; i<1000000; i++) {
sprintf(buf, "%08d", i);
Employees[buf] = i%1000;
}
printf("took %f sec\n", getTS() - ts);
cout << Employees["00001234"] << endl;
return 0;
}
Here's a C++ version of your code. Note that you should obviously take the maps by reference when passing them in get/set.
UPDATE Taking things a bit further and seriously optimizing for the given test case:
Live On Coliru
#include <iostream>
#include <boost/container/flat_map.hpp>
#include <chrono>
using namespace std;
using Map = boost::container::flat_map<string, int>;
int get(Map &e, char *s) { return e[s]; }
int set(Map &e, char *s, int value) { return e[s] = value; }
using Clock = std::chrono::high_resolution_clock;
template <typename F, typename Reso = std::chrono::microseconds, typename... Args>
Reso measure(F&& f, Args&&... args) {
auto since = Clock::now();
std::forward<F>(f)(std::forward<Args>(args)...);
return chrono::duration_cast<Reso>(Clock::now() - since);
}
#include <boost/iterator/iterator_facade.hpp>
using Pair = std::pair<std::string, int>;
struct Gen : boost::iterators::iterator_facade<Gen, Pair, boost::iterators::single_pass_traversal_tag, Pair>
{
int i;
Gen(int i = 0) : i(i) {}
value_type dereference() const {
char buf[10];
std::sprintf(buf, "%08d", i);
return { buf, i%1000 };
}
bool equal(Gen const& o) const { return i==o.i; }
void increment() { ++i; }
};
int main() {
Map Employees;
const auto n = 1000000;
auto elapsed = measure([&] {
Employees.reserve(n);
Employees.insert<Gen>(boost::container::ordered_unique_range, {0}, {n});
});
std::cout << "took " << elapsed.count() / 1000000.0 << " sec\n";
cout << Employees["00001234"] << endl;
}
Prints
took 0.146575 sec
234
Old answer
This just used C++ where appropriate
Live On Coliru
#include <iostream>
#include <map>
#include <chrono>
#include <cstdio>
using namespace std;
int get(map<string, int>& e, char* s){
return e[s];
}
int set(map<string, int>& e, char* s, int value) {
return e[s] = value;
}
using Clock = std::chrono::high_resolution_clock;
template <typename Reso = std::chrono::microseconds>
Reso getElapsed(Clock::time_point const& since) {
return chrono::duration_cast<Reso>(Clock::now() - since);
}
int main()
{
map<string, int> Employees;
std::string buf(10, '\0');
auto ts = Clock::now();
for (int i=0; i<1000000; i++) {
buf.resize(std::sprintf(&buf[0], "%08d", i));
Employees[buf] = i%1000;
}
std::cout << "took " << getElapsed(ts).count()/1000000.0 << " sec\n";
cout << Employees["00001234"] << endl;
}
Prints:
took 0.470009 sec
234
The notion of "slow" depends of course in comparison to what.
I ran your benchmark (using the standard chrono::high_resolution_clock instead of gettimeofday() ) on MSVC2013 with release configuration on an Corei7-920 at 2.67 GHz and find very similar results (1.452 s).
In your code, you do basically 1 millions of:
insertion in the map: Employees\[buf\]
update in the map (copying a new element to exisitng element): = i%1000
SO I tried to understand better where the time is spent:
first, the map needs to store the ordered keys, which is typically implemented with a binary tree. So I tried to use an unordered_map which uses a flatter hash table and gave it a very large bucket size to avoid clisions and rehashing. The result is then 1.198 s.
So roughly 20% of the time (here) is needed for making possibile a sorted access to the map data (i.e. you can iterate through your map using the order of the keys: do you need this ?)
next, playing with the order of insertion can really influence significantly the timing. As Thomas Matthews pointed out in the comments: for benchmarking purpose you should use random order.
then, making only and optimised insertion of data (no search no update) using emplace_hint() brings us to a time of 1.100 s.
So 75% of the time is needed to allocate and insert the data
finally, elaborating on the previous test, if you add an additional search and update after emplace_hint(), then the time goes up slightly above the original time (1.468 s). This confirms that access to the map is only a fraction of the time and most of the execution time is needed for the insertion.
Here the test for the point above:
chrono::high_resolution_clock::time_point ts = chrono::high_resolution_clock::now();
for (i = 0; i<1000000; i++) {
sprintf(buf, "%08d", i);
Employees.emplace_hint(Employees.end(), buf, 0);
Employees[buf] = i % 1000; // matters for 300
}
chrono::high_resolution_clock::time_point te = chrono::high_resolution_clock::now();
cout << "took " << chrono::duration_cast<chrono::milliseconds>(te - ts).count() << " millisecs\n";
Now your benchmark not only depends performance of the map: you do 1 million of sprintf() to set your buffer, and 1 million of conversion to a string. If you'd use a map instead, you'd notice that the whole test would take only 0.950s instead of 1.450s:
30% of your benchmark time is caused not by the map, but by the many strings you handle !
Of course, all this is much slower than a vector. But a vector doesn't sort its elements, and cannot provide for associative store.