Reading in from CSV to template vector - c++

I've been having difficulty all week trying to get one of my projects up and running. I'm required to read in from a 10,000 line CSV file from a meteorological database and output certain fields with a few demonstrations (Max blah blah).
I'm meant to design this using a self made template vector and aren't allowed access to the STL libraries.
As i'm just learning and this has been a few weeks in the making I think i've over complicated it for myself and now i'm stuck not knowing how to progress.
The main issue here is my confusion of how i'm going to not only read into a struct and parse the information to only read in what i need but then transform that data into the template vector.
Anyway, without further ado, here is my source code:
#include <iostream>
#include <fstream>
#include "Date.h"
#include "Time.h"
#include "Vector.h"
typedef struct {
Date d;
Time t;
float speed;
} WindLogType;
int main()
{
Vector<WindLogType> windlog;
std::string temp;
std::ifstream inputFile("MetData-31-3.csv");
int timeIndex, windSpeedIndex;
//18 Elements per line
//Need the elements at index 0 & 10
while(!inputFile.eof())
{
getline(inputFile, WindLogType.d,' ');
getline(inputFile, WindLogType.t,',');
for(int i = 0; i < 9; i++)
{
getline(inputFile, temp, ',');
}
getline(inputFile, WindLogType.speed);
windlog.push_back(WindLogType);
}
return 0;
}
Vector.h
#ifndef VECTOR_H
#define VECTOR_H
template <class elemType>
class Vector
{
public:
bool isEmpty() const;
bool isFull() const;
int getLength() const;
int getMaxSize() const;
void sort();
// T* WindLogType;
Vector(int nMaxSize = 64); //Default constructor, array size of 64.
Vector(const Vector&); //Copy constructor
~Vector(); //Destructor
void push_back(int);
int operator[](int);
int at(int i);
private:
int maxSize, length;
elemType* anArray;
void alloc_new();
};
template <class elemType>
bool Vector<elemType>::isEmpty() const
{
return (length == 0);
}
template <class elemType>
bool Vector<elemType>::isFull() const
{
return (length == maxSize);
}
template <class elemType>
int Vector<elemType>::getLength() const
{
return length;
}
template <class elemType>
int Vector<elemType>::getMaxSize() const
{
return maxSize;
}
//Constructor that takes the max size of vector
template <class elemType>
Vector<elemType>::Vector(int nMaxSize)
{
maxSize = nMaxSize;
length = 0;
anArray = new elemType[maxSize];
}
//Destructor
template <class elemType>
Vector<elemType>::~Vector()
{
delete[] anArray;
}
//Sort function
template <class elemType>
void Vector<elemType>::sort()
{
int i, j;
int min;
elemType temp;
for(i = 0; i < length; i++)
{
min = i;
for(j = i+1; j<length; ++j)
{
if(anArray[j] < anArray[min])
min = j;
}
temp = anArray[i];
anArray[i] = anArray[min];
anArray[min] = temp;
}
}
//Check if vector is full, if not add the item to the vector
template <class elemType>
void Vector<elemType>::push_back(int i)
{
if(length+1 > maxSize)
alloc_new();
anArray[length]=i;
length++;
}
template <class elemType>
int Vector<elemType>::operator[](int i)
{
return anArray[i];
}
//Return the vector at position 'i'
template <class elemType>
int Vector<elemType>::at(int i)
{
if(i < length)
return anArray[i];
throw 10;
}
//If the vector is about to get full, create a new temporary
//vector of double size and copy the contents across.
template <class elemType>
void Vector<elemType>::alloc_new()
{
maxSize = length*2;
int* tmp=new int[maxSize];
for(int i = 0; i < length; i++)
tmp[i]= anArray[i];
delete[] anArray;
anArray = tmp;
}
/**
//Copy Constructor, takes a reference to a vector and copies
//the values across to a new vector.
Vector::Vector(const Vector& v)
{
maxSize= v.maxSize;
length = v.length;
anArray = new int[maxSize];
for(int i=0; i<v.length; i++)
{
anArray[i] = v.anArray[i];
}
}**/
#endif
There are some things in the vector class that are completely unnecessary, they were just from a bit of practice.
Here is a sample of the CSV file:
WAST,DP,Dta,Dts,EV,QFE,QFF,QNH,RF,RH,S,SR,ST1,ST2,ST3,ST4,Sx,T
31/03/2016 9:00,14.6,175,17,0,1013.4,1016.9,1017,0,68.2,6,512,22.7,24.1,25.5,26.1,8,20.74
31/03/2016 9:10,14.6,194,22,0.1,1013.4,1016.9,1017,0,67.2,5,565,22.7,24.1,25.5,26.1,8,20.97
31/03/2016 9:20,14.8,198,30,0.1,1013.4,1016.9,1017,0,68.2,5,574,22.7,24,25.5,26.1,8,20.92
31/03/2016 9:30,15.1,215,27,0,1013.4,1016.8,1017,0,66.6,5,623,22.6,24,25.5,26.1,8,21.63
I require the elements in the WAST column and the S column, as WAST contains the date and S contains windspeed.
By no means do i want people to give me just the solution, I need to understand how i would read in and parse this data utilizing the struct & template vector.
There's no real "error" per se, I just lack the fundamental understanding of where to go next.
Any help would be greatly appreciated!
Thankyou

One easy and efficient way would be to have a vector per column, aka column-oriented storage. Column-oriented storage minimizes space requirements and allows you to easily apply linear algebra algorithms (including SIMD optimized), whithout having to pick individual struct members (as would be the case with row-oriented storage).
You can then parse each line using fscanf, each value into a separate variable. And then push_back the variables into the corresponding columns.
As fscanf does not parse dates, you would need to extract the date string into a char[64] and then parse that into struct tm which then can be converted to time_t.
The above assumes that you know the layout of the CSV and the types of the columns.
Pseudo-code:
vector<time_t> timestamps;
vector<double> wind_speeds;
for(;;) {
// Parse the CSV line into variables.
char date_str[64 + 1];
double wind_speed;
fscanf(file, "%64[^,], ..., %lf,...", date_str, ..., &wind_speed, ...);
time_t timestamp = parse_date(date_str);
// Store the parsed variables into the vectors.
timestamps.push_back(timestamp);
wind_speed.push_back(wind_speed);
}
double average_wind_speed = std::accumulate(wind_speeds.begin(), wind_speeds.end(), 0.) / wind_speeds.size();

.csv files are a representation of a table, delimited by "," (coma) to change cell and ";" (semi-column) for the end of the line.
EDIT: In the case of the ; does not work, the usual "\n" works. The below algorithm can easily be applied with the "\n"
In fact, there are no need to create a complicate program.. just if and while are enough. Here is an idea on how to proceed, I hope it can help you to understand a method, as it is what you are requesting.
1- Read every character (store it in a char) and add it to a string (the string += the char).
1.1- If the character is a ",", increase a counter and then you compare the string to the value desired (Here WAST).
1.1.2- If the string equales the desired value, save the counter in an integer (It allows knowing the position of the column you want.)
1.1.2- If not, continue until the end of the line ";" (which means in your case the desired column does not exist) or until you have a match (your string == "WAST")
NB: You can do it with different counters so that you know WAST position, S position etc.
Then:
Initialise a new counter
2- Compare the new counter to the saved value in 1.1.2.
2.1.1- If the values match, store the char contents in a string until you have a new coma.
2.1.2- If not, read every char until you find a new coma. Then increase your counter and restart from 2.
3- Continue to read the characters until you find a semi-column ";", and restart at step 2, until you finish to read the file.
To summarise, in this case the first step it to read every columns names, until finding the one you want or arriving at the end of the line. Store its position (noticed by the "," (comas)) thanks to a counter1.
Read every other line and storing the string in the desired column position (noticed by the "," (comas)) with counter1 compared to a new counter.
It may not be the most powerful algorithm by far, but it works and is easy to understand.
I tried to avoid writting it in C so that you can understand the steps without seeing the programmed solution. I hope it suits you.

Related

Improve Time Efficiency of Driver Program

Sorry that the title is vague. Essentially I am trying to approve the time (and overall) efficiency of a C++ driver program which:
Reads in a file line by line using ifstream
It is vital to my program that the lines are processed seperately, so I currently have 4 seperate calls to getline.
The program reads the string line into a vector of integers using string-stream.
Finally, it converts the vector into to a linked list of integers. Is there a way or a function that can directly read the integers from the file into the ll of integers?
Here is the driver code:
int main(int argc, char *argv[])
{
ifstream infile(argv[1]);
vector<int> vals_add;
vector<int> vals_remove;
//Driver Code
if(infile.is_open()){
string line;
int n;
getline(infile, line);
istringstream iss (line);
getline(infile, line);
istringstream iss2 (line);
while (iss2 >> n){
vals_add.push_back(n);
}
getline(infile, line);
istringstream iss3 (line);
getline(infile, line);
istringstream iss4 (line);
while (iss4 >> n){
vals_remove.push_back(n);
}
int array_add[vals_add.size()];
copy(vals_add.begin(), vals_add.end(), array_add);
int array_remove[vals_remove.size()];
copy(vals_remove.begin(), vals_remove.end(), array_remove);
Node *ptr = CnvrtVectoList(array_add, sizeof(array_add)/sizeof(int));
print(ptr);
cout << "\n";
for(int i = 0; i < vals_remove.size(); i++){
deleteNode(&ptr, vals_remove[i]);
}
print(ptr);
cout << "\n";
}
Here is a small example input:
7
6 18 5 20 48 2 97
8
3 6 9 12 28 5 7 10
Where lines 2 and 4 MUST be processed as separate lists, and lines 1 and 3 are the size of the lists (they must dynamically allocate memory so the size must remain exact to the input).
There are multiple points that can be improved.
First off, remove unnecessary code: you’re not using iss and iss3. Next, your array_add and array_remove seem to be redundant. Use the vectors directly.
If you have a rough idea of how many values you’ll read on average, reserve space in the vectors to avoid repeated resizing and copying (actually you seem to have these numbers in your input; use this information instead of throwing it away!). You can also replace your while reading loops with std::copy and std::istream_iterators.
You haven’t shown how CnvrtVectoList is implemented but in general linked lists aren’t particularly efficient to work with due to lack of locality: they throw data all over the heap. Contiguous containers (= vectors) are almost always more efficient, even when you need to remove elements in the middle. Try using a vector instead and time the performance carefully.
Lastly, can you sort the values? If so, then you can implement the deletion of values a lot more efficiently using iterative calls to std::lower_bound, or a single call to std::set_difference.
If (and only if!) the overhead is actually in the reading of the numbers from a file, restructure your IO code and don’t read lines separately (that way you’ll avoid many redundant allocations). Instead, scan directly through the input file (optionally using a buffer or memory mapping) and manually keep track of how many newline characters you’ve encountered. You can then use the strtod family of functions to scan numbers from the input read buffer.
Or, if you can assume that the input is correct, you can avoid reading separate lines by using the information provided in the file:
int add_num;
infile >> add_num;
std::copy_n(std::istream_iterator<int>(infile), std::inserter(your_list, std::end(your_list));
int del_num;
infile >> del_num;
std::vector<int> to_delete(del_num);
std::copy_n(std::istream_iterator<int>(infile), del_num, to_delete.begin());
for (auto const n : del_num) {
deleteNode(&ptr, n);
}
First of all: why do you use some custom list data structure? It's very likely that it is half-baked, i.e. doesn't have support for allocators, and thus would be much harder to adapt to perform well. Just use std::list for a doubly-linked list, or std::forward_list for a singly-linked list. Easy.
There are several requirements that you seem to imply:
The values of type T (for example: an int) are to be stored in a linked list - either std::list<T> or std::forward_list<T> (not a raw list of Nodes).
The data shouldn't be unnecessarily copied - i.e. the memory blocks shouldn't be reallocated.
The parsing should be parallelizable, although this makes sense only on fast data sources where the I/O won't dwarf CPU time.
The idea is then:
Use a custom allocator to carve memory in contiguous segments that can store multiple list nodes.
Parse the entire file into linked lists that uses the above allocator. The list will allocate memory segments on demand. A new list is started on each newline.
Return the 2nd and 4th list (i.e. lists of elements in the 2nd and 4th line).
It's worth noting that the lines that contain element counts are unnecessary. Of course, that data could be passed to the allocator to pre-allocate enough memory segments, but this disallows parallelization, since parallel parsers don't know where the element counts are - these get found only after the parallel-parsed data is reconciled. Yes, with a small modification, this parsing can be completely parallelized. How cool is that!
Let's start simple and minimal: parse the file to produce two lists. The example below uses a std::istringstream over the internally generated text view of the dataset, but parse could also be passed a std::ifstream of course.
// https://github.com/KubaO/stackoverflown/tree/master/questions/linked-list-allocator-58100610
#include <forward_list>
#include <iostream>
#include <sstream>
#include <vector>
using element_type = int;
template <typename allocator> using list_type = std::forward_list<element_type, allocator>;
template <typename allocator>
std::vector<list_type<allocator>> parse(std::istream &in, allocator alloc)
{
using list_t = list_type<allocator>;
std::vector<list_t> lists;
element_type el;
list_t *list = {};
do {
in >> el;
if (in.good()) {
if (!list) list = &lists.emplace_back(alloc);
list->push_front(std::move(el));
}
while (in.good()) {
int c = in.get();
if (!isspace(c)) {
in.unget();
break;
}
else if (c=='\n') list = {};
}
} while (in.good() && !in.eof());
for (auto &list : lists) list.reverse();
return lists;
}
And then, to test it:
const std::vector<std::vector<element_type>> test_data = {
{6, 18, 5, 20, 48, 2, 97},
{3, 6, 9, 12, 28, 5, 7, 10}
};
template <typename allocator = std::allocator<element_type>>
void test(const std::string &str, allocator alloc = {})
{
std::istringstream input{str};
auto lists = parse(input, alloc);
assert(lists.size() == 4);
lists.erase(lists.begin()+2); // remove the 3rd list
lists.erase(lists.begin()+0); // remove the 1st list
for (int i = 0; i < test_data.size(); i++)
assert(std::equal(test_data[i].begin(), test_data[i].end(), lists[i].begin()));
}
std::string generate_input()
{
std::stringstream s;
for (auto &data : test_data) {
s << data.size() << "\n";
for (const element_type &el : data) s << el << " ";
s << "\n";
}
return s.str();
}
Now, let's look at a custom allocator:
class segment_allocator_base
{
protected:
static constexpr size_t segment_size = 128;
using segment = std::vector<char>;
struct free_node {
free_node *next;
free_node() = delete;
free_node(const free_node &) = delete;
free_node &operator=(const free_node &) = delete;
free_node *stepped_by(size_t element_size, int n) const {
auto *p = const_cast<free_node*>(this);
return reinterpret_cast<free_node*>(reinterpret_cast<char*>(p) + (n * element_size));
}
};
struct segment_store {
size_t element_size;
free_node *free = {};
explicit segment_store(size_t element_size) : element_size(element_size) {}
std::forward_list<segment> segments;
};
template <typename T> static constexpr size_t size_for() {
constexpr size_t T_size = sizeof(T);
constexpr size_t element_align = std::max(alignof(free_node), alignof(T));
constexpr auto padding = T_size % element_align;
return T_size + padding;
}
struct pimpl {
std::vector<segment_store> stores;
template <typename T> segment_store &store_for() {
constexpr size_t element_size = size_for<T>();
for (auto &s : stores)
if (s.element_size == element_size) return s;
return stores.emplace_back(element_size);
}
};
std::shared_ptr<pimpl> dp{new pimpl};
};
template<typename T>
class segment_allocator : public segment_allocator_base
{
segment_store *d = {};
static constexpr size_t element_size = size_for<T>();
static free_node *advanced(free_node *p, int n) { return p->stepped_by(element_size, n); }
static free_node *&advance(free_node *&p, int n) { return (p = advanced(p, n)); }
void mark_free(free_node *free_start, size_t n)
{
auto *p = free_start;
for (; n; n--) p = (p->next = advanced(p, 1));
advanced(p, -1)->next = d->free;
d->free = free_start;
}
public:
using value_type = T;
using pointer = T*;
template <typename U> struct rebind {
using other = segment_allocator<U>;
};
segment_allocator() : d(&dp->store_for<T>()) {}
segment_allocator(segment_allocator &&o) = default;
segment_allocator(const segment_allocator &o) = default;
segment_allocator &operator=(const segment_allocator &o) {
dp = o.dp;
d = o.d;
return *this;
}
template <typename U> segment_allocator(const segment_allocator<U> &o) :
segment_allocator_base(o), d(&dp->store_for<T>()) {}
pointer allocate(const size_t n) {
if (n == 0) return {};
if (d->free) {
// look for a sufficiently long contiguous region
auto **base_ref = &d->free;
auto *base = *base_ref;
do {
auto *p = base;
for (auto need = n; need; need--) {
auto *const prev = p;
auto *const next = prev->next;
advance(p, 1);
if (need > 1 && next != p) {
base_ref = &(prev->next);
base = next;
break;
} else if (need == 1) {
*base_ref = next; // remove this region from the free list
return reinterpret_cast<pointer>(base);
}
}
} while (base);
}
// generate a new segment, guaranteed to contain enough space
size_t count = std::max(n, segment_size);
auto &segment = d->segments.emplace_front(count);
auto *const start = reinterpret_cast<free_node*>(segment.data());
if (count > n)
mark_free(advanced(start, n), count - n);
else
d->free = nullptr;
return reinterpret_cast<pointer>(start);
}
void deallocate(pointer ptr, std::size_t n) {
mark_free(reinterpret_cast<free_node*>(ptr), n);
}
using propagate_on_container_copy_assignment = std::true_type;
using propagate_on_container_move_assignment = std::true_type;
};
For the little test data we've got, the allocator will only allocate a segment... once!
To test:
int main()
{
auto test_input_str = generate_input();
std::cout << test_input_str << std::endl;
test(test_input_str);
test<segment_allocator<element_type>>(test_input_str);
return 0;
}
Parallelization would leverage the allocator above, starting multiple threads and in each invoking parse on its own allocator, each parser starting at a different point in the file. When the parsing is done, the allocators would have to merge their segment lists, so that they'd compare equal. At that point, the linked lists could be combined using usual methods. Other than thread startup overhead, the parallelization would have negligible overhead, and there'd be no data copying involved to combine the data post-parallelization. But I leave this exercise to the reader.

Creating a hashtable using vectors of vectors?

I'm currently trying to write a program that creates a hash table, using vectors of vectors for my collision resolution method.
The problem I am facing is that during runtime, a vector of vectors is created, but all of the Entry vectors inside remain of size 0. I know my put functions are faulty but I don't know where/why.
This is my first time creating a hash table and I'd appreciate any assistance in what the problem might be. My goal is to create a vector of Entry vectors, and each Entry has its associated key and value. After finding the hash value for a new Entry key, it should check the Entry vectors' key values to see if the key already exists. If it does, it updates that key's value.
This is a segment of table.cpp:
Table::Table(unsigned int maximumEntries) : maxEntries(100){
this->maxEntries = maximumEntries;
this->Tsize = 2*maxEntries;
}
Table::Table(unsigned int entries, std::istream& input){ //do not input more than the specified number of entries.
this->maxEntries = entries;
this->Tsize = 2*maxEntries;
std::string line = "";
int numEntries = 0;
getline(input, line);
while(numEntries<maxEntries || input.eof()){ // reads to entries or end of file
int key;
std::string strData = "";
convertToValues(key, strData, line);
put(key, strData); // adds each of the values to the tab;e
numEntries++;
getline(input,line);
}
}
void Table::put(unsigned int key, std::string data){
Entry newEntryObj(key,data); //create a new Entry obj
put(newEntryObj);
}
void Table::put(Entry e){ // creating the hash table
assert(currNumEntries < maxEntries);
int hash = (e.get_key() % Tsize);
Entry newEntry = Entry(e.get_key(), e.get_data());
for(int i = 0; i < hashtable[hash].size(); i++){
if (e.get_key() == hashtable[hash][i].get_key()){
hashtable[hash][i].set_data(e.get_data());
}
else{
hashtable[hash].push_back(newEntry); // IF KEY DOESNT EXIST, ADD TO THE VECTOR
}
}
}
This is Table.h
#ifndef table_h
#define table_h
#include "entry.h"
#include <string>
#include <istream>
#include <fstream>
#include <iostream>
#include <vector>
class Table{
public:
Table(unsigned int max_entries = 100); //Builds empty table with maxEntry value
Table(unsigned int entries, std::istream& input); //Builds table designed to hold number of entires
void put(unsigned int key, std::string data); //creates a new Entry to put in
void put(Entry e); //puts COPY of entry into the table
std::string get(unsigned int key) const; //returns string associated w/ param, "" if no entry exists
bool remove(unsigned int key); //removes Entry containing the given key
friend std::ostream& operator<< (std::ostream& out, const Table& t); //overloads << operator to PRINT the table.
int getSize();
std::vector<std::vector<Entry>> getHashtable();
private:
std::vector<std::vector<Entry>> hashtable; //vector of vectors
int Tsize; //size of table equal to twice the max number of entries
int maxEntries;
int currNumEntries;
#endif /* table_h */
};
and Entry.h:
#include <string>
#include <iosfwd>
class Entry {
public:
// constructor
Entry(unsigned int key = 0, std::string data = "");
// access and mutator functions
unsigned int get_key() const;
std::string get_data() const;
static unsigned int access_count();
void set_key(unsigned int k);
void set_data(std::string d);
// operator conversion function simplifies comparisons
operator unsigned int () const;
// input and output friends
friend std::istream& operator>>
(std::istream& inp, Entry &e);
friend std::ostream& operator<<
(std::ostream& out, Entry &e);
private:
unsigned int key;
std::string data;
static unsigned int accesses;
};
There are various problems with your code, but the answer for your question would be this:
In
void Table::put(Entry e){ // creating the hash table
Have a look at the loop.
for(int i = 0; i < hashtable[hash].size(); i++){
Now, hashtable[hash] is a vector. But initially it doesn't have any elements. So hashtable[hash].size() is 0. So you don't enter the loop.
On top of this, trying to access hashtable[hash] in the first place results in undefined behaviour due to hashtable not being properly resized to Tsize. Try this in your constructor(s):
this->maxEntries = maximumEntries;
this->Tsize = 2*maxEntries;
this->hashtable.resize(this->Tsize);
EDIT:
It would be easier for you to understand if you use std::vector::at function instead of std::vector::operator[]. For example:
void Table::put(Entry e){ // creating the hash table
assert(currNumEntries < maxEntries);
int hash = (e.get_key() % Tsize);
Entry newEntry = Entry(e.get_key(), e.get_data());
for(int i = 0; i < hashtabl.at(hash).size(); i++){
if (e.get_key() == hashtable.at(hash).at(i).get_key()){
hashtable.at(hash).at(i).set_data(e.get_data());
}
else{
hashtable.at(hash).push_back(newEntry); // IF KEY DOESNT EXIST, ADD TO THE VECTOR
}
}
}
Without resizing hashtable, this code would throw an out_of_range exception when you try to do hashtable.at(hash) the first time.
P.S. None of this is tested.

C++ Declaring arrays in class and declaring 2d arrays in class

I'm new with using classes and I encountered a problem while delcaring an array into a class. I want to initialize a char array for text limited to 50 characters and then replace the text with a function.
#ifndef MAP_H
#define MAP_H
#include "Sprite.h"
#include <SFML/Graphics.hpp>
#include <iostream>
class Map : public sprite
{
private:
char mapname[50];
int columnnumber;
int linenumber;
char casestatematricia[];
public:
void setmapname(char newmapname[50]);
void battlespace(int column, int line);
void setcasevalue(int col, int line, char value);
void printcasematricia();
};
#endif
By the way I could initialize my 2d array like that
char casestatematricia[][];
I want later to make this 2d array dynamic where I enter a column number and a line number like that
casestatematricia[linenumber][columnnumber]
to create a battlefield.
this is the cpp code so that you have an idea of what I want to do.
#include "Map.h"
#include <SFML/Graphics.hpp>
#include <iostream>
using namespace sf;
void Map::setmapname(char newmapname[50])
{
this->mapname = newmapname;
}
void Map::battlespace(int column, int line)
{
}
void Map::setcasevalue(int col, int line, char value)
{
}
void Map::printcasematricia()
{
}
thank you in advance.
Consider following common practice on this one.
Most (e.g. numerical) libraries don't use 2D arrays inside classes.
They use dynamically allocated 1D arrays and overload the () or [] operator to access the right elements in a 2D-like fashion.
So on the outside you never can tell that you're actually dealing with consecutive storage, it looks like a 2D array.
In this way arrays are easier to resize, more efficient to store, transpose and reshape.
Just a proposition for your problem:
class Map : public sprite
{
private:
std::string mapname;
int columnnumber;
int linenumber;
std::vector<char> casestatematricia;
static constexpr std::size_t maxRow = 50;
static constexpr std::size_t maxCol = 50;
public:
Map():
casestatematricia(maxRow * maxCol, 0)
{}
void setmapname(std::string newmapname)
{
if (newmapname.size() > 50)
{
// Manage error if you really need no more 50 characters..
// Or just troncate when you serialize!
}
mapname = newmapname;
}
void battlespace(int col, int row);
void setcasevalue(int col, int row, char value)
{
// check that col and line are between 0 and max{Row|Column} - 1
casestatematricia[row * maxRow + col] = value;
}
void printcasematricia()
{
for (std::size_t row = 0; row < maxRow; ++row)
{
for (std::size_t col = 0; col < maxCol; ++col)
{
char currentCell = casestatematricia[row * maxRow + col];
}
}
}
};
For access to 1D array like a 2D array, take a look at Access a 1D array as a 2D array in C++.
When you think about serialization, I guess you want to save it to a file. Just a advice: don't store raw memory to a file just to "save" time when your relaunch your soft. You just have a non portable solution! And seriously, with power of your computer, you don't have to be worry about time to load from file!
I propose you to add 2 methods in your class to save Map into file
void dump(std::ostream &os)
{
os << mapname << "\n";
std::size_t currentRow = 0;
for(auto c: casestatematricia)
{
os << static_cast<int>(c) << " ";
++currentRow;
if (currentRow >= maxRow)
{
currentRow = 0;
os << "\n";
}
}
}
void load(std::istream &is)
{
std::string line;
std::getline(is, line);
mapname = line;
std::size_t current_cell = 0;
while(std::getline(is, line))
{
std::istringstream is(line);
while(!is.eof())
{
char c;
is >> c;
casestatematricia[current_cell] = c;
++current_cell;
}
}
}
This solution is only given for example. They doesn't manage error and I have choose to store it in ASCII in file. You can change to store in binary, but, don't use direct write of raw memory. You can take a look at C - serialization techniques (just have to translate to C++). But please, don't use memcpy or similar technique to serialize
I hope I get this right. You have two questions. You want know how to assign the value of char mapname[50]; via void setmapname(char newmapname[50]);. And you want to know how to create a dynamic size 2D array.
I hope you are comfortable with pointers because in both cases, you need it.
For the first question, I would like to first correct your understanding of void setmapname(char newmapname[50]);. C++ functions do not take in array. It take in the pointer to the array. So it is as good as writing void setmapname(char *newmapname);. For better understanding, go to Passing Arrays to Function in C++
With that, I am going to change the function to read in the length of the new map name. And to assign mapname, just use a loop to copy each of the char.
void setmapname(char *newmapname, int length) {
// ensure that the string passing in is not
// more that what mapname can hold.
length = length < 50 ? length : 50;
// loop each value and assign one by one.
for(int i = 0; i < length; ++i) {
mapname[i] = newmapname[i];
}
}
For the second question, you can use vector like what was proposed by Garf365 need to use but I prefer to just use pointer and I will use 1D array to represent 2d battlefield. (You can read the link Garf365 provide).
// Declare like this
char *casestatematricia; // remember to initialize this to 0.
// Create the battlefield
void Map::battlespace(int column, int line) {
columnnumber = column;
linenumber = line;
// Clear the previous battlefield.
clearspace();
// Creating the battlefield
casestatematricia = new char[column * line];
// initialise casestatematricia...
}
// Call this after you done using the battlefield
void Map::clearspace() {
if (!casestatematricia) return;
delete [] casestatematricia;
casestatematricia = 0;
}
Just remember to call clearspace() when you are no longer using it.
Just for your benefit, this is how you create a dynamic size 2D array
// Declare like this
char **casestatematricia; // remember to initialize this to 0.
// Create the battlefield
void Map::battlespace(int column, int line) {
columnnumber = column;
linenumber = line;
// Clear the previous battlefield.
clearspace();
// Creating the battlefield
casestatematricia = new char*[column];
for (int i = 0; i < column; ++i) {
casestatematricia[i] = new char[line];
}
// initialise casestatematricia...
}
// Call this after you done using the battlefield
void Map::clearspace() {
if (!casestatematricia) return;
for(int i = 0; i < columnnumber; ++i) {
delete [] casestatematricia[i];
}
delete [][] casestatematricia;
casestatematricia = 0;
}
Hope this help.
PS: If you need to serialize the string, you can to use pascal string format so that you can support string with variable length. e.g. "11hello world", or "3foo".

Constructor issue <Unable to read memory>

I have to create a class Histogram and make operations on this class. The input can be one dimensional array or a two dimensional array. The problem appears when i convert the array into a matrix. This what i have tried so far. The error is <Unable to read memory>
histrogram.h
#ifndef HISTOGRAM_H
#define HISTOGRAM_H
#include<iostream>
class Histogram
{
private:
int** matrix;
int lines;
void SortMatrix();
public:
Histogram(){ }
Histogram(int elements[], int elementsNr);
Histogram(int** m, int l);
void Print();
};
#endif
historgram.cpp
#include"histogram.h"
using namespace std;
Histogram::Histogram(int** m, int l)
{
matrix=m;
lines=l;
SortMatrix();
}
Histogram::Histogram(int elements[], int elementsNr)
{
lines=0;
//initialize matrix : elementrNr lines and 2 columns
int** matrix=new int*[elementsNr];
for(int i=0;i<elementsNr;i++)
{
matrix[i]=new int[2];
matrix[i][0]=INT_MIN;
matrix[i][1]=INT_MIN;
}
//search each element from the array in the matrix
bool found=false;
for(int i=0;i<elementsNr;i++)
{
found=false;
for(int j=0;j<elementsNr;j++)
{
//the element was found in the matrix ( on the first column )
if(matrix[j][0] == elements[i])
{
matrix[j][1]++;
found=true;
break;
}
}
if(!found)
{
matrix[lines][0]=elements[i];
matrix[lines][1]=1;
lines++;
}
}
SortMatrix();
}
void Histogram::SortMatrix()
{
bool flag=true;
int temp;
for(int i=0;(i<lines) && flag;i++)
{
flag=false;
if(matrix[i+1][0]>matrix[i][0])
{
temp=matrix[i][0];
matrix[i][0]=matrix[i+1][0];
matrix[i+1][0]=temp;
flag=true;
}
}
}
void Histogram::Print()
{
for(int i=0;i<lines;i++)
{
cout<<matrix[i][0]<<" : " <<matrix[i][1]<<endl;
}
}
main.cpp
#include"histogram.h"
#include<iostream>
using namespace std;
int main()
{
int arr[]={6,7,3,1,3,2,4,4,7,5,1,1,5,6,6,4,5};
Histogram h(arr,17);
h.Print();
}
Here
int** matrix=new int*[elementsNr];
replace with
matrix=new int*[elementsNr];
becausematrix is already a member variable. You are creating a new temporary variable double pointer named matrix and allocating memory to it rather than your member variable matrix
A couple of people have already given you advice about how to fix some of the problems with this code. I'll give slightly different advice that may initially seem a bit brutal by comparison, but I'll try to demonstrate how it's honestly useful rather than nasty.
I would throw out your existing code with the possible exception of what you have in main, and start over, using an std::map. What you're doing right now is basically trying to re-create the capabilities that std::map already provides (and even when your code is fixed, it's not doing the job as well as std::map does right out of the box).
Using map, your whole program comes out to something like this:
std::ostream &operator<<(std::ostream &os, std::pair<int, int> const &d) {
return os << d.first << " : " << d.second;
}
int main() {
std::map<int, int> h;
for (int i=0; i<17; i++)
++h[arr[i]];
std::copy(h.begin(), h.end(),
std::ostream_iterator<std::pair<int, int> >(std::cout, "\n"));
return 0;
}
If you want to maintain virtually the same interface as your histogram class provided, it's pretty easy to do that -- the for loop goes into the constructor, the copy into print (and SortMatrix disappears, because a map is always sorted).
By doing this, you change from an O(N2) algorithm to an O(N log N) algorithm. The bugs others have pointed out disappear completely, because the code that contained them is no longer needed. The only real disadvantage I can see is that the result will probably use a bit more memory -- it uses a balanced tree with individually allocated nodes, which is likely to introduce a fair amount of overhead for nodes that only contain 2 ints (and a bit for balancing). I can't quite imagine worrying about this though -- long before you have enough nodes for the memory usage to become significant, you have way too many to present to even consider presenting to the user.
#mathematician1975 already provided an answer for the main problem. There's another bug in SortMatrix(): you only swap the elements of the first column, therefore after sorting, the counts (in the second column) will not be correct anymore. You'll have to insert
temp=matrix[i][1];
matrix[i][1]=matrix[i+1][1];
matrix[i+1][1]=temp;
to get it working.

C++: Program crash while adding object to custom vector class

I'm working on an email validation program for my cmpsci class and am having trouble with this one part.
What I'm doing is reading a list of valid top level domains from a text file into a vector class I wrote myself (I have to use a custom vector class unfortunately). The problem is that the program reads in and adds the first few domains to the vector all well and fine, but then crashes when it gets to the "org" line. I'm completely stumped why it works for the first few and then crashes.
Also, I have to use a custom string class; that's why I have the weird getline function (so I get the input in a char* for my String constructor). I've tried using the standard string class with this function and it still crashed in the same way so I can rule out the source of the problem being my string class. The whole program is quite large so I am only posting the most relevant parts. Let me know if more code is needed please. Any help would be awesome since I have no clue where to go from here. Thanks!
The ReadTlds function:
void Tld::ReadTlds() {
// Load the TLD's into the vector
validTlds = Vector<String>(0); // Init vector; declaration from header file: "static Vector<String>validTlds;"
ifstream in(TLD_FILE);
while(!in.eof()) {
char tmpInput[MAX_TLD_LENGTH]; // MAX_TLD_LENGTH equals 30
in.getline(tmpInput, MAX_TLD_LENGTH);
validTlds.Add(String(tmpInput)); // Crashes here!
}
}
My custom vector class:
#pragma once
#include <sstream>
#define INIT_CAPACITY 100
#define CAPACITY_BOOST 100
template<typename T> class Vector {
public:
// Default constructor
Vector() {
Data=NULL;
size=0;
capacity=INIT_CAPACITY;
}
// Init constructor
Vector(int Capacity) : size(0), capacity(Capacity) {
Data = new T[capacity];
}
// Destructor
~Vector() {
size=0;
Data = NULL;
delete[] Data;
}
// Accessors
int GetSize() const {return size;}
T* GetData() {return Data;}
void SetSize(const int size) {this->size = size;}
// Functions
void Add(const T& newElement) {
Insert(newElement, size);
}
void Insert(const T& newElement, int index) {
// Check if index is in bounds
if((index<0) || (index>capacity)) {
std::stringstream err;
err << "Vector::Insert(): Index " << index << " out of bounds (0-" << capacity-1 << ")";
throw err.str();
}
// Check capacity
if(size>=capacity)
Grow();
// Move all elements right of index to the right
for(int i=size-1; i>=index; i--)
Data[i+1]=Data[i];
// Put the new element at the specified index
Data[index] = newElement;
size++;
}
void Remove(int index) {
// Check if index is in bounds
if((index<0) || (index>capacity-1)) {
std::stringstream err;
err << "Vector::Remove():Index " << index << " out of bounds (0-" << capacity-1 << ")";
throw err.str();
}
// Move all elements right of index to the left
for(int i=index+1; i<size; i++)
Data[i-1]=Data[i];
}
// Index operator
T& operator [] (int index) const {
// Check if index is in bounds
if((index<0) || (index>capacity-1)) {
std::stringstream err;
err << "Vector operator[]:Index " << index << " out of bounds (0-" << capacity-1 << ")";
throw err.str();
}
return Data[index];
}
// Assignment oper
Vector<T>& operator = (const Vector<T>& right) {
Data = new T[right.GetSize()];
for(int i=0; i<right.GetSize(); i++)
Data[i] = right[i];
size = right.GetSize();
return *this;
}
private:
T *Data;
int size; // Current vector size
int capacity; // Max size of vector
void Grow() {
capacity+=CAPACITY_BOOST;
T* newData = new T[capacity];
for(int i=0; i<capacity; i++)
newData[i] = Data[i];
// Dispose old array
Data = NULL;
delete[] Data;
// Assign new array to the old array's variable
Data = newData;
}
};
The input file:
aero
asia
biz
cat
com
coop
edu
gov
info
int
jobs
mil
mobi
museum
name
net
org <-- crashes when this line is read
pro
tel
travel
The error Visual Studio throws is:
Unhandled exception at 0x5fb04013 (msvcp100d.dll) in Email4.exe: 0xC0000005: Access violation reading location 0xabababbb.
The problem is in your grow function:
void Grow() {
capacity+=CAPACITY_BOOST;
T* newData = new T[capacity];
for(int i=0; i<capacity; i++)
newData[i] = Data[i];
You increase the capacity, but then copy elements that didn't exist in the old array. It should be something like:
void Grow() {
int old_capacity = capacity;
capacity+=CAPACITY_BOOST;
T* newData = new T[capacity];
for(int i=0; i<old_capacity; i++)
newData[i] = Data[i];
You also NULL out Data before deleting it in both Grow and the destructor, which causes a memory leak. In both cases, you really don't need to set it to NULL at all, since there's no change of it being accidentally double-deleted (in Grow it's set to a new pointer immediately, in the destructor the object's lifetime is over). So just
delete[] Data;
alone is fine.
Also I think
if(size>=capacity)
can be:
if(size == capacity)
since size should never be over capacity. That would mean you'd already overflowed the buffer.
Matthew is probably right. Still, there's a valuable lesson to be learned here.
When you hit a problem like this, don't stop walking your code in your ReadTlds function. Keep walking inside the Vector class. Functions like Insert and Grow probably hold the error, but if you don't walk through them, you'll never find it.
Debugging is it's own very special skill. It takes a long time to get it down pat.
edit it's a late night and I misread your code, but I left my post to comment back
Also in the default ctor you do
Data = NULL;
capacity=INIT_CAPACITY;
(EDIT: expanded explanation here)
But never allocate the memory for Data. Shouldn't it be:
Vector() {
Data= new T[INIT_CAPCITY];
size=0;
capacity=INIT_CAPACITY;
}
And remove is missing
--size
EDIT:
Fellow readers help me out here:
Data is of type T* but everywhere else you are assigning and allocating it just like T instead of T* . My C++ days are too long gone to remember whether using a T& actually resolves this.
Also I can't remember that if you have an array of pointers and destruct it, that the dtor for the single instances in the array are destroyed.
Also in the assignment operator, wouldn't you be copying the pinters? so you just have to rely on the fact the the instance where you copyid from is never deleted (because then your objects would be dead too).
hth Mario