Sort .csv in multidimensional arrays - c++

I'm trying to read specific values (i.e. values#coordinate XY) from a .csv file and struggle with a proper way to define multidimensional arrays within that .csv.
Here's an example of the form from my .csv file
NaN,NaN,1.23,2.34,9.99
1.23,NaN,2.34,3.45,NaN
NaN,NaN,1.23,2.34,9.99
1.23,NaN,2.34,3.45,NaN
1.23,NaN,2.34,3.45,NaN
NaN,NaN,1.23,2.34,9.99
1.23,NaN,2.34,3.45,NaN
NaN,NaN,1.23,2.34,9.99
1.23,NaN,2.34,3.45,NaN
1.23,NaN,2.34,3.45,NaN
NaN,NaN,1.23,2.34,9.99
1.23,NaN,2.34,3.45,NaN
NaN,NaN,1.23,2.34,9.99
1.23,NaN,2.34,3.45,NaN
1.23,NaN,2.34,3.45,NaN
...
Ok, in reality, this file becomes very large. You can interpret rows=latitudes and columns=longitudes and thus each block is an hourly measured coordinate map. The blocks usually have the size of row[361] column[720] and time periods can range up to 20 years (=24*365*20 blocks), just to give you an idea of the data size.
To structure this, I thought of scanning through the .csv and define each block as a vector t, which I can access by choosing the desired timestep t=0,1,2,3...
Then, within this block I would like to go to a specific line (i.e. latitude) and define it as a vector longitudeArray.
The outcome shall be a specified value from coordinate XY at time Z.
As you might guess, my coding experience is rather limited and this is why my actual question might be very simple: How can I arrange my vectors in order to be able to call any random value?
This is my code so far (sadly it is not much, cause I don't know how to continue...)
#include <fstream>
#include <iostream>
#include <iomanip>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
int longitude, latitude; //Coordinates used to specify desired value
int t; //Each array is associated to a specific time t=0,1,2,3... (corresponds to hourly measured data)
string value;
vector<string> t; //Vector of each block
vector<string> longitudeArray; //Line of array, i.e. latitude
ifstream file("swh.csv"); //Open file
if (!file.is_open()) //Check if file is opened, if not
print "File could..."
{
cout << "File could not open..." << endl;
return 1;
}
while (getline(file, latitude, latitude.empty())) //Scan .csv (vertically) and delimit every time a white line occurs
{
longitudeArray.clear();
stringstream ss(latitude);
while(getline(ss,value,',') //Breaks line into comma delimited fields //Specify line number (i.e. int latitude) here??
{
latitudeArray.push_back(value); //Adds each field to the 1D array //Horizontal vector, i.e. latitude
}
t.push_back(/*BLOCK*/) //Adds each block to a distinct vector t
}
cout << t(longitudeArray[5])[6] << endl; //Output: 5th element of longitudeArray in my 6th block
return 0;
}
If you have any hint, especially if there is a better way handling large .csv files, I'd be very grateful.
Ps: C++ is inevitable for this project...
Tüdelüü,
jtotheakob

As usual you should first think in terms of data and data usage. Here you have floating point values (that can be NaN) that should be accessible as a 3D thing along latitude, longitude and time.
If you can accept simple (integer) indexes, the standard ways in C++ would be raw arrays, std::array and std::vector. The rule of thumb then says: if the sizes are known at compile time arrays (or std::array if you want operation on global arrays) are fine, else go with vectors. And if unsure std:vector is your workhorse.
So you will probably end with a std::vector<std::vector<std::vector<double>>> data, that you would use as data[timeindex][latindex][longindex]. If everything is static you could use a double data[NTIMES][NLATS][NLONGS] that you would access more or less the same way. Beware if the array is large, most compilers will choke if you declare it inside a function (including main), but it could be a global inside one compilation unit (C-ish but still valid in C++).
So read the file line by line, feeding values in your container. If you use statically defined arrays just assign each new value in its position, if you use vectors, you can dynamically add new elements with push_back.
This is too far from your current code for me to show you more than trivial code.
The static (C-ish) version could contain:
#define NTIMES 24*365*20
#define NLATS 361
#define NLONGS 720
double data[NTIMES][NLATS][NLONGS];
...
int time, lat, long;
for(time=0; time<NTIMES; time++) {
for (lat=0; lat<NLATS; lat++) {
for (long=0; long<NLONGS; long++) {
std::cin >> data[time][lat][long];
for (;;) {
if (! std::cin) break;
char c = std::cin.peek();
if (std::isspace(c) || (c == ',')) std::cin.get();
else break;
}
if (! std::cin) break;
}
if (! std::cin) break;
}
if (! std::cin) break;
}
if (time != NTIMES) {
//Not enough values or read error
...
}
A more dynamic version using vectors could be:
int ntimes = 0;
const int nlats=361; // may be a non compile time values
const int nlongs=720; // dito
vector<vector<vector<double>>> data;
int lat, long;
for(;;) {
data.push_back(vector<vector<double>>);
for(lat=0; lat<nlats; lat++) {
data[ntimes].push_back(vector<double>(nlongs));
for(long=0; long<nlongs; long++) {
std::cin >> data[time][lat][long];
for (;;) {
if (! std::cin) break;
char c = std::cin.peek();
if (std::isspace(c) || (c == ',')) std::cin.get();
else break;
}
if (! std::cin) break;
}
if (! std::cin) break;
}
if (! std::cin) break;
if (lat!=nlats || long!=nlongs) {
//Not enough values or read error
...
}
ntimes += 1;
}
This code will successfully process NaN converting it the special not a number value, but it does not check the number of fields per line. To do that, read a line with std::getline and use a strstream to parse it.

Thanks, I tried to transfer both versions to my code, but I couldn't make it run.
Guess my poor coding skills aren't able to see what's obvious to everyone else. Can you name the additional libs I might require?
For std::isspace I do need #include <cctype>, anything else missing which is not mentioned in my code from above?
Can you also explain how if (std::isspace(c) || (c == ',')) std::cin.get(); works? From what I understand, it will check whether c (which is the input field?) is a whitespace, and if so, the right term becomes automatically "true" because of ||? What consequence results from that?
At last, if (! std::cin) break is used to stop the loop after we reached the specified array[time][lat][long]?
Anyhow, thanks for your response. I really appreciate it and I have now an idea how to define my loops.

Again thank you all for your ideas.
Unfortunately, I was not able to run the script... but my task changed slightly, thus the need to read very large arrays is not required anymore.
However, I've got an idea of how to structure such operations and most probably will transfer it to my new task.
You may close this topic now ;)
Cheers
jtothekaob

Related

Input two matrices which didn't specialize size

I need to input two matrices with their sizes unfixed, using a blank row to declare the end of inputting each matrix.
For example, input:
1 2
3 4
(blank row here, end of input matrix 1)
5 6 7
8 9 10
(blank row here, end of input matrix 2)
will get a 2*2 matrix and a 2*3 matrix.
My current idea is to build a matrix large enough (like 1000*1000), then set loops and use cin to input each element (the code only shows how I input matrix 1):
int matx1[1000][1000];
for (i = 0;i < 1000;i++)
{
for (j = 0;j < 1000;j++)
{
temp = getchar();
if (temp == '\n')
{
mat1.col = j;
break;
}
else
{
putchar(temp);
}
cin>>matx1[i][j];
}
temp = getchar();
if (temp == '\n')
{
mat1.row = i;
break;
}
else
{
putchar(temp);
}
}
When I running this on Xcode, error happens, the putchar() function will interrupt my input in terminal by printing a number each time I press Enter, and the input result is in chaos.
I also tried the following code to avoid use of putchar():
for (i = 0; i < 1000; i++)
{
temp = getchar();
if (temp == '\n')
{
break;
}
else
{
matx1[i][0] = temp;
for (j = 1; j < 1000; j++)
{
cin >> matx1[i][j];
if (getchar() == '\n')
{
break;
}
}
}
}
Still, there are serious problems. The temp variable stores char and even if I convert it to int using ASCII, it works only if the first element of each line is smaller than 10, or the data of the first element of each line will be incorrectly stored.
So, the main question is:
How to switch to a new line to input the same matrix after press Enter once and switch to inputting the next matrix after press Enter again?
Or to say: how to get the event of '\n' without interfering with the original input stream?
To solve the problem at hand there is a more or less standard approach. You want to read csv data.
In your case, it is a little bit more difficult, because you do have a special format in your csv data. So first a " " separated list and then a empty line between 2 entries.
Now, how could this to be done? C++ is an object oriented language with many existing algorithms. You can create define a Proxy class and overwrite the extractor operator. The proxy class, and espcially the extractor, will do all the work.
The extractor, and that is the core of the question is, as said, a little bit more tricky. How can this be done?
In the extractor we will first read a complete line from an std::istream using the function std::getline. After having the line, we see a std::string containing "data-fields", delimited by a space. The std::string needs to be split up and the "data-fields"-contents shall be stored.
The process of splitting up strings is also called tokenizing. The "data-fields"-content is also called "token". C++ has a standard function for this purpose: std::sregex_token_iterator.
And because we have something that has been designed for such purpose, we should use it.
This thing is an iterator. For iterating over a string, hence sregex. The begin part defines, on what range of input we shall operate, then there is a std::regex for what should be matched / or what should not be matched in the input string. The type of matching strategy is given with last parameter.
1 --> give me the stuff that I defined in the regex and
-1 --> give me that what is NOT matched based on the regex.
We can use this iterator for storing the tokens in a std::vector. The std::vector has a range constructor, which takes 2 iterators a parameter, and copies the data between the first iterator and 2nd iterator to the std::vector.
The statement
std::vector token(std::sregex_token_iterator(line.begin(), line.end(), separator, -1), {});
defines a variable "token" of type std::vector<std::string>, splits up the std::string and puts the tokens into the std::vector. For your case we will use std::transform to change your strings into integers.
Very simple.
Next step. We want to read from a file. The file conatins also some kind of same data. The same data are rows.
And as for above, we can iterate over similar data. If it is the file input or whatever. For this purpose C++ has the std::istream_iterator. This is a template and as a template parameter it gets the type of data that it should read and, as a constructor parameter, it gets a reference to an input stream. It doesnt't matter, if the input stream is a std::cin, or a std::ifstream or a std::istringstream. The behaviour is identical for all kinds of streams.
And since we do not have files an SO, I use (in the below example) a std::istringstream to store the input csv file. But of course you can open a file, by defining a std::ifstream csvFile(filename). No problem.
We can now read the complete csv-file and split it into tokens and get all data, by simply defining a new variable and use again the range constructor.
Matrix matrix1( std::istream_iterator<ColumnProxy>(testCsv), {} );
This very simple one-liner will read the complete csv-file and do all the expected work.
Please note: I am using C++17 and can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction").
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed initializer list with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
Ì hope I could answer your basic question. Please see the full blown C++ example below:
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <regex>
#include <algorithm>
std::istringstream testCsv{ R"(1 2
3 4
5 6 7
8 9 10
)" };
// Define Alias for easier Reading
//using Columns = std::vector<std::string>;
using Columns = std::vector<int>;
using Matrix = std::vector<Columns>;
// The delimiter
const std::regex re(" ");
// Proxy for the input Iterator
struct ColumnProxy {
// Overload extractor. Read a complete line
friend std::istream& operator>>(std::istream& is, ColumnProxy& cp) {
// Read a line
cp.columns.clear();
if (std::string line; std::getline(is, line)) {
if (!line.empty()) {
// Split values and copy into resulting vector
std::transform(std::sregex_token_iterator(line.begin(), line.end(), re, -1),
std::sregex_token_iterator(),
std::back_inserter(cp.columns),
[](const std::string & s) {return std::stoi(s); });
}
else {
// Notify the caller. End of matrix
is.setstate(std::ios::eofbit | std::ios::failbit);
}
}
return is;
}
// Type cast operator overload. Cast the type 'Columns' to std::vector<std::string>
operator Columns() const { return columns; }
protected:
// Temporary to hold the read vector
Columns columns{};
};
int main()
{
// Define variable matrix with its range constructor. Read complete CSV in this statement, So, one liner
Matrix matrix1( std::istream_iterator<ColumnProxy>(testCsv), {} );
// Reset failbit and eofbit
testCsv.clear();
// Read 2nd matrix
Matrix matrix2(std::istream_iterator<ColumnProxy>(testCsv), {});
return 0;
}
Again:
What a pity that nobody will read this . . .

Reading key-value pairs as fast as possible in C++ from file

I have a file with roughly 2 million lines like this:
2s,3s,4s,5s,6s 100000
2s,3s,4s,5s,8s 101
2s,3s,4s,5s,9s 102
The first comma separated part indicates a poker result in Omaha, while the latter score is an example "value" of the cards. It is very important for me to read this file as fast as possible in C++, but I cannot seem to get it to be faster than a simple approach in Python (4.5 seconds) using the base library.
Using the Qt framework (QHash and QString), I was able to read the file in 2.5 seconds in release mode. However, I do not want to have the Qt dependency. The goal is to allow quick simulations using those 2 million lines, i.e. some_container["2s,3s,4s,5s,6s"] to yield 100 (though if applying a translation function or any non-readable format will allow for faster reading that's okay as well).
My current implementation is extremely slow (8 seconds!):
std::map<std::string, int> get_file_contents(const char *filename)
{
std::map<std::string, int> outcomes;
std::ifstream infile(filename);
std::string c;
int d;
while (infile.good())
{
infile >> c;
infile >> d;
//std::cout << c << d << std::endl;
outcomes[c] = d;
}
return outcomes;
}
What can I do to read this data into some kind of a key/value hash as fast as possible?
Note: The first 16 characters are always going to be there (the cards), while the score can go up to around 1 million.
Some further informations gathered from various comments:
sample file: http://pastebin.com/rB1hFViM
ram restrictions: 750MB
initialization time restriction: 5s
computation time per hand restriction: 0.5s
As I see it, there are two bottlenecks on your code.
1 Bottleneck
I believe that the file reading is the biggest problem there. Having a binary file is the fastest option. Not only you can read it directly in an array with a raw istream::read in a single operation (which is very fast), but you can even map the file in memory if your OS supports it. Here is a link that's very informative on how to use memory mapped files.
2 Bottleneck
The std::map is usually implemented with a self-balancing BST that will store all the data in order. This makes the insertion to be an O(logn) operation. You can change it to std::unordered_map, wich uses a hash table instead. A hash table have a constant time insertion if the number of colisions are low. As the ammount of elements that you need to read is known, you can reserve a suitable ammount of chuncks before inserting the elements. Keep in mind that you need more chuncks than the number of elements that will be inserted in the hash to avoid the maximum ammount of colisions.
Ian Medeiros already mentioned the two major botlenecks.
a few thoughts about data structures:
the amount of different cards is known: 4 colors of each 13 cards -> 52 cards.
so a card requires less than 6 bits to store. your current file format currently uses 24 bit (includig the comma).
so by simply enumerating the cards and omitting the comma you can save ~2/3 of file size and allows you to determine a card with reading only one character per card.
if you want to keep the file text based you may use a-m, n-z, A-M and N-Z for the four colors.
another thing that bugs me is the string based map. string operations are innefficient.
One hand contains 5 cards.
that means 52^5 posiibilities if we keep it simple and do not consider the already drawn cards.
--> 52^5 = 380.204.032 < 2^32
that means we can enumuerate every possible hand with a uint32 number. by defining a special sorting scheme of the cards (since order is irrelevant), we can assign a number to the hand and use this number as key in our map that is a lot faster than using strings.
if we have enough memory (1.5 GB) we do not even need a map but we can simply use an array.
of course the most cells are unused but access may be very fast. we even can ommit the ordering of the cards since the cells are present independet if we fill them or not. So we can use them. but in this case you should not forget to fill all possible permutations of the hand read from the file.
with this scheme we also (may be) can further optimize our file reading speed. if we only store the hands number and the rating so that only 2 values need to be parsed.
infact we can optimize the required storage space by using a more complex adressing scheme for the different hands, since in reality there are only 52*51*50*49*48 = 311.875.200 possible hands.additional to that the ordering is irrelevant as mentioned but i think that this saving is not worth the increased complexity of the encoding of the hands.
A simple idea might be to use the C API, which is considerably simpler:
#include <cstdio>
int n;
char s[128];
while (std::fscanf(stdin, "%127s %d", s, &n) == 2)
{
outcomes[s] = n;
}
A rough test showed a considerable speedup for me compared to the iostreams library.
Further speedups may be achieved by storing the data in a contiguous array, e.g. a vector of std::pair<std::string, int>; it depends on whether your data is already sorted and how you need to access it later.
For a serious solution, though, you should probably step back further and think of a better way to represent your data. For example, a fixed-width, binary encoding would be much more space-efficient and faster to parse, since you won't need to look ahead for line endings or parse strings.
Update: From some quick experimentation I've found it fairly fast to first read the entire file into memory and then perform alternating strtok calls with either " " or "\n" as the delimiter; whenever a pair of calls succeed, apply strtol on the second pointer to parse the integer. Here's a skeleton:
#include <cerrno>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <vector>
int main()
{
std::vector<char> data;
// Read entire file to memory
{
data.reserve(100000000);
char buf[4096];
for (std::size_t n; (n = std::fread(buf, 1, sizeof buf, stdin)) > 0; )
{
data.insert(data.end(), buf, buf + n);
}
data.push_back('\0');
}
// Tokenize the in-memory data
char * p = &data.front();
for (char * q = std::strtok(p, " "); q; q = std::strtok(nullptr, " "))
{
if (char * r = std::strtok(nullptr, "\n"))
{
char * e;
errno = 0;
int const n = std::strtol(r, &e, 10);
if (*e != '\0' || errno != 0) { continue; }
// At this point we have data:
// * the string is "q"
// * the integer is "n"
}
}
}

Clarification required regarding Arrays, Vectors and Maps in usage of a C++ Application

I want to know the right algorithm and a container class for my application. I am trying to build one Client-Server communication system where the Server contains group of files (.txt). The file structure (prototype) is like:
A|B|C|D....|Z$(some integer value)#(some integer value). Again the contents of A to Z are a1_a2_a3_a4......aN|b1_b2_b3_b4......bN|......|z1_z2_z3_z4.....zN. So what I wanted to do is when Server application has started, it has to load these files one-by-one and save the contents of each file in a Container class and again the contents of the file into particular variables based on the delimiters i.e.
for (int i=0; i< (Number of files); i++)
{
1) Load the file[0] in Container class[0];
2) Read the Container class[0] search for occurences of delimiters "_" and "|"
3) Till next "|" occurs, save the value occurred at "_" to an array or variable (save it in a buffer)
4) Do this till the file length completes or reaches EOF
5) Next read the second file, save it in Container class[1] and follow the steps as in 2),3) and 4)
}
I want to know if Vector or Map suits my requirement? As I need to search for occurrences of delimiters and push_back them and access while necessity comes.
Can I read whole single file as block and manipulate with the buffer or while file read only using seekg I can push the values to stack? One which will be better and easier to implement? What are the possibilities of using regex?
According to the format of input, and its size, I'd suggest doing something along these lines for reading and parsing the input:
void ParseOneFile (std::istream & inp)
{
std::vector<std::vector<std::string>> data;
int some_int_1 = 0, some_int_2 = 0;
std::string temp;
data.push_back ({});
while (0 == 0)
{
int c = inp.get();
if ('$' == c)
{
data.back().emplace_back (std::move(temp));
break;
}
else if ('|' == c)
{
data.back().emplace_back (std::move(temp));
data.push_back ({});
}
else if ('_' == c)
data.back().emplace_back (std::move(temp));
else
temp += char(c);
}
char sharp;
inp >> some_int_1 >> sharp >> some_int_2;
assert ('#' == sharp);
// Here, you have your data and your two integers...
}
The above function does not return the information it extracts, so you will want to change that. But it does read one of your files into a vector of vector of strings called data and two integers (some_int_1 and some_int_2.) It uses C++11 and does this reading and parsing quite efficiently, both in terms of processing and memory.
And, the above code does not check for any errors and inconsistent formatting in the input file.
Now, for your data structure problem. Since I have no idea about the nature of your data, I can't say for sure. All I can say is that a two-dimensional array and two integers on the side feels like a natural fit for this data. Since you have several files, you can store them all in another dimension of vector (or perhaps in a map, mapping a file name to a data structure like the following:
struct OneFile
{
vector<vector<string>> data;
int i1, i2;
};
vector<OneFile> all_files;
// or...
// map<string, OneFile> all_files;
The above function would fill one instance of the OneFile struct above.
As an example, all_files[0].data[0][0] will be a string referring to data item A0 in the first file, and all_files[7].data[25][3] will be another string referring to data item Z3 in the 8th file.

Very large look up table C++ - can I avoid typing the whole thing out?

I am not a programmer, but am an engineer who needs to use C++ coding on this occasion, so sorry if this question is a little basic.
I need to use a look up table as I have some highly non-linear dynamics going on that I need to model. It consists of literally 1000 paired values, from a pair of (0.022815, 0.7) up to (6.9453, 21.85).
I don't want to have to type all these values out in my C code. The values are currently stored in Matlab. Can I read them from a .dat file or something similar?
I will have calculated a value and simply want the program to kick out the paired value.
Thanks,
Adam
You can't read something stored in Matlab directly, unless you want to
write a parser for whatever format Matlab stores its data in. I'm not
familiar with Matlab, but I would be very surprised if it didn't have a
function to output this data to a file, in some text format, which you
could read and parse.
Assuming this is constant data, if it could output something along the
lines of:
{ 0.022815, 0.7 },
...
{ 6.9453, 21.85 },
you could include it as the initializer of a table in C++. (It may look
strange to have a #include in the middle of a variable definition, but
it's perfectly legal, and in such cases, perfectly justified.) Or just
copy/paste it into your C++ program.
If you can't get exactly this format directly, it should be trivial to
write a small script that would convert whatever format you get into
this format.
this program defines a map, then reading from a.txt file, inserting to a map, iterating on map for any purposes you have, and finally writing the map into a file.
just a simple practice:
#include <fstream>
#include <iostream>
#include <map>
using namespace std;
int main(){
ifstream inFile("a.txt", ios::in);
if (! inFile ){
cout<<"unabl to open";
return 0;
}
//reading a file and inserting in a map
map<double,double> mymap;
double a,b;
while( ! inFile.eof() ){
inFile>>a>>b;
mymap.insert ( a,b );
}
inFile.close(); //be sure to close the file
//iterating on map
map<double,double>::iterator it;
for ( it=mymap.begin() ; it != mymap.end(); it++ ){
// (*it).first
// (*it).second
}
//writing the map into a file
ofstream outFile;
outFile.open ("a.txt", ios::out); // or ios::app if you want to append
for ( it=mymap.begin() ; it != mymap.end(); it++ ){
outFile << (*it).first << " - " << (*it).second << endl; //what ever!
}
outFile.close();
return 0;
}
What I would do for this is as follows as I think this is faster than file open and close. First of all create a header file which contains all the data in an array. You could you a "replace all" available in Notepad or so to replace the () braces to { } braces. Later on you could even write a script that makes the header file from the Matlab file
>> cat import_data.h
#define TBL_SIZE 4 // In your case it is 1000
const double table[TBL_SIZE][2] =
{
{ 0.022815, 0.7 },
{ 6.9453, 21.85 },
{ 4.666, 565.9},
{ 567.9, 34.6}
};
Now in the main program you include this header also for the data
>> cat lookup.c
#include <stdio.h>
#include "import_data.h"
double lookup(double key)
{
int i=0;
for(;i<TBL_SIZE; i++) {
if(table[i][0] == key)
return table[i][1];
}
return -1; //error
}
int main() {
printf("1. Value is %f\n", lookup(6.9453));
printf("2. Value is %f\n", lookup(4.666));
printf("3. Value is %f\n", lookup(4.6));
return 0;
}
Yes, you can read them from the dat file. The question is, what format is the dat file? Once you know that, you want to use:
fopen
fread
fclose
for C and
ifstream
for C++ (or something similar).
The program still has to get those pairs from the file and load them in memory. You can loop through the lines in the file, parse the pairs and shove them in a std::map.
Something like this:
#include<fstream>
#include<map>
...
ifstream infile("yourdatfile.dat");
std::string str;
std::map<double, double> m; //use appropriate type(s)
while(getline(infile, str)){
//split str by comma or some delimiter and get the key, value
//put key, value in m
}
//use m
For the signal processing toolbox you can export data to C header files
directly from Matlab(don't know if it's your particular case):
Matlab export to C header
Or maybe the following article could be of help:
Exporting/Importing Data To/From MATLAB
One of options is to generate the C++ lookup table in matlab. Just write to some text file (lookup.cpp), read table producing C++ source...

read in values and store in list in c++

i have a text file with data like the following:
name
weight
groupcode
name
weight
groupcode
name
weight
groupcode
now i want write the data of all persons into a output file till the maximum weight of 10000 kg is reached.
currently i have this:
void loadData(){
ifstream readFile( "inFile.txt" );
if( !readFile.is_open() )
{
cout << "Cannot open file" << endl;
}
else
{
cout << "Open file" << endl;
}
char row[30]; // max length of a value
while(readFile.getline (row, 50))
{
cout << row << endl;
// how can i store the data into a list and also calculating the total weight?
}
readFile.close();
}
i work with visual studio 2010 professional!
because i am a c++ beginner there could be is a better way! i am open for any idea's and suggestions
thanks in advance!
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <limits>
struct entry
{
entry()
: weight()
{ }
std::string name;
int weight; // kg
std::string group_code;
};
// content of data.txt
// (without leading space)
//
// John
// 80
// Wrestler
//
// Joe
// 75
// Cowboy
int main()
{
std::ifstream stream("data.txt");
if (stream)
{
std::vector<entry> entries;
const int limit_total_weight = 10000; // kg
int total_weight = 0; // kg
entry current;
while (std::getline(stream, current.name) &&
stream >> current.weight &&
stream.ignore(std::numeric_limits<std::streamsize>::max(), '\n') && // skip the rest of the line containing the weight
std::getline(stream, current.group_code))
{
entries.push_back(current);
total_weight += current.weight;
if (total_weight > limit_total_weight)
{
break;
}
// ignore empty line
stream.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}
}
else
{
std::cerr << "could not open the file" << std::endl;
}
}
Edit: Since you wannt to write the entries to a file, just stream out the entries instead of storing them in the vector. And of course you could overload the operator >> and operator << for the entry type.
Well here's a clue. Do you see the mismatch between your code and your problem description? In your problem description you have the data in groups of four lines, name, weight, groupcode, and a blank line. But in your code you only read one line each time round your loop, you should read four lines each time round your loop. So something like this
char name[30];
char weight[30];
char groupcode[30];
char blank[30];
while (readFile.getline (name, 30) &&
readFile.getline (weight, 30) &&
readFile.getline (groupcode, 30) &&
readFile.getline (blank, 30))
{
// now do something with name, weight and groupcode
}
Not perfect by a long way, but hopefully will get you started on the right track. Remember the structure of your code should match the structure of your problem description.
Have two file pointers, try reading input file and keep writing to o/p file. Meanwhile have a counter and keep incrementing with weight. When weight >= 10k, break the loop. By then you will have required data in o/p file.
Use this link for list of I/O APIs:
http://msdn.microsoft.com/en-us/library/aa364232(v=VS.85).aspx
If you want to struggle through things to build a working program on your own, read this. If you'd rather learn by example and study a strong example of C++ input/output, I'd definitely suggest poring over Simon's code.
First things first: You created a row buffer with 30 characters when you wrote, "char row[30];"
In the next line, you should change the readFile.getline(row, 50) call to readFile.getline(row, 30). Otherwise, it will try to read in 50 characters, and if someone has a name longer than 30, the memory past the buffer will become corrupted. So, that's a no-no. ;)
If you want to learn C++, I would strongly suggest that you use the standard library for I/O rather than the Microsoft-specific libraries that rplusg suggested. You're on the right track with ifstream and getline. If you want to learn pure C++, Simon has the right idea in his comment about switching out the character array for an std::string.
Anyway, john gave good advice about structuring your program around the problem description. As he said, you will want to read four lines with every iteration of the loop. When you read the weight line, you will want to find a way to get numerical output from it (if you're sticking with the character array, try http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/, or try http://www.cplusplus.com/reference/clibrary/cstdlib/atof/ for non-whole numbers). Then you can add that to a running weight total. Each iteration, output data to a file as required, and once your weight total >= 10000, that's when you know to break out of the loop.
However, you might not want to use getline inside of your while condition at all: Since you have to use getline four times each loop iteration, you would either have to use something similar to Simon's code or store your results in four separate buffers if you did it that way (otherwise, you won't have time to read the weight and print out the line before the next line is read in!).
Instead, you can also structure the loop to be while(total <= 10000) or something similar. In that case, you can use four sets of if(readFile.getline(row, 30)) inside of the loop, and you'll be able to read in the weight and print things out in between each set. The loop will end automatically after the iteration that pushes the total weight over 10000...but you should also break out of it if you reach the end of the file, or you'll be stuck in a loop for all eternity. :p
Good luck!