Reading data from a file using fstream object - c++

The format of the file is:
ITEM: TIMESTEP
0
ITEM: NUMBER OF ATOMS
32768
ITEM: BOX BOUNDS pp pp ff
0.0000000000000000e+00 3.2000000000000000e+01
0.0000000000000000e+00 3.2000000000000000e+01
0.0000000000000000e+00 3.2000000000000000e+01
ITEM: ATOMS type x y z
1 0.292418 1.13983 1.28999
......
I read the header for each timestamp into a dummy string, and value of time step into an array. My code can read two timestamps correctly (65554 lines), but tellg() sets into -1 and I only get the last read values in my output. And also, the file never reaches EOF and my code continues for eternity.
#include <bits/stdc++.h>
using namespace std;
int main(int argc, char** argv)
{
string f = argv[1];int ens=1;string file;
fstream xyz("readas.xyz",ios_base::out); //to write data to cross-check what I read
fstream* Pxyz = &xyz;
float x,y,z,type;
int* time = (int*) malloc(100*sizeof(int));
int step;
string dummy;
while(ens < atoi(argv[2])) //this is to open different files to read
{
file=f+to_string(ens);
cout<<"Reading "+file<<endl;
fstream fobj(file,ios_base::in);
fstream* f= &fobj; //reading from this file
if(f->is_open())
{
*time=0;step=0;
while(true)
{
*f>>dummy>>dummy;
if(f->fail())break;
*f>>*(time+(++step));
*Pxyz << "32768" << "\n" << *(time+step) << endl;
*f>>dummy>>dummy>>dummy>>dummy;
*f>>dummy;
*f>>dummy>>dummy>>dummy>>dummy>>dummy>>dummy;
*f>>dummy>>dummy;
*f>>dummy>>dummy;
*f>>dummy>>dummy;
*f>>dummy>>dummy>>dummy>>dummy>>dummy>>dummy;
for(int i=0;i<32768;i++)
{
*f>>type>>x>>y>>z;
*Pxyz <<f->tellg() << " " << *(time+step) << " " << i << " " << type << " " << x << " " << y << " " << z << " " <<endl;
}
}
f->close(); //closing this file
}
++ens;
}
return 0;
}
The point at which the problem starts:
tellg time i and other values
1754887 10 32766 2 31.3309 31.9485 31.6061
1754914 10 32767 1 31.6358 31.1965 30.9986
32768
10
-1 10 0 0 31.6358 31.1965 30.9986
-1 10 1 0 31.6358 31.1965 30.9986
.......

Related

Reading certain Data from a txt file and ignore the rest

This a problem from introduction to C++ course. We need to read a Data from the file jan91.dat. Data looks like that.
1 59 26 43 .. .. .. .. ..
2 40 12 24 .. .. .. .. ..
3 21 14 18 .. .. .. .. ..
4 .. .. .. .. .. .. .. ..
First number is a date, we want to read only second and third number from each line.
Second number is Max Temperature, and third number is Min Temperature. Later we may need to read other numbers in a line like humidity etc that (but it is not required yet).
// This program determines the number of days in each of six
// temperature categories for the days of January 1991.
#include <iostream> // required for cin and cout
#include <iomanip> // required for set precision, setw
#include <fstream> // required for ifstream and ofstream
#include <string> // required for string
using namespace std; // standard library
int main()
{
// Name of Variables
int i, below0 = 0, from0to32 = 0, from33to75 = 0, above75 = 0;
double min_temp, max_temp, date = 0;
const string FILENAME = "jan91.dat";
// Open input file
ifstream jan91;
jan91.open(FILENAME.c_str());
if (jan91.fail()) {
cerr << "Error opening file" << endl;
exit(1);
}
// Read and check temperature per day
while (!jan91.eof()) {
(jan91 >> date >> max_temp >> min_temp);
if (min_temp < 0);
below0++;
if (max_temp > 75) {
from0to32++;
from33to75++;
above75++;
}
else if (max_temp > 33 && max_temp < 75) {
from0to32++;
from33to75++;
}
else if (max_temp > 0 && max_temp < 33) {
from0to32++;
}
}
// Print results
cout << "January of 1991" << endl;
cout << "Temperature Ranges \t Number of Days " << endl;
cout << "Below 0 \t \t\t" << below0 << endl;
cout << "0 to 32 \t \t\t" << from0to32 << endl;
cout << "33 to 75\t \t\t" << from33to75 << endl;
cout << "Above 75 \t \t\t" << above75 << endl;
// Close file
jan91.close();
// Exit program.
return 0;
}
some people suggested to use "getline" but for some reason I could not use it in my Visual Studio 17.3.3
Please advise

ta-lib c++, calculated macd not matched web

I'm using ta-lib c++ library to calculate MACD, but the result is totally different from what the website shows,
the real MACD is [444.39, 505.05, 248.02, -232.33, 100.39, -13.18],
but my result is [282.10, -74.12, -211.27, -460.82, -850.86]
I have set all the MAType to TA_MAType_EMA, but it makes no sense
#include <iostream>
#include <cassert>
#include <ta-lib/ta_libc.h>
using namespace std;
int main()
{
// init ta-lib context
TA_RetCode retcode;
retcode = TA_Initialize();
assert(retcode == TA_SUCCESS);
// comput moving average price
TA_Real close_price_array[100] = { 37924.41, 40849.89, 37952.37, 36564.58, 36844.22, 34719.71, 33156.65, 32858.22,
34212.01, 37118.35, 31924.17, 30327.18, 31757.38, 34459.95, 31952.8 , 31876.57,
32457.32, 31392.34, 34183.43, 37328.12, 36408.31, 35732.04, 37460.76, 35627.27,
39551.87, 34677.01, 33834.78, 31580.01, 39674.77, 40513.11, 40829.87, 38950.0 ,
34555.33, 32091.45, 31737.83, 33506.67, 31695.17, 29190.91, 28779.14, 28153.95,
26617.04, 26911.93, 27360.51, 25625.24, 24019.43, 23230.15, 23450.3 , 23341.65,
23099.56, 23873.04, 23551.1 , 22553.6 , 23329.31, 20659.69, 19406.28, 19198.7 ,
19215.36, 18401.98, 18106.72, 18134.91, 18347.36, 18806.82, 19213.0 , 19126.33,
19107.67, 18945.51, 19533.84, 18891.06, 19265.5 , 19306.92, 18116.34, 17505.0 ,
16502.76, 16905.43, 19129.39, 19358.42, 18269.55, 18294.73, 18784.06, 18655.81,
18046.78, 17871.06, 17318.57, 16450.98, 16026.15, 15950.15, 16098.79, 16122.33,
15666.22, 15168.03, 15004.24, 15354.6 , 15342.63, 15411.23, 15077.18, 13911.95,
13708.92, 13492.15, 13797.96, 13854.39 };
TA_Real *p = close_price_array;
cout.precision(8);
TA_Integer out_begin = 0;
TA_Integer out_nb_element = 0;
TA_Real outMACD[100] = { 0 };
TA_Real outMACDSignal[100] = { 0 };
TA_Real outMACDHist[100] = { 0 };
retcode = TA_MACDEXT(0, 99,
&close_price_array[0],
12, TA_MAType_EMA ,
26, TA_MAType_EMA ,
9, TA_MAType_EMA ,
&out_begin, &out_nb_element,
outMACD, outMACDSignal, outMACDHist);
assert(retcode == TA_SUCCESS);
cout << "out_begin_index: " << out_begin << endl;
cout << "out_nb_element: " << out_nb_element << endl;
cout << "outMACD array: " << endl;
for (auto &i : outMACD)
cout << i << " ";
cout << endl;
cout << "outMACDSignal array: " << endl;
for (auto &i : outMACDSignal)
cout << i << " ";
cout << endl;
cout << "outMACDSignal array: " << endl;
for (auto &i : outMACDHist)
cout << i << " ";
cout << endl;
retcode = TA_Shutdown();
assert(retcode == TA_SUCCESS);
return 0;
}
enter image description here
[After comparing TA-lib results with Excel calculations]: In your excel the 12-day EMA is calculated from the 1st day and its first value is the average on 12th day (8/13/2020) and the 26-day EMA is calculated from 1st day and first value is average on 26th day (26/13/2020). TA-Lib postpones the 12 day EMA calculation start to get its first value on the same day as first value of 26-day EMA. That means 12-day EMA is calculated from 8/16/2020 and it's first value is the average on (26/13/2020) as it's in 26-day EMA. So to adjust your excel to TA-Lib results you need to copy formula =AVERAGE(B19:B30) into the cell C30.
Another note is that TA-Lib's MACD outputs 3 arrays at once: macd, signal, histogram. And thus TA-Lib starts the output at the point it got meaningful values for all 3 result arrays. Thus you're getting result starting not from the point where 26-day EMA can be calculated, but from the point where Signal can be calculated (8 days later). So you shall compare talib_macd[1] with excel starting from cell E38 instead of E30.

C++ Reading back "incorrect" values from binary file?

The project I'm working on, as a custom file format consisting of the header of a few different variables, followed by the pixel data. My colleagues have developed a GUI, where processing, writing reading and displaying this type of file format works fine.
But my problem is, while I have assisted in writing the code for writing data to disk, I cannot myself read this kind of file and get satisfactorily values back. I am able to read the first variable back (char array) but not the following value(s).
So the file format matches the following structure:
typedef struct {
char hxtLabel[8];
u64 hxtVersion;
int motorPositions[9];
int filePrefixLength;
char filePrefix[100];
..
} HxtBuffer;
In the code, I create an object of the above structure and then set these example values:
setLabel("MY_LABEL");
setFormatVersion(3);
setMotorPosition( 2109, 5438, 8767, 1234, 1022, 1033, 1044, 1055, 1066);
setFilePrefixLength(7);
setFilePrefix( string("prefix_"));
setDataTimeStamp( string("000000_000000"));
My code for opening the file:
// Open data file, binary mode, reading
ifstream datFile(aFileName.c_str(), ios::in | ios::binary);
if (!datFile.is_open()) {
cout << "readFile() ERROR: Failed to open file " << aFileName << endl;
return false;
}
// How large is the file?
datFile.seekg(0, datFile.end);
int length = datFile.tellg();
datFile.seekg(0, datFile.beg);
cout << "readFile() file " << setw(70) << aFileName << " is: " << setw(15) << length << " long\n";
// Allocate memory for buffer:
char * buffer = new char[length];
// Read data as one block:
datFile.read(buffer, length);
datFile.close();
/// Looking at the start of the buffer, I should be seeing "MY_LABEL"?
cout << "buffer: " << buffer << " " << *(buffer) << endl;
int* mSSX = reinterpret_cast<int*>(*(buffer+8));
int* mSSY = reinterpret_cast<int*>(&buffer+9);
int* mSSZ = reinterpret_cast<int*>(&buffer+10);
int* mSSROT = reinterpret_cast<int*>(&buffer+11);
int* mTimer = reinterpret_cast<int*>(&buffer+12);
int* mGALX = reinterpret_cast<int*>(&buffer+13);
int* mGALY = reinterpret_cast<int*>(&buffer+14);
int* mGALZ = reinterpret_cast<int*>(&buffer+15);
int* mGALROT = reinterpret_cast<int*>(&buffer+16);
int* filePrefixLength = reinterpret_cast<int*>(&buffer+17);
std::string filePrefix; std::string dataTimeStamp;
// Read file prefix character by character into stringstream object
std::stringstream ss;
char* cPointer = (char *)(buffer+18);
int k;
for(k = 0; k < *filePrefixLength; k++)
{
//read string
char c;
c = *cPointer;
ss << c;
cPointer++;
}
filePrefix = ss.str();
// Read timestamp character by character into stringstream object
std::stringstream timeStampStream;
/// Need not increment cPointer, already pointing # 1st char of timeStamp
for (int l= 0; l < 13; l++)
{
char c;
c = * cPointer;
timeStampStream << c;
}
dataTimeStamp = timeStampStream.str();
cout << 25 << endl;
cout << " mSSX: " << mSSX << " mSSY: " << mSSY << " mSSZ: " << mSSZ;
cout << " mSSROT: " << mSSROT << " mTimer: " << mTimer << " mGALX: " << mGALX;
cout << " mGALY: " << mGALY << " mGALZ: " << mGALZ << " mGALROT: " << mGALROT;
Finally, what I see is here below. I added the 25 just to double check that not everything was coming out in hexadecimal. As you can see, I am able to see the label "MY_LABEL" as expected. But the 9 motorPositions all come out looking suspiciously like addresses are not values. The file prefix and the data timestamp (which should be strings, or at least characters), are just empty.
buffer: MY_LABEL M
25
mSSX: 0000000000000003 mSSY: 00000000001BF618 mSSZ: 00000000001BF620 mSSROT: 00000000001BF628 mTimer: 00000000001BF630 mGALX: 00000000001BF638 mGALY: 00000000001BF640 mGALZ: 00000000001BF648 mGALROT: 00000000001BF650filePrefix: dataTimeStamp:
I'm sure the solution can't be too complicated, but I reached a stage where I had this just spinning and I cannot make sense of things.
Many thanks for reading this somewhat long post.
-- Edit--
I might hit the maximum length allowed for a post, but just in case I thought I shall post the code that generates the data that I'm trying to read back:
bool writePixelOutput(string aOutputPixelFileName) {
// Write pixel histograms out to binary file
ofstream pixelFile;
pixelFile.open(aOutputPixelFileName.c_str(), ios::binary | ios::out | ios::trunc);
if (!pixelFile.is_open()) {
LOG(gLogConfig, logERROR) << "Failed to open output file " << aOutputPixelFileName;
return false;
}
// Write binary file header
string label("MY_LABEL");
pixelFile.write(label.c_str(), label.length());
pixelFile.write((const char*)&mFormatVersion, sizeof(u64));
// Include File Prefix/Motor Positions/Data Time Stamp - if format version > 1
if (mFormatVersion > 1)
{
pixelFile.write((const char*)&mSSX, sizeof(mSSX));
pixelFile.write((const char*)&mSSY, sizeof(mSSY));
pixelFile.write((const char*)&mSSZ, sizeof(mSSZ));
pixelFile.write((const char*)&mSSROT, sizeof(mSSROT));
pixelFile.write((const char*)&mTimer, sizeof(mTimer));
pixelFile.write((const char*)&mGALX, sizeof(mGALX));
pixelFile.write((const char*)&mGALY, sizeof(mGALY));
pixelFile.write((const char*)&mGALZ, sizeof(mGALZ));
pixelFile.write((const char*)&mGALROT, sizeof(mGALROT));
// Determine length of mFilePrefix string
int filePrefixSize = (int)mFilePrefix.size();
// Write prefix length, followed by prefix itself
pixelFile.write((const char*)&filePrefixSize, sizeof(filePrefixSize));
size_t prefixLen = 0;
if (mFormatVersion == 2) prefixLen = mFilePrefix.size();
else prefixLen = 100;
pixelFile.write(mFilePrefix.c_str(), prefixLen);
pixelFile.write(mDataTimeStamp.c_str(), mDataTimeStamp.size());
}
// Continue writing header information that is common to both format versions
pixelFile.write((const char*)&mRows, sizeof(mRows));
pixelFile.write((const char*)&mCols, sizeof(mCols));
pixelFile.write((const char*)&mHistoBins, sizeof(mHistoBins));
// Write the actual data - taken out for briefy sake
// ..
pixelFile.close();
LOG(gLogConfig, logINFO) << "Written output histogram binary file " << aOutputPixelFileName;
return true;
}
-- Edit 2 (11:32 09/12/2015) --
Thank you for all the help, I'm closer to solving the issue now. Going with the answer from muelleth, I try:
/// Read into char buffer
char * buffer = new char[length];
datFile.read(buffer, length);// length determined by ifstream.seekg()
/// Let's try HxtBuffer
HxtBuffer *input = new HxtBuffer;
cout << "sizeof HxtBuffer: " << sizeof *input << endl;
memcpy(input, buffer, length);
I can then display the different struct variables:
qDebug() << "Slice BUFFER label " << QString::fromStdString(input->hxtLabel);
qDebug() << "Slice BUFFER version " << QString::number(input->hxtVersion);
qDebug() << "Slice BUFFER hxtPrefixLength " << QString::number(input->filePrefixLength);
for (int i = 0; i < 9; i++)
{
qDebug() << i << QString::number(input->motorPositions[i]);
}
qDebug() << "Slice BUFFER filePrefix " << QString::fromStdString(input->filePrefix);
qDebug() << "Slice BUFFER dataTimeStamp " << QString::fromStdString(input->dataTimeStamp);
qDebug() << "Slice BUFFER nRows " << QString::number(input->nRows);
qDebug() << "Slice BUFFER nCols " << QString::number(input->nCols);
qDebug() << "Slice BUFFER nBins " << QString::number(input->nBins);
The output is then mostly as expected:
Slice BUFFER label "MY_LABEL"
Slice BUFFER version "3"
Slice BUFFER hxtPrefixLength "2"
0 "2109"
1 "5438"
...
7 "1055"
8 "1066"
Slice BUFFER filePrefix "-1"
Slice BUFFER dataTimeStamp "000000_000000P"
Slice BUFFER nRows "20480"
Slice BUFFER nCols "256000"
Slice BUFFER nBins "0"
EXCEPT, dataTimeStamp, which is 13 chars long, displays instead 14 chars. The 3 variables that follow: nRows, nCols and nBins are then incorrect. (Should be nRows=80, nCols=80, nBins=1000). My guess is that the bits belonging to the 14th char of dataTimeStamp should be read along with nRows, and so cascade on to produce the correct nCols and nBins.
I have separately verified (not shown here) using qDebug that what I'm writing into the file, really are the values I expect, and their individual sizes.
I personally would try to read exactly the number of bytes your struct is from the file, i.e. something like
int length = sizeof(HxtBuffer);
and then simply use memcpy to assign a local structure from the read buffer:
HxtBuffer input;
memcpy(&input, buffer, length);
You can then access your data e.g. like:
std::cout << "Data: " << input.hxtLabel << std::endl;
Why do you read to buffer, instead of using the structure for reading?
HxtBuffer data;
datFile.read(reinterpret_cast<char *>(&data), sizeof data);
if(datFile && datFile.gcount()!=sizeof data)
throw io_exception();
// Can use data.
If you want to read to a chracter buffer, than your way of getting the data is just wrong. You probably want to do something like this.
char *buf_offset=buffer+8+sizeof(u64); // Skip label (8 chars) and version (int64)
int mSSX = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSY = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSZ = *reinterpret_cast<int*>(buf_offset);
/* etc. */
Or, a little better (provided you don't change the contents of the buffer).
int *ptr_motors=reinterpret_cast<int *>(buffer+8+sizeof(u64));
int &mSSX = ptr_motors[0];
int &mSSY = ptr_motors[1];
int &mSSZ = ptr_motors[2];
/* etc. */
Notice that I don't declare mSSX, mSSY etc. as pointers. Your code was printing them as addresses because you told the compiler that they were addresses (pointers).

Values not written to vector

I'm trying to read pairs values from a file in the constructor of an object.
The file looks like this:
4
1 1
2 2
3 3
4 4
The first number is number of pairs to read.
In some of the lines the values seem to have been correctly written into the vector. In the next they are gone. I am totally confused
inline
BaseInterpolator::BaseInterpolator(std::string data_file_name)
{
std::ifstream in_file(data_file_name);
if (!in_file) {
std::cerr << "Can't open input file " << data_file_name << std::endl;
exit(EXIT_FAILURE);
}
size_t n;
in_file >> n;
xs_.reserve(n);
ys_.reserve(n);
size_t i = 0;
while(in_file >> xs_[i] >> ys_[i])
{
// this line prints correct values i.e. 1 1, 2 2, 3 3, 4 4
std::cout << xs_[i] << " " << ys_[i] << std::endl;
// this lines prints xs_.size() = 0
std::cout << "xs_.size() = " << xs_.size() << std::endl;
if(i + 1 < n)
i += 1;
else
break;
// this line prints 0 0, 0 0, 0 0
std::cout << xs_[i] << " " << ys_[i] << std::endl;
}
// this line prints correct values i.e. 4 4
std::cout << xs_[i] << " " << ys_[i] << std::endl;
// this lines prints xs_.size() = 0
std::cout << "xs_.size() = " << xs_.size() << std::endl;
}
The class is defined thus:
class BaseInterpolator
{
public:
~BaseInterpolator();
BaseInterpolator();
BaseInterpolator(std::vector<double> &xs, std::vector<double> &ys);
BaseInterpolator(std::string data_file_name);
virtual int interpolate(std::vector<double> &x, std::vector<double> &fx) = 0;
virtual int interpolate(std::string input_file_name,
std::string output_file_name) = 0;
protected:
std::vector<double> xs_;
std::vector<double> ys_;
};
You're experiencing undefined behaviour. It seems like it's half working, but that's twice as bad as not working at all.
The problem is this:
xs_.reserve(n);
ys_.reserve(n);
You are only reserving a size, not creating it.
Replace it by :
xs_.resize(n);
ys_.resize(n);
Now, xs[i] with i < n is actually valid.
If in doubt, use xs_.at(i) instead of xs_[i]. It performs an additional boundary check which saves you the trouble from debugging without knowing where to start.
You're using reserve(), which increases capacity (storage space), but does not increase the size of the vector (i.e. it does not add any objects into it). You should use resize() instead. This will take care of size() being 0.
You're printing the xs_[i] and ys_[i] after you increment i. It's natural those will be 0 (or perhaps a random value) - you haven't initialised them yet.
vector::reserve reserve space for further operation but don't change the size of the vector, you should use vector::resize.

How do I use a string in a struct instead of a char array for reading binary data

I am reading binary data into a struct, which is working just fine. Here is the code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
struct TaqIdx {
char symbol[10];
int tdate;
int begrec;
int endrec;
}__attribute__((packed));
int main()
{
ifstream fin("T201010A.IDX", ios::in | ios::binary);
if(!fin) {
cout << "Cannot open file." << endl;
return 1;
}
int cnt = 0;
TaqIdx idx;
while(fin.read((char *) &idx,sizeof(idx))) {
if(!fin.good()) {
cout << "A file error occurred." << endl;
return 1;
}
idx.symbol[10] = '\0';
cout << "(" << idx.symbol << ", " << idx.tdate << ", "
<< idx.begrec << ", " << idx.endrec << ") "
<< cnt++ << endl;
}
fin.close();
return 0;
}
The first few lines of output are the following:
(A , 20100864, 1, 35981) 0
(AA , 20100864, 35982, 89091) 1
(AAPR , 20100864, 89092, 89093) 2
(AACC , 20100864, 89094, 89293) 3
(AADR , 20100864, 89294, 89301) 4
(AAI , 20100864, 89302, 99242) 5
(AAME , 20100864, 99243, 99252) 6
(AAN , 20100864, 99253, 102275) 7
(AANA , 20100864, 102276, 102280) 8
(AAON , 20100864, 102281, 102592) 9
My question is this: is it possible to replace the C-style character array in the structure with a C++ string? If so, can you provide an example of how I would do that. Many thanks!
The code appears to be designed to read data serialized with a particular binary format into the TaqIdx format. You could certainly modify the reader to supply the data in a different format (including std::strings) , but you'd either have to rewrite the reader or convert after it had been loaded. Alternatively you could use an entirely different format for the data but that might not be compatible with the files you have.
I can't "comment everywhere" yet, so I apologize for this being kind of out of standard protocol around these parts. Why doesn't this break?
idx.symbol[10] = '\0';
The length of symbol is 10, won't having __attribute__((packed)) in there put a null byte into the first byte of tdate?