Execution order of c++ - c++

I created a program that tests carchive. I wanted to see how fast it took to save a million data points:
#include "stdafx.h"
#include "TestData.h"
#include <iostream>
#include <vector>
using namespace std;
void pause() {
cin.clear();
cout << endl << "Press any key to continue...";
cin.ignore();
}
int _tmain(int argc, _TCHAR* argv[])
{
int numOfPoint = 1000000;
printf("Starting test...\n\n");
vector<TestData>* dataPoints = new vector<TestData>();
printf("Creating %i points...\n", numOfPoint);
for (int i = 0; i < numOfPoint; i++)
{
TestData* dataPoint = new TestData();
dataPoints->push_back(*dataPoint);
}
printf("Finished creating points.\n\n");
printf("Creating archive...\n");
CFile* pFile = new CFile();
CFileException e;
TCHAR* fileName = _T("foo.dat");
ASSERT(pFile != NULL);
if (!pFile->Open(fileName, CFile::modeCreate | CFile::modeReadWrite | CFile::shareExclusive, &e))
{
return -1;
}
bool bReading = false;
CArchive* pArchive = NULL;
try
{
pFile->SeekToBegin();
UINT uMode = (bReading ? CArchive::load : CArchive::store);
pArchive = new CArchive(pFile, uMode);
ASSERT(pArchive != NULL);
}
catch (CException* pException)
{
return -2;
}
printf("Finished creating archive.\n\n");
//SERIALIZING DATA
printf("Serializing data...\n");
for (int i = 0; i < dataPoints->size(); i++)
{
dataPoints->at(i).serialize(pArchive);
}
printf("Finished serializing data.\n\n");
printf("Cleaning up...\n");
pArchive->Close();
delete pArchive;
pFile->Close();
delete pFile;
printf("Finished cleaning up.\n\n");
printf("Test Complete.\n");
pause();
return 0;
}
When I run this code, it takes some time to create the data points, but then it runs through the rest of the code almost instantly. However, I then have to wait about 4 minutes for the application to actually finish running. I would assume the application would wait hang at the serializing data portion just like it did during the creation of the data points.
So my question is about how this actually work. Does carchive do its thing on a separate thread and allow the rest of the code to execute?
I can provide more information if necessary.

If you want to create a vector with a million elements that are all default-initialized you just just use this version of the constructor
vector<TestData> dataPoints{numOfPoint};
You should stop newing everything, let RAII handle the cleanup for you.
Also, know that push_back requires a resize of your vector if it's capacity isn't large enough, so if you start with an empty vector, and know how big it is going to be at the end, you can use reserve ahead of time.
vector<TestData> dataPoints;
dataPoints.reserve(numOfPoint);
for (int i = 0; i < numOfPoint; i++)
{
dataPoints->push_back(TestData{});
}

Related

Write/Read a stream of data (double) using named pipes in C++

I am trying to develop a little application in C++, within a Linux environment, which does the following:
1) gets a data stream (a series of arrays of doubles) from the output of a 'black-box' and writes it to a pipe. The black-box can be thought as an ADC;
2) reads the data stream from the pipe and feeds it to another application which requires these data as stdin;
Unfortunately, I was not able to find tutorials or examples. The best way I found to realize this is summarized in the following test-bench example:
#include <iostream>
#include <fcntl.h>
#include <sys/stat.h>
#include <stdio.h>
#define FIFO "/tmp/data"
using namespace std;
int main() {
int fd;
int res = mkfifo(FIFO,0777);
float *writer = new float[10];
float *buffer = new float[10];
if( res == 0 ) {
cout<<"FIFO created"<<endl;
int fres = fork();
if( fres == -1 ) {
// throw an error
}
if( fres == 0 )
{
fd = open(FIFO, O_WRONLY);
int idx = 1;
while( idx <= 10) {
for(int i=0; i<10; i++) writer[i]=1*idx;
write(fd, writer, sizeof(writer)*10);
}
close(fd);
}
else
{
fd = open(FIFO, O_RDONLY);
while(1) {
read(fd, buffer, sizeof(buffer)*10);
for(int i=0; i<10; i++) printf("buf: %f",buffer[i]);
cout<<"\n"<<endl;
}
close(fd);
}
}
delete[] writer;
delete[] buffer;
}
The problem is that, by running this example, I do not get a printout of all the 10 arrays I am feeding to the pipe, whereas I keep getting always the first array (filled by 1).
Any suggestion/correction/reference is very welcome to make it work and learn more about the behavior of pipes.
EDIT:
Sorry guys! I found a very trivial error in my code: in the while loop within the writer part, I am not incrementing the index idx......once I correct it, I get the printout of all the arrays.
But now I am facing another problem: when using a lot of large arrays, these are randomly printed out (the whole sequence is not printed); as if the reader part is not able to cope with the speed of the writer. Here is the new sample code:
#include <iostream>
#include <fcntl.h>
#include <sys/stat.h>
#include <stdio.h>
#define FIFO "/tmp/data"
using namespace std;
int main(int argc, char** argv) {
int fd;
int res = mkfifo(FIFO,0777);
int N(1000);
float writer[N];
float buffer[N];
if( res == 0 ) {
cout<<"FIFO created"<<endl;
int fres = fork();
if( fres == -1 ) {
// throw an error
}
if( fres == 0 )
{
fd = open(FIFO, O_WRONLY | O_NONBLOCK);
int idx = 1;
while( idx <= 1000 ) {
for(int i=0; i<N; i++) writer[i]=1*idx;
write(fd, &writer, sizeof(float)*N);
idx++;
}
close(fd);
unlink(FIFO);
}
else
{
fd = open(FIFO, O_RDONLY);
while(1) {
int res = read(fd, &buffer, sizeof(float)*N);
if( res == 0 ) break;
for(int i=0; i<N; i++) printf(" buf: %f",buffer[i]);
cout<<"\n"<<endl;
}
close(fd);
}
}
}
Is there some mechanism to implement in order to make the write() wait until read() is still reading data from the fifo, or am I missing something trivial also in this case?
Thank you for those who have already given answers to the previous version of my question, I have implemented the suggestions.
The arguments to read and write are incorrect. Correct ones:
write(fd, writer, 10 * sizeof *writer);
read(fd, buffer, 10 * sizeof *buffer);
Also, these functions may do partial reads/writes, so that the code needs to check the return values to determine whether the operation must be continued.
Not sure why while( idx <= 10) loop in the writer, this loop never ends. Even on a 5GHz CPU. Same comment for the reader.

Why am I segfaulting in this specific code?

This program is supposed to take a file name and argument(s) and create a process that executes the code while outputting the result to the terminal (which I don't know why that isn't working either).
I have found that the seg fault is coming from my attempt to free the argvNew array of strings
#include <iostream>
using namespace std;
#include <unistd.h>
#include <sys/wait.h>
main(int argc, char **argv){
int pid;
int i;
char *argvNew[argc-1];
do{
//Check for failure
if ((pid = fork()) < 0) {
cerr << "Fork error\n";
exit(1);
}
//Check if parent
else if (pid == 0) {
/* child process */
//Add arguments to new array
for(i = 0; i < argc-2; i++){
argvNew[i] = argv[i+1];
}
argvNew[argc-2] = NULL;
if (execvp(argv[1], argvNew) < 0) {
cerr << "Execve error\n";
exit(1);
}
}
else {
/* parent */
waitpid(pid, NULL, 0);/* wait for the child to finish */
//Free argvNew
for(i = 0; i < argc-1;i++){
free(argvNew[i]);
}
free(argvNew);
}
//if we're need to create a new list of args in the future put it here
}while(!argc);
}
test input: ./myProgram /bin/ls -l
argvNew is automaticlly allocated, which means that the resources held by it are released automagiclly when it goes out of scope. You only need to free dynamicly allocated arrays:
char a[50]; // the [50] means automatic allocation
// ...
// no need to free
char* a = malloc(50); // dynamic allocation
// ...
// need to free later, or memory leak
free(a);
for(i = 0; i < argc-1;i++){
free(argvNew[i]);
}
//This next call is identical to free(argvNew[0]), probably where you're
//segfaulting, since you're freeing something twice
free(argvNew);
Note that you don't need to call free(), since argvNew[] contains pointers that do not point to new/malloc'ed data, but rather to the values of the argv array, which are managed by the parent process and should not be explicitly free()'ed by you

Why does Windows task manager show memory increases while writing very large files? Should I be worried? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am developing a c++ application (in VS2012, Windows Server 2012 R2) that writes large volumes of binary data, from cyclical arrays of buffers that have been allocated, to raw files. The thing is that system RAM usage as reported by Windows Task Manager increases in a linear rate as fwrite writes the data in the files until it reaches a certain point where it remains almost constant (also see the following image). Also, the memory used by my application remains constant the whole time.
I call fflush periodically and it has no effect. Although it seems to be a harmless case, I am concerned about this issue in terms of performance, as another Java application will also be running in a nominal operation.
Therefore, I would like to ask if I should worry about this and if there is a way to avoid this issue towards achieving the best performance for a real-time data recording system.
Similar questions have been asked here and here for linux operating systems and it has been said that the system can devote an amount of memory for caching the data, as long as there is enough memory available.
A part of the application is presented next. In short, the application controls a pair of cameras and each of them acquires frames and store them in properly aligned allocated buffers. There are i) a CameraInterface class, which creates two "producer" threads, ii) a Recorder class, which creates two "consumer" threads and iii) a SharedMemoryManager class that provides a producer with an available buffer for storing data and a consumer with the next buffer to be written to the file. The SharedMemoryManager holds two arrays of buffers (one for each pair of producer-consumer) and two respective arrays of flags that indicate the status of the buffer. It also holds two std::queue objects for quick accessing of the next buffers to be recorder. Parts of the Recorder and the SharedMemoryManager are shown next.
// somewhere in file "atcore.h"...
typedef unsigned char AT_U8;
// File: SharedMemoryManager.h
#ifndef __MEM_MANAGER__
#define __MEM_MANAGER__
#pragma once
#include "atcore.h"
#include <queue>
#include <mutex>
#define NBUFFERS 128
#define BUFFER_AVAILABLE 0
#define BUFFER_QUEUED 1
#define BUFFER_FULL 2
#define BUFFER_RECORDING_PENDING 3
// the status flag cycle is
// EMPTY -> QUEUED -> FULL -> RECORDING_PENDING -> EMPTY
using namespace std;
typedef struct{
AT_U8** buffers;
int* flags;
int acquiredCounter;
int consumedCounter;
int queuedCounter;
mutex flagMtx;
} sharedMemory;
typedef struct{
AT_U8* buffer;
int bufSize;
int index;
} record;
class SharedMemoryManager
{
public:
SharedMemoryManager();
~SharedMemoryManager(void);
void enableRecording();
void disableRecording();
int setupMemory(int cameraIdentifier, int bufferSize);
void freeMemory();
void freeCameraMemory(int cameraIdentifier);
int getBufferSize(int cameraIdentifier);
AT_U8* getBufferForCameraQueue(int cameraIdentifier); // get pointer to the next available buffer for queueing in the camera
int hasFramesForRecording(int cameraIdentifier); // ask how many frames for recording are there in the respective queue
AT_U8* getNextFrameForRecording(int cameraIdentifier); // get pointer to the next buffer to be recorded to a file
void copyMostRecentFrame(unsigned char* buffer, int cameraIdentifier); // TODO // get a copy of the most recent frame on the buffer
void notifyAcquiredFrame(AT_U8* buffer, int bufSize, int cameraIdentifier); // use this function to notify the manager that the buffer has just been filled with data
void notifyRecordedFrame(AT_U8* buffer, int cameraIdentifier); // use this function to notify the manager that the buffer has just been written to file and can be used again
private:
bool useMem0, useMem1;
int bufSize0, bufSize1;
sharedMemory* memory0;
sharedMemory* memory1;
queue<record*> framesForRecording0;
queue<record*> framesForRecording1;
bool isRecording;
int allocateBuffers(sharedMemory* mem, int bufSize);
void freeBufferArray(sharedMemory* mem);
};
#endif // !__MEM_MANAGER
// File: SharedMemoryManager.cpp
...
int SharedMemoryManager::hasFramesForRecording(int cameraIdentifier){
if (cameraIdentifier!=0 && cameraIdentifier!=1){
cout << "Could not get the number of frames in the shared memory. Invalid camera id " << cameraIdentifier << endl;
return -1;
}
if (cameraIdentifier==0){
return (int)framesForRecording0.size();
}
else{
return (int)framesForRecording1.size();
}
}
AT_U8* SharedMemoryManager::getNextFrameForRecording(int cameraIdentifier){
if (cameraIdentifier!=0 && cameraIdentifier!=1){
cout << "Error in getNextFrameForRecording. Invalid camera id " << cameraIdentifier << endl;
return NULL;
}
sharedMemory* mem;
if (cameraIdentifier==0) mem=memory0;
else mem=memory1;
queue<record*>* framesQueuePtr;
if (cameraIdentifier==0) framesQueuePtr = &framesForRecording0;
else framesQueuePtr = &framesForRecording1;
if (framesQueuePtr->empty()){ // no frames to be recorded at the moment
return NULL;
}
record* item;
int idx;
AT_U8* buffer = NULL;
item = framesQueuePtr->front();
framesQueuePtr->pop();
idx = item->index;
delete item;
mem->flagMtx.lock();
if (mem->flags[idx] == BUFFER_FULL){
mem->flags[idx] = BUFFER_RECORDING_PENDING;
buffer = mem->buffers[idx];
}
else{
cout << "PROBLEM. Buffer in getBufferForRecording. Buffer flag is " << mem->flags[idx] << endl;
cout << "----- BUFFER FLAGS -----" << endl;
for (int i=0; i<NBUFFERS; i++){
cout << "[" << i << "] " << mem->flags[i] << endl;
}
cout << "----- -----" << endl;
}
mem->flagMtx.unlock();
return buffer;
}
int SharedMemoryManager::allocateBuffers(sharedMemory* mem, int bufSize){
// allocate the array for the buffers
mem->buffers = (AT_U8**)calloc(NBUFFERS,sizeof(AT_U8*));
if (mem->buffers==NULL){
cout << "Could not allocate array of buffers." << endl;
return -1;
}
// allocate the array for the respective flags
mem->flags = (int*)malloc(NBUFFERS*sizeof(int));
if (mem->flags==NULL){
cout << "Could not allocate array of flags for the buffers." << endl;
free(mem->buffers);
return -1;
}
int i;
for (i=0; i<NBUFFERS; i++){ // allocate the buffers
mem->buffers[i] = (AT_U8*)_aligned_malloc((size_t)bufSize,8);
if (mem->buffers[i] == NULL){
cout << "Could not allocate memory for buffer no. " << i << endl;
for (int j=0; j<i; j++){ // free the previously allocated buffers
_aligned_free(mem->buffers[j]);
}
free(mem->buffers);
free(mem->flags);
return -1;
}
else{
mem->flags[i]=BUFFER_AVAILABLE;
}
}
return 0;
}
void SharedMemoryManager::freeBufferArray(sharedMemory* mem){
if (mem!=NULL){
for(int i=0; i<NBUFFERS; i++){
_aligned_free(mem->buffers[i]);
mem->buffers[i]=NULL;
}
free(mem->buffers);
mem->buffers = NULL;
free(mem->flags);
mem->flags = NULL;
free(mem);
mem = NULL;
}
}
// File: Recorder.h
#ifndef __RECORDER__
#define __RECORDER__
#pragma once
#include <string>
#include <queue>
#include <future>
#include <thread>
#include "atcore.h"
#include "SharedMemoryManager.h"
using namespace std;
class Recorder
{
public:
Recorder(SharedMemoryManager* memoryManager);
~Recorder();
void recordBuffer(AT_U8 *buffer, int bufsize);
int setupRecording(string filename0, string filename1, bool open0, bool open1);
void startRecording();
void stopRecording();
int testWriteSpeed(string directoryPath, string filename);
void insertFrameItem(AT_U8* buffer, int bufSize, int chunkID);
private:
FILE *chunk0, *chunk1;
string chunkFilename0, chunkFilename1;
int frameCounter0, frameCounter1;
bool writes0, writes1;
int bufSize0, bufSize1;
static SharedMemoryManager* manager;
bool isRecording;
promise<int> prom0;
promise<int> prom1;
thread* recordingThread0;
thread* recordingThread1;
static void performRecording(promise<int>* exitCode, int chunkIdentifier);
void writeNextItem(int chunkIdentifier);
void closeFiles();
};
#endif //!__RECORDER__
// File: Recorder.cpp
#include "Recorder.h"
#include <ctime>
#include <iostream>
using namespace std;
Recorder* recorderInstance; // keep a pointer to the current instance, for accessing static functions from (non-static) objects in the threads
SharedMemoryManager* Recorder::manager; // the same reason
...
void Recorder::startRecording(){
if (isRecording == false){ // do not start new threads if some are still running
isRecording = true;
if (writes0==true) recordingThread0 = new thread(&Recorder::performRecording, &prom0, 0);
if (writes1==true) recordingThread1 = new thread(&Recorder::performRecording, &prom1, 1);
}
}
void Recorder::writeNextItem(int chunkIdentifier){
FILE* chunk;
AT_U8* buffer;
int* bufSize;
if (chunkIdentifier==0){
chunk = chunk0;
bufSize = &bufSize0;
buffer = manager->getNextFrameForRecording(0);
}
else {
chunk = chunk1;
bufSize = &bufSize1;
buffer = manager->getNextFrameForRecording(1);
}
size_t nbytes = fwrite(buffer, 1, (*bufSize)*sizeof(unsigned char), chunk);
if (nbytes<=0){
cout << "No data were written to file." << endl;
}
manager->notifyRecordedFrame(buffer,chunkIdentifier);
if (chunkIdentifier==0) frameCounter0++;
else frameCounter1++;
}
void Recorder::performRecording(promise<int>* exitCode, int chunkIdentifier){
bool flag = true;
int remaining = manager->hasFramesForRecording(chunkIdentifier);
while( recorderInstance->isRecording==true || remaining>0 ){
if (remaining>0){
if (recorderInstance->isRecording==false){
cout << "Acquisition stopped, still " << remaining << " frames are to be recorded in chunk " << chunkIdentifier << endl;
}
recorderInstance->writeNextItem(chunkIdentifier);
}
else{
this_thread::sleep_for(chrono::milliseconds(10));
}
remaining = manager->hasFramesForRecording(chunkIdentifier);
}
cout << "Done recording." << endl;
}
In the Windows memory use screen shot you show, the biggest chunk (45GB) is "cached" of which 27GB is "modified", meaning "dirty pages waiting to be written to disk". This is normal behavior because you are writing faster than the disk I/O can keep up. flush/fflush has no effect on this because it is not in your process. As you note: "the memory used by my application remains constant the whole time". Do not be concerned. However, if you really don't want the OS to buffer dirty output pages, consider using "unbuffered I/O" available on Windows, as it will write through immediately to disk.
Edit: Some links to unbuffered I/O on Windows. Note that unbuffered I/O places memory-alignment constraints on your reads and writes.
File Buffering
CreateFile function

boost::interprocess message queue Race Condition on creation?

I am trying to debug sporadic access violations that occur inside a boost::interprocess message queue. (access violation reading an address in the shared memory region).
Environment: boost 1.54, VC++2010. Occurs in both Debug & Release builds.
It always occurs on or about line 854 (in case of reception) in message_queue.hpp:
Comments were added by me
recvd_size = top_msg.len; // top_msg points to invalid location
Or line 756 (in case of sending)
BOOST_ASSERT(free_msg_hdr.priority == 0); // free_msg_hdr points to invalid location
It appears as though this is related to the message queue creation. If a message queue is created "properly" (i.e. without the possible race condition), the error never occurs.
Otherwise it might occur on timed_receive() or timed_send() on the queue at seemingly random times.
I came up with a short example that represents the problem:
Unfortunately I cannot run it on Coliru, since it requires two processes.
One has to be started without any parameters, the second with any single parameter.
After a number of runs, one of the processes will crash in message_queue.
#include <iostream>
#include <boost/interprocess/ipc/message_queue.hpp>
#include <boost/thread.hpp>
#include <boost/assert.hpp>
#include <boost/date_time.hpp>
using namespace boost::interprocess;
using namespace boost::posix_time;
using boost::posix_time::microsec_clock; // microsec_clock is ambiguous between boost::posix_time and boost::interprocess. What are the odds?
int main(int argc, wchar_t** argv)
{
while(true)
{
int proc = 0;
message_queue* queues[2] = {NULL, NULL};
std::string names[] = {"msgq0", "msgq1"};
if(1 == argc)
{
proc = 0;
message_queue::remove(names[0].c_str());
if(NULL != queues[0]) { delete queues[0]; queues[0] = NULL; }
queues[0] = new message_queue(open_or_create, names[0].c_str(), 128, 10240);
bool bRet = false;
do
{
try
{
if(NULL != queues[1]) { delete queues[1]; queues[1] = NULL; }
queues[1]=new message_queue(open_only, names[1].c_str());
bRet = true;
}
catch(const interprocess_exception&)
{
//boost::this_thread::sleep(boost::posix_time::milliseconds(2));
delete queues[1];
queues[1] = NULL;
continue;
}
}while(!bRet);
}
else
{
proc = 1;
message_queue::remove(names[1].c_str());
if(NULL != queues[1]) { delete queues[1]; queues[1] = NULL; }
queues[1] = new message_queue(open_or_create, names[1].c_str(), 128, 10240);
bool bRet = false;
do
{
try
{
if(NULL != queues[0]) { delete queues[0]; queues[0] = NULL; }
queues[0]=new message_queue(open_only, names[0].c_str());
bRet = true;
}
catch(const interprocess_exception&)
{
//boost::this_thread::sleep(boost::posix_time::milliseconds(2));
delete queues[0];
queues[0] = NULL;
continue;
}
}while(!bRet);
}
long long nCnt = 0;
for(int i = 0; i < 1; ++i)
{
if(proc)
{
std::string sOut;
sOut = "Proc1 says: Hello ProcA " + std::to_string(nCnt) + " ";
sOut.resize(10230, ':');
for(int n = 0; n < 3; ++n)
{
queues[1]->timed_send(sOut.data(), sOut.size(), 0, ptime(boost::posix_time::microsec_clock::universal_time()) + milliseconds(1));
}
bool bMessage = false;
for(int n = 0; n < 3; ++n)
{
size_t nRec; unsigned int nPrio;
std::string sIn; sIn.resize(10240);
bMessage = queues[0]->timed_receive(&sIn[0], 10240, nRec, nPrio, ptime(boost::posix_time::microsec_clock::universal_time()) + milliseconds(1));
if(bMessage)
{
sIn.resize(nRec);
//std::cout << sIn << " ";
}
}
if(bMessage)
{
//std::cout << std::endl;
}
}
else
{
std::string sOut;
sOut = "Proc0 says: Hello Procccccccdadae4325a " + std::to_string(nCnt);
sOut.resize(10240, '.');
for(int n = 0; n < 3; ++n)
{
queues[0]->timed_send(sOut.data(), sOut.size(), 0, ptime(boost::posix_time::microsec_clock::universal_time()) + milliseconds(1));
}
bool bMessage = false;
for(int n = 0; n < 3; ++n)
{
size_t nRec; unsigned int nPrio;
std::string sIn; sIn.resize(10240);
bMessage = queues[1]->timed_receive(&sIn[0], 10240, nRec, nPrio, ptime(boost::posix_time::microsec_clock::universal_time()) + milliseconds(1));
if(bMessage)
{
sIn.resize(nRec);
//std::cout << sIn << " ";
}
}
if(bMessage)
{
//std::cout << std::endl;
}
}
nCnt++;
boost::this_thread::sleep(boost::posix_time::milliseconds(10));
}
}
return 0;
}
I am still thinking I might be doing something wrong, since I cannot find anything about this problem anywhere else, and the boost libraries are normally very good.
Is there anything I might be doing wrong with the usage of the message_queue in this example?
I don't think that both processes using open_or_create is a supported idiom. Are you aware of this thread on the mailing list? I can't find more discussions so it looks to me like lifetime management wasn't eventually considered necessary to add.
Thus you'll need to synchronise the creation manually with boost::interprocess or possibly by having one of the processes retrying to open_only the queue until the other process creates it.

Why Windows C++ muti-threading IOPS is much faster than IOMeter?

I have a SSD and I am trying to use it to simulate my program I/O performance, however, IOPS calculated from my program is much much faster than IOMeter.
My SSD is PLEXTOR PX-128M3S, by IOMeter, its max 512B random read IOPS is around 94k (queue depth is 32).
However my program (32 windows threads) can reach around 500k 512B IOPS, around 5 times of IOMeter! I did data validation but didn't find any error in data fetching. It's because my data fetching in order?
I paste my code belwo (it mainly fetch 512B from file and release it; I did use 4bytes (an int) to validate program logic and didn't find problem), can anybody help me figure out where I am wrong?
Thanks so much in advance!!
#include <stdio.h>
#include <Windows.h>
//Global variables
long completeIOs = 0;
long completeBytes = 0;
int threadCount = 32;
unsigned long long length = 1073741824; //4G test file
int interval = 1024;
int resultArrayLen = 320000;
int *result = new int[resultArrayLen];
//Method declarison
double GetSecs(void); //Calculate out duration
int InitPool(long long,char*,int); //Initialize test data for testing, if successful, return 1; otherwise, return a non 1 value.
int * FileRead(char * path);
unsigned int DataVerification(int*, int sampleItem); //Verify data fetched from pool
int main()
{
int sampleItem = 0x1;
char * fPath = "G:\\workspace\\4G.bin";
unsigned int invalidIO = 0;
if (InitPool(length,fPath,sampleItem)!= 1)
printf("File write err... \n");
//start do random I/Os from initialized file
double start = GetSecs();
int * fetchResult = FileRead(fPath);
double end = GetSecs();
printf("File read IOPS is %.4f per second.. \n",completeIOs/(end - start));
//start data validation, for 4 bytes fetch only
// invalidIO = DataVerification(fetchResult,sampleItem);
// if (invalidIO !=0)
// {
// printf("Total invalid data fetch IOs are %d", invalidIO);
// }
return 0;
}
int InitPool(long long length, char* path, int sample)
{
printf("Start initializing test data ... \n");
FILE * fp = fopen(path,"wb");
if (fp == NULL)
{
printf("file open err... \n");
exit (-1);
}
else //initialize file for testing
{
fseek(fp,0L,SEEK_SET);
for (int i=0; i<length; i++)
{
fwrite(&sample,sizeof(int),1,fp);
}
fclose(fp);
fp = NULL;
printf("Data initialization is complete...\n");
return 1;
}
}
double GetSecs(void)
{
LARGE_INTEGER frequency;
LARGE_INTEGER start;
if(! QueryPerformanceFrequency(&frequency))
printf("QueryPerformanceFrequency Failed\n");
if(! QueryPerformanceCounter(&start))
printf("QueryPerformanceCounter Failed\n");
return ((double)start.QuadPart/(double)frequency.QuadPart);
}
class input
{
public:
char *path;
int starting;
input (int st, char * filePath):starting(st),path(filePath){}
};
//Workers
DWORD WINAPI FileReadThreadEntry(LPVOID lpThreadParameter)
{
input * in = (input*) lpThreadParameter;
char* path = in->path;
FILE * fp = fopen(path,"rb");
int sPos = in->starting;
// int * result = in->r;
if(fp != NULL)
{
fpos_t pos;
for (int i=0; i<resultArrayLen/threadCount;i++)
{
pos = i * interval;
fsetpos(fp,&pos);
//For 512 bytes fetch each time
unsigned char *c =new unsigned char [512];
if (fread(c,512,1,fp) ==1)
{
InterlockedIncrement(&completeIOs);
delete c;
}
//For 4 bytes fetch each time
/*if (fread(&result[sPos + i],sizeof(int),1,fp) ==1)
{
InterlockedIncrement(&completeIOs);
}*/
else
{
printf("file read err...\n");
exit(-1);
}
}
fclose(fp);
fp = NULL;
}
else
{
printf("File open err... \n");
exit(-1);
}
}
int * FileRead(char * p)
{
printf("Starting reading file ... \n");
HANDLE mWorkThread[256]; //max 256 threads
completeIOs = 0;
int slice = int (resultArrayLen/threadCount);
for(int i = 0; i < threadCount; i++)
{
mWorkThread[i] = CreateThread(
NULL,
0,
FileReadThreadEntry,
(LPVOID)(new input(i*slice,p)),
0,
NULL);
}
WaitForMultipleObjects(threadCount, mWorkThread, TRUE, INFINITE);
printf("File read complete... \n");
return result;
}
unsigned int DataVerification(int* result, int sampleItem)
{
unsigned int invalid = 0;
for (int i=0; i< resultArrayLen/interval;i++)
{
if (result[i]!=sampleItem)
{
invalid ++;
continue;
}
}
return invalid;
}
I didn't look in enough detail to be certain, but I didn't see any code there to flush the data to the disk and/or ensure your reads actually came from the disk. That being the case, it appears that what you're measuring is primarily the performance of the operating system's disk caching. While the disk might contribute a little to the performance you're measuring, it's probably only a small contributor, with other factors dominating.
Since the code is apparently written for Windows, you might consider (for one example) opening the file with CreateFile, and passing the FILE_FLAG_NO_BUFFERING flag when you do so. This will (at least mostly) remove the operating system cache from the equation, and force each read or write to deal directly with the disk itself.