Read multiple .dat files by GPU - c++

I understand that reading files by GPU is inefficient task as it's faced by the slowest part of the system, that is, IO. However, I came up with another approach by using the CPU for files reading and let the processing burden be handled by the GPU. I wrote the following code in C++ but I'm stuck at the integration point, that is, how to make GPU handle these files after they've been read by the CPU. In other words, what is the set off point of C++-amp to be added and integrated with the code? or should I rewrite the whole code from the scratch?
{/* this code to read multiple .dat files from the directory that contains the implementation (from my account of stackoverflow) */
#include <Windows.h>
#include <ctime>
#include <stdint.h>
#include <iostream>
using std::cout;
using std::endl;
#include <fstream>
using std::ifstream;
#include <cstring>
/* Returns the amount of milliseconds elapsed since the UNIX epoch. Works on both
* windows and linux. */
uint64_t GetTimeMs64()
{
FILETIME ft;
LARGE_INTEGER li;
/* Get the amount of 100 nano seconds intervals elapsed since January 1, 1601 (UTC) and copy it
* to a LARGE_INTEGER structure. */
GetSystemTimeAsFileTime(&ft);
li.LowPart = ft.dwLowDateTime;
li.HighPart = ft.dwHighDateTime;
uint64_t ret;
ret = li.QuadPart;
ret -= 116444736000000000LL; /* Convert from file time to UNIX epoch time. */
ret /= 10000; /* From 100 nano seconds (10^-7) to 1 millisecond (10^-3) intervals */
return ret;
}
const int MAX_CHARS_PER_LINE = 512;
const int MAX_TOKENS_PER_LINE = 20;
const char* const DELIMITER = "|";
int main()
{
// create a file-reading object
uint64_t a = GetTimeMs64();
cout << a << endl;
HANDLE h;
WIN32_FIND_DATA find_data;
h = FindFirstFile( "*.dat", & find_data );
if( h == INVALID_HANDLE_VALUE ) {
cout<<"error"<<endl;
}
do {
char * s = find_data.cFileName;
ifstream fin;
fin.open(s); // open a file
if (!fin.good())
return 1; // exit if file not found
// read each line of the file
while (!fin.eof())
{
// read an entire line into memory
char buf[MAX_CHARS_PER_LINE];
fin.getline(buf, MAX_CHARS_PER_LINE);
// parse the line into blank-delimited tokens
int n = 0; // a for-loop index
// array to store memory addresses of the tokens in buf
const char* token[MAX_TOKENS_PER_LINE] = {}; // initialize to 0
// parse the line
token[0] = strtok(buf, DELIMITER); // first token
if (token[0]) // zero if line is blank
{
for (n = 1; n < MAX_TOKENS_PER_LINE; n++)
{
token[n] = strtok(0, DELIMITER); // subsequent tokens
if (!token[n]) break; // no more tokens
}
}
// process (print) the tokens
for (int i = 0; i < n; i++) // n = #of tokens
cout << "Token[" << i << "] = " << token[i] << endl;
cout << endl;
}
// Your code here
} while( FindNextFile( h, & find_data ) );
FindClose( h );
uint64_t b = GetTimeMs64();
cout << a << endl;
cout << b << endl;
uint64_t c = b - a;
cout << c << endl;
system("pause");
}

There is no way to handle the files for GPU. As you assumed CPU handles IO.
So you need to store your read information in memory, send it to the GPU, compute there and etc.
One of the good ways to work with files is to archive (with GPU) your information.
So you read file with CPU, extract > compute > archive with GPU, and store it with CPU.
UPD.
(CPU IO READ from file (should be already archived information)) to -> main memory
(CPU SEND) to -> GPU global memory from main memory
(GPU EXTRACT (if archived))
(GPU COMPUTE (your work here))
(GPU ARCHIVE)
(CPU RETRIEVE) to -> main memory from GPU global memory
(CPU IO WRITE to file)

Related

Decreasing Latency of playing sound using Playsound in C++ (windows)

Currently, we are playing 5 sounds one after another using Wave output and Fetching from the TCP socket. We are now using playBuffer to play the sounds. But there is a latency of playing one sound from another sound to. I don't want any latency in between playing the 5 audio and want to be played immediately. Is there any way to do that in playsound, or can I achieve that using any other library in C++ ? I am currently using a windows system. Would really appreciate some help, Seaching for hours for a solution.
// AudioTask.cpp : Defines the entry point for the console application.
// Adapted from http://www.cplusplus.com/forum/beginner/88542/
#include "stdafx.h"
#define _WIN32_WINNT 0x0500
#include <windows.h>
#include <mmsystem.h>
#include <iostream>
#include <fstream>
#include <conio.h>
#include <math.h>
#include <stdint.h>
#define PI 3.14159265
using namespace std;
typedef struct WAV_HEADER1 {
uint8_t RIFF[4]; // = { 'R', 'I', 'F', 'F' };
uint32_t ChunkSize;
uint8_t WAVE[4]; // = { 'W', 'A', 'V', 'E' };
uint8_t fmt[4]; // = { 'f', 'm', 't', ' ' };
uint32_t Subchunk1Size = 16;
uint16_t AudioFormat = 1;
uint16_t NumOfChan = 1;
uint32_t SamplesPerSec = 16000;
uint32_t bytesPerSec = 16000 * 2;
uint16_t blockAlign = 2;
uint16_t bitsPerSample = 16;
uint8_t Subchunk2ID[4]; // = { 'd', 'a', 't', 'a' };
uint32_t Subchunk2Size;
} wav_hdr1;
void playBuffer(short* audioSamplesData1, short* audioSamplesData2, int count)
{
static_assert(sizeof(wav_hdr1) == 44, "");
wav_hdr1 wav;
wav.NumOfChan = 2;
wav.SamplesPerSec = 44100;
wav.bytesPerSec = 176400;
wav.blockAlign = 4;
wav.bitsPerSample = 16;
// Fixed values
wav.RIFF[0] = 'R';
wav.RIFF[1] = 'I';
wav.RIFF[2] = 'F';
wav.RIFF[3] = 'F';
wav.WAVE[0] = 'W';
wav.WAVE[1] = 'A';
wav.WAVE[2] = 'V';
wav.WAVE[3] = 'E';
wav.fmt[0] = 'f';
wav.fmt[1] = 'm';
wav.fmt[2] = 't';
wav.fmt[3] = ' ';
wav.Subchunk2ID[0] = 'd';
wav.Subchunk2ID[1] = 'a';
wav.Subchunk2ID[2] = 't';
wav.Subchunk2ID[3] = 'a';
wav.ChunkSize = (count * 2 * 2) + sizeof(wav_hdr1) - 8;
wav.Subchunk2Size = wav.ChunkSize - 20;
char* data = new char[44 + (count * 2 * 2)];
memcpy(data, &wav, sizeof(wav));
int index = sizeof(wav);
//constexpr double max_amplitude = 32766;
for (int i = 0; i < count; i++)
{
short value = audioSamplesData1 ? audioSamplesData1[i] : 0;
memcpy(data + index, &value, sizeof(short));
index += sizeof(short);
value = audioSamplesData2 ? audioSamplesData2[i] : 0;
memcpy(data + index, &value, sizeof(short));
index += sizeof(short);
}
PlaySound((char*)data, GetModuleHandle(0), SND_MEMORY | SND_SYNC);
}
void performAction(short audioSamplesData1[], short audioSamplesData2[], int count)
{
playBuffer(audioSamplesData1, audioSamplesData1, count);
playBuffer(audioSamplesData2, audioSamplesData2, count);
playBuffer(audioSamplesData1, NULL, count);
playBuffer(NULL, audioSamplesData2, count);
playBuffer(audioSamplesData1, audioSamplesData2, count);
}
class Wave {
public:
Wave(char * filename);
~Wave();
void play(bool async = true);
bool isok();
private:
char * buffer;
bool ok;
HINSTANCE HInstance;
int numberOfAudioBytes;
};
Wave::Wave(char * filename)
{
ok = false;
buffer = 0;
HInstance = GetModuleHandle(0);
numberOfAudioBytes = 0;
ifstream infile(filename, ios::binary);
if (!infile)
{
std::cout << "Wave::file error: " << filename << std::endl;
return;
}
infile.seekg(0, ios::end); // get length of file
int length = infile.tellg();
buffer = new char[length]; // allocate memory
infile.seekg(0, ios::beg); // position to start of file
infile.read(buffer, length); // read entire file
std::cout << "Number of elements in buffer : " << length << std::endl;
numberOfAudioBytes = length;
infile.close();
ok = true;
}
Wave::~Wave()
{
PlaySound(NULL, 0, 0); // STOP ANY PLAYING SOUND
delete[] buffer; // before deleting buffer.
}
void Wave::play(bool async)
{
if (!ok)
return;
// Create two arrays of sound data to use as a test for performing the task we need to do.
const int SAMPLE_RATE = 44100; // 44.1 kHz
const int FILE_LENGTH_IN_SECONDS = 3;
const int NUMBER_OF_SAMPLES = SAMPLE_RATE*FILE_LENGTH_IN_SECONDS; // Number of elements of audio data in the array, 132300 in this case.
std::cout << "NUMBER_OF_SAMPLES : " << NUMBER_OF_SAMPLES << std::endl;
short audioSamplesData_A[NUMBER_OF_SAMPLES];
short audioSamplesData_B[NUMBER_OF_SAMPLES];
float maxVolume = 32767.0; // 2^15 - 10.0
float frequencyHz_A = 500.0;
float frequencyHz_B = 250.0;
for (int i = 0; i < NUMBER_OF_SAMPLES; i++)
{
float pcmValue_A = sin(i*frequencyHz_A / SAMPLE_RATE * PI * 2);
float pcmValue_B = sin(i*frequencyHz_B / SAMPLE_RATE * PI * 2);
short pcmValueShort_A = (short)(maxVolume * pcmValue_A);
short pcmValueShort_B = (short)(maxVolume * pcmValue_B);
//short pcmValueShort_B = (short)(0.5*maxVolume*(pcmValue_A + pcmValue_B));
audioSamplesData_A[i] = pcmValueShort_A; // This is what you need to play.
audioSamplesData_B[i] = pcmValueShort_B; // This is what you need to play.
// waveData += pack('h', pcmValueShort_A) - Python code from Python equivalent program, perhaps we need something similar.
// See enclosed "Py Mono Stereo.py" file or visit https://swharden.com/blog/2011-07-08-create-mono-and-stereo-wave-files-with-python/
}
// The task that needs to be done for this project:
// The audio data is available in the form of an array of shorts (audioSamplesData_A and audioSamplesData_B created above).
// What needs to happen is this audio data (audioSamplesData_A and audioSamplesData_B) must each be played so we can hear them.
// When this task is over, there will be no need for any WAV file anywhere, the goal is NOT to produce a WAV file. The goal is
// to take the audio data in the form of audioSamplesData_A and play it from memory somehow.
// We need to take the input data (audioSamplesData_A and audioSamplesData_B) and play the same sounds that the 5 WAV files are currently playing, but
// in the end, we will no longer need those WAV files.
// You do NOT need to create any new files.
// In the end, you do not need to read any files either.
// In the final project, all you will need is this current main.cpp file. You run main.cpp and you hear the 5 sounds.
// The 5 sounds, are created BY C++ here in this file (see loop above).
// Display the first 100 elements for one of the audio samples array
for (int i = 0; i < 100; i++)
{
//std::cout << "i = " << i << ", audioSamplesData_B[i] : " << audioSamplesData_B[i] << std::endl;
}
// Display the first 100 elements for the serialized buffer of WAV header data + some audio data, all coming from one of the WAV files on the disk.
for (int i = 0; i < 100; i++) // Last 6 elements is where audio data begins. First 44 elements are WAV header data.
{
//std::cout << "i = " << i << ", buffer[i] : " << (int) buffer[i] << std::endl;
}
performAction(audioSamplesData_A, audioSamplesData_B, NUMBER_OF_SAMPLES);
// Play the sample sound, the one obtained from the WAV file on the disk, not the one created from the audio samples created above.
//PlaySound((char*)(&audioSamplesData_A[0]), HInstance, SND_MEMORY | SND_SYNC);
//PlaySound((char*)audioSamplesData_B, HInstance, SND_MEMORY | SND_SYNC);
//PlaySound((char*)audioSamplesData_AB, HInstance, SND_MEMORY | SND_SYNC);
//PlaySound((char*)buffer, HInstance, SND_MEMORY | SND_SYNC);
}
bool Wave::isok()
{
return ok;
}
int main(int argc, char *argv[]) {
std::cout << "Trying to play sound ...\n";
// Load the WAV files from them from the disk. These files are here only to help you understand what we need. In the end, we will no longer need them.
Wave outputA("outputA.WAV"); // Audio file equivalent to audioSamplesData_A curve generated in the loop above.
Wave outputB("outputB.WAV"); // Audio file equivalent to audioSamplesData_B curve generated in the loop above.
Wave outputALeftOnly("outputALeftOnly.WAV"); // Audio file that plays sound A on the left only, must be able to take audioSamplesData_A and somehow make it left only.
Wave outputBRightOnly("outputBRightOnly.WAV"); // Audio file that plays sound B on the right only, must be able to take audioSamplesData_B and somehow make it right only.
Wave outputALeftOutputBRight("outputALeftOutputBRight.WAV"); // Must be able to take both audioSamplesData_A and audioSamplesData_B and make it play different sounds in left and right.
// Play the WAV files from the disk, either all of them or a subset of them.
outputA.play(0);
//outputB.play(0);
//outputALeftOnly.play(0);
//outputBRightOnly.play(0);
//outputALeftOutputBRight.play(0);
std::cout << "press key to exit";
while (1) {} // Loop to prevent command line terminal from closing automatically.
return 0;
}

Why does Windows task manager show memory increases while writing very large files? Should I be worried? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am developing a c++ application (in VS2012, Windows Server 2012 R2) that writes large volumes of binary data, from cyclical arrays of buffers that have been allocated, to raw files. The thing is that system RAM usage as reported by Windows Task Manager increases in a linear rate as fwrite writes the data in the files until it reaches a certain point where it remains almost constant (also see the following image). Also, the memory used by my application remains constant the whole time.
I call fflush periodically and it has no effect. Although it seems to be a harmless case, I am concerned about this issue in terms of performance, as another Java application will also be running in a nominal operation.
Therefore, I would like to ask if I should worry about this and if there is a way to avoid this issue towards achieving the best performance for a real-time data recording system.
Similar questions have been asked here and here for linux operating systems and it has been said that the system can devote an amount of memory for caching the data, as long as there is enough memory available.
A part of the application is presented next. In short, the application controls a pair of cameras and each of them acquires frames and store them in properly aligned allocated buffers. There are i) a CameraInterface class, which creates two "producer" threads, ii) a Recorder class, which creates two "consumer" threads and iii) a SharedMemoryManager class that provides a producer with an available buffer for storing data and a consumer with the next buffer to be written to the file. The SharedMemoryManager holds two arrays of buffers (one for each pair of producer-consumer) and two respective arrays of flags that indicate the status of the buffer. It also holds two std::queue objects for quick accessing of the next buffers to be recorder. Parts of the Recorder and the SharedMemoryManager are shown next.
// somewhere in file "atcore.h"...
typedef unsigned char AT_U8;
// File: SharedMemoryManager.h
#ifndef __MEM_MANAGER__
#define __MEM_MANAGER__
#pragma once
#include "atcore.h"
#include <queue>
#include <mutex>
#define NBUFFERS 128
#define BUFFER_AVAILABLE 0
#define BUFFER_QUEUED 1
#define BUFFER_FULL 2
#define BUFFER_RECORDING_PENDING 3
// the status flag cycle is
// EMPTY -> QUEUED -> FULL -> RECORDING_PENDING -> EMPTY
using namespace std;
typedef struct{
AT_U8** buffers;
int* flags;
int acquiredCounter;
int consumedCounter;
int queuedCounter;
mutex flagMtx;
} sharedMemory;
typedef struct{
AT_U8* buffer;
int bufSize;
int index;
} record;
class SharedMemoryManager
{
public:
SharedMemoryManager();
~SharedMemoryManager(void);
void enableRecording();
void disableRecording();
int setupMemory(int cameraIdentifier, int bufferSize);
void freeMemory();
void freeCameraMemory(int cameraIdentifier);
int getBufferSize(int cameraIdentifier);
AT_U8* getBufferForCameraQueue(int cameraIdentifier); // get pointer to the next available buffer for queueing in the camera
int hasFramesForRecording(int cameraIdentifier); // ask how many frames for recording are there in the respective queue
AT_U8* getNextFrameForRecording(int cameraIdentifier); // get pointer to the next buffer to be recorded to a file
void copyMostRecentFrame(unsigned char* buffer, int cameraIdentifier); // TODO // get a copy of the most recent frame on the buffer
void notifyAcquiredFrame(AT_U8* buffer, int bufSize, int cameraIdentifier); // use this function to notify the manager that the buffer has just been filled with data
void notifyRecordedFrame(AT_U8* buffer, int cameraIdentifier); // use this function to notify the manager that the buffer has just been written to file and can be used again
private:
bool useMem0, useMem1;
int bufSize0, bufSize1;
sharedMemory* memory0;
sharedMemory* memory1;
queue<record*> framesForRecording0;
queue<record*> framesForRecording1;
bool isRecording;
int allocateBuffers(sharedMemory* mem, int bufSize);
void freeBufferArray(sharedMemory* mem);
};
#endif // !__MEM_MANAGER
// File: SharedMemoryManager.cpp
...
int SharedMemoryManager::hasFramesForRecording(int cameraIdentifier){
if (cameraIdentifier!=0 && cameraIdentifier!=1){
cout << "Could not get the number of frames in the shared memory. Invalid camera id " << cameraIdentifier << endl;
return -1;
}
if (cameraIdentifier==0){
return (int)framesForRecording0.size();
}
else{
return (int)framesForRecording1.size();
}
}
AT_U8* SharedMemoryManager::getNextFrameForRecording(int cameraIdentifier){
if (cameraIdentifier!=0 && cameraIdentifier!=1){
cout << "Error in getNextFrameForRecording. Invalid camera id " << cameraIdentifier << endl;
return NULL;
}
sharedMemory* mem;
if (cameraIdentifier==0) mem=memory0;
else mem=memory1;
queue<record*>* framesQueuePtr;
if (cameraIdentifier==0) framesQueuePtr = &framesForRecording0;
else framesQueuePtr = &framesForRecording1;
if (framesQueuePtr->empty()){ // no frames to be recorded at the moment
return NULL;
}
record* item;
int idx;
AT_U8* buffer = NULL;
item = framesQueuePtr->front();
framesQueuePtr->pop();
idx = item->index;
delete item;
mem->flagMtx.lock();
if (mem->flags[idx] == BUFFER_FULL){
mem->flags[idx] = BUFFER_RECORDING_PENDING;
buffer = mem->buffers[idx];
}
else{
cout << "PROBLEM. Buffer in getBufferForRecording. Buffer flag is " << mem->flags[idx] << endl;
cout << "----- BUFFER FLAGS -----" << endl;
for (int i=0; i<NBUFFERS; i++){
cout << "[" << i << "] " << mem->flags[i] << endl;
}
cout << "----- -----" << endl;
}
mem->flagMtx.unlock();
return buffer;
}
int SharedMemoryManager::allocateBuffers(sharedMemory* mem, int bufSize){
// allocate the array for the buffers
mem->buffers = (AT_U8**)calloc(NBUFFERS,sizeof(AT_U8*));
if (mem->buffers==NULL){
cout << "Could not allocate array of buffers." << endl;
return -1;
}
// allocate the array for the respective flags
mem->flags = (int*)malloc(NBUFFERS*sizeof(int));
if (mem->flags==NULL){
cout << "Could not allocate array of flags for the buffers." << endl;
free(mem->buffers);
return -1;
}
int i;
for (i=0; i<NBUFFERS; i++){ // allocate the buffers
mem->buffers[i] = (AT_U8*)_aligned_malloc((size_t)bufSize,8);
if (mem->buffers[i] == NULL){
cout << "Could not allocate memory for buffer no. " << i << endl;
for (int j=0; j<i; j++){ // free the previously allocated buffers
_aligned_free(mem->buffers[j]);
}
free(mem->buffers);
free(mem->flags);
return -1;
}
else{
mem->flags[i]=BUFFER_AVAILABLE;
}
}
return 0;
}
void SharedMemoryManager::freeBufferArray(sharedMemory* mem){
if (mem!=NULL){
for(int i=0; i<NBUFFERS; i++){
_aligned_free(mem->buffers[i]);
mem->buffers[i]=NULL;
}
free(mem->buffers);
mem->buffers = NULL;
free(mem->flags);
mem->flags = NULL;
free(mem);
mem = NULL;
}
}
// File: Recorder.h
#ifndef __RECORDER__
#define __RECORDER__
#pragma once
#include <string>
#include <queue>
#include <future>
#include <thread>
#include "atcore.h"
#include "SharedMemoryManager.h"
using namespace std;
class Recorder
{
public:
Recorder(SharedMemoryManager* memoryManager);
~Recorder();
void recordBuffer(AT_U8 *buffer, int bufsize);
int setupRecording(string filename0, string filename1, bool open0, bool open1);
void startRecording();
void stopRecording();
int testWriteSpeed(string directoryPath, string filename);
void insertFrameItem(AT_U8* buffer, int bufSize, int chunkID);
private:
FILE *chunk0, *chunk1;
string chunkFilename0, chunkFilename1;
int frameCounter0, frameCounter1;
bool writes0, writes1;
int bufSize0, bufSize1;
static SharedMemoryManager* manager;
bool isRecording;
promise<int> prom0;
promise<int> prom1;
thread* recordingThread0;
thread* recordingThread1;
static void performRecording(promise<int>* exitCode, int chunkIdentifier);
void writeNextItem(int chunkIdentifier);
void closeFiles();
};
#endif //!__RECORDER__
// File: Recorder.cpp
#include "Recorder.h"
#include <ctime>
#include <iostream>
using namespace std;
Recorder* recorderInstance; // keep a pointer to the current instance, for accessing static functions from (non-static) objects in the threads
SharedMemoryManager* Recorder::manager; // the same reason
...
void Recorder::startRecording(){
if (isRecording == false){ // do not start new threads if some are still running
isRecording = true;
if (writes0==true) recordingThread0 = new thread(&Recorder::performRecording, &prom0, 0);
if (writes1==true) recordingThread1 = new thread(&Recorder::performRecording, &prom1, 1);
}
}
void Recorder::writeNextItem(int chunkIdentifier){
FILE* chunk;
AT_U8* buffer;
int* bufSize;
if (chunkIdentifier==0){
chunk = chunk0;
bufSize = &bufSize0;
buffer = manager->getNextFrameForRecording(0);
}
else {
chunk = chunk1;
bufSize = &bufSize1;
buffer = manager->getNextFrameForRecording(1);
}
size_t nbytes = fwrite(buffer, 1, (*bufSize)*sizeof(unsigned char), chunk);
if (nbytes<=0){
cout << "No data were written to file." << endl;
}
manager->notifyRecordedFrame(buffer,chunkIdentifier);
if (chunkIdentifier==0) frameCounter0++;
else frameCounter1++;
}
void Recorder::performRecording(promise<int>* exitCode, int chunkIdentifier){
bool flag = true;
int remaining = manager->hasFramesForRecording(chunkIdentifier);
while( recorderInstance->isRecording==true || remaining>0 ){
if (remaining>0){
if (recorderInstance->isRecording==false){
cout << "Acquisition stopped, still " << remaining << " frames are to be recorded in chunk " << chunkIdentifier << endl;
}
recorderInstance->writeNextItem(chunkIdentifier);
}
else{
this_thread::sleep_for(chrono::milliseconds(10));
}
remaining = manager->hasFramesForRecording(chunkIdentifier);
}
cout << "Done recording." << endl;
}
In the Windows memory use screen shot you show, the biggest chunk (45GB) is "cached" of which 27GB is "modified", meaning "dirty pages waiting to be written to disk". This is normal behavior because you are writing faster than the disk I/O can keep up. flush/fflush has no effect on this because it is not in your process. As you note: "the memory used by my application remains constant the whole time". Do not be concerned. However, if you really don't want the OS to buffer dirty output pages, consider using "unbuffered I/O" available on Windows, as it will write through immediately to disk.
Edit: Some links to unbuffered I/O on Windows. Note that unbuffered I/O places memory-alignment constraints on your reads and writes.
File Buffering
CreateFile function

OpenSSL SHA256 Wrong result

I have following piece of code that is supposed to calculate the SHA256 of a file. I am reading the file chunk by chunk and using EVP_DigestUpdate for the chunk. When I test the code with the file that has content
Test Message
Hello World
in Windows, it gives me SHA256 value of 97b2bc0cd1c3849436c6532d9c8de85456e1ce926d1e872a1e9b76a33183655f but the value is supposed to be 318b20b83a6730b928c46163a2a1cefee4466132731c95c39613acb547ccb715, which can be verified here too.
Here is the code:
#include <openssl\evp.h>
#include <iostream>
#include <string>
#include <fstream>
#include <cstdio>
const int MAX_BUFFER_SIZE = 1024;
std::string FileChecksum(std::string, std::string);
int main()
{
std::string checksum = FileChecksum("C:\\Users\\Dell\\Downloads\\somefile.txt","sha256");
std::cout << checksum << std::endl;
return 0;
}
std::string FileChecksum(std::string file_path, std::string algorithm)
{
EVP_MD_CTX *mdctx;
const EVP_MD *md;
unsigned char md_value[EVP_MAX_MD_SIZE];
int i;
unsigned int md_len;
OpenSSL_add_all_digests();
md = EVP_get_digestbyname(algorithm.c_str());
if(!md) {
printf("Unknown message digest %s\n",algorithm);
exit(1);
}
mdctx = EVP_MD_CTX_create();
std::ifstream readfile(file_path,std::ifstream::in|std::ifstream::binary);
if(!readfile.is_open())
{
std::cout << "COuldnot open file\n";
return 0;
}
readfile.seekg(0, std::ios::end);
long filelen = readfile.tellg();
std::cout << "LEN IS " << filelen << std::endl;
readfile.seekg(0, std::ios::beg);
if(filelen == -1)
{
std::cout << "Return Null \n";
return 0;
}
EVP_DigestInit_ex(mdctx, md, NULL);
long temp_fil = filelen;
while(!readfile.eof() && readfile.is_open() && temp_fil>0)
{
int bufferS = (temp_fil < MAX_BUFFER_SIZE) ? temp_fil : MAX_BUFFER_SIZE;
char *buffer = new char[bufferS+1];
buffer[bufferS] = 0;
readfile.read(buffer, bufferS);
std::cout << strlen(buffer) << std::endl;
EVP_DigestUpdate(mdctx, buffer, strlen(buffer));
temp_fil -= bufferS;
delete[] buffer;
}
EVP_DigestFinal_ex(mdctx, md_value, &md_len);
EVP_MD_CTX_destroy(mdctx);
printf("Digest is: ");
//char *checksum_msg = new char[md_len];
//int cx(0);
for(i = 0; i < md_len; i++)
{
//_snprintf(checksum_msg+cx,md_len-cx,"%02x",md_value[i]);
printf("%02x", md_value[i]);
}
//std::string res(checksum_msg);
//delete[] checksum_msg;
printf("\n");
/* Call this once before exit. */
EVP_cleanup();
return "";
}
I tried to write the hash generated by program as string using _snprintf but it didn't worked. How can I generate the correct hash and return the value as string from FileChecksum Function? Platform is Windows.
EDIT: It seems the problem was because of CRLF issue. As Windows in saving file using \r\n, the Checksum calculated was different. How to handle this?
MS-DOS used the CR-LF convention,So basically while saving the file in windows, \r\n comes in effect for carriage return and newline. And while testing on online (given by you), only \n character comes in effect.
Thus either you have to check the checksum of Test Message\r\nHello World\r\n in string which is equivalent to creating and reading file in windows(as given above), which is the case here.
However, the checksum of files,wherever created, will be same.
Note: your code works fine :)
It seems the problem was associated with the value of length I passed in EVP_DigestUpdate. I had passed value from strlen, but replacing it with bufferS did fixed the issue.
The code was modified as:
while(!readfile.eof() && readfile.is_open() && temp_fil>0)
{
int bufferS = (temp_fil < MAX_BUFFER_SIZE) ? temp_fil : MAX_BUFFER_SIZE;
char *buffer = new char[bufferS+1];
buffer[bufferS] = 0;
readfile.read(buffer, bufferS);
EVP_DigestUpdate(mdctx, buffer, bufferS);
temp_fil -= bufferS;
delete[] buffer;
}
and to send the checksum string, I modified the code as:
EVP_DigestFinal_ex(mdctx, md_value, &md_len);
EVP_MD_CTX_destroy(mdctx);
char str[128] = { 0 };
char *ptr = str;
std::string ret;
for(i = 0; i < md_len; i++)
{
//_snprintf(checksum_msg+cx,md_len-cx,"%02x",md_value[i]);
sprintf(ptr,"%02x", md_value[i]);
ptr += 2;
}
ret = str;
/* Call this once before exit. */
EVP_cleanup();
return ret;
As for the wrong checksum earlier, the problem was associated in how windows keeps the line feed. As suggested by Zangetsu, Windows was making text file as CRLF, but linux and the site I mentioned earlier was using LF. Thus there was difference in the checksum value. For files other than text, eg dll the code now computes correct checksum as string

Read multiple .dat files in C++

I used the code below to read one .dat file and find the execution time, it worked very well. I tried to build a loop to read multiple files as I have more than 20 files with different names (I need to keep their names), but it did not work. How can I develop this code to read all files located in a certain folder no matter how many they are? (based on following code)
#include <Windows.h>
#include <ctime>
#include <stdint.h>
#include <iostream>
using std::cout;
using std::endl;
#include <fstream>
using std::ifstream;
#include <cstring>
/* Returns the amount of milliseconds elapsed since the UNIX epoch. Works on both
* windows and linux. */
uint64_t GetTimeMs64()
{
FILETIME ft;
LARGE_INTEGER li;
/* Get the amount of 100 nano seconds intervals elapsed since January 1, 1601 (UTC) and copy it
* to a LARGE_INTEGER structure. */
GetSystemTimeAsFileTime(&ft);
li.LowPart = ft.dwLowDateTime;
li.HighPart = ft.dwHighDateTime;
uint64_t ret;
ret = li.QuadPart;
ret -= 116444736000000000LL; /* Convert from file time to UNIX epoch time. */
ret /= 10000; /* From 100 nano seconds (10^-7) to 1 millisecond (10^-3) intervals */
return ret;
}
const int MAX_CHARS_PER_LINE = 512;
const int MAX_TOKENS_PER_LINE = 20;
const char* const DELIMITER = "|";
int main()
{
// create a file-reading object
ifstream fin;
fin.open("promotion.txt"); // open a file
if (!fin.good())
return 1; // exit if file not found
// read each line of the file
while (!fin.eof())
{
// read an entire line into memory
char buf[MAX_CHARS_PER_LINE];
fin.getline(buf, MAX_CHARS_PER_LINE);
// parse the line into blank-delimited tokens
int n = 0; // a for-loop index
// array to store memory addresses of the tokens in buf
const char* token[MAX_TOKENS_PER_LINE] = {}; // initialize to 0
// parse the line
token[0] = strtok(buf, DELIMITER); // first token
if (token[0]) // zero if line is blank
{
for (n = 1; n < MAX_TOKENS_PER_LINE; n++)
{
token[n] = strtok(0, DELIMITER); // subsequent tokens
if (!token[n]) break; // no more tokens
}
}
// process (print) the tokens
for (int i = 0; i < n; i++) // n = #of tokens
cout << "Token[" << i << "] = " << token[i] << endl;
cout << endl;
}
uint64_t z = GetTimeMs64();
cout << z << endl;
system("pause");
}
For listing files in a directory on Windows, refer to this link:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365200(v=vs.85).aspx
Notes about your code:
don't use fin.eof() to test the end of input, see why: eof of istream in C++
to read multiple files, remember fin.clear() before fin.close if you use the same fin to read multiple files.
UPDATE:
The following code prints out the files name in a directory D:\\Test. If you need absolute path for every file or files in subfolders, change GetFiles to do that. This is pretty straightforward according to the link I provided. The code is test on VS2012 Win7 Pro.
#include <windows.h>
#include <Shlwapi.h>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
#pragma comment(lib, "Shlwapi.lib")
int GetFiles(const string &path, vector<string> &files, const string &wildcard = "\\*")
{
wstring basepath(path.begin(), path.end());
wstring wpath = basepath + wstring(wildcard.begin(), wildcard.end());
WIN32_FIND_DATA ffd;
HANDLE hFind = INVALID_HANDLE_VALUE;
DWORD dwError = 0;
hFind = FindFirstFile(wpath.c_str(), &ffd);
if (INVALID_HANDLE_VALUE == hFind) {
// display error messages
return dwError;
}
TCHAR buf[MAX_PATH];
do {
if (ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) {
// directory
} else {
PathCombine(buf, basepath.c_str(), ffd.cFileName);
wstring tmp(buf);
files.push_back(string(tmp.begin(), tmp.end()));
}
} while (FindNextFile(hFind, &ffd));
dwError = GetLastError();
if (ERROR_NO_MORE_FILES != dwError) {
// some errors
}
FindClose(hFind);
return dwError;
}
int main()
{
string path("D:\\Documents\\Visual Studio 2012\\Projects\\SigSpatial2013");
vector<string> files;
GetFiles(path, files);
string line;
ifstream fin;
for (int i = 0; i < files.size(); ++i) {
cout << files[i] << endl;
fin.open(files[i].c_str());
if (!fin.is_open()) {
// error occurs!!
// break or exit according to your needs
}
while (getline(fin, line)) {
// now process every line
}
fin.clear();
fin.close();
}
}
I think it's easier:
1- if you factor out the code that reads a file and process its content into its own function: void process_file( char* filename );
2- add a new function to list a directory's content: char** list_dir( char* dir );
3- combine the 2 functions in your main()
this makes for cleaner and more testable code
I agree with the suggestions to encapsulate this.
On Windows the code looks like this
HANDLE h;
WIN32_FIND_DATA find_data;
h = FindFirstFile( "*.dat", & find_data );
if( h == INVALID_HANDLE_VALUE ) {
// Error
return;
}
do {
char * s = find_data.cFileName;
// Your code here
} while( FindNextFile( h, & find_data ) );
FindClose( h );

uint64 does not name a type

The compiler displays this message 'uint64 does not name a type' every time I try to execute using uint64, and same goes for uint or unit32, I have imported stdint.h but was useless. the other question is when I execute using int, I get different value for the variable z, less value like -160000 then -140000 and so on with every subsequent execution. how to solve that? here is the code
#include <Windows.h>
#include <ctime>
#include <stdint.h>
#include <iostream>
using std::cout;
using std::endl;
#include <fstream>
using std::ifstream;
#include <cstring>
/* Returns the amount of milliseconds elapsed since the UNIX epoch. Works on both
* windows and linux. */
uint64 GetTimeMs64()
{
FILETIME ft;
LARGE_INTEGER li;
/* Get the amount of 100 nano seconds intervals elapsed since January 1, 1601 (UTC) and copy it
* to a LARGE_INTEGER structure. */
GetSystemTimeAsFileTime(&ft);
li.LowPart = ft.dwLowDateTime;
li.HighPart = ft.dwHighDateTime;
uint64 ret;
ret = li.QuadPart;
ret -= 116444736000000000LL; /* Convert from file time to UNIX epoch time. */
ret /= 10000; /* From 100 nano seconds (10^-7) to 1 millisecond (10^-3) intervals */
return ret;
}
const int MAX_CHARS_PER_LINE = 512;
const int MAX_TOKENS_PER_LINE = 20;
const char* const DELIMITER = "|";
int main()
{
// create a file-reading object
ifstream fin;
fin.open("promotion.txt"); // open a file
if (!fin.good())
return 1; // exit if file not found
// read each line of the file
while (!fin.eof())
{
// read an entire line into memory
char buf[MAX_CHARS_PER_LINE];
fin.getline(buf, MAX_CHARS_PER_LINE);
// parse the line into blank-delimited tokens
int n = 0; // a for-loop index
// array to store memory addresses of the tokens in buf
const char* token[MAX_TOKENS_PER_LINE] = {}; // initialize to 0
// parse the line
token[0] = strtok(buf, DELIMITER); // first token
if (token[0]) // zero if line is blank
{
for (n = 1; n < MAX_TOKENS_PER_LINE; n++)
{
token[n] = strtok(0, DELIMITER); // subsequent tokens
if (!token[n]) break; // no more tokens
}
}
// process (print) the tokens
for (int i = 0; i < n; i++) // n = #of tokens
cout << "Token[" << i << "] = " << token[i] << endl;
cout << endl;
}
uint64 z = GetTimeMs64();
cout << z << endl;
system("pause");
}
The type is named uint64_t. Same goes for uint32_t, uint16_t, uint8_t, etc.
uint doesn't exist. You might have intended simply unsigned int.