I have a program that records data from a serial port. Every so often, I want to split up files such that the data logs don't become very large. The problem is, after I recreate the FILE* and try to write into it, the program crashes. No compiler errors/warnings before hand also...
The program does create one log for the first time interval, but once it's time to create a new data log, it crashes at the fwrite.
First off, initializations/declarations.
char * DATA_DIR = "C:\DATA";
sprintf(path,"%s%s%s",DATA_DIR,curtime,".log"); //curtime is just the current time in a string
FILE * DATA_LOG = fopen(path, "wb+");
And later on in a while loop
if(((CURRENT_TIME-PREVIOUS_TIME) > (SEC_IN_MINUTE * MINUTE_CHUNKS) ) && (MINUTE_CHUNKS != 0) && FIRST_TIME == 0) //all this does is just checks if its time to make a new file
{
fclose(DATA_LOG); //end the current fileread
char * path;
char curtime[16];
//gets the current time and saves it to a file name
sprintf(curtime , "%s" , currentDateTime());
sprintf(path,"%s%s%s",DATA_DIR,curtime,".log");
DATA_LOG = fopen(path, "wb+"); //open the new file
//just some logic (not relevant to problem)
PREVIOUS_TIME = CURRENT_TIME;
newDirFlag = 1;
}
fwrite(cdata , sizeof(char) , numChars , DATA_LOG); //crashes here. cdata, sizeof, and numChars don't change values
Any ideas why is this happening? I'm stumped.
Couple of problems, path has no memory allocated (you're writing stuff to some random memory address which is bad). You also should check the return values of fwrite fopen for errors. If there is one use perror so you know what the problem is. It's likely the fopen is failing or you're corrupting your stack by writing to path.
Also use snprintf it's much safter than just sprintf which is vulnerable to buffer overflow.
EDIT: just saw your comment that it's c++. Why not use std::string and fstream instead? They are much safer than what you're currently doing (and probably easier).
Your MAIN problem is that char * path; has no memory assigned to it. This means that you are writing to some RANDOM [1] location in memory.
I would suggest that you use char path[PATH_MAX]; - that way you don't have to worry about allocating and later deallocating the storage for your path.
Alternatively, you could use:
stringstream ss;
ss << DATA_DIR << currentDateTime() << ".log";
string path = ss.str();
fopen(path.c_str(), "wb+")
which is a more C++ style solution.
[1] By random, I don't mean truly a random number, but some unknown value that happens to be in that location on the stack. It is almost always NOT a good place to store a string.
Related
I am writing a HTTP server in C++, and serving static files is mostly OK, however when reading .PNG files or other binary's, every method I have tried fails. My main problem is when I open up Dev tools, reading a example image would give a transferred size of 29.56kb, and a size of 29.50 kb for my current method. The sizes given also do not match up with what du-sh give, which is 32kb.
My first method was to push the contents of a file onto a string, and call a function to serve that. However, this would also server ~6kb if memory serves correctly.
My current method is to read the file using std::ifstream in binary mode. I am getting the size of the file using C++17's filesystem header and using std::filesystem::file_size. I read the contents into a buffer and then call a function to send the buffer contents 1 byte at a time
void WebServer::sendContents(std::string contents) {
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
throw std::runtime_error("Server accept: " + std::string(strerror(errno)));
}
}
void WebServer::sendFile(std::string path) {
path = "./" + path;
std::string fileCont; //File contents
std::string mimeType; //Mime type of the file
std::string contLength;
std::string::size_type idx = path.rfind('.');
if (idx != std::string::npos) mimeType = this->getMimeType(path.substr(idx + 1));
else mimeType = "text/html";
std::filesystem::path reqPath = std::filesystem::path("./" + path).make_preferred();
std::filesystem::path parentPath = std::filesystem::path("./");
std::filesystem::path actualPath = std::filesystem::canonical(parentPath / reqPath);
if (!this->isSubDir(actualPath, parentPath)) { this->sendRoute("404"); return; }
std::ifstream ifs;
ifs.open(actualPath, std::ios::binary);
if (ifs.is_open()) {
//Get the size of the static file being server
std::filesystem::path staticPath{path};
std::size_t length = std::filesystem::file_size(staticPath);
char* buffer = new char[length];
*buffer = { 0 }; //Initalize the buffer that will send the static file
ifs.read(buffer, sizeof(char) * length); //Read the buffer
std::string resp = "HTTP/1.0 200 OK\r\n"
"Server: webserver-c\r\n"
"Content-Length" + std::to_string(length) + "\r\n"
"Content-type: " + mimeType + "\r\n\r\n";
if (!ifs) std::cout << "Error! Only " << std::string(ifs.gcount()) << " could be read!" << std::endl;
this->sendContents(resp); //Send the headers
for (size_t i=0; i < length; i++) {
std::string byte = std::string(1, buffer[i]);
this->sendContents(byte);
}
delete buffer; //We do not need megs of memory stack up, that shit will grow quick
buffer = nullptr;
} else {
this->sendContents("HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"); return;
}
ifs.close();
}
It should be noted that this->newFd is a socket descriptor
It should also be noted that I have tried to take a look at this question here, however the same problem still occurs for me
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
There are two bugs for the price of one, here.
This is used to send the contents of the binary file. One byte at a time. sendContents gets used, apparently, to send one byte at a time, here. This is horribly inefficient, but it's not the bug. The first bug is as follows.
Your binary file has plenty of bytes that are 00.
In that case, contents will proudly contain this 00 byte, here. c_str() returns a pointer to it. strlen() then reaches the conclusion that it is receiving an empty string, for input, and make a grandiose announcement that the string contains 0 characters.
In the end, send's third parameter will be 0.
No bytes will get sent, at all, instead of the famous 00 byte.
The second bug will come into play once the inefficient algorithm gets fixed, and sendContents gets used to send more than one byte at a time.
send() holds a secret: this system call may return other values, other than -1 to indicate the failure. Such as the actual number of bytes that were sent. So, if send() was called to send, say, 100 bytes, it may decide so send only 30 bytes, return 30, and leaving you holding the bag with the remaining 70 unsent bytes.
This is actually, already, an existing bug in the shown code. sendContents() also gets used to send the entire resp string. Which is, eh, in the neighborhood of a 100 bytes. Give or take a dozen.
You are relying on this house of cards: of send() always doing its job complete job, in this particular case, not slacking off, and actually sending the entire HTTP/1.0 response string.
But, send() is a famous slacker, and you have no guarantees, whatsoever, that this will actually happen. And I have it on good authority that an upcoming Friday the 13th your send() will decide to slack off, all of a sudden.
So, to fix the shown code:
Implement the appropriate logic to handle the return value from send().
Do not use c_str(), followed by strlen(), because: A) it's broken, for strings containing binary data, B) this elaborate routine simply reinvents a wheel called size(). You will be happy to know that size() does exactly what it's name claims to be.
One other bug:
char* buffer = new char[length];
It is possible for an exception to get thrown from the subsequent code. This memory get leaked, because delete does not get called.
C++ gurus know a weird trick: they rarely use new, but instead use containers, like std::vector, and they don't have to worry about leaking memory, because of that.
RESOLVED
I'm trying to make a simple file loader.
I aim to get the text from a shader file (plain text file) into a char* that I will compile later.
I've tried this function:
char* load_shader(char* pURL)
{
FILE *shaderFile;
char* pShader;
// File opening
fopen_s( &shaderFile, pURL, "r" );
if ( shaderFile == NULL )
return "FILE_ER";
// File size
fseek (shaderFile , 0 , SEEK_END);
int lSize = ftell (shaderFile);
rewind (shaderFile);
// Allocating size to store the content
pShader = (char*) malloc (sizeof(char) * lSize);
if (pShader == NULL)
{
fputs ("Memory error", stderr);
return "MEM_ER";
}
// copy the file into the buffer:
int result = fread (pShader, sizeof(char), lSize, shaderFile);
if (result != lSize)
{
// size of file 106/113
cout << "size of file " << result << "/" << lSize << endl;
fputs ("Reading error", stderr);
return "READ_ER";
}
// Terminate
fclose (shaderFile);
return 0;
}
But as you can see in the code I have a strange size difference at the end of the process which makes my function crash.
I must say I'm quite a beginner in C so I might have missed some subtilities regarding the memory allocation, types, pointers...
How can I solve this size issue?
*EDIT 1:
First, I shouldn't return 0 at the end but pShader; that seemed to be what crashed the program.
Then, I change the type of reult to size_t, and added a end character to pShader, adding pShdaer[result] = '/0'; after its declaration so I can display it correctly.
Finally, as #JamesKanze suggested, I turned fopen_s into fopen as the previous was not usefull in my case.
First, for this sort of raw access, you're probably better off
using the system level functions: CreateFile or open,
ReadFile or read and CloseHandle or close, with
GetFileSize or stat to get the size. Using FILE* or
std::filebuf will only introduce an additional level of
buffering and processing, for no gain in your case.
As to what you are seeing: there is no guarantee that an ftell
will return anything exploitable as a numeric value; it could
very well be just a magic cookie. On most current systems, it
is a byte offset into the physical file, but on any non-Unix
system, the offset into the physical file will not map directly
to the logical file you are reading unless you open the file in
binary mode. If you use "rb" to open the file, you'll
probably see the same values. (Theoretically, you could get
extra 0's at the end of the file, but practically, the OS's
where that happened are either extinct, or only used on legacy
mainframes.)
EDIT:
Since the answer stating this has been deleted: you should loop
on the fread until it returns 0 (setting errno to 0 before
each call, and checking it after the return to see whether the
function returned because of an error or because it reached the
end of file). Having said this: if you're on one of the usual
Windows or Unix systems, and the file is local to the machine,
and not too big, fread will read it all in one go. The
difference in size you are seeing (given the numerical values
you posted) is almost certainly due to the fact that the two
byte Windows line endings are being mapped to a single '\n'
character. To avoid this, you must open in binary mode;
alternatively, if you really are dealing with text (and want
this mapping), you can just ignore the extra bytes in your
buffer, setting the '\0' terminator after the last byte
actually read.
I have this code that works fine:
FILE *fp;
fp = fopen(filename.c_str(), "rb");
char id[5];
fread(id,sizeof(char),4,fp);
now I've changed something in my architecture, and instead the filename as fullpath of the file I have a char pointer that contains the data of the file.. so I don't need to read (fopen, etc..) but only to read the char* buffer...
how can I do this?
thanks in advance
If I'm understanding your question correctly, you want to access a four character ID somewhere in the middle of your buffer. The easiest way to do this is just to copy the data into a new buffer and add a NULL terminator.
size_t index = 0;
// ...
char id[5];
memcpy(id, &myData[index], 4);
id[4] = '\0';
index += 4;
You can then read through your buffer sequentially by updating the index value every time you read something.
char id[5];
strncpy(id,bfr,4);
id[4]='\0';
Where bfr is the buffer with your file data.
Also strongly advise you read the chapter on pointers and strings in K&R: The C Programming Language.
i wrote an application which processes data on the GPU. Code works well, but i have the problem that the reading part of the input file (~3GB, text) is the bottleneck of my application. (The read from the HDD is fast, but the processing line by line is slow).
I read a line with getline() and copy line 1 to a vector, line2 to a vector and skip lines 3 and 4. And so on for the rest of the 11 mio lines.
I tried several approaches to get the file at the best time possible:
Fastest method I found is using boost::iostreams::stream
Others were:
Read the file as gzip, to minimize IO, but is slower than directly
reading it.
copy file to ram by read(filepointer, chararray, length)
and process it with a loop to distinguish the lines (also slower than boost)
Any suggestions how to make it run faster?
void readfastq(char *filename, int SRlength, uint32_t blocksize){
_filelength = 0; //total datasets (each 4 lines)
_SRlength = SRlength; //length of the 2. line
_blocksize = blocksize;
boost::iostreams::stream<boost::iostreams::file_source>ins(filename);
in = ins;
readNextBlock();
}
void readNextBlock() {
timeval start, end;
gettimeofday(&start, 0);
string name;
string seqtemp;
string garbage;
string phredtemp;
_seqs.empty();
_phred.empty();
_names.empty();
_filelength = 0;
//read only a part of the file i.e the first 4mio lines
while (std::getline(in, name) && _filelength<_blocksize) {
std::getline(in, seqtemp);
std::getline(in, garbage);
std::getline(in, phredtemp);
if (seqtemp.size() != _SRlength) {
if (seqtemp.size() != 0)
printf("Error on read in fastq: size is invalid\n");
} else {
_names.push_back(name);
for (int k = 0; k < _SRlength; k++) {
//handle special letters
if(seqtemp[k]== 'A') ...
else{
_seqs.push_back(5);
}
}
_filelength++;
}
}
EDIT:
The source-file is downloadable under https://docs.google.com/open?id=0B5bvyb427McSMjM2YWQwM2YtZGU2Mi00OGVmLThkODAtYzJhODIzYjNhYTY2
I changed the function readfastq to read the file, because of some pointer problems. So if you call readfastq the blocksize (in lines) must be bigger than the number of lines to read.
SOLUTION:
I found a solution, which get the time for read in the file from 60sec to 16sec. I removed the inner-loop which handeles the special characters and do this in GPU. This decreases the read-in time and only minimal increases the GPU running time.
Thanks for your suggestions.
void readfastq(char *filename, int SRlength) {
_filelength = 0;
_SRlength = SRlength;
size_t bytes_read, bytes_expected;
FILE *fp;
fp = fopen(filename, "r");
fseek(fp, 0L, SEEK_END); //go to the end of file
bytes_expected = ftell(fp); //get filesize
fseek(fp, 0L, SEEK_SET); //go to the begining of the file
fclose(fp);
if ((_seqarray = (char *) malloc(bytes_expected/2)) == NULL) //allocate space for file
err(EX_OSERR, "data malloc");
string name;
string seqtemp;
string garbage;
string phredtemp;
boost::iostreams::stream<boost::iostreams::file_source>file(filename);
while (std::getline(file, name)) {
std::getline(file, seqtemp);
std::getline(file, garbage);
std::getline(file, phredtemp);
if (seqtemp.size() != SRlength) {
if (seqtemp.size() != 0)
printf("Error on read in fastq: size is invalid\n");
} else {
_names.push_back(name);
strncpy( &(_seqarray[SRlength*_filelength]), seqtemp.c_str(), seqtemp.length()); //do not handle special letters here, do on GPU
_filelength++;
}
}
}
First instead of reading the file into memory you may work with file mappings. You just have to build your program as 64-bit to fit 3GB of virtual address space (for 32-bit application only 2GB is accessible in the user mode). Or alternatively you may map & process your file by parts.
Next, it sounds to me that your bottleneck is "copying a line to a vector". Dealing with vectors involves dynamic memory allocation (heap operations), which in a critical loop hits the performance very seriously). If this is the case - either avoid using vectors, or make sure they're declared outside the loop. The latter helps because when you reallocate/clear vectors they do not free memory.
Post your code (or a part of it) for more suggestions.
EDIT:
It seems that all your bottlenecks are related to string management.
std::getline(in, seqtemp); reading into an std::string deals with the dynamic memory allocation.
_names.push_back(name); This is even worse. First the std::string is placed into the vector by value. Means - the string is copied, hence another dynamic allocation/freeing happens. Moreover, when eventually the vector is internally reallocated - all the contained strings are copied again, with all the consequences.
I recommend using neither standard formatted file I/O functions (Stdio/STL) nor std::string. To achieve better performance you should work with pointers to strings (rather than copied strings), which is possible if you map the entire file. Plus you'll have to implement the file parsing (division into lines).
Like in this code:
class MemoryMappedFileParser
{
const char* m_sz;
size_t m_Len;
public:
struct String {
const char* m_sz;
size_t m_Len;
};
bool getline(String& out)
{
out.m_sz = m_sz;
const char* sz = (char*) memchr(m_sz, '\n', m_Len);
if (sz)
{
size_t len = sz - m_sz;
m_sz = sz + 1;
m_Len -= (len + 1);
out.m_Len = len;
// for Windows-format text files remove the '\r' as well
if (len && '\r' == out.m_sz[len-1])
out.m_Len--;
} else
{
out.m_Len = m_Len;
if (!m_Len)
return false;
m_Len = 0;
}
return true;
}
};
if _seqs and _names are std::vectors and you can guess the final size of them before processing the whole 3GB of data, you can use reserve to avoid most of the memory re-allocation during pushing back the new elements in the loop.
You should be aware of the fact that the vectors effectively produce another copy of parts of the file in main memory. So unless you have a main memory sufficiently large to store the text file plus the vector and its contents, you will probably end up with a number of page faults that also have a significant influence on the speed of your program.
You are apparently using <stdio.h> since using getline.
Perhaps fopen-ing the file with fopen(path, "rm"); might help, because the m tells (it is a GNU extension) to use mmap for reading.
Perhaps setting a big buffer (i.e. half a megabyte) with setbuffer could also help.
Probably, using the readahead system call (in a separate thread perhaps) could help.
But all this are guesses. You should really measure things.
General suggestions:
Code the simplest, most straight-forward, clean approach,
Measure,
Measure,
Measure,
Then if all else fails:
Read raw bytes (read(2)) in page-aligned chunks. Do so sequentially, so kernel's read-ahead plays to your advantage.
Re-use the same buffer to minimize cache flushing.
Avoid copying data, parse in place, pass around pointers (and sizes).
mmap(2)-ing [parts of the] file is another approach. This also avoids kernel-userland copy.
Depending on your disk speed, using a very fast de compression algorithm might help, like fastlz (there are at least two other that might be more efficient, but under GPL, so licence can be a problem).
Also, using C++ data structures and functions car increase the speed as you can maybe achieve a better compiler-time optimization. Going the C way isn't always the fastes! In some bad conditions, using char* you need to parse the whole string to reach the \0 yielding desastrous performances.
For parsing your data, using boost::spirit::qi is also probably the most optimized approach http://alexott.blogspot.com/2010/01/boostspirit2-vs-atoi.html
I am trying to get the entire raw header into a file but everytime I attempt to write the contents I get a file full of ÌÌÌÌÌÌÌÌÌÌÌ. What am I doing wrong?
DWORD CTryISAPIFilter::OnPreprocHeaders(CHttpFilterContext* httpContext,
PHTTP_FILTER_PREPROC_HEADERS headerInformation)
{
char buffer[4096];
DWORD bufferSize = sizeof(buffer);
BOOL HeaderBoolean = headerInformation->GetHeader(httpContext->m_pFC, "ALL_RAW", buffer, &bufferSize);
char * ptrIn = (char *) buffer;
std::string postData2 = ptrIn;
char * outputString = new char[4096];
int i = 0;
for(i=0;i<4096;i++){
outputString[i] = postData2[i];
}
outputString[i+1] = NULL;
std::ofstream outfile ("D:\\WebSites\\wwwroot\\test.txt",std::ios::app);
outfile << outputString << std::endl;
outfile.close();
return SF_STATUS_REQ_NEXT_NOTIFICATION;
}
Is headerInformation->GetHeader() returning success?
If so, how much is it actually writing into buffer (presumably it tells you this in a value it places in bufferSize)
I suspect that GetHeader() is failing, and nothing is being written to buffer because:
you're getting all "ÌÌÌÌÌÌÌÌÌÌÌ" characters (which is what the debug builds of VC will set uninitialized memory to), and
you're not getting an exception thrown when you index postData2 well past what should usually be the end of the string (in most cases anyway). So there's apparently no '\0' terminator in buffer (which GetHeader() will write if it succeeds).
You need to check for this failure and examine GetLastError() to get more information on what the failure is.
Update: Your buffer might not be large enough. See http://msdn.microsoft.com/en-us/magazine/cc163939.aspx for how to appropriately size the buffer.
Update 2: It's been a while since I've done web stuff, but isn't "ALL_RAW" a CGI-style server environment variable rather than a header? Shouldn't you retrieve this using GetServerVariable() instead of GetHeader()?