I have a multithreaded application running on Win XP. At a certain stage one of a threads is failing to open an existing file using fopen function. _get_errno function returns EMFILE which means Too many open files. No more file descriptors are available. FOPEN_MAX for my platform is 20. _getmaxstdio returns 512. I checked this with WinDbg and I see that about 100 files are open:
788 Handles
Type Count
Event 201
Section 12
File 101
Port 3
Directory 3
Mutant 32
WindowStation 2
Semaphore 351
Key 12
Thread 63
Desktop 1
IoCompletion 6
KeyedEvent 1
What is the reason that fopen fails ?
EDIT:
I wrote simple single threaded test application. This app can open 510 files. I don't understand why this app can open more files then multithreaded app. Can it be because of file handle leaks ?
#include <cstdio>
#include <cassert>
#include <cerrno>
void main()
{
int counter(0);
while (true)
{
char buffer[256] = {0};
sprintf(buffer, "C:\\temp\\abc\\abc%d.txt", counter++);
FILE* hFile = fopen(buffer, "wb+");
if (0 == hFile)
{
// check error code
int err(0);
errno_t ret = _get_errno(&err);
assert(0 == ret);
int maxAllowed = _getmaxstdio();
assert(hFile);
}
}
}
I guess this is a limitation of your operating system. It can depend on many things: the way the file descriptors are represented, the memory they consume, and so on.
And I suppose there isn't much you can do about it. Perhaps there is some parameter to tweak that limit.
The real question is, do you really need to open that much files simultaneously ? I mean, even if you have 100+ threads trying to read 100+ different files, they probably wont be able to read them at the same time, and you'll probably not get any better result than having, as an example, 50 threads.
It's difficult to be more accurate since we don't know what you try to achieve.
I think in win32 all the crt function will finally endup using the win32 api underneath. So in this case most probably it must be using CreateFile/OpenFile of win32. Now CreatFile/OpenFile api is not meant only for files (Files,Directories,Communication Ports,pipes,mail slots,Drive volumes etc.,). So in a real application depending on the number these resources your max open file may vary. Since you have not described much about the application. This is my first guess. If time permits go through this http://blogs.technet.com/b/markrussinovich/archive/2009/09/29/3283844.aspx
Related
I have a shell script running on Linux that essentially takes a photo with a Webcam at intervals of 30s and stores the result in a directory, replacing the old image by overwriting it.
This photo is then read by a program written in C++, but once it is executed asynchronously to the script, I would like to use a semaphore in that script so that when I was about to update a new photo (say 5 sec before), the script indicates in this flag the unavailability of the picture for access to any other program.
The first thing that came to my mind was the script simply writing this information in a text file and the C++ program reads it back to know the status, but that struck me as a rough solution.
Another thing I thought, it was the C++ program itself to instantiate the script at 30s intervals, by using the command exec("./<script>.sh") but since this program is not always running, wont work.
So, I wonder how to use some kind of "dynamic" system variable (if it exists), something that acts as a RAM, so that I can use it as a flag where the script indicates some status, and the program then reads it.
It is possible ?
Based on the above tips, I got the solution based on 'named pipes' approach. What lies behind this is that FIFOs are essentially treated by the operacional system as 'files', so they can be manipulated with the same standard system libraries that are already available. Just sharing the CPP code used for testing:
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <unistd.h>
#include <linux/stat.h>
#define FIFO_FILE "MYFIFO"
int main(void)
{
FILE *fp;
char readbuf[80];
int StrCnt = 0 ;
/* Create the FIFO if it does not exist */
umask(0);
mknod(FIFO_FILE, S_IFIFO|0666, 0);
while(1)
{
fp = fopen(FIFO_FILE, "r");
if ( fp > 0 )
{
fgets(readbuf, 80, fp);
printf("Received %d\th string : %s\n", StrCnt++ , readbuf);
fclose(fp);
// remove(FIFO_FILE);
rewind(fp);
}
}
return(0);
}
I can put values to MYFIFO with e the following command:
echo "abcde12345" > MYFIFO
And ls -al command shows FIFO content, 10 characters + 1 null character, if code not running:
-rw-rw-r-- 1 esp8266 esp8266 11 10 22 16:10 MYFIFO
Whereas if the code is running, it empties the buffer (above function rewind()):
prw-rw-rw- 1 esp8266 esp8266 0 10 22 16:12 MYFIFO
I am currently working on porting a code under VxWorks. so I use the simulator to validate changes.
This code requires the opening of many pipes and sockets. I have a problem with the opening of these files descriptors. Indeed, I can open 17 files descriptors (sockets or pipes cause the same error) but the following return the error "EMFILE: too many opened files".
After some research on the net, I modified the global variable NUM_FILES, but this change had no effect.
Do you know if this is the simulator that limits the number of files descriptors opened simultaneously ?
Thank you for your help
I also had problems with not enough file descriptors being available. Setting NUM_FILES to 50 or so solved the problem. The limitation is within the VxWorks kernel which statically allocated the file descriptor table.
As far as I know changing NUM_FILES requires the kernel to be recompiled since it is a kernel configuration value.
You can count the number of free file descriptors by compiling and executing the following function on the VxWorks shell:
int countFreeFds(void)
{
int count = 0;
int i;
FILE *fd[100];
for (count = 0; count < 100; count++)
{
fd[count] = fopen("somefile", "r"); /* some any existing file */
if (fd[count] == NULL)
{
break;
}
}
for (i = (count - 1); i >= 0; i--)
{
fclose(fd[i]);
}
return (count);
}
If you do that on a freshly started VxWorks with no further binary loaded or tasks being started the value returned by countFreeFds will return a number close to NUM_FILES.
(also note that I've not tested the function above since right now I haven't got access to the source I've used some years ago ... you may also want to modify the code to use sockets or pipes instead but concerning free file descriptors it makes no difference)
I found the problem
i had to modify RTP_FD_NUM_MAX
it was a specific RTP value
The company I'm working with has a program written in ye olde vb6, which is updated pretty frequently, and most clients run the executable from a mapped network drive. This actually has surprisingly few issues, the biggest of which is automatic updates. Currently the updater program (written in c++) renames the existing exe, then downloads and places the new version into the old version's place. This generally works fine, but in some environments it simply fails.
The solution is running this command from microsoft:
for /f "skip=4 tokens=1" %a in ('net files') do net files %a /close
This command closes all network files that are shared (well... most) and then the updater can replace the exe.
In C++ I can use the System(""); function to run that command, or I could redirect the output of net files, and iterate through the results looking for the particular file in question and run net file /close command to close them. But it would be much much nicer if there were winapi functions that have similar capabilities for better reliability and future safety.
Is there any way for me to programmatically find all network shared files and close relevant ones?
You can programmatically do what net file /close does. Just include lmshare.h and link to Netapi32.dll. You have two functions to use: NetFileEnum to enumerate all open network files (on a given computer) and NetFileClose to close them.
Quick (it assumes program is running on same server and there are not too many open connections, see last paragraph) and dirty (no error checking) example:
FILE_INFO_2* pFiles = NULL;
DWORD nRead = 0, nTotal = 0;
NetFileEnum(
NULL, // servername, NULL means localhost
"c:\\directory\\path", // basepath, directory where VB6 program is
NULL, // username, searches for all users
2, // level, we just need resource ID
(LPBYTE*)&pFiles, // bufptr, need to use a double pointer to get the buffer
MAX_PREFERRED_LENGTH, // prefmaxlen, collect as much as possible
&nRead, // entriesread, number of entries stored in pFiles
&nTotal, // totalentries, ignore this
NULL //resume_handle, ignore this
);
for (int i=0; i < nRead; ++i)
NetFileClose(NULL, pFiles[i].fi2_id);
NetApiBufferFree(pFiles);
Refer to MSDN for details about NetFileEnum and NetFileClose. Note that NetFileEnum may return ERROR_MORE_DATA if more data is available.
Guys, I am a beginner in threading and logging.
Btw, I am not a native English speaker, so pardon me if there is any mistake in my English.
I have created a multiple-thread software, where each thread uses logging module like the following:
Each thread uses different log files, so I believe that the chances of data conflicts occurred are 0.
__inline void print_logW(int _level,const wchar_t *domain,const wchar_t *msg)
{
wchar_t mess[200] = _T("");
if(_level<=traceLevel)
{
__time64_t timer;
struct tm t_st;
_time64(&timer);
localtime_s(&t_st,&timer);
if (domain == NULL)
{
domain = _T("");
}
if (msg != NULL)
{
if (showTimeStampFlag == true)
{
swprintf_s(mess,200,_T("%s : %ld"),msg,GetTickCount());
}
else
{
wcscpy_s(mess,200,msg);
}
}
if(oldTime.tm_year != t_st.tm_year || oldTime.tm_mon != t_st.tm_mon || oldTime.tm_mday != t_st.tm_mday)
{
oldTime = t_st;
print_log_preparebyDateW();
}
FILE* fp;
errno_t err = _wfopen_s(&fp, this->m_pathW, _T("at+, ccs=UTF-8"));
if (err != 0)
{
// error
return;
}
fwprintf_s(fp, m_logFormatW,
_level,
1900 + t_st.tm_year, t_st.tm_mon + 1, t_st.tm_mday,
t_st.tm_hour, t_st.tm_min, t_st.tm_sec,
domain, mess
);
fflush(fp);
fclose(fp);
}
}
When I see the log of the software that I made, I found a problem where sometimes the thread process becomes so slow (a process (such as getting pointer of an image) that usually only take 16 ms max, would take 0.2 seconds or more to finish). I am still investigating the cause of this problem, but at first, I would like to know whether the logging module is already thread safe or not.
By the way, for the parameters,
"_level" is the logging level to print or unprint the details of the process
I use "domain" to show the class where the logging is performed
"msg" is the content of the log (e.g. "process 1 started")
And as for the m_logFormatW,
m_logFormatW = _T("[%.2d][%.4d-%.2d-%.2dT%.2d:%.2d:%.2d][%s] %s\n");
If there is any question or anything unclear, feel free to ask.
As long as you are linking to the multi-threaded runtime libraries and oldTime is not a global or static variable your log function will be thread safe. If oldTime is a global or static variable you will need to serialize access to it when you access or modify it otherwise you risk a race condition. The only other thing that may not be thread safe is print_log_preparebyDateW but it's hard to say since you haven't included the code for it. As long as oldTime is not global or static and all the runtime library functions that you use are marked as thread safe or are part of a library marked as thread safe in the MSDN you'll be OK.
The only other problem I can see is when you open the file. If the file is already open and another thread attempts to log information the open call will fail causing the information to be lost. This is because _wfopen_s opens the file without any sharing modes. You can fix this by using std::mutex and locking it while the file is open and unlocking it after the file is closed.
One possible reason your worker threads are taking longer to execute than expected is that opening the log file, writing the information, flushing the file and closing it can take a bit of extra time. This can happen any time file I/O occurs even when caching is involved. Usually you can reduce the time by opening the log file once and then closing it when your application terminates.
Another possible solution to reduce the time it takes your worker threads to execute is to use a pipe. In this scenario you write the log text to a pipe and have an additional thread that reads from the pipe and writes to the log file. This will eliminate any disk I/O that may occur when your worker threads log information. You may encounter some instances where the logging takes a bit of extra time if the pipe is full but it won't happen as often.
Your approach is good for single threaded application and will not work in multithreaded environment as you are not serializing the log message requests.
Its better you look into some well written logger class in opensource such as
a) AsynchronousAndSynchronouslogger - http://www.codeproject.com/Articles/288827/g2log-An-efficient-asynchronous-logger-using-Cplus
b) Simplethreadsafe - http://cpplogging.codeplex.com/
c) Log4Cpp - http://log4cpp.sourceforge.net/
I have a C++ program that creates an output file "A" with ofstream. This file is then read by some legacy C code that opens the file with _iobuf. The legacy code then creates its own output file "B" using _iobuf, and this file is then read by the C++ program using ifstream. This sequence is iterated many times, with the same file names for A and B in each iteration.
Occasionally, the C++ program cannot open the output file A for writing, and I must try several times before it succeeds. This occurs nondeterministically, and maybe once in a thousand iterations. Note that the C program never has to wait to open its input or output file, nor does the C++ program ever have to wait to open its input file. This informal observation is based on hundreds of thousands of iterations.
I'm wondering if this has something to do with mixing ofstream and _iobuf in the same program? Both the C++ code and the C code are linked into the same program. And the legacy C code is technically C++ code, but was written in a very C-like style. Is there anything I can do to eliminate this waiting to open the ofstream file? I do not want to change the legacy code if I can possibly avoid it.
Pseudo code (not compiled):
void someObject::someMethod()
{
for (int count = 0; count < someLimit; ++count)
{
newerObject::firstMethod();
olderObject::secondMethod();
newerObject::thirdMethod();
}
}
void newerObject::firstMethod()
{
// do some processing first
// then write the results of the processing to a file
ofstream A;
A.open("A", ofstream::out); // this sometimes must be tried multiple times
// write data to file A
A.close();
}
void olderObject::secondMethod()
{
FILE* f;
f = fopen("A", "rt"); // this always works the first time
// read data from file A
fclose(f);
// do some processing
f = fopen("B", "w");
// write data to file B
fclose(f);
}
void newerObject::thirdMethod()
{
ifstream B;
B.open("B"); // this always works the first time
// read data from file B
B.close();
// do some processing
}
Currently, as a work around, I put the ofstream::open in a do-while loop. I would love to get rid of this awkwardness. Thanks in advance for any advice you can give.
First off, the problem is almost certainly not the use of different methods to access the files: under the hood, the C and C++ I/O functions use the same system I/O facilities. You seem to be using Windows (on other systems files typically can be open multiple times simultaneously) and I don't know much about the system but I would suspect that the file system hasn't been updated to reflect that the file is closed when you try to open it. This may have to do with the "t" open flag: I don't know what this is about.
On UNIXes you can force the I/O operations to wait until the actual change on disk completed. Something like this could help avoiding the problem but has the significant cost that operations become hideously slow. On UNIXes one approach would be to blow away the file system entry the moment the file was opened successfully (after all, at this point its name isn't used anymore):
if (FILE* fp = fopen("file", "r")) {
remove("file");
// do processing
}
However, if I recall correctly on Windows you can neither remove the file nor rename it. Personally, in solving the problem I would proceed as follows:
Determine under which situations the file can't be opened, e.g. by keeping the file open and trying to open it. This is mainly intended to create a setup where the problem is reproducible so you can verify later that you indeed found a solution.
Once I found a way to reproduce the problem I would probably a better idea of the actual root cause and possibly googling would help. In any case this is the point where researching the root cause comes in.
Once the cause is understood it is hopefully easy to devise a solution. If not, opening the file multiple times under it is successful may very well be the right solution.