"too many open files" error after deleting many files - c++

My program creates a log file every 10 seconds in a specified directory. Then in a different thread, it iterates the files in that directory. If the file has content it compresses it and uploads it to external storage, if the file is empty, it deletes it. After the program runs for a while I get an error "too many open files" (gzopen failed, errno = 24).
When I looked inside /proc/<pid>/fd I see many broken links to files in the same directory where the logs are created and the word (deleted) next to the link.
Any idea what am I doing wrong? I checked the return values in both threads, of the close function (in the thread which writes the logs), and in the boost::filesystem::remove (the thread which compresses and uploads the non empty log files and deletes empty log files). All the return values are zero while the list of the (deleted) links gets longer buy 1 every 10 seconds.
I think this problem never happened to me on 32 bits but recently I moved to 64 bits and now I got this surprise.

You are neglecting to close files you open.
From your description, it sounds like you close the files you open for logging in your logging thread, but you go on to say that you just boost::filesystem::remove files after compressing and/or uploading.
Remember that:
Any compressed file you opened with gzopen has to be gzclosed
Any uncompressed file you open to compress it has to be closed.
If you open a file to check if it's empty, you have to close it.
If you open a file to transfer it, you have to close it.
Output of /proc/pid/fd would be very helpful in narrowing this down, but unfortunately you don't post it. Examples of how seemingly unhelpful output gives subtle hints:
# You forgot to gzclose the output file after compressing it
l-wx------ user group 64 Apr 9 10:17 43 -> /tmp/file.gz (deleted)
# You forgot to close the input file after compressing it
lr-x------ user group 64 Apr 9 10:17 43 -> /tmp/file (deleted)
# You forgot to close the input file after logging
l-wx------ user group 64 Apr 9 10:17 43 -> /tmp/file (deleted)
# You forgot to close the input file after transferring it
lr-x------ user group 64 Apr 9 10:17 43 -> /tmp/file.gz (deleted)

Related

Concatenating text and binary data into one file

I am developing an application, and I have several pieces of data that I want to be able to save to and open from the same file. The first is several lines that are essentially human readable that store simple data about certain attributes. The data is stored in AttributeList objects that support operator<< and operator>>. The rest are .png images which I have loaded into memory.
How can I save all this data to one file in such a way that I can then easily read it back into memory? Is there a way to store the image data in memory that will make this easier?
How can I save all this data to one file in such a way that I can then
easily read it back into memory? Is there a way to store the image
data in memory that will make this easier?
Yes.
In an embedded system I once worked on, the requirement was to capture sysstem configuration into a ram file system. (1 Meg byte)
We used zlib to compress and 'merge' multiple files into a single storage file.
Perhaps any compression system can work for you. On Linux, I would use popen() to run gzip or gunzip, etc.
update 2017-08-07
In my popen demo (for this question), I build the command string with standard shell commands:
std::string cmd;
cmd += "tar -cvf dumy514DEMO.tar dumy*V?.cc ; gzip dumy514DEMO.tar ; ls -lsa *.tar.gz";
// tar without compression ; next do compress
Then construct my popen-wrapped-in-a-class instance and invoke the popen read action. There is normally very little feedback to the user (as is the style of UNIX Philosophy, i.e. no success messages), so I included (for this demo) the -v (for verbose option). The resulting feedback lists the 4 files tar'd together, and I list the resulting .gz file.
dumy514V0.cc
dumy514V1.cc
dumy514V2.cc
dumy514V3.cc
8 -rw-rw-r-- 1 dmoen dmoen 7983 Aug 7 17:23 dumy514DEMO.tar.gz
And a snippet from the dir listing shows my executable, my source code, and the newly created tar.gz.
-rwxrwxr-x 1 dmoen dmoen 86416 Aug 7 17:18 dumy514DEMO
-rw-rw-r-- 1 dmoen dmoen 13576 Aug 7 17:18 dumy514DEMO.cc
-rw-rw-r-- 1 dmoen dmoen 7983 Aug 7 17:23 dumy514DEMO.tar.gz
As you can see, the tar.gz is about 8000 bytes. The 4 files add to about 70,000 bytes.

Can I redirect additional file descriptors from the command line while launching a program?

From C++, you can output to cout and cerr, which are handled by the file descriptors 1 and 2 respectively. Outside of the C++ program, I can then redirect this output wherever I want it (in this case, writing it two separate files):
$ my-program 1>output 2>errors;
Am I stuck with only file descriptors 1 and 2, or can I "create my own"? Let's say, I wanted a third output that saves debug information, or a fourth output that mails the administrator?
$ my-program 1>output 2>errors 3>>/logs/my-program.log 4>&1 | scripts/email-admin.sh;
Can I write to file descriptor's 3 and 4 within my program?
Opening all your files in a wrapper script is not usually a good design. When you want your program to be smarter, and able to close a big log file and start a new one, you'll need the logic in your program.
But to answer the actual question:
Yes, you can have the shell open whatever numbered file descriptors you like, for input, output or both. Having the parent open them before execve(2) is identical to what you'd get from opening them with code in the child process (at least on a POSIX system, where stdin/out/err are just like other file descriptors, and not special.) File descriptors can be marked as close-on-exec or not. Use open(2) with O_CLOEXEC, or after opening use fcntl(2) to set FD_CLOEXEC
They don't have to refer to regular files, either. Any of them can be ttys, pipes, block or character device files, sockets, or even directories. (There's no shell redirection syntax for opening directories, because you can only use readdir(3) on them, not read(2) or write(2).)
See this bash redirection tutorial. And just as a quick example of my own:
peter#tesla:~$ yes | sleep 60 4> >(cat) 5</etc/passwd 9>/dev/tcp/localhost/22 42<>/tmp/insecure_temp &
[1] 25644
peter#tesla:~$ ll /proc/$!/fd
total 0
lr-x------ 1 peter peter 64 Sep 9 21:31 0 -> pipe:[46370587]
lrwx------ 1 peter peter 64 Sep 9 21:31 1 -> /dev/pts/19
lrwx------ 1 peter peter 64 Sep 9 21:31 2 -> /dev/pts/19
l-wx------ 1 peter peter 64 Sep 9 21:31 4 -> pipe:[46372222]
lrwx------ 1 peter peter 64 Sep 9 21:31 42 -> /tmp/insecure_temp
lr-x------ 1 peter peter 64 Sep 9 21:31 5 -> /etc/passwd
l-wx------ 1 peter peter 64 Sep 9 21:31 63 -> pipe:[46372222]
lrwx------ 1 peter peter 64 Sep 9 21:31 9 -> socket:[46372228]
# note the rwx permissions: some are read-write, some are one-way
The >(cmd) process substitution syntax expands to a filename like /dev/fd/63. Using 4> >(cmd) opens that fd as fd 4, as well.
Redirecting stderr to a pipe takes some juggling of file descriptors, because there's no 2| cmd syntax. 2> >(cmd) works, but the cmd runs in the background:
peter#tesla:~$ (echo foo >&2 ) 2> >(wc) # next prompt printed before wc output
peter#tesla:~$ 1 1 4
peter#tesla:~$ ( echo foo >&2; ) 2>&1 | wc
1 1 4
The usual way to handle something like sending debug information somewhere else would be to choose a logging system that writes directly to a file (rather than sending it to another virtual file descriptor, and then redirecting that file descriptor in bash to file).
One option would be to use mkfifo, and have your program write to the fifo as a file, then use some other means to direct the fifo to other locations.
Another option for the mail script would be to run the mail script as a subprocess from in the C++ program and write to its stdin with internal piping. Piping into sendmail is the traditional way for a program to send mail on a Unix system.
The best thing to do is to find libraries that handle the things that you want to do.
Examples
mkfifo
http://linux.die.net/man/3/mkfifo
You use the mkfifo command to create a special file, and then you can fopen and fprint* to it as usual. You can pass the file name to the program like any other.
The bash part looks something like this:
mkfifo mailfifo
yourprogram mailfifo &
cat mailfifo | scripts/email-admin.s
Pipe to subprocess
http://www.gnu.org/software/libc/manual/html_node/Pipe-to-a-Subprocess.html

Windows Get list of ALL files on volume with size

Question: how to list all files on volume with size they occupy on disk?
Applicable solutions:
cmd script
free tool with sqlite/txt/xls/xml/json output
C++ / winapi code
The problem:
There are many tools and apis to list files, but their results dont match chkdsk and actual free space info:
Size Count (x1000)
chkdsk c: 67 GB 297
dir /S 42 GB 267
FS Inspect 47 GB 251
Total Commander (Ctrl+L) 47 GB 251
explorer (selection size) 44 GB 268
explorer (volume info) 67 GB -
WinDirStat 45 GB 245
TreeSize couldn't download it - site unavailable
C++ FindFirstFile/FindNextFile 50 GB 288
C++ GetFileInformationByHandleEx 50 GB 288
Total volume size is 70 GB, about 3 GB is actually free.
I'm aware of:
File can occupy on disk, more than its actual size, i need the size it occupies (i.e. greater one)
Symlinks, Junctions etc - that would be good to see them (though i don't think this alone can really give 20 GB difference in my case)
Filesystem uses some space for indexes and system info (chkdisk shows negligible, don't give 20 GB)
I run all tools with admin privileges, hidden files are shown.
FindFirstFile/FindNextFile C++ solution - this dont give correct results, i don't know because of what, but this gives the same as Total commander NOT the same as chkdsk
Practical problem:
I have 70 GB SSD disk, all the tools report about 50 GB is occupied, but in fact it's almost full.
Format all and reinstall - is not an option since this will happens again quite soon.
I need a report about filesizes. Report total must match actual used and free space. I'm looking for an existing solution - a tool, a script or a C++ library or C++ code.
(Actual output below)
chkdsk c:
Windows has scanned the file system and found no problems.
No further action is required.
73715708 KB total disk space.
70274580 KB in 297259 files.
167232 KB in 40207 indexes.
0 KB in bad sectors.
463348 KB in use by the system.
65536 KB occupied by the log file.
2810548 KB available on disk.
4096 bytes in each allocation unit.
18428927 total allocation units on disk.
702637 allocation units available on disk.
dir /S
Total Files Listed:
269966 File(s) 45 071 190 706 bytes
143202 Dir(s) 3 202 871 296 bytes free
FS Inspect http://sourceforge.net/projects/fs-inspect/
47.4 GB 250916 Files
Total Commander
49709355k, 48544M 250915 Files
On a Posix system, the answer would be to use the stat function. Unfortunately, it does not give the number of allocated blocs in Windows so it does not meet your requirements.
The correct function from Windows API is GetFileInformationByHandleEx. You can use FindFirstFile, FindNextFile to browse the full disk, and ask for FileStandardInfo to get a FILE_STANDARD_INFO that contains for a file (among other fields): LARGE_INTEGER AllocationSize for the allocated size and LARGE_INTEGER EndOfFile for the used size.
Alternatively, you can use directly GetFileInformationByHandleEx on directories, asking for FileIdBothDirectoryInfo to get a FILE_ID_BOTH_DIR_INFO structure. This allows you to get information on many files in a single call. My advice would be to use that one, even if it is of less common usage.
To get list of all files (including hidden and system files), sorted within directories with descending size, you can go to your cmd.exe and type:
dir /s/a:-d/o:-s C:\* > "list_of_files.txt"
Where:
/s lists files within the specified directory and all subdirectories,
/a:-d lists only files (no directories),
/o:-s put files within directory in descending size order,
C:\* means all directories on disk C,
> "list_of_files.txt" means save output to list_of_files.txt file
Listing files grouped by directory may be a little inconvenient, but it's the easiest way to list all files. For more information, take a look at technet.microsoft.com
Checked on Win7 Pro.

How to handle FTE queued transfers

I have fte monitor with a '*.txt' as trigger condition, whenever a text file lands at source fte transfer file to destination, but when 10 files land at source at a time then fte is triggering 10 transfer request simultaneously & all the transfers are getting queued & stuck.
Please suggest how to handle this scenarios
Ok, I have just tested this case:
I want to transfer four *.xml files from directory right when they appear in that directory. So I have monitor set to *.xml and transfer pattern set to *.xml (see commands bellow).
Created with following commands:
fteCreateTransfer -sa AGENT1 -sm QM.FTE -da AGENT2 -dm QM.FTE -dd c:\\workspace\\FTE_tests\\OUT -de overwrite -sd delete -gt /var/IBM/WMQFTE/config/QM.FTE/FTE_TEST_TRANSFER.xml c:\\workspace\\FTE_tests\\IN\\*.xml
fteCreateMonitor -ma AGENT1 -mn FTE_TEST_TRANSFER -md c:\\workspace\\FTE_tests\\IN -mt /var/IBM/WMQFTE/config/TQM.FTE/FTE_TEST_TRANSFER.xml -tr match,*.xml
I got three different results depending on configuration changes:
1) just as commands are, default agent.properties:
in transfer log appeared 4 transfers
all 4 transfers tryed to transfer all four XML files
3 of them with partial success because agent could't delete source file
with success that transfered all files and deleted all source files
Well, with transfer type File to File, final state is in fact ok - four files in destination directory because the previous file are overwritten. But with File to Queue I got 16 messages in destination queue.
2) fteCreateMonitor command modified with parameter "-bs 100", default agent.properties:
in transfer log , there is only one transfer
this transfer is with partial success result
this transfer tryed to transfer 16 files (each XML four times)
agent was not able to delete any file, so source files remained in source directory
So in sum I got same total amount of files transfered (16) as in first result. And not even deleted source files.
3) just as commands are, agent.properties modified with parameter "monitorMaxResourcesInPoll=1":
in transfer log , there is only one transfer
this transfer is with success result
this transfer tryed to transfer four files and succeeded
agent was able to delete all source files
So I was able to get expected result only with this settings. But I am still not sure about appropriateness of the monitorMaxResourcesInPoll parameter set to "1".
Therefore for me the answer is: add
monitorMaxResourcesInPoll=1
to agent.properties. But this is in collision with other answers posted here, so I am little bit confused now.
tested on version 7.0.4.4
Check the box that says "Batch together the file transfers when multiple trigger files are found in one poll interval" (screen three).
Make sure that you set the maxFilesForTransfer in the agent.properties file to a value that is large enough for you, but be careful as this will affect all transfers.
You can also set monitorMaxResourcesInPoll=1 in the agent.properties file. I don't recommend this for 2 reasons: 1) it will affect all monitors 2) it may make it so that you can never catch up on all the files you have to transfer depending on your volume and poll interval.
Set your "Batch together the file transfers..." to a value more than 10:
Max Batch Size = 100

What would cause a rare lchown() failure: Operation not permitted

I have a test of C++ code that in most runs passes, but in some rare instances fails due to the call to lchown() in my application under test, failing with errno EPERM and strerror:
Operation not permitted.
The code in question in my application is like this:
::lchown("pathnameToFile", uid_t(500), static_cast<unsigned>(-1)); // This line works
::lchown("pathnameToFile", static_cast<unsigned>(-1), gid_t(500)); // This fails rarely
Also in the test case iteration that failed, the prior attempt to create a symbolic link to the "pathnameToFile" also failed to create it in my application, but the code did not detect any error (the following returned 0):
::symlink("pathnameToFile", "linkToPathname");
I imagine these two things are related. This is running on a 32 bit Centos 4 machine.
The "pathnameToFile" exists on an NFS mounted partition. Could there be some kind of race condition between the file being created and the link to it and lchown failing because the NFS does not reflect its existence yet?
After some time elapsed however, the symbolic link appeared although the chown remained without effect.
The "pathnameToFile" lives in a directory with permissions:
drwxrwxr-x 2 me me 4096 Jun 22 17:33 .
-rw-rw-r-- 1 me root 33 Jun 22 17:33 pathnameToFile
lrwxrwxrwx   1 me root    8 Jun 22 17:33 LinkToPathname -> pathnameToFile
The gid 500 is the primary group for 'me', the other group being 'wheel'.
> groups
me wheel
It is a race condition, add a short sleep when the lchown fails and try again.