I have the details of the mount path (specifically mount prefix) as obtained using getmntent
in the structure as defined below:
struct mntent {
char *mnt_fsname; /* name of mounted file system */
char *mnt_dir; /* file system path prefix */
char *mnt_type; /* mount type (see mntent.h) */
char *mnt_opts; /* mount options (see mntent.h) */
int mnt_freq; /* dump frequency in days */
int mnt_passno; /* pass number on parallel fsck */
};
Using mnt_dir I want to check if the mount path is still mounted after a while as it is possible that before some processing is done on it, it might have been unmounted.
What is the most efficient way to check if the path is still mounted?
Also Is there a way to get callback in case the path gets unmounted?
I'd say that the most efficient way is to cache st_dev and st_ino returned by stat() (although probably caching just st_dev should be enough).
If the volume gets unmounted, the mount point reverts to the empty subdirectory in the parent filesystem where the volume was originally mounted, and stat() will return a different device+inode, for the same file path.
As far as being notified, poke around the inotify(7) interface, paying attention to the IN_UNMOUNT event.
Related
According to documentation, the structure fields explanation follows:
struct statfs {
__SWORD_TYPE f_type; /* type of file system (see below) */
__SWORD_TYPE f_bsize; /* optimal transfer block size */
fsblkcnt_t f_blocks; /* total data blocks in file system */
fsblkcnt_t f_bfree; /* free blocks in fs */
fsblkcnt_t f_bavail; /* free blocks available to
unprivileged user */
fsfilcnt_t f_files; /* total file nodes in file system */
fsfilcnt_t f_ffree; /* free file nodes in fs */
fsid_t f_fsid; /* file system id */
__SWORD_TYPE f_namelen; /* maximum length of filenames */
__SWORD_TYPE f_frsize; /* fragment size (since Linux 2.6) */
__SWORD_TYPE f_spare[5];
};
Does "total file nodes in file system" mean how much existing files we have? Does it include directories and links?
What does mean "free file nodes in fs"?
What is f_spare?
In some Linux forks (for example, in Android) I see that f_spare size is 4, and additional field f_flags is defined. What flags are defined for f_flags?
Is f_fsid just random number that uniquely identifies the file system, or what is it?
Does "total file nodes in file system" mean how much existing files we have? Does it include directories and links?
Almost. Yes, it includes directories and softlinks, but two files can can share the same inode. In that case, they are hardlinked and share the same space on hard disk, but is viewed as different files in the filesystem. Example to illustrate:
% echo Hello > test1.txt
% ln test1.txt test2.txt
% ls -i test1.txt test2.txt
14946320 test1.txt 14946320 test2.txt
The number you'll see to the left of the filenames are the inodes (you'll have a different number than in my example). As you can see, they have the same inode. If you make a change to one file, the same change will be visible through the other file.
What does mean "free file nodes in fs"?
A filesystem often have an upper limit of inodes it can keep track of. The actual type fsfilcnt_t sets one limit (18446744073709551615 on my system), but it's most probably something lower. Unless you use your filesystem in very special ways, this limit is usually not a problem.
What is f_spare? In some Linux forks (for example, in Android) I see that f_spare size is 4, and additional field f_flags is defined.
f_spare is just spare bytes to pad the struct itself. The padding bytes are reserved for future use. If one __fsword_t of info is added to the struct in the future, they'll remove one spare __fsword_t from f_spare. My system only has 4 spare __fsword_ts for example (32 bytes).
What flags are defined for f_flags?
The mount flags defined for your system may be different, but my man statfs64 page shows these:
ST_MANDLOCK
Mandatory locking is permitted on the filesystem (see fcntl(2)).
ST_NOATIME
Do not update access times; see mount(2).
ST_NODEV
Disallow access to device special files on this filesystem.
ST_NODIRATIME
Do not update directory access times; see mount(2).
ST_NOEXEC
Execution of programs is disallowed on this filesystem.
ST_NOSUID
The set-user-ID and set-group-ID bits are ignored by exec(3) for executable files on this filesystem
ST_RDONLY
This filesystem is mounted read-only.
ST_RELATIME
Update atime relative to mtime/ctime; see mount(2).
ST_SYNCHRONOUS
Writes are synched to the filesystem immediately (see the description of O_SYNC in open(2)).
ST_MANDLOCK
Mandatory locking is permitted on the filesystem (see fcntl(2)).
ST_NOATIME
Do not update access times; see mount(2).
ST_NODEV
Disallow access to device special files on this filesystem.
ST_NODIRATIME
Do not update directory access times; see mount(2).
ST_NOEXEC
Execution of programs is disallowed on this filesystem.
ST_NOSUID
The set-user-ID and set-group-ID bits are ignored by exec(3) for executable files on this filesystem
ST_RDONLY
This filesystem is mounted read-only.
ST_RELATIME
Update atime relative to mtime/ctime; see mount(2).
ST_SYNCHRONOUS
Writes are synched to the filesystem immediately (see the description of O_SYNC in open(2)).
Is f_fsid just random number that uniquely identifies the file system, or what is it?
Directly from the man statfs64 page: "Nobody knows what f_fsid is supposed to contain (but see below)" and further below:
The f_fsid field
Solaris, Irix and POSIX have a system call statvfs(2) that returns a struct statvfs (defined in ) containing an unsigned long f_fsid. Linux, SunOS, HP-UX, 4.4BSD have a system call statfs() that returns a struct statfs (defined in ) containing a fsid_t f_fsid, where fsid_t is defined as struct { int val[2]; }. The same holds for FreeBSD, except that it uses the include file .
The general idea is that f_fsid contains some random stuff such that the pair (f_fsid,ino) uniquely determines a file. Some operating systems use (a variation on) the device number, or the device number combined with the filesystem type. Several operating systems restrict giving out the f_fsid field to the superuser only (and zero it for unprivileged users), because this field is used in the filehandle of the filesystem when NFS-exported, and giving it out is a security concern.
Under some operating systems, the fsid can be used as the second argument to the sysfs(2) system call.
I have a target board running linux which has approximately around 5 million+ files in the directory. (This directory doesn't have any sub directories)
If i execute this program it takes several minutes to get the total space information.
Is there a faster way to accomplish this? Thanks
#include <stdio.h>
#include <dirent.h>
#include <string.h>
#include <stdlib.h>
#include <limits.h>
#include <sys/stat.h>
#include <errno.h>
void calcSpace(char *path, long long int *totalSpace)
{
DIR *dir; /* dir structure we are reading */
struct dirent *ent; /* directory entry currently being processed */
char absPath[200];
struct stat statbuf; /* buffer for stat()*/
long long int fileCount=0;
fprintf(stderr, "Opening dir %s\n", path);
dir = opendir(path);
if(NULL == dir) {
perror(path);
return;
}
while((ent = readdir(dir)))
{
fileCount++;
sprintf(absPath, "%s/%s", path, ent->d_name);
if(stat(absPath, &statbuf)) {
perror(absPath);
return;
}
*totalSpace= (*totalSpace) + statbuf.st_size;
}
fprintf(stderr, "Closing dir %s\n", path);
printf("fileCount=%lld.\n", fileCount);
closedir(dir);
}
int main(int argc, char *argv[])
{
char *dir;
long long int totalSpace=0;
if(argc > 1)
dir = argv[1];
else
dir = ".";
calcSpace(dir, &totalSpace);
printf("totalSpace=%lld\n", totalSpace);
return 0;
}
As stated in the comments, it seems that the main cost are the calls for stat and readdir.
Optimizing readdir calls
We can save some serious costs on the readdir costs using the getdents(2) syscall instead. This syscall is similar to 'readdir', but you can use it in order to read multiple directory entries in each syscall - greatly reducing the overhead of calling the readdir syscall for each entry.
A code example can be found in the man page I linked to. The importat thing to note is that you probably should fiddle around with the amount of entries read at a time using getdents (the count paramater - in the example it is 1024), to find the sweet spot for your configuration and machine (this step will probably give you the performance improvement you want over readdir).
Optimizing stat calls
It is recommended to use the fstatat(2) function instead of the regular stat you used (relevant man page for both). This is because the first paramter of fstatat(2) is a dirfd - file descriptor for the directory the file you are stating is under.
That means that you can open a file descriptor to the directory once (using open(2)), and then all fstatat calls will be done with this dirfd. This will optimize the stating process in the kernel (as the reference for the entire path and the directory itself shouldn't be resolved for every stat syscall anymore), and would probably make your code simpler and a little bit faster (as path concatenation will not be needed anymore).
While technically this is not an 'answer', some comments about the issue here:
Behind the scenes, most Linux file system represent files as 'inodes', with the directory representing list of names->inode. Depending on the file system, the list may be simple linear list, balanced tree, or a hash - which will improve on the performance of lookup, and the penalty for working with folders with large number of files.
Older operating systems (Vax VMS, and it's predecessors FILES-11) had to ability to open a file by a unique ID (file number, sequence number), in addition to opening a file by file path. In Linux space this will be equivalent to opening a file by inode number. This approach make it possible to open, or query for meta data with little overhead. Unfortunately, Unix/Linux does not have similar feature at the application level, and as much as I know, there are no plans to create such an interface. It will require significant upgrade to all file system drivers.
Another approach will be to implement multi-file stat system calls, which will benefit from being able to perform the file lookup for all files during a single scan. While type of system calls will speed up various utilities, it will not benefit most applications, which will usually stick to POSIX calls. it will be hard to implement such a feature, without changes to various file system drivers. I believe it's unlikely to be available soon.
When Using Windows I can get a (more or less unique) serial number of a HDD partition by using command
GetVolumeInformation()
My question: is there something similar available for Linux? Means a number that would change only when somebody formats a partition and that can be retrieved programmatically?
Thanks!
In linux, you could use the blkid command to get the UUID of the partition:
# blkid /dev/sda1
/dev/sda1: UUID="15677362-cef3-4a53-aca3-3bace1b0d92a" TYPE="ext4"
This info is stored in the formatting of specific partition types like ext4, xfs and changes when reformatted. There is no info available for unformatted partitions.
If you need to call it from code, calling out to a shell to run this command isn't exactly the prettiest way to do it, but it could work:
#include <stdio.h>
int main(int argc,char ** argv) {
/* device you are looking for */
char device[]="/dev/sda1";
/* buffer to hold info */
char buffer[1024];
/* format into a single command to be run */
sprintf(buffer,"/sbin/blkid -o value %s",device);
/* run the command via popen */
FILE *f=popen(buffer,"r");
/* probably should check to make sure f!=null */
/* read the first line of output */
fgets(buffer,sizeof(buffer),f);
/* print the results (note, newline is included in string) */
fprintf(stdout,"uuid is %s",buffer);
}
You can use udev to get the serial number of the device. (You'll need to know the device name though)
struct udev *context = udev_new();
struct udev_device *device = udev_device_new_from_syspath(context, "/sys/block/sda");
const char *id = udev_device_get_property_value(device, "ID_SERIAL");
// Cleanup
udev_device_unref(device);
udev_unref(context);
Partitions have (at least) three identities in linux:
The raw device identifier (think cat /proc/partitions) - this is not a unique serial number
The UUID of the partition - can be found with blkid, and stored within the partition itself. You can also manually parse /dev/.blkid.tab - the format being obvious.
The disk label - also stored within the partition. EG:
lsblk -o name,mountpoint,label,uuid
NAME MOUNTPOINT LABEL UUID
sda
├─sda1 / 315eaf50-adcc-4f0d-b767-f008f3f1c194
├─sda2
└─sda5 [SWAP] 1ff31705-f488-44a4-ba5f-e2fe9eff4b96
sr0
Of these, the second is closest to what you want. To read it programatically, use libblkid.
# blkid /dev/sdX gives the filesystem type of the partition whether mounted or unmounted. How can I do it from C/C++ with out invoking a system call and parsing the out put? How can I do it programatically? Is there any blkid-dev package?
You always can use the blkid libraries (for ubuntu it's as easy as installing libblkid-dev). And for the real usage see: https://github.com/fritzone/sinfonifry/blob/master/plugins/disk_status/client/disk_status.cpp (sorry for advertising code from my own repository, but it has exactly this functionality developed there). And do not forget, that you will need to run the application with sudo in order to have full access to the disk.
For mounted partition, instead of reading /proc/self/mounts, you can do that (assuming you know the path where the partition is mounted):
#include <sys/vfs.h>
#include <stdio.h>
#include <linux/magic.h>
static const struct {
unsigned long magic;
const char *type;
} types[] = {
{EXT4_SUPER_MAGIC, "ext4"},
{TMPFS_MAGIC, "tmpfs"},
};
const char *get_type(unsigned long magic) {
static const char * unkown="unkown";
unsigned int i;
for (i=0; i < sizeof(types)/sizeof(types[0]); i++)
if (types[i].magic == magic)
return types[i].type;
return unkown;
}
void main() {
struct statfs buf;
statfs("/", &buf);
printf("/ is %s\n", get_type((unsigned long)buf.f_type));
statfs("/tmp", &buf);
printf("/tmp is %s\n", get_type((unsigned long)buf.f_type));
}
In my case it displays:
/ is ext4
/tmp is tmpfs
For more details see
man statfs
You can obviously add all the types you need. They are listed by the statfs manpage.
It is said that statfs is deprecated, by I don't know of another call that would return the filesystem type.
For mounted partitions, your C++ program could read sequentially and parse the /proc/self/mounts pseudo-file, see proc(5)
For unmounted partitions, they could contain anything (including no file system at all, or swap data, or raw data - e.g. for some database system). So the question may even be meaningless. You might popen some file -s command.
You should study the source code of /bin/mount since it is free software (and it does similar things for the auto case). You may want to use libmagic(3) (which is used by file(1) command)
I need help with one of my programs that pulls out available drives on a system and prints various information about the drives. I am using VC++ and am fairly new to C++ and need some high level inputs or example code from experienced programmers.
Here is my current source code:
#include "stdafx.h"
#include Windows.h
#include stdio.h
#include iostream
using namespace std;
int main()
{
// Initial Dummy drive
WCHAR myDrives[] = L" A";
// Get the logical drive bitmask (1st drive at bit position 0, 2nd drive at bit position 1... so on)
DWORD myDrivesBitMask = GetLogicalDrives();
// Verifying the returned drive mask
if(myDrivesBitMask == 0)
wprintf(L"GetLogicalDrives() failed with error code: %d\n", GetLastError());
else {
wprintf(L"This machine has the following logical drives:\n");
while(myDrivesBitMask) {
// Use the bitwise AND with 1 to identify
// whether there is a drive present or not.
if(myDrivesBitMask & 1) {
// Printing out the available drives
wprintf(L"drive %s\n", myDrives);
}
// increment counter for the next available drive.
myDrives[1]++;
// shift the bitmask binary right
myDrivesBitMask >>= 1;
}
wprintf(L"\n");
}
system("pause");
}
`
-Here is the output-
This machine has the following logical drives:
drive C
drive D
drive E
drive F
drive G
drive H
drive I
I need to output additional information about each drive (perhaps an example will tell the story in a shorter amount of time):
Drive – C:\
Drive Type: Fixed
Drive Ready Status: True
Volume Label: Boot Drive
File System Type : NTFS
Free Space: 30021926912
Total Drive Size: 240055742464
Drive – D:\
Drive Type: Fixed
Drive Ready Status: True
Volume Label: Application Data
File System Type : NTFS
Free Space: 42462507008
Total Drive Size: 240054693888
Which methods, libs api, etc. can I use to pull out drive type, drive status, volume label, file system type, free space, and total drive size?
*Side note, I noticed a defect with my pre-processor directives, specifically within the standard I/O header files. I know that is not the recommended way using printf and cout is type safe and the proper route to go but I couldn't figure out how to format output in cout as you would do in wprintf(L"drive %s\n", myDrives);.... so how would you do this with cout??
Thanks in advance.
You want to look at functions such as GetVolumeInformation to retrieve file system information such as free space and volume name.
GetDriveType will give you some basic information about the drive type, but USB thumb sticks and flash readers can give surprising results.
I'm not sure what you mean by "ready status". If you mean whether there is a valid volume in the drive, then you can try CreateFile with a path of "\\.\C:" to try and open the volume. If it fails then there is no volume (disk) present. This will be of use for SD card readers. To do this without an error dialog appearing you will need to call SetErrorMode(SEM_NOOPENFILEERRORBOX) first.
To check whether a drive is ready you may also use GetDiskFreeSpaceEx. If this fails, the drive is not ready/usable.
Here is some example code: http://pinvoke.net/default.aspx/coredll/GetDiskFreeSpaceEx.html