More explanation on `statfs64` - c++

According to documentation, the structure fields explanation follows:
struct statfs {
__SWORD_TYPE f_type; /* type of file system (see below) */
__SWORD_TYPE f_bsize; /* optimal transfer block size */
fsblkcnt_t f_blocks; /* total data blocks in file system */
fsblkcnt_t f_bfree; /* free blocks in fs */
fsblkcnt_t f_bavail; /* free blocks available to
unprivileged user */
fsfilcnt_t f_files; /* total file nodes in file system */
fsfilcnt_t f_ffree; /* free file nodes in fs */
fsid_t f_fsid; /* file system id */
__SWORD_TYPE f_namelen; /* maximum length of filenames */
__SWORD_TYPE f_frsize; /* fragment size (since Linux 2.6) */
__SWORD_TYPE f_spare[5];
};
Does "total file nodes in file system" mean how much existing files we have? Does it include directories and links?
What does mean "free file nodes in fs"?
What is f_spare?
In some Linux forks (for example, in Android) I see that f_spare size is 4, and additional field f_flags is defined. What flags are defined for f_flags?
Is f_fsid just random number that uniquely identifies the file system, or what is it?

Does "total file nodes in file system" mean how much existing files we have? Does it include directories and links?
Almost. Yes, it includes directories and softlinks, but two files can can share the same inode. In that case, they are hardlinked and share the same space on hard disk, but is viewed as different files in the filesystem. Example to illustrate:
% echo Hello > test1.txt
% ln test1.txt test2.txt
% ls -i test1.txt test2.txt
14946320 test1.txt 14946320 test2.txt
The number you'll see to the left of the filenames are the inodes (you'll have a different number than in my example). As you can see, they have the same inode. If you make a change to one file, the same change will be visible through the other file.
What does mean "free file nodes in fs"?
A filesystem often have an upper limit of inodes it can keep track of. The actual type fsfilcnt_t sets one limit (18446744073709551615 on my system), but it's most probably something lower. Unless you use your filesystem in very special ways, this limit is usually not a problem.
What is f_spare? In some Linux forks (for example, in Android) I see that f_spare size is 4, and additional field f_flags is defined.
f_spare is just spare bytes to pad the struct itself. The padding bytes are reserved for future use. If one __fsword_t of info is added to the struct in the future, they'll remove one spare __fsword_t from f_spare. My system only has 4 spare __fsword_ts for example (32 bytes).
What flags are defined for f_flags?
The mount flags defined for your system may be different, but my man statfs64 page shows these:
ST_MANDLOCK
Mandatory locking is permitted on the filesystem (see fcntl(2)).
ST_NOATIME
Do not update access times; see mount(2).
ST_NODEV
Disallow access to device special files on this filesystem.
ST_NODIRATIME
Do not update directory access times; see mount(2).
ST_NOEXEC
Execution of programs is disallowed on this filesystem.
ST_NOSUID
The set-user-ID and set-group-ID bits are ignored by exec(3) for executable files on this filesystem
ST_RDONLY
This filesystem is mounted read-only.
ST_RELATIME
Update atime relative to mtime/ctime; see mount(2).
ST_SYNCHRONOUS
Writes are synched to the filesystem immediately (see the description of O_SYNC in open(2)).
ST_MANDLOCK
Mandatory locking is permitted on the filesystem (see fcntl(2)).
ST_NOATIME
Do not update access times; see mount(2).
ST_NODEV
Disallow access to device special files on this filesystem.
ST_NODIRATIME
Do not update directory access times; see mount(2).
ST_NOEXEC
Execution of programs is disallowed on this filesystem.
ST_NOSUID
The set-user-ID and set-group-ID bits are ignored by exec(3) for executable files on this filesystem
ST_RDONLY
This filesystem is mounted read-only.
ST_RELATIME
Update atime relative to mtime/ctime; see mount(2).
ST_SYNCHRONOUS
Writes are synched to the filesystem immediately (see the description of O_SYNC in open(2)).
Is f_fsid just random number that uniquely identifies the file system, or what is it?
Directly from the man statfs64 page: "Nobody knows what f_fsid is supposed to contain (but see below)" and further below:
The f_fsid field
Solaris, Irix and POSIX have a system call statvfs(2) that returns a struct statvfs (defined in ) containing an unsigned long f_fsid. Linux, SunOS, HP-UX, 4.4BSD have a system call statfs() that returns a struct statfs (defined in ) containing a fsid_t f_fsid, where fsid_t is defined as struct { int val[2]; }. The same holds for FreeBSD, except that it uses the include file .
The general idea is that f_fsid contains some random stuff such that the pair (f_fsid,ino) uniquely determines a file. Some operating systems use (a variation on) the device number, or the device number combined with the filesystem type. Several operating systems restrict giving out the f_fsid field to the superuser only (and zero it for unprivileged users), because this field is used in the filehandle of the filesystem when NFS-exported, and giving it out is a security concern.
Under some operating systems, the fsid can be used as the second argument to the sysfs(2) system call.

Related

Get the size of a directory (not its content)

(please before dismissing this as already answered, read the full question)
In C++ we can get the file size of a regular file by running std::filesystem::file_size(PATH); But this function does not work on directories, which is my problem.
I am in a situation where I need to know the size of directory "inode", in (most) linux systems the standard size of a directory is 4kB or block size:
$:~/tmp/test$ mkdir ex
$:~/tmp/test$ ls -l
total 4
drwxrwxr-x 2 secret secret 4096 Oct 15 08:43 ex
These 4kB inlcudes space for having a "list" of the file in that directory.
But if the number of files in the directory becomes significantly large, the size of the folder can increase (which is where I am).
I need to be able to track this increase.
So my question is that besides calling ls -l or du from C++ is there a C++-native way of getting the size of the directory?
I am aware that the reason it does not work with std::filesystem::file_size(path) is due file systems different ways of representing directories.
https://en.cppreference.com/w/cpp/filesystem/file_size
For a regular file p, returns the size determined as if by reading the
st_size member of the structure obtained by POSIX stat (symlinks are
followed)
The result of attempting to determine the size of a directory (as well
as any other file that is not a regular file or a symlink) is
implementation-defined.
file_size on a directory may actually work in some implementations, but I guess that yours doesn't. There doesn't appear to be a pure C++ alterantive, but you can call the POSIX stat function yourself. That does work on directories and reports the number you want.

How to get known paths for linux

Windows has a concept of a Known Path with functions to retrieve them without hard-coding a path:
#include <filesystem>
#include <windows.h>
#include <ShlObj.h>
//...
std::filesystem::path GetAppDataPath() {
namespace FS = std::filesystem;
PWSTR ppszPath = nullptr;
auto hr_path = ::SHGetKnownFolderPath(FOLDERID_RoamingAppData, KF_FLAG_DEFAULT, nullptr, &ppszPath);
bool success = SUCCEEDED(hr_path);
if (success) {
auto p = FS::path(ppszPath);
::CoTaskMemFree(ppszPath);
p = FS::canonical(p);
return p;
}
return {};
}
Is there an equivalent for linux?
Linux is an operating system kernel. It does not have a concept of user directories.
There are several Linux distributions. The filesystem structure is determined by the distro. Most distros conform to POSIX standard, and follow (to varying degree) the Filesystem Hierarchy Standard by Linux Foundation, which is similar to the directory structures of other UNIX like systems. That said, distributions generally allow the user to use the file system in unconventional configurations. For example, they don't typically force users home directory to be under /home.
POSIX specifies a few environment variables that are relevant to this context:
HOME
The system shall initialize this variable at the time of login to be a pathname of the user's home directory.
TMPDIR
This variable shall represent a pathname of a directory made available for programs that need a place to create temporary files.
Environment variables can be accessed using std::getenv in C++.
On desktop systems, the directory structure is also determined to some degree by the desktop environment, of which there are several available. freedesktop.org produces unofficial specifications for interoperability of different desktop environments. On DE's conforming to XDG Base Directory Specification should following environment variables be available:
$XDG_DATA_HOME defines the base directory relative to which user specific data files should be stored. If $XDG_DATA_HOME is either not set or empty, a default equal to $HOME/.local/share should be used.
$XDG_CONFIG_HOME defines the base directory relative to which user specific configuration files should be stored. If $XDG_CONFIG_HOME is either not set or empty, a default equal to $HOME/.config should be used.
$XDG_DATA_DIRS defines the preference-ordered set of base directories to search for data files in addition to the $XDG_DATA_HOME base directory. The directories in $XDG_DATA_DIRS should be seperated with a colon ':'.
If $XDG_DATA_DIRS is either not set or empty, a value equal to /usr/local/share/:/usr/share/ should be used.
freedesktop.org also provides a utility xdg-user-dirs:
xdg-user-dirs is a tool to help manage "well known" user directories like the desktop folder and the music folder. It also handles localization (i.e. translation) of the filenames.
$(XDG_CONFIG_HOME)/user-dirs.dirs specifies the current set of directories for the user. This file is in a shell format, so its easy to access from a shell script. This file can also be modified by users (manually or via applications) to change the directories used.
So, in case of FOLDERID_RoamingAppData, you should probably use one of $XDG_x depending on the use case, falling back to the appropriate default relative to $HOME as specified.

C++ Check if mount path is still mounted

I have the details of the mount path (specifically mount prefix) as obtained using getmntent
in the structure as defined below:
struct mntent {
char *mnt_fsname; /* name of mounted file system */
char *mnt_dir; /* file system path prefix */
char *mnt_type; /* mount type (see mntent.h) */
char *mnt_opts; /* mount options (see mntent.h) */
int mnt_freq; /* dump frequency in days */
int mnt_passno; /* pass number on parallel fsck */
};
Using mnt_dir I want to check if the mount path is still mounted after a while as it is possible that before some processing is done on it, it might have been unmounted.
What is the most efficient way to check if the path is still mounted?
Also Is there a way to get callback in case the path gets unmounted?
I'd say that the most efficient way is to cache st_dev and st_ino returned by stat() (although probably caching just st_dev should be enough).
If the volume gets unmounted, the mount point reverts to the empty subdirectory in the parent filesystem where the volume was originally mounted, and stat() will return a different device+inode, for the same file path.
As far as being notified, poke around the inotify(7) interface, paying attention to the IN_UNMOUNT event.

How to get USB Drive Label in Linux?

I am trying to get USB drive's Label in my c/c++ Application. I am using libudev to get the usb details. But it doesn't provides the drives Label. Does any one have an idea on how to get the drive Label. I am working on embedded platform, it doesn't have a /dev/disk folder.
Please Help.
Kernel Version : 3.3.8
Normally, a usb filesystem has a vfat partition on it to make it compatible between msdos, windows, linux and mac architectures.
The label is a property of the vfat filesystem. It appears normaly as the first directory entry in the root directory and marked as a filesystem label. Recent implementations of msdos filesystems (merely vfat exfat and fat32) write it also in a fixed part of the boot record for that partition, so you can read it from there.
You have volume serial-number at offset 0x43 (4 bytes) in the first sector of the partition.
You have also a copy of the volume label at offset 0x47 in that first sector also (11 bytes length)
The trick is: as normally a usb stick is partitioned (with only one partition) you have to:
look in the first sector of the usb stick for the partition table and locate the first parition.
then, look in the first sector of that parition, locate byte offset 0x43 and use that four bytes as volume serial number (it matches UUID="..." in /etc/fstab linux file) and the eleven bytes that follow for the volume label.
Note
Be careful that NTFS doesn't use that place for that purpose and you can damage a NTFS partition writing there. Just read from that place.
Note 2
Also, don't try to write to that place even in vfat filesystems, as they also maintain a copy of the volume label in the root directory of the filesystem.
Note 3
The easiest way to get the label of a dos filesystem (and ext[234], ntfs, etc) in linux is with the command blkid(8) it gives the followind output:
/dev/sda1: UUID="0b2741c0-90f5-48d7-93ce-6a03d2e8e9aa" TYPE="ext4"
/dev/sda5: UUID="62e2cbf2-d847-4048-856a-a90b91116285" TYPE="crypto_LUKS"
/dev/mapper/sda5_crypt: UUID="vnBDh3-bcaR-Cu7E-ok5D-oeFp-5SyP-MmAEsb" TYPE="LVM2_member"
/dev/mapper/my_vg-root: UUID="1b9f158b-35b5-490e-b914-bdc70e7f5c28" TYPE="ext4"
/dev/mapper/my_vg-swap_1: UUID="36b8ac81-7043-42ae-9f2a-908d53e2a2b3" TYPE="swap"
/dev/sdb1: LABEL="K003_1G" UUID="641B-80BF" TYPE="vfat"
As you can see, the last entry is for a vfat usb pendrive, but you have to parse this output (I think is not difficult to do)
I believe the "label" of a disk is a property maintained by the file system it's using, i.e. it's not at the USB level.
You're going to need the proper file system implementation, i.e. "mount" the disk.
You can use blkid to read the USB device label:
blkid USB_PATH | grep -o ""LABEL.*"" | cut -d'\"' -f2

Efficiently List All Sub-Directories in a Directory

Please see edit with advice taken so far...
I am attempting to list all the directories(folders) in a given directory using WinAPI & C++.
Right now my algorithm is slow & inefficient:
- Use FindFirstFileEx() to open the folder I am searching
- I then look at every file in the directory(using FindNextFile()); if its a directory file then I store its absolute path in a vector, if its just a file I do nothing.
This seems extremely inefficient because I am looking at every file in the directory.
Is there a WinAPI function that I can use that will tell me all the sub-directories in a given directory?
Do you know of an algorithm I could use to efficiently locate & identify folders in a directory(folder)?
EDIT:
So after taking the advice I have searched using FindExSearchLimitToDirectories but for me it still prints out all the files(.txt, etc.) & not just folders. Am I doing something wrong?
WIN32_FIND_DATA dirData;
HANDLE dir = FindFirstFileEx( "c:/users/soribo/desktop\\*", FindExInfoStandard, &dirData,
FindExSearchLimitToDirectories, NULL, 0 );
while ( FindNextFile( dir, &dirData ) != 0 )
{
printf( "FileName: %s\n", dirData.cFileName );
}
In order to see a performance boost there must be support at the file system level. If this does not exist then the system must enumerate every single object in the directory.
In principle, you can use FindFirstFileEx specifying the FindExSearchLimitToDirectories flag. However, the documentation states (emphasis mine):
This is an advisory flag. If the file system supports directory filtering, the function searches for a file that matches the specified name and is also a directory. If the file system does not support directory filtering, this flag is silently ignored.
If directory filtering is desired, this flag can be used on all file systems, but because it is an advisory flag and only affects file systems that support it, the application must examine the file attribute data stored in the lpFindFileData parameter of the FindFirstFileEx function to determine whether the function has returned a handle to a directory.
However, from what I can tell, and information is sparse, FindExSearchLimitToDirectories flag is not widely supported on desktop file systems.
Your best bet is to use FindFirstFileEx with FindExSearchLimitToDirectories. You must still perform your own filtering in case you meet a file system that doesn't support directory filtering at file system level. If you get lucky and hit upon a file system that does support it then you will get the performance benefit.
If you're using FindFirstFileEx, then you should be able to specify the _FINDEX_SEARCH_OPS::FindExSearchLimitToDirectories option (to be used as the fSearchOp param in FindFirstFileEx) to limit the first search (and any subsequent FindNextFile()) calls to directories.