How do I read the effective cgroups limits for the current process using /sys/fs/cgroup - cgroups

I'm looking for the simplest recipe for reading the cgroups v1 and v2 effective values for CPU limits affecting my own process (using file system operations not OS commands).
From the kernel docs and looking around I think the main steps are something like this - but I really want to get it right to avoid it breaking on certain OSes/cgroups versions:
locate the cgroups root mount directory (typically /sys/fs/cgroup/cpu,cpuacct for v1 and /sys/fs/cgroup for v2, but could in theory be mounted somewhere else I believe?) --> probably grepping for cgroup in /proc/self/mountinfo is needed for this?
since the above is the root cgroup and my own process might be in a more specific child cgroup, I think I need locate the subdirectory of the /sys/fs/cgroup[/cpu,cpuacct] root corresponding to the cgroup my process is actually in? --> My best guess is to use "/proc/self/cgroup" to find out what this subdir is, though if I'm running inside a container it appears this path (e.g. /sys/fs/cgroup/cpu,cpuacct/docker/xxxxxxxx) does not exist, so maybe I just fallback to the root cgroup? Is that a safe thing to do?
Once I've found the directory for my own process's cgroup, I need to read the limit files (e.g. cpu.max, if using cgroups2). I'm not quite clear on whether all the files (e.g. cpu.max) are always present in all cgroup directories, or only at parts of the hierarchy where a limit is added? Do I need to look at all directories starting with the deepest level cgroup and then working backwards until I find a cpu.max file that exists?
In case cgroups1 and cgroups2 are both running on the same system (seems to be allowed), do I prefer the cgroups2 files if present and fall back to cgroups1 if not, or the other way around?
I couldn't figure it out from the OS/kernel docs and was surprised I couldn't find a short blog or code snippet answering this question of how to get the /sys/fs directory containing the effective values for my own process... if there is one then great, if not perhaps this StackOverflow question is a good place for the information to live!

Related

Is there a faster alternative to enumerating folders than FindFirstFile/FindNextFile with C++?

I need to get all paths to subfolders within a folder (with WinAPIs and C++.) So far the only solution that I found is recursively calling FindFirstFile / FindNextFile but it takes a significant amount of time to do this on a folder with a deeper hierarchy.
So I was wondering, just to get folder names, is there a faster approach?
If you really just need subfolders you should be able to use FindFirstFileEx with
search options to filter out non-directories.
The docs suggest this is an advisory flag only, but your filesystem may support this optimization - give it a try.
FindExSearchLimitToDirectories
This is an advisory flag. If the file
system supports directory filtering, the function searches for a file
that matches the specified name and is also a directory. If the file
system does not support directory filtering, this flag is silently
ignored.
A faster approach would be to bypass the FindFirstFile...() API and go straight to the file system directly. You can use DeviceIoControl() with the FSCTL_ENUM_USN_DATA control to access the master file table, at least on NTFS formatted volumes. With that information, you can directly access the records for files/folders, which includes their attributes, parent info, etc. Yes, it would be more work, but it should also be faster since you can optimize the code to access just the pieces you need.
That is the fastest approach you can come across. Also you may consider using another thread to manage directory enumerations as it takes a lot of time. even Microsoft file explorer spend some time if the directory has a lot of sub folders/files.
One more thing here is that you can enumerate directories once and then register for any updates. so the cost of enumerating the folder should be made only once during start up.

Is there a way to completely remove an inode when the Link count is 2?

Currently my data is organised in a volume which has a cache directory (where all the files are first created or transferred). After that there are suitable directories on the volume which in their subdirs, contain files hardlinked to files in the cache.
This is done so that the same inode (file) can be hardlinked multiple times in multiple directories.
Now when trying to clean up the volume, I recurively go through the dirs(not the cache) and based on certain criterion, unlink the files (which basically reduces the inode count of the cache entry by 1). Is there a way for me to delete the cache entry directly, when I am deleting the last hardlink (that is bringing down the count from 2 to 1). This way I would not have to manually parse through the whole cache directory to clear any inodes from it, which have a link count of just 1.
I have gone through unlink/remove functions, and could not find anything specific of use. Is there some purging algorithm that internally takes care of this, then I can try to implement that.
Any help on this would be highly appreciated. In anticipation of a prompt reply.
I saw this and a few other places which instruct you how to delete all hardlinks from shell (use find -samefile and call remove on each file). You could call it via system although that might be frowned on by some people).
No, there isn't anything that does what you want out of the box.
It might be useful to do the deletion when unlinking the hardlink and noticing that the link count is 1, since at that point the inode should be in the page cache; this of course is dependent on knowing the name of the file in the cache directory.

System Volume Information: Access Denied

I have written a program that can search a physical device at the sector level, from start to finish. Now I want to create a routine that will deal only with logical files.
Therefore, I need to recursively list all directories and files on an NTFS volume (or FAT32) using native C++. The problem I am running into is an "access denied" error whenever I encounter a Windows System folder.
C:\System Volume Information and
C:\Users
are just two examples of these folders.
I am NOT looking to "skip over" these directories. On the contrary, they are most important to the project at hand.
I have tried a variety of options that have been offered up in C++ forums etc. and all of them seem to either fail (access denied) or the quick answer is to "skip over" them.
At this point, I am wondering if I need to somehow lookup the physical sector for these folders and systematically trace through the extents at the physical sector level for each?
Looking for some help here and I would appreciate any ideas. Thank you!
NOTE: I saw no point in posting any sample code only because I've tried way too many combinations (most of which could read directories etc.) but all of which failed to navigate System directories.
Not that I recommend this, but since you're very determined, why don't you just temporarily change the security descriptors on those folders with SetFileSecurity so you can open a handle, then change them back again? That should work.

Out of Core Implementation of a Quadtree

I am trying to build a Quadtree data structure(or let's just say a tree) on the secondary memory(Hard Disk).
I have a C++ program to do so and I use fopen to create the files. Also, I am using tesseral coding to store each cell in a file named with its corresponding code to store it on the disk in one directory.
The problem is that after creating about 1,100 files, fopen just returns NULL and stops creating new files. I can create further files manually in that directory, but using C++ it can not create any further files.
I know about max limit of inode on ext3 filesystem which is (from Wikipedia) 32,000 but mine is way less than that, also note that I can create files manually on the disk; just not through fopen.
Also, I really appreciate any idea regarding the best way to store a very dynamic quadtree on disk(I need the nodes to be in separate files and the quadtree might have a depth of 50).
Using nested directories is one idea, but I think it will slow down the performance because of following the links on the filesystem to access the file.
Thanks,
Nima
Whats the errno value of the failed fopen() call?
Do you keep the files you have created open? If yes you are most probably exceeding the maximum number of open files per process.
When you use directories as data structures, you delegate the work of maintaining that structure to the file system, which is not necessarily designed to do that.
Edit: Frank is probably right that you'v exceeded the number of available file descriptors. You can increase those, but that shows that you're also using internals of your ABI as a data structure. Slow and (as resources are exhausted) unstable.
Either code for a very specific OS installation, or use a SQL database.
I have no idea why fopen wouldn't work. Look at errno.
However, storing everything in one directory is a bad idea. When you add a lot of files, it will get slow. Having a directory for every level of the tree will also be slow.
Instead, combine multiple levels into one directory. You could, for example, have one directory for every four levels of the tree. This would limit the number of directories, amount of nesting, and number of files per directory, giving very good performance.
The limitation could come from:
stdio (C library). most 256 handles. Can be increased to 1024 (in VC, call _setmaxstdio)
OS kernel on the file hanldes per process (usually 1024).

How can I quickly enumerate directories on Win32?

I'm trying to speedup directory enumeration in C++, where I'm recursing into subdirectories. I currently have an app which spends 95% of it's time in FindFirst/FindNextFile APIs, and it takes several minutes to enumerate all the files on a given volume. I know it's possible to do this faster because there is an app that does: Everything. It enumerates my entire drive in seconds.
How might I accomplish something like this?
I realize this is an old post, but there is a project on source forge that does exactly what you are asking and the source code is available.
You can find the project here: NTFS-Search
"Everything" builds an index in the background, so queries are against the index not the file system itself.
There are a few improvements to be made - at least over the straight-forward algorrithm:
First, breadth search over depth search. That is, enumerate and process all files in a single folder before recursing into the sub folders you found. This improves locality - usually a lot.
On Windows 7 / W2K8R2, you can use FindFirstFileEx with FindExInfoBasic, the main speedup being omitting the short file name on NTFS file systems where this is enabled.
Separate threads help if you enumerate different physical disks (not just drives). For the same disk it only helps if it's an SSD ("zero seek time"), or you spend significant time processing a file name (compared to the time spent on disk access).
[edit] Wikipedia actually has some comments -
Basically, they are skipping the file system abstraction layer, and access NTFS directly. This way, they can batch calls and skip expensive services of the file system - such as checking ACL's.
A good starting point would be the NTFS Technical Reference on MSDN.
"Everything" accesses directory information at a lower level than the Win32 FindFirst/FindNext APIs.
I believe it reads and interprets the NTFS MFT structures directly, and that this is one of the main reasons for its performance. It's also why it requires admin privileges and why "Everything" only indexes local or removable NTFS volumes (not network drives, for example).
A couple other utilities that do the similar things are:
FindOnClick by 2Brightsparks
Search GT
A little reverse engineering with a debugger on these tools might give you some insight on the techniques they use.
Don't recurse immediately, save a list of directories you find and dive into them when finished. You want to do linear access to each directory, to take advantage of locality of reference and any caching the OS is doing.
If you're already doing the best you can to get the maximum speed from the API, the next step is to do low-level disk accesses and bypass Windows altogether. You might get some guidance from the NTFS drivers for Linux, or perhaps you can use one directly.
If you are doing this on NTFS, here's a lib for low level access: NTFSLib.
You can enumerate through all file records in $MFT, each representing a real file on disk. You can get all file attributes from the record, including $DATA.
This may be the fastest way to enumerate all files/directories on NTFS volumes, 200k~300k files per minute as I tested.