I have a test of C++ code that in most runs passes, but in some rare instances fails due to the call to lchown() in my application under test, failing with errno EPERM and strerror:
Operation not permitted.
The code in question in my application is like this:
::lchown("pathnameToFile", uid_t(500), static_cast<unsigned>(-1)); // This line works
::lchown("pathnameToFile", static_cast<unsigned>(-1), gid_t(500)); // This fails rarely
Also in the test case iteration that failed, the prior attempt to create a symbolic link to the "pathnameToFile" also failed to create it in my application, but the code did not detect any error (the following returned 0):
::symlink("pathnameToFile", "linkToPathname");
I imagine these two things are related. This is running on a 32 bit Centos 4 machine.
The "pathnameToFile" exists on an NFS mounted partition. Could there be some kind of race condition between the file being created and the link to it and lchown failing because the NFS does not reflect its existence yet?
After some time elapsed however, the symbolic link appeared although the chown remained without effect.
The "pathnameToFile" lives in a directory with permissions:
drwxrwxr-x 2 me me 4096 Jun 22 17:33 .
-rw-rw-r-- 1 me root 33 Jun 22 17:33 pathnameToFile
lrwxrwxrwx 1 me root 8 Jun 22 17:33 LinkToPathname -> pathnameToFile
The gid 500 is the primary group for 'me', the other group being 'wheel'.
> groups
me wheel
It is a race condition, add a short sleep when the lchown fails and try again.
Related
Under Linux:
I am trying to access a folder using 'recursive_directory_iterator' and it is throwing a "Permission denied" exception.
Here are the permissions for the given folder:
ls -l /my_folder/
total 4
drwxr-xr-x 8 root root 4096 Nov 2 2021 my_subfolder
This means that "others" have exec and thus should be able to view and enter 'my_subfolder'.
Why then does the iterator throw and exception on it, as viewing is still permitted?
How can I just iterate though a tree as a user with only 'x' permission on the folder?
My file manager run with my regular user (same user that runs my application) can do this with no problem on the same folder...
Thanks!
Some clarification:
Both my file manager and with 'ls' I can list the contests of these folders and their sub-folders, as a regular user. This is what I am trying to achieve with my C++ application as well using the std::filesystem::recursive_directory_iterator.
Here is an output of my console:
ls -l /media/my_user/my_folder/
total 36
drwxr-xr-x 4 root root 4096 Jul 10 2020 install-logs-2020-07-10.0
drwxr-xr-x 4 root root 4096 Jul 27 2020 install-logs-2020-07-27.0
drwxr-xr-x 4 root root 4096 Jul 29 2020 install-logs-2020-07-29.0
drwxr-xr-x 4 root root 4096 Nov 2 2021 install-logs-2021-11-02.0
drwxr-xr-x 4 root root 4096 Nov 2 2021 install-logs-2021-11-02.1
drwx------ 2 root root 16384 Jul 10 2020 lost+found
~$ ls -l /media/my_user/
total 4
drwxr-xr-x 8 root root 4096 Nov 2 2021 my_folder
~$ ls /media/my_user/my_folder/
install-logs-2020-07-10.0 install-logs-2020-07-27.0 install-logs-2020-07-29.0 install-logs-2021-11-02.0 install-logs-2021-11-02.1 lost+found
~$ ls /media/my_user/my_folder/install-logs-2020-07-
install-logs-2020-07-10.0/ install-logs-2020-07-27.0/ install-logs-2020-07-29.0/
~$ ls /media/my_user/my_folder/install-logs-2020-07-10.0/
crash log
As I said, I can do the same with my UI file manager too. I just want to do the same using C++.
Based on some comments, and in particular the answer by #Homer512, I made the code even simpler, and I still have trouble figuring this out.
Now all I want is to get the permissions of a given directory, but it seems that even that is not possible??
The directory I am giving is NOT the 'lost+found' directory, but its parent.
The current code I am trying is this:
void findPackagesRecursivelyIn(const std::filesystem::path &dir) {
namespace fs = std::filesystem;
fs::directory_iterator dirIt{dir};
std::cout<<"getting status"<<std::endl;
auto permissions = fs::status(dirIt->path()).permissions();
std::cout<<"got status"<<std::endl;
if((permissions & fs::perms::others_exec ) != fs::perms::none) {
std::cout<<dirIt->path()<<std::endl;
}
std::cout<<"after loop"<<std::endl;
}
Here is the output for this code:
getting status
terminate called after throwing an instance of 'std::filesystem::__cxx11::filesystem_error'
what(): filesystem error: directory iterator cannot open directory: Permission denied [/media/my_user/my_folder]
based on this it seems even initializing the iterator with a directory that has drwxr-xr-x permissions is not possible...
What am I missing here??
Many thanks for your help!
The directory contains a subdirectory lost+found that can only be accessed by root. This is typical for an Ext2-4 filesystem.
You can skip permission errors by constructing the iterator with the skip_permission_denied flag:
recursive_directory_iterator(path,
std::filesystem::directory_options::skip_permission_denied)
Alternatively, you can check permissions when the iterator returns the directory, then call recursive_directory_iterator::disable_recursion_pending() if you find that the access would fail. I don't see another way to report permission errors without invalidating the iterator.
Also, this workaround has a race condition, because the directory permissions could change between testing and recursing into it.
I guess keeping a copy of the iterator between iterations would also allow recovery, but at a higher cost.
Ok, I figured what the problem is, though I am not sure yet why it is behaving like that.
The issue was not with the permissions.
The folder I was accessing was a mounted folder that is mounted when a USB stick is inserted.
The issue was, that while I was notified (using Qt's QFileSystemWatcher) that the mounted folder was added, apparently, it is still not available at that time.
So, if I output the available storage list (in the slot triggered by QFileSystemWatcher):
for (const auto &storage : QStorageInfo::mountedVolumes()) {
std::cout<<"Storage:"<<storage.displayName().toStdString()<<std::endl;
}
The the very same mounted folder for which QFileSystemWatcher was notifying me was not in the list of the available storage.
If I restart the application, and list the storage again, it is in the list and I can access it.
As this is a different problem to the original I posted, I will close this question with this answer.
If I see that I can't figure out the problem I just explained, I'll open a new question for it.
Many thanks for all of you who helped!!
P.S
For what ever reason QFileSystemWatcher is notifying about a new mount even when that mount is not fully available yet.
The solution was to build a little "wait and poll" loop in the slot.
That did the trick.
Once the mount is fully "there", the std::filesystem::recursive_directory_iterator code works as expected.
I have a Squish Test Suite with two test cases inside. It runs fine normally.
However, when Valgrind is enabled, the second test case fails maybe 50% of the time, with the error:
Script Error RuntimeError: startApplication() failed
I see in the log that Squish only waited 20 seconds between attempting to start the second test case and giving up.
Thinking that this is a timeout issue due to Valgrind running, I've tried increasing the following timeouts:
squishserver --config setAUTTimeout 600
squishserver --config setResponseTimeout 1200
squishserver --config setSoftExitTimeout 20000
squishrunner --config setAUTTimeout 600
squishrunner --config setAUTPostMortemTimeout 300000
I also call startApplication in the second test by requesting a much longer timeout
app = startApplication("MyApp", 600000)
However, none of these seem to increase the startup wait time reported in the log beyond 20 seconds.
Any advice on how to deal with a second Test Case that fails on startup when the CPU is heavily loaded (by Valgrind)?
--- EDIT ---
server.ini (local to each linux account)
[General]
AUT/MyApp= "/vobs/ui/squish_utils"
AUTPMTimeout = "20000"
AUTSoftExitTimeout = "2000"
AUTTimeout = "20"
ResponseTimeout = "300"
sponseTimeout = "1200"
Unit Test output (note :17 and :37 second timestamps = 20s wait time)
92 2021-11-23T21:00:17 END_TEST_CASE End 'tst_case1' End of test 'tst_case1'
93 21:00:17:224 Debug: Connection closed from 127.0.0.1:34460
94 2021-11-23T21:00:17 START_TEST_CASE Start 'tst_case2' Test 'tst_case2' started (tst_case2)
95 2021-11-23T21:00:17 LOG /vobs/ui/tools/MyApp/unit_test/suite_WS/tst_case2/test.py:42: Timeout set to: 300000
96 21:00:17:467 Debug: Connection established from 127.0.0.1:34494
97 RUNNERID: 163488749
98 21:00:17:470 Debug: Connection established from 127.0.0.1:34495
99 AUTID: 163488751
100 AUTHOST: 127.0.0.1
101 AUTPORT: 4322
102 RUNNERID: -1
103 QT_NO_GLIB
104
105 21:00:19:236 Debug: SoftExitTimeout reached (2000ms). Some AUT processes are still running and will now be shutdown.
106 21:00:19:236 Debug: Trying to close process: /vobs/ui/squish_utils/MyApp
107 AUT stderr (163488749): QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-build'
108 AUT stderr (163488749): INFO: Loading Qt Wrapper configuration from "/user/tools/LINUX/Development/squish/squish-6.3.1-qt59x-linux64/etc/qtwrapper.ini"
109 AUT stderr (163488749): INFO: Installing atexit() handler to catch AUT shutdown
110 AUT stderr (163488749): INFO: Use of native dialogs in Qt is disabled (if possible)
111 AUT stderr (163488749): QSettings path override to: "/vobs/ui/tools/MyApp/unit_test"
112 AUT stderr (163488749):
113 AUTID: -1
114 2021-11-23T21:00:37 ERROR /vobs/ui/tools/MyApp/unit_test/suite_WS/tst_case2/test.py:44: Script Error RuntimeError: startApplication() failed
115 2021-11-23T21:00:37 END_TEST_CASE End 'tst_case2' End of test 'tst_case2'
116 21:00:37:479 Debug: Connection closed from 127.0.0.1:34495
117 21:00:37:481 Debug: Connection closed from 127.0.0.1:34494
118 2021-11-23T21:00:37 END End 'suite_WS' End of test 'suite_WS'
I have created a disk and added to an instance already, I SSH'ed into the instance and issue the following command and confirmed the disk has been added to the instance:
xenonxie#us-central1-a:~$ ls -l /dev/disk/by-id/
total 0
lrwxrwxrwx 1 root root 9 Dec 25 03:39 google-persistent-disk-0 -> ../../sda
lrwxrwxrwx 1 root root 10 Dec 25 03:39 google-persistent-disk-0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 Dec 25 19:38 google-persistent-disk-1 -> ../../sdb
lrwxrwxrwx 1 root root 9 Dec 25 03:39 scsi-0Google_PersistentDisk_persistent-disk-0 -> ../../sda
lrwxrwxrwx 1 root root 10 Dec 25 03:39 scsi-0Google_PersistentDisk_persistent-disk-0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 Dec 25 19:38 scsi-0Google_PersistentDisk_persistent-disk-1 -> ../../sdb
The last line scsi-0Google_PersistentDisk_persistent-disk-1 is the disk I created/attached.
I like to validate the size of the disk (200GB) but it seems something wrong as I do not receive the result.
xenonxie#us-central1-a:~$ du -hc /dev/disk/by-id
0 /dev/disk/by-id
0 total
xenonxie#us-central1-a:~$
Thanks.
When we create a disk in GCP and associate it with a Compute Engine, we may wish to ask questions about the nature of the disk. To see the raw structure of the disk, we can use the fdisk command. This will list the partitions and their sizes. A good choice might be fdisk -l. Since this is a low-level command, you might have to run it with sudo fdisk.
If you have mounted the disk as a file system, you can then use file system commands to examine the Linux filesystem nature. Two commands of interest are du to look at / examine disk utilization and another command is df which shows how much disk is free.
My requirement
My python server runs as a regular user on RHEL
But it needs to create files/directories at places it doesn't have access to.
Also needs to do chown those files with random UID/GID
My approach
Trying this in capability-only environment, no setuid.
I am trying to make use of cap_chown and cap_dac_override capabilities.
But am totally lost of how to get it working in systemctl kind of environment
At present I have following in the service file:
#cat /usr/lib/systemd/system/my_server.service
[Service]
Type=simple
SecureBits=keep-caps
User=testuser
CapabilityBoundingSet=~
Capabilities=cap_dac_override,cap_chown=eip
ExecStart=/usr/bin/linux_capability_test.py
And following on the binary itself:
# getcap /usr/bin/linux_capability_test.py
/usr/bin/linux_capability_test.py = cap_chown,cap_dac_override+ei
But this here says, that it will never work on scripts:
Is there a way for non-root processes to bind to "privileged" ports on Linux?
With the current setting, the capabilities I have for the running process are:
# ps -ef | grep lin
testuser 28268 1 0 22:31 ? 00:00:00 python /usr/bin/linux_capability_test.py
# getpcaps 28268
Capabilities for `28268': = cap_chown,cap_dac_override+i
But if I try to create file in /etc/ from within that script:
try:
file_name = '/etc/junk'
with open(file_name, 'w') as f:
os.utime(file_name,None)
It fails with 'Permission denied'
Is that the same case for me that it won't work ?
Can I use python-prctl module here to get it working ?
setuid will not work with scripts because it is a security hole, due to the way that scripts execute. There are several documents on this. You can even start by looking at the wikipedia page.
A really good workaround is to write a small C program that will launch your Python script with hard-coded paths to python and the script. A really good discussion of all the issues may be found here
Update: A method to do this, not sure if the best one. Using 'python-prctl' module:
1. Ditch 'User=testuser' from my-server.service
2. Start server as root
3. Set 'keep_caps' flag True
4. Do 'setgroups, setgid and setuid'
5. And immediately limit the permitted capability set to 'DAC_OVERRIDE' and 'CHOWN' capability only
6. Set the effective capability for both to True
Here is the code for the same
import prctl
prctl.securebits.keep_caps = True
os.setgroups([160])
os.setgid(160)
os.setuid(160)
prctl.cap_permitted.limit(prctl.CAP_CHOWN, prctl.CAP_DAC_OVERRIDE)
prctl.cap_effective.dac_override = True
prctl.cap_effective.chown = True`
DONE !!
Based upon our discussion above, I did the following:
[Service]
Type=simple
User=testuser
SecureBits=keep-caps
Capabilities=cap_chown,cap_dac_override=i
ExecStart=/usr/bin/linux_capability_test.py
This starts the server with both those capabilities as inheritable.
Wrote a small C, test code to chown file
#include <unistd.h>
int main()
{
int ret = 0;
ret = chown("/etc/junk", 160, 160);
return ret;
}
Set following on the gcc'ed binary
chown testuser:testuser /usr/bin/chown_c
chmod 550 /usr/bin/chown_c
setcap cap_chown,cap_dac_override=ie /usr/bin/chown_c
The server does following to call the binary
import prctl
prctl.cap_inheritable.chown = True
prctl.cap_inheritable.dac_override = True
execve('/usr/bin/chown_c',[],os.environ)
And I was able to get the desired result
# ll /etc/junk
-rw-r--r-- 1 root root 0 Aug 8 22:33 /etc/junk
# python capability_client.py
# ll /etc/junk
-rw-r--r-- 1 testuser testuser 0 Aug 8 22:33 /etc/junk
My program creates a log file every 10 seconds in a specified directory. Then in a different thread, it iterates the files in that directory. If the file has content it compresses it and uploads it to external storage, if the file is empty, it deletes it. After the program runs for a while I get an error "too many open files" (gzopen failed, errno = 24).
When I looked inside /proc/<pid>/fd I see many broken links to files in the same directory where the logs are created and the word (deleted) next to the link.
Any idea what am I doing wrong? I checked the return values in both threads, of the close function (in the thread which writes the logs), and in the boost::filesystem::remove (the thread which compresses and uploads the non empty log files and deletes empty log files). All the return values are zero while the list of the (deleted) links gets longer buy 1 every 10 seconds.
I think this problem never happened to me on 32 bits but recently I moved to 64 bits and now I got this surprise.
You are neglecting to close files you open.
From your description, it sounds like you close the files you open for logging in your logging thread, but you go on to say that you just boost::filesystem::remove files after compressing and/or uploading.
Remember that:
Any compressed file you opened with gzopen has to be gzclosed
Any uncompressed file you open to compress it has to be closed.
If you open a file to check if it's empty, you have to close it.
If you open a file to transfer it, you have to close it.
Output of /proc/pid/fd would be very helpful in narrowing this down, but unfortunately you don't post it. Examples of how seemingly unhelpful output gives subtle hints:
# You forgot to gzclose the output file after compressing it
l-wx------ user group 64 Apr 9 10:17 43 -> /tmp/file.gz (deleted)
# You forgot to close the input file after compressing it
lr-x------ user group 64 Apr 9 10:17 43 -> /tmp/file (deleted)
# You forgot to close the input file after logging
l-wx------ user group 64 Apr 9 10:17 43 -> /tmp/file (deleted)
# You forgot to close the input file after transferring it
lr-x------ user group 64 Apr 9 10:17 43 -> /tmp/file.gz (deleted)