How I can get ring 0 operating mode for my process in Windows 7(or Vista)?
Allowing arbitrary code to run in ring 0 violates basic OS security principles.
Only the OS kernel and device drivers run in ring 0. If you want to write ring 0 code, write a Windows device driver. This may be helpful.
Certain security holes may allow your code to run in ring 0 also, but this isn't portable because the hole might be fixed in a patch :P
Technically speaking, all processes have some threads spending some of their time in Kernel-Mode (ring 0). Whenever a user-mode process makes a syscall into the OS, there is a transition where the thread gets into ring 0 via a 'gate'. Whenever a process needs to talk to a device, allocate more process-wide memory, or spawn new threads, a syscall is used to ask the OS to provide this service.
Therefore, if you want to have a process run some code in ring 0, you'll need to write a driver and then communicate with this driver thru some syscalls. The most common syscall for this is called ioctl (stands for I/O Control).
Another thing to look at on the Windows platform is the UMDF (User-Mode Driver Framework). This allows you to write, debug, and test a driver in user-mode (running in ring 3) but it is still accessible to other drivers or other processes in the system.
You cannot set kernel mode from a user mode process. That's how security works.
Related
Not sure whether I should post this here or not but I gotta ask.
Context :
Linux on an embedded platform (CPU #~500MHz)
One team working on the single userspace software
One team working on Linux + driver + uboot etc.
The software has to handle GPIO, some are output (write when needed), some are input (read when needed for some, preferably interrupt-like for others).
The software is a multi-threaded app with ~10-15 threads in SCHED_FIFO scheduling policy.
Let's say I have a module called WGPIO which is a wrapper handling GPIO. (this is developed by the Linux team btw. WGPIO is still in user-space, but they could develop a driver if needed)
Here is some pseudo_code of what is designed as we speak.
gpio_state state = ON;
// IO_O is output. Set to ON, don't care if it's active_high or active_low btw
WGPIO_WriteOutput(IO_O,state);
// IO_I is input, read when needed
WGPIO_ReadInput(IO_I,&state);
// register callback when rising edge occurs on IO named IO_IT
WGPIO_SetCallback(IO_IT,EDGE_RISING,my_callback);
// Unmask to enable further IT-like processing
WGPIO_UnmaskIRQ(IO_IT);
I must be able to handle some of the GPIO changes in 5 to 10ms.
Is some userspace polling (WGPIO would have a SCHED_FIFO thread then) on multiple FDs enough to simulate an "interrupt-like" handling in my app ? This looks like the most simple idea.
If you need more details, feel free to ask.
Thanks in advance.
From kernel gpio/sysfs.txt:
"value" ... reads as either 0 (low) or 1 (high). If the GPIO
is configured as an output, this value may be written;
any nonzero value is treated as high.
If the pin can be configured as interrupt-generating interrupt
and if it has been configured to generate interrupts (see the
description of "edge"), you can poll(2) on that file and
poll(2) will return whenever the interrupt was triggered. If
you use poll(2), set the events POLLPRI and POLLERR. If you
use select(2), set the file descriptor in exceptfds. After
poll(2) returns, either lseek(2) to the beginning of the sysfs
file and read the new value or close the file and re-open it
to read the value.
"edge" ... reads as either "none", "rising", "falling", or
"both". Write these strings to select the signal edge(s)
that will make poll(2) on the "value" file return.
This file exists only if the pin can be configured as an
interrupt generating input pin.
The preferred way is usually to configure the interrupt with /sys/class/gpio/gpioN/edge and poll(2) for POLLPRI | POLLERR (important it's not POLLIN!) on /sys/class/gpio/gpioN/value. If your process is some "real-time" process that needs to handle the events in real time, consider decreasing it's niceness.
You can even find some example code on github that uses poll, ex. this repo.
How to execute some script (in my case it would script which copies logs to flash or copies logs remotely) before watchdog execution?
Should I modify linux kernel watchdog driver? If so in which method?
Or maybe it is possible somehow to configure this by:
/etc/default/watchdog
/etc/watchdog.conf
However we have busybox installed where watchdog configuration is limited.
I cannot find anything on google, what is suprised as this is basic problem which needs to be solved - everybody wants to have logs after watchdog reset in persistent memory, flash what is not /var/log/ path.
Of course solution to copy from time to time logs to flash in normal device lifecycle is not good idea as there should be some solution how to do this when watchdog timeout on feeding /dev/watchdog expires.
On a linux kernel newer than 4.9 you should have the availability of the pretimeout governor framework which would allow you to write a linux kernel driver which would react on the detection of a pre-timeout. A solution like this is well beyond the scope of a simple question and answer, so I'm leaving my original answer stand.
TL;DR:
If the problem is detectable while the OS is still running you can flush the logs. If the problem is caused by the OS locking up then you won't have an opportunity to fix the issue as hardware will reset the box.
There are two things here:
Watchdog device
Watchdog program
The watchdog device is typically a hardware timer that will do 'something specifically low level' when it's timer expires. The most common low level thing to do is reset the box. There is no OS involvement in this if it happens in hardware. You will have no opportunity to do anything high level once that timer runs out - e.g. writing log files somewhere.
The watchdog program is a tool that reassures the watchdog device periodically as long as it's check conditions are met.
The busybox watchdog timer's condition is a simple loop (pseudo code):
while (1) {
# reassure watchdog
# sleep some time
}
so if the program stops running - e.g. by an OS lockup or termination of the program then the underlying hardware will simply kick the box.
The 'bigger' watchdog binary provides a bunch of checks, and if they fail, then it will trigger the repair-binary option in the /etc/watchdog.conf to try to recover. This would be a potential point to flush the logs.
I'm writing a kernel driver, which should read (and in some cases, also write) some memory addresses in kernel session space (win32k.sys). I've read in another topic that for example in Windbg I should change the context to a random user process to read the memory of kernel session space (with .process /p). How can I do that in a kernel driver? Should I create a user process which communicate with the driver (that's my idea now, but I hope that there is a better solution) or there is a more simple solution for this?
Session space are not mapped in system address space (that drivers share, if not attached to any process). Those why you getting BSOD while accessing win32k.
You need to be attached to EPROCESS via KeStackAttachProcess to perform this operation. You can get session id with ZwQueryInformationProcess(ProcessSessionInformation) function.
Kernel memory space is shared among all of the kernel objects ( just like a real/unprotected mode in DOS and early Windows versions). Kernel driver can access any address within the kernel space, whether it belongs to him or not.
You must find and attach to the csrss process!
win32k.sys is not loaded in the system address space of all process only for csrss.
You should do stack attach to csrss process.
I have a Qt application that runs on Linux.
The user can switch the system to mem sleep using this application.
Switching to mem sleep is trivial, but catching the wake up event in user space isn't.
My current solution is to use a infinite loop to trap the mem sleep, so that when the system wakes up, my application always continues from a predictable point.
Here is my code:
void MainWindow::memSleep()
{
int fd;
fd = ::open("/sys/power/state", O_RDWR);// see update 1)
QTime start=QTime::currentTime();
write(fd,"mem",3); // command that triggers mem sleep
while(1){
usleep(5000); // delay 5ms
const QTime &end=QTime::currentTime();// check system clock
if(start.msecsTo(end)>5*2){// if time gap is more than 10ms
break; // it means this thread was frozen for more
} // than 5ms, indicating a wake up after a sleep
start=end;
}
:: close(fd); // the end of this function marks a wake up event
}
I described this method as a comment on this question, and it was pointed out that it's not a good solution, which I agree.
Question: Is there a C API that I can use to catch the wake up event?
Update:
1) what is mem sleep?
https://www.kernel.org/doc/Documentation/power/states.txt
The kernel supports up to four system sleep states generically, although three
of them depend on the platform support code to implement the low-level details
for each state.
The states are represented by strings that can be read or written to the
/sys/power/state file. Those strings may be "mem", "standby", "freeze" and
"disk", where the last one always represents hibernation (Suspend-To-Disk) and
the meaning of the remaining ones depends on the relative_sleep_states command
line argument.
2) why do I want to catch the wake up event?
Because some hardware need to be reset after a wake up. A hardware input device generates erroneous input events after system wakes up, so it has to be disabled before sleep(easy) and enable after wake up(this question).
This should/could be handled by the driver in the kernel, which I have access to, or fixed in hardware, which my team can do but does not have the time to do it.(why I, a app developer, need to fix it in user space)
3) constraints
This is embedded linux, kernel 2.6.37, arch:arm, march:omap2, distro:arago. It's not as convenient as PC distros to add packages, not does it have ACPI. And mem sleep support in kernel 2.6.37 isn't mature at all.
Linux device drivers for PCI devices can optionally handle suspend and resume which, presumably, the kernel calls, respectively, just before the system is suspended, and just after resuming from a suspend. The PCI entrypoints are in struct pci_driver.
You could write and install a trivial device driver which does nothing more than sense resume operations and provides an indication to any interested processes. The simplest might be to support a file read() which returns a single byte whenever a resume is sensed. The program only need open the device and leave a thread stuck reading a single character. Whenever the read succeeds, the system just resumed.
More to the point, if the devices your application is handling have device drivers, the drivers should be updated to react appropriately to a resume.
When the system wakes from sleep, it should generate an ACPI event, so acpid should let you detect and handle that: via an /etc/acpi/events script, by connecting to /var/run/acpid.socket, or by using acpi_listen. (acpi_listen should be an easy way to test if this will work.)
Check pm-utils which you can place a hook at /etc/pm/sleep.d
In the hook you can deliver signal to your application, e.g. by kill or any IPC.
You can also let pm-utils to do the computer suspend, which IMO is far more compatible with different configurations.
EDIT:
I'm not familiar with arago but pm-utils comes with arch and ubuntu.
Also note that, on newer system that uses systemd, pm-utils is obsoleted and you should instead put hooks on systemd.
REF: systemd power events
I'm currently working on a C++ server that takes requests and spawns off new processes to handle them. Those child processes then (sometimes) have to execute calls to system(3) to invoke other programs (third party ones over which I have no control). This server is being ported over to a new hardware platform, so I have to retain compatibility between multiple systems, going back to kernel 2.4.20. I'm currently ignoring children (signal(SIGCHLD, SIG_IGN)) and this works fine on the old kernel, however when I run the server on the newer kernels to which I'm porting the server (2.6, 3.2) on different hardware, this system call fails, with system(3) setting errno to ECHILD. What's changed in the kernel and what's the proper way of handling children if I can't ignore them ? (Note, when I register a handler for SIGCHLD following Beej's example, it works fine)