I am trying to build a minimal kernel. But I am not sure how to load a function from my custom bootloader into the kernel. Can anybody solve this problem?
I am trying to build a minimal kernel. But I am not sure how to load a function from my custom bootloader into the kernel. Can anybody solve this problem?
Typically the each boot loader's code is intended for a different environment, so it doesn't make sense to use the code from any boot loader in the kernel.
For rare cases where that doesn't apply; you might pass a function pointer from boot loader to kernel (possibly as a parameter to the kernel's entry point, but possibly in some kind of table or other data structure that's passed to kernel).
However, even when it is possible, it's likely to be easier to "cut & paste" the function into kernel's code (or use #include or..) instead of calling code in the boot loader. This is especially true if kernel frees/re-uses the memory that the boot loader consumed after the boot loader has done its job.
Related
I am trying to use Tensorflow for inference within my C++ application. Other parts of the application need access to large amounts of GPU memory (not at exactly the same time as Tensorflow). However, once Tensorflow has been used to perform inference, it hogs the GPU memory and does not release it until the application ends. Ideally, after inference, I would be able to free the GPU memory used by Tensorflow to allow other algorithms to use the GPU.
Has anyone else faced this problem, and did you find a solution?
Tensorflow allocates memory for the lifetime of the process. There is unfortunately no way around that, you only get the memory back once the process finishes.
One way to solve this would be to "modularize" your application into multiple distinct processes. Have one process for performing inference, and a parent process (your application) which calls it. You can run the child process blocking, so your entire app behaves as if it was executing the code itself (apart from handling resource sharing of course).
In order to identify memory code injection (on windows systems), I want to a hash the memory of all processes on the system, for example, if the memory of calc.exe is always x and now it is y, I know that someone injected into calc.exe code.
1: Is this thinking correct? What part of the process memory always stays the same and what part is changing?
2: Dose dll have a separate memory, or it is in the memory of the exe? In other words, can i generate a hash for memory of a dll?
3: How can I dump the memory of a process or of a dll in c++?
Code is continually being injected in processes when running windows.
One example are delay loaded DLLs. When a process starts up, only the core DLLS are loaded. When certain features get exercised, the code first loads the new DLLs (code) from disk and then executes it.
Another example is .NET managed applications. Most code sits as uncompiled code on disk. When new parts of the application need to be run, the .NET runtime loads that uncompiled code, compiles it (aka JITs it) and then executes it.
The problem you are trying to solve is worthwhile, but extremely hard. The OS itself tries to solve this problem to protect your processes.
If you are trying to do something more advanced than what windows is doing for you behind the scenes, the first thing to do will be to understand all the steps windows takes to protect process and validate the code being injected in them, while still enabling processes to load code dynamically (which is a necessity).
Good luck.
Or maybe you have a more specific problem you are trying to solve?
1) The idea is nice. But as long as the process runs, they change their memory (or they do nothing) so it won't work. What you could do, is to hash the code part of the memory.
2) No, DLL are libraries linked to your code, not a separate process. They are just loaded dynamically instead of statically (http://msdn.microsoft.com/en-us/library/windows/desktop/ms681914%28v=vs.85%29.aspx)
3) Normally your OS prohibits you from accessing memory of neighbour processes. If it would allow it for your process, then it would be very easy for malware to propagate, and your system would be very instable, as one crashing process could crash all the others. So it'll be very very tricky to do such kind of dumps ! But if your process has the right priviledges, you could have a look at ReadProcessMemory()
I have just done something similar I basically c#'s these scripts:
http://www.exploit-monday.com/2012/03/powershell-live-memory-analysis-tools.html
When writing code that is to be injected into a running process, and subsequently call functions from within that application, sometimes you need to create a function pointer if you're wanting to call a function provided by that application itself - in the manner of a computer-based training application, or a computer game hack, etc.
Function pointers are easy in C++, if you know the offset of the function. Finding those offsets are what become the time consuming part, if the application that you're working with is frequently updated, because updates to the application may change the offset.
Are there any methods of automatically tracking these offsets? I seem to recall hearing about fingerprinting methods or something that would attempt to automatically locate the functions for you. Any ideas about those?
This is very dependent on what you're injecting into and the environment you're running.
If you're in a windows environment, i'd give this a read through
x86 code injection into an x86 proccess from a x64 process
In a linux type environment you could do something with the global offset table?
You could always do signature based approach to find the function. Or perhaps there is an exported function that calls the function you want to hook. You could trace the logic as such.
I'm wondering what happens in a CUDA program when a line like
myKernel<<<16,4>>>(arg1,arg2);
is encountered.
What happens then? Is the CUDA driver invoked and the ptx code passed to it or what?
"It just works". Just kidding. Probably I will get flamed for posting this answer, as my knowledge is not extensive in this area. But this is what I can say:
The nvcc code processor is a compiler driver, meaning it uses multiple compilers and steers pieces of code in one direction or another. You might want to read more about the nvcc toolchain here if you have questions like these. Anyway, one of the things the nvcc tool will do is replace the kernel launch syntax mykernel<<<...>>> with a sequence of api calls (served by various cuda and GPU api libraries). This is how the cuda driver gets "invoked" under the hood.
As part of this invocation sequence, the driver will perform a variety of tasks. It will inspect the executable to see if it contains appropriate SASS (device assembly) code. The device does not actually execute PTX, which is an intermediate code, but SASS. If no appropriate SASS is available, but PTX code is available in the image, the driver will do a JIT-compile step to create the SASS. (In fact, some of this actually happens at context creation time/CUDA lazy initialization, rather than at the point of the kernel launch.)
Additionally, in the invocation sequence, the driver will do various types of device status checking, data validity checking (e.g. kernel launch configuration parameters), and data copying (e.g. kernel sass code, kernel parameters) to the device.
Finally, the driver will initiate execution on the device, and then immediately return control to the host thread.
Additional insight into kernel execution may be obtained by studying kernel execution in the driver API. To briefly describe the driver API, I could call it a "lower level" API than the cuda runtime API. However, the point of mentioning it is that it may give some insight into how a kernel launch syntax (runtime API) could be translated into a C-level API that actually looks like library calls.
Someone else may come along with a better/more detailed explanation.
Is there a way I could make a C or C++ program that would run without an operating system and that would draw something like a red pixel to the top left corner? I have always wondered how these types of applications are made. Since Windows is written in C I imagine there is a way to do this.
Thanks
If you're writing for a bare processor, with no library support at all, you'll have to get all the hardware manuals, figure out how to access your video memory, and perform whatever operations that hardware requires to get a pixel drawn onto the display (or a sound on the beeper, or a block of memory read from the disk, or whatever).
When you're using an operating system, you'll rely on device drivers to know all this for you. Programs are still written, every day, for platforms without operating systems, but rarely for a bare processor. Many small MPUs come with a support library, usually a set of routines that lets you manipulate whatever peripheral devices they support.
It can certainly be done. You typically write the code in C, and you pretty much have to do everything on your own, with no standard library. To set your pixel, you'd usually load a pointer to the physical address of the screen, and write the correct value to that pointer. Alternatively, on a PC you could consider using the VESA BIOS. In all honesty, it's fairly similar to the way most code for MS-DOS was written (most used MS-DOS to read and write data on disk, but little else).
The core bootloader and the part of the Kernel that bootstraps the OS are written in assembly. See http://en.wikipedia.org/wiki/Booting for a brief writeup of how an operating system boots. There's no way I'm aware of to write a bootloader or Kernel purely in a higher level language such as C or C++ without using assembly.
You need to write a bootstrapper and a loader combination followed by a payload which involves setting the VGA mode manually by interrupt, grabbing a handle to the basic video buffer and then writing a value to the 0th byte.
Start here: http://en.wikipedia.org/wiki/Bootstrapping_(computing)
Without an OS it's difficult to have a loader, which means no dynamic libc. You'd have to link statically, as well as have a decent amount of bootstrap code written in assembly (although it could be provided as object files which you could then link with). Also, since you'd be at the mercy of whatever the system has, you'd be stuck with the VESA video modes (unless you want to write your own graphics driver and subsystem, which you don't).
There is, but not generally from within the OS. Initially, they are an asm stub that's executed from the MBR on the drive. See MBR. For x86 processors, this is generally 16-bit processing code, this generally jumps into the operating system code from here, and upgrades to 32-bit/64-bit mode depending on the operating system and chipset.