Read data in executable on run - c++

G'Day!
I have an executable (Unix or Windows - it should be cross-compiling). If one opens this executable by any editor and write some stuff to the end - the application would still run perfect.
On execution, the application with all its data loads to the RAM. So, the user-written part of file is also loaded into memory.
Is there any chance to read this data?
I need this data in fast access. Other workarounds are not OK, because it takes too much time:
Reading directly from file (on hard disk) or mapping it is not fine, because the application have to read this file on each run, but this application has lots of launches per sec.
Using shared memory with another process (something like server, which holds data) is not cross-compiling
Using pipes between app and so-called server is not fast enough, imho.
That's why I decided to write some stuff to the end of application.
Thanks in advance!

Are you re-inventing
exe packers (see http://en.wikipedia.org/wiki/Executable_compression)
embedded resources? A portable approach was described here Is there any standard way of embedding resources into Linux executable image?
I also think you're might be optimizing the wrong things.
Reading directly from file (on hard disk) or mapping it is not fine, because the application have to read this file on each run, but this application has lots of launches per sec.
The kernel[1] is way smarter than we are and is perfectly capable of caching the mapped stuff. Heck, if you map it READ-ONLY there will be no difference with directly accessing data from your program's base image.
[1]: this goes for both WIndows and Unix

Related

Retrieve physical address of file on disk

Using the Windows API, I'm trying to write a program to read data from a disk. I managed to get access to the content of the drive using CreateFile and I'm able to search through it. Let's say there are some files on that disk and I know their paths, but I'm actually interested in their physical location.
My question is:
Is it possible to retrieve the physical location or address of the files (or sector they're located in) and where are they stored on the drive without searching the whole drive? If so, what functions should I use? Using SetFilePointer or FindFirstFile don't seem to solve the solution either.
The whole point of any file system is to abstract the physical disk sectors and provide you a higher level abstraction (called files). So the answer to "Is it possible to retrieve the physical location" should be no! (in general); some code might even move the sectors of a file (e.g. a disk defragmenter and you could imagine it is running concurrently with your program, even if that is not recommended..)
For more, read wikipages on file systems and files, then read a good book such as Operating systems: Three Easy Pieces
Notice that by using files, you are expecting that your program behave similarly after having moved a file system into a different disk, provided the file paths, contents, and metadata remain the same. In particular, you could have two external USB disks enclosures with different geometries or capacities having the same file contents (perhaps even in different file systems, e.g. VFAT on one and NTFS on another), and you then expect your program to behave identically when accessing such files (in the first box or the second one). Whatever box is plugged, your program would (for example) access the same F:\MyDir\MyFile.dat file. As file systems, both boxes would appear identical. At the physical sector level, data would be organized very differently.
BTW, the physical organization of files inside a file system varies greatly from one file system to another one. You could use some Ext3 file system on your machine (since there are Ext3 drivers for Windows) - and that is actually useful to share some data between Linux & Windows on a dual boot PC -, and the file organization is different from a FAT one or a NTFS one.
You might get some way to query the kernel to get the actual physical sector location. But I am not sure it works for all file systems (what would be the meaning of a sector location for some remote NFS one). And that information could be stale before your program get it (e.g. if some defragmenter is working in parallel). Also, other processes could access and modify the same file system at the same time (so that meta data -e.g. the sector location- would be obsolete by the time your process is scheduled to run again).
On Windows and on Unix like systems, file system code runs in the kernel. And other processes could use that same code (and the same file system) while your process is not running. Both Windows and Unix have preemptive scheduling, so you have no guarantee that your process runs again in user mode before some other process is using the same file system.
Remember that in practice, your file data often stays in the page cache. And that is why you might not hear your disk working -if you still have a rotating hard disk- when accessing the same file several times in a row (e.g. running the same program on the same file twice, a few seconds apart; usually the second run is keeping the disk silent, because the file data is already in RAM).
In a comment you mention that you want
To watch the data of the file and for example see what happens to the data when it gets deleted or modified.
but that should work at the file system level. Linux has inotify(7) facilities for that (they work on most local file systems, e.g. Ext4 or BTRFS, but not on remote file systems à la nfs(5), and neither on pseudo file systems à la proc(5)). I don't know if Windows has something similar to Linux inotify (but probably yes, at least in some cases).
You probably should consider using some database (maybe as simple as sqlite), and perhaps you want ACID properties (then use some real RDBMS like PostGreSQL). With PostGreSQL you might use TRIGGERs to be aware that some data changed, even if some other program changes the same database.
You could also do some file locking, and adopt the convention that every program accessing your particular file should lock it appropriately.

DBF file parallel reading

A foxpro software reads , writes and updates records in a DBF file. I parallely read that same DBF in a c++ application. Will there be any issues if I keep my c++ applications reading DBF file for a long time ?
Yes, the DBF format is multi-user -- nearly every real-world application that uses them is multi-user; we have apps used by hundreds of users for example.
There may be a problem in that your C++ application does not respect the locking mechanism that FoxPro would use, but that's not the same thing. If you use Microsoft Visual FoxPro OLEDB driver properly on the C++ side you won't have an issue, but yes as with anything like this open and close the DBF as quickly as you can.
VFP tables are file based and use shared lock while updating it. If you are doing reading it directly (low level), with only reading there wouldn't be a problem. Since it is data anyway and the best optimised readers are OLEDB\ODBC drivers (ODBC drivers exist for up to version 6. For later versions, Sybase Advantage Server has a driver -local free, remote paid AFAIK but I don't use).
I have been using DBF tables from VFP (yes from VFP) and C# via VFPOLEDB for a long time and I can say I had no issues. Actually, the driver works better from C# vs from within VFP, I don't know why.
Also note that, when you are using VFPOLEDB driver, you are using ANSI mode by default (and shared for read\write unless you change the mode).
If you keep the DBF file open for a long time, you prevent data structure updates to the table. If the VFP application is updated and needs new columns in the DBF file that you are reading, the update would fail until your program stops.
To fix this you or your client would need to change the update process to include your application.
Sometimes VFP applications contain a mechanism that terminates the application. Usually this is some sort of timer that checks a specific field or some file, and then terminates the application. This is often used, because users keep applications open when the leave work. The database is therefore still locked impacting updates and sometimes consistent backups.
You can fix this by implementing the same mechanism in your application.

State of system registry if the computer suddenly loses power while being in sleep mode?

I'm testing my Windows local service in a situation when the system was previously suspended (or, put into Sleep mode) and then if it suddenly loses power (without going out from suspension.)
For simplicity, at the moment, I'm testing it using a virtual machine (VMWare Workstation 10) and their "Reset" option that is supposed to simulate it:
And what I am observing is somewhat strange. For instance, when I log the data, that was supposed to be saved into system registry before the system is suspended, and then check that same registry key after the system is booted back up (after the power reset) the data in the system registry seems to be missing. Or the value of the key is just not there. While my log (which is just a text file) has everything saved correctly.
So I'm curious if it's something specific to Windows System Registry, or some bug in the VMWare software?
PS. The OS that I'm testing it on is Windows 7.
It's not specific to the Registry. You have to understand that "the Registry" is actually an abstraction. It's a shared database, backed by multiple files with non-trivial structure. The shared abstraction lives in RAM and doesn't mirror the disk structure directly.
On the other hand, your file is almost certainly not shared. File access, the file cache and virtual memory are pretty well integrated. Your write initially ends in the file cache (RAM). When you suspend your PC, Windows isn't going to copy the **file* cache to the hibernation file. That's a bit pointless - the dirty file cache contents can be written out to disk, and the clean part can be discarded outright.

How can I run a code directly into a processor with a File System?

I have a simple anisotropic filter c/c++ code that will process an .pgm image which is an text file with greyscale information for each pixel, and after done processing, it will generate an output image with the filter applied.
This program takes up to some seconds in order for it to do about 10 iterations on a x86 CPU running windows.
Me and an academic finishing his master degree on applied computing, we need to run the code under FPGA (Altera DE2-115) to see if there is considerable results of performance gain when running the code directly on the processor (NIOS 2).
We have successfully booted up the S.O uClinux under the FPGA, but there are some errors with device hardware, and by that we can't access SD-Card not even Ethernet, so we can't get the code and image into the FPGA in order to test its performance.
So I am here asking to an alternative way to test our code performance directly into an CPU with a file system so the code can read the image and generate another one.
The alternative can be either with an product that has low cost and easy to use (I was thinking raspberry PI), or either if I could upload the code somewhere that runs automatically for me and give me the reports.
Thanks in advance.
what you're trying to do is benchmarking some software on a multi GHz x86 Processor vs. a soft-core processor running 50MHz? (as much as I can tell from Altera docs)
I can guarantee that it will be even slower on the FPGA! Since it is also running an OS (even embedded Linux) it also has threading overhead and what not. This can not be considered running it "directly" on CPU (whatever you mean by this)
If you really want to leverage the performance of an FPGA you should "convert" your C-Code into a HDL and run it directly in hardware. Accessing the data should be possible. I don't know how it's done with an Altera board but Xilinx has some libraries accessing data from a SD card with FAT.
You can use on board SRAM or DDR2 RAM to run OS and your application.
Hardware design in your FPGA must have memory controller in it. In SOPC or Qsys select external memory as reset vector and compile design.
Then open NioSII build tools for Eclipse.
In Eclipse create new project by selecting NiosII Application and BSP project.
Once the project is created, go to BSP properties and type offset of external memory in the linker tab and generate BSP.
Compile project and Run as Nios II hardware.
This will run you application on through external memory.
You wont be able to see the image but 2-D array representing image in memory can be
printed on console.

Reading and interpreting memory page file in C++

I need to analyse some malware that I have on a vmware image (vmware is a virtual machine), in particular I need to do a full dump of a certain process. I know that vmware,on pausing, writes the whole RAM into a .vmem file. The platform the image is taken of is Windows XP. I know that there are certain tools that do this but they are mostly closed source or don't work for Windows XP. I need it to be done in reasonable time (under one second if that is possible somehow) and to run it from my own C++ program, any help would be really appreciated.
You seem to be asking to interact with processes and their memory from a suspended VM.
Give some forensic tools a shot. This one looks promising:
http://code.google.com/p/volatility/