In windows is it possible to know what kind of disk we are dealing with from a c/c++ program? forget about gpt or mbr, how to know whether it is basic or dynamic? Program input can be drive letter or any info related to disk, output should be dynamic or basic.
No need of a direct way of doing, even if it is lengthy process, its okay.
I couldn't find much in msdn. Please help me out.
There is a way in windows, but it's not straight forward.
There is no direct API to determine if a disk is Basic or Dynamic, however all dynamic disks will have LDM Information.
So if a drive has a partion with LDM information on it, then it's going to be a dynamic disk.
the DeviceIoControl() method with the IOCTL_DISK_GET_DRIVE_LAYOUT_EX control code can be used to get this information.
Here is a post with a sample console application to do what you're asking for.
As per from MSDN http://msdn.microsoft.com/en-us/library/aa363785(VS.85).aspx
Detecting the Type of Disk
There is no specific function to programmatically detect the type of disk a particular file or directory is located on. There is an indirect method.
First, call GetVolumePathName. Then, call CreateFile to open the volume using the path. Next, use IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS with the volume handle to obtain the disk number and use the disk number to construct the disk path, such as "\?\PhysicalDriveX". Finally, use IOCTL_DISK_GET_DRIVE_LAYOUT_EX to obtain the partition list, and check the PartitionType for each entry in the partition list.
check out for GetDriveType() .
Related
Goal
I'm porting a filesystem to Windows, and am writing a more Windows-like interface for the mounter executable. Part of this process is letting the user locate a partition and pick a drive letter. Ultimately the choice of partition has to result in something I can open using CreateFile(), open(), fopen() or similar.
Leads
Windows seems to revolve around the concept of volumes, which don't seem quite analogous to disks, and only occur for already mounted filesystems.
Promising leads I've had include:
IOCTL_DISK_GET_DRIVE_LAYOUT_EX
Physical Disks and Volumes
Displaying Volume Paths
However these all end in volumes or offsets thereof, not the /dev/sda1 partition-specific-style handle I'm after.
This question is after a very similar thing, I considered a bounty until I observed the OP is after physical disk names, not partitions. This answer contains a method to brute force partition names, I'd like to avoid that (or see documentation containing bounds for the possible paths).
Question
I'd like:
Correct terminology and documentation for unmounted partitions in Windows.
An effective and documented method to reliably retrieve all available partitions.
The closest fit to the partition file abstraction as available in Linux, wherein all IO is bound to the appropriate area of the disk for the partition opened.
Update0
While the main goal is still opening raw partitions, it appears the solution may involve first acquiring a handle to each disk drive, and then using that in turn to acquire each partition. How to enumerate all the disk drives (even those without mounted volumes on them already) is required.
As you noted, you can use IOCTL_DISK_GET_DRIVE_LAYOUT_EX to get a list of partitions.
There's a good overview of the related concepts here. I wonder if the missing link for you is
Detecting the Type of Disk
There is no specific function to
programmatically detect the type of
disk a particular file or directory is
located on. There is an indirect
method.
First, call GetVolumePathName. Then,
call CreateFile to open the volume
using the path. Next, use
IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS
with the volume handle to obtain the
disk number and use the disk number to
construct the disk path, such as
"\?\PhysicalDriveX". Finally, use
IOCTL_DISK_GET_DRIVE_LAYOUT_EX to
obtain the partition list, and check
the PartitionType for each entry in
the partition list.
The full list of disk management control codes may have more that would be useful. To be honest I'm not sure how the Unix partition name maps onto Windows, maybe it just doesn't directly.
If you can imagine moving from safe haven of userspace and the Windows API (win32) to coding a device driver with NTTDK, you could try IoReadPartitionTableEx or some other low level disk function.
To be blunt, the best way to reliably get all mounted/unmounted disk partitions is to parse the mbr/gpt yourself.
First to clear a few things up: Disks contain partitions and partitions combine to create volumes. Therefore, you can have one volume which consists of two partitions from two different disks.
IOCTL_DISK_GET_DRIVE_LAYOUT_EX is the closest solution you're going to get without doing it manually. The problem with this is that it relies on windows which can incorrectly parse the MBR for god knows what reason. My current working theory is that if Windows was installed via EFI but is being booted via MBR, youll see this sort of issue. Windows manages to get away with this because most partition managers copy the important partition information to the MBR alongside the GPT. But this means that you wont get important information like the partition UUID (which is only stored in the GPT).
All of the other solutions involve getting the Volume information which is completely different from the partition information.
Side Note: a Volume id will usually be of the form \\.\Volume{PARTITION_UUID}. Cases where this would not hold: if the drive is partitioned with MBR and not GPT (MBR does not have a partition UUID, therefore windows makes one up), if you have a raid drive, or if you have a volume consisting of partitions from multiple disks (kinda the same thing as raid). Those are just the cases that come to my mind, dont hold me to them.
I think you're slightly mistaken in an earlier phase. For instance, you seem to assume that "mounting" works in Windows like it works in Unix. It's a bit different.
Let's start at the most familiar end. Paths like C:\ use drive letters. Those are essentially just a set of symbolic links nowadays (On Windows, they're more formally known as "junctions"). There's a base set for all users, and each user can add their own. Even if there is no drive letter for a volume, there will still be a volume name like \\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}\. You can use this volume name in calls to CreateFile() etc. I'm not sure if fopen() likes them, though.
The function QueryDosDevice will get you the Windows device name for a drive letter or a volume name. A device name looks like "\Device\HarddiskVolume1", but you can't pass it to CreateFile
Microsoft has example code to enumerate all partitions.
On Windows, like on Linux, you can open the partition itself as if it were a file. This is quite well documented under CreateFile.
I want to verify if the text log files created by my program being run at my customer's site have been tampered with. How do you suggest I go about doing this? I searched a bunch here and google but couldn't find my answer. Thanks!
Edit: After reading all the suggestions so far here are my thoughts. I want to keep it simple, and since the customer isn't that computer savy, I think it is safe to embed the salt in the binary. I'll continue to search for a simple solution using the keywords "salt checksum hash" etc and post back here once I find one.
Obligatory preamble: How much is at stake here? You must assume that tampering will be possible, but that you can make it very difficult if you spend enough time and money. So: how much is it worth to you?
That said:
Since it's your code writing the file, you can write it out encrypted. If you need it to be human readable, you can keep a second encrypted copy, or a second file containing only a hash, or write a hash value for every entry. (The hash must contain a "secret" key, of course.) If this is too risky, consider transmitting hashes or checksums or the log itself to other servers. And so forth.
This is a quite difficult thing to do, unless you can somehow protect the keypair used to sign the data. Signing the data requires a private key, and if that key is on a machine, a person can simply alter the data or create new data, and use that private key to sign the data. You can keep the private key on a "secure" machine, but then how do you guarantee that the data hadn't been tampered with before it left the original machine?
Of course, if you are protecting only data in motion, things get a lot easier.
Signing data is easy, if you can protect the private key.
Once you've worked out the higher-level theory that ensures security, take a look at GPGME to do the signing.
You may put a checksum as a prefix to each of your file lines, using an algorithm like adler-32 or something.
If you do not want to put binary code in your log files, use an encode64 method to convert the checksum to non binary data. So, you may discard only the lines that have been tampered.
It really depends on what you are trying to achieve, what is at stakes and what are the constraints.
Fundamentally: what you are asking for is just plain impossible (in isolation).
Now, it's a matter of complicating the life of the persons trying to modify the file so that it'll cost them more to modify it than what they could earn by doing the modification. Of course it means that hackers motivated by the sole goal of cracking in your measures of protection will not be deterred that much...
Assuming it should work on a standalone computer (no network), it is, as I said, impossible. Whatever the process you use, whatever the key / algorithm, this is ultimately embedded in the binary, which is exposed to the scrutiny of the would-be hacker. It's possible to deassemble it, it's possible to examine it with hex-readers, it's possible to probe it with different inputs, plug in a debugger etc... Your only option is thus to make debugging / examination a pain by breaking down the logic, using debug detection to change the paths, and if you are very good using self-modifying code. It does not mean it'll become impossible to tamper with the process, it barely means it should become difficult enough that any attacker will abandon.
If you have a network at your disposal, you can store a hash on a distant (under your control) drive, and then compare the hash. 2 difficulties here:
Storing (how to ensure it is your binary ?)
Retrieving (how to ensure you are talking to the right server ?)
And of course, in both cases, beware of the man in the middle syndroms...
One last bit of advice: if you need security, you'll need to consult a real expert, don't rely on some strange guys (like myself) talking on a forum. We're amateurs.
It's your file and your program which is allowed to modify it. When this being the case, there is one simple solution. (If you can afford to put your log file into a seperate folder)
Note:
You can have all your log files placed into a seperate folder. For eg, in my appplication, we have lot of DLLs, each having it's own log files and ofcourse application has its own.
So have a seperate process running in the background and monitors the folder for any changes notifications like
change in file size
attempt to rename the file or folder
delete the file
etc...
Based on this notification, you can certify whether the file is changed or not!
(As you and others may be guessing, even your process & dlls will change these files that can also lead to a notification. You need to synchronize this action smartly. That's it)
Window API to monitor folder in given below:
HANDLE FindFirstChangeNotification(
LPCTSTR lpPathName,
BOOL bWatchSubtree,
DWORD dwNotifyFilter
);
lpPathName:
Path to the log directory.
bWatchSubtree:
Watch subfolder or not (0 or 1)
dwNotifyFilter:
Filter conditions that satisfy a change notification wait. This parameter can be one or more of the following values.
FILE_NOTIFY_CHANGE_FILE_NAME
FILE_NOTIFY_CHANGE_DIR_NAME
FILE_NOTIFY_CHANGE_SIZE
FILE_NOTIFY_CHANGE_SECURITY
etc...
(Check MSDN)
How to make it work?
Suspect A: Our process
Suspect X: Other process or user
Inspector: The process that we created to monitor the folder.
Inpector sees a change in the folder. Queries with Suspect A whether he did any change to it.
if so,
change is taken as VALID.
if not
clear indication that change is done by *Suspect X*. So NOT VALID!
File is certified to be TAMPERED.
Other than that, below are some of the techniques that may (or may not :)) help you!
Store the time stamp whenever an application close the file along with file-size.
The next time you open the file, check for the last modified time of the time and its size. If both are same, then it means file remains not tampered.
Change the file privilege to read-only after you write logs into it. In some program or someone want to tamper it, they attempt to change the read-only property. This action changes the date/time modified for a file.
Write to your log file only encrypted data. If someone tampers it, when we decrypt the data, we may find some text not decrypted properly.
Using compress and un-compress mechanism (compress may help you to protect the file using a password)
Each way may have its own pros and cons. Strength the logic based on your need. You can even try the combination of the techniques proposed.
I'm trying to speedup directory enumeration in C++, where I'm recursing into subdirectories. I currently have an app which spends 95% of it's time in FindFirst/FindNextFile APIs, and it takes several minutes to enumerate all the files on a given volume. I know it's possible to do this faster because there is an app that does: Everything. It enumerates my entire drive in seconds.
How might I accomplish something like this?
I realize this is an old post, but there is a project on source forge that does exactly what you are asking and the source code is available.
You can find the project here: NTFS-Search
"Everything" builds an index in the background, so queries are against the index not the file system itself.
There are a few improvements to be made - at least over the straight-forward algorrithm:
First, breadth search over depth search. That is, enumerate and process all files in a single folder before recursing into the sub folders you found. This improves locality - usually a lot.
On Windows 7 / W2K8R2, you can use FindFirstFileEx with FindExInfoBasic, the main speedup being omitting the short file name on NTFS file systems where this is enabled.
Separate threads help if you enumerate different physical disks (not just drives). For the same disk it only helps if it's an SSD ("zero seek time"), or you spend significant time processing a file name (compared to the time spent on disk access).
[edit] Wikipedia actually has some comments -
Basically, they are skipping the file system abstraction layer, and access NTFS directly. This way, they can batch calls and skip expensive services of the file system - such as checking ACL's.
A good starting point would be the NTFS Technical Reference on MSDN.
"Everything" accesses directory information at a lower level than the Win32 FindFirst/FindNext APIs.
I believe it reads and interprets the NTFS MFT structures directly, and that this is one of the main reasons for its performance. It's also why it requires admin privileges and why "Everything" only indexes local or removable NTFS volumes (not network drives, for example).
A couple other utilities that do the similar things are:
FindOnClick by 2Brightsparks
Search GT
A little reverse engineering with a debugger on these tools might give you some insight on the techniques they use.
Don't recurse immediately, save a list of directories you find and dive into them when finished. You want to do linear access to each directory, to take advantage of locality of reference and any caching the OS is doing.
If you're already doing the best you can to get the maximum speed from the API, the next step is to do low-level disk accesses and bypass Windows altogether. You might get some guidance from the NTFS drivers for Linux, or perhaps you can use one directly.
If you are doing this on NTFS, here's a lib for low level access: NTFSLib.
You can enumerate through all file records in $MFT, each representing a real file on disk. You can get all file attributes from the record, including $DATA.
This may be the fastest way to enumerate all files/directories on NTFS volumes, 200k~300k files per minute as I tested.
I'm looking for a way to monitor which processes are using (or attempting to access) a file over a duration of time. What are some good Windows APIs or tools to achieve this?
You could replace the file by a reparse point. The reparse point invokes a custom file system filter, which can redirect the access to anothe file. This is for instance how NTFS junctions work. If you let your file system filter handle reparse points in the same way, you can intercept all attempts by all processes to open the underlying file. It's a rather heavy-handed approach though, as it involves modifying the file system itself.
You can use the FileSystemWatcher.
Here's a nice tut.
http://www.geekpedia.com/tutorial173_File-monitoring-using-FileSystemWatcher.html
FileSystemWatcher is not suitable for determining the process.
There already was a different question. look here, this solution fits your needs.
I want to create a simple program which is working very similar to RAID1. It should work like this:
First i want to give the primary HDD-s drive letter and than the secondary one. I will only write to the primary HDD! If any new data is copied to the primary HDD it should automatically copy it to the secondary one.
I need some help where should i start all this? How to monitor the written data in the primary HDD? Obviously there are many ways to do what i want (i think), but i need the simpliest way.
If this isn't so complicated, than how can i handle that case if the primary HDD has two or more partition, because then i should check the secondary HDD's partition too, and then create/resize them if necessary?
Thanks in advance!
kampi
The concept of mirroring disk writes to another disk in real time is the basis for high availability, and implementing these schemes are not trivial.
The company I work for makes DoubleTake, which does real time mirroring & replication of file based IO to local or remote volumes. This is a little different than what you are describing, which appears to be block based disk/volume replication, but many of the concepts are similar.
For file based replication, there are a quite a few nasty scenarios, i'll describe a few:
Synchronizing the contents of one volume to another volume, keeping in mind that changes can occur while you are doing this. I suppose you could simply this by requiring that volumes start out totally formatted. But for people that have data that will not be a good solution!
keeping up with disk changes: What if the volume you are mirroring to is slower than the source volume? Where do you buffer? To Disk? Memory?
Anyways we use a kernel mode file system filter driver to capture the disk IO, and then our user mode service grabs this IO and forwards it to a local or remote disk.
If you want to learn about file system filtering, one of the best books (its old but good) is File System Internals, by Rajeev Nagar. Its a must read for doing any serious work with file system filters.
Also take a look at the file system filter samples on the Windows 7 WDK, its free, and they have good file mon examples that will get you seeing disk changes pretty quickly.
Good Luck!