The purpose is to obtain information about the physical hard disk under the Linux platform. The required information includes the physical hard drive name (caption), device path (such as /dev/sda), read/write rate and read and write throughput of each physical disk, total disk capacity and remaining available capacity per physical disk. I eventually need to get this information in the program in C++. But I also accept the command line acquisition method.
I have tried lshw, smartctl, hdparm, lsblk, fdisk.etc, but some of these commands are not available on different Linux distributions. And these methods can't get all the information I want. I also tried to get information from /proc/diskstats, /sys/block, etc., but it didn't solve the problem. The confusion between logical disks and physical disks also makes processing difficult. It is important to emphasize that the information you need to obtain is for each physical disk.
fdisk -l command will show you in details.
Related
Using the Windows API, I'm trying to write a program to read data from a disk. I managed to get access to the content of the drive using CreateFile and I'm able to search through it. Let's say there are some files on that disk and I know their paths, but I'm actually interested in their physical location.
My question is:
Is it possible to retrieve the physical location or address of the files (or sector they're located in) and where are they stored on the drive without searching the whole drive? If so, what functions should I use? Using SetFilePointer or FindFirstFile don't seem to solve the solution either.
The whole point of any file system is to abstract the physical disk sectors and provide you a higher level abstraction (called files). So the answer to "Is it possible to retrieve the physical location" should be no! (in general); some code might even move the sectors of a file (e.g. a disk defragmenter and you could imagine it is running concurrently with your program, even if that is not recommended..)
For more, read wikipages on file systems and files, then read a good book such as Operating systems: Three Easy Pieces
Notice that by using files, you are expecting that your program behave similarly after having moved a file system into a different disk, provided the file paths, contents, and metadata remain the same. In particular, you could have two external USB disks enclosures with different geometries or capacities having the same file contents (perhaps even in different file systems, e.g. VFAT on one and NTFS on another), and you then expect your program to behave identically when accessing such files (in the first box or the second one). Whatever box is plugged, your program would (for example) access the same F:\MyDir\MyFile.dat file. As file systems, both boxes would appear identical. At the physical sector level, data would be organized very differently.
BTW, the physical organization of files inside a file system varies greatly from one file system to another one. You could use some Ext3 file system on your machine (since there are Ext3 drivers for Windows) - and that is actually useful to share some data between Linux & Windows on a dual boot PC -, and the file organization is different from a FAT one or a NTFS one.
You might get some way to query the kernel to get the actual physical sector location. But I am not sure it works for all file systems (what would be the meaning of a sector location for some remote NFS one). And that information could be stale before your program get it (e.g. if some defragmenter is working in parallel). Also, other processes could access and modify the same file system at the same time (so that meta data -e.g. the sector location- would be obsolete by the time your process is scheduled to run again).
On Windows and on Unix like systems, file system code runs in the kernel. And other processes could use that same code (and the same file system) while your process is not running. Both Windows and Unix have preemptive scheduling, so you have no guarantee that your process runs again in user mode before some other process is using the same file system.
Remember that in practice, your file data often stays in the page cache. And that is why you might not hear your disk working -if you still have a rotating hard disk- when accessing the same file several times in a row (e.g. running the same program on the same file twice, a few seconds apart; usually the second run is keeping the disk silent, because the file data is already in RAM).
In a comment you mention that you want
To watch the data of the file and for example see what happens to the data when it gets deleted or modified.
but that should work at the file system level. Linux has inotify(7) facilities for that (they work on most local file systems, e.g. Ext4 or BTRFS, but not on remote file systems à la nfs(5), and neither on pseudo file systems à la proc(5)). I don't know if Windows has something similar to Linux inotify (but probably yes, at least in some cases).
You probably should consider using some database (maybe as simple as sqlite), and perhaps you want ACID properties (then use some real RDBMS like PostGreSQL). With PostGreSQL you might use TRIGGERs to be aware that some data changed, even if some other program changes the same database.
You could also do some file locking, and adopt the convention that every program accessing your particular file should lock it appropriately.
I have a large storage device (flash memory) plugged onto my computer via the PCIe bus, I want to access such device directly, i.e., without any file system (e.g., NTFS or ext4) on it.
How can I do this using C/C++? (on both Windows 7 and Linux)
I am wondering if I can 1) open the device just as a file, and then read and write binary data to it, or 2) allocate the whole device using some function like malloc, then each byte on the device have an address so that I can access them based on the addresses.
I prefer the second way if it possible, but I don't know if the OS supports this since it seems the address space needs to be shared with the main memory.
According to Microsoft documentation:
On Windows you can open a physical drive using CreateFile using a path of the form
\\.\PhysicalDriveN
where N is the device number or a logical drive using a path of the form
\\.\X:
You will need to seek, read and write in multiples of the sector size which can be retrieved using DeviceIoControl() with IOCTL_DISK_GET_DRIVE_GEOMETRY.
On Linux each storage device ends up getting a device entry in /dev. The first storage device is typically /dev/sda, the second storage device, if one is present, is /dev/sdb. Note that an optical disk is a storage device, so a CD-ROM or a DVD-ROM drive, if one is present, would get a device node entry.
Some Linux distributions may use a different naming convention, but this is what it usually is. So, you'll need to figure out which device corresponds to your flash disk, and just open the /dev/sdX device, and simply read and write from it. Your reads and writes must be for even block (sector) sizes, and seeking the opened file governs which disk blocks/sectors the subsequent read or write will affect.
Generally, /dev/sdX will be owned by root, but there are usually some Linux distribution-specific ways to fiddle the userid that owns a particular device node.
I'm trying to make a software that backups my entire hard drive.
I've managed to write a code for reading the raw data from hard disk sectors. However, i want to have incremental backups. For that i need to know the changed made to OS settings, file changes, everything.
My question is -
Using FileSystemWatcher and Inotify, will i be able to know every change made to every sector in the hard drive ? (OS settings etc)
I'm coding it in C++ for linux and windows.
(Saw this question on Stackoverflow which gave me some idea)
Inotify is to detect changes while your program is running, I'm guessing that FilySystemWatches is similar.
One way to solve this is to have a checksum on each sector or multiple of sectors, and when making a backup you compare the checksums to the list you have and only backup blocks that have been changed.
The MS Windows FileSystemWatcher mechanism is more limited than Linux's Inotify, but both probably will do what you need. The Linux mechanism provides (optional) notification for file reads, which causes the "access timestamp" to be updated.
However, the weakness from your application's perspective is that all file modifications made from system boot up to your program getting loaded (and unload to shutdown) will not be monitored. Your application might need to look through file modification timestamps of many files to identify changed files, depending on the level of monitoring you are targeting.
Both architectures maintain a timestamp for each file tracking when the file was last accessed. If that being updated is a trigger for a backup notification, the Windows mechanism lacking such notification will cause mismatched behavior on the platforms. Windows' mechanism can also drop notifications due to buffer size limitations. Here is a real gem from the documentation:
Note that a FileSystemWatcher does not raise an Error event when an event is missed or when the buffer size is exceeded, due to dependencies with the Windows operating system. To keep from missing events, follow these guidelines:
Increasing the buffer size with the InternalBufferSize property can prevent missing file system change events.
Avoid watching files with long file names. Consider renaming using shorter names.
Keep your event handling code as short as possible.
At least you can control two out of three of these....
can I change the ECC code for a block of a file stored on a flash drive by any means ?
of a file stored on a HDD (though I don't think there would be a difference between the two)
Maybe , through some hardware interrupts or anything like that?
Also I need the solution to be in C/C++.
A NAND flash drive is composed of a number of data pages and a flash controller. The ECC code on the NAND flash is used by the flash controller. The controller uses it to determine if the associated data page has any errors. A filesystem (like fat32, NTFS, or ext3) is usually implemented on top of the raw data page structure. A file may be spread across one or more flash pages. Please note that there is an error code for each flash page, meaning that a file larger than one data page will have an error code for each page it uses. A hard disk drive is composed of one or more platters, heads which read and write data on the platters, and a disk controller. Each platter is divided into sectors. Hard disk drives also have error codes to check the integrity of sectors on the platters. Again, a filesystem, is typically implemented on top of the raw disk sectors. If a file is larger than a sector then there will be multiple sectors associated with the file, each sector having its own error code. To access a data page or sector error code you will need to interface with the flash or hard dive controller directly. This will require interfacing with the device driver for the device. You will need to read the documentation for the device driver in order to discover what functions it offers to allow you access to error codes. In either instance, unless a file occupies only one data page or sector it will have more than one error code associated with it. Some filesystems create error codes for files, regardless of there length, but accessing a filesystem level error code typically does not require hardware access.
I was wondering how hard disk access works. Ex, how could I view/modify sectors? Im targeting Windows if that helps.
Thanks
This page seems to have some relevant information on the subject:
You can open a physical or logical
drive using the CreateFile()
application programming interface
(API) with these device names provided
that you have the appropriate access
rights to the drive (that is, you must
be an administrator). You must use
both the CreateFile() FILE_SHARE_READ
and FILE_SHARE_WRITE flags to gain
access to the drive.
Once the logical or physical drive has
been opened, you can then perform
direct I/O to the data on the entire
drive. When performing direct disk
I/O, you must seek, read, and write in
multiples of sector sizes of the
device and on sector boundaries. Call
DeviceIoControl() using
IOCTL_DISK_GET_DRIVE_GEOMETRY to get
the bytes per sector, number of
sectors, sectors per track, and so
forth, so that you can compute the
size of the buffer that you will need.
The documentation of CreateFile also offers some clues:
You can use the CreateFile function to open a physical disk drive or a volume,
which returns a direct access storage device (DASD) handle that can be
used with the DeviceIoControl function. This enables you to access the
disk or volume directly, for example such disk metadata as the partition
table. However, this type of access also exposes the disk drive or
volume to potential data loss, because an incorrect write to a disk
using this mechanism could make its contents inaccessible to the
operating system. To ensure data integrity, be sure to become
familiar with DeviceIoControl and how other APIs behave
differently with a direct access handle as opposed to a file system handle.
You can open a logical volume (e.g. c: drive) or a physical drive using win32's CreateFile() function. With the returned handle you can read and write sectors as needed.
This page at MSDN should get you started: CreateFile Function
I take no responsibility for damaged caused :-)