Inotify-like feature in a distributed file system - hdfs

As the title goes, I want to trigger a notification when some events happen.
A event above can be user-defined, such as updating specified files in 1-miniute.
If files are stored locally, I can easily make it with the system call inotify, but the case is that files locate on a distributed file system such as mfs..
How to make it? I wonder to know if there are some solutions or open-source project to solve this problem. Thanks.

If you have only black-box access (e.g. NFS protocol) to the remote system(s), you don't have much options unless the protocol supports what you need. So I'll assume you have control over the remote systems.
The "trivial" approach is running a local inotify/fanotify listener on each computer that would forward the notification over the network. FAM can do this over NFS.
A problem with all notification-based system is the risk of lost notifications in various edge cases. This becomes much more acute over a network - e.g. client confirms reciept of notification, then immediately crashes. There are reliable message queues you can build on but IMHO this way lies madness...
A saner approach is stateless hash-based scan.
I like to call the following design "hnotify" but that's not an established term. The ideas are widely used by many version control and backup systems, dating back to Plan 9.
The core idea is if you know cryptographic hashes for files, you can compose a single hash that represents a directory of files - it changes if any of the files changed - and you can build these bottom-up to represent the whole filesystem's state.
(Git stores things this way and is very efficient at it.)
Why are hash trees cool? If you have 2 hash trees — one representing the filesystem state you saw at point in the past, one representing the current state — you can easily find out what changed between them:
You start at the roots. If they are different you read the 2 root directories and compare hashes for subdirectories.
If a subdirectory has same hash in both trees, then nothing under it changed. No point going there.
If a subdirectory's hash changed, compare its contents recursively — call step (1).
If one has a subdirectory the other doesn't, well that's a change. With some global table you can also detect moves/renames.
Note that if few files changed, you only read a small portion of the current state. So the remote system doesn't have to send you the whole tree of hashes, it can be an interactive ping-pong of "give me hashes for this directory; ok now for this...".
(This is akin to how Git's dumb http protocol worked; there is a newer protocol with less round trips.)
This is as robust and bug-proof as polling the whole filesystem for changes — you can't miss anything — but reasonably efficient!
But how does the server track current hashes?
Unfortunately, fully hashing all disk writes is too expensive for most people. You may get if for free if you're lucky to be running a deduplicating filesystem, e.g. ZFS or Btrfs.
Otherwise you're stuck with re-reading all changed files (which is even more expensive than doing it in the filesystem layer) or using fake file hashes: upon any change to a file, invent a new random "hash" to invalidate it (and try to keep the fake hashes on moves). Still compute real hashes up the tree. Now you may have false positives — you "detect a change" when the content is the same — but never false negatives.
Anyway, the point is that whatever stateful hacks you do (e.g. inotify with periodic scans to be sure), you only do them locally on the server. Across the network, you only ever send hashes that represent snapshots of current state (or its subtrees)! This way you can have a distributed system with many servers and clients, intermittent connectivity, and still keep your sanity.
P.S. Btrfs can efficiently find differences from an older snapshot. But this is a snapshot taken on the server (and causing all data to be preserved!), less flexible than a client-side lightweight tree-of-hashes.
P.S. One of your tags is HadoopFS. I'm not really familiar with it, but I suspect a lot of its files are write-once-then-immutable, and it might be able to natively give you some kind of file/chunk ids that can serve as fake hashes?
Existing tools
The first tool that springs to my mind is bup index. bup is a very clever deduplicating backup tool built on git (only scalable to huge data), so it sits on the foundation described above. In theory, indexing data in bup on the server and doing git fetch over the network would even implement the hash-walking comparison of what's new that I described above — unfortunately the git repositories that bup produces are too big for git itself to cope with. Also you probably don't want bup to read and store all your data. But bup index is a separate subsystem that quickly scans a filesystem for potential changes, without yet reading the changed files.
Currently bup doesn't use inotify but it's been discussed in depth.
Oh, and bup uses Bloom Filters which are a nearly optimal way to represent sets with false positives. I'm almost certain Bloom filters have a role to play in optimizion stateless notification protocols ("here is a compressed bitmap of all I have; you should be able to narrow your queries with it" or "here is a compressed bitmap of what I want to be notified about"). Not sure if the way bup uses them is directly useful to you, but this data structure should definitely be in your toolbelt.
Another tool is git annex. It's also based on Git (are you noticing a trend?) but is designed to keep the data itself out of Git repos (so git fetch should just work!) and has a "WORM" option that uses fake hashes for faster performance.
Alternative design: compressed replayable journal
I used to think the above is the only sane stateless approach for clients to check what's changed. But I just read http://arstechnica.com/apple/2007/10/mac-os-x-10-5/7/ about OS X's FSEvents framework, which has a perhaps simpler design:
ALL changes are logged to a file. It's kept forever.
Clients can ask "replay for me everything since event 51348".
The magic trick is the log has coarse granularity ("something in this directory changed, go re-scan it to find out what", repeated changes within 30 seconds are combined) so this journal file is very compact.
At the low level you might resort to similar techniques — e.g. hashes — but the top-level interface is different: instead of snapshots you deal with a timeline of events. It may be an easier fit for some applications.

Related

Creating c++ application where secret information can be stored

I want to create portable c++ application for myself [CLI] which will store my secret project information.
But i am not sure, how can i store information in my program, as whatever i will update in program when i am using it will be stored in buffer and when i will close it, it will get deleted and same informations will not be available at any place.
I want to store information persistently, what is the best way to do it. [Considering my application will be portable, i.e, i can carry it in my pen drive in any place and i can fetch my information from program].
Option i found was Datbase , but i have certain problem with database :-
1). sqlite => If any one gets my sqlite.db file, he will know all my secret project.
2). mysql/sql or any other database => They are not portable, it needs to be installed in system too and i need to import , export everytime in system wherever i will have to use it.
How such application stores information in crypted format, so that no one can read it easily.
Any help will be great.
Thanks
If you want your data to remain secret then you must encrypt it.
How you persist the data (sqlite, text file, etc.) makes no difference whatsoever.
See also:
encrypt- decrypt with AES using C/C++
This is not REALLY an answer, but it's far too long "discussion about your subject" to fit as a comment, and I'd rather break the rules by writing one "non-answer answer" (especially now that you have already accepted another answer) than write 6 comments.
First of all, if it's written in C++, it won't be truly portable in the sense that you can carry it around and plug it in anywhere you like and just access the ifnormation, because different systems will have different OS and processor architecture. Fine if you restrict being able to "plug in" on Windows and Linux with x86 - you only need to build two copies of your code. But covering more architectures - e.g. being able to plug into a iPad or a MacBook will require two more builds of the software. Soon you'll be looking at quite a lot of code to carry around (never mind that you need the relevant C++ compiler and development environment to built the original copy). Yes, C++ is a portable language, but it doesn't mean that the executable file will "work on anything" directly - it will need to be compiled for that architecture.
One solution here may of course be to use something other than C++ - for example Java, that only needs a Java VM on the target system - it's often available on a customer system already, so less of an issue. But that won't work on for example an ipad.
Another solution is to have your own webserver at home, and just connect to your server from your customer's site. That way, none of the information (except the parts you actually show the customer) ever leaves your house. Make it secure by studying internet/web-site security, and using good passwords [and of course, you could even set it up so that it's only available at certain times when you need it, and not available 24/7]. Of course, if the information is really top-secret (nuclear weapons, criminal activities, etc), you may not want to do that for fear of someone accessing it when you don't want it to be accessed. But it's also less likely to "drop out of your pocket" if it's well protected with logins and passwords.
Encrypting data is not very hard - just download the relevant library, and go from there - crypt++ is one of those libraries.
If you store it in a database, you will need either a database that encrypts on itself, or a very good way to avoid "leaking" the clear-text information (e.g. storing files on /tmp on a linux machine), or worse, you need to decrypt the whole database before you can access it - which means that something could, at least in theory, "slurp" your entire database.
Depending on how secret your projects are, you may also need to consider that entering for example a password will be readable by the computer you are using - unless you bring your own computer as well [and in that case, there are some really good "encrypt my entire disk" software out there that is pretty much ready to use].
Also, if someone says "Can I plug in my memory stick on your computer and run some of my from it", I'm not sure I'd let that person do that.
In other words, your TECHNICAL challenges to write the code itself may not be the hardest nut to crack in your project - although interesting and challenging.

prioritizing torrent download sequences using libtorrent

Suppose I have 2+ clients (developed by me) ALL using libtorrent ( http://www.rasterbar.com/products/libtorrent/manual.html#queuing )
Can I prioritize download of a file from other clients effectively so that they download the file's pieces/chunks (whatever is torrent terminology here) from beginning of the file towards its end and not quite in random order?
(of course I'm allowing some "multiplexing" / "intertwining" pieces for reasons of availability and performance, but the goal here is to download as linearly and quickly from the start of the file towards the end as possible)
The goal I'm thinking about here is obviously previewing the file quickly. How to do this most effectively using libtorrent / possibly other C++ torrent library?
(I'm not quite interested in torrent implementations using non-binary languages, like Java or Python - I need machine code for reasons of performance and security, so, C, C++ or possibly D would all fit the bill)
You can certainly prioritize pieces and files with torrent_handle::prioritize_pieces() and torrent_handle::prioritize_files(). See the documentation.
This won't be enough to download in-order though. To do that, you can enable sequential download with torrent_handle::set_sequential_download(). This will issue new piece requests in-order. Keep in mind that the time a request take to be satisfied varies a lot depending on which peer you talk to. Making the requests in-order does not necessarily mean receiving the pieces in order.
There is another mechanism to attempt to do that. torrent_handle::set_piece_deadline() is used to set a target completion time for a piece. Such pieces are considered time-critical pieces, and they are ordered by their deadline and the fastest peers are used to request blocks from those pieces, attempting to download them in deadline-order.
Now, I also get the impression that you want two separate clients (presumably running on different machines) to coordinate which pieces they download. Is that right? It's not entirely clear what you're asking about, but there's no simple way of asking libtorrent to do that.
You could write a plugin for libtorrent that implements a new extension message for these clients to chat and coordinate, which could de-select certain pieces the other client is downloading by setting their priority to 0.

Easiest way to sign/certify text file in C++?

I want to verify if the text log files created by my program being run at my customer's site have been tampered with. How do you suggest I go about doing this? I searched a bunch here and google but couldn't find my answer. Thanks!
Edit: After reading all the suggestions so far here are my thoughts. I want to keep it simple, and since the customer isn't that computer savy, I think it is safe to embed the salt in the binary. I'll continue to search for a simple solution using the keywords "salt checksum hash" etc and post back here once I find one.
Obligatory preamble: How much is at stake here? You must assume that tampering will be possible, but that you can make it very difficult if you spend enough time and money. So: how much is it worth to you?
That said:
Since it's your code writing the file, you can write it out encrypted. If you need it to be human readable, you can keep a second encrypted copy, or a second file containing only a hash, or write a hash value for every entry. (The hash must contain a "secret" key, of course.) If this is too risky, consider transmitting hashes or checksums or the log itself to other servers. And so forth.
This is a quite difficult thing to do, unless you can somehow protect the keypair used to sign the data. Signing the data requires a private key, and if that key is on a machine, a person can simply alter the data or create new data, and use that private key to sign the data. You can keep the private key on a "secure" machine, but then how do you guarantee that the data hadn't been tampered with before it left the original machine?
Of course, if you are protecting only data in motion, things get a lot easier.
Signing data is easy, if you can protect the private key.
Once you've worked out the higher-level theory that ensures security, take a look at GPGME to do the signing.
You may put a checksum as a prefix to each of your file lines, using an algorithm like adler-32 or something.
If you do not want to put binary code in your log files, use an encode64 method to convert the checksum to non binary data. So, you may discard only the lines that have been tampered.
It really depends on what you are trying to achieve, what is at stakes and what are the constraints.
Fundamentally: what you are asking for is just plain impossible (in isolation).
Now, it's a matter of complicating the life of the persons trying to modify the file so that it'll cost them more to modify it than what they could earn by doing the modification. Of course it means that hackers motivated by the sole goal of cracking in your measures of protection will not be deterred that much...
Assuming it should work on a standalone computer (no network), it is, as I said, impossible. Whatever the process you use, whatever the key / algorithm, this is ultimately embedded in the binary, which is exposed to the scrutiny of the would-be hacker. It's possible to deassemble it, it's possible to examine it with hex-readers, it's possible to probe it with different inputs, plug in a debugger etc... Your only option is thus to make debugging / examination a pain by breaking down the logic, using debug detection to change the paths, and if you are very good using self-modifying code. It does not mean it'll become impossible to tamper with the process, it barely means it should become difficult enough that any attacker will abandon.
If you have a network at your disposal, you can store a hash on a distant (under your control) drive, and then compare the hash. 2 difficulties here:
Storing (how to ensure it is your binary ?)
Retrieving (how to ensure you are talking to the right server ?)
And of course, in both cases, beware of the man in the middle syndroms...
One last bit of advice: if you need security, you'll need to consult a real expert, don't rely on some strange guys (like myself) talking on a forum. We're amateurs.
It's your file and your program which is allowed to modify it. When this being the case, there is one simple solution. (If you can afford to put your log file into a seperate folder)
Note:
You can have all your log files placed into a seperate folder. For eg, in my appplication, we have lot of DLLs, each having it's own log files and ofcourse application has its own.
So have a seperate process running in the background and monitors the folder for any changes notifications like
change in file size
attempt to rename the file or folder
delete the file
etc...
Based on this notification, you can certify whether the file is changed or not!
(As you and others may be guessing, even your process & dlls will change these files that can also lead to a notification. You need to synchronize this action smartly. That's it)
Window API to monitor folder in given below:
HANDLE FindFirstChangeNotification(
LPCTSTR lpPathName,
BOOL bWatchSubtree,
DWORD dwNotifyFilter
);
lpPathName:
Path to the log directory.
bWatchSubtree:
Watch subfolder or not (0 or 1)
dwNotifyFilter:
Filter conditions that satisfy a change notification wait. This parameter can be one or more of the following values.
FILE_NOTIFY_CHANGE_FILE_NAME
FILE_NOTIFY_CHANGE_DIR_NAME
FILE_NOTIFY_CHANGE_SIZE
FILE_NOTIFY_CHANGE_SECURITY
etc...
(Check MSDN)
How to make it work?
Suspect A: Our process
Suspect X: Other process or user
Inspector: The process that we created to monitor the folder.
Inpector sees a change in the folder. Queries with Suspect A whether he did any change to it.
if so,
change is taken as VALID.
if not
clear indication that change is done by *Suspect X*. So NOT VALID!
File is certified to be TAMPERED.
Other than that, below are some of the techniques that may (or may not :)) help you!
Store the time stamp whenever an application close the file along with file-size.
The next time you open the file, check for the last modified time of the time and its size. If both are same, then it means file remains not tampered.
Change the file privilege to read-only after you write logs into it. In some program or someone want to tamper it, they attempt to change the read-only property. This action changes the date/time modified for a file.
Write to your log file only encrypted data. If someone tampers it, when we decrypt the data, we may find some text not decrypted properly.
Using compress and un-compress mechanism (compress may help you to protect the file using a password)
Each way may have its own pros and cons. Strength the logic based on your need. You can even try the combination of the techniques proposed.

How to create a program which is working similar like RAID1 (mirroring)?

I want to create a simple program which is working very similar to RAID1. It should work like this:
First i want to give the primary HDD-s drive letter and than the secondary one. I will only write to the primary HDD! If any new data is copied to the primary HDD it should automatically copy it to the secondary one.
I need some help where should i start all this? How to monitor the written data in the primary HDD? Obviously there are many ways to do what i want (i think), but i need the simpliest way.
If this isn't so complicated, than how can i handle that case if the primary HDD has two or more partition, because then i should check the secondary HDD's partition too, and then create/resize them if necessary?
Thanks in advance!
kampi
The concept of mirroring disk writes to another disk in real time is the basis for high availability, and implementing these schemes are not trivial.
The company I work for makes DoubleTake, which does real time mirroring & replication of file based IO to local or remote volumes. This is a little different than what you are describing, which appears to be block based disk/volume replication, but many of the concepts are similar.
For file based replication, there are a quite a few nasty scenarios, i'll describe a few:
Synchronizing the contents of one volume to another volume, keeping in mind that changes can occur while you are doing this. I suppose you could simply this by requiring that volumes start out totally formatted. But for people that have data that will not be a good solution!
keeping up with disk changes: What if the volume you are mirroring to is slower than the source volume? Where do you buffer? To Disk? Memory?
Anyways we use a kernel mode file system filter driver to capture the disk IO, and then our user mode service grabs this IO and forwards it to a local or remote disk.
If you want to learn about file system filtering, one of the best books (its old but good) is File System Internals, by Rajeev Nagar. Its a must read for doing any serious work with file system filters.
Also take a look at the file system filter samples on the Windows 7 WDK, its free, and they have good file mon examples that will get you seeing disk changes pretty quickly.
Good Luck!

Pattern for synching object lists among computers (in C++)?

I've got an app that has about 10 types of objects. There will be potentially a few thousand object instances of each type. These lists of objects need to stay synchronized between apps running on different machines. If an object is added, changed or deleted, that needs to propagate to the other machines.
This will be a star topology -- there is a central master, and the rest are clients.
I DO have the concept of a session, so can store data about each client.
Is there a good design pattern to follow for this? Even better, is there a (template based?) library that would handle asking the container what has changed since client X came by and getting that delta to send out?
Right now I'm thinking every object-type container has an update counter. When something is added/changed/removed, the update counter is incremented, and the changed object(s) are tagged with that value. Each client will save the value of the update counter when it gets an update. Later it will come back and ask for any changes since it's update counter value. Finally, deletes are kept as tombstone records (although I'm not exactly sure when to clear them out).
One thing that makes this harder is clients can come and go without the central server necessarily knowing, although I guess there could be a timeout concept (if the server haven't heard from a client in 5 minutes, it assumes the client is gone)
Is this a well-known pattern? Any additional suggestions?
How you implement synchronization very much depends on your needs. Do the changes need to be sent to the clients, or is it sufficient that the clients checks if an object is up to date whenever it uses the objects? How bout using the Proxy pattern? This pattern allows you to create a proxy-implementation of your objects that can check if they are up to date or not, do update if they are not, and then return the result. I would do this by having a lastChanged timestamp on the objects on the master and a lastUpdated timestamp on the client objects. If latency is an issue checking if an object is up-to-date on each call is probably not a good idea. Consider having a separate thread that queries the master for changed objects and marks them "dirty". This could dramatically reduce the network traffic as well.
You could also look into the Observer pattern and Publish/Subscribe.
An option that might be simple to implement and still pretty efficient is to treat the pile of objects as an opaque blob and use librsync to synchronize them. It sounds like all of the updates flow one direction, from master to clients, and there's probably some persistent representation of the objects on the clients -- a file or something. I'm assuming it's a file for the rest of this answer, though any sequence of bytes can be used.
The way it would work is that each client would generate a librsync "signature" of its local copy of the blob and send that signature to the master. The signature is about 1% of the size of the blob. The master would then use librsync to compute a delta between that signature and the current data, and send the delta to the client, which would use librsync to apply the delta to its local copy of the blob.
The librsync API is simple, and the signature/delta data transfer is relatively efficient.
If that's not workable, it may still be useful to take a more manual "delta-based" approach, to avoid having to do per-object versioning. Each time the master makes a change, it should log that change to a journal, recording what was done and to which object. Versioning is done at the whole-database level, so in effect a version number is assigned to each journal entry.
When a client connects, it should send its version of the whole object collection, and the server can then respond with the contents of the journal between the client's version and the newest entry. If updates on a given object are done by completely replacing the object contents, then you can optimize this by filtering out all but the most recent version of each object. If the master also keeps track of which versions it has sent to which client, it can know when it is safe to discard old journal entries. Even if it doesn't track that, you can still discard old journal entries according to some heuristic (probably just age) and if you receive a connection from a client whose last version is older than your oldest journal entry, then you just have to send the entire set of objects to that client.