how are programs able to verify an encrypted file's password? - c++

consider (for an example) that we have encrypted a file (sample.txt) using win-zip 9 by typing a password "agoodpassword".
now if we try to open the file by typing some wrong password, we get a error message saying: the password typed is incorrect.
the question:
how can a software verify if the password typed in is correct or not? the content of the file could be any random data, so checking for errors in the file after decryption is not going to work. But still the software needs some source to verify this password; so how does this win-zip software verify if the decryption is successful or not?
What I suspect is the password could also be there in the same file being encrypted. Is it true or does the software adopt any other method?

Instead of just encrypting, many applications that create a ciphertext also create an authentication tag. This authentication tag can be checked before decryption; if the authentication tag is incorrect than one of the parameters (key, IV or ciphertext) is incorrect.
To use encryption using a password it is common to utilize PKCS#5 (password based encryption). PKCS#5 contains a password hashing method that utilizes "key stretching", making it harder for an attacker to test/compare many passwords using brute force or dictionary attacks. Such a password hashing method is called a Password Based Key Derivation Function or PBKDF. The latest PKCS#5 describes PBKDF2.
Now if you want to create a new password based encryption method, I would propose to do the following:
Perform a PBKDF2 with (very) high iteration count and 128 bit salt;
Make sure that the user gets feedback about the strength of the password;
Perform a KBKDF (key based key derivation function) on the result of PBKDF2, creating a check value, a data encryption key, and a data authentication key;
Use the data encryption key for an encryption method, say AES-128-CBC with random IV;
Use the data authentication key for a HMAC over the IV and the ciphertext;
Store the check value;
To verify the correct password during decryption, use the check value.
Note that I did not discuss the KBKDF yet. You may use a hash over the output of the PBKDF2 and a simple counter or string for that, say SHA-256(key seed, "ENC").

You can use a hash value to provide a very high probability that anything other than the correct password will be rejected. Basically, if you hash a password it produces a number with a certain number of binary digits, and a good cryptographic hash will produce a completely different number (in as much as random thing tend to differ) if you type something even the tiniest bit different (for example, changing the order of two characters, or using uppercase instead of lower).
There's still a very small chance that two different passwords will produce the same hash value... for example if you only had a 32-bit hash value then there's about a 1 in 2^32 (4 billion) chance. It gets quite mathematically complex to create a hash function that doesn't let you retrieve the password (especially if it is a short password, and someone can pre-generate a list of short words with specific hash values too), so you probably want to have a pretty weak hash - just good enough to avoid returning corrupt data for 99.99% of typos - and/or one that's known to be resistant to such attacks.

Related

Working with Password_hash and password_verify

I'm new to password_hash and password_verify, and they appear to be the most efficient way of storing passwords securely!
I noticed that password_hash produces different hash for the same plain-text value every time!
This means that if a user tried to create an account with the password (thisIsMyPassword) it will generate a hash like this $2y$10$VCNH8ndve8hwbvLJ2nMHtOsEiigE4zA7ViADxCJfq9bmUCmkNkcce,
And if another or the same user tried to create another account with the same password i.e. (thisIsMyPassword) the account will be created and the hash value of the password will be something like $2y$10$Hqssc5nn3pzgfwqVwQrQz.Ny71q972RXmCmyV9ykywG8iELbsf47a!
Now you see the same value i.e. (thisIsMyPassword) resulted in different hashes!
Is this OK?
Is it OK to let the users use same passwords, as long as the password hash is different in the database?
The password hash includes a so-called salt, a small random value, which is here to prevent dictionary attacks, here is what PHP manual says:
If omitted, a random salt will be generated by password_hash()
for each password hashed. This is the intended mode of operation.
The value you get as the output, is not really a plain hash, but a
string made of - algorithm id, salt and HASH(password,salt).
The used algorithm, cost and salt are returned as part of the hash.
Therefore, all information that's needed to verify the hash is included.
in it. This allows the password_verify() function to verify the hash
without needing separate storage for the salt or algorithm information.

A simple credentials table for mySQL

Here is my simple table definition for a mysql credentials table.
case "credentials":
self::create('credentials', 'identifier INT NOT NULL AUTO_INCREMENT, flname VARCHAR(60), email VARCHAR(32), pass VARCHAR(40), PRIMARY KEY(identifier)');
break;
Please ignore all but the inner arguments...the syntax is good...I just want to verify the form. Basically, I have an auto-incrementing int for the PRIMARY KEY and 3 fields - the users's name, email, and password.
I want this to be as simple as possible. Searches will be based upon the id
Question: Will this work for a basic credentials table?
Please please please do not store passwords in plaintext.
Use a well known iterated hashing function, such as bcrypt or PBKDF2. Don't store a raw MD5 hash, or even a raw SHA or SHA-2 hash. You should always salt and iterate your hashes to be secure.
You'll need one extra column to store the salt, and if you want to be flexible you could also have per-user iteration counts and maybe even per-user hash functions. That gives you the flexibility to change to a different hash function in the future without requiring all users to immediately change their passwords.
Apart from that the table looks fine.
I would suggest that you increase the size of the email field (maximum length of an email can be up to 256 chars). Also you should store your passwords as a hash (e.g. bcrypt) not a plain string.

Google Plus user id representation technique

What kind of technique does use Google Plus to generate users' unique ids?
Example
https://plus.google.com/102766325060234825733/posts
You can only assume that they are randomly generated ID's that are large enough to be generated non-sequentially with sufficient entropy.
The ID's are too big to be stored in a bigint field which is interesting, again probably due to the required entropy and non-sequential requirement (so that nothing can be inferred by comparing userid's).
A simple encryption of a serially generated number, with a secret key can be used to generate the IDs. It can be a 1 way hash, or a decryptable encryption.
The reason for not using serial numbers directly is obvious: You can easily guess userids of other users on the network, which can result in Bots scraping the content of the network.

Create a single use link

I'm writing a database front end for a website. Next to the records I want to include a link likes this:
Record 1 - [Add][1] [Edit][2] [Delete][3]
But I want to protect these links from being used more than once. My thinking is to pass a hash value then store a list of valid HASH values in a table somewhere and only process requests with valid hash values. Is there a better way to do is?
Update: The answer to this question led me to ask this question: What is the difference between a "nonce" and a "GUID"?. Why exactly would you use a nonce instead of a GUID?
Your idea is correct, except that you should use cryptographically secure random bytes (a "nonce") instead of a hash.

Suitable alternative to CryptEncrypt

We have a situation in our product where for a long time some data has been stored in the application's database as SQL string (choice of MS SQL server or sybase SQL anywhere) which was encrypted via the Windows API function CryptEncrypt. (direct and de-cryptable)
The problem is that CryptEncrypt can produce NULL's in the output, meaning that when it's stored in the database, the string manipulations will at some point truncate the CipherText.
Ideally we'd like to use an algo that will produce CipherText that doesn't contain NULLs as that will cause the least amount of change to the existing databases (changing a column from string to binary and code to deal with binary instead of strings) and just decrypt existing data and re-encrypt with the new algorithm at database upgrade time.
The algorithm doesn't need to be the most secure, as the database is already in a reasonably secure environment (not an open network / the inter-webs) but does need to be better than ROT13 (which I can almost decrypt in my head now!)
edit: btw, any particular reason for changing ciphertext to cyphertext? ciphertext seems more widely used...
Any semi-decent algorithm will end up with a strong chance of generating a NULL value somewhere in the resulting ciphertext.
Why not do something like base-64 encode your resulting binary blob before persisting to the DB? (sample implementation in C++).
Storing a hash is a good idea. However, please definitely read Jeff's You're Probably Storing Passwords Incorrectly.
That's an interesting route OJ.
We're looking at the feasability of a non-reversable method (still making sure we don't explicitly retrieve the data to decrypt) e.g. just store a Hash to compare on a submission
It seems that the developer handling this is going to wrap the existing encryption with yEnc to preserve the table integrity as the data needs to be retrievable, and this save all that messy mucking about with infinite-improbab.... uhhh changing column types on entrenched installations.
Cheers Guys