open addressing vs chained hashing - c++

Open addressing is usually faster than chained hashing. I am testing my code with successful researches with a low load factor (0.1) but I keep getting best time results for the chained hashing instead of the open addressing. The difference is very very little and sometimes open address is even faster but on an average of 100 inserts the time is better with chained hasing.
I used a vector<string> as hash table for open addressing and a vector<list<string> > for chained hasing and universal hasing for both.
This is the part of the code about successful researches that I am timing:
Chained hashing research:
for (auto it_ric = this->hash[key].begin(); it_ric != this->hash[key].end(); it_ric++)
{
if ((*it_ric) == string)
{
found = true;
}
}
Open addressing:
while (found == false && key != hash.size())
{
if (hash[key] == string)
{
key = hash.size();
found = true;
}
else if (hash[key] == "")
{
key = hash.size();
found = true;
}
else
{
key++;
}
}

Related

Fastest solution to Testdome excercise about repeating playlist

I tried to an exercise on TestDome about discovering if a playlist has repetitions ( TestDome C++ Playlist )
I tried to solve in this way:
bool isRepeatingPlaylist()
{
std::map<std::string, int> songs;
Song* pSong = this;
while (pSong != nullptr) {
if (songs[pSong->name] > 0)
return true;
songs[pSong->name]++;
pSong = pSong->nextSong;
}
return false;
}
The feedback is that I passed 3 out of 4 test cases. The test case I'm not passing is the one about performances. Can you help me improving?
The 4th test is "Performance test on a large playlist"
when it comes to efficiency and you don't need ordered data you should use unordered_set<> or unordered_map<>.
map<> search complexity is O(log n) but it's average O(1) to O(n) in the worst case for unordered_set<> or unordered_map<>.
below code passed all 4 tests of TestDome C++ Playlist
bool isRepeatingPlaylist()
{
std::unordered_set<std::string> playedSongs;
Song *pSong = this;
while (pSong != nullptr)
{
if (playedSongs.find(pSong->name) == playedSongs.end())
playedSongs.insert(pSong->name);
else
return true;
pSong = pSong->nextSong;
}
return false;
}
to use it just add #include<unordered_set>

Depth first search to find a shortest path c++

void existsInNextMapDFS(int currMapID, int startMapID, int destMapID, int numRecursions, cliext::vector<MapPath^>^ searchList, cliext::vector<MapPath^>^ finalPath,bool &destFound,int &i) {
if (currMapID == destMapID) {
if ((int)(finalPath->size()) == 0 || finalPath->size() > searchList->size())
*finalPath = searchList; //Current path is the shortest path to destination map
return; //Returning so that no further maps from this one are searched
}
if (getMap(currMapID)->portals->Count == 0 || numRecursions > 300) {
return;
}
//If current map is an endpoint or if number of recursions are over 300, no further maps are searched
for each(PortalData^ portalData in getMap(currMapID)->portals) {
Log::WriteLine("for loop portaldata");
bool existsInSearchList = false;
for each (MapPath ^ mapData in searchList) {
if (mapData->mapID == portalData->toMapID) {
Log::WriteLine("for loop mapdata");
existsInSearchList = true;
break;
}
}
if (getMap(portalData->toMapID) == nullptr) {
continue; //Skips portals where the portal's map is not found
}
if (existsInSearchList) {
continue; //Skip portals where it goes to maps already in search path to prevent loop backs
}
MapPath^ mapPath = gcnew MapPath(currMapID, portalData);
searchList->push_back(mapPath);
existsInNextMapDFS(portalData->toMapID, startMapID, destMapID, numRecursions + 1, searchList, finalPath,destFound,i); //Recursive call
searchList->pop_back();
}
Hey i'm trying to write a map skipper for a 2d platform game, each map has number of portal which are stored in map DATA, i need the shortest route to reach a destination map.
The idea is to go through all portals in each map until destination is reached and return the route.
this is what i got so far, but it gets stuck sometimes and doesn't search new maps.
i tried debugging it for hours but still couldnt find a sloution..

Using getopt when options have options C++

I am using getopt to parse command line arguments and my issue is that some of my options have options. My project is to test different backend implementations of maps and the -b flag specifies which implementation to use. Most of the options are straight forward but for the backends that use hash tables (chained and open) there is an additional -number that can be added to the end to specify the load factor. So it would be -b chained-0.75.
My idea is that I would take the substring from 8 to the end (or 5 for the "open" option) because that would ignore the "chained-" part of the string and then use atof() to convert it to a double and then declare my map. I believe optarg is a char array (?) and I keep running into type mismatch errors even though I have tried std::string str(optarg); I also don't know what to write in place of else if (strcasecmp(optarg, "chained") == 0) because there could be any number at the end of it. So right now when I do -b chained-0.75 it calls the usage function.
Here is what I have so far:
while ((c = getopt(argc, argv, "hb:n:p:")) != -1) {
switch (c) {
case 'b':
if (strcasecmp(optarg, "unsorted") == 0) {
map = new UnsortedMap();
} else if (strcasecmp(optarg, "sorted") == 0) {
map = new SortedMap();
} else if (strcasecmp(optarg, "bst") == 0) {
map = new BSTMap();
} else if (strcasecmp(optarg, "unordered") == 0) {
map = new UnorderedMap();
} else if (strcasecmp(optarg, "chained") == 0) {
double load_factor;
std::string str(optarg);
std::string ld_str = str.substr(8, str.length()-1);
load_factor = atof(ld_str);
map = new ChainedMap(load_factor);
} else if (strcasecmp(optarg, "open") == 0) {
map = new OpenMap();
} else {
usage(1);
}
break;
Any hints or ideas would be appreciated!
strcasecmp() is an exact match comparison function, this strcasecmp() will obviously not match "chained-0.75". The only thing that strcasecmp() will match against the string "chained" is "chained", not "chained-0.75", not "changed-foobar", not "chained-anything".
The right function is strncasecmp:
} else if (strncasecmp(optarg, "chained-", 8) == 0) {
Note that you're comparing against "chained-", and not just "chained". A few moments' of thinking should make it clear why.
The existing code also fails to take into account the possibility that the string after "chained-" is not a number, since atof() does not handle parsing errors. If you need to be able to detect and handle an error here, use strtod() instead of atof().

check if a value belongs to a certain object

I have a map with two different kinds of objects: deposit account and checking account. I want to write a money transfer methode to transfer money between two checking accounts only. Is there a way to check if both account numbers belong to the same checking account object?
bool Bank::moneyTransfer(long fromAccount,long toAccount, double amount)
{
map<long, account*>::iterator iterFrom;
map<long, account*>::iterator iterTo;
iterFrom = m_accountList.find(fromAccount);
if (iterFrom == m_accountList.end()) {
return false;
}
iterTo = m_account.find(toAccount);
if (iterFrom == m_accountList.end()) {
return false;
}
Konto *fromAccount = iterFrom->second;
Konto *toAccount = iterTo->second;
if (!fromAccount->drawMoney(amount)) {
return false;
}
toAccount->payIn(amount);
return true;
}
Q. Is there a way to check if both account numbers belong to the same checking account object?
A. Yes
As Shaktal says you're passing in the account numbers just compare them.
There are a couple things that need to be cleaned up in your code:
The fact that you're asking this question indicated that the you think that you could have the same Key in a map with 2 Values. This is not the case, this code will result in the Key 13 mapping to only a DepositAccount:
m_accountList[13] = CheckingAccount();
m_accountList[13] = DepositAccount();
Please use auto to declare your variables, especially instead of map<long, account*>::iterator aside from being easier to read, you wont have to come back and edit your logic when you change m_accountList's type, for more information on auto I believe the definitive article is: https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/
After making these corrections, your code should look something like:
bool Bank::moneyTransfer(long fromAccount, long toAccount, double amount)
{
if(fromAccount != toAccount) {
auto iterFrom = m_accountList.find(fromAccount);
if (iterFrom != m_accountList.end()) {
auto iterTo = m_account.find(toAccount);
if (iterFrom != m_accountList.end() && iterFrom->second->drawMoney(amount)) {
iterTo->second->payIn(amount);
return true;
}
}
}
return false;
}

Why, after using 'CryptSetHashParam', can I no longer add data to my MD5 hash object?

I am trying to use the Microsoft 'Crypt...' functions to generate an MD5 hash key from the data that is added to the hash object. I am also trying to use the 'CryptSetHashParam' to set the hash object to a particular hash value before adding data to it.
According to the Microsoft documentation (if I am interpreting it correctly), you should be able to do this by creating a duplicate hash of the original object, use the 'CryptGetHashParam' function to retrieve the hash size then use 'CryptSetHashParam' on the original object to set the hash value accordingly. I am aware that after using 'CryptGetHashParam' you are unable to add additional data to a hash object (which is why I thought you needed to create a duplicate), but I can't add data to either the original hash object or the duplicate hash object after using either 'CryptGetHashParam' (as expected), or 'CryptSetHashParam' (which I didn't expect).
Below are code extracts of the class I am writing and an example of how I am using the class functions:
The result I get after running the code is:
"AddDataToHash function failed - Errorcode: 2148073484.", which translates to: "Hash not valid for use in specified state.".
I've tried many different ways to try and get this working as intended, but the result is always the same. I accept that I am doing something wrong, but I can't see what it is I'm doing wrong. Any ideas please?
CLASS CONSTRUCTOR INITIALISATION.
CAuthentication::CAuthentication()
{
m_dwLastError = ERROR_SUCCESS;
m_hCryptProv = NULL;
m_hHash = NULL;
m_hDuplicateHash = NULL;
if(!CryptAcquireContext(&m_hCryptProv, NULL, NULL, PROV_RSA_FULL, CRYPT_MACHINE_KEYSET))
{
m_dwLastError = GetLastError();
if (m_dwLastError == 0x80090016 )
{
if(!CryptAcquireContext(&m_hCryptProv, NULL, NULL, PROV_RSA_FULL, CRYPT_NEWKEYSET | CRYPT_MACHINE_KEYSET))
{
m_dwLastError = GetLastError();
m_hCryptProv = NULL;
}
}
}
if(!CryptCreateHash(m_hCryptProv, CALG_MD5, 0, 0, &m_hHash))
{
m_dwLastError = GetLastError();
m_hHash = NULL;
}
}
FUNCTION USED TO SET THE HASH VALUE OF THE HASH OBJECT.
bool CAuthentication::SetHashKeyString(char* pszKeyBuffer)
{
bool bHashStringSet = false;
DWORD dwHashSize = 0;
DWORD dwHashLen = sizeof(DWORD);
BYTE byHash[DIGITAL_SIGNATURE_LENGTH / 2]={0};
if(pszKeyBuffer != NULL && strlen(pszKeyBuffer) == DIGITAL_SIGNATURE_LENGTH)
{
if(CryptDuplicateHash(m_hHash, NULL, 0, &m_hDuplicateHash))
{
if(CryptGetHashParam(m_hDuplicateHash, HP_HASHSIZE, reinterpret_cast<BYTE*>(&dwHashSize), &dwHashLen, 0))
{
if (dwHashSize == DIGITAL_SIGNATURE_LENGTH / 2)
{
char*pPtr = pszKeyBuffer;
ULONG ulTempVal = 0;
for(ULONG ulIdx = 0; ulIdx < dwHashSize; ulIdx++)
{
sscanf(pPtr, "%02X", &ulTempVal);
byHash[ulIdx] = static_cast<BYTE>(ulTempVal);
pPtr+= 2;
}
if(CryptSetHashParam(m_hHash, HP_HASHVAL, &byHash[0], 0))
{
bHashStringSet = true;
}
else
{
pszKeyBuffer = "";
m_dwLastError = GetLastError();
}
}
}
else
{
m_dwLastError = GetLastError();
}
}
else
{
m_dwLastError = GetLastError();
}
}
if(m_hDuplicateHash != NULL)
{
CryptDestroyHash(m_hDuplicateHash);
}
return bHashStringSet;
}
FUNCTION USED TO ADD DATA FOR HASHING.
bool CAuthentication::AddDataToHash(BYTE* pbyHashBuffer, ULONG ulLength)
{
bool bHashDataAdded = false;
if(CryptHashData(m_hHash, pbyHashBuffer, ulLength, 0))
{
bHashDataAdded = true;
}
else
{
m_dwLastError = GetLastError();
}
return bHashDataAdded;
}
MAIN FUNCTION CLASS USAGE:
CAuthentication auth;
.....
auth.SetHashKeyString("0DD72A4F2B5FD48EF70B775BEDBCA14C");
.....
if(!auth.AddDataToHash(pbyHashBuffer, ulDataLen))
{
TRACE("CryptHashData function failed - Errorcode: %lu.\n", auth.GetAuthError());
}
You can't do it because it doesn't make any sense. CryptGetHashParam with the HP_HASHVAL option finalizes the hash, so there is no way to add data to it. If you want to "fork" the hash so that you can finalize it at some point as well as add data to it, you must duplicate the hash object prior to finalizing. Then you add the data to one of the hash objects and finalize the other. For example, you might do this if you wanted record a cumulative hash after every 1024 bytes of a data stream. You should not call CryptSetHashParam on the hash object that you are continuing to add data to.
CryptSetHashParam with the HP_HASHVAL option is a brutal hack to overcome a limitation in the CryptoAPI. The CryptoAPI will only sign a hash object, so if you want to sign some data that might have been hashed or generated outside of CAPI, you have to "jam" it into a hash object.
EDIT:
Based on your comment, I think you are looking for a way to serialize the hash object. I cannot find any evidence that CryptoAPI supports this. There are alternatives, however, that are basically variants of my "1024 bytes" example above. If you are hashing a sequence of files, you could simply compute and save the hash of each file. If you really need to boil it down to one value, then you can compute a modified hash where the first piece of data you hash for file i is the finalized hash for files 0, 1, 2, ..., i-1. So:
H-1 = empty,
Hi = MD5 (Hi-1 || filei)
As you go along, you can save the last successfully computed Hi value. In case of interruption, you can restart at file i+1. Note that, like any message digest, the above is completely sensitive to both order and content. This is something to consider on a dynamically changing file system. If files can be added or changed during the hashing operation, the meaning of the hash value will be affected. It might be rendered meaningless. You might want to be certain that both the sequence and content of the files you are hashing is frozen during the entire duration of the hash.