Find a User by name quickly - c++

I have a C++ server that manages Users for a game. These Users have unique AccountIDs and almost every look-up for Users on the server involves finding a User from a global map of
std::map<unsigned int, User*>
where unsigned int is the AccountID. This works great except for this new case where I am implementing a friends list. In order to add a friend to someones friend list it needs to be done by Username. I am also running into this problem when inviting people by Username to a chatroom or other "party" type events.
My two current options are:
1) Iterate through the entire Users map, doing a string comparison by Username.
2) Do a database look-up on an indexed Username column and return the AccountID, then do a map find for the User*.
Both of these solutions are very inefficient. I am looking for a more optimized solution of finding a User by Username.
The first idea that comes to mind is a Hashtable that hashes on the Username, but then I have two different data structures (the Hashtable and the Map) that are doing the same thing except one is by AccountID and one is by name.
A second option could be to use the Username as the key for the map, although I can't imagine having a string for a key being too efficient.
Any suggestions on what I should do here? As for some more information on the server, there will be around 1000+ Users and they will be leaving and joining constantly.

C++11 has std::unordered_map which will automagically handle hashing for you, e.g. std::unordered_map<std::string, User*>.

I would suggest just using another map std::map<std::string, User*>. I believe that for an application with ~1000 users it is over-engineering to do hashmaps or more complicated solutions, the string based lookup in map will not be that expensive, practically zero compared to lookup in database.
Maybe, you can use the by-product of having alphabetically sorted users somewhere as well.

Related

User Friendly Unique Identifier For DynamoDB

In my DynamoDB table named users, I need a unique identifier, which is easy for users to remember.
In a RDBMS I can use auto increment id to meet the requirement.
As there is no way to have auto increment id in DynamoDB, is there a way to meet this requirement?
If I keep last used id in another table (lastIdTable) retrieve it before adding new document, increment that number and save updated numbers in both tables (lastIdTable and users), that will be very inefficient.
UPDATE
Please note that there's no way of using an existing attribute or getting users input for this purpose.
Since it seems you must create a memorable userId without any information about the user, I’d recommend that you create a random phrase of 2-4 simple words from a standard dictionary.
For example, you might generate the phrase correct horse battery staple. (I know this is a userId and not a password, but the memorability consideration still applies.)
Whether you use a random number (which has similar memorability to a sequential number) or a random phrase (which I think is much more memorable), you will need to do a conditional write with the condition that the ID does not already exist, and if it does exist, you should generate a new ID and try again.
email address seems the best choice...
Either as a partition key, or use a GUID as the partition key and have a Global Secondary Index over email address.
Or as Matthew suggested in a comment, let the users pick a user name.
Docker container naming strategy might give you some idea. https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go
It will result in unique (limited) yet human friendly
Examples
awesome_einstein
nasty_weinstein
perv_epstein
A similar one: https://github.com/jjmontesl/codenamize

How do you implement multi-tenancy on CouchBase? Can it be performant?

I'm considering an app which will store customer data. Given the way buckets work in CouchBase, all customer data will be in one bucket. It appears that I have two choices:
Implement multi-tenancy in views, by assigning a field to each record that indicates the customer it belongs to.
Implement it by putting a factor on every key that is a customer ID.
It seems, though, that since I will be using views, I'll really want to do both. In case number 2, I need to have the data in the record so that it can be indexed on (or maybe I can pull out part of the key in the map phase and index on customer) and in option 1, I'd want it to be part of the key as a check when retrieving data to make sure I don't send the wrong customers data down the line.
The problem is, this is a service where multiple customers will interact, and sometimes one customer will create some data and the other will view it, at the first customers request. But putting an ACL on each record that lists everyone who's authorized to view it would be problematic, to say the least.
I bet there is a common methodology or design pattern to answer this question, and would appreciate some pointers to best practices.
I'm also concerned about the performance if the indexes are indexing both on the particular piece of relevant data, and the customer id... a large number of different customers would presumably make the indexes much less efficient. (but maybe not.)
Here are my thoughts on your questions:
[Concerning items #1 and 2] - It seems, though, that since I will be using views, I'll really want to do both.
This doesn't seem to make sense to me. In Couchbase, the map phase can include content from both the key and the value. It makes little sense to store the data in both the key and the value, as you are guaranteed to have 1:1 duplication there. Store it wherever it makes the most sense to store it; in this case, probably the value.
The problem is, this is a service where multiple customers will interact, and sometimes one customer will create some data and the other will view it, at the first customers request. But putting an ACL on each record that lists everyone who's authorized to view it would be problematic, to say the least.
My site also has muti-tenant data stored in a single database. In my case, I use object unique identifiers as my keys. By default, customers can access all objects that belong to them (I have a user object, and the user is associated with a customer account). Users may also have additional permissions assigned to them, whereby a single object from another customer could be added to their user account, and they would thereby be granted access to view the object.
The alternative is "security through obscurity" and use guids as a random identifier, giving customers access to view any object that they have the guid for.
I would not, however, try to store the permissions on the objects themselves. That would quickly become unwieldy. You need to think about your specific use case, and decide what simple approach would work for the majority of the cases, and just not support the other 1-2% of the cases.

How to generate unqiue keys for caching items in ColdFusion

I posted a similar question over on the Adobe Community forums, but it was suggested to ask over here as well.
I'm trying to cache distinct queries associated with a particular database, and need to be able to flush all of the queries for that database while leaving other cached queries intact. So I figured I'd take advantage of ColdFusion's ehcache capabilities. I created a specific cache region to use for queries from this particular database, so I can use cacheRemoveAll(myRegionName) to flush those stored queries.
Since I need each distinct query to be cached and retrievable easily, I figured I'd hash the query parameters into a unique string that I would use for the cache key for each query. Here's the approach I've tried so far:
Create a Struct containing key value pairs of the parameters (parameter name, parameter value).
Convert the Struct to a String using SerializeJSON().
Hash the String using Hash().
Does this approach make sense? I'm wondering how others have approached cache key generation. Also, is the "MD5" algorithm adequate for this purpose, and will it guarantee unique key generation, or do I need to use "SHA"?
UPDATE: use cacheRegion attribute introduced in CF10!
http://help.adobe.com/en_US/ColdFusion/10.0/CFMLRef/WSc3ff6d0ea77859461172e0811cbec22c24-7fae.html
Then all you need to do is to specify cachedAfter or cachedWithin, and forget about how to to generate unique keys. CF will do it for you by 'hashing':
query "Name"
SQL statement
Datasource
Username and
password
DBTYPE
reference: http://www.coldfusionmuse.com/index.cfm/2010/9/19/safe.caching
I think this would be the easiest, unless you really need to fetch a specific query by a key, then u can feed your own hash using cacheID, another new attribute introduced in CF10.

Unique alpha-numeric string from a unique integer? (masking IDs in a game server?)

I have a multiplayer mobile game out in the wild, it's backed by a sql database. Each game gets an ID which is just an auto-increment field. I can look up a game with a url like:
http://www.example.com/gameId=123
That url is not visible to players at the moment, but I was thinking of displaying it so users can invite friends and let non-players look on in the game as they play (through a browser - at the moment everyone plays through a native app).
But the fact that I'm putting the game ID out there in the open seems like a bad idea. If someone guessed an endpoint for say deleting a game, they could do bad stuff knowing the ID (of course my endpoints are protected by user auth, but still).
Do most services mask IDs of this sort, should I send out a url like:
http://www.example.com/gameId=maskedIdAbc
and then my game server has to translate that ID into the corresponding ID in my database?
Not sure if that's overkill. If not, what's a good way to generate a unique alpha-numeric string based off a unique integer?
Thanks
Why not change the primary key of the game from an incremental ID to a GUID? The game is out in the wild but you should be able to get there in a number of steps. Add the Guid as a Field and allow games to be looked up either by ID or GUID. Update your clients to use the GUID, phase out the ID, and finally change the primary key to be the GUID.
You could hash the int, or even use the hex, but its breakable. Better to implement a complete fix, if you don't want to use a GUID you could implement your equivalent random characters that you store against each db record but why go to the trouble when GUIDs are usually Nativity supported by databases.
If range of the integers is not big, you may define tabble with unique, random alpha-num strings. I think it's the best way.
I has a similar situation and did not want to use the gameID (using your example here) in the url, as someone can use any number. I can still use the ID's, but need to add additional checking for authorizing the users.
You could use UUID to generate gameID's but I see few problems with this;
- non numeric ID will have an impact on the performance
- if this is the primary key and want to use it as FK on other tables, space
What I did;
In addition to gameID in my table, I added another column WebGameID varchar (32). After the game ID was generated, updated the WebGameID = MD5(gameID). This will be a unique 32 char string for the specific gameID. With this I was able to use gameID for internal keys and FK's ad only use WebGameID for the URL for limiting user manipulation.

Storing and Searching Large Data Set

I'm relatively new to programming in C++ and I'm trying to create a data set that just has two values: an ID number and a string. There will be about 100,000 pairs of these. I'm just not sure what data structure would best suit my needs.
The data set has the following requirements:
-the ID number corresponding to a string is 6 digits (so 000000 to 999999)
-not all ID values between 000000 and 999999 will be used
-the user will not have permission to modify the data set
-I wish to search by ID or words in the String and return to the user ID and String
-speed of searching is important
So basically I'm wondering what I should be using (vector, list, array, SQL database, etc) to construct this data set and quickly search it?
the ID number corresponding to a string is 6 digits (so 000000 to
999999)
Good, use an int, or more precisely int32_t for the ID
-not all ID values between 000000 and 999999 will be used
Not a problem...
-the user will not have permission to modify the data set
Encapsulate your data within a class and you are good to go
-I wish to search by ID or words in the String and return to the user ID and String
Good, use Boost.Bimap
-speed of searching is important
I know, that's why you are using C++... :-)
You may also want to check SQLite : SQLite, can also function as an in-memory database.
use std::map
void main()
{
std::map<string /*id*/, string> m;
m["000000"] = "any string you want";
}
Vector & list are worst to use if you don't sort them, you don't want loop through all.
I suggest you use map, even tho building the entire map might take longer (nlogn). I still recommend it, since the runtime for searching is log(n) which is pretty fast!
"speed of searching is important"
I'd suggest something like a class which contains a vector of your id/string pairs, an unordered_map which maps id to an iterator or reference into that vector, and an unordered_map which maps a string to an iterator or reference into that vector. Then, two search functions in the class which look up the id/string pair based on the id or a string.
You have couple of options.
Use database, MySQL, SQLite etc. Performance depends on the database you use.
Or, if you want to do it in C++ code, you can use vectors. One vector for the key, another is for the string. You also need to map the related index between 2 vectors.
Sort both vectors after add a new item. Remember to update the map of related index
Then use binary search to find either key, or value. It shall be fast enough.