There are two approaches I've been thinking about for storing data in cookies. One way is to use one cookie for all the data and store it as a JSON string.
The other approach is to use a different cookies for each piece of data.
The negatives I see with the first approach is that it'll take up more space in the headers because of the extra JSON characters in the cookie. Also I'll have to parse and stringify the JSON which will take a little processing time. The positive is that only one cookie is being used. Are there other positives I am missing?
The negatives I see with the second approach is that there will be more cookie key value pairs used.
There are about 15-20 cookies that I will be storing. The expires date will be the same for each cookie.
From what I understand the max number of cookies per domain is around 4000. We are not close to that number yet.
Are there any issue I am overlooking? Which approach would be best?
Edit - These cookies are managed by the JavaScript.
If you hand out any data for storage to your users (which is what cookies do), you should encrypt the data, or at the very very least sign it.
This is needed to protect the data from tampering.
At this point, size considerations are way off (due to padding), and so is the performance overhead of parsing the JSON (encryption will cause significantly greater overhead).
Conclusion: store your data as JSON, (encrypt it), sign it, encode it as base64, and store it in a single cookie. Keep in mind that there is a maximum size for cookies (and it's 4K).
Reference: among numerous other frameworks and applications, this is what Rails does.
A best-practice for cookies is to minimize their use. For instance, limit your cookie usage to just remembering the session id, and then store your data on the server side.
In the EU, cookies are subject to legal regulations, and using cookies for almost anything but session ids require explicit client consent.
Good morning.
I think i understand you. At sometime ago, i use cookies stored as json data encrypted, but for intranet, or administration accounts. For users of shop, i used this same practice. Whetever, to store products on shop site, i don't use encryption.
Important: sometimes i have problems with json decode before decrypt data. Depending your use, you can adopt a system storing data separated by ; and : encrypted like:
encrypt_function($key, "product:K10072;qtd:1|product:1042;qtd:1|product:3790;qtd:1") to store products; and
encrypt_function($key, "cad_products:1;mdf_products:2;cad_collabs:0") to store security grants.
Any system can be hacked. You need to create an applycation with constant user data verification and log analyzing. This system, yes, needs to be fast.
Related
I'm new to learning about Django sessions (and Django in general). It seems to me that request.session functions like a dictionary, but I'm not sure how much data I can save on it. Most of the examples I have looked at so far have been using request.session to store relatively small data such as a short string or integer. So is there a limit to the amount of data I can save on a request.session or is it more related to what database I am using?
Part of the reason why I have this question is because I don't fully understand how the storage of request.session works. Does it work like another Model? If so, how can I access the keys/items on the admin page?
Thanks for any help in advance!
In short: it depends on the backend you use, you specify this with the SESSION_BACKEND [Django-doc]. Te backends can be (but are not limited to):
'django.contrib.sessions.backends.db'
'django.contrib.sessions.backends.file'
'django.contrib.sessions.backends.cache'
'django.contrib.sessions.backends.cached_db'
'django.contrib.sessions.backends.signed_cookies'
Depending on how each backend is implemented, different maximums are applied.
Furthermore the SESSION_SERIALIZER matters as well, since this determines how the data is encoded. There are two builtin serializers:
'django.contrib.sessions.serializers.JSONSerializer'; and
''django.contrib.sessions.serializers.PickleSerializer'.
Serializers
The serializer determines how the session data is converted to a stream, and thus has some impact on the compression rate.
For the JSONSerializer, it will make a JSON dump that is then compressed with base64 compression, and signed with hmac/SHA1. This compression ratio will likely have ~33% overhead compared to the original JSON blob.
The PickleSerializer will first pickle the object, and then compress it as well and sign it. Pickling tends to be less compact than JSON encoding, but pickling on the other hand can convert objects that are not dictionaries, lists, etc. into a stream.
Backends
Once the data is serialized, the backend determines where it is stored. Some backends have limitations.
django.contrib.sessions.backends.db
Here Django uses a database model to store session data. If the database can store values up to 4 GiB (like MySQL for example), then it will probably store JSON blobs up to 3 GiB per session. Note that of course there should be sufficient disk space to store the table.
django.contrib.sessions.backends.file
Here the data is written to a file. There are no limitations implemented, but of course there should be sufficient disk space. Some operating systems can add certain limitations to the amount of disk space files in a directory can allocate.
django.contrib.sessions.backends.cache
Here it is stored in one of the caches you specified in the CACHES setting [Django-doc], depending on the cache system you pick certain limitations can apply.
django.contrib.sessions.backends.cache_db
Here you use a combination of cache and db: you use the cache, but the data is backed by the database, such that if the cache is invalidated, the database still contains the data. This thus means that the limitations of both backends apply.
django.contrib.sessions.backends.signed_cookies
Here you store signed cookies at the browser of the client. The limitations of the cookies are here specified by the browser.
RFC-2965 on HTTP State Management Mechanism specifies that a browser should normally be capable of storing at least 4096 bytes per cookie. But with the signing part, it might be possible that this threshold is not sufficient at all.
If you use the cookies of the browser, you thus can only store very limited amounts of data.
I am trying to design a service to send emails to users. This service is pretty much similar to Amazon SES.
One of the requirement is to keep track of all the emails that this system will be sending. I am confused as how to design this solution so that I can maintain the emails sent with parent user(known at the time of sending email) who sent emails.
If I start dumping all the email related data in relational DB, it will grow exponentially over period of time and will create a lot of problem. Similarly if I store these things in Cassandra it will grow at good speed and create problems.
Need for storing this information:-
1) In future need to know if email was sent to a particular user and when.
2) If the feedback loop creates complaint mail, I will need to map it back to a particular email id(which will be present in complaint email) and parent user who sent it(which will be stored at the time email was sent).
Can someone help me giving pointers as, how to store or create some cache in a way to achieve this.
It's unlikely to grow "exponentially." Seems like it will grow linearly. Regardless, if you need the ability to look up who sent what to whom, then you have no choice but to store it.
What you need to do is estimate how many emails you send per day, and how much data you need to save with each of those emails. Do the math and determine how much data you expect to be generating each day. Then at least you can figure out how large your database will get over time.
You'll also need to consider how you want to index the data. Seems like you'll want to index by email id, at least. You might also want to index by sender, and also possibly by recipient. Those indexes will create additional per-email data storage requirements. How much is something you'll have to determine through analysis.
How much actual disk space this will occupy per email is hard to determine. If the messages are short, you could probably get more than a million emails per gigabyte in a relational database. You could potentially do much better than that if you compress the message data, or apply other techniques that take advantage of similarities in the messages. For example, if you send the exact same message to a thousand recipients, you can store a single copy of the message and just store a reference to that message in the individual email records.
You might also want to consider how long you need to store each message. Do you need to store everything forever, or can you periodically remove all messages that are older than a year (or some other relatively long amount of time)?
Good day, I'v implemented a REST service. In the URL of resource end-point I use ID's which are primary keys of tables of the database. For example http://host/myapp/items/item/4. I'v learned using the database ID in the URL is a bad practice and I should use UUID instead. On the other hand I'v learned that using UUIDs in indexes is a performance issue if there's many records in the database because they are not sequential (1,2,3,...). So I'v got an idea to encrypt the database ID. This is how it could work:
1) Client POSTs an item to `http://host/myapp/items`.
2) The back-end creates a new item in the database.
3) Autoincremented ID '4' is generated by the database.
4) The back-end encrypts the ID '4' to 'fa4ce3178a045b2a' using a cipher key and returns encrypted ID of a created resource.
And then:
5) Client sends a request to GET `http://myapp/items/item/fa4ce3178a045b2a`.
6) The back-end decrypts 'fa4ce3178a045b2a' to '4' using an cipher key.
7) The back-end fetches item with primary key '4' and sends it to the client.
What are the cons of such solution? Will the encryption/decryption will be fast enough so that it's not worse then using UUID? And what encryption algorithm should I use so that it is fast and doesn't consume much resources? Could someone more experienced advise or recommend a better solution? Thank you in advance. Vojtech
yes, ids are sometimes undesired. client can predict and generate the links. can check how big your database is etc. but sometimes it doesn't matter and then ids are perfectly ok.
the fastest ciphers are symmetric ones. for details you have to find/do some benchmarks. example is here: http://stateless.geek.nz/2004/10/13/scp-performance/
but i don't think anyone can tell you if the process of encryption will be faster than using uuid. it depends on your database size, indexes you use, cache, hardware etc. do the performance tests. if speed is critical for you, you can think of storing translation map/table (uuid -> id) in memory
I don't think we can predict which is faster: using UUID in your database or encrypting and decrypting the ids. It can depend on the type of the database, the computer the database is on and the actual request as well.
For example when you want to list many resources and you want to add links to the detailed views, you have to encrypt the id of each resource in order to compose the response. Now by a long list this can take a much longer time than a slightly slower select, so I would not use it.
I don't think this is a real bottleneck. I think the HTTP communication is the bottleneck, so in order to make things faster you should consider setting the HTTP cache properly instead. Btw. if you really want to crypt your ids, you should measure the speeds, instead of asking us to guess them.
Suppose that a staff member using a web site can exchange tickets for a customer. It is convenient to store data about the multi-view exchange in the session. But more than one exchange might be going on at the same time.
One way to keep track of the separate data in the session is to create a sub-session key and use that to access the session data. This key would need to be part of the view as a hidden input or it would need to be in the URL. This all gets pretty messy and the hidden variable method isn't great since redirects might occur during the exchange.
Is there a clean way to do this?
Use a database table that tracks information for a particular exchange and read/write from it when opening/submitting your wizard pages. Sessions are much more volatile by nature.
I run a REST service on AppEngine (which may not be relevant). Each REST request is accompanied by a user id and password, and at the beginning of each request I hash the password to see if it matches my records before proceeding.
This is working well, theoretically, but in practice I get bursts of requests from users - 4 or 5 a second. It's taking BCrypt 500ms to hash the password for each request! What a waste!
Clearly, I don't want to optimize the BCrypt time down. Are there standard practices for caching hashes? Would memcache be a safe place to store a table of recently hashed passwords and their hashes? I guess at that point I might as well store the users' plain-text passwords in Memcache. I'd love to do a 3ms memcache lookup instead of a 500ms hash, but security is paramount. Would it make more sense to implement some sort of session abstraction?
Thanks for any advice!
Edit for extra context: this is a gradebook application that stores sensitive student data (grades). Teachers and students log in from everywhere, including over wifi, etc. Every successful request is sent over https.
The usual rule of thumb with REST APIs is to have them remain fully stateless, however, as with goto there is a time and a place depending on your requirements. If you're not averse to the idea of having the client store a temporary session key which you only need to regenerate occasionally then you might try that out.
When a client makes a request, check whether they're sending a session key variable along with the user ID and password. If they are not, generate one based on the current server time and their password, then pass it back to the client. Store it in your own database along with its creation time. Then, when a client makes a request that includes a session key, you can verify it by directly comparing it to the session key stored in your database without requiring a hash. As long as you invalidate the key every few hours, it shouldn't be much of a security concern. Then again, if you're currently sending the password and user ID in the clear then you already have security issues.
Your best bet given your current approach is to keep a mapping of attempted passwords to their bcrypted form in memcache. If you're concerned for some reason about storing a plaintext password in memcache, then use an md5 or sha1 hash of the attempted password as a key instead.
The extra step isn't really necessary. Item stored in memcache don't leak to other apps.