Persistent data with finer granularity than a session in django

Persistent data with finer granularity than a session in django - django

Suppose that a staff member using a web site can exchange tickets for a customer. It is convenient to store data about the multi-view exchange in the session. But more than one exchange might be going on at the same time.
One way to keep track of the separate data in the session is to create a sub-session key and use that to access the session data. This key would need to be part of the view as a hidden input or it would need to be in the URL. This all gets pretty messy and the hidden variable method isn't great since redirects might occur during the exchange.
Is there a clean way to do this?

Use a database table that tracks information for a particular exchange and read/write from it when opening/submitting your wizard pages. Sessions are much more volatile by nature.

Related

What is the best practice to write an API for an action that affects multiple tables?

Consider the example use case as below.
You need to invite a Company as your connection. The sub actions that needs to happen in this situation is.
A Company need to be created by adding an entry to the Company table.
A User account needs to be created for the staff member to login by creating an entry in the User table.
A Staff object is created to ensure that the User has access to the Company by creating an entry in the Staff table.
The invited company is related to the invitee company, so a relation similar to friendship is created to connect the two companies by creating an entry in the Connection table.
An Invitation object is created to store the information as to who invited who onto the system, with other information like invitation time, invite message etc. For this, and entry is created in the Invitation table.
An email needs to be sent to the user to accept invitation and join by setting password.
As you can see, entries are to be made in 5 Tables.
Is it a good practice to do all this in a single API call?
If not, what are the other option.
How do I maintain data integrity if it is to be split into multiple APIs?

If the actions need to be atomic, then it's definitely best to do this in a single API call. Otherwise, you run the risk of someone not completing all the tasks required and leaving the resources in a potentially conflicting state.
That said, you're not updating a single resource, so this isn't a good fit for a single RESTful resource creation call (e.g., POST /companyInvitations) -- as all these other things being created and stitched together might lead to quite a bit of confusion.
If the action you're doing is "inviting a Company", then one option is to use Google's "custom method" syntax (POST /resources/1234:action) as defined in AIP-136. In this case, you might do POST /companies/1234:invite which says "I want to invite Company #1234 to be my connection".
Under the hood, this might atomically upsert (create if resources don't already exist) all the right things that you've listed out.

Something to consider when approaching an API call where multiple things happen when called, is how long those downstream actions take. Leaving the api call blocked isn't the best idea in the world while things are processing in the background.
You could consider (depending on your usecase) taking in the api request, immediately responding with a 200 status, and dropping the request onto an internal queue for processing. When your background service picks up the request it can update whatever needs to be updated and manage the transactions appropriately etc. This also caters for horizontal scaling scenarios where lots of "worker" services can be deployed to process the requests.
As part of this you could consider adding another "status" endpoint where requests can be made to find out how things are going. To avoid lots of polling status requests you could also take in callback details as part of the original api call which then gets called when the background processing is complete. Or you could do both!

What's the best practice to implement "read receipts" on group chats in AWS AppSync and Amplify?

I'm building an Angular 11 web app using AppSync for the backend.
I've mentioned group chat, but basically I have a feature in my app where I have an announcement feature where there's a person creating announcements to a specific audience (can be individual members or groups of members) and whenever the receiving user opens the announcement, it has to mark that announcement as read for that user in their UI and also let the sender know that it has been opened by that particular member.
I have an idea for implementing this:-
Each announcement needs to have a "seenBy" which aggregates the user Ids of the ones who open it.
Each member also has an attribute in their user object named "announcementsRead" which is an array of Ids of the announcements that they have opened.
In the UI when I'm gathering the list of announcements for the user, the ones whose ID don't belong in the member's own announcementsRead array, will be marked as unread.
When they click on it and it is opened, I make 2 updates - a) To the announcement object I simply push the member's user ID to the "seenBy" attribute and push to db. b) to the member's user object, I add the announcement's id to the "announcementRead" attribute and push it to the DB.
This is just something that I came up with.
Please let me know if there are any pitfalls to this approach. Or if there are simpler ways to achieve this functionality.
I have a few concerns as well:-
Let's say that two users are opening an announcement at the same time, and the clients try to update the announcement with the updated seenBy containing the user's ID, what happens when the two requests from two different clients are happening concurrently? It's possible that the first user fetches the object and then the second user fetches it immediately, and by the time the second user has updated the attribute and sent it back to the DB, the first user has already written their updated data. In such a case the second user's write to the DB will overwrite the first user's change. I am not sure of the internal mechanisms of the amplify data store, but I can imagine this happening. Is this possible? If so, how do we ensure that it is prevented?
Is it really necessary for me to maintain the "announcementsRead" attribute in the user? I mean I can imagine generating that list in the UI every time I get the list of announcements by checking if the current user's ID exists in the announcement's "seenBy" and maintaining that list in the UI, that way we can eliminate redundancy of info in the DB and also it would make sense to not accumulate extremely old announcement IDs that may have been deleted. But I'm wondering if having this on the member actually helps in an indispensable way.
Hope my questions are clear.

Handling multiple users concurrently populating a PostgreSQL database

I'm currently trying to build a web app that would allow many users to query an external API (I cannot retrieve all the data served by this API at regular intervals to populate my PostgreSQL database for various reasons). I've read several thing about ACID and MVCC but still, I'm not sure there won't be any problem if several users are populating/reading my PostgreSQL database at the very same time. So here I'm asking for advice (I'm very new to this field)!
Let's say my users query the external API to retrieve articles. They make their search via a form, the back end gets it, queries the api, populates the database, then query the database to return some data to the front end.
Would it be okay to simply create a unique table to store the articles returned by the API when users are querying it ?
Shall I rather store the articles returned by the API and associate each of them to the user that requested it (the Article model will contain a foreign key mapping to a User model)?
Or shall I give each user a table (data isolation would be good but that sounds very inefficient)?
Thanks for your help !

Would it be okay to simply create a unique table to store the articles returned by the API when users are querying it ?
Yes. If the articles have unique keys (doi?) you could use INSERT...ON CONFLICT DO NOTHING to handle the (presumably very rare) case that an article is requested by two people nearly simultaneously.
Shall I rather store the articles returned by the API and associate each of them to the user that requested it (the Article model will contain a foreign key mapping to a User model)?
Do you want to? Is there a reason to? Do you care who requested each article? It sounds like you anticipating storing only the first person to request each article, and not every request?
Or shall I give each user a table (data isolation would be good but that sounds very inefficient)?
Right, you would be hitting the API a lot more often (assuming some large fraction of articles are requested more than once) and storing a lot of duplicates. It might not even solve the problem, if one person hits "submit" twice in a row, or has multiple tabs open, or writes a bot to hit your service in parallel.

How can I approach data split across multiple databases?

I'm putting together a proposal for the development of a web application.
The app is to be launched in multiple countries, and some of the client's partners and (allegedly; I'm no lawyer) some of the countries involved have rules about where personal data can be stored. The upshot is that there is a hard requirement that particular data about certain countries' users is stored on servers in that country. (It sounds like they're OK with me caching data in any country, though -- so I intend to have a Redis in-memory store in the main data centre.) Some of the data (credit card details, for example) will additionally be encrypted, but this seems to make no difference to them in terms of where it can be stored.
With the current set of requirements, users from one country won't actually ever interact with users from another country, so one obvious option is to run different instances of the application in each country, entirely self-contained. This is simpler from an architectural point of view, but harder to manage, and would have overall higher server costs. It might get complicated if for example the client wants reports on all users across all countries, or eventually they want to merge the databases, and users' primary keys have to change. Not impossible, but it'd likely be a pain.
Probably better would be to have a central database with all information the client deems it acceptable to host in a single spot (North America somewhere), and then satellite databases in each country holding the information the client needs to be kept "at home".
So the main database would have the main users table, consisting of only a PK and a country code, and would have lots of other tables. Each local database would have a "user details" table, with a foreign key (to the main users table on the main database) and a bunch of other columns of personally identifiable information, as well as username, email address, password, etc.
The client may then push to have other data stored in the satellite locations, some of which may be one-to-many with a user or many-to-many with a user.
My questions:
How can this be handled with Django? Can it, or should I look at other frameworks?
Can the built-in User model be edited to look in all the satellite databases for the matching User model on log in, and when logged in to retrieve the user data from those databases without too much trouble?
Are there any guidelines you can give me to make sure code stays simple and things stay efficient?
Will this be significantly easier if the satellite database only has one-to-one data with the main User table? I imagine that having one-to-many or many-to-many data in those satellite databases would be a major pain (or at least inefficient), or am I wrong?

To answer your questions accordingly:
Looks like something that you could do in Django (I like Django so I may not be the best to opinion here) - maybe the following will convince you (or not).
A microservice approach? Multiple instances of the "user" resource multiservice each with it's own database (I heard you about the costs but maybe?).
You can do plenty with Django Authentication Backends (including wirting your own) - there is a "remote" auth backend you could use as an example. Read about stateless authentication (JWT).
Look at points 1 and 2.
Consider not using the built-in Django user model is it doesn't suit you.

How can I push data to a user session?

I need to push changes to my app's session scopes in real time. Each user in session in my app has a similar struct to this:
session.user =
{
name = "Foo",
mojo = "100"
};
Users can modify each others' "mojo." For example, if user Foo received 10 mojo points, and he now has 110, I need to update his session.user.mojo to reflect the additional "mojo" received. I need to modify his session struct, in other words.
Example 2: User in session 1 does something where user in session 2 receives "mojo." The session.user.mojo in session 2 needs to be updated to reflect this change.
Some info:
The inital mojo value is pulled from the database and stored in the session when a user logs in.
"Mojo" updates always take place in the database. "Mojo" stored in the session is used to govern user privileges.
What are my options? Is this even possible? I have absolutely no idea on how to do something like that.
UPDATE I don't want pass the updated values back to the user (the data will refresh when the user navigates between pages). I only want to change them in the appropriate user's session scope.

This answer is ColdFusion 9 specific.
Cache user data (e.g. cachePut()) by user ID, and keep track of their user ID in session. Every update to mojo should retrieve the user data in cache - if present - and update it there as well. Finally, if this is a multi-server environment, setup messaging between the machines that broadcasts the user ID of any change to mojo, servers receiving the message then update their own cached user data.
What this buys you is limiting the amount of database activity that goes on, pretty good liveness, and makes the mojo value available globally, which has the added benefit of being available for purposes other than the user session (e.g. another user can review their profile to see the mojo score).

If you really need to change vars in a particular Session, there's no built-in way to do that. Maybe you can abstract out the logic, instead of accessing the mojo from Session, always access mojo from DB?
update: Why session? How about a big struct in Application scope, and use userID or sessionID as key, and mojo as value? You can also store a timestape like lastUpdated and delete the ones that has not been updated to reclaim your memory. Then from time to time, update your DB? Or... update your DB async if u're worry about performance.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js