I'm encountering an issue where I have a function that is intended to require serialized access dependent on some circumstances. This seemed like a good case for using advisory locks. However, under fairly heavy load, I'm finding that the serialized access isn't occurring and I'm seeing concurrent access to the function.
The intention of this function is to provide "inventory control" for a event. Meaning, it is intended to limit concurrent ticket purchases for a given event such that the event is not oversold. These are the only advisory locks used within the application/database.
I'm finding that occasionally there are more tickets in an event than the eventTicketMax value. This doesn't seem like it should be possible because of the advisory locks. When testing with low volume (or manually introduced delays such as pg_sleep after acquiring the lock), things work as expected.
CREATE OR REPLACE FUNCTION createTicket(
userId int,
eventId int,
eventTicketMax int
) RETURNS integer AS $$
DECLARE insertedId int;
DECLARE numTickets int;
BEGIN
-- first get the event lock
PERFORM pg_advisory_lock(eventId);
-- make sure we aren't over ticket max
numTickets := (SELECT count(*) FROM api_ticket
WHERE event_id = eventId and status <> 'x');
IF numTickets >= eventTicketMax THEN
-- raise an exception if this puts us over the max
-- and bail
PERFORM pg_advisory_unlock(eventId);
RAISE EXCEPTION 'Maximum entries number for this event has been reached.';
END IF;
-- create the ticket
INSERT INTO api_ticket (
user_id,
event_id,
created_ts
)
VALUES (
userId,
eventId,
now()
)
RETURNING id INTO insertedId;
-- update the ticket count
UPDATE api_event SET ticket_count = numTickets + 1 WHERE id = eventId;
-- release the event lock
PERFORM pg_advisory_unlock(eventId);
RETURN insertedId;
END;
$$ LANGUAGE plpgsql;
Here's my environment setup:
Django 1.8.1 (django.db.backends.postgresql_psycopg2 w/ CONN_MAX_AGE 300)
PGBouncer 1.7.2 (session mode)
Postgres 9.3.10 on Amazon RDS
Additional variables which I tried tuning:
setting CONN_MAX_AGE to 0
Removing pgbouncer and connecting directly to DB
In my testing, I have noticed that, in cases where an event was oversold, the tickets were purchased from different webservers so I don't think there is any funny business about a shared session but I can't say for sure.
As soon as PERFORM pg_advisory_unlock(eventId)is executed, another session can grab that lock, but as the INSERT of session #1 is not yet commited, it will not be counted in the COUNT(*)of session #2, resulting in the over-booking.
If keeping the advisory lock strategy, you must use transaction-level advisory locks (pg_advisory_xact_lock), as opposed to session-level. Those locks are automatically released at COMMIT time.
Related
Hi everyone,
I'm a little bit lost with a problem thinking in ddd way.
Imagine you have an application to sell concert ticket. So you have an entity which is called Concert with the quantity number and a method to buy a ticket.
class Concert {
constructor(
public id: string,
public name: string,
public ticketQuantity: number,
) {}
buyTicket() {
this.ticketQuantity = this.ticketQuantity - 1;
}
}
The command looks like this:
async execute(command: BookConcertCommand): Promise<void> {
const concert = await this.concertRepository.findById(command.concertId);
concert.buyTicket();
await this.concertRepository.save(concert);
}
Imagine, your application has to carry a lot of users and 1000 users try to buy a ticket at the same when the ticketQuantity is 500.
How can you ensure the invariant of the quantity can't be lower than 0 ?
How can you deal with concurrency here because even if two users try to buy a ticket at the same time the data can be false ?
What are the patterns we can use to ensure consistency and concurrency ?
Optimistic or pessismistic concurrency can't be a solution because it will frustrate a lot of users and we try to put all our logic domain into our domain so we can't put any logic inside sql/db or use a transactional script approach.
How can you ensure the invariant of the quantity can't be lower than 0
You include logic in your domain model that only assigns a ticket if at least one unassigned ticket is available.
You include locking (either optimistic or pessimistic) to ensure "first writer wins" -- the loser(s) in a data race should abort or retry.
If your book of record was just data in memory, then you would ensure that all attempts to buy tickets for concert 12345 must first acquire the same lock. In effect, you serialize the requests so that the business logic is running one at a time only.
If your book of record was a relational database, then within the context of each transaction you might perform a "select for update" to get a copy of the current data, and perform the update in the same transaction. The database will raise it's flavor of concurrent modification exception to the connections that lose the race.
Alternatively, you use something like the semantics of a conditional-write / compare and swap: you get an unlocked copy of the concert from the book of record, make your changes, then send a "update this record if it still looks like the unlocked copy" message, if you get the response announcing you've won the race, congratulations - you're done. If not, you retry or fail.
Optimistic or pessismistic concurrency can't be a solution because it will frustrate a lot of users
Of course it can
If the concert is overbooked, they are going to be frustrated anyway
The business logic doesn't have to run synchronously with the request - it might be acceptable to write down that they want a ticket, and then contact them asynchronously to let them know that a ticket has been assigned to them
It may be helpful to review some of Udi Dahan's writing on collaborative and competitive domains; for instance, this piece from 2011.
In a collaborative domain, an inherent property of the domain is that multiple actors operate in parallel on the same set of data. A reservation system for concerts would be a good example of a collaborative domain – everyone wants the “good seats” (although it might be better call that competitive rather than collaborative, it is effectively the same principle).
You might be following these steps:
1- ReserveRequested -> ReserveRequestAccepted -> TicketReserved
2- ReserveRequested -> ReserveRequestRejected
When somebody clicks on the buy ticket button, you should create a reserve request entity, and then you can process the reservation in the background and by a queue system.
On the user side, you can return a unique reserve request-id to check the result of the process. So the frontend developer should fetch the result of process periodically until it succeeds or fails.
This is an answer rather than a question which I need to state in SO anyway. I was struggle with this question ("how to turn off autocommit when using soci library with PostgreSQL databases") for a long time and came up with several solutions.
In Oracle, by default the auto commit option is turned off and we have to call soci::session::commit explicitly to commit the transactions we have made but in PostgreSQL this is other way around and it will commit as soon as we execute a sql statement (correct me, if I'm wrong). This will introduce problems when we write applications database independently. The soci library provide soci::transaction in order to address this.
So, when we initialize a soci::transaction by providing the soci::session to that, it will hold the transaction we have made without commiting to the database. At the end when we call soci::transaction::commit it will commit the changes to the database.
soci::session sql(CONNECTION_STRING);
soci::transaction tr(sql);
try {
sql << "insert into soci_test(id, name) values(7, \'John\')";
tr.commit();
}
catch (std::exception& e) {
tr.rollback();
}
But, performing commit or rollback will end the transaction tr and we need to initialize another soci::transaction in order to hold future transactions (to create an active in progress transaction) we are about to make. Here are more fun facts about soci::transaction.
You can have only one soci::transaction instance per soci::session. The second one will replace the first one, if you initialize another.
You can not perform more than a single commit or rollback using a soci::transaction. You will receive an exception, at the second time you do commit or rollback.
You can initialize a transaction, then use session::commit or session::rollback. It will give the same result as transaction::commit or transaction::rollback. But the transaction will end as soon as you perform single commit or rollback as usual.
It doesn't matter the visibility of the soci::transaction object to your scope (where you execute the sql and call commit or rollback) in order to hold the db transactions you made until explicitly commit or rollback. In other words, if there is an active transaction in progress for a session, db transactions will hold until we explicitly commit or rollback.
But, if the lifetime of the transaction instance which created for the session was end, we cannot expect the db transactions will be halt.
If you every suffer with "WARNING: there is no transaction in progress", you have to perform commit or rollback only using soci::transaction::commit or soci::transaction::rollback.
Now I will post the solution which I came up with, in order to enable the explicit commit or rollback with any database backend.
This is the solution I came up with.
namespace mysociutils
{
class session : public soci::session
{
public:
void open(std::string const & connectString)
{
soci::session::open(connectString);
tr = std::unique_ptr<soci::transaction>(new soci::transaction(*this));
}
void commit()
{
tr->commit();
tr = std::unique_ptr<soci::transaction>(new soci::transaction(*this));
}
void rollback()
{
tr->rollback();
tr = std::unique_ptr<soci::transaction>(new soci::transaction(*this));
}
void ~session()
{
tr->rollback();
}
private:
std::unique_ptr<soci::transaction> tr;
};
}
When ever commit or rollback is performed, initialize a new soci::transaction. Now you can replace your soci::session sql with mysociutils::session sql and enjoy SET AUTOCOMMIT OFF.
The application Im working on needs to enforce the following rules (among others):
We cannot register a new user to the system if the active user quota for the tenant is exceeded.
We cannot make a new project if the project quota for the tenant is exceeded.
We cannot add more multimedia resources to any project that belongs to a tenant if the maximum storage quota defined in the tenant is exceeded
The main entities involved in this domain are:
Tenant
Project
User
Resource
As you can imagine, these are the relationship between entities:
Tenant -> Projects
Tenant -> Users
Project -> Resources
As a first glance, It seems the aggregate root that will enforce those rules is the tenant:
class Tenant
attr_accessor :users
attr_accessor :projects
def register_user(name, email, ...)
raise QuotaExceededError if active_users.count >= #users_quota
User.new(name, email, ...).tap do |user|
active_users << user
end
end
def activate_user(user_id)
raise QuotaExceededError if active_users.count >= #users_quota
user = users.find {|u| u.id == user_id}
user.activate
end
def make_project(name, ...)
raise QuotaExceededError if projects.count >= #projects_quota
Project.new(name, ...).tap do |project|
projects << project
end
end
...
private
def active_users
users.select(&:active?)
end
end
So, in the application service, we would use this as:
class ApplicationService
def register_user(tenant_id, *user_attrs)
transaction do
tenant = tenants_repository.find(tenant_id, lock: true)
tenant.register_user(*user_attrs)
tenants_repository.save(tenant)!
end
end
...
end
The problem with this approach is that aggregate root is quite huge because it needs to load all users, projects and resources and this is not practical. And also, in regards to concurrency, we would have a lot of penalties due to it.
An alternative would be (I'll focus on user registration):
class Tenant
attr_accessor :total_active_users
def register_user(name, email, ...)
raise QuotaExceededError if total_active_users >= #users_quota
# total_active_users += 1 maybe makes sense although this field wont be persisted
User.new(name, email, ...)
end
end
class ApplicationService
def register_user(tenant_id, *user_attrs)
transaction do
tenant = tenants_repository.find(tenant_id, lock: true)
user = tenant.register_user(*user_attrs)
users_repository.save!(user)
end
end
...
end
The case above uses a factory method in Tenant that enforces the business rules and returns the User aggregate. The main advantage compared to the previous implementation is that we dont need to load all users (projects and resources) in the aggregate root, only the counts of them. Still, for any new resource, user or project we want to add/register/make, we potentially have concurrency penalties due to the lock acquired. For example, if Im registering a new user, we cannot make a new project at the same time.
Note also that we are acquiring a lock on Tenant and however we are not changing any state in it, so we dont call tenants_repository.save. This lock is used as a mutex and we cannot take advantage of optimistic concurrency unless we decide to save the tenant (detecting a change in the total_active_users count) so that we can update the tenant version and raise an error for other concurrent changes if the version has changed as usual.
Ideally, I'd like to get rid of those methods in Tenant class (because it also prevents us from splitting some pieces of the application in their own bounded contexts) and enforce the invariant rules in any other way that does not have a big impact with the concurrency in other entities (projects and resources), but I don't really know how to prevent two users to be registered simultaneously without using that Tenant as aggregate root.
I'm pretty sure that this is a common scenario that must have a better way to be implemented that my previous examples.
I'm pretty sure that this is a common scenario that must have a better way to be implemented that my previous examples.
A common search term for this sort of problem: Set Validation.
If there is some invariant that must always be satisfied for an entire set, then that entire set is going to have to be part of the "same" aggregate.
Often, the invariant itself is the bit that you want to push on; does the business need this constraint strictly enforced, or is it more appropriate to loosely enforce the constraint and charge a fee when the customer exceeds its contracted limits?
With multiple sets -- each set needs to be part of an aggregate, but they don't necessarily need to be part of the same aggregate. If there is no invariant that spans multiple sets, then you can have a separate aggregate for each. Two such aggregates may be correlated, sharing the same tenant id.
It may help to review Mauro Servienti's talk All our aggregates are wrong.
An aggregate shoud be just a element that check rules. It can be from a stateless static function to a full state complex object; and does not need to match your persistence schema nor your "real life" concepts nor how you modeled your entities nor how you structure your data or your views. You model the aggregate with just the data you need to check rules in the form that suits you best.
Do not be affraid about precompute values and persist them (total_active_users in this case).
My recommendation is keep things as simple as possible and refactor (that could mean split, move and/or merge things) later; once you have all behavior modelled, is easier to rethink and analyze to refactor.
This would be my first approach without event sourcing:
TenantData { //just the data the aggregate needs from persistence
int Id;
int total_active_users;
int quota;
}
UserEntity{ //the User Entity
int id;
string name;
date birthDate;
//other data and/or behaviour
}
public class RegistrarionAggregate{
private TenantData fromTenant;//data from persistence
public RegistrationAggregate(TenantData fromTenant){ //ctor
this.fromTenant = fromTenant;
}
public UserRegistered registerUser(UserEntity user){
if (fromTenant.total_active_users >= fromTenant.quota) throw new QuotaExceededException
fromTeant.total_active_users++; //increase active users
return new UserRegisteredEvent(fromTenant, user); //return system changes expressed as a event
}
}
RegisterUserCommand{ //command structure
int tenantId;
UserData userData;// id, name, surname, birthDate, etc
}
class ApplicationService{
public void registerUser(RegisterUserCommand registerUserCommand){
var user = new UserEntity(registerUserCommand.userData); //avoid wrong entity state; ctor. fails if some data is incorrect
RegistrationAggregate agg = aggregatesRepository.Handle(registerUserCommand); //handle is overloaded for every command we need. Use registerUserCommand.tenantId to bring total_active_users and quota from persistence, create RegistrarionAggregate fed with TenantData
var userRegisteredEvent = agg.registerUser(user);
persistence.Handle(userRegisteredEvent); //handle is overloaded for every event we need; open transaction, persist userRegisteredEvent.fromTenant.total_active_users where tenantId, optimistic concurrency could fail if total_active_users has changed since we read it (rollback transaction), persist userRegisteredEvent.user in relationship with tenantId, commit transaction
eventBus.publish(userRegisteredEvent); //notify external sources for eventual consistency
}
}
Read this and this for a expanded explanation.
When you do:
#transaction.atomic
def update_db():
do_bulk_update()
while the function is running, does it lock the database?
I'm asking regarding django's atomic transaction:
https://docs.djangoproject.com/en/1.10/topics/db/transactions/#autocommit-details
(I'm assuming modern SQL databases in this answer.)
tl;dr
Transactions are not locks, but hold locks that are acquired automatically during operations. And django does not add any locking by default, so the answer is No, it does not lock the database.
E.g. if you were do:
#transaction.atomic
def update_db():
cursor.execute('UPDATE app_model SET model_name TO 'bob' WHERE model_id = 1;')
# some other stuff...
You will have locked the app_model row with id 1 for the duration of "other stuff". But it is not locked until that query. So if you want to ensure consistency you should probably use locks explicitly.
Transactions
As said, transactions are not locks because that would be awful for perfomance. In general they are lighter-weight mechanisms in the first instance for ensuring that if you make a load of changes that wouldn't make sense one at a time to other users of the database, those changes appear to happen all at once. I.e. are atomic. Transactions do not block other users from mutating the database, and indeed in general do not block other users from mutating the same rows you may be reading.
See this guide and your databases docs (e.g. postgres) for more details on how transactions are protected.
Django implementation of atomic.
Django itself does the following when you use the atomic decorator (referring to the code).
Not already in an atomic block
Disables autocommit. Autocommit is an application level feature which will always commit transactions immediately, so it looks to the application like there is never a transaction outstanding.
This tells the database to start a new transaction.
At this point psycopg2 for postgres sets the isolation level of the transaction to READ COMMITTED, which means that any reads in the transaction will only return committed data, which means if another transaction writes, you won't see that change until it commits it. It does mean though that if that transaction commits during your transaction, you may read again and see that the value has changed during your transaction.
Obviously this means that the database is not locked.
Runs your code. Any queries / mutations you make are not committed.
Commits the transaction.
Re-enables autocommit.
In an earlier atomic block
Basically in this case we try to use savepoints so we can revert back to them if we "rollback" the "transaction", but as far as the database connection is concerned we are in the same transaction.
Automatic locking
As said, the database may give your transaction some automatic locks, as outlined in this doc. To demonstrate this, consider the following code that operates on a postgres database with one table and one row in it:
my_table
id | age
---+----
1 | 50
And then you run this code:
import psycopg2 as Database
from multiprocessing import Process
from time import sleep
from contextlib import contextmanager
#contextmanager
def connection():
conn = Database.connect(
user='daphtdazz', host='localhost', port=5432, database='db_test'
)
try:
yield conn
finally:
conn.close()
def connect_and_mutate_after_seconds(seconds, age):
with connection() as conn:
curs = conn.cursor()
print('execute update age to %d...' % (age,))
curs.execute('update my_table set age = %d where id = 1;' % (age,))
print('sleep after update age to %d...' % (age,))
sleep(seconds)
print('commit update age to %d...' % (age,))
conn.commit()
def dump_table():
with connection() as conn:
curs = conn.cursor()
curs.execute('select * from my_table;')
print('table: %s' % (curs.fetchall(),))
if __name__ == '__main__':
p1 = Process(target=connect_and_mutate_after_seconds, args=(2, 99))
p1.start()
sleep(0.6)
p2 = Process(target=connect_and_mutate_after_seconds, args=(1, 100))
p2.start()
p2.join()
dump_table()
p1.join()
dump_table()
You get:
execute update age to 99...
sleep after update age to 99...
execute update age to 100...
commit update age to 99...
sleep after update age to 100...
commit update age to 100...
table: [(1, 100)]
table: [(1, 100)]
and the point is that the second process is started before the first command completes, but after it has called the update command, so the second process has to wait for the lock which is why we don't see sleep after update age to 100 until after the commit for age 99.
If you put the sleep before the exec, you get:
sleep before update age to 99...
sleep before update age to 100...
execute update age to 100...
commit update age to 100...
table: [(24, 3), (100, 2)]
execute update age to 99...
commit update age to 99...
table: [(24, 3), (99, 2)]
Indicating the lock was not acquired by the time the second process gets to its update, which happens first but during the first process's transaction.
As described by #daphtdazz answer, Django doesn't acquire any locks when you open a transaction, but as you update the data, the database may acquire automatic locks. The type and scope of the locks are database dependant and may also depend on the transaction isolation levels. Refer to your database documentations for the details of these automatic locks.
There are a few options if you want to take locks manually.
The main and simplest one is doing a select_for_update() query. This will acquire an update lock that will block all other updates to the rows matching the query. This is the same lock that is automatically acquired when you update a row in a transaction, but select_for_update() allows you to acquire the update lock before actually making the update, which can often be useful.
If row locking isn't suitable for your situation, you can acquire advisory locks in databases that supports them (e.g. Postgres). Out of the box, Django doesn't support this, but there are third party packages that adds support for advisory locks to Django, or you can simply issue the appropriate raw SQL query.
I have a question about a specific functionality in Siebel, regarding service requests.
Is there a way to track time when certain service request is in a given status/substatus, for example "Waiting on Customer"? When the service request is changed again to another status that isn't "Wait for somebody" anymore, I have to stop counting the time.
I don't know of any out of the box solution to your needs, however there are many ways to achieve it with a bit of customisation. For example:
Create two new fields, Waiting Time (with predefault value: 0) and Waiting Date.
Create the following BC user properties:
On Field Update Set x = "Status", "Waiting Time", "IIF([Waiting Date] IS NULL, [Waiting Time], [Waiting Time] + (Timestamp() - [Waiting Date]))
On Field Update Set y = "Status", "Waiting Date", "IIF([Status]='Waiting on Customer',Timestamp(),NULL)"
Your Waiting Date field will store the last time the service request changed to "Waiting on Customer", or NULL if it's on another status. Then, Waiting Time will accumulate the total time the request has been in that status.
I have not tested the solution, it might need some more work, for example, it's possible that Siebel doesn't allow you to use the expression [Waiting Time] + (Timestamp() - [Waiting Date]) directly and you'll have to decompose it using auxiliary calculated fields.
Note also that the On Field Update Set user property has changed its syntax from Siebel 7.7-7.8 to Siebel 8.x.
If you're familiar with server scripting, you could implement something similar quite easily, on the BusComp_PreSetFieldValue event. If the field being changed is Status, check if you're entering or exiting (or not) the "Waiting on Customer" status, and update the two fields accordingly.