Understanding solution to Hilzer's Barbershop Problem - concurrency

Problem
This is a problem from "The Little Book Of Semaphores".
Our barbershop has three chairs, three barbers, and a waiting area
that can accommodate four customers on a sofa and that has standing
room for additional customers. Fire codes limit the total number of
customers in the shop to 20.
A customer will not enter the shop if it is filled to capacity with
other customers. Once inside, the customer takes a seat on the sofa or
stands if the sofa is filled. When a barber is free, the customer that
has been on the sofa the longest is served and, if there are any
standing customers, the one that has been in the shop the longest
takes a seat on the sofa. When a customer’s haircut is finished, any
barber can accept payment, but because there is only one cash
register, payment is accepted for one customer at a time. The bar-
bers divide their time among cutting hair, accepting payment, and
sleeping in their chair waiting for a customer.
Book's solution
Below is the book's solution:
Shared variables
customers = 0
mutex = Semaphore(1)
mutex2 = Semaphore(1)
sofa = Semaphore(4)
customer1 = Semaphore(0)
customer2 = Semaphore(0)
payment = Semaphore(0)
receipt = Semaphore(0)
queue1 = []
queue2 = []
Customer thread
self.sem1 = Semaphore(0)
self.sem2 = Semaphore(0)
mutex.wait()
if customers == 20:
mutex.signal()
balk()
customers += 1
queue1.append(self.sem1)
mutex.signal()
# enterShop()
customer1.signal()
self.sem1.wait()
sofa.wait()
# sitOnSofa()
self.sem1.signal()
mutex2.wait()
queue2.append(self.sem2)
mutex2.signal()
customer2.signal()
self.sem2.wait()
sofa.signal()
# getHairCut()
mutex.wait()
# pay()
payment.signal()
receipt.wait()
customers -= 1
mutex.signal()
Barber thread
customer1.wait()
mutex.wait()
sem = queue1.pop(0)
sem.signal()
sem.wait()
mutex.signal()
customer2.wait()
mutex2.wait()
sem2 = queue2.pop(0)
sem2.signal()
mutex2.signal()
# cutHair()
payment.wait()
# acceptPayment()
receipt.signal()
Question
As I understood it, the following steps can happen when the first customer (Customer 1) comes in and a barber (Barber 1) is sleeping on customer1:
Customer 1 adds its own semaphore to queue1, signals customer1, and waits on its own semaphore.
Barber 1 wakes up, pops Customer 1's semaphore from the customer1 queue, signals it (waking up Customer 1), and waits on it. At this moment, Barber 1 holds mutex, so no other free barber can let a customer in before Barber 1 gets to wake up and release mutex.
Customer 1 wakes up, passes through sofa (subtracting it and making it 3), and signals its own semaphore (waking up Barber 1).
Barber 1 wakes up and signals mutex.
At this point, there is nothing stopping another customer from going through the same process, and eventually getting to queue2 before Customer 1.
So, I'm having trouble understanding how the book's solution enforces the constraint that "if there are any standing customers, the one that has been in the shop the longest takes a seat on the sofa".
To enforce the order, it seems like the Customer thread should signal its self.sem1 only after it adds itself to queue2, not before.
Could someone help me understand this solution?

First, note that the barber thread pulls customers from the head of queue2, which represents the sofa, so once the barber pulls a customer, that customer is the one waiting for the longest time on the sofa.
Second, note that once a customer is pulled out of queue2, the hair cut itself does not take any time, and the customer moves to payment. So the way it is written, customers are pulled from the queue and moved the payment immediately. Haircut is instantaneous.
So if another customer comes and moves through the queues, the only way it can be picked is if there is another barber and the second customer is at the head of the queue. But, if the second customer is at the head of the queue, that means another barber already picked the first customer. So, that really cannot happen.
About your question regarding self.sem1: its purpose is to wake up the customer once the customer is picked from the queue. sem1 corresponds to the queue for customers standing, sem2 corresponds to the sofa. The customer adds the semaphore to the queue, and waits on it, so when the barber thread picks it up, it can wake up the customer. So it can't really enforce ordering, it is there to wake up the customer when he/she changes location. The ordering is enforced by the queue.

Related

Database: new table or new field for order attributes

Current table schema:
User
- id
- default_handling_fee(integer, null=True)
Order
- id
- buyer(user_id)
- order_handling_fee(integer, null=True)
We now want to add a second type of fee that can apply to orders. Some rules:
Fees apply to < 10% of orders.
An order can have 0, either, or both fees.
Every time order info is displayed to users, we need to
highlight fees and show the types of fees.
Fees must be editable on an order-by-order basis.
We may add additional fees going forward.
The db is currently pretty small (a few thousand rows) and unlikely
to grow beyond 100k in the next few years.
I can see two options:
A: Add a new field on order and one on user:
User
- id
- default_handling_fee(integer, null=True)
- default_processing_fee(integer, null=True)
Order
- id
- buyer(user_id)
- order_handling_fee(integer, null=True)
- order_processing_fee(integer, null=True)
or B, add a new 'fees table':
User
- id
Order
- id
- buyer(user_id)
- order_fees(many_to_many_field to OrderFees)
OrderFees
- id
- buyer(user_id)
- price(integer)
- fee_type(choices=['handling', 'processing'])
- is_default_value(boolean)
If a user creates an order and applies one or both of the fees, in option B (new table), we would first look for existing fees that match the user and the price. If that combination exists, we would add it to the order_fees field. If it did not exist (a new price), we would create a new row and add that row to the order_fees field.
I recognize that option A is a lot simpler: no joins when looking up fees, no creating new rows, no stale rows that get created once and never used again.
I see two downsides of A. For one, it's not extensible. If we add a gift_wrapping_fee in a few months, that will mean adding a third field with null on nearly every order, and so on for additional types of order fees. The second disadvantage is that we have to remember to add checks in-app in every place that order info is displayed.
if order.order_processing_fee:
show fee
if order.order_handling_fee:
show fee
With option B, it's just
for fee in order.order_fees:
show fee
There is much less chance of errors in option B at the cost of at least one additional query per order shown to users.
One additional point: since this is all being done with Django as a backend, we can define methods on the model such that all the price fields are defined in one place.
Which option is better? Is there a third option I haven't considered?
(edited for clarity)
i think that if, in a near future, you will need to add new fee types, than the B option is better.
Have you considered a hybrid option of A and B? If i've understand the problem, you don't have only an order fee but a default fee for the users too.
You can have a Fee table:
Fees
-id
-type(choices=[your fee type list])
-price
...(other attributes)
And then you can have two many to many relation, one for the Order
Order
-id
-buyer(user)
-order_fees(many to many)
And a second for the user
User
-id
-default_fees(many to many)

How to apply windowing function before filter in Django

I have these models:
class Customer(models.Model):
....
class Job(models.Model):
customer = models.ForeignKey('Customer')
payment_status = models.ForeignKey('PaymentStatus')
cleaner = models.ForeignKey(settings.AUTH_USER_MODEL,...)
class PaymentStatus(models.Model):
is_owing = models.NullBooleanField()
I need to find out, for each job, how many total owed jobs the parent customer has, but only display those jobs belonging to the current user. The queryset should be something like this:
user = self.request.user
queryset = Job.objects.select_related('customer'
).filter(payment_status__is_owing=True).annotate(
num_owings=RawSQL('count(jobs_job.id) over (partition by customer_id)', ())
).filter(cleaner=user)
I am using 'select_related' to display fields from the customer related to the job.
Firstly I haven't found a way to do this without the windowing function/raw SQL.
Secondly, regardless of where I place the .filter(window_cleaner=user) (before or afer the annotate()), the final result is always to exclude the jobs that do not belong to the current user in the total count. I need to exclude the jobs from displaying, but not from the count in the windowing function.
I could do the whole thing as raw SQL, but I was hoping there was a nicer way of doing it in Django.
Thanks!
I don't know if this helps and it really depends on how you are wanting to display the results to your user. However if it were me with a free hand to the design aspect I would probably split my window. Perhaps having the total of owed jobs for the parent customer at the top and a separate list for the jobs that belong to the current user below. Then I would split the construction of the data doing a normal query, as you have, for the jobs relating to the current user but then use a custom template tag to calculate the total number of jobs for the parent customer.
I use custom template tags quite a bit. I find they are very cool for those quick snapshot totals that we all want to display to our users. For example....the total number of points accumulated, the number of outstanding tasks, etc etc.
If you've not looked at them previously check out the docs at https://docs.djangoproject.com/en/1.11/howto/custom-template-tags/
They are really easy to use.

Add a negative price line item or discount

I'm using coupons in a storefront to offer discounts. Some coupons are for a flat dollar amount for orders greater than a specific value. Like, $10 off an order of $40 or more. Other coupons give a discounted rate, say, 20% off your first order this month (storefront is handling the limit, so can ignore). I want to use authorize.net to process the transactions, and send receipts to customers.
My first thought was to modify the unit price of things that are discounted. This would work fine for rate discounts, though doesn't show all the information. The problem would be for flat discounts. Where do you take the $10 off if there are a few kinds of items.
My second thought was to add a line item with a negative value/price to the order receipt. Authorize doesn't seem to accept negative values for anything, so that was a failure.
We're using the AIM transaction libraries for Java.
Order anetOrder = Order.createOrder();
anetOrder.setInvoiceNumber(sanitize(order.getOrderNumber(), 20));
anetOrder.setShippingCharges(shippingCharges);
anetOrder.setTotalAmount(total);
for (OrderProductIf op : order.getOrderProducts()) {
OrderItem item = OrderItem.createOrderItem();
item.setItemTaxable(true);
item.setItemId(sanitize(op.getSku(), 31));
item.setItemName(sanitize(op.getName(), 31));
item.setItemDescription(sanitize(op.getModel(), 255));
item.setItemPrice(op.getPrice());
item.setItemQuantity(new BigDecimal(op.getQuantity()));
anetOrder.addOrderItem(item);
}
sanitize is a function that limits the length of strings.
Transaction transaction = merchant.createAIMTransaction(TransactionType.AUTH_CAPTURE, total);
transaction.setCreditCard(creditCard);
transaction.setCustomer(customer);
transaction.setOrder(anetOrder);
transaction.setShippingAddress(shippingAddress);
transaction.setShippingCharges(shippingCharges);
Result<Transaction> result = (Result<Transaction>) merchant.postTransaction(transaction);
return getPaymentResult(result);
I'm out of ideas here.
One way would be to calculate the total amount with the discount without modifying the line items, for a $60 sale with a $10 discount below:
<transactionRequest>
<transactionType>authCaptureTransaction</transactionType>
<amount>50</amount>
Then add
<userFields>
<userField>
<name>Discount</name>
<value>$10.00</value>
</userField>
The userField value is arbitrary, make it -$10.00, if you like it better.
The line items are not totaled. SO
Add line item for credit in your system.
When you submit to Authorize.net, check for negative numbers and change description to indicate it is a credit, then change value to positive
Make sure the TOTAL you submit is correct. Authorize.net will not check that your line items add up to the correct total

Best approach to handle concurrency in Django for eauction toy-app

I am implementing an eauction toy-app in Django and am confused on how to best handle concurrency in the code below. I am uncertain which of my solution candidates (or any other) fits best with the design of Django. I am fairly new to Django/python and my SQL know-how is rusty so apologies if this is a no-brainer.
Requirement: Users may bid on products. Bids are only accepted if they are higher than the previous bids on the same product.
Here is a stripped down version of the models:
class Product(models.Model):
name = models.CharField(max_length=20)
class Bid(models.Model):
amount = models.DecimalField(max_digits=5, decimal_places=2)
product = models.ForeignKey(Product)
and the bid view. This is where the race conditions occur (see comments):
def bid(request, product_id):
p = get_object_or_404(Product, pk=product_id)
form = BidForm(request.POST)
if form.is_valid():
amount = form.cleaned_data['amount']
# the following code is subject to race conditions
highest_bid_amount = Bid.objects.filter(product=product_id).aggregate(Max('amount')).get('amount__max')
# race condition: a bid might have been inserted just now by another thread so highest_bid_amount is already out of date
if (amount > highest_bid_amount):
bid = Bid(amount=amount, product_id=product_id)
# race condition: another user might have just bid on the same product with a higher amount so the save() below is incorrect
b.save()
return HttpResponseRedirect(reverse('views.successul_bid)'
Solution candidates I considered so far:
I have read the Django doc about transactions but I wouldn't know how to apply them to my problem. Since the database does not know about the requirement that bids must be ascending it cannot cause Django to throw an IntegrityError. Is there a way to define this constraint during model definition? Or did it misunderstand the transaction API?
A stored procedure could take care of the bid logic. This is seems to me the "best" choice so far but it shifts handling the race condition to the underlying database system. If this is a good approach, though, this solution might be combined with solution 1?
I considered using a select_for_update call to lock the bids for this product. However, this does not seem to be a solution since in my understanding it would not affect any new bids being created?
Wish list:
If in any way possible, I would like to refrain from locking the entire bid table, since bids on other products can not be affected anyway.
If there is a good solution on application level, I would like to keep the code independent from the underlying database system.
Many thanks for your thoughts!
Would it be possible for you to add a highest_bid column to Products. If my logic is not off, you could then update the highest bid where product_id = x and highest < current_bid. If this query indicates that a row has been updated then you add the new record to the bid table. This would probably mean that you would have to have a default value for highest_bid column.
Have you checked out Celery? You might process your queries asynchronously, queuing the queries and then handing results or errors back when they're available. That seems like a likely path to take if you want to avoid locking.
Otherwise, it does seem like some locking would need to occur.

Distinct-style filtering on a Django model

Distinct might be the wrong word for what I want but I have a Message class like the following for a simple flat messaging system between users:
class Message(models.Model):
thread = models.ForeignKey(Thread)
from_user = models.ForeignKey(User, related_name='messagefromuser')
to_user = models.ForeignKey(User, related_name='messagetouser')
when = models.DateTimeField(auto_now_add=True)
message = models.TextField()
This allows two users to chat about a single Thread object. The system is designed to allow two users to have separate conversations on separate Threads.
So as it is, I can grab the messages a given user is involved in with the following query:
Message.objects.filter( Q(from_user=u) | Q(to_user=u) )
That outputs every message a user has sent or received. I'm building a page where users can see all their conversations with other users, grouped by thread. This is the ideal output that I can imagine getting to:
[
{
'thread': thread_instance,
'conversations': [
{
'other_user': user_instance
'latest_reply': date_time_instance
},
....
]
},
...
]
I have thought about iterating this from the top, but unless there's a way to filter through Thread into Message's to_user, from_user fields, there are just too many threads. The DB server would melt.
"Group" the messages by Thread
"Group" those by the other user so each group is between two distinct users, per Thread
Pluck the most recent to_user=u and annotate something with that.
I'm going a little crazy trying to warp my brain around the particulars. In my head it feels like something you should be able to do in a couple of lines but I just can't see how.
threads = Thread.objects.filter(
Q(message_set__from_user=u) | Q(message_set__to_user=u)
).order_by('id')
messages = Message.objects.filter(
Q(thread__in=threads) & Q(Q(thread__from_user=u) | Q(thread_to_user=u))
).order_by('thread__id', '-when').select_related('from_user', 'to_user')
from itertools import groupby
t_index = 0
for thread_messages in groupby(messages, lambda x: x.thread_id):
if threads[t_index].id is thread_messages[0].thread_id:
threads[t_index].messages = thread_messages
t_index += 1
That might look a bit complex or scary but it should do what you are after. Essentially it queries all your threads first so we can find out what threads we've messaged about. Then it finds all the related messages to those threads.
Both of the queries are ordered by the same field so that in the lower part of the code we can iterate through the list only once, instead of needing a nested for loop to find each thread with the correct id. They are also both filtered by the same query (regarding thread objects at least) to ensure we are getting back only the relevant results to this query.
Lastly, the messages that we received back are grouped together and attached to each thread; they will show up in descending order for each thread.
Right at the end, you may also want to re-sort your threads to show the latest ones first, easy to do assuming a 'when' field on the Thread model:
threads = sorted(threads, key=lambda x: x.when, reverse=True)
By using the aforementioned method, you will have to do 2 queries every time, regardless, first the threads, then the messages. But it will never go above this (watch out for joins on a select_related though or recursive queries on related objects).