Connecting multithreaded task runner with django web application

Connecting multithreaded task runner with django web application - django

I’m developing a web service using django. In addition to the web application, I have a separate module with about 40 functions that take in some parameters, perform some network-bound tasks and return results. These functions (or an entry point function) can be called from django views.
Here is the flow I’m trying to achieve.
From the web application, users can submit an URL to start the operation.
That request should initiate those functions in parallel (with the URL as an argument) in the server (not necessarily all at once)
User can do a request from the web application to get a list of completed tasks and results of the ongoing operation.
Multiple users can submit URLs to the web application and initiate the operation separately (each user gets a list of 40 results)
Currently I am experimenting with Thread and Queue classes to achieve this. What I want to know is how can I manage this flow without getting so many threads? How should I maintain the separation between two operation sessions? Is there any way I can in-cooperate the capabilities of django for this?
All I ask is a basic guideline of how things should be organized to achieve this.

It sounds like you could call your functions in celery, a distributed task queue module for python. Take a look at the docs for integration with django here: http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html
There is a module named django- celery-beat of you need to schedule tasks.

Related

How to implement User Interface for Microservice Application based on AWS

There are numerous articles such as this one, which insist that while implementing microservices, each team should build its own UI. These fragments can be composed into a single page using iFrames or div.
My question is, how to implement using AWS.
This could be done at deployment time, or runtime.
1. Deployment time: HTML fragments need to be collected as part of CI/CD process and put together in S3 which serves as web server for static pages (API GW for dynamic JSON content).
2. Run time: The html fragment (div)will have to be delivered to client browser on demand, as and when client clicks through (can be cached in client browser, and/or API GW.
What is the technology for this? I don't think Lambdas can deliver HTML fragments through API GW. Also, the CI/CD approach appears to be fuzzy to me.
Btw any other way? Thanks

Use SPA like angular/react (but not limited to. Same method can be implemented in MPA).
SPA helps us to segregate the UI from the back-end and both runs independently.
And in the service layer of the application(UI), call the respective micro-service say M1 or M2.
In monolithic applications like MPA (multi page application- spring/struts/jsp) where the UI is generated from the back-end and are tightly coupled you need to call different rest-api using jquery/ajax and process the JSON response.
In SPA multiple scenarios like parallel trigger all micro-services or wait for M1 and pass the payload to M2 or race between M1 and M2 can be achieved using promises in javascript

REST API or "direct" database access for remote Celery/Django workers?

I'm working on a project that will have multiple celery workers on machines in different locations in the US that will communicate over the internet.
Am I better off distributing my Django project to each machine and configuring them with the database credentials to my database host, or should I have a "main" Django/database host that presents a REST API for remote celery tasks and workers to hit for the database access?
Mostly looking for pros/cons and any factors I haven't thought of.
I can provide single simple API endpoints that provide all the data my tasks need to query and simple POST API endpoints that can create all the database entries my tasks need to create.
I'll have no more than say 10 remote workers that would maybe altogether being doing 1 request per minute.
I'm thinking this probably means my concerns are not so much about the request/response overhead but more about maintainability, architecture, security...

The answer depends on too many variables, concerns, forces and whatnots to be anything else than "it depends...".
I assume you already thought of the following pros and cons, but anyway:
Using an API will make for longer request/response cycles (obviously) and put quite some load on the Django project (front server, app server etc). Also it means that your tasks won't be able to use all of the database features (complex queries, aggregations, whatever).
OTHO adding an API layer will isolate the workers from the inner db schema, which can make migrations (on the Django side) and deployment easier as you won't have to stop all the workers, deploy to everyone and restart the workers. Well it even makes it possible to change the API side technology without impacting the workers (not that I see much reason to do so but anyway...). But it also mean you have a whole API to maintain, and chances are that model changes - or at last part of them - will impact your API and/or your tasks code anyway (if the changes are about adding features that the workers should use etc).
IOW, it really depends (yeah, I already said so, didn't I ?) on your project's needs and constraints, and only you/your team know which solution will best match your project.

Running continually tasks alongside the django app

I'm building a django app which lists the hot(according to a specific algorithm) twitter trending topics.
I'd like to run some processes indefinitely to make twitter API calls and update the database(postgre) with the new information. This way the hot trending topic list gets updated asynchronously.
At first it seemed to me that celery+rabbitmq were the solution to my problem, but from what I understand they are used within django to launch scheduled or user triggered tasks, not indefinitely running tasks.
The solution that comes to my mind is write a .py file to continually put trending topics in a queue and write independent .py files continually running, making get queue requests and saving the data in the db used by django with raw SQL or SQLAlchemy. I think that this could work, but I'm pretty sure there is a much better way to do it.

If you just need to keep some processes running continually, supervisor is a nice solution.
You can combine it with any queuing technology you like to push things into your queues.

Are web services processed sequentially or in parallel?

I am just getting started in web services using Lotus Notes. What I would like to be able to do is to create a web service that generates a sequential number. The code to generate the number is based on existing code we have used for some time within our databases (just straight lotus script, no web services). Basically there is a document that stores the next number, the next number is returned and is updated for the next call save conflicts are detected and the number is tried again if there was a issue saving the number.
I thought I might use a web service for to generate the number. So are web services processed sequentially or in parallel? Because if they are serial then I won't need to deal with two people trying to save the number at the same time.

Web services are a way for two systems to communicate with each other where they would not have a common language.
For example LotusScript agent connecting to a .Net server.
When creating a web service provider (server) on Domino you can code it in LotusScript or Java. The server then provides a WSDL file for the consumer (client) to write the code required to talk to that web service.
This tutorial should explain it better for you:
http://www-10.lotus.com/ldd/ddwiki.nsf/dx/Creating_your_first_Web_Service_provider_and_consumer_in_LotusScript_and_Java.
Now as for Domino. Web services run in order they are requested from the server. However there is no control to say "Don't start until Webservice X has finished".
You could also code this into an application but run the serious risk of deadlocks of memory/performance issues for other users unless you counter for that.
The Domino server can also be set to not run web services/agents in parallel. But again you risk the same issues.
If it is a unique ID then you could go by the UNID of the document you create from the web service. Or you can use #UNIQUE via an evaluate, but both only return text.
http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/topic/com.ibm.designer.domino.main.doc/H_UNIQUE.html

From the Lotus Designer Documentation:
To enable concurrent Web services on a server, you must enable concurrent Web Agents on that server. Open the Server document you want to edit. Click the Internet Protocols - Domino Web Engine tab. Enable Run Web Agents concurrently.
The maximum number of concurrent Web service calls is determind by "Max concurrent agents"-setting. From the Lotus Administration Documentation:
Max concurrent agents Specifies the number of agents allowed to run concurrently. Valid values are 1 through 10. Default values are 1 for daytime and 2 for nighttime. Enabling a higher number of concurrent agents can relieve a heavily loaded Agent Manager, but also reduces the resources available to run other server tasks.
Lotus Notes Domino Version 8.5.x

Yes web services Will run in parrallel. But since you wrote that your code deals with save conflict, you should NOT have problem.
As in standard notes calls by 2 users: the 1st get the doc then the 2nd get the doc and save (speedy two) then first will get save conflict.
In conclusion yes it's parallel BUT it's not a problem.

I would have thought that they would by default run sequentially as asynchronous web agents is off unless you switch it on. So although it's a good design pattern to do 'safe' sequentially number if you only allocate a number via the web service and you haven't changed the asynchronous setting then you'll be fine

Let me also add:
Employ document locking to assure number uniqueness in sequential document numbering solution

There is a simple solution that avoids synchronicity considerations.
You should generate a temporary number using #Unique, then use a scheduled agent to assign sequential numbers in order of document creation, selecting only unprocessed documents using a properly constituted view. If you're not concerned about the order in which documents were created and only concerned that all numbers are unique, a view is not necessary, and you can just trigger the agent on unprocessed documents.
The temporary number can be used for reference temporarily until a proper sequential number is assigned.
When the scheduled agent runs, it should send authors confirmation with the correct reference number.

Or, you could export to DXL and get the sequence= attribute of the tag. This only works if you're accessing a single instance of the database, though. And the DXL export/XML import is a huge amount of overhead.
Unfortunately, I can't see a way to easily get the sequence number of the note from LotusScript NotesDocument. If you have an active support contract, you could open a Problem Management Report for a software enhancement request ("APAR", in IBM's parlance, though I do not know what its acronym expands to).
Good luck!

Asynchronous message queues and processing like Amazon Simple Queue service in django

There are many activities on an application that need things like:
Send email, Post to twitter
thumbnail an image, into several sizes
call a cron to find connected relationships
A good way to do these tasks is to write into an asynchronous queue on which operations are performed.
What django application can be used to implement such functionality, as the one Amazon Simple Queue service offers, locally?
I have come across celery. Right thing? Anything else that exists, like this?

Beanstalkd can also do what you want, and I've used it (though not from Python) to do some similar things (resizing images, and running other background tasks). There are a couple of Python client libraries to interface with it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Connecting multithreaded task runner with django web application - django

Related

How to implement User Interface for Microservice Application based on AWS

REST API or "direct" database access for remote Celery/Django workers?

Running continually tasks alongside the django app

Are web services processed sequentially or in parallel?

Asynchronous message queues and processing like Amazon Simple Queue service in django

Categories

Resources