gunicorn + django + telegram + mqtt client - django

We use gunicorn with django and django-telegrambot. We also have a MQTT client in an own app. When some MQTT messages arrive we send Telegram messages and the other way around. The Problem is now that when we use gunicorn with multiple workers, we have multiple MQTT Clients, so that when a MQTT message arrives we will send multiple times the same Telegram message.
When we use gunicorns preload with workers, we only have one MQTT client, but then all processes share the same Telegram TCP connection and we get wired SSL errors. As an alternative we could use only use on process and multiple threads, but then sometimes MQTT and Telegram messages gets not processed (idk why).
Is there a way to get this running?
Instead of using webhooks one could use botpolling, but django-telegrambot says:
Polling mode by management command (an easy to way to run bot in local machine, not recommended in production!)

I'm not familiar with the django-telegrambot library, so I can't judge why the authors chose to make this statement (maybe ask on the GitHub repository …). However, both polling and webhooks are officially supported by Telegram (see here). IMHO both have pros and cons. Webhooks may have a slight performance benefit over polling, but also require more work to set up. Polling requires you to continuously fetch for updates, which can be seen as downside. OTOH with webhooks you have to have a webserver running. For small to medium sized bots (in terms of usernumber), polling should be fine - I'm using polling without problems for my (rather small) bots.
Please take this with a grain of salt as I'm far from being an expert on networking topics.

Related

How to create a full-duplex communication between API and various clients?

In my website, I'd like to create a public API that would allow clients (unknown people) to interact with my services. A classic REST API would work well in that case.
However, I need to be able to send events to the clients too. These events are not related to client HTTP requests. I saw "webhooks" are a way to deal with this. If I understood well, with webhooks, my service would send HTTP POST requests to a URL specified by the client, with event data inside this request.
I think websocket can be used too as a solution for this full-duplex communication need.
What I want to know, is which method would be the simplest for clients to implement to talk to my services? Simplicity is the key point here.
The hard thing is that my clients can use various technologies (full websites with HTTP servers, iOS/Android apps without server, etc.)
What are implications for clients if I use REST API + webhooks? Websockets? etc?
How to make a choice?
Hope it's clear (but not sure). Thanks :)
I would consider webhooks a simpler solution. And yes, you understood it well, that with webhooks, a developer using your API would register a URL where your backend would POST event data. It's a common pattern that's used in APIs.
A great benefit of using a webhooks design is that a client/server connection does not need to stay open. After all, if events occur infrequently (i.e. only a few times per hour, or per day) or keeping a consistent connection open is a challenge, establishing a connection only when it's needed is rather efficient.
The challenge of using webhooks for you, the API provider, is designing an evented backend system that deals with change of state detection and reliable webhook calling mechanisms (i.e. dealing with webhook receiver URLs that are unresponsive or throw errors).
The challenge of using webhooks on the developer end is that they need to stand up a reliable web server that listens for the event POST data from your server.
Realtime APIs (i.e. based on Websockets, Bayeux/CometD) are really swell because that live connection means that new connections do not have to be established, which is particularly useful with very chatty sessions. Additionally, there are a lot of projects and companies out there that have taken care of the heavy lifting on the server and client with fully-baked libraries. One of those is Fanout.io which makes pushing messages between the client/server possible with just a few lines of code, utilizing XMPP, Bayeux, and Websockets when possible.
(I am not affiliated with Fanout, but I have used it)
So, to sum it up, webhooks are simple mostly because you are already familiar with the architecture needed to implement them, and the pattern is a well traveled one. If you are leaning toward a persistent connection approach, I would look at tools/platforms like Fanout because it takes care of the heavy lifting (i.e. subscribe/publish, concurrent connection scale, client/server libraries).

Ideas for scaling chat in AWS?

I'm trying to come up with the best solution for scaling a chat service in AWS. I've come up with a couple potential solutions:
Redis Pub/Sub - When a user establishes a connection to a server that server subscribes to that user's ID. When someone sends a message to that user, a server will perform a publish to the channel with the user's id. The server the user is connected to will receive the message and push it down to the appropriate client.
SQS - I've thought of creating a queue for each user. The server the user is connected to will poll (or use SQS long-polling) that queue. When a new message is discovered, it will be pushed to the user from the server.
SNS - I really liked this solution until I discovered the 100 topic limit. I would need to create a topic for each user, which would only support 100 users.
Are their any other ways chat could be scaled using AWS? Is the SQS approach viable? How long does it take AWS to add a message to a queue?
Building a chat service isn't as easy as you would think.
I've built full XMPP servers, clients, and SDK's and can attest to some of the subtle and difficult problems that arise. A prototype where users see each other and chat is easy. A full features system with account creation, security, discovery, presence, offline delivery, and friend lists is much more of a challenge. To then scale that across an arbitrary number of servers is especially difficult.
PubSub is a feature offered by Chat Services (see XEP-60) rather than a traditional means of building a chat service. I can see the allure, but PubSub can have drawbacks.
Some questions for you:
Are you doing this over the Web? Are users going to be connecting and long-poling or do you have a Web Sockets solution?
How many users? How many connections per user? Ratio of writes to reads?
Your idea for using SQS that way is interesting, but probably won't scale. It's not unusual to have 50k or more users on a chat server. If you're polling each SQS Queue for each user you're not going to get anywhere near that. You would be better off having a queue for each server, and the server polls only that queue. Then it's on you to figure out what server a user is on and put the message into the right queue.
I suspect you'll want to go something like:
A big RDS database on the backend.
A bunch of front-end servers handling the client connections.
Some middle tier Java / C# code tracking everything and routing messages to the right place.
To get an idea of the complexity of building a chat server read the XMPP RFC's:
RFC 3920
RFC 3921
SQS/ SNS might not fit your chatty requirement. we have observed some latency in SQS which might not be suitable for a chat application. Also SQS does not guarantee FIFO. i have worked with Redis on AWS. It is quite easy and stable if it is configured taking all the best practices in mind.
I've thought about building a chat server using SNS, but instead of doing one topic per user, as you describe, doing one topic for the entire chat system and having each server subscribe to the topic - where each server is running some sort of long polling or web sockets chat system. Then, when an event occurs, the data is sent in the payload of the SNS notification. The server can then use this payload to determine what clients in its queue should receive the response, leaving any unrelated clients untouched. I actually built a small prototype for this, but haven't done a ton of testing to see if it's robust enough for a large number of users.
HI realtime chat doesn't work well with SNS. It's designed for email/SMS or service 1 or a few seconds latency is acceptable. In realtime chat, 1 or a few seconds are not acceptable.
check this link
Latency (i.e. “Realtime”) for PubNub vs SNS
Amazon SNS provides no latency guarantees, and the vast majority of latencies are measured over 1 second, and often many seconds slower. Again, this is somewhat irrelevant; Amazon SNS is designed for server-to-server (or email/SMS) notifications, where a latency of many seconds is often acceptable and expected.
Because PubNub delivers data via an existing, established open network socket, latencies are under 0.25 seconds from publish to subscribe in the 95% percentile of the subscribed devices. Most humans perceive something as “realtime” if the event is perceived within 0.6 – 0.7 seconds.
the way i would implement such a thing (if not using some framework) is the following:
have a webserver (on ec2) which accepts the msgs from the user.
use Autoscalling group on this webserver. the webserver can update any DB on amazon RDS which can scale easily.
if you are using your own db, you might consider to decouple the db from the webserver using the sqs (by sending all requests the same queue), and then u can have a consumer which consume the queue. this consumer can also be placed behind an autoscalling group, so that if the queue is larger than X msgs, it will scale (u can set it up with alarms)
sqs normally updates pretty fast i.e less than one second. (from the moment u sent it, to the moment it appears on the on the queue), and rarely more than that.
Since a new AWS IoT service started to support WebSockets, Keepalive and Pub/Sub couple months ago, you may easily build elastic chat on it. AWS IoT is a managed service with lots of SDKs for different languages including JavaScript that was build to handle monster loads (billions of messages) with zero administration.
You can read more about update here:
https://aws.amazon.com/ru/about-aws/whats-new/2016/01/aws-iot-now-supports-websockets-custom-keepalive-intervals-and-enhanced-console/
Edit:
Last SQS update (2016/11): you can now use Amazon Simple Queue Service (SQS) for applications that require messages to be processed in a strict sequence and exactly once using First-in, First-out (FIFO) queues. FIFO queues are designed to ensure that the order in which messages are sent and received is strictly preserved and that each message is processed exactly once.
Source:
https://aws.amazon.com/about-aws/whats-new/2016/11/amazon-sqs-introduces-fifo-queues-with-exactly-once-processing-and-lower-prices-for-standard-queues/
Now on, implementing SQS + SNS looks like a good idea too.

Django node.js socket.io

I am trying to make a realtime messaging application. There will be 2 distinct server(node.js and django) and when a user sends message to another user message will be stored in database than node.js will send a message to receiver like "You have new Message!". For that i am planing to call url which node.js serve. So node.js and django will interact each other. And what is best way send message to specifig client ? (I keep clients with their id's in a assosicative array.)
what do you think about that? is it efficent or do you suggest better way to do this ?
Now that I understand more about what you're trying to do, here my answer, just keep in mind that this only reflects my opinion, and I bet that many others would argue about it.
It all matter on how much traffic you expect to have in your application. If it's not a high traffic application, then efficiency in run-time is insignificant when compared to that of the development, and so choose the technology you feel most comfortable with.
If though you do aim for high traffic application, then I believe that this setup is not a good one.
First of all while http based communication between servers might seem comfortable, you are dealing with the overhead of http over tcp (since http is based on tcp). And so regular tcp sockets scale better, but on the other hand if you write the sockets server in python than you can run it from the same process as the django and then just use it as an object from django (you're entering the realm of threads here). But that's problematic if you have a few web instances, again depends on how much traffic you expect.
As for your choice for implementing the messaging server, I've never tested node.js but I believe that in benchmark tests it won't compare for something written in erlang or Java NIO. For example: JAVA AIO (NIO.2) VS NODEJS

django and comet

What should i use to implement comet on django?
All things i've found on google seems outdated. Some people point to orbited.org or hookbox.org, but both of them are just dead now. How people solve this problem nowadays?
I would check out Pusher which is a third party service that will allow you to push events to your browser with a drop-dead simple API. They have a free sandbox plan that comes with 100k messages per day and 20 connections.
Alternatively, you could run APE on your server to push events down.
Django isn't really designed for long polling and comet.

Message queuing solutions?

(Edited to try to explain better)
We have an agent, written in C++ for Win32. It needs to periodically post information to a server. It must support disconnected operation. That is: the client doesn't always have a connection to the server.
Note: This is for communication between an agent running on desktop PCs, to communicate with a server running somewhere in the enterprise.
This means that the messages to be sent to the server must be queued (so that they can be sent once the connection is available).
We currently use an in-house system that queues messages as individual files on disk, and uses HTTP POST to send them to the server when it's available.
It's starting to show its age, and I'd like to investigate alternatives before I consider updating it.
It must be available by default on Windows XP SP2, Windows Vista and Windows 7, or must be simple to include in our installer.
This product will be installed (by administrators) on a couple of hundred thousand PCs. They'll probably use something like Microsoft SMS or ConfigMgr. In this scenario, "frivolous" prerequisites are frowned upon. This means that, unless the client-side code (or a redistributable) can be included in our installer, the administrator won't be happy. This makes MSMQ a particularly hard sell, because it's not installed by default with XP.
It must be relatively simple to use from C++ on Win32.
Our client is an unmanaged C++ Win32 application. No .NET or Java on the client.
The transport should be HTTP or HTTPS. That is: it must go through firewalls easily; no RPC or DCOM.
It should be relatively reliable, with retries, etc. Protection against replays is a must-have.
It must be scalable -- there's a lot of traffic. Per-message impact on the server should be minimal.
The server end is C#, currently using ASP.NET to implement a simple HTTP POST mechanism.
(The slightly odd one). It must support client-side in-memory queues, so that we can avoid spinning up the hard disk. It must allow flushing to disk periodically.
It must be suitable for use in a proprietary product (i.e. no GPL, etc.).
How is your current solution showing its age?
I would push the logic on to the back end, and make the clients extremely simple.
Messages are simply stored in the file system. Have the client write to c:/queue/{uuid}.tmp. When the file is written, rename it to c:/queue/{uuid}.msg. This makes writing messages to the queue on the client "atomic".
A C++ thread wakes up, scans c:\queue for "*.msg" files, and if it finds one it then checks for the server, and HTTP POSTs the message to it. When it receives the 200 status back from the server (i.e. it has got the message), then it can delete the file. It only scans for *.msg files. The *.tmp files are still being written too, and you'd have a race condition trying to send a msg file that was still being written. That's what the rename from .tmp is for. I'd also suggest scanning by creation date so early messages go first.
Your server receives the message, and here it can to any necessary dupe checking. Push this burden on the server to centralize it. You could simply record every uuid for every message to do duplication elimination. If that list gets too long (I don't know your traffic volume), perhaps you can cull it of items greater than 30 days (I also don't know how long your clients can remain off line).
This system is simple, but pretty robust. If the file sending thread gets an error, it will simply try to send the file next time. The only time you should be getting a duplicate message is in the window between when the client gets the 200 ack from the server and when it deletes the file. If the client shuts down or crashes at that point, you will have a file that has been sent but not removed from the queue.
If your clients are stable, this is a pretty low risk. With the dupe checking based on the message ID, you can mitigate that at the cost of some bookkeeping, but maintaining a list of uuids isn't spectacularly daunting, but again it does depend on your message volume and other performance requirements.
The fact that you are allowed to work "offline" suggests you have some "slack" in your absolute messaging performance.
To be honest, the requirements listed don't make a lot of sense and show you have a long way to go in your MQ learning. Given that, if you don't want to use MSMQ (probably the easiest overall on Windows -- but with [IMO severe] limitations), then you should look into:
qpid - Decent use of AMQP standard
zeromq - (the best, IMO, technically but also requires the most familiarity with MQ technologies)
I'd recommend rabbitmq too, but that's an Erlang server and last I looked it didn't have usuable C or C++ libraries. Still, if you are shopping MQ, take a look at it...
[EDIT]
I've gone back and reread your reqs as well as some of your comments and think, for you, that perhaps client MQ -> server is not your best option. I would maybe consider letting your client -> server operations be HTTP POST or SOAP and allow the HTTP endpoint in turn queue messages on your MQ backend. IOW, abstract away the MQ client into an architecture you have more control over. Then your C++ client would simply be HTTP (easy), and your HTTP service (likely C# / .Net from reading your comments) can interact with any MQ backend of your choice. If all your HTTP endpoint does is spawn MQ messages, it'll be pretty darned lightweight and can scale through all the traditional load balancing techniques.
Last time I wanted to do any messaging I used C# and MSMQ. There are MSMQ libraries available that make using MSMQ very easy. It's free to install on both your servers and never lost a message to this day. It handles reboots etc all by itself. It's a thing of beauty and 100,000's of message are processed daily.
I'm not sure why you ruled out MSMQ and I didn't get point 2.
Quite often for queues we just dump record data into a database table and another process lifts rows out of the table periodically.
How about using Asynchronous Agents library from .NET Framework 4.0. It is still beta though.
http://msdn.microsoft.com/en-us/library/dd492627(VS.100).aspx