I have a Java/Spring application running in the Amazon AWS cloud.
My server instances are using load balancing and runs the same image of a Linus OS, with a Tomcat application server.
They are also connected to S3 as a shared file system (s3fs), and an RDS database.
My concern is to be sure the state of the different applications is synchronized. Today, the point of synchronization is the database, but when memory caching is needed, out of sync problems appear.
The solution I would like to use is to put in place a messaging system between the applications. For specific reasons, I cannot use Amazon SQS service, then JMS seems to fit my needs. After some reading, HornetQ seems also a very good implementation of it. Once an application state change, it communicates the change to all other applications. Each application is producer and consumer of the same queue.
As we are in a dynamic system where servers and IPs are automatically created and deleted, the automatic discovery of instances seems to be the best solution to use.
But in AWS, broadcast is not possible!
For HornetQ, I saw a kind of work around which is using JGroups additionally. But for me, this is a second framework to investigate and learn. Twice the work. And no more an out-of-the-box solution.
What is your opinion? Does anyone already build a solution for similar needs?
Maybe other out-of-the-box solutions exists?
Thanks in advance for your answer!
In my experience you could try to use TCPGOSSIP, that is a HornetQ configuration.
See https://docs.jboss.org/jbossclustering/cluster_guide/5.1/html/jgroups.chapt.html
Related
I have a situation where I have a NodeJs app that runs as an event listener. This NodeJs app listens for external events outside of my application through websocket.
I need each of the events coming in to only be processed once by my Nodejs app.
However, it's also crucial to ensure that this particular NodeJs app instance can auto-scale up/down when needed and is highly available so that it wouldn't be a bottleneck.
Usually, when it comes to scaling and HA, the first thing that come to my mind is to run a few of instances of it with a load balancer, or run multiple containers on something like ECS. Doing so would introduce multiple instances of the Nodejs app and would also mean each of the same events from the websocket will get processed more than once by all the instances/containers which received it.
What would be a good solution and design to tackle such a problem?
Not sure I fully understand the situation here but I think what you are saying is that you have a socket server that emit to other services, however that a single instance, even with dedicated resources is subject to bottlenecks.
Assuming what I have said is in-line with the question what you probably want to look at (not sure if you using socket.io or not) is the redis socket.io package. This will essentially use redis to store the sockets so you can cluster your socket server and not have it sending duplicates or missed users.
To your question about scale, you for sure would want to use containers for this, we actually use digitalocean 'apps' as an easy way to deploy our containers without having to manage Kuberneties and docker images, only downside right now is no auto scale, however scaling out is just a click of a button and with alerts setup we know when to scale up or down.
With this setup, we have our socket server running with managed redis server, when we need more socket server we just tick it up and we have more throughput.
I would like to implement a queuing mechanism for sending out email via PHPMailer on Amazon EC2. I have set up Beanstalkd correctly on the server and can access it via a console. The mail doesn't seem to go through (trying the various combinations of sample code). In addition do I need to set up a cron job also that would call one of the producer or consumer files?
Does anyone have working code for sending out email via phpmailer/pheanstalk please for Amazon EC2?
Thanks.
Beanstalkd is great, and I use it myself, however, don't use it for this; It's reinventing the wheel in a bad way. Instead, install a local mail server such as postfix and get that to do your queuing for you. This is also much, much simpler, faster, and easier to control. Email servers are built for managing queues, and they are extremely good at it.
Before you do so, get your mail sending script working – there's no point in even attempting to get something more complex working until you've done that. Also be aware that sending email from EC2 is difficult – Amazon wants you to use their SES service rather than sending directly – you may find sending is blocked altogether. Read the PHPMailer troubleshooting guide to see how to diagnose that.
I'm designing a web log analytic.
And I found an architect with Django(Back-end & front-end)+ kafka + spark.
I also found some same system from this link:http://thevivekpandey.github.io/posts/2017-09-19-high-velocity-data-ingestion.html with below architect
But I confuse about the role of kafka-consumer. It will is a service, independent to Django, right?
So If I want to plot real-time data to front-end chart, how to I attached to Django.
It will too ridiculous if I place both kafka-consumer & producer in Django. Request from sdk come to Django by pass to kafa topic (producer) and return Django (consumer) for process. Why we don't go directly. It looks simple and better.
Please help me to understand the role of kafka consumer, where it should belong? and how to connect to my front-end.
Thanks & best Regards,
Jame
The article mentions about the use case without Kafka:
We saw that in times of peak load, data ingestion was not working properly: it was taking too long to connect to MongoDB and requests were timing out. This was leading to data loss.
So the main point of introducing Kafka and Kafka Consumer is to avoid too much load on DB layer and handle it gracefully with a messaging layer in between. To be honest, any message queue can be used in this case, not only Kafka.
Kafka Consumer can be a part of the web layer. It wouldn't be optimal, because you want the separation of concerns (which makes the system more reliable in case of failures) and ability to scale things independently.
It's better to implement the Kafka Consumer as a separate service if the concerns mentioned above really matter (scalability and reliability) and it's easy for you to do operationally (because you need to deploy, monitor, etc. a new service now). In the end it's a classic monolith vs. microservices dilemma.
So, here's the thing. I really like the idea of microservices and want to set it up and test it before deciding if I want to use it in production. And then if I do want to use it I want to slowly chip away pieces of my old rails app and move logic to microservices. This I think I can do using HAProxy and set up different routing based on URLs. So this should be covered.
Then my next biggest concern is that I don't want too much overhead to ensure everything is running smoothly on the infrastructure side. I want preferrably low configuration and the ease of development, testing and deployment.
Now, I want to know what are the benefits and downsides of each styles. Akka (cluster) vs something like Kubernetes (maybe even fabric8 on top of it).
What I also worry about is fault tolerance. I don't know how do you do that with Kubernetes. Do you then have to include some message queue to ensure your messages don't get lost? And then also have multiple queue if one of the queues goes down? Or just retry until queue comes up again? Akka actors already have that right? Retrying and mail boxes? What are the strategies for fault tolerance for microservices? Do they differ for each approach?
Someone please enlighten me! ;)
I don't know much about Akka, but from reading quickly it seems that it is an app framework. Kubernetes is at a bit of a lower-level. Kubernetes runs your containers and manages them for you. We don't have a concept of queues or mailboxes.
Kubernetes will soon have L7 load balancing so you can do URL maps.
As for fault tolerance - kubernetes ensures that your stated intentions are true - run N copies of this container. That container might be an Akka app or might be mysql - dopesn't matter.
There are a bunch of guides on Docker + Akka. Kubernetes makes managing docker containers easier, but the app is still yours :)
I'm working on the design of a remote control application. From my iPhone or a web browser, I'll send a few commands. Soon my home computer will perform the commands and send back results. I know there are remote desktop apps, but I want something programmable, something simpler, and something that I wrote.
My current direction is to use Amazon Simple Queue Service (SQS) as the message bus. The iPhone places some messages in a queue. My local Java/JRuby program notices the messages on the queue, performs the work and sends back status via a different queue.
This will be a very low-volume application. At $1.00 for a million requests (plus a handful of data transfer charges), Amazon SQS looks a lot more affordable than having my own server of any type. And super reliable, that's important for me too.
Are there better/standard toolkits or architectures for this kind of remote control? Cost is not a big issue, but I prefer the tons I learn by doing it myself.
I'm moderately concerned about security, but doubt it will be a problem. The list of commands recognized will be very short, and only recognized in specific contexts. No "erase hard drive" stuff.
update: I'll probably distribute these programs to some other people who want the same function, but who don't have Amazon SQS accounts. For now, they'll use anonymous access to my queues, with random 80-character queue names.
Well, I think it's a clever approach -- and as you said, the costs for your little traffic aren't even worth mentioning.
As I mentioned in the comment, it's a good way to leave your home machine behind your firewall and not have an open port on the internet.
I would suggest using OnlineMQ.com as a start; they have a free package.