How to restrict the resources within one JVM - wso2

I'm trying with WSO2 products, and I'm thinking about a scenario where bad code could take up all the CPU time (e.g. dead loop or so). I did try it with WSO2 AS with 2 tenants, A and B. And A's bad code does affect B and B's app will have a very long reponse delay or even stuck. Is there a way to restrict the CPU usage of a tenant? Thanks!

At the moment, you will have to setup your environment in what is known as private jet mode, where each tenant gets its own JVM, if you need total isolation.
In a shared environment, we have stuck thread detection which will ensure that critical threads will not run for more than a specified time period. We have plans for CPU usage limiting on per tenant basis. This would be available in a future release.

My suggestion would be to not run two tenants in one application server. Run two separate processes on the same machine. Better yet, run two separate processes in separate OS-level containers (like a jail or an lxc container). Or separate virtual machines if you can't use containers.
Operating systems give you tools for controlling CPU use - rlimit and nice for processes, and implementation-specific facilities for containers and VMs. Because they're implemented in the OS (or virtual machine manager), they are capable of doing this job correctly and reliably. There's no way an application server can do it anywhere near as well.
In any case, having separate applications share an application server and JVM is a terrible idea which should have been put to death in the '90s. There's just no need for it, and it introduces so many potential headaches.

Related

Threading vs Containers in Orchestration

I am working on re-designing an existing Spring Boot application which is trying to automate a series of executions of work that the company is currently having done manually by it's staff. It has a main application that works almost as an orchestration application in the sense that it is the service that will call other applications to get the overall piece of work done. It has 7 sub-systems that it invokes, 3 of these systems need to be invoked in some form of order and complete before the other 4 are invoked but they can be invoked asynchronously.
All of these sub-systems have now been moved to Spring Microservices and the application I'm working on must invoke these microservices (some in order and some asynchronously), it is possible that my application will be called more than once at the same time so I need to consider that multiple containers may be needed for each sub-system. I've implemented Open-Feign to invoke each of the Microservices.
They also have the plan in the not so distant future to move this to AWS ECS/Fargate, however for the time being it is going to be run in Linux VM's and the containers are created on the same private network for communication. I'm wondering if I should remove ThreadPoolTaskExecutor completely and just invoke a new container for each simultaneous request to my application, however I've read that threads on a process are still faster and have less overhead than creating a process on a container and considering there's not going to be many containers invoked simultaneously I'm perplexed as to the best approach.
Any advice would be appreciated.
Unless each request scales up the memory consumption linearly in the application by at least an additional 1Gb (then new pod base memory would not be that much compared with it) it is an overkill to spin up a new pod for each request... it is 200Mb of additional memory for each request and I don't see a benefit nor the need.

Running multiple app instances on a single container in PCF

We have an internal installation of PCF.
A developer wants to push a stateless (obeys 12 factor rules) nodejs app which will spawn other app instances i.e leverage nodejs clustering as per https://nodejs.org/api/cluster.html. Hence there would be multiple processes running on each container. Any known issues with this from a PCF perspective? I appreciate it violates the rule/suggestion of one app instance per container but that is just a suggestion :) All info welcome.
Regards
John
When running an application on Cloud Foundry that spawns child processes, the number one thing you need to watch out for is memory consumption. You set a memory limit when you push your application which is for the entire container. That includes the parent process, whatever child processes are spawned and a little overhead (currently init process, sshd process & health check).
Why is this a problem? Most buildpacks make the assumption that only one process will be running and that it will consume as much memory as possible while staying under the defined memory limit. They attempt to configure the software which is running your application to do this. When you spawn child processes, this breaks the buildpack's assumptions and can create scenarios where your application will exceed the defined memory limit. When this happens, even by one byte, the process will be killed and restarted.
If you're concerned with scaling your application, you should not try to spin off child processes in one extra large container. Instead, let the platform help you and scale up the number of application instances. The platform can easily do this and by using multiple smaller containers you can scale just as well. In fact, if you already have a 12-factor app, it should be well positioned to work in this manner.
Good luck!

Running RabbitMQ+Celery in the same server as production environment

I'm running a Django app in an EC2 instance, which uses RabbitMQ + Celery for task queuing. Are there any drawbacks to running my RabbitMQ node from the same EC2 instance as my production app?
The answer to this questions really depends on the context of your application.
When you're faced with scenarios you should always consider a few things.
Seperation of concerns
Here, we want to make sure that if one of the systems are not responsible for the running of other systems. This includes things like
If the ec2 instance running all the stuff goes down, will the remaining tasks in queue continue running
if my RAM is full, will all systems remain functioning
Can I scale just one segment of my app without having to redesign infrastructure.
By having rabbit and django (with some kind of service, wsgi, gunicorn, waitress etc) all on one box, you loose a lot of resource contingency.
Although RAM and CPU may be abundant, there is a limit to IO, disk writes, network writes etc. This means that if for some reason you have a heavy write function, all other systems may suffer as a result. If you have a heavy write to RAM funciton, the same applies.
So really the downfalls from keeping things in one system that I can see from your question and my own experience are as follows.
Multiple points of failure. If your one instance of rabbit fails, your queues and tasks stop working.
If your app starts generating big traffic, other systems start to contend for recourses.
If any component goes down, that could mean other downtime of other services.
System downtime means complete downtime of all components.
Lots of headaches when your application demands more resources with minimal downtime.
Lots of web traffic will slow down task running
Lots of task running will slow down web requests
Lots of IO will slow down all the things
The rule of thumb that I usually follow is keep single points of failures far from each other - that way you only need to manage those components. A good use case for this would be to use an EC2 instance for your app, another for your workers and another for your rabbit. That way you can apply smaller/bigger instances for just those components if you need to. You can even create AMIs and create autoscaling groups - if it is your use case.
Here are some articles for reference
Seperation of concern
Modern design architectures
Single points of failure
TLDR; If you can run on one EC2 you should but make it easy to scale today.
Both Joshnidhin and Giannis covered the RAM, IO and CPU aspects.
I have run production apps in single instances with containerization and slept with peace of mind that if tomorrow suddenly lots of people want what I have built, I can scale pretty quickly by deploying those containers on different instances instead of one single instance.
Docker allows you to put a limit on CPU consumption and memory usage for each container hence you can also be sure that they will not step into each other.
If we take EC2 instance out of this question it becomes:
Are there any drawbacks in running RabbitMQ Node on the same server as my productions app?
I would say it depends on various things like, kind of workloads and its composition, complexity of the workload, do you expect growth in usage etc.
If your workload is well behaved and the server is big enough for both (app + task q) then why not as there will be only one server to manage. Make sure to protect these 2 process from each other by limiting their system resource usage.
If your traffic is not well behaved then you might want more the one server. In this case having dedicated servers is better (separation of concerns) as you will have to manage more than one server.
Now back to EC2, all the above still apply. EC2 makes horizontal scaling of applications easier so if you have them on separate instance then you can scale them individually and cost effectively. If not when you scale there will be wastage of resources.

C++ frameworks for distributed computer applications

I have a C++/MFC application ("MyApp") that has always run standalone, but which now needs to be run simultaneously on at least two PCs, and eventually on perhaps up to 20 PCs. What is desirable is:
The system will be deployed with fixed PC names ("host1", "host2", etc) and IP addresses (192.168.1.[host number]) on an isolated network
Turn on the first PC, and it will start all the others using wake-on-lan
One or more instances of "MyApp" automatically start on each node, configure themselves, and do their stuff more-or-less independently (there is no critical inter-node communication once started)
The first node in the system also hosts a user interface that provides some access to other nodes. Communication to/from the other nodes only occurs at the user's request, which is generally sporadic for normal users, and occasionally intense for users who have to test that new features are working as designed across the system.
For simulation/test purposes, MyApp should also be able to be started on a specified subset of arbitrarily-named computers on a LAN.
The point of the last requirement is so that when trying to reproduce the problem that is occurring on a 20-PC system on the other side of the world, I don't have to find 20 PCs from somewhere and hook them up, I can instead start by (politely) stealing some spare CPU cycles from other PCs in my office.
I can imagine building this from the ground up in C++/MFC along these lines:
write a new service that automatically starts with Windows, and which can start/stop instances of MyApp when it receives the corresponding command via the network, or when configured to automatically start MyApp in non-test deployments.
have MyApp detect which node it is in the wider system, and if it is the first node, ensure all other nodes are turned on via wake-on-lan
design and implement a communications protocol between the various local and remote services, and MyApp instances
This really seems quite basic, but also like it would be reinventing wheels if I did it from the ground up. So the central question is are there frameworks with C++ classes that make this kind of thing easy? What options should I investigate? And is referring to this as "distributed computing" misleading? What else would you call it? (Please suggest additional or different tags)

How to serve CPU intensive webservice requests in the cloud?

Background: I'm running a webservice in which each request involves a fair amount of computations (up to 10 seconds on a quadcore machine).
Each request can be broken down to about 150 independent (and equally small) subtasks.
What I'm after: l'm looking for a hosting service that allows me to serve these kinds of requests efficiently in a scalable manner.
What I've considered: I've looked into Google App Engine and Rackspace.
It seems to me as if GAE is intended for simple requests, requiering litte resources to process. Problem with something like Rackspace is that I can't tell in advance how many vCPUs I may need (and even if I knew how big future spikes would be, I don't want to sit with, say, 40 servers idling the rest of the time)
Questions:
Would it be possible to use GAE in the following way:
For each request, split it up into 150 subtasks
Process all subtasks independently by doing 150 concurrent HTTP requests to the same webapp (but through a differrnt method)
Collect the results from the "subresults" and return a response to the original request.
Is there any possibility that Map Reduce for GAE could be of any help?
Is there any other service better suited for this task?
Yes, this is possible. The usual way would be to use Task Queue, possibly via DeferredTask helper class.
1.3 Normal web requests (to frontend instances) are limited to 30s, so doing this in synchronous way is not guaranteed to succeed. Also note that instances are artificially limited to do 10 parallel requests (if multithreading is enabled).
Yes, this is a job for map reduce. But note that map reduce is async - you give it tasks to do and it will be done sometime in the future.
Given the processing you need you might want to look at GAE backends (they are long running with multithrading and come in different sizes). If you need even more processing power, then you might want to look at Compute Engine.
Unless all of these 150 subtasks are read-only activities, trying to run them all in a single thread is just not safe. Web requests are unreliable - people can cancel, hit refresh if it takes too long, close windows in the middle, or just time out due to network issues. The background HTTP requests, likewise, can have a whole mess of problems. The standard solution is to have your front-end code simply build a list of things that need to be done, so it can get back to the user quickly, and have a back-end 'worker' process handle the (potentially unreliable) subtasks. Depending on what your application is doing, you might bounce the user to a "working" screen (like searching for airfare) where they can safely wait for the results to come up, or it might just be stuffed away as a "pending" job (like ordering something from Amazon).
There's countless different ways to handle this basic workflow. If you stick with Google App Engine, they have a "task queue" as part of the platform - providing a simple mechanisms for creating & dispatching background tasks. If you go with Rackspace, their cloud offering is less of a unified platform so you'll have to either roll your own queue or get one to plug into your setup.