Questions regarding Cloudfoundry Application with multiple Processes

Questions regarding Cloudfoundry Application with multiple Processes - cloud-foundry

I am reading about the concept of Side Car and Multi Process Application in cloud foundry .
https://docs.cloudfoundry.org/devguide/multiple-processes.html
https://docs.cloudfoundry.org/devguide/sidecars.html
I have few questions which I could not figure out myself.
Q1: When to use a CF Application with Sidecar vs when to use a CF Application with Processes
I understood that the primary difference between sidecar vs multiple process application is related to container. A sidecar process runs in the same container whereas for multi process application all of them run in separate containers.
I could not figure out , in which scenarios we should use sidecar vs in which scenarios we can use multiple process application
Q2: Different Processes in different technology
In an application with multiple processes , if I want to run 2 processes in 2 different technology ( one process on Java another process in Go / anything else), how to do so ?
This question comes to my mind as I see the buildpack configuration goes with the application , not with the process. So I am getting an impression as if only all processes must be in same technology ( or we can provide multiple buildpacks here ?) .
Here is a sample manifest.yml that I am using:
applications:
- name: multi-process1
disk_quota: 1G
path: target/SampleApp-1.jar
instances: 1
memory: 2G
buildpacks:
- java_buildpack
env:
CONFIG_SERVER_PORT: 8080
processes:
- type: 'task'
command: '$PWD/.java-buildpack/open_jdk_jre/bin/java -jar $PWD/BOOT-INF/lib/mycustomlogger-1.jar'
memory: 512MB
instances: 1
health_check:
type: http
- type: 'sampleProcess2'
command: '$PWD/.java-buildpack/open_jdk_jre/bin/java -jar $PWD/BOOT-INF/lib/mycustomlogger-1.jar'
memory: 512MB
instances: 1
health_check:
type: http
- type: 'web'
#command: '$PWD/.java-buildpack/open_jdk_jre/bin/java $PWD/BOOT-INF/classes/com/example/SampleApp'
memory: 1G
health_check:
type: http
Q3: Interacting processes
In this context how can one process call/talk/interact with the other processes within the application. What are the options available here ? I could not find any sample which demonstrate multiple interacting processes within an app, any sample would be really helpful.
Q4 : Difference between Multi Target Application vs Multi Process Application
I came across a concept called Multi Target Application , reference : https://www.cloudfoundry.org/blog/accelerating-deployment-distributed-cloud-applications/
I did not find such possibility in standard Cloud Foundry, but I felt it might be "similar" to Multi Process app ( since they run on independent containers and are not impacting each other).
My questions are:
What are the differences between Multi Target Application vs Multi Process Application?
What are the fundamental Could Foundry concepts on which Multi Target Application is built ?
Any guidance would be really appreciated.

Q1: When to use a CF Application with Sidecar vs when to use a CF Application with Processes
Different process types are helpful when you have well-separated applications. They might talk to each other, they might interact, but it's done through some sort of published interface like a REST API or through a message queue.
A typical example is a work queue. You might have one process that's running your web application & handling traffic, but if a big job comes in then it'll instruct a worker process, running separately, to handle that job. This is often done through a message queue. The advantage is you can scale each individually.
With a sidecar, they are in the same process. This works well for the scenario where you need tight coupling between two or more processes. For example, if you need to share the same file system or if you have proxy process that intercepts traffic.
The two processes are often linked in a way that they are scaled in tandum, i.e. there is a one-to-one relationship between the processes. This relationship is necessary because if you scale up the application instance count, you're scaling up both. You cannot scale them independently.
In an application with multiple processes , if I want to run 2 processes in 2 different technology ( one process on Java another process in Go / anything else), how to do so ?
You're correct. You'd need to rely on multi-buildpack support.
Your application is only staged once and a single droplet is produced. Each process is spun up from the same droplet. Everything you need must be built into that one droplet. Thus you need to push everything all together and you need to run multiple buildpacks.
In your manifest.yml, the buildpacks entry is a list. You can specify the buildpacks you want to run and they will run in that order.
Alternatively, you could pre-compile. If you're using Go. Cross-compilation is often trivial, so you could compile ahead of time and just push the binary you generate with your application.
In this context how can one process call/talk/interact with the other processes within the application. What are the options available here ? I could not find any sample which demonstrate multiple interacting processes within an app, any sample would be really helpful.
I'll split this in two parts:
If you're talking about apps running as a sidecar, it depends on what the sidecar does, but you have options. Basically, anything you can do in a Linux environment, keeping in mind you're running as a non-root user, you can do. Coordinate through a shared file/folder, intercept the network port & proxy to another port, or other ipc.
If you're talking about multiple processes (the CF terminology) such that you have each running in a separate container, then you are more limited. You would need to use some external method to communicate. This could be a service broker, a database (not recommended), or an API (Rest, gRCP or rsocket are all options).
Note that this "external" method of communication doesn't necessarily need to be public, just external to the container. You could use a private service like a broker/DB, or you could use internal routes & container networking.
Q4 : Difference between Multi Target Application vs Multi Process Application
Sorry, I'm not sure about this one. Multi Target apps are not a core concept in CF. I'll leave it for someone else to answer.

Related

Running multiple app instances on a single container in PCF

We have an internal installation of PCF.
A developer wants to push a stateless (obeys 12 factor rules) nodejs app which will spawn other app instances i.e leverage nodejs clustering as per https://nodejs.org/api/cluster.html. Hence there would be multiple processes running on each container. Any known issues with this from a PCF perspective? I appreciate it violates the rule/suggestion of one app instance per container but that is just a suggestion :) All info welcome.
Regards
John

When running an application on Cloud Foundry that spawns child processes, the number one thing you need to watch out for is memory consumption. You set a memory limit when you push your application which is for the entire container. That includes the parent process, whatever child processes are spawned and a little overhead (currently init process, sshd process & health check).
Why is this a problem? Most buildpacks make the assumption that only one process will be running and that it will consume as much memory as possible while staying under the defined memory limit. They attempt to configure the software which is running your application to do this. When you spawn child processes, this breaks the buildpack's assumptions and can create scenarios where your application will exceed the defined memory limit. When this happens, even by one byte, the process will be killed and restarted.
If you're concerned with scaling your application, you should not try to spin off child processes in one extra large container. Instead, let the platform help you and scale up the number of application instances. The platform can easily do this and by using multiple smaller containers you can scale just as well. In fact, if you already have a 12-factor app, it should be well positioned to work in this manner.
Good luck!

Multiprocess web server with ocaml

I want to make webserver with ocaml. It will have REST interface and will have no dependencies (just searching in constant data loaded to RAM on process startup) and serve read only queries (which can be served from any node - result will be the same).
I love OCaml, however, I have one problem that it can only process using on thread at a time.
I think of scaling just by having nginx in front of it and load balance to multiple process instances running on different ports on the same server.
I don't think I'm the only one running into this issue, what would be the best tool to keep running few ocaml processes at a time and to ensure that if any of them crash they would be restarted and have different ports from each other (to load balance between them)?
I was thinking about standard linux service but I don't want to create like 4 hardcoded records and call service start webserver1 on each of them.

Is there a strong requirement for multiple operating system processes? Otherwise, it seems like you could just use something like cohttp with either lwt or async to handle concurrent requests in the same OS process, using multiple threads (and an event-loop).
As you mentioned REST, you might be interested in ocaml-webmachine which is based on cohttp and comes with well-commented examples.

Can the same dyno run multiple processes?

I am creating small app running multiple microservices. I would like to have this app available 24/7, so free dyno hours are not enough for me. If I upgrade to a hobby plan I would get 10 Process Types.
Can I run another microservice on each of the processes (web), or does Heroku give me the ability only to install one web process per dyno, and the other 10 process types are for scaling my app? In other words, If i need 6 microservices running 24/7 should I buy 6 hobby dynos?

You can only have 1 web process type. You can horizontally scale your web process to run on multiple dynos ("horizontal scalability"), however you will need to upgrade to at least standard-1x dyno types to do that (i.e. you can only run 1 web dyno instance if you are using free or hobby dyno types).
However, in addition to your web process, you can instantiate multiple additional process types (e.g. "worker" processes). These will NOT be able to listen on HTTP/S requests from your clients, but can be used for offloading long-running jobs from your web process.
So, if you map each of your 4-6 microservices to a different Process Type in your Procfile, and if your microservices are not themselves web servers, you might be able to make do with hobby dynos.

Heroku's default model is to map a process type to its own dyno: https://devcenter.heroku.com/articles/procfile states
"Every dyno in your application will belong to one of the process
types, and will begin executing by running the command associated with
that process type."
e.g. heroku ps:scale worker=1 for a type of worker.
Other people have written about how to use foreman or Honcho to run multiple python processes in a single dyno, which utilize a secondary Procfile and possibly other slug setup in a post_compile step. Presumably you could do something similar depending on your chosen language and its buildpack; the official buildpack API doesn't list this step though :/. That said, given Heroku's base Ubuntu stacks, you may be able to get away with a web: script.sh to do any setup and exec of processes.
For grouping multiple processes in a Docker container, you could check out https://runnable.com/docker/rails/run-multiple-processes-in-a-container, which relies again on a custom CMD [ "/start.sh" ]. Note that it's contrary to Docker's single-process-per-container philosophy, and can give you more headaches e.g. around forwarding signals to child processes, an ugly Dockerfile to setup all microservices, etc. (If you do decide to use multiple containers, Heroku has a writeup on using Docker Compose for local dev.)
Also, don't forget you're bounded by the performance of your dyno and the process/thread limits.
Of course, multiple processes in a given dyno is generally not recommended for non-toy production or long term dev maintenance. ;)

There's RUNIT buildpack which makes it easy to combine multiple processes within a single dyno - as long as they all fit within your dyno memory limit (512M for a Hobby dyno).
You still have only one HTTP endpoint per Heroku app, which means your microservices will need to communicate via queues, pub/sub or some RPC hub (like deepstream.io).

C++ frameworks for distributed computer applications

I have a C++/MFC application ("MyApp") that has always run standalone, but which now needs to be run simultaneously on at least two PCs, and eventually on perhaps up to 20 PCs. What is desirable is:
The system will be deployed with fixed PC names ("host1", "host2", etc) and IP addresses (192.168.1.[host number]) on an isolated network
Turn on the first PC, and it will start all the others using wake-on-lan
One or more instances of "MyApp" automatically start on each node, configure themselves, and do their stuff more-or-less independently (there is no critical inter-node communication once started)
The first node in the system also hosts a user interface that provides some access to other nodes. Communication to/from the other nodes only occurs at the user's request, which is generally sporadic for normal users, and occasionally intense for users who have to test that new features are working as designed across the system.
For simulation/test purposes, MyApp should also be able to be started on a specified subset of arbitrarily-named computers on a LAN.
The point of the last requirement is so that when trying to reproduce the problem that is occurring on a 20-PC system on the other side of the world, I don't have to find 20 PCs from somewhere and hook them up, I can instead start by (politely) stealing some spare CPU cycles from other PCs in my office.
I can imagine building this from the ground up in C++/MFC along these lines:
write a new service that automatically starts with Windows, and which can start/stop instances of MyApp when it receives the corresponding command via the network, or when configured to automatically start MyApp in non-test deployments.
have MyApp detect which node it is in the wider system, and if it is the first node, ensure all other nodes are turned on via wake-on-lan
design and implement a communications protocol between the various local and remote services, and MyApp instances
This really seems quite basic, but also like it would be reinventing wheels if I did it from the ground up. So the central question is are there frameworks with C++ classes that make this kind of thing easy? What options should I investigate? And is referring to this as "distributed computing" misleading? What else would you call it? (Please suggest additional or different tags)

Django and Celery Confusion

After reading a lot of blogposts, I decided to switch from crontab to Celery for my middle-scale Django project. I have a few things I didn't understand:
1- I'm planning to start a micro EC2 instance which will be dedicated to RabbitMQ, would this be sufficient for a small-to-medium heavy tasking? (Such as dispatching periodical e-mails to Amazon SES).
2- Computing of tasks, does compution of tasks occur on the Django server or the rabbitMQ server (assuming the rabbitMQ is on a seperate server)?
3- When I need to grow my system and have 2 or more application servers behind a load balancer, do these two celery machines need to connect to the same rabbitMQ vhost? Assuming application servers are the carbon copy and tasks are same and everything is sync on the database level.

I don't know the answer to this question, but you can definitely configure it to be suitable (e.g. use -c1 for a single process worker to avoid using much memory, or eventlet/gevent pools), see also the --autoscale option. The choice of broker transport also matters here, the ones that are not polling are more CPU effective (rabbitmq/redis/beanstalk).
Computing happens on the workers, the broker is only responsible for accepting, routing and delivering messages (and persisting messages to disk when necessary).
To add additional workers these should indeed connect to the same virtual host. You would
only use separate virtual hosts if you would want applications to have separate message buses.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js