I need to start multiple wso2 apimanager profiles in the same machine. How can this be done?
Particularly I have to start all components on one host except the gateway, which is on another node. I cannot understand how to use the profiling tool to match my needs.
Is it correct to start the gateway on the other node with the command -Dprofile=gateway-worker given that its not a cluster?
Thank you
I need to start multiple wso2 apimanager profiles in the same machine. How can this be done?
You can copy the installation to different folders (e.g. the gateway), set the Offset parameter in the carbon.xml for each copy (instance), change the ports in other other instances to mach the offset port and you can start multiple instance on a single host
Particularly I have to start all components on one host except the gateway
Nope, you either specifying a single profile, or you will start an "all in one" node. There's no simple way saying you want all profiles except the gateway.
Having a default instance (publisher, store, tm, km, ..) with dedicated gateway you can start a default (all-in-one) instance and simply you just configure the gateway to be the dedicated instance (in api-manager.xml), not using the local one.
You can create your own profile with selected modules, e.g. effectively starting all except the gateway module, though I don't see any benefit to do so. Disabling a few modules won't save you any considerable memory or subscription charges
Is it correct to start the gateway on the other node with the command -Dprofile=gateway-worker given that its not a cluster?
The profile parameter has nothing to do with the cluster. The gateway instance may or may not be in a cluster with other gateway instances.
Related
I have a project on GCP which contains a compute node, dns,router,load balancer and API DialogFlow. The connection of DF fulfillment (webhook) with the compute node is though the dns and the load balancer, and it's works.
I detected some random and unfrequently latency problems between DF fulfillment (webhook) and the node and I suppose that if I could connect the webhook directly I'll reduce timings.
I want to connect the DF fulfillment (webhook) directly to the internal IP of the node, but it seems not possible. The DF API and the compute node are on the same GCP project, Why I can't connect the fulfillment with the local IP of the node?
So, the Dialogflow webhook service has some requirements as below:
It must handle HTTPS requests (I think with Compute Engine you can implement this using Ngrok)
The URL for requests must be publicly accessible
...
and a few more.
Although your logic that internal IP could reduce time is correct, the problem is it is not publicly assessable. I guess that is why it is not working. Additionally, DF's waiting time is 5 second and that should be enough unless you are doing some complicated DB queries. Even in that case, I have seen people discussing some workaround to extend the waiting time.
Here's the link for more details
My scenario is mentioned below, please provide the solution.
I need to run 17 HTTP Rest API's for 30K users.
I will create 6 AWS instances (Slaves) for running 30K (6 Instances*5000 Users) users.
Each AWS instance (Slave) needs to handle 5K Users.
I will create 1 AWS instance (Master) for controlling 6 AWS slaves.
1) For Master AWS instance, what instance type and storage I need to use?
2) For Slave AWS instance, what instance type and storage I need to use?
3) The main objective is a Single AWS instance need to handle 5000Users (5k) users, for this what instance type and storage I need to use? This objective needs to solve for low cost (pricing)?
Full ELB DNS Name:
The answer is I don't know, this is something you need to find out how many users you will be able to simulate on this or that AWS instance as it depends on the nature of your test, what it is doing, response size, number of postprocessors/assertions, etc.
So I would recommend the following approach:
First of all make sure you are following recommendations from the 9 Easy Solutions for a JMeter Load Test “Out of Memory” Failure
Start with single AWS server, i.e. t2.large and single virtual user. Gradually increase the load at the same time monitor the AWS health (CPU,RAM, Disk, etc) using either Amazon CloudWatch or JMeter PerfMon Plugin. Once there will be a lack of the monitored metrics (i.e. CPU usage exceeds 90%) stop your test and mention the number of virtual users at this stage (you can use i.e. Active Threads Over Time listener for this)
Depending on the outcome either switch to other instance type (i.e. Compute Optimized if there is a lack of CPU or Memory Optimized if there is a lack of RAM) or go for higher spec instance of the same tier (i.e. t2.xlarge)
Once you get the number of users you can simulate on a single host you should be able to extrapolate it to other hosts.
JMeter master host doesn't need to be as powerful as slave machines, just make sure it has enough memory to handle incoming results.
I have a web app which runs behind Amazon AWS Elastic Load Balancer with 3 instances attached. The app has a /refresh endpoint to reload reference data. It need to be run whenever new data is available, which happens several times a week.
What I have been doing is assigning public address to all instances, and do refresh independently (using ec2-url/refresh). I agree with Michael's answer on a different topic, EC2 instances behind ELB shouldn't allow direct public access. Now my problem is how can I make elb-url/refresh call reaching all instances behind the load balancer?
And it would be nice if I can collect HTTP responses from multiple instances. But I don't mind doing the refresh blindly for now.
one of the way I'd solve this problem is by
writing the data to an AWS s3 bucket
triggering a AWS Lambda function automatically from the s3 write
using AWS SDK to to identify the instances attached to the ELB from the Lambda function e.g. using boto3 from python or AWS Java SDK
call /refresh on individual instances from Lambda
ensuring when a new instance is created (due to autoscaling or deployment), it fetches the data from the s3 bucket during startup
ensuring that the private subnets the instances are in allows traffic from the subnets attached to the Lambda
ensuring that the security groups attached to the instances allow traffic from the security group attached to the Lambda
the key wins of this solution are
the process is fully automated from the instant the data is written to s3,
avoids data inconsistency due to autoscaling/deployment,
simple to maintain (you don't have to hardcode instance ip addresses anywhere),
you don't have to expose instances outside the VPC
highly available (AWS ensures the Lambda is invoked on s3 write, you don't worry about running a script in an instance and ensuring the instance is up and running)
hope this is useful.
While this may not be possible given the constraints of your application & circumstances, its worth noting that best practice application architecture for instances running behind an AWS ELB (particularly if they are part of an AutoScalingGroup) is ensure that the instances are not stateful.
The idea is to make it so that you can scale out by adding new instances, or scale-in by removing instances, without compromising data integrity or performance.
One option would be to change the application to store the results of the reference data reload into an off-instance data store, such as a cache or database (e.g. Elasticache or RDS), instead of in-memory.
If the application was able to do that, then you would only need to hit the refresh endpoint on a single server - it would reload the reference data, do whatever analysis and manipulation is required to store it efficiently in a fit-for-purpose way for the application, store it to the data store, and then all instances would have access to the refreshed data via the shared data store.
While there is a latency increase adding a round-trip to a data store, it is often well worth it for the consistency of the application - under your current model, if one server lags behind the others in refreshing the reference data, if the ELB is not using sticky sessions, requests via the ELB will return inconsistent data depending on which server they are allocated to.
You can't make these requests through the load balancer, So you will have to open up the security group of the instances to allow incoming traffic from source other than the ELB. That doesn't mean you need to open it to all direct traffic though. You could simply whitelist an IP address in the security group to allow requests from your specific computer.
If you don't want to add public IP addresses to these servers then you will need to run something like a curl command on an EC2 instance inside the VPC. In that case you would only need to open the security group to allow traffic from some server (or group of servers) that exist in the VPC.
I solved it differently, without opening up new traffic in security groups or resorting to external resources like S3. It's flexible in that it will dynamically notify instances added through ECS or ASG.
The ELB's Target Group offers a feature of periodic health check to ensure instances behind it are live. This is a URL that your server responds on. The endpoint can include a timestamp parameter of the most recent configuration. Every server in the TG will receive the health check ping within the configured Interval threshold. If the parameter to the ping changes it signals a refresh.
A URL may look like:
/is-alive?last-configuration=2019-08-27T23%3A50%3A23Z
Above I passed a UTC timestamp of 2019-08-27T23:50:23Z
A service receiving the request will check if the in-memory state is at least as recent as the timestamp parameter. If not, it will refresh its state and update the timestamp. The next health-check will result in a no-op since your state was refreshed.
Implementation notes
If refreshing the state can take more time than the interval window or the TG health timeout, you need to offload it to another thread to prevent concurrent updates or outright service disruption as the health-checks need to return promptly. Otherwise the node will be considered off-line.
If you are using traffic port for this purpose, make sure the URL is secured by making it impossible to guess. Anything publicly exposed can be subject to a DoS attack.
As you are using S3 you can automate your task by using the ObjectCreated notification for S3.
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
https://docs.aws.amazon.com/cli/latest/reference/s3api/put-bucket-notification.html
You can install AWS CLI and write a simple Bash script that will monitor that ObjectCreated notification. Start a Cron job that will look for the S3 notification for creation of new object.
Setup a condition in that script file to curl "http: //127.0.0.1/refresh" when the script file detects new object created in S3 it will curl the 127.0.0.1/refresh and done you don't have to do that manually each time.
I personally like the answer by #redoc, but wanted to give another alternative for anyone that is interested, which is a combination of his and the accepted answer. Using SEE object creation events, you can trigger a lambda, but instead of discovering the instances and calling them, which requires the lambda to be in the vpc, you could have the lambda use SSM (aka Systems Manager) to execute commands via a powershell or bash document on EC2 instances that are targeted via tags. The document would then call 127.0.0.1/reload like the accepted answer has. The benefit of this is that your lambda doesn't have to be in the vpc, and your EC2s don't need inbound rules to allow the traffic from lambda. The downside is that it requires the instances to have the SSM agent installed, which sounds like more work than it really is. There's AWS AMIs already optimized with SSM agent stuff, but installing it yourself in the user data is very simple. Another potential downside, depending on your use case, is that it uses an exponential ramp up for simultaneous executions, which means if you're targeting 20 instances, it runs one 1, then 2 at once, then 4 at once, then 8, until they are all done, or it reaches what you set for the max. This is because of the error recovery stuff it has built in. It doesn't want to destroy all your stuff if something is wrong, like slowly putting your weight on some ice.
You could make the call multiple times in rapid succession to call all the instances behind the Load Balancer. This would work because the AWS Load Balancers use round-robin without sticky sessions by default, meaning that each call handled by the Load Balancer is dispatched to the next EC2 Instance in the list of available instances. So if you're making rapid calls, you're likely to hit all the instances.
Another option is that if your EC2 instances are fairly stable, you can create a Target Group for each EC2 Instance, and then create a listener rule on your Load Balancer to target those single instance groups based on some criteria, such as a query argument, URL or header.
Is it possible to do AutoScaling with Static IPs in AWS ? The newly created instances should either have a pre-defined IP or pick from a pool of pre-defined IPs.
We are trying to setup ZooKeeper in production, with 5 zooKeeper instances. Each one should have a static-IP which are to hard-coded in the Kafka's AMI/Databag that we use. It should also support AutoScaling, so that if one of the zooKeeper node goes down, a new one is spawned with the same IP or from a pool of IPs. For this we have decided to go with 1 zoo-keeper instance per AutoScaling group, but the problem is with the IP.
If this is the wrong way, please suggest the right way. Thanks in advance !
One method would be to maintain a user data script on each instance, and have each instance assign itself an elastic IPs from a set of EIPs assigned for this purpose. This user data script would be referenced in the ASGs Launch Configuration, and would run on launch.
Say the user script is called "/scripts/assignEIP.sh", using the AWS CLI you would have it consult the pool to see which ones are available and which ones are not (already in use). Then it would assign itself one of the available EIPS.
For ease of IP management, you could keep the pool of IPs in a simple text properties file on S3, and have the instance download and consult that list when the instance starts.
Keep in mind that each instance will need an to be assigned IAM instance profile that will allow each instance to consult and assign EIPs to itself.
When writing a web app with Django or such, what's the best way to connect to dynamic EC2 instances, such as a cluster of Redis or memcache instances? IP addresses change between reboots, etc. Elastic IPs are limited to 5 by default - what are some other options for auto-discovering/auto-updating which machines are available?
Late answer, but use Boto: http://boto.cloudhackers.com/en/latest/index.html
You can use security groups, tags, and other means to hit the EC2 API and pick the instances/IPs for each thing (DB Server, caching server, etc.) at load-time. We do this with great success in deployment, and are moving that way with our Django settings.py, as well.
One method that I heard mentioned recently in an AWS webinar was to store this sort of information in SimpleDB. Essentially, you would use SimpleDB as the central configuration location, and each instance that you launch would register its IP etc. with this configuration, so you would always have a complete description of all of your instances in one place. I haven't seen this in practice so I don't know what the best practices would be exactly, but the idea sounds reasonable. I suppose you could use SNS or something to signal all the other instances whenever the configuration changes, so everyone could refresh their in-memory cache of the configuration.
I don't know the AWS administrative APIs yet really, but there's probably an API call to list your EC2 instances, at which point you could use some sort of custom protocol to ping each of them and ask it what it is -- part of the memcache cluster, Redis, etc.
I'm having a similar problem and didn't found a solution yet because we also need to map Load Balancers addresses.
For your problem, there are two good alternatives:
If you are not using EC2 micro instances or load balancers, you should definitely use Amazon Virtual Private Cloud, because it lets you control instances IPs and routing tables (check all limitations before using this service).
If you are only using EC2 instances, you could write a script that uses the EC2 API tools to run the command ec2-describe-instances to find all instances and their public/private IPs. Then, the script could parameterize instances names to hosts and update /etc/hosts. Finally, you should put the script in the crontab of every computer/instance that need to access the EC2 instances (see ec2-describe-instances).
If you want to stay with EC2 instances (I'm in the same boat, I've read that you can do such things with their VPC or use an S3 bucket or something like that.) but with EC2, I'm in the middle of writing stuff like this...it's all really simple up till the part where you need to contact the server with a server from your data center or something. The way I'm doing it currently is using the API to create the instance and start it...then once its ready, I contact the server to execute a powershell script that I have on the server....the powershell renames the computer and reboots it...that takes care of needing the hostname and MAC for our data center firewalls. I haven't found a way yet to remotely rename a computer.
As far as knowing the IP, the elastic IPs are the way to go. They say you're only allowed 5 and gotta apply for more but we've been regularly requesting more and they give em to us..we're up to like 15 now and they haven't complained yet.
Another option if you dont' want to do all the computer renaming and such...you could use DHCP and set your computer up so when it boots it gets the computer name and everything from DHCP....I'm not sure how to do this exactly, I've come across very smart people telling me that's the way to do it during my research for Amazon.
I would definitely recommend that you get into the Amazon API...I've been working with it for less than a month and I can do all kinds of crazy things. My code can detect areas of our system that are getting stressed, spin up 10 amazon servers all configured to act as whatever needs stress relief, and be ready to send jobs to all in less than 7 minutes. Brings a tear to my eye.
The documentation is very complete...the API itself is a work of art and a joy to program against...I've very much enjoyed working with it. (and no, i dont' work for them lol)
Do it the traditional way: with DNS. This is what it was built for, so use it! When a machine boots, have it ask for the domain name(s) related to its function, and use that for your configuration. If it stops responding, re-resolve the DNS (or just do that periodically anyway).
I think route53 and the elastic load balancing stuff can be used to do this, if you want to stick to Amazon solutions.