I am planning to set up a Solr server on a EC2 instance. As traffic grows I might have move the solr server from a smaller instance to a bigger one. But this change will need to happen in realtime when the old solr instance serves traffic. So I am concerned that while doing this switch, some valuable data that might been indexed could get lost. Also the data from old server will need to be moved to the new server. There would be a significant time required to do this.
Also when the traffic cannot be handled by the largest server, SolrCloud will need to be deployed on multiple servers and the same data migration issue could occur.
Is there an efficient and a more robust way to do this?
you could probably:
DO start using SolrCloud from the get go, but just with a single node/one shard. At this point there is nothing 'Cloud-dy' here but not harm done either.
When traffic grows, you can create the new bigger EC2 instance, and add it to cluster. Now you have a 'working' SolrCloud cluster with a replica.
As needed, keep adding nodes, and creating more shards/replicas.
Related
Just wondering what is the best practice here.
I have these dev environments, dev/QA/UAT/ab/monkey and so on, which are used only during the daytime. We would like to save some cost here, by shutting them down during nighttime.
Each environment consists of frontend/API/caching/queueing/DB servers/Docker images.
Is using Terraform's create/destroy daily the right approach here?
First thing I noticed is the IP address change on removing EC2 instances. Every day on destroying the env, I will have to re-map the DNS. But this can be solved using EC2 elastic IP. But then I read somewhere:
if you’re using an EIP to just provide a public IP and not to rapidly and seamlessly distribute traffic in the event of an outage while keeping DNS records the same, it’s best to just use the AWS non-EIP pub IP and DNS records for pub access
Does AWS give a public DNS that doesn't go away if I shut down the EC2 instance?
Next is of course the data back-ups that I have to do. I have to back up all DBs, assets like images and videos, logs are not a concern since I will be pushing them off to another server using a log collector agent but all other data needs to be backed up before removal using Terraform destroy. I will also have tones of ECR images, I guess I need to back up them as well.
This feels like a lot of work. What is the best practice?
Just to add, almost all environments will run through-out the year.
You definitely could destroy these environments every day, depending on where your infrastructure as code lives, you could do this in a number of ways. For example if it's in a github repo, using github actions and workflows, you could create a task that runs a little while after you finish each day that would destroy everything. Other options would be gitlab which has it's own way of doing this, or something like Jenkins/TeamCity/Bamboo/CircleCI which could automate the job for you.
In theory you could set up another job that applies them again each weekday morning, so you can save money and you don't waste time each morning setting up your dev envs.
With regards to your DNS issues, if you are managing your DNS records with route53, you can add a resource for your records which point to the public IP of your instance (that would be an A record), or the public DNS of your instance (for example). Then when you create the new resources each morning the records will be updated to point at your new instances.
Simply shutting down the instances isn't always going to cut all of your costs, as you will still be paying for some resources like the EBS volumes and if you have elastic IPs which are not in use you get charged for that, load balancers generate charges even when not in use etc
here's how the story goes.
We started transforming a monolith, single-machine, e-commerce application (Apache/PHP) to cloud infrastructure. Obviously, the application and the database (MySQL) were on the same machine.
We decided to move to AWS. And as the first step of transformation, we decided to split the database and application. Hosting application on a c4.xlarge machine. And hosting database to RDS Aurora MySQL on a db.r5.large machine, with default options.
This setup performed well. Especially the database performance went up high.
Unfortunately, when the traffic spiked up, we started experiencing long response times. Looked like RDS, although being really fast for executing queries, wasn't returning results fast enough over the network to the EC2 machine.
So that was our conclusion after an in-depth analysis of the setup including Apache/MySQL/PHP tuning parameters. The delayed response time was definitely due to the network latency between EC2 and RDS/Aurora machine, both machines being in the same region.
Before adding additional resources (ex: ElastiCache etc) we'd first like to look into any default configuration we can play around with to solve this problem.
What do you think we missed there?
One of the bigest strength with the cloud is the scalability and you should always design your application to utilise it and it sounds like your RDS instance is getting chocked due to nr of request more than the process time for the queries. So rather go more small instances with load balancing than one big doing all the job. And with Load Balancers you will get away from a singel point of failure due to you can have replicas of your database and they can even be placed in different AZ.
Here is a blogpost you can read on the topic:
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/
Good luck in your aws journey.
The Best answer to your question is using read replicas, but remember only your read requests could be sent to your read replicas so you would need to design your application that way
Also for some cost savings, you should try aurora serverless
One more option is passing traffic between ec2 and rds through a private network rather than using the public internet to connect your ec2 to rds that can be one of the mistakes that might be happening
Yes, fellow SOrs, I'm doing it backwards. I tried an AWS RDS but the CPU seems to be spiking so often that I need the flexibility of an EC2 to run some fine tuning. I'm not a MySQL expert, so I'm asking:
How can I create a setup on the EC2 so that it reads and replicates my RDS?
Ideally I'd do the switch in real time via DNS but first I need the EC2 to act like a clone of the RDS updating with any new data happening between now and the actual migration period.
Any pointers are much appreciated. Thanks!
Why can't you use mysql-tuner with RDS?
You shouldn't need to run sysbench, since Amazon handles OS level tuning for you on RDS
Aurora is a drop-in replacement for MySQL and will scale better than any MySQL cluster you could setup on EC2
You should be addressing why your Wordpress instance is hammering the database so much instead of trying to optimize the database.
You should put a CDN in front of your Wordpress site and cache as much as you can to reduce the load on both your web server and database server. It looks like there are also solutions out there for using Redis to cache data so that Wordpress doesn't have to constantly go back to MySQL for data.
Amazon provides the CloudFront CDN, but I would also recommend looking into CloudFlare.
Honestly, given your number of concurrent users, unless you have tons of dynamic constantly changing content, you should be able to run your entire site on a t2.micro with CloudFlare in front of it with cache everything enabled.
I'd like to offer an update:
Mark B's input has been extremely valuable as I have discovered that I can run mysql tuner remotely and touch the RDS. Therefore there was no need to migrate after all.
The RDS CPU spikes were due to a large amount of non-INDEX JOINs.
I have added indexes and the results are fantastic:
I can't seem to find a lot of documentation on how the clusters in RethinkDB actually work.
In Cassandra I connect to a cluster by defining one or more hosts, so in case one of them is down, or even has been removed, I still can connect to the whole cluster, before the code/configuration will be updated, reflecting the changes of my hosts IP addresses.
As far as I've understood it, RethinkDB doesn't have such a logic and I'd need to implement it myself, but I'd still be at all times connected to the whole cluster, is that correct?
When creating a database, it is "kind of" created for the whole cluster, there is no way and no need to specify the exact servers which would be taking care of it. When creating a table and I don't specify a primary replica tag, which server will be the primary replica? If I specify a tag which is assigned to multiple servers - same question applies. How is the final server which will be the main replica selected?
In Cassandra I connect to a cluster by defining one or more hosts, so in case one of them is down, or even has been removed, I still can connect to the whole cluster, before the code/configuration will be updated, reflecting the changes of my hosts IP addresses.
In RethinkDB, you connect to the cluster by connecting to a node in the cluster. That node will take care of communicating with all the other nodes in the cluster. If that node disconnects from the cluster, then you might not be able to do writes or read, depending on your cluster sharding and replication. If that node fails, you won't be able to do anything. At that point, you can try connecting to another node.
As far as I've understood it, RethinkDB doesn't have such a logic and I'd need to implement it myself
Yes, RethinkDB won't automatically reconnect you to another node in the cluster if your node fails. That being said, this might be as simple as having multiple connections and switching between them (unless I'm missing something!).
When creating a database, it is "kind of" created for the whole cluster, there is no way and no need to specify the exact servers which would be taking care of it.
Yes, when you create a database it's created for the whole cluster. A database doesn't really 'live' in a specific node. It's only tables that live in a specific node.
When creating a table and I don't specify a primary replica tag, which server will be the primary replica?
RethinkDB will automatically take care of that. It will pick the server where the primary replica will be, based on the following:
Sever distribution load (which servers have more tables and data).
Wether a specific server was already a primary/secondary for that table.
If you want to manually control in which server the primary or secondary ends up, you can set it manually through the table_config table in the rethinkdb database. (You take a peak at that database. It give you a better view into how RethinkDB works!)
If I specify a tag which is assigned to multiple servers - same question applies.
Same as above.
How is the final server which will be the main replica selected?
Same as above.
In terms of documentation, I would suggest the following:
Sharding and replication: http://rethinkdb.com/docs/sharding-and-replication/ (Although your questions suggest you probably already read this :))
I'm currently running a site which uses Redis through Elasticache. We want to move to a larger instance with more RAM since we're getting to around 70% full on our current instance type.
Is there a way to scale up an Elasticache instance in the same way a RDS instance can be scaled?
Alternative, I wanted to create a replica group and add a bigger instance to it. Then, once it's replicated and running, promote the new instance to be the master. This doesn't seem possible through the AWS console as the replicas are created with the same instance type as the primary node.
Am I missing something or is it simply a use case which can't be achieved. I understand that I can start a bigger instance and manually deal with replication then move the web servers over to use the new server but this would require some downtime due to DNS migration, etc.
Thanks!,
Alan
Elasticache feels more like a cache solution in the memcached sense of the word, meaning that to scale up, you would indeed fire up a new cluster and switch your application over to it. Performance will degrade for a moment because the cache would have to be rebuilt, but nothing more.
For many people (I suspect you included), however, Redis is more of a NoSQL database solution in which data loss is unacceptable. Amazon offers the read replicas as a "solution" to that problem, but it's still a bit iffy. Of course, it offers replication to reduce the risk of data loss, but it's still nowhere near as production-safe (or mature) as RDS for a Redis database (as opposed to a cache, for which it's quite perfect), which offers backup and restore procedures, as well as well-structured change management to support scaling up. To my knowledge, ElastiCache does not support changing the instance type for a running cluster. This suggests that it's merely an in-memory solution that would lose all its data on reboot.
I'd go as far as saying that if data loss concerns you, you should look at a self-rolled Redis solution instead of simply using ElastiCache. Not only is it marginally cheaper to run, it would enable you to change the instance type like you would on any other EC2 instance (after stopping it, of course). It would also enable you to use RDB or AOF persistence.
You can now scale up to a larger node type while ElastiCache preserves:
https://aws.amazon.com/blogs/aws/elasticache-for-redis-update-upgrade-engines-and-scale-up/
Yes, you can instantly scale up a running Elasticache instance type to a larger size. I've tested it and experienced very little actual downtime (I think a few seconds at first, but very quickly it's back online, even while the Console will show the process taking roughly a few minutes to actually finish.) I went from a t2.micro to a m3.medium with no problem.
You can scale up or down
Go to Elasticache service
Select the cluster
From Actions menu in top, choose Modify
Modify Node Type as shown below
If you have a cluster, you can add more shards, decrease number of shards, rebalance slot distributions, or add more read replicas. just click on the cluster itself, you should be see something like this
Be aware when you delete shards, it will automatically redistribute data to other existing shards so it will affect on traffic and overloading other shards, when you try to delete a shard you would get a warning like this
Still need more help, please feel free to leave a comment and I would be more than happy to help.