Unreliable discovery for elasticsearch nodes on ec2 - amazon-web-services

I'm using elasticsearch (0.90.x) with the cloud-aws plugin. Sometimes nodes running on different machines aren't able to discover each other ("waited for 30s and no initial state was set by the discovery"). I've set "discovery.ec2.ping_timeout" to "15s", but this doesn't seem to help. Are there other settings that might make a difference?
discovery:
type: ec2
ec2:
ping_timeout: 15s

Not sure if you are aware of this blog post: http://www.elasticsearch.org/tutorials/elasticsearch-on-ec2/. It explains the plugin settings in depth.
Adding cluster name, like so
cluster.name: your_cluster_name
discovery:
type: ec2
...
might help.

Related

EksCtl : Update node-definitions via cluster config file not working

I am using eksctl to create our EKS cluster.
For the first run, it works out good, but if I want to upgrade the cluster-config later in the future, it's not working.
I have a cluster-config file with me, but any changes made to it are not reflect with update/upgrade command.
What am I missing?
Cluster.yaml :
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: supplier-service
region: eu-central-1
vpc:
subnets:
public:
eu-central-1a: {id: subnet-1}
eu-central-1b: {id: subnet-2}
eu-central-1c: {id: subnet-2}
nodeGroups:
- name: ng-1
instanceType: t2.medium
desiredCapacity: 3
ssh:
allow: true
securityGroups:
withShared: true
withLocal: true
attachIDs: ['sg-1', 'sg-2']
iam:
withAddonPolicies:
autoScaler: true
Now, if in the future, I would like to make change to instance.type or replicas, I have to destroy entire cluster and recreate...which becomes quite cumbersome.
How can I do in-place upgrades with clusters created by EksCtl? Thank you.
I'm looking into the exact same issue as yours.
After a bunch of searches against the Internet, I found that it is not possible yet to in-place upgrade your existing node group in EKS.
First, eksctl update has become deprecated. When I executed eksctl upgrade --help, it gave a warning like this:
DEPRECATED: use 'upgrade cluster' instead. Upgrade control plane to the next version.
Second, as mentioned in this GitHub issue and eksctl document, up to now the eksctl upgrade nodegroup is used only for upgrading the version of managed node group.
So unfortunately, you'll have to create a new node group to apply your changes, migrate your workload/switch your traffic to new node group and decommission the old one. In your case, it's not necessary to nuke the entire cluster and recreate.
If you're seeking for seamless upgrade/migration with minimum/zero down time, I suggest you try managed node group, in which the graceful draining of workload seems promising:
Node updates and terminations gracefully drain nodes to ensure that your applications stay available.
Note: in your config file above, if you specify nodeGroups rather than managedNodeGroups, an unmanaged node group will be provisioned.
However, don't lose hope. An active issue in eksctl GitHub repository has been lodged to add eksctl apply option. At this stage it's not yet released. Would be really nice if this came true.
To upgrade the cluster using eksctl:
Upgrade the control plane version
Upgrade coredns, kube-proxy and aws-node
Upgrade the worker nodes
If you just want to update nodegroup and keep the same configuration, you can just change nodegroup names, e.g. append -v2 to the name. [0]
If you want to change the node group configuration 'instance type', you need to just create a new node group: eksctl create nodegroup --config-file=dev-cluster.yaml [1]
[0] https://eksctl.io/usage/cluster-upgrade/#updating-multiple-nodegroups-with-config-file
[1] https://eksctl.io/usage/managing-nodegroups/#creating-a-nodegroup-from-a-config-file

AWS Codestar Proper Way to Add RDS Postgres Database without Breaking Anything

I'm using AWS Codestar setup and I would like to add a database.config to my .ebextentions folder in my rails project.
If you're wondering why I'm not adding database trough console, the Codestar's pipeline fails at the final ExecuteChangeSet stage for CloudFormation changes and throws a 404 error, I assume CodePipeline looking for the previous instance.
Error Message I've been receiving AWS suggests I edit Elastic Beanstalk directly. Really somewhat lost how I can add a database to my project using Elastic Beanstalk while not breaking Codestars CodePipline ExecuteChangeSet.
You specified the 'AWSEBRDSDBInstance' resource in your configuration to create a database instance,
without the corresponding database security group 'AWSEBRDSDBSecurityGroup'. For a better way to add
and configure a database to your environment, use 'eb create --db' or the Elastic Beanstalk console
instead of using a configuration file.
My .ebextensions/database.config file so far.
Resources:
AWSEBRDSDatabase:
Type: AWS::RDS::DBInstance
Properties:
AllocatedStorage: 5
DBInstanceClass: db.t2.micro
DBName: phctest
Engine: postgresql
EngineVersion: 10.4
MasterUsername: username
MasterUserPassword: password
I could also make a separate RDS database on it's own I thought about that, but like to leave it for Elastic Beanstalk.

Autoclustering does not work on AWS with RabbitMQ

We are using the latest version of RabbitMQ, v3.7.2 on a few EC2 instances on AWS. We want to use auto clustering which comes default in the product, Cluster Formation and Peer Discovery.
After we start RabbitMQ it fails/ignores to do this. The only message we see in the log file is:
[info] <0.229.0> Peer discovery backend rabbit_peer_discovery_aws does not support registration, skipping registration.
On our RabbitMQ EC2 instance an IAM role is attached with the coorect policy. The rabbitMQ config is:
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_aws
cluster_formation.aws.region = eu-west-1
cluster_formation.aws.use_autoscaling_group = true
cluster_formation.aws.use_private_ip = true
Did anyone face this issue?
Add the following to your rabbitmq.conf and restart rabbitmq-server
log.file.level = debug
It allows you to see a discovery request to AWS in logs.
Then do this on any rabbitmq node:
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl start_app
It'll execute the discovery again. Check rabbitmq logs for 'AWS Request' you'll see corresponding response so that you can check if your ec2 instances were found by specified tags. If no, something is wrong with your tags.
Not an answer (not enough reputation points to comment) but I'm dealing with the same thing. I've double-checked that the security groups are correct, they allow ports 4369, 5672 and 15672 (confirmed via telnet/netcat), and the IAM policies are correct. Debug logging shows nothing else. I'm at a loss how to figure this one out.

Unable to add an RDS instance to Elastic Beanstalk

Suddenly I can't add an RDS to my EB environment, not sure why. Here's the full error message:
Unable to retrieve RDS configuration options.
Configuration validation exception: Invalid option value: 'db.t1.micro' (Namespace: 'aws:rds:dbinstance', OptionName: 'DBInstanceClass'): DBInstanceClass db.t1.micro not supported for mysql db
I am not sure if this is due to the default AMI that I am using or something else.
Note that I didn't choose to launch t1.micro RDS instance. Seems like eb is trying to get that but this type has been eliminated from RDS instance class.
Just found this link in the community forum. https://forums.aws.amazon.com/ann.jspa?annID=4840, looks like elastic Beanstalk has not updated cloudformation templates yet.
I think it's resolved now. But as a side note, AWS should not make things like this a community announcement.

change ElastiCache node DNS record in cloud formation template

I need to create CNAME record for ElastiCache Cluster. However, I build redis cluster and there is only one node. As far as I found there is no
ConfigurationEndpoint.Address for redis cluster. Is there any chance to change DNS name for node in cluster and how to do it?
Currently template looks like:
"ElastiCahceDNSRecord" : {
"Type" : "AWS::Route53::RecordSetGroup",
"Properties" : {
"HostedZoneName" : "example.com.",
"Comment" : "Targered to ElastiCache",
"RecordSets" : [{
"Name" : "elche01.example.com.",
"Type" : "CNAME",
"TTL" : "300",
"ResourceRecords" : [
{
"Fn::GetAtt": [ "myelasticache", "ConfigurationEndpoint.Address" ]
}
]
}]
}
}
For folks coming to this page for a solution. There is now a way to get the Redis endpoint directly from within the CFN.
There is now the ability to get the RedisEndpoint.Address from the AWS::ElastiCache::CacheCluster or PrimaryEndPoint.Address from the AWS::ElastiCache::ReplicationGroup
Per the documentation (http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-elasticache-cache-cluster.html):
RedisEndpoint.Address - The DNS address of the configuration endpoint for the Redis cache cluster.
RedisEndpoint.Port - The port number of the configuration endpoint for the Redis cache cluster.
or
Per the documentation (http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-elasticache-replicationgroup.html):
PrimaryEndPoint.Address -
The DNS address of the primary read-write cache node.
PrimaryEndPoint.Port -
The number of the port that the primary read-write cache engine is listening on.
An example CFN (other bits not included):
Resources:
DnsRedis:
Type: 'AWS::Route53::RecordSetGroup'
Properties:
HostedZoneName: 'a.hosted.zone.name.'
RecordSets:
- Name: 'a.record.set.name'
Type: CNAME
TTL: '300'
ResourceRecords:
- !GetAtt
- RedisCacheCluster
- RedisEndpoint.Address
DependsOn: RedisCacheCluster
RedisCacheCluster:
Type: 'AWS::ElastiCache::CacheCluster'
Properties:
ClusterName: cluster-name-redis
AutoMinorVersionUpgrade: 'true'
AZMode: single-az
CacheNodeType: cache.t2.small
Engine: redis
EngineVersion: 3.2.4
NumCacheNodes: 1
CacheSubnetGroupName: !Ref ElastiCacheSubnetGroupId
VpcSecurityGroupIds:
- !GetAtt
- elasticacheSecGrp
- GroupId
Looks like the ConfigurationEndpoint.Address is only supported for Memcached clusters, not for Redis. Please see this relevant discussion in the AWS forums.
Also, the AWS Auto Discovery docs (still) state:
Note
Auto Discovery is only available for cache clusters running the
Memcached engine. Redis cache clusters are single node clusters, thus
there is no need to identify and track all the nodes in a Redis
cluster.
Looks like your 'best' solution is to query the individual endpoint(s) in us, in order to determine the addresses to connect to, using AWS::CloudFormation::Init as is suggested on the AWS forums thread.
UPDATE
As #slimdrive pointed out below, this IS now possible, through the AWS::ElastiCache::CacheCluster. Please read further below for more details.
You should be able to use PrimaryEndPoint.Address instead of ConfigurationEndpoint.Address in the template provided to get the DNS address of the primary read-write cache node as documented on the AWS::ElastiCache::ReplicationGroup page.
This can be extremely confusing-- depending on what you're trying to do, you use either ConfigurationEndpoint or PrimaryEndpoint... I'm adding my findings here as this was one of the first posts I found when trying to search. I'll also detail some other issues I've had with ElastiCache redis engine setup with CloudFormation. I was trying to set up a CloudFormation type of AWS::ElastiCache::ReplicationGroup
Let me preface this with the fact that I had previously set up a clustered instance of redis ElastiCache using a t2.micro build type with no problems. In fact, I received an error from the node-redis npm package saying that clusters weren't supported, so I also implemented the redis-clustr wrapper around that. Anyway, all that was working fine.
We then moved forward with trying to create a CloudFormation template for this, and I ran into all sorts of limitations that the aws console UI must be hiding from people. In chronological order of how I ran into the problems, here were my struggles:
t2.micro instances are not supported with auto-failover. So I set AutomaticFailoverEnabled to false.
Fix: t2.micro instances actually can use auto-failover. Use the Parameter Group that has clustered mode enabled. The default one for me was default.redis3.2.cluster.on (I used version 3.2.6, as this is the most current that supports encryption at rest and in transit). The parameter group can not be changed after the instance is created, so don't forget this part.
We received an error from the redis-clustr/node-redis package: this instance has cluster support disabled.
(This is how I found the parameter group needed the value on)
We received an error in the CF template that cluster mode can not be used if auto failure is off
This is what made me try using a t2.micro instance again, since I knew I had auto-failover turned on in my other instance and was using a t2.micro instance. Sure enough, this combination does work together.
I had outputs to the stack and creation of parameters in the Parameter Store of the connection url and port. This failed with x attribute/property does not exist on the ReplicationGroup.
Fix: It turns out that if cluster mode is disabled (using parameter group default.redis3.2, for example), you must use the PrimaryEndPoint.Address and PrimaryEndPoint.Port values. If cluster mode is enabled, use ConfigurationEndPoint.Address and ConfigurationEndPoint.Port. I had tried using the RedisEndpoint.Address and RedisEndpoint.Port with no luck, though this may work with a single redis node with no replica (I also could have had the casing wrong-- see the note below).
NOTE
Also, a major issue affected me is the casing: The P in EndPoint must be capitalized in the PrimaryEndPoint and ConfigurationEndPoint variations if you are creating a AWS::ElastiCache::ReplicationGroup, but the p is lower case if you are creating a AWS::ElastiCache::CacheCluster: RedisEndpoint, ConfigurationEndpoint. I'm not sure why there's a discrepancy there, but it may be the cause of some problems.
Link to AWS docs for GetAtt, which lists available attributes for different CloudFormation resources