EC2 - Huge latency in saving streaming data to local file - amazon-web-services

I've written some python code to receive streaming tick trading data from an API (TWS API of Interactive Brokers using IB Gateway) and append data to file in the local machine. On a daily basis, the amount of data is roughly no more than 1GB. In addition, the 1GB of steaming data per day is composed by several millions of read/write operations for a few 100s of bytes each.
When I run the code in my local machine, the latency between the timestamp associated with the received tick data and when the data is appended to file is in the order of 0.5 to 2 seconds. However, when I run the same code in an EC2 instance, the latency explodes to minutes or hours.
At 4:30 UTC the markets open. The first chart shows that the latency is not due to RAM, CPU and presumedly IOPS. Volume type is gp2 with 100 IOPS for t2.micro and 900 IOPS for m5.large.
How to find what's causing the huge latency?

Related

AWS EC2 EBS and instance store IOPS usage

We're gathering logs into Graylog and then store these logs in Elasticsearch. There are 3 i3.xlarge nodes collecting the last 30 days of logs, and 3 r5a.xlarge nodes holding another 700 days of logs older than one month. The warm storage drives are 6144 GiB GP3 EBS with 3000 IOPS and 125 MiB/s throughput.
Just these 6 instances take up about two thirds of our monthly budget.
I'd like to know if storage IOPS limits on both instance types are saturated enough, as I'm looking for possible savings.
CloudWatch gives me numbers like:
Up to 40k DiskReadOps on instance store volumes
Up to 29k DiskWriteOps on instance store volumes
Up to 30k EBSReadOps on warm logs EBSs
Up to 27k EBSWriteOps on warm logs EBSs
Given the EBS limits, I'm not quite sure if the numbers mean that only 3k out of 30k requests per second get processed and the rest is queued. Or is there any burst even with GP3s?
Is my storage saturated IOPS-wise, or is there any space for optimization?

Amazon EBS High Usage Costs

I have a SpringBoot application running on an Amazon EC2 instance that uses Amazon EBS disk storage. The application receives JSON data as HTTP and stores it into an Amazon RDS (MySQL) database.
In total, the application received data from 600 processes. Each process sent JSON for approximately 0.1mb, so ~60mb of data were sent in total.
However, my EBS charges were ~$36 of USE2-EBS:VolumeUsage.gp2 which is priced at $0.1/GB. If I look at my billing summary, I find:
$0.10 per GB-month of General Purpose SSD (gp2) provisioned storage - US East (Ohio)1,048.230 GB-Mo$104.82
This is cumulative throughout the month where I ran my processes several other times, but I am really struggling to understand how I am generating this much data given that the size of the JSON I sent over HTTP is far inferior.
Any advice on how I could get further insight into this?

100 GiB Files upload to AWS Ec2 Instance

We have the n number of files with total size of around 100 GiB. We need to upload all the files to EC2 Linux instance which is hosted in AWS (US region).
My office(in India) internet connection is 4Mbps dedicated leased line. Its taking more than 45 min to upload 500 MB file to EC2 instance. which is too slow.
How do we transfer this kind of bulk upload with minimum time period..?
If it is 100s of TB we can go with snowball import and export but this is 100 GiB.
It should be 3x faster than you experience.
If there are many small files you can try to "zip" them to send fewer large files.
And make sure you dont bottleneck the linux server by encrypting the data (ssh/sftp). Ftp may be your fastest way.
But 100GB will always take at least 57 hours with your max speed..

RDS connection time out in rush hours (with ~50.000 HTTP requests)

We're using RDS on a db.t2.large instance. And an auto-scaling group of EC2's is writing data to the database during the day. In rush hours we're having about 50.000 HTTP requests each which read/write MySQL data.
This varies each day, but for today's example, during an hour:
We're seeing "Connect Error (2002) Connection timed out" from our PHP instances, about 187 times a minute.
RDS CPU won't raise above 50%
DB Connections won't go above 30 (max is set to 5000).
Free storage is ~ 300G (Disk size is large to provide high IOPS )
Write IOPS hit 1500 burst but drop to 900 because burst limit has expired, after rush hours.
Read IOPS are hitting 300 each 10mins and around 150 in between.
Disk Write Throughput averages between 20 and 25 MB/Sec
Disk Read Throughput between 0,75 and 1,5 MB/Sec
CPU Credit Balance is around 500, so we don't have a need for the CPU burst.
And when it comes to the network, I see a potential limit we're hitting:
Network Receive Throughput reaches 1.41 MB/Second and stays around 1.5 MB/Seconds during an hour.
During this time Network Transmit 5 a 5.2 MB/Second with drops to 4 MB/Second each 10 min which concurs with our cronjobs which are processing data (mainly reading)
I've tried placing the EC2's in different or the same AZ's, but this has no effect
During this time I can connect fine from my local workstation via SSH Tunnel (EC2 -> RDS). And from the EC2 to the RDS as well.
The PHP scripts are set to time-out after 5 sec of trying to connect to ensure a fast response. I've increased this limit to 15 sec now for some scripts.
But which limit are we hitting on RDS? Before we start migrating or changing instances types we'd like to known the source of this problem. I've also just enabled Enhanced Monitoring to get more details on this issue.
If more info needed, I'll gladly elaborate where needed.
Thanks!
Update 25/01/2016
On recommendation of datasage we increased the RDS disk size to 500 GB which gives us 1500 IOPS with 3600 burst, it uses around 1200 IOPS (so not even bursting now) and the time outs still occur.
Connection time-outs are set to 5 sec and 15 sec as mentioned before, shows no difference.
Update 26/01/2016
RDS Screenshot from our peak hours:
Update 28/01/2016
I've changed the setting sync_bin_log to 0, because initially I thought we were hitting the EBS throughput limits (GP-SSD 160 Mbit/s), this gives us a significant drop in disk throughput and the IOPS are lower as well, but we still see the connection time outs occur.
When we plot the times that the errors occur we're seeing that each minute around :40 seconds the time-outs start happening during about 25seconds, then no errors for about 35 secs again and it starts again. This during the peak hour of our incoming traffic.
Apparently it was the Network Performance keeping us back. When we upgraded our RDS instance to an m4.xlarge (with High Network Performance) the issues were resolved.
This was a last resort for us, but it solved our problem in the end.

How does Amazon RDS calculate I/O rate?

I browsed the Amazon RDS pricing site today and now do want to know how they actually calculate the I/O rate? What does "$0.10 per 1 million requests" really mean?
Can anyone give some simple examples how many I/Os a simple query from EC2 to a MySQL on RDS produces?
In general it is a price for EBS storage service. Amazon claims something like this for EBS (section Projecting Costs):
As an example, a medium sized website database might be 100 GB in size
and expect to average 100 I/Os per second over the course of a month.
This would translate to $10 per month in storage costs (100 GB x
$0.10/month), and approximately $26 per month in request costs (~2.6
million seconds/month x 100 I/O per second * $0.10 per million I/O).
If you have a running application on Linux, here is an article how to measure cost for EBS: