I'm having troubles with downloading files from S3, If I download a file like 200MB, and then i download another files, the download speed it's just really slow like (40KB/s) as you can see in the follow pic:
And when the first download finish, the second continues with the 40KB/s...
Any ideas about that?
Amazon S3 has huge bandwidth.
If you are downloading from Amazon S3 to your own computer (outside of AWS), then the only limitations that would impact you are your own Internet bandwidth, and any speed limitations imposed within your own network.
I will presume that you are downloading an object from Amazon S3 to an Amazon EC2 instance in the same region as the S3 bucket.
In this scenario, the only bandwidth limitation is the Network Performance on the Amazon EC2 instance. Basically, the bigger the instance, the more bandwidth is available.
In the Launch screen, a t2.large is listed as Low to Moderate Network Performance. This is reasonably good, but not as good as larger instance types.
See:
Amazon EC2 Instance Configuration - Amazon Elastic Compute Cloud
EC2 Network Performance Cheat Sheet | cloudonaut
It might also be a result of the software you are using to download the files and how it multi-tasks and shares network bandwidth between the downloads.
If you're connected to the Internet via WiFi router from your laptop, try to use cable connection instead.
Related
I have some datasets (can go up to 10 GBs (zipped) altogether possibly) for my Machine Learning applications
In order to expose these datasets to others, I believe I have to host a server and let others to download over the network.
what is the cheapest server I can use for this? (I checked AWS free tiers, can these be used?)
Do I need to write up a web server? is there a premade tool that I can use for my use case?
You haven't indicated how much data will be downloaded (GB/month) and that's important because you pay for data transfer out to the internet (about $0.09 per GB) beyond an initial free amount (1 GB/month, I believe, but check if free tier offers more), and that's relevant to both S3 and EC2.
That said, I'd consider a few options.
Storing the files in S3 and serving them from S3 via CloudFront may be cheaper than running a server 24x7 to host and serve the files.
A small EC2 server that fits into the free tier usage plan, running a web or FTP server, serving up your files.
Similar to #1 but you can also configure requester pays for S3 downloads. This option requires your downloaders to have AWS credentials and for you to manage their access. May not be feasible in your case.
Create an EBS volume containing your data, take a snapshot of that volume, and share the snapshot with other AWS accounts, then shut down your EC2 instance. This option requires your users to be AWS account holders and that they share their AWS account numbers with you. May not be feasible in your case.
AWS SFTP serving up data stored in S3.
I have aws setup for my website, What I am doing is when a user uploads an image , we are saving it to a folder on ec2 and then transferring it to s3, post which we are fetching images from s3.
I have also stored all the js and css on ec2 and fetching all from ec2 itself.
My data transfer cost is very high now, Please suggest if storing images on ec2 is costing me more ? should I directly store it on s3?
Always think of using CDN or dedicated web hosting services if your web traffics is high. EC2 are only recommended for back-office processing usage than serving web page. There is no free lunch in AWS if you are not careful. You must always check AWS bandwidth pricing before you want to host anything inside AWS. In certain extend, the data transfer costs can be many time more expensive than the EC2 server and (s3, EBS) storage.
AWS only give EC2 1 GB free data transfer to the Internet. After that, it is $0.09/GB. If you open your web server to everyone and 20 bots go download 100GB data daily from your EC2 web server, you will get a hefty bill, i.e. (100GB x $0.09 x 30days = $270 ) - $0.09 (Free 1GB) = $269.01
Also remember, S3 data transfer out to internet is NOT FREE. You only get free unlimited data transfer from S3 to your EC2/lambda within the same region. If you signed the S3 file as a URL to let people download the file, you get billed by "internet OUT" bandwidth charge.
Data Transfer charges only apply to data going from an AWS Region to the Internet. There is no charge for uploading to AWS, nor for moving data between S3 and EC2 in the same region.
If your data transfer costs are high, it suggests that you are serving a lot of traffic to the Internet, either from EC2 or S3.
I have a AWS Windows Instance with SQL Server running on it.
I took a database backup and the resultant file is of size 175 GB.
What is the fastest and most efficient way of downloading this file from AWS to my local machine ?
Network bandwidth varies with the size of Amazon EC2 instances. Put simply, larger instances have larger bandwidth.
Your own Internet bandwidth will also be a limiting factor.
To fully utilize the available bandwidth, you could use the Tsunami UDP protocol. This is similar in concept to Bittorrent, in that it has large windows and does not wait for error correction.
Amazon S3 actually supports the Bittorrent protocol, so you could copy the file to S3, then use Bittorrent to download it. This would be great at recovering from transmission errors. However, it means you are sending the file twice through constrained resources (the EC2 instance to S3, then S3 to your computer), which would be less efficient.
My company is looking for a solution for file sharing via FTP - currently, we share one server for client/admin FTP file sharing and serving multiple sites, and are looking to split off our roles so that we have one server dedicated to FTP and one for serving websites.
I have tried to find a good solution with AWS, but cannot find any detailed information regarding EBS and EC2 servers, and whether an EC2 package will be able to handle FTP storage. For example, a T2.nano instance seems ideal with 1 cpu and minimal RAM, but I see no information regarding EBS storage limits.
We need around 500GiB at most, and will have transfers happening daily in the neighborhood of 1GiB in and out. We don't need to run a database or http server. We may run services for file cleanup in the background weekly.
EDIT:
I mis-worded the question, which was founded from a fundamental lack of understanding AWS EC2 and EBS which I now grasp. I know EC2 can run FTP services, the question was more of a cost-effective solution with dynamic storage. Thanks for the input!
As others here on SO will tell you: don't bother with EBS. It can be made to work but does not make much sense in the long run. It's also more expensive and trickier to operate (backups/disaster recovery/having multiple ftp server machines).
Go with S3 storing your files and use something that is able to leverage S3 for ftp (like s3fs)
See:
http://resources.intenseschool.com/amazon-aws-howto-configure-a-ftp-server-using-amazon-s3/
Setting up FTP on Amazon Cloud Server
http://cloudacademy.com/blog/s3-ftp-server/
If FTP is not a strong requirement you can also look at migrating people to using S3 directly (either initially or after you do the setup and give them the option of both FTP and S3 directly)
the question is among the most seen on SO for aws: You can install a FTP server on any EC2 instance type
There's no limit on EBS and you can always increase the storage if you need, so best rule is: start low and increase when needed
Only point to mention is the network performance comes with the instance type so if you care about the speed a t2.nano (low network performance) might not be sufficient
I have a website which gets backup from different social media services and then stores the data on server and then that is displayed on my website. content includes, videos, images, and text data.
Currently i am using an EC2 instance with RDS and EBS. Data is stored in EBS Volumes, But as the amount of the data is big enough more than 1 TB and that is increasing. Every time my EBS volume gets filled i attach another volume.
Then i added S3 to my Setup. Cron jobs runs and stores data on S3 and the EC2 instance displays data from the S3. I am using PHP SDK for this purpose.
The problem which i am facing is that the S3 is very slow in my current setup.
Please suggest whether my setup is good or i need some change in my setup and the other way how can i speedup S3. or i should opt some other way to my setup.
EC2 instance is large reserved instance running CentOS.
I have listened some about the S3fs that mount S3 bucket to Ec2 as a volume. Is this a good choice, as when i mounted S3 Bucket to Ec2 instance the transfer rate was very slow.
I am new to the AWS. My users does not access files directly from S3, but they access through my website which is running on EC2 Instance.
RDS is a good choice for storing metadata such as tags, comments and other relevant information about your multimedia files. S3 is good for storing static content such as Video, Audio and Pictures. I think your approach with RDS and S3 is good enough.
EBS backed instances are good for persistence. If you store your metadata on RDS and static content on S3, the only reason why you should use EBS backed EC2 instances is that you have some configuration files which are unversioned right now. If that's not the case, assuming that your configuration is checked into version control and can be pulled on-demand for a fresh instance every time, then you might want to ditch EBS volumes in favor of ephemeral storage. That may give you some performance boost, nothing significant though.
Regarding your concern with S3's latency, yes, S3 is slow. While all your writes may happen directly to S3, I would highly recommend that you set up Amazon CloudFront for your S3 buckets and let your website consume multimedia content from the CloudFront. CloudFront is a Content Delivery Network (CDN) which works with disk volumes (EBS backed or ephemeral) as well as with S3. Setting it up would take not more than a few minutes. CloudFront also supports streaming media files over RTMP. You may need a library like GPAC for hinting multimedia files to make them streamable if not being done already. You might then want to consider creating one distribution for Video/Audio files for streaming and another distribution for Images, Javascript, Stylesheets and other text files.
Hope this helps.
For faster getting and uploading files from Amazon S3 I use batch() found here.
Also you can use cloudfront for faster getting files. I think 9gag uses cloudfront also..