I am using the MediaIoBaseDownload to implement download from GCS.
But I found that the download response is always about 5 seconds between each response.
if I download two files at the same time, the gap between each response will around 10 seconds.
upload speed is fine, it only occurred while downloading.
Is there any limitation about the download API cause I could not found the limitation.
after add some log information I found that the most time spent at the response.read() in httplib2
Could I think this is the limitation that GCS server holds or is there any setting of buckets(e.g. like DRA) that will affect the download speed?
I am using python of 2.7.8.
thanks!
Related
Using AWS Amplify Storage, uploading a file to AWS S3 should be simple:
Storage.put(key, blob, options)
The above works without problem for smaller files, (no larger than around 4MB).
Uploading anything larger, ex. a 25MB video, does not work: Storage just freezes (app does not freeze, only Storage). No error is returned.
Question: How can I upload larger files using AWS Amplify Storage?
Side note: Described behaviour appears both on Android and iOS.
Amplify now automatically segments large files into 5Mb chunks and uploads them using the Amazon S3 Multipart upload process
https://aws.amazon.com/about-aws/whats-new/2021/10/aws-amplify-javascript-file-uploads-storage/
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html#mpu-process
After updating to
"aws-amplify": "ˆ4.3.11",
"aws-amplify-react-native": "^6.0.2"
uploads over 100MB are not freezing UI anymore + we also migrated to resumable uploads. When we used older version of aws-amplify": "^3.1.1", the problems like you mentioned were present.
Here is the pull request from Dec, 2021 for mentioned fixes:
https://github.com/aws-amplify/amplify-js/pull/8336
So the solution is really to upgrade AWS Amplify library.
However, this approach works only on iOS.
Uploading big media files on Android results in network error when calling fetch (as a required step before calling Storage.put method).
Although the same method can perfectly work on the web, in React Native uploading big files was/is not implemented optimally (taking in mind, that we should load all file in memory using fetch()).
I have a 8GB-size CSV file of 104 million rows sat on the local hard drive. I need to upload this either directly to BigQuery as a table or via Google Cloud Storage + then point link in BigQuery. What's the quickest way to accomplish this? After trying the web console upload and Google Cloud SDK, both are quite slow (moving at 1% progress every few minutes).
Thanks in advance!
All the 3 existing answer are right, but if you have a low bandwidth, no one will help you, you will be physically limited.
My recommendation is to gzip your file before sending it. Text file has an high compression rate (up to 100 times) and you can ingest gzip files directly into BigQuery without unzipped them
Using the gsutil tool is going to be much faster, and fault tolerant than the web console (which will probably time out before finishing anyway). You can find detailed instructions here (https://cloud.google.com/storage/docs/uploading-objects#gsutil) but essentially, once you have the gcloud tools installed on your computer, you'll run:
gsutil cp [OBJECT_LOCATION] gs://[DESTINATION_BUCKET_NAME]/
From there, you can upload the file into BigQuery (https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv) which will all happen on Google's network.
The bottleneck you're going to face is your internet upload speed during the initial upload. What we've done in the past to bypass this is spin up a compute box, run whatever process generated the file, and have it output onto the compute box. Then, we use the built in gsutil tool to upload the file to cloud storage. This has the benefit of running entirely on Google's Network and will be pretty quick.
I would recomment you to give a look to this article where there are several points to take into consideration.
Basically the best option is to upload your object making use of the parallel upload feature of gsutil, into the article you can find this command:
gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp ./localbigfile gs://your-bucket
And also there you will find several tips to improve your upload, like moving the chunk size of the objects to upload.
Once uploaded I'd go to the option that dweling has provided for the Bigquery part by looking further at this document.
Have you considered using the BigQuery Command Line Tool, as per example provided below?
bq load --autodetect --source-format=CSV PROJECT_ID:DATASET.TABLE ./path/to/local/file/data.csv
The above command will directly load the contents of the local CSV file data.csv into the specified table with schema automatically detected. Alternatively, details on how you could customise the load job as per your requirements through parsing additional flags can be found here https://cloud.google.com/bigquery/docs/loading-data-local#bq
I am uploading a file to my Google Cloud Platform VM using scp on Linux. However, after initially uploading it at a speed of 900 kb/s it quickly falls to 20kb/s. My internet upload speed should be around 20mbps. I wanted to upload an SQLite database clocking in at 20gb, but this is unfeasible at this point.
Right now I used 54 minutes to upload a 94 MB file. It cannot be that slow?
I had the same issue multiple times with GCP, the solution I use is to compress all my files, upload it to dropbox and then wget the file from there. The speeds should go back to normal.
This answer should help you ae well, though I don't know if your paticular issue is related to gcp , scp or both.
https://askubuntu.com/questions/760509/how-to-make-scp-go-faster
G'day,
I am playing around with django-skel on a recent project and have used most of its defaults: Heroku for hosting and S3 for file storage. I'm mostly serving a static-y site except using sorl for thumbnail generation, however the response times are pathetic.
You can visit the site: http://bit.ly/XlzkXp
My template looks like: https://gist.github.com/cd15e320be6f4454a7fb
I'm serving the template using a shortcut from the URL conf, no database usage at all: https://gist.github.com/f9d1a9a191959dcff1b5
However, it's consistently taking 15+ seconds for the response. New relic shows this is because of requests going to S3 while processing the view. This does not make any sense to me.
New Relic data: http://i.imgur.com/vs9ZTLP.png?1
Why is something using httplib to request things from S3? I can see how collectstatic might be doing it, but not the processing of the view itself.
What am I not understanding about Django-skel and this setup?
Have the same issue, my guess is that:
django-compress with django-storage are both in use
which results the former saving cache it needs to render templates to S3 bucket
and then reading it (through network, so httplib) while rendering each template
My second guess was that instructions on django-compress with remote storage to implement "S3 Storage backend which caches files locally, too" would resolve this issue.
Though it makes sense to me: saving cache to both locations local and S3 and reading from local filesystem first should speed things up, it somehow does not work this way.. still the response time is around 8+ sec.
By disabling django-compress with COMPRESS_ENABLED = False i managed to get 1-1.3 sec average response time.
Any ideas?
(I will update this answer in case of any progress)
Anyone knows how to up the request time length in drupal? I'm trying to download large files via the web services module but my token keeps expiring because the request takes so long. I know there is a setting in drupal to do this but I just can't find it.
UPDATE
So I found out how to up the request time (/admin/build/services/settings) but that didn't work. I'm still getting "The request timed out" on files about 10mb large. Anyone has any ideas? Also, I'm using ASIHTTPRequest and drupal-ios-sdk and downloading the files to an iPad.
Turns out it was the default timeOutSeconds property on the ASIHTTPRequest was too small (10 seconds). When I uped it, my large files downloaded ok