Boto: small file download works but large file doesnt

Boto: small file download works but large file doesnt - amazon-web-services

I have a script that works very well in presenting the user with a list of files stored in an S3 bucket which when they select, the file is downloaded and something is then done with the file.
This method works on files up to 600Mb, but when the user chooses another file which is 2Gb is get a Boto exception error stating the file is being used by another process.
listname = self.list_ctrl.GetItemText(i)
conn = boto.connect_s3(access_key, secret_key)
bucket = conn.get_bucket('data')
key = bucket.get_key(listname)
key.get_contents_to_filename(key.name)
It is really puzzling as it works great for smaller files
Any ideas what may be causing it to fail?

Related

How to pass InMemoryUploadedFile as a file?

User records audio, audio gets saved into audio Blob and sent to backend. I want to get the audio file and send it to openai whisper API.
files = request.FILES.get('audio')
audio = whisper.load_audio(files)
I've tried different ways to send the audio file but none of it seemed to work and I don't understand how it should be sent. I would prefer not to save the file. I want user recorded audio sent to whisper API from backend.
Edit*
The answer by AKX seems to work but now there is another error
Edit 2*
He has edited his answer and everything works perfectly now. Thanks a lot to #AKX!

load_audio() requires a file on disk, so you'll need to cater to it – but you can use a temporary file that's automagically deleted outside the with block. (On Windows, you may need to use delete=False because of sharing permission reasons.)
import os
import tempfile
file = request.FILES.get('audio')
with tempfile.NamedTemporaryFile(suffix=os.path.splitext(file.name)[1], delete=False) as f:
for chunk in file.chunks():
f.write(chunk)
f.seek(0)
try:
audio = whisper.load_audio(f.name)
finally:
os.unlink(f.name)

Is there a way to write to the aws config file in node?

I want to load the AWS config file and edit the contents of the file.
I found #aws-sdk/shared-ini-file-loader, that works well to load the config file data as the JSON object.
import { loadSharedConfigFiles } from '#aws-sdk/shared-ini-file-loader'
let awsFileContents = await loadSharedConfigFiles({ configFilepath: '~/.aws/config' })
console.log(awsFileContents.configFile)
Now I want to perform some changes in the awsFileContents.configFile object, parse it back to the correct format, and write it back to the ~/.aws/config file.
Is there an AWS module available that can do that?
I have tried ini, multi-ini, and conf-cfg-ini. But they have issues while parsing the JSON back to the correct format.

You do not need the SDK to read/write the config file. It is a normal INI file you can modify with standard tools for INI files.
You can use e.g. the ini package for this: https://www.npmjs.com/package/ini
In other languages like Python and also for my "AWS Manager" (Windows application written in Delphi) I use simple ini function to read and also write the config file without any issues.

wxWidgets wxFile create file first round works, creating second round with overwrite flag fails with Access Denied

I can't figure out why I keep getting 'Access Denied'.
I am using wxWidgets to create a file on Windows in the following directory: C:\Users\username\Documents\MyApp. Initially the file won't exist, and it gets created and all is well. Doing this a second or more time results in the error: error 5: Access Denied. This even happens when running the application as administrator...
The file in question is a backup of a sqlite3 database file and backup can be run multiple times in a day and thus can overwrite the previous file. The filename gets today's date appended to it.
Creation of the file is as follows:
bool DatabaseBackup::CreateBackupFile(const wxString& fileName)
{
wxFile file;
bool success = file.Create(fileName, true, wxFile::read_write);
if (!success) {
pLogger->error("Failed to create file {0}", fileName.ToStdString());
}
file.Close();
return success;
}
There is another function which appends the date as well as attaches the full path to the file name so the result is like so: C:\Users\username\Documents\MyApp\myapp.2020-03-29.db.
I have also tried checking if the file exists beforehand and using wxRemoveFile(fileName), but this also results in the Access Denied error... Creating files in Notepad and Notepad++ works fine.
Am I missing something? I can't figure this out, especially since it creates the file the first time round.

Remove the 3rd parameter of wxFile::Create(..., wxFile::read_write), so it can take the default wxS_DEFAULT value.
The 3rd parameter requires a value or combination of wxPosixPermissions enum type, not of wxFile::OpenMode enum type.

How do I make excel spreadsheets downloadable in Django?

I'm writing a web application that generates reports from a local database. I want to generate an excel spreadsheet and immediately cause the user to download it. However, when I try to return the file via HttpResponse, I can not open the file. However, if I try to open the file in storage, the file opens perfectly fine.
This is using Django 2.1 (for database reasons, I'm not using 2.2) and I'm generating the file with xlrd. There is another excel spreadsheet that will need to be generated and downloaded that uses the openpyxl library (both libraries serve very distinct purposes IMO).
This spreadsheet is not very large (5x6 column s xrows).
I've looked at other similar stack overflow questions and followed their instructions. Specifically, I am talking about this answer:
https://stackoverflow.com/a/36394206/6411417
As you can see in my code, the logic is nearly the same and yet I can not open the downloaded excel spreadsheets. The only difference is that my file name is generated when the file is generated and returned into the file_name variable.
def make_lrm_summary_file(request):
file_path = make_lrm_summary()
if os.path.exists(file_path):
with open(file_path, 'rb') as fh:
response = HttpResponse(fh.read(), content_type="application/vnd.ms-excel")
response['Content-Disposition'] = f'inline; filename="{ os.path.basename(file_path) }"'
return response
raise Http404
Again, the file is properly generated and stored on my server but the download itself is providing an excel file that can not be opened. Specifically, I get the error message:
EXCEL.EXE - Application Error | The application was unable to start correctly (0x0000005). Click OK to close the application.

How to read a huge CSV file from Google Cloud Storage line by line using Java?

I'm new to Google Cloud Platform. I'm trying to read a CSV file present in Google Cloud Storage (non-public bucket accessed via Service Account key) line by line which is around 1GB.
I couldn't find any option to read the file present in the Google Cloud Storage (GCS) line by line. I only see the read by chunksize/byte size options. Since I'm trying to read a CSV, I don't want to use read by chunksize since it may split a record while reading.
Solutions tried so far:
Tried copying the contents from CSV file present in GCS to temporary local file and read the temp file by using the below code. The below code is working as expected but I don't want to copy huge file to my local instance. Instead, I want to read line by line from GCS.
StorageOptions options =
StorageOptions.newBuilder().setProjectId(GCP_PROJECT_ID)
.setCredentials(gcsConfig.getCredentials()).build();
Storage storage = options.getService();
Blob blob = storage.get(BUCKET_NAME, FILE_NAME);
ReadChannel readChannel = blob.reader();
FileOutputStream fileOuputStream = new FileOutputStream(TEMP_FILE_NAME);
fileOuputStream.getChannel().transferFrom(readChannel, 0, Long.MAX_VALUE);
fileOuputStream.close();
Please suggest the approach.

Since, I'm doing batch processing, I'm using the below code in my ItemReader's init() method which is annotated with #PostConstruct. And In my ItemReader's read(), I'm building a List. Size of list is same as chunk size. In this way I can read lines based on my chunkSize instead of reading all the lines at once.
StorageOptions options =
StorageOptions.newBuilder().setProjectId(GCP_PROJECT_ID)
.setCredentials(gcsConfig.getCredentials()).build();
Storage storage = options.getService();
Blob blob = storage.get(BUCKET_NAME, FILE_NAME);
ReadChannel readChannel = blob.reader();
BufferedReader br = new BufferedReader(Channels.newReader(readChannel, "UTF-8"));

One of the easiest ways might be to use the google-cloud-nio package, part of the google-cloud-java library that you're already using: https://github.com/googleapis/google-cloud-java/tree/v0.30.0/google-cloud-contrib/google-cloud-nio
It incorporates Google Cloud Storage into Java's NIO, and so once it's up and running, you can refer to GCS resources just like you'd do for a file or URI. For example:
Path path = Paths.get(URI.create("gs://bucket/lolcat.csv"));
try (Stream<String> lines = Files.lines(path)) {
lines.forEach(s -> System.out.println(s));
} catch (IOException ex) {
// do something or re-throw...
}

Brandon Yarbrough is right, and to add to his answer:
if you use gcloud to login with your credentials then Brandon's code will work: google-cloud-nio will use your login to access the files (and that'll work even if they are not public).
If you prefer to do it all in software, you can use this code to read credentials from a local file and then access your file from Google Cloud:
String myCredentials = "/path/to/my/key.json";
CloudStorageFileSystem fs =
CloudStorageFileSystem.forBucket(
"bucket",
CloudStorageConfiguration.DEFAULT,
StorageOptions.newBuilder()
.setCredentials(ServiceAccountCredentials.fromStream(
new FileInputStream(myCredentials)))
.build());
Path path = fs.getPath("/lolcat.csv");
List<String> lines = Files.readAllLines(path, StandardCharsets.UTF_8);
edit: you don't want to read all the lines at once so don't use realAllLines, but once you have the Path you can use any of the other techniques discussed above to read just the part of the file you need: you can read one line at a time or get a Channel object.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Boto: small file download works but large file doesnt - amazon-web-services

Related

How to pass InMemoryUploadedFile as a file?

Is there a way to write to the aws config file in node?

wxWidgets wxFile create file first round works, creating second round with overwrite flag fails with Access Denied

How do I make excel spreadsheets downloadable in Django?

How to read a huge CSV file from Google Cloud Storage line by line using Java?

Categories

Resources