Keeping Original Filename for FileField in Django - django

I would like to keep the original file name of an UploadedFile in Django that has its location stored in a FileField. Right now I am observing that if two files have the same name, the first file uploaded keeps its original name but the second time a file with that name is uploaded, it has a random string appended to make the file name unique. One solution is to add an additional field to the model: Django: How to save original filename in FileField? or Saving Original File Name in Django with FileField but these solutions seem suboptimal as they require changing the Model fields.
An alternative would be to prepend a random directory path to the front of the file make sure that in a given directory the file name is unique and allowing the basename to remain unchanged. One way to do this would be to pass in a callable upload_to that does just that. Another option would be to subclass FileField and override get_filename to not strip the input filename to the basename allowing the caller to pass in a filename with a prepended path. The latter option is not ideal if I want to use an ImageField as I would have to subclass that as well.

In looking at the code that actually generates the unique filename by appending the random string, it looks like the best solution to this problem might be to subclass the Storage class in-use and override get_available_name method to create unique filenames by prepending a directory rather than post-pending the string to the base name.

Sorry for the quick answere, here is another approach to your question :
The idea here is to create an unique folder for each uploaded file.
# in your settings.py file
MY_FILE_PATH = 'stored_files/'
The path were your files will be stored : /public/media/stored_files
# somewhere in your project create an utils.py file
import random
try:
from hashlib import sha1 as sha_constructor
except ImportError:
from django.utils.hashcompat import sha_constructor
def generate_sha1(string, salt=None):
"""
Generates a sha1 hash for supplied string.
:param string:
The string that needs to be encrypted.
:param salt:
Optionally define your own salt. If none is supplied, will use a random
string of 5 characters.
:return: Tuple containing the salt and hash.
"""
if not isinstance(string, (str, unicode)):
string = str(string)
if isinstance(string, unicode):
string = string.encode("utf-8")
if not salt:
salt = sha_constructor(str(random.random())).hexdigest()[:5]
hash = sha_constructor(salt+string).hexdigest()
return (salt, hash)
In your models.py
from django.conf import settings
from utils.py import generate_sha1
def upload_to_unqiue_folder(instance, filename):
"""
Uploads a file to an unique generated Path to keep the original filename
"""
salt, hash = generate_sha1('{}{}'.format(filename, get_datetime_now().now))
return '%(path)s%(hash_path)s%(filename)s' % {'path': settings.MY_FILE_PATH,
'hash_path': hash[:10],
'filename': filename}
#And then add in your model fileField the uplaod_to function
class MyModel(models.Model):
file = models.FileField(upload_to=upload_to_unique_folder)
The file will be uploaded to this location :
public/media/stored_file_path/unique_hash_folder/my_file.extention
Note : I got the code from Django userena sources, and adapted it to my needs
Note2 : For more informations take a look at this greate post on Django File upload : File upload example
Have a good day.
Edit : Trying to provide a working solution :)

To my understanding, during the form submission/file upload process, you can add form validation functions.
During the validation and cleaning process, you could check that the database does not already have a duplicate name (ie. query to see if that file name exists).
If it is duplicate, you could just rename it xyz_1, xyz_2, etc

Related

What is the appropriate way to use a ModelForm field as an argument to upload_to?

I have a ModelForm, where two of the fields are lastname and firstname. I also have a file field for file uploading. As several files are being uploaded by many different people, I would like to group the files into a directory based on their names.
I've been trying to use a custom formatted string to do this, but so far I'm getting an error that there are not enough arguments for format string, and I am wondering if it as something to do with the form not being saved yet.
My attempt to generate a filename based on form fields is:
def filename_path(instance, filename):
return os.path.join('applicant_documents/%s_%s/%s' % instance.last_name, instance.first_name, filename)
and the field from my model is defined as:
documents = models.FileField(upload_to=filename_path)
Am I doing something wrong, or is this not possible?
As the error suggested, you need to provide a tuple instead for string formatting.
def filename_path(instance, filename):
return os.path.join('applicant_documents/%s_%s/%s' % (instance.last_name, instance.first_name, filename))

Directing Output Paths of Altered Files

How can I direct the destination of the output file to my db?
My models.py is structured like so:
class Model(models.Model):
char = models.CharField(max_length=50, null=False, blank=False)
file = models.FileField(upload_to=upload_location, null=True, blank=True)
I have the user enter a value for 'char', and then the value of 'char' is printed on to a file. The process of successfully printing onto the file is working, however, the file is outputting to my source directory.
My goal is to have the output file 'pdf01.pdf' output to my db and be represented as 'file' so that the admin can read it.
Much of the information in the Dango docs has been focussed on directing the path of objects imported by the user directly, not on files that have been created internally. I have been reading mostly from these docs:
Models-Fields
Models
File response objects
Outputting PDFs
I have seen it recommend to write to a buffer, not a file, then save the buffer contents to my db however I haven't been able to find many examples of how to do that relevant to my situation online.
Perhaps there is a relevant gap in my knowledge regarding buffers and BytesIO? Here is the function I have been using to alter the pdf, I have been using BytesIO to temporarily store files throughout the process but have not been able to figure out how to use it to direct the output anywhere specific.
can = canvas.Canvas(BytesIO(), pagesize=letter)
can.drawString(10, 10, char)
can.save()
BytesIO().seek(0)
text_pdf = PdfFileReader(BytesIO())
base_file = PdfFileReader(open("media/01.pdf", "rb"))
page = base_file.getPage(0)
page.mergePage(text_pdf.getPage(0))
PdfFileWriter().addPage(page)
PdfFileWriter().write(open("pdf01.pdf", "wb")
FileField does not store files directly in the database. Files get uploaded in a location on the filesystem determined by the upload_to argument. Only some metadata are stored in the DB, including the path of the file in your filesystem.
If you want to have the contents of the files in the database, you could create a new File model that includes a BinaryField to store the data and a CharField to store the URL from which the file can be fetched. To feed the data of PdfFileWriter to the binary field of Django, perhaps the most appropriate would be to use BytesIO.
I found this workaround to direct the file to a desired location (in this case both my media_cdn folder and also output it to an admin.)
I set up an admin action to perform the function that outputs the file so the admin will have access to both the output version in the form of both an HTTP response and through the media_cdn storage.
Hope this helps anyone who struggles with the same problem.
#admin.py
class edit_and_output():
def output:
author = Account.email
#alter file . . .
with open('media_cdn/account/{0}.pdf'.format(author), 'wb') as out_file:
output.write(out_file)
response = HttpResponse(content_type='application/pdf')
response['Content-Disposition'] = 'attachment;filename="{0}.pdf"'.format(author)
output.write(response)

Django rest Framework, change filename of ImageField

I have an API endpoint with Django Rest Framework to upload an image.
class MyImageSerializer(serializers.ModelSerializer):
image = serializers.ImageField(source='image')
I can upload images but they are saved with the filename that is sent from the client which can result to collisions. I would like instead to upload the file to my CDN with a timestamp filename.
Generating the filename is not the problem, just saving the image with it.
Any one knows how to do that?
Thanks.
If your image is of type ImageField from django, then you don't really have to do anything, not even declare it in your serializer like you did. It's enough to add it in the fields attribute and django will handle collisions. This means django will add _index on each new file which might generate a collision, so if you upload a file named 'my_pic.jpg' 5 times, you will actually have files 'my_pic.jpg', 'my_pic_1.jpg', 'my_pic_2.jpg', 'my_pic_3.jpg', 'my_pic_4.jpg' on your server.
Now, this is done using django's implementation for FileSystemStorage (see here), but if you want it to append a timestamp to your filename, all you have to do is write a storage class where you overwrite the get_available_name(name) method. Example:
class MyFileSystemStorage(FileSystemStorage):
def get_available_name(self, name):
''' name is the current file name '''
now = time.time()
stamp = datetime.datetime.fromtimestamp(now).strftime('%Y-%m-%d-%H-%M-%S')
return '{0}_{1}'.format(name, str(stamp))
And the image field in your model:
image = models.ImageField(upload_to='your upload dir', storage= MyFileSystemStorage)
Important update
As of August 20, 2014 this is no longer an issue, since Django found a vulnerability related to this behaviour (thanks #mlissner for pointing it out) . Important excerpt :
We’ve remedied the issue by changing the algorithm for generating file
names if a file with the uploaded name already exists.
Storage.get_available_name() now appends an underscore plus a random 7
character alphanumeric string (e.g. "_x3a1gho"), rather than iterating
through an underscore followed by a number (e.g. "_1", "_2", etc.).

File upload after model save on Django admin

I am using a file upload in my Django model like this :
def upload_path(self, filename):
return 'upload/actualities/%s/%s' % (self.id, filename)
photo = models.ImageField(upload_to=upload_path)
and my adminModel is :
from actualities.models import *
from django.contrib import admin
class ActualityAdmin(admin.ModelAdmin):
class Media:
js = ('/static/js/tiny_mce/tiny_mce.js', '/static/js/textareas.js')
admin.site.register(Actuality, ActualityAdmin)
Everything works fine except when i edit mu model because it has an id. But when I create it, the file upload happens before the model saving... So i put my file in /media/actualities/None/filename.jpg, and I want /media/2/filename.jpg
How can I force to make the file upload after the model saving?
Thank you!!!
You will probably want to override the Model's save() method, and maybe come up with a custom "don't do anything" UploadHandler, then switch back to the original one and call save again.
https://docs.djangoproject.com/en/dev/topics/http/file-uploads/
https://docs.djangoproject.com/en/dev/topics/db/models/
What I would do in this situation however, is make a custom upload handler that saves the file off into some temp space. Then I'd override the save method (in a mixin or something) that moves the file from temp to wherever you wanted it.
#Tomek's answer is also another way. If you have your model generate it's own id, then you can use that.
A second to last suggestion which is what I do with my photo blog is instead of saving all the images in a directory like media/2/filename.jpg I save the image by date uploaded. 2011/10/2/image.jpg This kind of helps any directory from getting too unwieldy.
Finally, you could hash the file names and store them in directories of hash name to kind of equally spread out the images in a directory.
I've picked the date style because that's meaningful for me with that project. Perhaps there is another way you can name an image for saving that would mean something more than "model with id 2's pics" that you could use for this problem.
Good Luck!
As workaround, try generating UUID for file name (instead of using self.id).

Django upload file into specific directory that depends on the POST URI

I'd like to store uploaded files into a specific directory that depends on the URI of the POST request. Perhaps, I'd also like to rename the file to something fixed (the name of the file input for example) so I have an easy way to grep the file system, etc. and also to avoid possible security problems.
What's the preferred way to do this in Django?
Edit: I should clarify that I'd be interested in possibly doing this as a file upload handler to avoid writing a large file twice to the file system.
Edit2: I suppose one can just 'mv' the tmp file to a new location. That's a cheap operation if on the same file system.
Fixed olooney example. It is working now
#csrf_exempt
def upload_video_file(request):
folder = 'tmp_dir2/' #request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
BASE_PATH = '/home/'
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
file_content = ContentFile( request.FILES['file'].read() )
try:
# Iterate through the chunks.
for chunk in file_content.chunks():
fout.write(chunk)
fout.close()
html = "<html><body>SAVED</body></html>"
return HttpResponse(html)
except:
html = "<html><body>NOT SAVED</body></html>"
return HttpResponse(html)
Django gives you total control over where (and if) you save files. See: http://docs.djangoproject.com/en/dev/topics/http/file-uploads/
The below example shows how to combine the URL and the name of the uploaded file and write the file out to disk:
def upload(request):
folder = request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
# Iterate through the chunks.
for chunk in fout.chunks():
fout.write(chunk)
fout.close()
Edit: How to do this with a FileUploadHandler? It traced down through the code and it seems like you need to do four things to repurpose the TemporaryFileUploadHandler to save outside of FILE_UPLOAD_TEMP_DIR:
extend TemporaryUploadedFile and override init() to pass through a different directory to NamedTemporaryFile. It can use the try mkdir except for pass I showed above.
extend TemporaryFileUploadHandler and override new_file() to use the above class.
also extend init() to accept the directory where you want the folder to go.
Dynamically add the request handler, passing through a directory determined from the URL:
request.upload_handlers = [ProgressBarUploadHandler(request.path.replace('/', '_')]
While non-trivial, it's still easier than writing a handler from scratch: In particular, you won't have to write a single line of error-prone buffered reading. Steps 3 and 4 are necessary because FileUploadHandlers are not passed request information by default, I believe, so you'll have to tell it separately if you want to use the URL somehow.
I can't really recommend writing a custom FileUploadHandler for this. It's really mixing layers of responsibility. Relative to the speed of uploading a file over the internet, doing a local file copy is insignificant. And if the file's small, Django will just keep it in memory without writing it out to a temp file. I have a bad feeling that you'll get all this working and find you can't even measure the performance difference.