Django FileField save the url instead of the path in the database - django

I have a FileField on my form.
I like the way it behaves, writes the file to MEDIA_ROOT, etc.
I'd like to change what it writes to the database.
If I look in the database I see the path.. /home/user/media/path/to/file.txt what I would like it to write is the URL path without the domain /media/file.txt
Is there an argument to pass into the ModelForm, or Model?
If not, which class do I override?
Seems like bad practice to write absolute path's to a database. I need to be able to dynamically change MEDIA_ROOT & MEDIA_URL. And possibly share this database with other non django applications that would only need the URL of the media.

I don't see a way to save the URL in the database, but you can save a relative path in the database. I suppose you have set upload_to to an absolute path. You can change that to a path relative to your MEDIA_ROOT, for example `upload_to="path/to/". See also the official Documentation on that matter.

Related

How does Django store the reference of media file (or images) in database

My question is not How to store file/image or fetch/show/access them into the templates.
My question is one level deeper, I want to know
When we define a file/image field in Django model, and upload a file to it which store the file/image into media root. Then we access through modelInstance.file.url . So does Django stores url of file/image or name or just location in media root ( .url will give this appended to media url) into the database.
I just want to know what reference does Django save into database for media file??
generally it saves the location of file, inside media root. you can see in database also.
here's official docs: https://docs.djangoproject.com/en/3.0/topics/files/

Get absolute file path of FileField of a model instance in Django

I have a page where people can upload files to my server.
I want to do something with the file in Celery. So, I need to know the absolute filepath of the uploaded FileFiled of my Model.
Let's say I queried the model and got the instance.
Now I need to get the absolute file path including the filepath.
obj = Audio.objects.get(pk=1)
I'm currently trying obj.filename and it's only printing the file name and not the absolute path.
I know I can get the upload path I input into upload_to and media directory, but I was wondering if there was a more DRY and automatic approach.
How do I get the absolute path of file which is a file filed in obj?
Found an answer.
I gotta do a .path on the FileField
If I do
obj.audio_file.path
obj is the model instance I queried and audio_file is the filefield

Django FilePathField Best Practice

I've looked over the questions in SO and none can explain the proper usage of Django's FilePathField. The Django documentation about it is a little short. A web search does not yield good tutorials about it as well. To add do non uploaded files must be collected to the static directory,reside inside apps where they are used, or at the project level?
If you want to handle uploaded files consider using FileField instead of FilePathField. FileField also stores path to the uploaded file, but it is designed to handle new file creation for uploads. FilePathField is just used to point to a path in your file system.

How to tell Django to put media (pdf) files urls in sitemap.xml?

I've put static views and model views in my Django generated sitemap.xml file but i do not know how to tell Django to put all of the media files in to it? I have a hundred of PDF files with seo friendly links and i want them in my sitemap.xml, but as they are not in correlation with any of my models i don't know how to manage this?
EDIT: I almost forgot one important thing - my media (pdf) files are served through CloudFront so even if i manage somehow to list them in my Django Sitemap.xml i'll have additional problem because they have 'something.cloudfront.com' in their url's and not on my web site's url 'example.com'.
Is this even possible to solve? How does this reflect on SEO?
SOLVED:
#kb, thanks for a great answer! I've used RewriteRule in my htaccess as you suggested in the first part of your answer, and it works fine.
As of second part, instead of creating model for my media files (which would work just fine, but only downside would be manual adding of every new pdf file)
i decided to add some lines to my items() method so i could list bucket content and filter pdf files. With that i can have all of my files up-to-date all the time, easily:
#sitemap.py
import boto
from boto.s3.key import Key
from boto.s3.connection import S3Connection
import re
def items(self):
AWS_ACCESS_KEY_ID = #'my_access_key_number'
AWS_SECRET_ACCESS_KEY = #'my_secret_access_key'
Bucketname = #'my_bucket_name'
conn = boto.s3.connect_to_region('eu-central-1', aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY, is_secure=False, calling_format = boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket(Bucketname)
new_list = []
regex = re.compile(r'bucketsubfolder/media.*\.pdf$', re.I) #i hold my media files in bucketsubfolder so url is for example somedomain.cloudfront.net/bucketsubfolder/media/somefile.pdf
for item in bucket.list():
if regex.match(item.name):
new_list.append(item.name)
return new_list
You are not allowed to use external urls in your sitemap (or rather, they won't have the desired effect being indexed by Google as part of your site content).
I think your best option is to dedicate a path on your site like /hosted/pdf/xxxx.pdf that rewrites everything to cloudfront.com/pdf/xxxx.pdf or similar using mod_rewrite/location patterns/regex.
That way you can use a local site URL in your sitemap but still have the browser sent to the cloudfront served content directly, I think this might even be a good use of the 302 HTTP status code.
In the Sitemap class there is an items() method that returns what is to be included in the sitemap.xml, and you could create your own class that extends it and adds additional data.
You can either manually add the data hardcoded in the method but I think the preferred option is to create a Model that represents each remote hosted file and that contains the information necessary to output it in the sitemap. (This also lets you add properties such as visibility on a per file basis and lets you manage it via admin assuming you set up a ModelAdmin for it.)
I think you might be able do something similar to what they show in http://docs.djangoproject.com/en/1.9/ref/contrib/sitemaps with the BlogSitemap class that extends Sitemap. Be sure to check the heading "Sitemap for static views" on that page as well.
My suggestion is that you chose the model approach to represent the files, so you have your hosted PDFs (or other CDN content) as a model called StaticHostedFile or similar and you iterate through all of them in the items() section. It does require you to index all the current PDFs to create models for them as well as create a new model whenever a new PDF is added (but that could be automated).
It can be good to know that you can add "includes" in a sitemap.xml so you might be able to split the site content into two sitemaps (content+pdfs) and include both in sitemap.xml, for instance:
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/original_sitemap.xml</loc>
<lastmod>2016-07-12T09:12Z</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.com/pdf_sitemap.xml</loc>
<lastmod>2016-07-15T08:55Z</lastmod>
</sitemap>
</sitemapindex>
This still requires local URLs and rewrites as per above though, but it can be a nifty trick for when you have several separate sitemaps to combine. (For instance if running a Django site under one subdir and a Wordpress site under another or whatnot.)

How to make Django url dispatcher use subdomain?

I have a vague idea on how to solve this, but really need a push :)
I have a Django app running with apache (mod_wsgi). Today urls look like this:
http://site.com/category/A/product/B/
What I would like to do is this:
http://A.site.com/product/B
This means that the url dispatcher some how needs to pick up the value found in the subdomain and understand the context of this instead of only looking at the path. I see two approaches:
Use .htaccess and rewrites so that a.site.com is a rewrite. Not sure if this does the trick since I don't fully understand what the django url dispatcher framework will see in that case?
Understanding how the url dispatcher DO work I could write a filter that looks at valid sub domains and provides this in a rewritten format to the url dispatcher code.
Any hints or solutions are very much appreciated! Thanks.
Have you looked at django.contrib.sites? I think a combination of that, setting SITE_ID in your settings.py, and having one WSGI file per "site" can take care of things.
EDIT: -v set.
django.contrib.sites is meant to let you run multiple sites from the same Django project and database. It adds a table (django.contrib.sites.models.Site) that has domain and name fields. From what I can tell, the name can mean whatever you want it to, but it's usually the English name for the site. The domain is what should show up in the host part of the URL.
SITE_ID is set in settings.py to the id of the site being served. In the initial settings.py file, it is set to 1 (with no comments). You can replace this with whatever code you need to set it to the right value.
The obvious thing to do would be to check an environment variable, and look up that in the name or domain field in the Site table, but I'm not sure that will work from within the settings.py file, since that file sets up the database connection parameters (circular dependency?). So you'll probably have to settle for something like:
SITE_ID = int(os.environ.get('SITE_ID', 1)
Then in your WSGI file, you do something like:
os.environ['SITE_ID'] = 2
and set that last number to the appropriate value. You'll need one WSGI file per site, or maybe there's a way to set SITE_ID from within the Apache setup. Which path to choose depends on the site setup in question.
The sites framework is most powerful where you use Site as the target of a ForeignKey or ManyToManyField so that you can link your model instances (i.e. records) to specific sites.
Mikes solution is correct if you want to have multiple sites with same apps with different content (sites module) on multiple domains or subdomains, but it has a drawback that you need to be running multiple instances of the Django process.
A better solution for the main problem about multiple domains or subdomains is to use a simple middleware that handles incoming requests with the process_request() function and changing the documented urlconf attribute (link) of the request object to the URLconf you want to use.
More details and an example of the per-request or per-domain URL dispatcher can be found at:
http://gw.tnode.com/0483-Django/
Try adding a wildcard subdomain: usually *.