MySQL replication and Django FileField - django

I have MySQL replication setup, and it replicates nicely the database data. However, I also use FileField and ImageField, and have file loaded onto the FS. I probably will just use rsync to manually replicate this, but is there a better way?
I know of key value storage. But for this project, I am looking to minimize the number of technologies involved and stick with simple options. I've successfully used rsync for this before, but I was wondering if others who have done this have any new cool tools (or even rsync wrappers) that work better.
Your experiences are appreciated.

I haven't searched to see if anyone has already done this, but you can write your own code in Django to remotely copy the file to your goal server (i.e. SFTP).
Option 1 on this front: create your own Form Field that extends the Image and File field that does this uploading.
Option 2: in your form/view, call some additional function that does the uploading. Option 3: override something in Django code to handle this automatically for Image and File fields (probably not recommended, unless there is some slick way I'm not thinking of).
Here's info on using SFTP in Python: SFTP in Python? (platform independent)
If you're using something like Amazon's CloudFront or Buckets, then you can use Boto to handle the uploading (I believe): http://aws.amazon.com/code/827?_encoding=UTF8&jiveRedirect=1 (if not, there are probably other python libraries to help).

Related

How do I create a migration for a content type?

I wanted to migrate 3 fields to the Article content type. So I want to know the initial step to follow.
Depending on the way you want to migrate the data, depends on how you do it. Also, is this a migration from a previous version? Basically you would create those fields in the new database, and using the data type you choose for export (JSON being the easiest) you'll setup your source, process, and destination module/plugin to manage the content. You can find this info here. Then define the field attributes and after exporting the data Drush will pretty much handle the rest.
Start by downloading and installing migrate_plus and migrate_tools into the destination site.
Create new fields in destination site.
Identify source export type.
Export data.
Setup module with process (if needed) and destination configuration.
Load module in new site, following source type directive
Run Drush migrate commands to manage the import.
As I said JSON has been the easiest I have found and is done fairly quickly, setting up the module will be the most time spent but there are enough community modules out there to support you.
Don't forget, always back up, but luckily if using Drush it can always revert back the migration for you in case, just good practice.
Good luck.

Foswiki: Uploading and downloading topics without FTP

I have a Foswiki wiki on a server. Is it possible to script the following without FTP access (for various reasons I can't use it):
Download a topic's wikitext, modify it locally, then upload it again (overwriting the topic)
Upload wikitext to a new topic
I've been doing these tasks manually, but I'd like to automate them. I've looked into the Foswiki API and a few plugins, but nothing seems capable of doing this.
Is there a way? (any programming language)
If you have web access, you could drive the bin/view and bin/save scripts remotely from a script.
Take a look at our BuildContrib upload target for an example. It gets a strikeone key and downloads the original topic to recover any form data. It then uploads the topic text, creating a new version. It's written in perl, and uses LWP.
https://github.com/foswiki/distro/blob/master/BuildContrib/lib/Foswiki/Contrib/BuildContrib/Targets/upload.pm
The following isn't(!) the right solution (sure exists an nice Foswiki-way approach), but if you know perl, you can do anything with the:
Install Firefox
install MozRepl addon into it
Install the WWW::Mechanize::Firefox perl module
Now, you can script anything what you can do directly from the browser, e.g. logging into the Foswiki, click buttons, save topics, etc..etc. Drawback - it isn't an easy way - you need to know many details.
Myself using this technique for testing.

Is it better to pull files from remote locations or grant users FTP access to my system?

I need to setup a process to update a database table with user supplied CSV-data (running Coldfusion 8/MySQL 5.0.88).
I'm not sure about the best way to do this.
Should I give users FTP-access to my system, generate a directory for every user and upload files from there, or should I pick files up from external locations, so the user has to setup an FTP folder my system can access. I'm sort of leaning towards the 2nd way and wanted to set this up using cfschedule and cfftp, but I'm not sure this is the best way to go forward. Security wise, I'm mor inclined to have users specify an FTP location, from where I pull, rather than handing out and maintaing FTP folders for every user.
Question:
Which approach is better both in terms of security and automation?
Thanks for input!
I wouldn't use either approach. I would give the users a web page to upload their csv files. The cf page that accepts the files would place them into a specific folder and make sure they have unique filenames. The cffile tag will help you with that.
The scheduled job would start with a cfdirectory tag on the target folder. This creates a query object. Loop through it and do what you have to do with each file.
Remember to check for the correct file extension. Then look at the first line of the file to ensure it matches the expected format.
Once you have finished processing the file, do something with it so that you don't process it again on the next scheduled job.
Setting up a custom FTP server is certainly a possibility, since you are able to create users, and give them privileges (automated). It is also secure.
But I don't know the best place to start if you don't have any experience with setting up a FTP server.
Try https://www.dropbox.com/
a.)Create a dropbox account,send invites to your users/clients.
b.)You can upload files/folders into dropbox,your clients/users can access it from their
dropbox account/dropbox desktop app..
c.)Your users/clients can upload files/folders and you can access it from your dropbox
website account/desktop app.
Dropbox is rank 1 software, better in security and automation.
Other solutions:
Best solution GOOGLE DRIVE(5gb free)
create a new gmail account,give ur id and password to your users.ask them to open google drive and import/export files.or try skydrive(25gb free)
http://www.syncplicity.com/
https://www.cubby.com/
http://www.huddle.com/?source=cj&aff=4003003
http://www.egnyte.com/
http://www.sharefile.com/

Django + S3 (boto) + Sorl Thumbnail: Suggestions for optimisation

I am using S3 storage backend across a Django site I am developing, both to reduce load from the EC2 server(s), and to allow multiple webservers (redundancy, load balancing) access the same set of uploaded media.
Sorl.thumbnail (v11) template tags are being used in our templates to allow flexible image resizing/cropping.
Performance on media-rich pages is not very good, and when a page containing thumbnails needing to be generated for the first time is accessed, the requests even time out.
I understand that this is due to sorl thumbnail checking/downloading the original image from S3 (which could be quite large and high resolution), and rendering/checking/uploading the thumbnail.
What would you suggest is the best solution to this setup?
I have seen suggestions of storing a local copy of files in addition to the S3 copy (not to great when a couple of server are being used for load balancing). Also I've seen it suggested to store 0-byte files to fool sorl.thumbnail.
Are there any other suggestions or better ways of approaching this?
sorl thumbnail is now created with remote slow storages in mind. The first creation of the thumbnail is however done quering the storage, for example first accessed from template, but after that the references are cached in a key value store. Still you need the first query and creation, well one solution is to use the low level api sorl.thumbnail.get_thumbnail with the same options when the image is uploaded. When the image uploaded add this thumbnail creation job to a que like celery.
You can use Sorlery. It combines sorl and celery to create thumbnails via workers. It's very careful not to do any filesystem access outside of the worker thread.
The thumbnail returned immediately (before the worker has had a chance) can be controlled by setting your THUMBNAIL_DUMMY_SOURCE to an appropriate placeholder.
The job is created the first time the thumbnail is requested, subsequent requests are served the dummy image until the worker thread completes.
Almost same as #Aidan's solution, I have made some tweaks on sorl-thumbnail. I also pre-generate thumbnails with celery. My code is here sorl_thumbnail-async
But I came to know easy_thumbnails does exactly what I was trying to do, so I am using it in my current project. You might find useful, short post on the topic is here
The easiest solution I've found so far is actually this third party service: http://cloudinary.com/

Django action after file upload

We have an extensive existing codebase and we've added load-balanced servers with a single master server to the equation now. There are various apps that contain models with uploaded files and images which all work fine... However, this raises the obvious problem of the rsync delay. Rsync is in the crontab and set to run every minute but this still means there's a potential 59 second wait between content being created and it actually existing on the webservers.
What I would like, is to be able to register some kind of 'post file changed' handler that triggers rsync whenever a new file is uploaded. I can't find anything of the sort though! Django has file upload handlers, but these appear to only deal with the actual upload stream, not the file as it is saved to the filesystem thereafter.
The best approach I can see is to create simple extensions to FileField, FieldFile, ImageField and ImageFieldFile as part of my project and hook into the save and delete methods in the FileField. Essentially, to create custom File and Image fields with this behaviour added. This isn't massively complicated to do but it doesn't seem like the most elegant solution to me. I'll need to teach South about my new fields, update every model that is affected and then create hordes of south migrations (which I'm pretty sure will clash with some code we have pending).
I'm also looking into creating a custom Storage class for the project, but I'm nervous about this having far-reaching effects on other pieces of code.
I can't believe no-one has come across this issue before, is there a canonical approach?
Thanks very much!
If you want to tackle this problem from the server-side (eg. similar solution to rsync) and you're running Linux, you might want to check out lsyncd:
http://code.google.com/p/lsyncd/
lsyncd uses inotify in the Linux kernel to watch directories and invoke an rsync as soon as files are modified. Fairly simple to drop in.