Storing and downloading files from Redis in Django Celery Queue - django

I'm zipping a large number of files in my app, which leads to problems with performance. So now I've decided to zip files in a separate queue, store results in Redis and make available to user as soon as the process is done. I'm storing data in Redis to make it faster, and because I don't need files to be stored on server hard drive.
Here is my task.py:
#shared_task
def zip_files(filenames, key):
compression = zipfile.ZIP_STORED
s = BytesIO()
zf = zipfile.ZipFile(s, mode="w")
for fpath in filenames:
fdir, fname = os.path.split(fpath)
zf.write(fpath, fname, compress_type=compression)
zf.close()
caches['redis'].set(hash_key, {'file':s.getvalue()})
return hash_key
And then here is my simple download view:
def folder_cache(request, folder_hash):
cache_data = caches['redis'].get(folder_hash)
if cache_data:
response = FileResponse(cache_data['file'], content_type="application/x-zip-compressed")
response['Content-Disposition'] = 'attachment; filename=hello.zip'
response['Content-Length'] = len(cache_data['file'])
return response
return HttpResponse("Cache expired.")
Problem is that I can only download a part of the file, then the download is stopped by "Network connection was lost" message. The downloaded file seems to contain a set of numbers (not binary data). But I don't know, maybe I use FileResponse wrong? Or I need to serialize data before / after putting it to Redis cache?
I also tried same code in shell, it works when I use fopen and write data from Redis cache directly to server hard drive.

Finally, I found out that I only had to wrap file data into ContentFile class. So here is the latest working code:
def folder_cache(request, folder_hash):
cache_data = caches['redis'].get(folder_hash)
if cache_data:
if (cache_data['status'] == 'complete'):
...
response = FileResponse(ContentFile(cache_data['file']), content_type="application/x-zip-compressed")
response['Content-Disposition'] = 'attachment; filename={}'.format(filename)
response['Content-Length'] = len(cache_data['file'])
return response

Related

Django - Create and add multiple xml to zip, and download as attachment

I am learner, and trying to build code to in which user has option to download the zip file that contains multiple .xlm files, which are created on the bases of database.
I have been able to create below code to download single xml file. But struggling to get multiple files packed in zipped format(for each row of database).
import xml.etree.ElementTree as ET
def export_to_xml(request):
listings = mydatabase.objects.all()
root = ET.Element('listings')
for item in listings:
price = ET.Element('price')
price.text = str(item.Name)
offer = ET.Element('offer', attrib={'id': str(item.pk)})
offer.append(price)
root.append(offer)
tree = ET.ElementTree(root)
response = HttpResponse(ET.tostring(tree.getroot()), content_type='application/xhtml+xml')
response['Content-Disposition'] = 'attachment; filename="data.xml"'
return response
Hi Got the solution by using following approach
byteStream = io.BytesIO()
with zipfile.ZipFile(byteStream, mode='w',) as zf:
# your code
zf.writestr()
response = HttpResponse(byteStream.getvalue(), content_type='application/x-zip-compressed')
response['Content-Disposition'] = "attachment; filename=finename.zip"

Django send excel file to Celery Task. Error InMemoryUploadedFile

I have background process - read excel file and save data from this file. I need to do read file in the background process. But i have error InMemoryUploadedFile.
My code
def create(self, validated_data):
company = ''
file_type = ''
email = ''
file = validated_data['file']
import_data.delay(file=file,
company=company,
file_type=file_type,
email=email)
my method looks like
#app.task
def import_data(
file,
company,
file_type,
email):
// some code
But i have error InMemoryUploadedFile.
How i can to send a file to cellery without errors?
When you delay a task, Celery will try to serialize the parameters which in your case a file is included.
Files and especially files in memory can't be serialized.
So to fix the problem you have to save the file and pass the file path to your delayed function and then read the file there and do your calculations.
Celery does not know how to serialize complex objects such as file objects. However, this can be solved pretty easily. What I do is to encode/decode the file to its Base64 string representation. This allows me to send the file directly through Celery.
The following example shows how (I intendedly placed each conversion separatedly, though this could be arranged in a more pythonic way):
import base64
import tempfile
# (Django, HTTP server)
file = request.FILES['files'].file
file_bytes = file.read()
file_bytes_base64 = base64.b64encode(file_bytes)
file_bytes_base64_str = file_bytes_base64.decode('utf-8') # this is a str
# (...send string through Celery...)
# (Celery worker task)
file_bytes_base64 = file_bytes_base64_str.encode('utf-8')
file_bytes = base64.b64decode(file_bytes_base64)
# Write the file to a temporary location, deletion is guaranteed
with tempfile.TemporaryDirectory() as tmp_dir:
tmp_file = os.path.join(tmp_dir, 'something.zip')
with open(tmp_file, 'wb') as f:
f.write(file_bytes)
# Process the file
This can be inefficient for large files but it becomes pretty handy for small/medium sized temporary files.

Return Zip file with HttpResponse using StringIO, Django, Python

I'm trying to return a zip file with HttpResponse, using StringIO() because i'm not storing in DB or Harddrive.
My issue is that my response is returning 200 when i request the file, but the OS never ask me if i want to save the file, or the file is never saved. i think that the browser is reciving the file because i have seen on the Network Activity (inspect panel) and it says than a 6.4 MB file type zip is returned.
I'm taking a .step file (text file) from a DB's url, extracting the content, zipping and returning, that's all.
this my code:
def function(request, url_file = None):
#retrieving info
name_file = url_file.split('/')[-1]
file_content = urllib2.urlopen(url_file).read()
stream_content = StringIO(file_content)
upload_name = name_file.split('.')[0]
# Create a new stream and write to it
write_stream = StringIO()
zip_file = ZipFile(write_stream, "w")
try:
zip_file.writestr(name_file, stream_content.getvalue().encode('utf-8'))
except:
zip_file.writestr(name_file, stream_content.getvalue().encode('utf-8', 'ignore'))
zip_file.close()
response = HttpResponse(write_stream.getvalue(), mimetype="application/x-zip-compressed")
response['Content-Disposition'] = 'attachment; filename=%s.zip' % upload_name
response['Content-Language'] = 'en'
response['Content-Length'] = write_stream.tell()
return response

csv download doesn't occur without error in Django

I am implementing csv donwloand function referring to this page.
Even though I don't get any error message, cannot get csv file downloaded.
Does anyone know what is the problem for this implementation?
Below is the code to download csv file from database.
class timeCSVexport(View):
def get(self,request,pk,keyword):
key=keyword.replace("_"," ")
queryset=timeseries.objects.filter(html__pk=pk).filter(keyword=key)
bio = BytesIO()
data=json.loads(list(queryset)[0].df)
df=pd.DataFrame.from_dict(data,orient='index').T
df.index=pd.to_datetime(df.index)
df1=df.sort_index()
sheet=key[:31] if len(key)>31 else key
print (sheet)
writer=pd.ExcelWriter(bio,engine='xlsxwriter')
df1.to_excel(writer,sheet_name=sheet)
writer.save()
bio.seek(0)
workbook=bio.getvalue()
response = StreamingHttpResponse(workbook,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % pk
return response

Django download file empty

I am writing a simple function for downloading a certain file, from the server, to my machine.
The file is unique represented by its id. The file is locatd corectly, and the download is done, but the downloaded file (though named as the one on the server) is empty.
my download function looks like this:
def download_course(request, id):
course = Courses.objects.get(pk = id).course
path_to_file = 'root/cFolder'
filename = __file__ # Select your file here.
wrapper = FileWrapper(file(filename))
content_type = mimetypes.guess_type(filename)[0]
response = HttpResponse(wrapper, content_type = content_type)
response['Content-Length'] = os.path.getsize(filename)
response['Content-Disposition'] = 'attachment; filename=%s/' % smart_str(course)
return response
where can i be wrong? thanks!
I answered this question here, hope it helps.
Looks like you're not sending any data (you don't even open the file).
Django has a nice wrapper for sending files (code taken from djangosnippets.org):
def send_file(request):
"""
Send a file through Django without loading the whole file into
memory at once. The FileWrapper will turn the file object into an
iterator for chunks of 8KB.
"""
filename = __file__ # Select your file here.
wrapper = FileWrapper(file(filename))
response = HttpResponse(wrapper, content_type='text/plain')
response['Content-Length'] = os.path.getsize(filename)
return response
so you could use something like response = HttpResponse(FileWrapper(file(path_to_file)), mimetype='application/force-download').
If you are really using lighttpd (because of the "X-Sendfile" header), you should check the server and FastCGI configuration, I guess.
Try one of these approaches:
1) Disable GZipMiddleware if you are using it;
2) Apply a patch to django/core/servers/basehttp.py described in
https://code.djangoproject.com/ticket/6027