Cannot read csv files uploaded using pandas in Djnago - django

I have the following view defined in my Django views:-
def csv_file_upload(request):
if request.method == "POST" and request.FILES['file_upload']:
registry = request.POST.get('reg_select').lower()
csv_file = request.FILES['file_upload']
data = pd.read_csv(csv_file, delimiter="\|\|")
print(data.head())
return render(request, "csv_file_upload.html", {})
But the pd.read_csv part is giving me this error:-
cannot use a string pattern on a bytes-like object
The sample csv file that I have is like this:
Col_A||Col_B||Col_C
A0||B0||C0
A1||B1||C1
The same file I can read using pd.read_csv() without using Django and do no get this error.
Why is this error being caused when using Django?

Files are uploaded as bytes and not as string ( expected )
You should read file and decode its content to string
csv_bytes = request.FILES['file_upload'].read()
csv_text = csv_bytes.decode('utf-8')
string_buffer = io.StringIO(csv_text)
data = pd.read_csv(string_buffer , delimiter="\|\|")

Related

How to zip multiple uploaded file in Django before saving it to database?

I am trying to compress a folder before saving it to database/file storage system using Django. For this task I am using ZipFile library. Here is the code of view.py:
class BasicUploadView(View):
def get(self, request):
file_list = file_information.objects.all()
return render(self.request, 'fileupload_app/basic_upload/index.html',{'files':file_list})
def post(self, request):
zipfile = ZipFile('test.zip','w')
if request.method == "POST":
for upload_file in request.FILES.getlist('file'): ## index.html name
zipfile.write(io.BytesIO(upload_file))
fs = FileSystemStorage()
content = fs.save(upload_file.name,upload_file)
data = {'name':fs.get_available_name(content), 'url':fs.url(content)}
zipfile.close()
return JsonResponse(data)
But I am getting the following error:
TypeError: a bytes-like object is required, not 'InMemoryUploadedFile'
Is there any solution for this problem? Since I may have to upload folder with large files, do I have to write a custom TemporaryFileUploadHandler for this purpose? I have recently started working with Django and it is quite new to me. Please help me with some advice.
InMemoryUploadedFile is an object that contains more than just file you should open file and read it content ( InMemoryUploadedFile.file is the file)
InMemoryUploadedFile.open()
You should open file with open() and then read() it's content, also you should check if you have uploaded files correctly also you could use with syntax for both zip and file
https://www.pythonforbeginners.com/files/with-statement-in-python

Django Doc Convert to HTML

I am looking for a HTML converter which allows me to convert .doc to HTML in my Django project.
In my project, .docx files can be converted but not .doc files.
.docx file processing was done as follows.
view.py:
#csrf_exempt
#api_view(['POST'])
def fileload(request):
if request.method == 'POST' and request.FILES['file']:
urls = settings.MEDIA_ROOT+'fileload/'
fs = FileSystemStorage(location=urls, base_url=urls)
filename = fs.save(file.name, file)
filepath = urls + file.name
ext = os.path.splitext(filepath)[1]
print(ext)
html=None
code = '0'
if ext == '.docx':
html = get_docx_html(filepath)
code = '1'
fs.delete(file.name)
data = {
'code': code,
'html': html
}
response = JsonResponse(data)
return response
def get_docx_html(path):
with open(path, "rb") as docx_file:
result = mammoth.convert_to_html(docx_file)
html = result.value
messages = result.messages
return html
In the same way, doc files are not converted.
I'd like to have the .doc file converted.
Any idea of approach that can be recommended or sample code? Thanks a lot.

Cannot upload file to Amazon S3 using Boto

I'm using Flask / Heroku and the Boto library. I want the uploaded file to be saved in my S3...
#app.route("/step3/", methods = ["GET", "POST"])
def step3():
if request.method == "GET":
return render_template("step3.html")
else:
file = request.files['resume']
if file and allowed_file(file.filename):
filename = secure_filename(file.filename)
k = Key(S3_BUCKET)
k.key = "TEST"
k.set_contents_from_filename(file)
return redirect(url_for("preview"))
but the following gives me the following...
TypeError: coercing to Unicode: need string or buffer, FileStorage found
To write it you need to change your file as a String, that means you need to read it after it has been open.

How to validate contents of a CSV file using Django forms

I have a web app that needs to do the following:
Present a form to request a client side file for CSV import.
Validate the data in the CSV file or ask for another filename.
At one point, I was doing the CSV data validation in the view, after the form.is_valid() call from getting the filename (i.e. I have the imported CSV file into memory in a dictionary using csv.DictReader). After running into problems trying to pass errors back to the original form, I'm now trying to validate the CONTENTS of the CSV file in the form's clean() method.
I'm currently stumped on how to access the in memory file from clean() as the request.FILES object isn't valid. Note that I have no problems presenting the form to the client browser and then manipulating the resulting CSV file. The real issue is how to validate the contents of the CSV file - if I assume the data format is correct I can import it to my models. I'll post my forms.py file to show where I currently am after moving the code from the view to the form:
forms.py
import csv
from django import forms
from io import TextIOWrapper
class CSVImportForm(forms.Form):
filename = forms.FileField(label='Select a CSV file to import:',)
def clean(self):
cleaned_data = super(CSVImportForm, self).clean()
f = TextIOWrapper(request.FILES['filename'].file, encoding='ASCII')
result_csvlist = csv.DictReader(f)
# first line (only) contains additional information about the event
# let's validate that against its form definition
event_info = next(result_csvlist)
f_eventinfo = ResultsForm(event_info)
if not f_eventinfo.is_valid():
raise forms.ValidationError("Error validating 1st line of data (after header) in CSV")
return cleaned_data
class ResultsForm(forms.Form):
RESULT_CHOICES = (('Won', 'Won'),
('Lost', 'Lost'),
('Tie', 'Tie'),
('WonByForfeit', 'WonByForfeit'),
('LostByForfeit', 'LostByForfeit'))
Team1 = forms.CharField(min_length=10, max_length=11)
Team2 = forms.CharField(min_length=10, max_length=11)
Result = forms.ChoiceField(choices=RESULT_CHOICES)
Score = forms.CharField()
Event = forms.CharField()
Venue = forms.CharField()
Date = forms.DateField()
Div = forms.CharField()
Website = forms.URLField(required=False)
TD = forms.CharField(required=False)
I'd love input on what's the "best" method to validate the contents of an uploaded CSV file and present that information back to the client browser!
I assume that when you want to access that file is in this line inside the clean method:
f = TextIOWrapper(request.FILES['filename'].file, encoding='ASCII')
You can't use that line because request doesn't exist but you can access your form's fields so you can try this instead:
f = TextIOWrapper(self.cleaned_data.get('filename'), encoding='ASCII')
Since you have done super.clean in the first line in your method, that should work. Then, if you want to add custom error message to you form you can do it like this:
from django.forms.util import ErrorList
errors = form._errors.setdefault("filename", ErrorList())
errors.append(u"CSV file incorrect")
Hope it helps.

django RequestFactory file upload

I try to create a request, using RequestFactory and post with file, but I don't get request.FILES.
from django.test.client import RequestFactory
from django.core.files import temp as tempfile
tdir = tempfile.gettempdir()
file = tempfile.NamedTemporaryFile(suffix=".file", dir=tdir)
file.write(b'a' * (2 ** 24))
file.seek(0)
post_data = {'file': file}
request = self.factory.post('/', post_data)
print request.FILES # get an empty request.FILES : <MultiValueDict: {}>
How can I get request.FILES with my file ?
If you open the file first and then assign request.FILES to the open file object you can access your file.
request = self.factory.post('/')
with open(file, 'r') as f:
request.FILES['file'] = f
request.FILES['file'].read()
Now you can access request.FILES like you normally would. Remember that when you leave the open block request.FILES will be a closed file object.
I made a few tweaks to #Einstein 's answer to get it to work for a test that saves the uploaded file in S3:
request = request_factory.post('/')
with open('my_absolute_file_path', 'rb') as f:
request.FILES['my_file_upload_form_field'] = f
request.FILES['my_file_upload_form_field'].read()
f.seek(0)
...
Without opening the file as 'rb' I was getting some unusual encoding errors with the file data
Without f.seek(0) the file that I uploaded to S3 was zero bytes
You need to provide proper content type, proper file object before updating your FILES.
from django.core.files.uploadedfile import File
# Let django know we are uploading files by stating content type
content_type = "multipart/form-data; boundary=------------------------1493314174182091246926147632"
request = self.factory.post('/', content_type=content_type)
# Create file object that contain both `size` and `name` attributes
my_file = File(open("/path/to/file", "rb"))
# Update FILES dictionary to include our new file
request.FILES.update({"field_name": my_file})
the boundary=------------------------1493314174182091246926147632 is part of the multipart form type. I copied it from a POST request done by my webbrowser.
All the previous answers didn't work for me. This seems to be an alternative solution:
from django.core.files.uploadedfile import SimpleUploadedFile
with open(file, "rb") as f:
file_upload = SimpleUploadedFile("file", f.read(), content_type="text/html")
data = {
"file" : file_upload
}
request = request_factory.post("/api/whatever", data=data, format='multipart')
Be sure that 'file' is really the name of your file input field in your form.
I got that error when it was not (use name, not id_name)