I have some CSV data and I want to import into django models using the example CSV data:
1;"02-01-101101";"Worm Gear HRF 50";"Ratio 1 : 10";"input shaft, output shaft, direction A, color dark green";
2;"02-01-101102";"Worm Gear HRF 50";"Ratio 1 : 20";"input shaft, output shaft, direction A, color dark green";
3;"02-01-101103";"Worm Gear HRF 50";"Ratio 1 : 30";"input shaft, output shaft, direction A, color dark green";
4;"02-01-101104";"Worm Gear HRF 50";"Ratio 1 : 40";"input shaft, output shaft, direction A, color dark green";
5;"02-01-101105";"Worm Gear HRF 50";"Ratio 1 : 50";"input shaft, output shaft, direction A, color dark green";
I have some django models named Product. In Product there are some fields like name, description and price. I want something like this:
product=Product()
product.name = "Worm Gear HRF 70(02-01-101116)"
product.description = "input shaft, output shaft, direction A, color dark green"
product.price = 100
You want to use the csv module that is part of the python language and you should use Django's get_or_create method
with open(path) as f:
reader = csv.reader(f)
for row in reader:
_, created = Teacher.objects.get_or_create(
first_name=row[0],
last_name=row[1],
middle_name=row[2],
)
# creates a tuple of the new object or
# current object and a boolean of if it was created
In my example the model teacher has three attributes first_name, last_name and middle_name.
Django documentation of get_or_create method
If you want to use a library, a quick google search for csv and django reveals two libraries - django-csvimport and django-adaptors. Let's read what they have to say about themselves...
django-adaptors:
Django adaptor is a tool which allow you to transform easily a CSV/XML
file into a python object or a django model instance.
django-importcsv:
django-csvimport is a generic importer tool to allow the upload of CSV
files for populating data.
The first requires you to write a model to match the csv file, while the second is more of a command-line importer, which is a huge difference in the way you work with them, and each is good for a different type of project.
So which one to use? That depends on which of those will be better suited for your project in the long run.
However, you can also avoid a library altogether, by writing your own django script to import your csv file, something along the lines of (warning, pseudo-code ahead):
# open file & create csvreader
import csv, yada yada yada
# import the relevant model
from myproject.models import Foo
#loop:
for line in csv file:
line = parse line to a list
# add some custom validation\parsing for some of the fields
foo = Foo(fieldname1=line[1], fieldname2=line[2] ... etc. )
try:
foo.save()
except:
# if the're a problem anywhere, you wanna know about it
print "there was a problem with line", i
It's super easy. Hell, you can do it interactively through the django shell if it's a one-time import. Just - figure out what you want to do with your project, how many files do you need to handle and then - if you decide to use a library, try figuring out which one better suits your needs.
Use the Pandas library to create a dataframe of the csv data.
Name the fields either by including them in the csv file's first line or in code by using the dataframe's columns method.
Then create a list of model instances.
Finally use the django method .bulk_create() to send your list of model instances to the database table.
The read_csv function in pandas is great for reading csv files and gives you lots of parameters to skip lines, omit fields, etc.
import pandas as pd
from app.models import Product
tmp_data=pd.read_csv('file.csv',sep=';')
#ensure fields are named~ID,Product_ID,Name,Ratio,Description
#concatenate name and Product_id to make a new field a la Dr.Dee's answer
products = [
Product(
name = tmp_data.ix[row]['Name'],
description = tmp_data.ix[row]['Description'],
price = tmp_data.ix[row]['price'],
)
for row in tmp_data['ID']
]
Product.objects.bulk_create(products)
I was using the answer by mmrs151 but saving each row (instance) was very slow and any fields containing the delimiting character (even inside of quotes) were not handled by the open() -- line.split(';') method.
Pandas has so many useful caveats, it is worth getting to know
You can also use, django-adaptors
>>> from adaptor.model import CsvModel
>>> class MyCSvModel(CsvModel):
... name = CharField()
... age = IntegerField()
... length = FloatField()
...
... class Meta:
... delimiter = ";"
You declare a MyCsvModel which will match to a CSV file like this:
Anthony;27;1.75
To import the file or any iterable object, just do:
>>> my_csv_list = MyCsvModel.import_data(data = open("my_csv_file_name.csv"))
>>> first_line = my_csv_list[0]
>>> first_line.age
27
Without an explicit declaration, data and columns are matched in the same order:
Anthony --> Column 0 --> Field 0 --> name
27 --> Column 1 --> Field 1 --> age
1.75 --> Column 2 --> Field 2 --> length
For django 1.8 that im using,
I made a command that you can create objects dynamically in the future,
so you can just put the file path of the csv, the model name and the app name of the relevant django application, and it will populate the relevant model without specified the field names.
so if we take for example the next csv:
field1,field2,field3
value1,value2,value3
value11,value22,value33
it will create the objects
[{field1:value1,field2:value2,field3:value3}, {field1:value11,field2:value22,field3:value33}]
for the model name you will enter to the command.
the command code:
from django.core.management.base import BaseCommand
from django.db.models.loading import get_model
import csv
class Command(BaseCommand):
help = 'Creating model objects according the file path specified'
def add_arguments(self, parser):
parser.add_argument('--path', type=str, help="file path")
parser.add_argument('--model_name', type=str, help="model name")
parser.add_argument('--app_name', type=str, help="django app name that the model is connected to")
def handle(self, *args, **options):
file_path = options['path']
_model = get_model(options['app_name'], options['model_name'])
with open(file_path, 'rb') as csv_file:
reader = csv.reader(csv_file, delimiter=',', quotechar='|')
header = reader.next()
for row in reader:
_object_dict = {key: value for key, value in zip(header, row)}
_model.objects.create(**_object_dict)
note that maybe in later versions
from django.db.models.loading import get_model
is deprecated and need to be change to
from django.apps.apps import get_model
The Python csv library can do your parsing and your code can translate them into Products().
something like this:
f = open('data.txt', 'r')
for line in f:
line = line.split(';')
product = Product()
product.name = line[2] + '(' + line[1] + ')'
product.description = line[4]
product.price = '' #data is missing from file
product.save()
f.close()
Write command in Django app. Where you need to provide a CSV file and loop it and create a model with every new row.
your_app_folder/management/commands/ProcessCsv.py
from django.core.management.base import BaseCommand
from django.conf import settings
from your_app_name.models import Product
class Command(BaseCommand):
def handle(self, *args, **options):
with open(os.join.path(settings.BASE_DIR / 'your_csv_file.csv'), 'r') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=';')
for row in csv_reader:
Product.objects.create(name=row[2], description=row[3], price=row[4])
At the end just run the command to process your CSV file and insert it into Product model.
Terminal:
python manage.py ProcessCsv
Thats it.
If you're working with new versions of Django (>10) and don't want to spend time writing the model definition. you can use the ogrinspect tool.
This will create a code definition for the model .
python manage.py ogrinspect [/path/to/thecsv] Product
The output will be the class (model) definition. In this case the model will be called Product.
You need to copy this code into your models.py file.
Afterwards you need to migrate (in the shell) the new Product table with:
python manage.py makemigrations
python manage.py migrate
More information here:
https://docs.djangoproject.com/en/1.11/ref/contrib/gis/tutorial/
Do note that the example has been done for ESRI Shapefiles but it works pretty good with standard CSV files as well.
For ingesting your data (in CSV format) you can use pandas.
import pandas as pd
your_dataframe = pd.read_csv(path_to_csv)
# Make a row iterator (this will go row by row)
iter_data = your_dataframe.iterrows()
Now, every row needs to be transformed into a dictionary and use this dict for instantiating your model (in this case, Product())
# python 2.x
map(lambda (i,data) : Product.objects.create(**dict(data)),iter_data
Done, check your database now.
You can use the django-csv-importer package.
http://pypi.python.org/pypi/django-csv-importer/0.1.1
It works like a django model
MyCsvModel(CsvModel):
field1 = IntegerField()
field2 = CharField()
etc
class Meta:
delimiter = ";"
dbModel = Product
And you just have to:
CsvModel.import_from_file("my file")
That will automatically create your products.
You can give a try to django-import-export. It has nice admin integration, changes preview, can create, update, delete objects.
This is based off of Erik's answer from earlier, but I've found it easiest to read in the .csv file using pandas and then create a new instance of the class for every row in the in data frame.
This example is updated using iloc as pandas no longer uses ix in the most recent version. I don't know about Erik's situation but you need to create the list outside of the for loop otherwise it will not append to your array but simply overwrite it.
import pandas as pd
df = pd.read_csv('path_to_file', sep='delimiter')
products = []
for i in range(len(df)):
products.append(
Product(
name=df.iloc[i][0]
description=df.iloc[i][1]
price=df.iloc[i][2]
)
)
Product.objects.bulk_create(products)
This is just breaking the DataFrame into an array of rows and then selecting each column out of that array off the zero index. (i.e. name is the first column, description the second, etc.)
Hope that helps.
Here's a django egg for it:
django-csvimport
Consider using Django's built-in deserializers. Django's docs are well-written and can help you get started. Consider converting your data from csv to XML or JSON and using a deserializer to import the data. If you're doing this from the command line (rather than through a web request), the loaddata manage.py command will be especially helpful.
define class in models.py and a function in it.
class all_products(models.Model):
def get_all_products():
items = []
with open('EXACT FILE PATH OF YOUR CSV FILE','r') as fp:
# You can also put the relative path of csv file
# with respect to the manage.py file
reader1 = csv.reader(fp, delimiter=';')
for value in reader1:
items.append(value)
return items
You can access ith element in the list as items[i]
Related
I have tabulated text files of my data and records are too much in number i cant upload it one by one in database
is there any way to import that data into the table i created in model
I have created a simple script that might be a start for you. This script will read in a csv-file and store it into a database. You should be able to modify it to meet your needs by replacing filename.csv to the location of your file, and YourModel to the actual model that it represents. You will also need to change obj.field1 = line[0] to the representing columns and fields that are matched with each other.
import csv
# Open the csv file and reads it into a two dimensional List
with open('filename.csv', 'rb') as f:
reader = csv.reader(f)
lines = list(reader)
# Create an empty list of objects of your model
objects = []
# Iterate each record of the csv file
for line in lines:
# Create an empty instance of your model
obj = YourModel()
# Populate the fields of the model based on the record line of your file
obj.field1 = line[0] # The first column
obj.field2 = line[1] # The second column
# Add the model to the list of objects
objects.append(obj)
# Save all objects simultaniously, instead of saving for each line
YourModel.objects.bulk_create(objects)
I found some information on extending and changing the save() method on my model, but a few other people mentioned that it was bad practice to do that, and one should instead modify the admin form.
Extracting the audio from a mp4 is easy using moviepy, I just need to run these lines:
from moviepy.editor import *
audio = VideoFileClip("test-file.mp4").audio
audio.write_audiofile("audio.mp3")
However, I do not know where to put this within my model to ensure it gets run and saves the correct file.
My model looks like this:
Class MyModel(models.Model):
audio = models.FileField(upload_to=update_filename)
It's important this code is executed before the file gets saved, and the audio file is the one getting saved to the audio attribute of my model.
Instead of changing the save method, I needed to change the clean method, which is where the data for a model is validated and where one can modify the attributes of a model.
from django.db import models
from django.core.exceptions import ValidationError
from moviepy.editor import *
class MyModel(models.Model):
audio = models.FileField(upload_to=lambda i, f: f[0:-4] + ".mp3")
def clean(self):
super().clean()
extension = self.audio.name[len(self.audio.name) - 4:]
file = self.audio.file
file.__class__
if extension != ".mp3" and extension != ".mp4":
raise ValidationError("Incorrect File Format Uploaded. Valid Formats: (MP3, MP4)")
elif extension == ".mp4":
file_audio = VideoFileClip(file.temporary_file_path()).audio
new_file_path = file.temporary_file_path()[:-4] + ".mp3"
file_audio.write_audiofile(new_file_path)
file.file.name = new_file_path
It's important to run super.clean() before modifying a model's attributes, because if one runs it after modifying an attribute, it will return a ValidationError.
I have the following file say prof.xml
<include>
<param name="xxx" value="yyy"/>
<param name="mmm" value="nnn"/>
</include>
Now I want to create django model where the model should look like this
class prof:
xxx= models.CharField(verbose_name="XXX",max_length=45)
mmm = models.CharField(verbose_name="YYY",max_length=100)
ie The model fields should have the names that are param names in the xml file and the values in the xml file should be inserted into the database. How can this be done?
I have done something like this to get the param name from the xml but I dunno how to create model field names out of that.
import os
files = [file for file in os.listdir(os.path.join(path,'prof.xml')) if os.path.isfile(file)]
for file in files:
f = open((os.path.join(path,'prof.xml')),'r')
for line in f.readlines():
pos1 = line.find("param name")
pos2 = line.find("value")
if pos1>=0 and pos2>=0:
field_name=line[pos1+12:pos2-2]
I'm not sure you can do that dynamically, as after creating model, you need to syncdb to create appropriate tables etc.
May be you could change your design a bit and have a model with keyand value fields.
class DataContainer(models.Model):
key = models.CharField(verbose_name="key",max_length=45)
value = models.CharField(verbose_name="value",max_length=100)
And have ManyToMany or ForeignKey relation with your model like:
class SomeModel(models.Model):
data = models.ManyToManyField(DataContainer)
First of all, you shouldn't be parsing your XML by hand. That's a recipe for disaster. Use a library.
Also, I'm going to second Rohan's advice on not trying to create your models dynamically, but it is possible. I do it in tests for libraries, as seen here, but I've never tried it for making permanent tables. I haven't tested this, but something like this might work:
from django.core.management import call_command
from django.db import models
def create_new_model(name, fields):
new_model = type(name, models.Model, fields)
models.register_models('myapp', new_model)
call_command('syncdb')
If anyone's crazy enough to try this, please comment and let me know how it goes.
I am trying to use a ModelChoiceField to get the values populated from an external database.
I have added an additional database in my setting.py and have set up a externaldb.py file in my app as follows:
from django.db import connections
def Location():
rs = []
cursor = connections['mydb'].cursor()
cursor.execute("SELECT city FROM db.data_center WHERE removed is null;")
zones = cursor.fetchall()
for v in zones[::]:
rs.append(v)
The using python manage.py shell I can do this
>>>from platform.externaldb import Location
>>>print Location()
>>>[(u'India-01',), (u'Singapore-01',), (u'Europe-01',)]
So I am getting values but how to I get that to appear in a drop down box.. This is my forms.py
forms.py
from platform.externaldb import Location
zone = forms.ModelChoiceField(Location(), label='Zone')
But this doesn't work for me.. How do I do this so the 3 values appears in the ModelChoiceField drop down list?
Thanks - Oli
You could make use of the ChoiceField form field rather then the ModelChoiceField. The problem with using a ModelChoiceField is that it expects a QuerySet. The ChoiceField allows you to add items via a List instead.
locations = forms.ChoiceField(choices=Locations(), label="Zone")
EDIT
Previously, I had used the ModelChoiceField:
locations = forms.ModelChoiceField(queryset=Location.objects.all(), label="Zone")
which will work as long as Location is a Model (which I wasn't sure of based on your code)
I've a large number of models (120+) and I would like to let users of my application export all of the data from them in XML format.
I looked at django-piston, but I would like to do this with minimum code. Basically I'd like to have something like this:
GET /export/applabel/ModelName/
Would stream all instances of ModelName in applabel together with it's tree of related objects .
I'd like to do this without writing code for each model.
What would be the best way to do this?
The standard django dumpdata command is not flexible enough to export single models. You can use the makefixture command to do that http://github.com/ericholscher/django-test-utils/blob/master/test_utils/management/commands/makefixture.py
If I'd have to do this, as a basic starting point I'd start from something like:
from django.core.management import call_command
def export_view(request, app_label, model_slug):
# You can do some stuff here with the specified model if needed
# ct = ContentType.objects.get(app_label=app_label, model=model_slug)
# model_class = ct.model_class()
# I don't know if this is a correct specification of params
# command line example: python manage.py makefixture --format=xml --indent=4 YourModel[3] auth.User[:5]
# You'll have to test it out and tweak it
call_command("makefixture", "file.xml", '%s.%s[:]' % (app_label, model_slug), format='xml')