Using Scrapy DjangoItem with Django best way - django

I am am new to Django / Scrapy and well to programming in general. I am trying to make a Django site to help me learn.
What I want to do is Scrape product information from different sites and store them in my postgres database using DjangoItem from Scrapy.
I have read all the docs from both Scrapy and Django. I have searched here and other sites for a couple days and just couldn't find exactly what I was looking for that made the light bulb go off.
Anyway, my question is, what is the standard for deploying Scrapy and Django together. Ideally I would like to scrape 5-10 different sites and store their information in my database.
Scrapy's docs are a little short on information on the best way to implement DjangoItem.
1) Should the Scrapy project be inside my Django app, at the root level of my Django project or outside all together.
2) Other than setting DjangoItem to my Django model, do I need to change any other settings?
Thanks
Brian

I generally put my scrapy project somewhere inside my Django project root folder. Just remember you will need to make sure both projects are in the python path. This is easy to do if you are using virtualenv properly.
Aside from that as long as you can import your Django models from Scrapy i think everything else in the Scrapy docs is very clear. When you import your Django model, the Django settings are set up at that point, this means your database connection etc should all be working fine as long as they are already working in Django.
The only real trick is getting the python path set up properly (which is probably a topic for another questions).

Related

How to make Django-Oscar appear as a separate app in Django admin panel?

Good day y'all!
Here comes a basic Django question:
I was running a Django project with one app in it and installed Django-Oscar to cover my ecommerce needs. So, I pip installed it in my main project and set everything up the way they explain it on readthedocs. Now, the structure of my admin panel looks like this:
Main project
My app
Oscar Address
Oscar Analytics
Oscar ...
And I'd like it to be:
Main project
My app
Shop
Oscar Address
Oscar Analytics
Oscar ...
I already did django-admin startapp shop for that matter.
Apparently the question is so obvious that I can't find any tutorials for dummies to do this.
Can someone point me in the right direction? Maybe a generic tutorial about including apps in apps the right way is laying around somewhere?
Thank you in advance.
It isn't clear from your question what you're actually trying to achieve, but Oscar does not really 'play nice' with the Django admin, from the docs:
But please note that Oscar makes no attempts at having [admin] be a workable interface; admin integration exists to ease the life of developers.
However, if you're just looking to configure the basic oscar functionality there is the Oscar dashboard (at /dashboard) which is where the ecommerce functionality is configurable. That can also be customised with additional views.

Auto populate django models and views while creating an app

I am pretty new to Django and still learning. The question is that i know that an empty app can be created in django and then i can add models and views. But throughout a long time of making websites, I noticed that most of the time i am doing operating on dbs like update,del or insert. Now my question is that is there anyway that i can add the model parameters while creating the app itself in django and it automatically populated models,forms and views for me for the basic operations that i mentioned above.
That way i can create DBS architecture while i am creating the app and the only way that i am left with is to review and make changes in models if any.
Just to make sure that i am understood, i am attaching a snippet of what i am trying to say which is wrong but i hope that it gives a better idea
//Not real code Purpose only to give a better understanding to question
python manage.py startapp test --fields id=int verified=bool
Or if there is not is there any zsh plugin / 3rd party plugin that might be helpfull
Also i should mention that the need originated from a current project which is a one page application and has lots of tables but no operations on it. So the task to write models everytime would be tedious and not the correct way. So any other suggestions are also greatly appreciated

Wordpress database integration/sync with Django

My company will be rolling out a new website to accompany our product launch and would like to switch over to Wordpress as our content management system. We will be utilizing a Wordpress theme that will allow users to create their own virtual events without having to log into the Wordpress dashboard (back-end). This event information will be displayed on the website for other users to view and register - this is all built into the theme we have purchased.
These virtual events will be held on our software platform, which is built on Django. We would like to utilize Wordpress to manage the login and event creation process, but would also like to have event information displayed on the Wordpress site AND imported to the Django database as well.
For example: Users will need to submit three items on the front-end Wordpress site to create an event: Title, Host Name, and Start Time. When that information is submitted can it be automatically duplicated to the Django database in addition to it being sent to the WP database?
I have already done some research on this matter, but what I have found thus far might not work for our needs. I found this presentation by Collin Anderson - it is similar to what we want to achieve, but I believe the application is a little different: http://www.confreaks.com/videos/4493-DjangoCon2014-integrating-django-and-wordpress-can-be-simple.
I have a lot of experience with Wordpress, but very limited experience with Django. This question is more for research purposes than a "how-to". We want to know if we can continue to plan on heading toward the Wordpress direction or if we should seek alternative methods for our site. I appreciate you taking moment to answer my question.
I'm working on something similar at the moment and found a good starting point was this:
http://agiliq.com/blog/2010/01/wordpress-and-django-best-buddies/
That way, as dan-klasson suggests, you can use the same database for both the wp side and the django side.
In short, first things first take a back up of the wp database in case anything goes wrong.
Create a new django project and set your settings.py to use the wp database.
In this new django project you can use ./manage.py inspectdb > models.py to autogenerate a models.py file of the wp database. Be careful here as there are differences between wp and django conventions. You will need to manually alter some of the auto generated models.py. Django supplies db_table and db_column arguments to allow you to rename tables and columns for the django part if you'd like to.
You can then create a new django app in your django project and place the models.py you've created in there. This new app will be using the same data as your wordpress site. I'm not sure exactly what you want to do but I would be very, very careful about having wordpress and django access the same data simultaneously. You may want to set the django side as read only.
You can then add other apps to extend the django side of things as you wish.
I should point out that I haven't completed my work on this yet but so far so good. I'll update as I find sticking points etc.

django admin django.contrib.staticfiles

I'm following the tutorial contained here:
http://www.djangobook.com/en/2.0/chapter06.html
They say that the admin site should look like this:
http://www.djangobook.com/en/2.0/_images/admin_index.png
When I start the admin site, though, it looks really simplistic, just plain text and links:
Django administration
Welcome, admin. Change password / Log out
Site administration
Auth
Groups Add Change
Users Add Change
Recent Actions
My Actions
None available
I noticed that it looks all nice like the link when I uncomment django.contrib.staticfiles from the INSTALLED_APPS, although that wasn't mentioned in the tutorial...can someone please explain this behavior to me?
Thank you for your help!
The Django Book is a little out of date (although an update is in the works I believe):
This book was originally published by Apress in 2009, and covered Django 1.0. Since then, it’s languished. We’re working on getting the book updated to cover Django 1.4, 1.5, and beyond
Static files are all the CSS/JS & images that your site (and the django admin) uses. They need to be collected and placed somewhere that your server (or development server) can serve them. This is the job of django.contib.staticfiles.
You can read more about this in the 'Managing Static Files' documentation
Websites generally need to serve additional files such as images, JavaScript, or CSS. In Django, we refer to these files as “static files”. Django provides django.contrib.staticfiles to help you manage them.

How to port from Drupal to Django?

What would be the best way to port an existing Drupal site to a Django application?
I have around 500 pages (mostly books module) and around 50 blog posts. I'm not using any 3rd party modules.
I would like to keep the current URLS (for SEO purposes) and migrate database to Django. I will create a simple blog application, so migrating blog posts should be ok. What would be the best way to serve 500+ pages with Django? I would like to use Admin to edit/add new pages.
All Django development is similar, and yours will fit the pattern.
Define the Django model for your books and blog posts.
Unit test that model using Django's built-in testing capabilities.
Write some small utilities to load your legacy data into Django. At this point, you'll realize that your Django model isn't perfect. Good. Fix it. Fix the tests. Redo the loads.
Configure the default admin interface to your model. At this point, you'll spend time tweaking the admin interface. You'll realize your data model is wrong. Which is a good thing. Fix your model. Fix your tests. Fix your loads.
Now that your data is correct, you can create templates from your legacy pages.
Create URL mappings and view functions to populate the templates from the data model.
Take the time to get the data model right. It really matters, because everything else is very simple if your data model is solid.
It may be possible to write Django models which work with the legacy database (I've done this in the past; see docs on manage.py inspectdb).
However, I'd follow advice above and design a clean database using Django conventions, and then migrate the data over. I usually write migration scripts which write to the new database through Django and read the old one using the raw Python DB APIs (while it is possible to tie Django to multiple databases simultaneously, too).
I also suggest taking a look at the available blogging apps for Django. If the one included in Pinax suits your need, go ahead and use Pinax as a starting point.
S.Lott answer is still valid after years, I try to complete the analysis with the tools and format to do the job.
There are many Drupal export tools out of there by now but with the very same request I go for Views Datasource choosing JSON as format. This module is very solid and available for the last version of Drupal. The JSON format is very fast in both parsing and encoding and it's easy to read and very Python-friendly (import json).
Using Views Datasource you can create a node view sorted by node id (nid), show a limited number of elements per page, configure a view path, add to it a filter identifier and pass to it the nid to read all elements until you get an empty JSON response.
When importing in Django you have a wide set of tools as well, starting from loaddata to load fixtures. Views Datasource exported JSON but it's not formatted as Django expects fixtures: you can write a custom admin command to do the import, where you can have the full control of the import flow.
You can start your command passing a nid=0 as argument and then let the procedure read, import and then fetch data from the next page passing simply the last nid read in the previous HTTP request. You can even restrict access to the path on view but you need additional configuration on the import side.
Regarding performance, just for example I parsed and imported 15.000+ nodes in less than 10 minutes via a Django 1.8 custom admin command on an 8 core / 8 GB Linux virtual machine and PostgreSQL as DBMS, logging success and error information into a custom model for each node.
These are the basics for import/export between these two platform, for detailed information I described all the major steps for export from Drupal and then import to Django in this guide.