Django 1.6, creating cube data(OLAP) from postgresql - django

I am not very familiar with OLAP reporting, so is there any django app or python package which converts RDBMS(postgresql) data to cube data, which can be queried using django ORM, I have searched and found solutions like http://cubes.databrewery.org/, jasper etc, but these seemed to be overkill for my use case.

Python Cubes is one of the most lightweight OLAP servers you can find, and it can map your existing relational database into an OLAP schema (without forcing you to transform your data).
In order to query your OLAP cubes, you can use a tool like http://jjmontesl.github.io/cubesviewer/ (disclaimer, I'm the main developer).
Jasper is not really an OLAP server. It's more of a report generation tool.

Related

The best way for integration Django and Scrapy

I know some ways like scrapy-djangoitem but as it has mentioned:
DjangoItem is a rather convenient way to integrate Scrapy projects with Django models, but bear in mind that Django ORM may not scale well if you scrape a lot of items (ie. millions) with Scrapy. This is because a relational backend is often not a good choice for a write intensive applications (such as a web crawler), specially if the database is highly normalized and with many indices.
So what is the best way to use scraped items in db and django models?
It is not about Django ORM but rather about the database you choose as backend. What it says is that if you are expecting to write millions of items to your tables, relational database systems might not be your best choice here (MySQL, Postgres ...) and it can be even worse in terms of performance if you add many indicies since your application is write-heavy (Database must update B-Trees or other structures for keeping index on every write).
I would suggest sticking with Postgres or MySQL for now and look for another solution if you start to have performance issues on database level.

Data Warehouse and Django

This is more of an architectural question than a technological one per se.
I am currently building a business website/social network that needs to store large volumes of data and use that data to draw analytics (consumer behavior).
I am using Django and a PostgreSQL database.
Now my question is: I want to expand this architecture to include a data warehouse. The ideal would be: the operational DB would be the current Django PostgreSQL database, and the data warehouse would be something additional, preferably in a multidimensional model.
We are still in a very early phase, we are going to test with 50 users, so something primitive such as a one-column table for starters would be enough.
I would like to know if somebody has experience in this situation, and that could recommend me a framework to create a data warehouse, all while mantaining the operational DB with the Django models for ease of use (if possible).
Thank you in advance!
Here are some cool Open Source tools I used recently:
Kettle - great ETL tool, you can use this to extract the data from your operational database into your warehouse. Supports any database with a JDBC driver and makes it very easy to build e.g. a star schema.
Saiku - nice Web 2.0 frontend built on Pentaho Mondrian (MDX implementation). This allows your users to easily build complex aggregation queries (think Pivot table in Excel), and the Mondrian layer provides caching etc. to make things go fast. Try the demo here.
My answer does not necessarily apply to data warehousing. In your case I see the possibility to implement a NoSQL database solution alongside an OLTP relational storage, which in this case is PostgreSQL.
Why consider NoSQL? In addition to the obvious scalability benefits, NoSQL offer a number of advantages that probably will apply to your scenario. For instance, the flexibility of having records with different sets of fields, and key-based access.
Since you're still in "trial" stage you might find it easier to decide for a NoSQL database solution depending on your hosting provider. For instance AWS have SimpleDB, Google App Engine provide their own DataStore, etc. However there are plenty of other NoSQL solutions you can go for that have nice Python bindings.

Django norel access to different nosql at the same time?

i'm new to the nosql world, and from forums and articles that i've read: most of users try to "mix" nosql tools, for example, they use Cassandra and MongoDB together to make a "powerful system", because am beginning with MongoDB, i've downloaded the DjanMon project (am a django fan ^_^ ), of course i've downloaded the special version of django that accepts the NoSql use: Django NonRel, and i've noticed that the Setting file dont "oblige" you to use one specific NoSql solution like in Django with RDBMS where you must specify MySql or PostegreSql or other solution, so, is it possible to mix lot of (or two of course) NoSql solution using Django (for example MongoDB+Cassandra)?
There's nothing to stop you using multiple storage solutions, whether SQL or NoSQL - but the NoSQL solutions all have different architectures, data models and APIs (For example, MongoDB is a document-oriented database, whereas Cassandra is Column-oriented), so you can't usually swap one for another without some effort.
Can you clarify what you are actually trying to achieve? I.e. why are you interested in mixing these two specific solutions?

DJango Appengine Bforms

I am trying to understand how Django and Appengine work together?
First, question: Is this a good team?
Experience, what is possible and what not, would be great.
I also read some modules like auth, admin wont work.
But the article is rather old, so maybe there is an update.
And in that tutorial one has to import bforms.
What is that?
Django Module? Appengine? Python? Bigtable?
How is Bigtable different from regular SQL, MySQL?
Thanks
Regular SQL and MySQL are designed for one computer only and fail in cloud computing where you need 1,000 computers for one database. Thus the next generation databases, like bigtable, were created to distribute the data over many database servers. They are called NoSQL databases for "Not Only SQL." See http://nosql-database.org/ for a list of NoSQL databases. The google app engine apparently allows you to use the bigtable structure so you data is distributed over a dozen database servers in the cloud. So does Amazon's simple db.

Persistence solutions for C++ (with a SQL database)?

I'm wondering what kind of persistence solutions are there for C++ with a SQL database? In addition to doing things with custom SQL (and encapsulating the data access to DAOs or something similar), are there some other (more general) solutions?
Like some general libraries or frameworks (something like Hibernate & co for Java and .NET) or something else? (Something that I haven't even thought of can also be welcome to be suggested)
EDIT: Yep, I was searching more for an ORM solution or something similar to handle sql queries and the relationships between tables and objects than for the db engine itself. Thanks for all the answers anyway!
SQLite is great: it's fast, stable, proven, and easy to use and integrate.
There is also Metakit although the learning curve is a bit steep. But I've used it with success in a professional project.
It sounds like you are looking for some ORM so that you don't have to bother with hand written SQL code.
There is a post here that goes over ORM solutions for C++.
You also did not mention the type of application you are writing, if it is a desktop application, mobile application, server application.
Mobile: You are best off using SQLite as your database engine because it can be embedded and has a small footprint.
Desktop App: You should still consider using SQLite here, but you also have the option with most desktop applications to have an always on connection to the internet in which case you may want to provide a network server for this task. I suggest using Apache + MySQL + PHP and using a lightweight ORM such as Outlet ORM, and then using standard HTTP post calls to access your resources.
Server App: You have many more options here but I still suggest using Apache + MySQL + PHP + ORM because I find it is much easier to maintain this layer in a script language than in C++.
MySQL Connector/C++ is a C++ implementation of JDBC 4.0
The reference customers who use MySQL Connector/C++ are:
- OpenOffice - MySQL Workbench
Learn more: http://forums.mysql.com/read.php?167,221298
SQLite + Hiberlite is a nice and promising project. though I hope to see it more actively developed. see http : // code.google.com/p/hiberlite/
I use MYSQL or SQLite.
MYSQL: Provides a server based DB that your application must dynamically connect to.
SQLite:Provides an in memory or file base DB.
Using the in memory DB is useful for quick development as setting up and configuring a DB server just for a single project is a big task. But once you have a DB server up and running it's just as easy to sue that.
In memory DB is useful for holding small DB such as configuration etc.
While for larger data sets a DB server is probably more practical.
Download from here: http://dev.mysql.com/
Download from here: http://www.sqlite.org/