RavenDB import from relational database - database-migration

Is it possible to import data from a relational database to RavenDB using a complex query?
This query joins several tables, more than 20 in a very complex way, something that is not achievable using its Migration tool in RavenDB studio.
I was thinking in ETL but there is no path from SQL --> to --> RavenDB.
Any help is appreciated.

Related

Handle large amounts of time series data in Django while preserving Django's ORM

We are using Django with its ORM in connection with an underlying PostgreSQL database and want to extend the data model and technology stack to store massive amounts of time series data (~5 million entries per day onwards).
The closest questions I found were this and this which propose to combine Django with databases such as TimescaleDB or InfluxDB. But his creates parallel structures to Django's builtin ORM and thus does not seem to be straightforward.
How can we handle large amounts of time series data while preserving or staying really close to Django's ORM?
Any hints on proven technology stacks and implementation patterns are welcome!
Your best option is to keep your relational data in Postgres and your time series data in a separate database, and combining them when needed in your code.
With InfluxDB you can do this join with a Flux script by passing it the SQL that Django's ORM would execute, along with your database connection info. This will return your data in InfluxDB's format though, not Django models.
why not using in parallel to your existing postgres a timescaledb for the time series data, and use this django integration for the latter one: https://pypi.org/project/django-timescaledb/.
Using multiple databases in django is possible, also I not did it by myself so far. Have a look here to do it in a convenient way (reroute certain Models to another db instead of default postgres one)
Using Multiple Databases with django

How are cross database joins performed in superset?

How are cross database joins performed in superset? For example, are the 2 datasources pulled into a pandas dataframe? or a sqlite / postgres db? then joined in memory? or do you have to provide a database instance for superset to perform operations like these?
Superset provides the possibility of creating Virtual Datasets with custom SQL queries, so you need to have your datasources in tables in a database to perform the joins and create charts using the Virtual Datasets.
If I understand the question correctly, I believe the SuperSet is an opensource equivalent to the ADO.Net DataSet from Microsoft. If so, then the selected data from both DB's are pulled into memory (data tables) using separate connections (because each connectionstring is going to be different) and then the operations are performed on the fly, in memory.
In that scenario, no external database would be required.

Should I use Data Warehouse or database or something else?

On current project we have a webapp with analytics module. The users select some filters and based on those filters table or graph is shown. We want the module to be responsive, so when the users select the filters it can get data in matters of seconds.
User filters are querying a large table ~1,000,000,000 rows and 20 columns (for a few years it should grow 2x/year in rows). 18 out of 20 columns are filtrable. And mostly there will be SELECT + WHERE queries.
We are not sure, should we use Data Warehouses or classical DBs.
Current reasearch suggests we should discuss between Clickhouse, DynamoDB, Snowflake, BigQuery or Redshift. Has anyone had similar use cases and which database solution would you recommend?
Since you are using the database for analytics purposes, it is recommended to use a OLAP ( Redshift)..
an OLAP database is designed to process large datasets quickly to answer questions about data.
You can compare the pricing here
https://medium.com/2359media/redshift-vs-bigquery-vs-snowflake-a-comparison-of-the-most-popular-data-warehouse-for-data-driven-cb1c10ac8555

DBFlow: How to migrate tables from an other database?

I have some columns of a table in an "old" database that I want to migrate to a new one, using DBFlow. DBFlow provides the #Migration annotation for databases, but it seems it only works to migragte tables in the same database.
What is the best approach to import columns into the a new/different database using DBFlow?
It is not possible to migrate between different databases. One need to copy/convert/migrate by hand.

using database routers to shard a table

I am trying to use django's database routers to shard my database, but I am not able to find a solution for that.
I'd like to define two databases, create the same table in both and then save the even rows in one db, the odd ones in the other one. The examples in the documentation show how to write to a master db and to read from the readonly slaves, which is not what I want, because I don't want to store the whole dataset in both dbs.
Do know any webpage explaining what I am trying to do?
Thank you
PS: I am using Postgresql and I know there are tools to achieve the same goal at DB level. My goal is to study if it can also be done in django and to explore if there are some advantages by doing this.