How can I create a model with ActiveRecord capabilities but without an actual table behind? - ruby-on-rails-4

I think this is a recurrent question in the Internet, but unfortunately I'm still unable to find a successful answer.
I'm using Ruby on Rails 4 and I would like to create a model that interfaces with a SQL query, not with an actual table in the database. For example, let's suppose I have two tables in my database: Questions and Answers. I want to make a report that contains statistics of both tables. For such purpose, I have a complex SQL statement that takes data from these tables to build up the statistics. However the SELECT used in the SQL statement does not directly take values from neither Answers nor Questions tables, but from nested SELECTs.
So far I've been able to create the StatItem model, without any migration, but when I try StatItem.find_by_sql("...nested selects...") the system complains about unexisting table stat_items in the database.
How can I create a model whose instance's data is retrieved from a complex query and not from a table? If it's not possible, I could create a temporary table to store the data in there. In such case, how can I tell the migration file to not create such table (it would be created by the query)?

How about creating a materialized view from your complex query and following this tutorial:
ActiveRecord + PostgreSQL Materialized Views

Michael Kohl and his proposal of materialized views has given me an idea, which I initially discarded because I wrongly thought that a single database connection could be shared by two processes, but after reading about how Rails processes requests, I think my solution is fine.
STEP 1 - Create the model without migration
rails g model StatItem --migration=false
STEP 2 - Create a temporary table called stat_items
#First, drop any existing table created by older requests (database connections are kept open by the server process(es).
ActiveRecord::Base.connection.execute('DROP TABLE IF EXISTS stat_items')
#Second, create the temporary table with the desired columns (notice: a dummy column called 'id:integer' should exist in the table)
ActiveRecord::Base.connection.execute('CREATE TEMP TABLE stat_items (id integer, ...)')
STEP 3 - Execute an SQL statement that inserts rows in stat_items
STEP 4 - Access the table using the model, as usual
For example:
StatItem.find_by_...
Any comments/improvements are highly appreciated.

Related

Creating tables with dynamic columns in apache calcite

There are two tables in graph database.
User { id, name}
Group { id, name}
User is connected to Group via an edge. No i want to query this via apache calcite with where clause as
select * from User where User.Group.id="Foo"
Since apache calcite accepts Schema with predefined Table with predefined columns, above query fails in validation step. One way to achieve this way is to Define user with Four columns as {id, name, Group.id, Group.name}. Now the problem is in my case, A table can be connected to more than one other tables and the depth can go up to 6 depth. Creating a table with all the columns of their child classes with lead to a table with lot of dynamic columns.
Is there a way to define columns of a table as the way they appear in query.
Look at resolved issue https://issues.apache.org/jira/browse/CALCITE-1150.
It introduces DynamicRecordType to Apache Calcite. Here is propossed specification https://docs.google.com/document/d/1vCWlqRyJQCtYbtVAjGOKP-8BD4_hrhoM9-4qbdoJs6k/edit.
I think it's used by Apache Drill project, see https://github.com/apache/drill/search?q=DynamicRecordType.

WSO2 DAS spark script

I'm trying to deploy new data publisher car. I looked at tthe APIM_LAST_ACCESS_TIME_SCRIPT.xml spark script (used by api manager) and didn't understand the difference between the two temporaries tables created: API_LAST_ACCESS_TIME_SUMMARY_FINAL and APILastAccessSummaryData
The two Spark temporary tables represent different JDBC tables (possibly in different datasources), where one of them acts as the source for Spark and the other acts as the destination.
To illustrate this better, have a look at the simplified script in question:
create temporary table APILastAccessSummaryData using CarbonJDBC options (dataSource "WSO2AM_STATS_DB", tableName "API_LAST_ACCESS_TIME_SUMMARY", ... );
CREATE TEMPORARY TABLE API_LAST_ACCESS_TIME_SUMMARY_FINAL USING CarbonAnalytics OPTIONS (tableName "API_LAST_ACCESS_TIME_SUMMARY", ... );
INSERT INTO TABLE APILastAccessSummaryData select ... from API_LAST_ACCESS_TIME_SUMMARY_FINAL;
As you can see, we're first creating a temporary table in Spark with the name APILastAccessSummaryData, which represents an actual relational DB table with the name API_LAST_ACCESS_TIME_SUMMARY in the WSO2AM_STATS_DB datasource. Note the using CarbonJDBC keyword, which can be used to directly map JDBC tables within Spark. Such tables (and their rows) are not encoded, and can be read by the user.
Second, we're creating another Spark temporary table with the name API_LAST_ACCESS_TIME_SUMMARY_FINAL. Here however, we're using the CarbonAnalytics analytics provider, which will mean that this table will not be a vanilla JDBC table, but an encoded table similar to the one from your previous question.
Now, from the third statement, you can see that we're reading (SELECT) a number of fields from the second table API_LAST_ACCESS_TIME_SUMMARY_FINAL and inserting them (INSERT INTO) into the first, which is APILastAccessSummaryData. This represents the Spark summarisation process.
For more details on the differences between the CarbonAnalytics and CarbonJDBC analytics providers or on how Spark handles such tables in general, have a look at the documentation page for Spark Query Language.

Inserting data into a Django DB

I'm trying to use a regular python script to add data to a table that was created via the standard django process (start project/app/create model etc).
In the same DB, I set up another table with the same columns as the django DB to test on, and wrote a script that successfully parsed data and inserted it into that DB.
When I changed the table name so that the data would be written to the standard Django table, nothing was inserted and no error was thrown.
Is there something that prevents access to the Django tables that I'm unaware of?

How RedShift Sessions are handled from a Server Connection for TEMP tables

I'm using ColdFusion to connect to a RedShift database and I'm trying to understand how to test/assume myself of how the connections work in relation to TEMP tables in RedShift.
In my CFADMIN for the datasource I have unchecked Maintain connections across client requests. I would assume then each user who is using my website would have their own "Connection" to the DB? Is that correct?
Per the RedShift docs about temp tables:
TEMP: Keyword that creates a temporary table that is visible only within the current session. The table is automatically dropped at the end of the session in which it is created. The temporary table can have the same name as a permanent table. The temporary table is created in a separate, session-specific schema. (You cannot specify a name for this schema.) This temporary schema becomes the first schema in the search path, so the temporary table will take precedence over the permanent table unless you qualify the table name with the schema name to access the permanent table.
Am I to understand that if #1 is true and each user has their own connection to the database and thereby their own session then per #2 any tables that are created will be only in that session even though the "user" is the same as it's a connection from my server that is using the same credentials.
3.If my assumptions in #1 and #2 are correct then if I have ColdFusion code that runs a query like so:
drop if exists tablea
create temp table tablea
insert into tablea
select * from realtable inner join
drop tablea
And multiple users are using that same function that does this. They should never run into any conflicts where one table gets dropped as another request is trying to use it correct?
How do I test that this is the case? Besides throwing it into production and waiting for an error how can I know. I tried running a few windows side by side in different browsers and stuff and didn't notice an issue, but I don't know how to know if the temp tables truly are different between clients. (as they should be.) I imagine I could query some meta data but what meta data about the table would tell me that?
I have a similar situation, but with redbrick database software. I handle it by creating unique table names. The general idea is:
Create a table name something like this:
<cfset tablename = TableText & randrange(1, 100000)>
Try to create a table with that name. If you fail try again with a different name.
If you fail 3 times stop trying and mail the cfcatch information to someone.
I have all this code in a custom tag.
Edit starts here
Based on the comments, here is some more information about my situation. In CFAdmin, for the datasource being discussed, the Maintain Connections box is checked.
I put this code on a ColdFusion page:
<cfquery datasource="dw">
create temporary table dan (f1 int)
</cfquery>
I ran the page and then refreshed it. The page executed successfully the first time. When refreshed, I got this error.
Error Executing Database Query.
** ERROR ** (7501) Name defined by CREATE TEMPORARY TABLE already exists.
That's why I use unique tablenames. I don't cache the queries though. Ironically, my most frequent motivation for using temporary tables is because there are situations where they make things run faster than using the permanent tables.

Syncing db with existing tables through django for an existing schema table and also updating few columns for the tables and the rest automatically

I am doing a poc in Django and i was trying to create the admin console module for inserting,updating and deleting records through django admin console through models and it was doing fine
I have 2 questions.
1.I need to have model objects for existing tables which needs to be present in a particular schema.say schema1.table1
Here as of now i was doing poc for public schema.
So can it be done in a fixed defined schema and if yes how.Any reference would be very helpful
2.Also i wanted to update few columns in the table through console and the rest of the columns will be done automatically like currentimestamp and created date etc.Is it possible through default django console and if yes kindly share any reference
Steps for 1
What i have done as of now is created a class in model.py with attributes as author,title,body,timeofpost
Then i used sqlmigrate after makemigrations app to create the table and after migrating have been using the admin console for django to insert and update the records for the table created.But this is for POC only.
Now i need to do the same but for existing tables with whom i can interact and insert or update record for those existing tables through admin console.
Also the tables are getting created in public schema by default.But i am using postgres and the existing tables are present in different schemas and i wanted to insert,update and delete for this existing tables.
I am stuck up here as i dont know how to configure model with existing database schema tables through which we can interact through django console and also for different schemas and not in public schema
Steps for 2:
Also i wanted the user to give input for few columns like suppose in this case time of creation is not required to be given as input by user .Rather it should be taken care when the database is updating or creating
Thanks
In order for Django to "interact" with an existing database you need to create a model for it which can be done automatically as shown here. This assumes that your "external" database isn't going to be changed often because you'll have to keep your models in sync which is tricky - there are other approaches if you need that.
As for working with multiple database schemas - is there a reason you can't put your POC table in the same database as the others? Django supports multiple databases, but it will be harder to setup. See here.
Finally, it sounds like you are interested in setting the Django default field attribute. For an example of current time see here.