Using RelBuilder how to create SELECT and WHEREclause? - apache-calcite

I have the following code where I am trying to create a relational algebra:
final RelNode relNode = relBuilder.scan(index).build();
I am looking for a sample code where I can create a SELECE and WHERE clause in the above code.

Have you read the documentation? There are several examples there.
final RelNode node = builder
.scan("EMP")
.aggregate(builder.groupKey("DEPTNO"),
builder.count(false, "C"),
builder.sum(false, "S", builder.field("SAL")))
.filter(
builder.call(SqlStdOperatorTable.GREATER_THAN,
builder.field("C"),
builder.literal(10)))
.build();
System.out.println(RelOptUtil.toString(node));
is equivalent to SQL
SELECT deptno, count(*) AS c, sum(sal) AS s
FROM emp
GROUP BY deptno
HAVING count(*) > 10

Related

Query for listing Datasets and Number of tables in Bigquery

So I'd like make a query that shows all the datasets from a project, and the number of tables in each one. My problem is with the number of tables.
Here is what I'm stuck with :
SELECT
smt.catalog_name as `Project`,
smt.schema_name as `DataSet`,
( SELECT
COUNT(*)
FROM ***DataSet***.INFORMATION_SCHEMA.TABLES
) as `nbTable`,
smt.creation_time,
smt.location
FROM
INFORMATION_SCHEMA.SCHEMATA smt
ORDER BY DataSet
The view INFORMATION_SCHEMA.SCHEMATA lists all the datasets from the project the query is executed, and the view INFORMATION_SCHEMA.TABLES lists all the tables from a given dataset.
The thing is that the view INFORMATION_SCHEMA.TABLES needs to have the dataset specified like this give the tables informations : dataset.INFORMATION_SCHEMA.TABLES
So what I need is to replace the *** DataSet*** by the one I got from the query itself (smt.schema_name).
I am not sure if I can do it with a sub query, but I don't really know how to manage to do it.
I hope I'm clear enough, thanks in advance if you can help.
You can do this using some procedural language as follows:
CREATE TEMP TABLE table_counts (dataset_id STRING, table_count INT64);
FOR record IN
(
SELECT
catalog_name as project_id,
schema_name as dataset_id
FROM `elzagales.INFORMATION_SCHEMA.SCHEMATA`
)
DO
EXECUTE IMMEDIATE
CONCAT("INSERT table_counts (dataset_id, table_count) SELECT table_schema as dataset_id, count(table_name) from ", record.dataset_id,".INFORMATION_SCHEMA.TABLES GROUP BY dataset_id");
END FOR;
SELECT * FROM table_counts;
This will return something like:

RelNode of a query in which FROM clause itself has a query

I want to achieve result from a table where I ORDER BY column id and I don't want id to be present in the result. I can achieve this using the following query.
SELECT COALESCE (col1, '**')
FROM (select col1, id FROM myDataSet.myTable WHERE col4 = 'some filter' ORDER BY id);
Now, I want to create a RelNode of the above query. As far as I know, in calcite, to perform table scan, there are only two methods scan(String tableName) and scan(Iterable<String> tableNames). Is there a way to scan(RelNode ) ? How to do this ?
The query
select col1, col2, col2 FROM myDataSet.myTable WHERE col4 = 'some filter' ORDER BY id
should also give you the desired result.
If you want to represent the query you have written more directly, you would start by constructing a RelNode for the query in the from clause, starting with a scan of myDataSet.myTable, adding the filter, and the order. Then you can project the specific set of columns you want.
Just simply create a RelNode of inner subquery and create another projection on top of it. Like so.
builder.scan('myTable')
.filter(builder.call(SqlStdOperator.EQUALS, builder.field(col4), builder.literal('some filter') )))
.project(builder.field('col1'), builder.field('id'))
.sort(builder.field('id'))
.project(builder.call(SqlStdOperator.COALESCE(builder.field('col1'), builder.literal('**'))))
.build()

Create RelNode of a select query with concat

I went through the documentation of Apache Calcite. Is the relNode correct for the following query in BigQuery?
SELECT CONCAT('a or b',' ', '\n', first_name)
FROM foo.schema.employee
WHERE first_name = 'name';
relNode = builder
.scan("schema.employee")
.filter(builder.call(SqlStdOperatorTable.EQUALS,
builder.field("first_name"),
builder.literal("name"))
.project(builder.call(SqlStdOperatorTable.CONCAT,
builder.literal("a or b"),
builder.literal(" "),
builder.literal("\\n"),
builder.field(first_name)))
.build()
That looks correct at a glance. I would suggest you confirm by looking at the query results and also by converting your RelNode to SQL.

Doctrine2 Query Builder Left Join with Select

I want to implement this SQL using doctrine2 query builder:
SELECT c.*, COUNT(s.id) AS studentCount
FROM classes c
LEFT JOIN (
SELECT *
FROM student_classes
WHERE YEAR = '2012'
) sc ON c.id = sc.class_id
LEFT JOIN students s ON sc.student_id = s.id
GROUP BY c.id
I tried this one but didn't work
$qb = $this->getEntityManager()
->getRepository('Classes')
->createQueryBuilder('c');
$qb->select('c.id AS id, c.name AS name, COUNT(s) AS studentCount');
$qb->leftJoin(
$qb->select('sc1')
->from('StudentClasses', 'sc1')
->where('sc1.year = :year')
->setParameter('year', $inputYear),
'sc2'
);
$qb->leftJoin('sc2.students', 's');
$qb->groupBy('c.id');
return $qb->getQuery()->getScalarResult();
or should I use nativeSQL instead?
any help would be appreciated,
thanks.
What are you trying to do is really interesting, because JOIN on a SELECT seems to not be supported by Doctrine2 with DQL or QueryBuilder. Of course, you can try with a native query.
However, to answer to your question, I believe that you don't need to make a JOIN on a SELECT. Simply, JOIN on StudentClasses and then add a condition in the WHERE about the $year! The WHERE clause is made for that.
You can use WITH clause to join entity with additional check, For your subquery you can write the same using left join with year filter, In join part i have used c.studentClasses based on the assumption that in Classes entity you have some mapped property for StudentClasses entity
$qb = $this->getEntityManager()
->getRepository('Classes')
->createQueryBuilder('c');
$qb->select('c.id AS id, c.name AS name, COUNT(s) AS studentCount');
$qb->leftJoin('c.studentClasses','sc2', 'WITH', 'sc2.year = :year');
$qb->leftJoin('sc2.students', 's');
$qb->setParameter('year', $inputYear);
$qb->groupBy('c.id');

How to integrate a CTE query in Entity Framework 5

I have an SQL query that I have written using CTE. Now, I am moving the repository to use Entity Framework 5.
I am at a loss as to how to integrate (or rewrite) the CTE-based query using Entity Framework 5.
I am using POCO entities with the EF5 and have a bunch of Map classes. There is no EDMX file etc.
I feel like a total noob right now and would appreciate any help pointing me in the right direction.
The CTE query is as following
WITH CDE AS
(
SELECT * FROM collaboration.Workspace AS W WHERE W.Id = #WorkspaceId
UNION ALL
SELECT W.* FROM collaboration.Workspace AS W INNER JOIN CDE ON W.ParentId = CDE.Id AND W.ParentId <> '00000000-0000-0000-0000-000000000000'
)
SELECT
W.Id AS Id,
W.Name AS Name,
W.Description AS Description,
MAX(WH.ActionedTimeUtc) AS LastUpdatedTimeUtc,
WH.ActorId AS LastUpdateUserId
FROM
collaboration.Workspace AS W
INNER JOIN
collaboration.WorkspaceHistory AS WH ON W.Id = WH.WorkspaceId
INNER JOIN
(
SELECT TOP 10
CDE.Id
FROM
CDE
INNER JOIN
collaboration.WorkspaceHistory AS WH ON WH.WorkspaceId = CDE.Id
WHERE
CDE.Id <> #WorkspaceId
GROUP BY
CDE.Id,
CDE.ParentId,
WH.ActorId,
WH.Action
HAVING
WH.ActorId = #UserId
AND
WH.Action <> 4
ORDER BY
COUNT(*) DESC
) AS Q ON Q.Id = WH.WorkspaceId
GROUP BY
W.Id,
W.Name,
W.Description,
WH.ActorId
HAVING
WH.ActorId = #UserId
You must create stored procedure for your SQL query (or use that query directly) and execute it through dbContext.Database.SqlQuery. You are using code-first approach where you don't have any other options. In EDMX you could use mapped table valued function but code-first doesn't have such option yet.
I have built a stored procedure which takes array of ids as input parameter and return data table using recursive cte query Take a look at the code here it's using EF and code first approach