I am trying to convert sql query to Tinkerpop Gremlin. sql2Gremlin library does it but it looks on join as relation while I am relying on no join approach where you can refer relations with dot as delimiter between two entity.
I have parsed and validated query and I have RelRoot object.
Apache calcite returns RelRoot object which is root of algebraic expression.
Lets say I dont want to apply any query optimization, How do i use my RelNode Visitor to transform the RelRoot into TinkerPop Gremlin DSL.
Ideally I would first use From clause and then apply filters defined in where clause? How is select, filters, From clause represent in RelRoot tree?
What does apache calcite means by relational expression or RelNode?
Rephrasing the same question without TinkerPop Gremlin context:
How should I use RelRoot visitor to visit the RelRoot and transform the query to another DSL?
I don't know why you insist on RelRoot and not RelNode tree, but Apache Calcite is doing its optimizations of relational algebra in RelNode stack. There is a class called RelVisitor that you might find interesting, since it can do exactly what you need: visit all RelNodes. You can then extract information you need from them and build your DSL with it.
EDIT: In RelVisitor, you have access to the parent node and the child nodes of the currently visited node. You can extract all the information usually available to the RelNode object (see docs), and if you cast it to specific relational algebra operation, for example, Project, you can extract what fields are inside Project operation by doing node.getRowType().getFieldList().forEach(field -> names.add(field.getName())), where names is a previously defined Set<String>. You can find the full code here.
You should also take a look at the algebra docs to understand how SQL maps to relational algebra in Calcite before attempting this.
Related
I am using Apache Calcite to validate and rewrite SQL based on policies that put certain restrictions on these SQL queries. I am trying to modify a RelNode tree in order to rewrite the query to enforce these restrictions. I want to be able to remove certain parts from a query (after it has been validated). For example, I want to be able to remove projection fields (which I managed to do using RelBuilder.projectExcept) and to remove a table scan and its corresponding column references from a query.
Simple example:
SELECT a.foo, b.bar, c.baz
FROM a, b, c
WHERE a.index = b.index AND b.index = c.index
Let's say we want to remove table c from the query, to get to the following:
SELECT a.foo, b.bar
FROM a, b
WHERE a.index = b.index
I have tried using RelBuilder but this does not support removing nodes from the tree. I have also thought about an approach using RelVisitor but this seems quite complicated for this purpose. I think it would essentially require building a new RelNode tree. Lastly, implementing rules using RelRule seems like it would be a suitable option, but I cannot figure out from the Calcite documentation how to remove a particular RelNode and how to parameterize this (e.g. conditionally apply the rule if the table name is c).
Can anyone point me to a good approach? Alternatively, would it be easier to just modify the SqlNode parse tree?
A rule transforms (in this case TransformationRule) a RelNode to an equivalent RelNode i.e both should have the same row. Assumming you want to use HepPlanner with your custom rule registered and if the rule matches, it will eventually check whether the original rel and the transformed rel have the same row using RelOptUtil#verifyTypeEquivalence. I think mutating the relNode via RelVisitor or mutating the sqlNode via SqlVisitor is your best bet.
I'm trying to use regular expressions in a cypher WHERE clause. I would like to match tables (nodes) which contains specific property.
MATCH (n)
WHERE n.Text =~ '*'
RETURN n;
I want to find all nodes which contains "UName" property.
So please suggest what should I put in where clause.
To get all nodes that have the UName property you can use the keys() function. This way:
MATCH(n)
WHERE 'UName' in keys(n)
Also, remember that Neo4j has no table concept. The data is stored as nodes and relationships, both with properties. Take a look in this Property Graph Model intro.
I am writing a simple app in django that searches for records in database.
Users inputs a name in the search field and that query is used to filter records using a particular field like -
Result = Users.objects.filter(name__icontains=query_from_searchbox)
E.g. -
Database consists of names- Shiv, Shivam, Shivendra, Kashiva, Varun... etc.
A search query 'shiv' returns records in following order-
Kahiva, Shivam, Shiv and Shivendra
Ordered by primary key.
My question is how can i achieve the order -
Shiv, Shivam, Shivendra and Kashiva.
I mean the most relevant first then lesser relevant result.
It's not possible to do that with standard Django as that type of thing is outside the scope & specific to a search app.
When you're interacting with the ORM consider what you're actually doing with the database - it's all just SQL queries.
If you wanted to rearrange the results you'd have to manipulate the queryset, check exact matches, then use regular expressions to check for partial matches.
Search isn't really the kind of thing that is best suited to the ORM however, so you may which to consider looking at specific search applications. They will usually maintain an index, which avoids database hits and may also offer a percentage match ordering like you're looking for.
A good place to start may be with Haystack
I'm trying to explore an elasticsearch cluster using python, and I'm new to elasticsearch. If I use Marvel/Sense, I can see the cluster's schema using GET _mapping. Is there an equivalent way to do this in Python? If so I can see the "schema" of the cluster!
More generally, I'd like to discover programmatically all the indicies, each indices' doc_types, classify the doc_types' fields (are they text strings, ints, floats, what range to the numeric ones take, ..) basically learn the schema and basic statistics of each field. If there is a better way than GET _mapping to start this project, I'm all ears.
This is related to this question, where they are looking for a list of indices using Python, but is more general.
You can do that with pyelasticsearch. This is how you can do GET _mapping
in python.
From the Docs
get_mapping(index=None, doc_type=None) [source]
Fetch the mapping definition for a specific index and type.
Parameters:
index – An index or iterable thereof
doc_type – A document type or iterable thereof
Omit both arguments to get mappings for all types and indexes.
Explore API to learn more
I have my own data store mechanism for store data. but I want to implement standards data manipulation and query interface for end users,so I thought QT sql is suitable for my case.
but I still cannot understand how do I involved my indexes for sql query.
let say for example,
I have table with column A(int),B(int),C(int),D(int) and column A is indexed.assume I execute query like select * from Foo where A = 10;
How do I involved my index for search the results?.
You have written your own storage system and want to manipulate it using an SQL like syntax? I don't think Qt SQL is the right tool for that job. It offers connectivity to various SQL servers and is not meant for parsing SQL statements. Qt expects to "pass through" the queries and then somehow parse the result set and transform it into a Qt friendly representation.
So if you only want to have a Qt friendly representation, I wouldn't see a reason to go the indirection with SQL.
But regarding your problem:
In SQL, indexes are usually not stated in the queries, but during the creation of the table schema. But SQL server has a possibility to "hint" indexes, is that what you are looking for?
SELECT column_list FROM table_name WITH (INDEX (index_name) [, ...]);