Where is BigData datastore INSERT statement and Java API library? - datastore

I was given a task to create a graph data in BigData datastore. Here BigData is RDF datastore. But the problem I even couldn'd find an INSERT statement to it? Is there any sample to INSERT and store data?
And where is the Java API library to it?
Here BigData is RDF datastore. More can be found here http://www.systap.com/bigdata.htm .

Insert statement is not supported in SPARQL versions before 1.1
So make sure your datastore support atleast SPARQL 1.1
In some datastores like stardog even though it support SPARQL 1.1 they removed UPDATE,INSERT ,and DELETE features due to some security reasons.
If you datastore support SPARQL 1.1 and INSERT Here is the query
INSERT DATA
{
<http://example.org/subject> <http://example.org/predicate> <http://example.org/object>
}
or
INSERT DATA
{
GRAPH <http://example.org/myGraph>
{
<http://example.org/subject> <http://example.org/predicate> <http://example.org/object>
}
}

Related

What is the equivalent of the SQL function REGEXP_EXTRACT in Azure Synapse?

I want to convert my code that I was running in Netezza (SQL) to Azure Synapse (T-SQL). I was using the built-in Netezza SQL function REGEXP_EXTRACT but this function is not built-in Azure Synapse.
Here's the code I'm trying to convert
-- Assume that "column_v1" has datatype Character Varying(3) and can take value between 0 to 999 or NULL
SELECT
column_v1
, REGEXP_EXTRACT(column_v1, '[0-9]+') as column_v2
FROM INPUT_TABLE
;
Thanks,
John
regexExtract() function is supported in Synapse.
In order to implement it, you need to use couple of things, here is a demo that i built, here im using the SalesLT.Customer data that is supported as a sample data in microsoft:
In Synapse -> Integrate tab:
Create new pipeline
Add dataflow activity to your pipline
In dataflow activity: under settings tab -> create new data flow
double click on the dataflow (it should open it) Add source (it can be blob storage / files on prem etc.)
add a derived column transformation
in derived column add new column (or override an existing column) in Expression: add this command regexExtract(Phone,'(\\d{3})') it will select the 3 first digits, since my data has dashes in it, its makes more sense to replace all characters that are not digits using regexReplace method: regexReplace(Phone,'[^0-9]', '')
add sink
DataFlow activities:
derived column transformation:
Output:
please check MS docs:
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-derived-column
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expression-functions
Regex_extract is not available in T-SQL. Thus, we try to do similar functionalities using Substring/left/right functions along with Patindex function
SELECT input='789A',
extract= SUBSTRING('789A', PATINDEX('[0-9][0-9][0-9]', '789A'),4);
Result
Refer Microsoft documents patindex (T-SQL), substring (T-SQL) for additional information.

Parse JSON as key value in Dataflow job

How to parse JSON data in apache beam and store in bigquery table ?
For example: JSON data
[{ "name":"stack"},{"id":"100"}].
How to parse JSON data and convert to PCollection K,V that will store in BQ table?
Appreciate your help!!
Typically you would use a built in JSON parser in the programming language (Are you using beam or python). Then create a TableRow object and use that for the PCollection which you are passing to the BQ table.
Note: Some JSON parsers disallow JSON which starts with a root list, as you have shown in your example. They tend to prefer something like this, with a root map. I believe this is the case in python's json library.
{"name":"stack", "id":"100"}
Please see this example pipeline, for an example on how to create the PCollection and use BigqueryIO.
You may also want to consider using one of the X to BigQuery template pipelines.

How do I query for relationship data in spring data neo4j 4?

I have a cypher query that is supposed to return nodes and edges so that I can render a representation of my graph in a web app. I'm running it with the query method in Neo4jOperations.
start n=node({id}) match n-[support:SUPPORTED_BY|INTERPRETS*0..5]->(argument:ArgumentNode)
return argument, support
Earlier, I was using spring data neo4j 3.3.1 with an embedded database, and this query did a fine job of returning relationship proxies with start nodes and end nodes. I've upgraded to spring data neo4j 4.0.0 and switched to using a remote server, and now it returns woefully empty LinkedHashMaps.
This is the json response from the server:
{"commit":"http://localhost:7474/db/data/transaction/7/commit","results":[{"columns":["argument","support"],
"data":[
{"row":[{"buildVersion":-1},[]]},
{"row":[{"buildVersion":-1},[{}]]}
]}],"transaction":{"expires":"Mon, 12 Oct 2015 06:49:12 +0000"},"errors":[]}
I obtained this json by putting a breakpoint in DefaultRequest.java and executing EntityUtils.toString(response.getEntity()). The query is supposed to return two nodes which are related via an edge of type INTERPRETS. In the response you see [{}], which is where data about the edge should be.
How do I get a response with the data I need?
Disclaimer: this is not a definitive answer, just what I've pieced together so far.
You can use the queryForObjects method in Neo4jOperations, and make sure that your query returns a path. Example:
neo4jOperations.queryForObjects(ArgumentNode.class, "start n=node({id}) match path=n-[support:SUPPORTED_BY|INTERPRETS*0..5]->(argument:ArgumentNode) return path", params);
The POJOs that come back should be hooked together properly based on their relationship annotations. Now you can poke through them and manually build a set of edges that you can serialize. Not ideal, but workable.
Docs suggesting that you return a path:
From http://docs.spring.io/spring-data/data-neo4j/docs/4.0.0.RELEASE/reference/html/#_cypher_queries:
For the query methods that retrieve mapped objects, the recommended
query format is to return a path, which should ensure that known types
get mapped correctly and joined together with relationships as
appropriate.
Explanation of why queryForObjects helps:
Under the hood, there is a distinction between different types of queries. They have GraphModelQuery, RowModelQuery, and GraphRowModelQuery, each of which pass a different permutation of resultDataContents: ["row", "graph"] to the server. If you want data sufficient to reconstruct the graph, you need to make sure "graph" is in the list.
You can find this code inside ExecuteQueriesDelegate:
if (type != null && session.metaData().classInfo(type.getSimpleName()) != null) {
Query qry = new GraphModelQuery(cypher, parameters);
...
} else {
RowModelQuery qry = new RowModelQuery(cypher, parameters);
...
}
Using queryForObjects allows you to provide a type, and kicks things over into GraphModelQuery mode.

How to insert data into RDF data source in WSO2 DSS

I can query data using Sparql query as explained here, however, when I try to write insert statement in Sparql like below:
PREFIX space: <http://purl.org/net/schemas/space/>
PREFIX relevance: <http://a9.com/-/opensearch/extensions/relevance/1.0/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
INSERT DATA
{
http://nasa.dataincubator.org/spacecraft/1968-009B space:internationalDesignator 1968-009B
}
DSS throws this exception:
Nested Exception:-
com.hp.hpl.jena.query.QueryParseException: Lexical error at line 10, column 101. Encountered: " " (32), after : "INSERT"
Because I can write insert SQL with RDBMS data source, so I think RDF also supports insert functionality.
Could you help me to solve it?
By the looks of it, I feel that the problem is with the SPARQL query itself. Although, I'm aware that the query is syntactically correct and conforms to the SPARQL specifications, I wonder whether the Apache Jena version used in DSS allows you to follow the syntax "INSERT DATA" (just a wild guess analyzing the reported error log). Can you try "INSERT (INTO)" clause and check if it works? Ideally, DSS doesn't do any modifications to the query except input/output mapping processing so if your query format is right it should work out of the box.
Cheers,
Prabath
Insert functionality is not yet supported in WSO2 DSS yet.

Testing neo4j with indexes

I want to test my neo4j project with nosql unit. This works fine as long as I don't need a lucene index. Is there a way to create a test database with an index?
I think graphml offers no possibility for indexes, so I try to use the auto-index like this:
#Before
public void startAutoIndex(){
AutoIndexer<Node> nodeAutoIndexer = graphDb.index().getNodeAutoIndexer();
nodeAutoIndexer.startAutoIndexingProperty( "id" );
nodeAutoIndexer.startAutoIndexingProperty( "refname" );
nodeAutoIndexer.setEnabled(true);
}
this doesn't work for me.
Is there another way to implement the auto-index?
Best regards
Jan
generally , two ways.
either you use the geoff xml export format
or use your graphml, but set up autoindexing on the server side using the conf/server.properties file. there, set up these rows:
node_auto_indexing=true
node_keys_indexable=id,refname
restart the db and do the graphml import (assuming the imported nodes have id and refname as their properties - in case you need a general id of the neo4j db and not your unique one, there is no need to specify the id as an index.).