Spanner setAllowPartialRead(true) usage and purpose - google-cloud-platform

from offical code snippet example of Spanner Java Client :
https://github.com/GoogleCloudPlatform/java-docs-samples/blob/HEAD/spanner/spring-data/src/main/java/com/example/spanner/SpannerTemplateSample.java
I can see the usage of
new SpannerQueryOptions().setAllowPartialRead(true)):
#Component
public class SpannerTemplateSample {
#Autowired
SpannerTemplate spannerTemplate;
public void runTemplateExample(Singer singer) {
// Delete all of the rows in the Singer table.
this.spannerTemplate.delete(Singer.class, KeySet.all());
// Insert a singer into the Singers table.
this.spannerTemplate.insert(singer);
// Read all of the singers in the Singers table.
List<Singer> allSingers = this.spannerTemplate
.query(Singer.class, Statement.of("SELECT * FROM Singers"),
new SpannerQueryOptions().setAllowPartialRead(true));
}
}
I didn't find any explanation on it. Anyone can help?

Quoting from the documentation:
Partial read is only possible when using Queries. In case the rows returned by query have fewer columns than the entity that it will be mapped to, Spring Data will map the returned columns and leave the rest as they of the columns are.

Related

How to create a StreamableTable which is also a ScannableTable (Apache Calcite)?

I am looking to implement a org.apache.calcite.schema.Table which can be used as a stream as well as a table.
I was going through the Calcite documentation, and here it mentions an example of Orders table which is a stream as well as table. It also mentions that both of the following queries are applicable on this Orders table/stream,
SELECT STREAM * FROM Orders;
and
SELECT * FROM Orders;
I am trying to implement a class whose instances are such tables. I implemented the StreamableTable interface as well as the ScannableTable interface but still not able to get it to work both ways. When I try to execute a non-stream query (like SELECT * FROM TEST_TABLE), I get the following error,
Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 38: Cannot convert stream 'TEST_TABLE' to relation
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:868)
at org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5043)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateModality(SqlValidatorImpl.java:3739)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateModality(SqlValidatorImpl.java:3664)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:1048)
at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:232)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1016)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:724)
at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:567)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:242)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:642)
at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:508)
at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:478)
at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:556)
at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
... 3 more
Queries like SELECT STREAM * FROM TEST_TABLE work as expected.
Can someone help me create such a table?
The modality of Select * FROM Orders is RELATION but the table is an instance of StreamableTable and that's why the above exception is thrown. I changed the RelOptTableImpl#supportsModality a bit as follows.
#Override public boolean supportsModality(SqlModality modality) {
switch (modality) {
case STREAM:
return table instanceof StreamableTable;
default:
// check whether the table is scannable
if (table instanceof ScannableTable) {
return true;
}
return !(table instanceof StreamableTable);
}
}
For the above SQL, now the plans generated were as per usual:
Logical:
LogicalProject(ROWTIME=[$0], ID=[$1], PRODUCT=[$2], UNITS=[$3])
LogicalTableScan(table=[[STREAMS, ORDERS]])
Physical:
EnumerableTableScan(table=[[STREAMS, ORDERS]])
With Result set starting with:
ROWTIME=2015-02-15 10:15:00; ID=1; PRODUCT=paint; UNITS=10",
ROWTIME=2015-02-15 10:24:15; ID=2; PRODUCT=paper; UNITS=5"
You can create a test case for this in StreamTest with above changes.

ML.NET Dynamic InputModel

I'm using ML.NET to do Multiclass Classification. I have 3 use cases with different input models(different number of columns and data types) and there will be more to come so it doesn't make sense to have to create a physical file for each input models for every new use cases. I'd like to have preferably just ONE physical file that can adapt to any models if possible and if not, dynamically create the input model at runtime based on the column definitions defined out of a json string retrieved from a table in a Sql Server DB. Is this even possible? If so, can you share the sample codes?
Here are some snippets of the prediction codes that I'd like to make generic :-
public class DynamicInputModel
{
[ColumnName("ColumnA"), LoadColumn(0)]
public string ColumnA { get; set; }
[ColumnName("ColumnB"), LoadColumn(1)]
public string ColumnB { get; set; }
}
PredictionEngine<DynamicInputModel, MulticlassClassificationPrediction> predEngine = _predEnginePool.GetPredictionEngine(modelName: modelName);
IDataView dataView = _mlContext.Data.LoadFromTextFile<DynamicInputModel>(
path: testDataPath,
hasHeader: true,
separatorChar: ',',
allowQuoting: true,
allowSparse: false);
var testDataList = _mlContext.Data.CreateEnumerable<DynamicInputModel>(dataView, false).ToList();
I don't think you can do DynamicInput, however you can create pipelines from one input schema and create multiple different models based on the labels/features. I have an example below that does that...two label columns and you can pass in an array of what feature columns to use for the model. The one downside to this approach is that the input schema (CSV/Database) has to be static (not change on load):
https://github.com/bartczernicki/MLDotNet-BaseballClassification

Can I query DynamoDBMapper with partition key only?

I've seen this page about how to query with partition keys only. However, my case is using DynamoDBMapper class to make the query, what seemed to work there does not apply.
Here's a part of my code:
private final DynamoDBMapper mapper;
List<QueryResult> queryResult = mapper.query(QueryResult.class, queryExpression);
The table I query has a primary partition key id and primary sort key timestamp.
I wanted to query all the rows with designatedid, eav looks like:
{:id={S: 0123456,}}
but if the id has duplicates (which makes sense cause it's partition key), it always gives me
"The provided key element does not match the schema"
Not sure how to resolve this. Due to sharing code with other tables, DynamoDBMapper class is a must.
Any help appreciated! Thanks.
Does the below work?
final DynamoDBQueryExpression<QueryResult> queryExpression = new DynamoDBQueryExpression<>();
expression.setKeyConditionExpression("id = :id");
expression.withExpressionAttributeValues(ImmutableMap.of(":id", new AttributeValue("0123456")));
Here is a working example:
final MyItem hashKeyValues = MyItem.builder()
.hashKeyField("abc")
.build();
final DynamoDBQueryExpression<MyItem> queryExpression = new DynamoDBQueryExpression<>();
queryExpression.withHashKeyValues(hashKeyValues);
queryExpression.setConsistentRead(false); //or true
final PaginatedQueryList<MyItem> response = dynamoDBMapper.query(MyItem.class, queryExpression);

Update and Delete Map/Reduce in HBase

I have a table that contains about have billion records. I want to change the key of these records i.e fetch a records change its key somehow, delete what was fetched save the new records ! Let us say for example my key is [time-accountId] and I want to change it to [account-time]
I want to fetch entity create new with different key, delete the entity with [time-account] and save the new entity with [accout-time]
What is the best way to accomplish this task ?
I am thinking of M/R but how can I delete entities with M/R ?
You need a mapreduce which will produce a Put and a Delete for each row of your table. Only a mapper is needed here since you don't need aggregation on your data, so skip the reducer:
TableMapReduceUtil.initTableReducerJob(
table, // output table
null, // reducer class
job);
Your mapper has to generate both Put and Delete, so the output value class to used is the Mutation (https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/Mutation.html):
TableMapReduceUtil.initTableMapperJob(
table, // input table
scan, // Scan instance to control CF and attribute selection
MyMapper.class, // mapper class
ImmutableBytesWritable.class, // mapper output key
Mutation.class, // mapper output value
job);
Then your mapper will look like this:
Delete delete = ...
context.write(oldKey, delete);
Put put = ...
context.write(newKey, put);

Qt/SQL - Get column type and name from table without record

Using Qt, I have to connect to a database and list column's types and names from a table. I have two constraints:
1 The database type must not be a problem (This has to work on PostgreSQL, SQL Server, MySQL, ...)
2 When I looked on the internet, I found solutions that work but only if there are one or more reocrd into the table. And I have to get column's type and name with or without record into this database.
I searched a lot on the internet but I didn't find any solutions.
I am looking for an answer in Qt/C++ or using a query that can do that.
Thanks for help !
QSqlDriver::record() takes a table name and returns a QSqlRecord, from which you can fetch the fields using QSqlRecord::field().
So, given a QSqlDatabase db,
fetch the driver with db.driver(),
fetch the list of tables with db.tables(),
fetch the a QSqlRecord for each table from driver->record(tableName), and
fetch the number of fields with record.count() and the name and type with record.field(x)
According to the previous answers, I make the implementation as below.It can work well, hope it can help you.
{
QSqlDatabase db = QSqlDatabase::addDatabase("QSLITE", "demo_conn"); //create a db connection
QString strDBPath = "db_path";
db.setDatabaseName(strDBPath); //set the db file
QSqlRecord record = db.record("table_name"); //get the record of the certain table
int n = record.count();
for(int i = 0; i < n; i++)
{
QString strField = record.fieldName(i);
}
}
QSqlDatabase::removeDatabase("demo_conn"); //remove the db connection
Getting column names and types is a database-specific operation. But you can have a single C++ function that will use the correct sql query according to the QSqlDriver you currently use:
QStringlist getColumnNames()
{
QString sql;
if (db.driverName.contains("QOCI", Qt::CaseInsensitive))
{
sql = ...
}
else if (db.driverName.contains("QPSQL", Qt::CaseInsensitive))
{
sql = ...
}
else
{
qCritical() << "unsupported db";
return QStringlist();
}
QSqlQuery res = db.exec(sql);
...
// getting names from db-specific sql query results
}
I don't know of any existing mechanism in Qt which allows that (though it might exist - maybe by using QSqlTableModel). If noone else knows of such a thing, I would just do the following:
Create data classes to store the information you require, e.g. a class TableInfo which stores a list of ColumnInfo objects which have a name and a type.
Create an interface e.g. ITableInfoReader which has a pure virtual TableInfo* retrieveTableInfo( const QString& tableName ) method.
Create one subclass of ITableInfoReader for every database you want to support. This allows doing queries which are only supported on one or a subset of all databases.
Create a TableInfoReaderFactory class which allows creation of the appropriate ITableInfoReader subclass dependent on the used database
This allows you to have your main code independent from the database, by using only the ITableInfoReader interface.
Example:
Input:
database: The QSqlDatabase which is used for executing queries
tableName: The name of the table to retrieve information about
ITableInfoReader* tableInfoReader =
_tableInfoReaderFactory.createTableReader( database );
QList< ColumnInfo* > columnInfos = tableInfoReader->retrieveTableInfo( tableName );
foreach( ColumnInfo* columnInfo, columnInfos )
{
qDebug() << columnInfo.name() << columnInfo.type();
}
I found the solution. You just have to call the record function from QSqlDatabase. You have an empty record but you can still read column types and names.