How create table with aws glue with indexes? - amazon-web-services

Consider a snippet:
glueContext.getSinkWithFormat(
connectionType = "postgresql",
options = JsonOptions(Map(
"url" -> "jdbc://myurl",
"dbtable" -> "myTable",
"user" -> "user",
"password" -> "password"
))
).writeDynamicFrame(frame)
This creates table if not exists automatically, but without any idexes or id columns. Is there a way to setup a glue to
create them?

Related

Querying DynamoDb with Global Secondary Index Error

I'm new to DynamoDb and the intricacies of querying it - I understand (hopefully correctly) that I need to either have a partition Key or Global Secondary Index (GSI) in order to query against that value in the table.
I know I can use Appsync to query on a GSI by setting up a resolver - and this works. However, I have a setup using the Java AWS CDK (I'm writing in Kotlin) where I'm using Appsync and routing my queries into lambda Resolvers (so that once this works, I can do more complicated things later).
The Crux of the issue is that when I setup a Lambda to resolve my query, I end up with this Error msg: com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Query condition missed key schema element: testName returned from the Lambda.
I think these should be the key snippets..
My DynamoDbBean..
#DynamoDbBean
data class Test(
#get:DynamoDbPartitionKey var id: String = "",
#get:DynamoDbSecondaryPartitionKey(indexNames = ["testNameIndex"])
var testName: String = "",
)
Using the CDK I Created GSI on
testTable.addGlobalSecondaryIndex(
GlobalSecondaryIndexProps.builder()
.indexName("testNameIndex")
.partitionKey(
Attribute.builder()
.name("testName")
.type(AttributeType.STRING)
.build()
)
.projectionType(ProjectionType.ALL)
.build())
Then, within my Lambda I am trying to query my DynamoDb table, using a fixed value here testName = A.
My Sample data in the Test table would be like so..
{
"id" : "SomeUUID",
"testName" : "A"
}
private var client: AmazonDynamoDB = AmazonDynamoDBClientBuilder.standard().build()
private var dynamoDB: DynamoDB = DynamoDB(client)
Lambda Resolver Snippets...
val table: Table = dynamoDB.getTable(TABLE_NAME)
val index: Index = table.getIndex("testNameIndex")
...
QuerySpec().withKeyConditionExpression("testNameIndex = :testName")
.withValueMap(ValueMap().withString(":testName", "A"))
val iterator: Iterator<Item> = index.query(querySpec).iterator()
while (iterator.hasNext()) {
logger.info(iterator.next().toJSONPretty())
}
This is what results in this Error msg: com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Query condition missed key schema element: testName
Am I on the wrong lines here? I know there is some mixing of libs between the 'enhanced' Dynamo sdk and the dynamodbv2 sdk - so if there is a better way of doing this query, I'd love to know!
Thanks!
Your QuerySpec's withKeyConditionExpression is initialized wrong. You need to init it like.
not testNameIndex it should be testName
QuerySpec().withKeyConditionExpression("testName = :testName")
.withValueMap(ValueMap().withString(":testName", "A"))

Enforcing AWS Glue table properties order

I'm using boto3 to update a glue table's table parameters.
I'm doing this with the method update_table.
However, whatever order I push the TableInput parameter in, it gets ignored.
For example I push the following dict:
...
"Parameters": {
"AVTOR": "",
"PROJEKT": "",
"PODROCJE": "",
"CrawlerSchemaDeserializerVersion": "1.0",
"CrawlerSchemaSerializerVersion": "1.0",
...
After updating the table, the table properties are in a random order:
Is there a way to enforce the table properties order with update_table?

How to Create HANA Temp table in Vora Programmatically(Scala)

At the moment a temp table is created using the following statement
val HANA_TABLE = s"""
CREATE TEMPORARY TABLE TEMP_HANA
USING com.sap.spark.hana
OPTIONS (
path "TABLE",
host "HANA1",
dbschema "SCHEMA",
user "USER",
passwd "PASSWD",
instance "22"
)"""
vc.sql(HANA_TABLE);
Is there a way to do this Programmatically in scala? like
vc.read.format("com.sap.spark.hana").options(options).loadTemp()
on a side note is there an API for Vora?
Please see the Vora Developer Guide -> Chapter "8 Accessing Data in SAP HANA"
Your example could be written in this way
val options = Map(
"dbschema" -> "SCHEMA",
"path" -> "TABLE",
"host" -> "HANA1",
"instance" -> "22",
"user" -> "USER",
"passwd" -> "PASSWD"
)
val inputDF = vc.read.format("com.sap.spark.hana").options(options).load()
inputDF.registerTempTable("TEMP_HANA")
vc.sql("select * from TEMP_HANA").show

How do you export a Map data type column on DynamoDB to S3 with JSON data type using HiveQL on EMR?

There are records that map data type on DynamoDB, I want to export these records to S3 with JSON data format using HiveQL on EMR.
How do you do this one? Is it possible?
I read the following documentaion, but that I wanted information was nothing.
DynamoDB DataFormat Documentation: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataFormat.html
Hive Command Examples for Exporting... Documentation: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMR_Hive_Commands.html
I tried the following steps:
Create a table on DynamoDB
TableName: DynamoDBTable1
HashKey: user_id
Insert two records to DynamoDB
# record1
user_id: "0001"
json: {"key1": "value1", "key2": "value2"}
# record2
user_id: "0001"
json: {"key1": "value1", "key2": "value2"}
Create a table on EMR from DynamoDB
CREATE EXTERNAL TABLE test (user_id string, json map<string, string>)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" = "DynamoDBTable",
"dynamodb.column.mapping" = "user_id:user_id,json:json");
Export records to S3
INSERT OVERWRITE DIRECTORY 's3://some-bucket/exports/' select json from test where user_id = '0001';
Confirm the S3 bucket, but the exported data is not JSON format...
# Expected
[
{"key1": "value1", "key2": "value2"},
{"key1": "value1", "key2": "value2"}
]
# Actual
key1^C{"s":"value1"}^Bkey2^C{"s":"value2"}
key1^C{"s":"value1"}^Bkey2^C{"s":"value2"}
The following DynamoDB data types are not supported by the DynamoDBStorageHandler class, so they cannot be used with dynamodb.column.mapping:
Map,
List,
Boolean,
Null

Symfony2 + doctrine2 table relation design

I'm using Symfony2 with doctrine2 and I need to design table relation with yml file.
The tables are: users, account and roles where users can be members of many accounts and have there different role.
Without doctrine I would create tables and one joining table with user_id,account_id and role_id.
With doctrine I have now this and I'm looking for a hint how to add there one more relation to table roles.
User:
type: entity
manyToMany:
accounts:
targetEntity: Accounts
joinTable:
name: UserAccount
joinColumns:
user_id:
referencedColumnName: id
inverseJoinColumns:
account_id:
referencedColumnName: id
In such case the only way you can go is create another entity called, let say UserAccountRole and connect to it with OneToMany.
User -> (OneToMany) -> UserAccountRole -> (ManyToOne) -> User
Account -> (OneToMany) -> UserAccountRole -> (ManyToOne) -> Account
Role -> (OneToMany) -> UserAccountRole -> (ManyToOne) -> Role