Issue while querying on DynamoDB table using GSI - amazon-web-services

I have a DynamoDB table, let us say here: ReportingTable. Its has following keys to uniquely identify items:
reportingtablePrimaryKey - partition key of table.
merchantId - sort Key of table.
transactionType-timestamp-index - Global Secondary Index of table containing following attributes.
transactionType - partition key of our GSI. We are always saving four types of values here. [Cancel, Refund, Shipment, MFNShipment]
timestamp- timestamp in epoch when item came into our system and was saved in dynamodb.
Now, the thing I am trying to achieve is, I am to calculate number of items present in DynamoDB table which lie between two timestamps (start and end timestamp).
For that, I came-up with the approach of using our GSI transactionType-timestamp-index and where for the list of values of transactionType and timestamp range, I will pass the key condition which will read all the records and to overcome returned response limiting issue and I will use lastEvaluatedKey in loop to get the other records till end.
Following is the code I am using:
private static int getNumberOfRecordsFromTable(final AmazonDynamoDB dynamoDBclient, final String tableName,
final String gsiIndex, final List<String> transactionTypes,
final long startTimeEpoch, final long endTimeEpoch) {
int numberOfRecords=0;
Map<String, AttributeValue> lastEvaluatedKey = null;
Map<String, AttributeValue> valueMap = new HashMap<>();
valueMap.put(":transaction_type", new AttributeValue().withSS(transactionTypes));
valueMap.put(":start_time_epoch", new AttributeValue().withN(String.valueOf(startTimeEpoch)));
valueMap.put(":end_time_epoch", new AttributeValue().withN(String.valueOf(endTimeEpoch)));
Map<String, String> nameMap = new HashMap<>();
nameMap.put("#timestamp","timestamp");
nameMap.put("#transactionType","transactionType");
final String conditionExpression = "(#transactionType = :transaction_type) " +
"AND (#timestamp BETWEEN :start_time_epoch AND :end_time_epoch)";
QueryRequest queryRequest = new QueryRequest()
.withTableName(tableName)
.withIndexName(gsiIndex)
.withKeyConditionExpression(conditionExpression)
.withExpressionAttributeNames(nameMap)
.withExpressionAttributeValues(valueMap)
.withProjectionExpression("#transactionType, #timestamp")
.withExclusiveStartKey(lastEvaluatedKey)
.withConsistentRead(false);
do {
int numberOfRecordsFetched=0;
QueryResult queryResult = dynamoDBclient.query(queryRequest);
lastEvaluatedKey = queryResult.getLastEvaluatedKey();
numberOfRecordsFetched = queryResult.getScannedCount();
queryRequest.setExclusiveStartKey(lastEvaluatedKey);
numberOfRecords = numberOfRecords + numberOfRecordsFetched;
} while (lastEvaluatedKey != null);
log.info("Number of {} type messages fetched :: {}", transactionType, numberOfRecords);
return numberOfRecords;
}
I am getting the following error:
Exception in thread "main" com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: One or more parameter values were invalid: Condition parameter type does not match schema type (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: FJLJTP7NFVKPTSDF2AJRUL0PTJVV4KQNSO5AEMVJF66Q9ASUAAJG)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1640)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:3443)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:3419)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeQuery(AmazonDynamoDBClient.java:2318)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.query(AmazonDynamoDBClient.java:2293)
at com.amazon.gstreporting.mtrschedulercli.scripts.GSTDatabaseRecordsCount.getNumberOfRecordsFromTable(GSTDatabaseRecordsCount.java:231)
at com.amazon.gstreporting.mtrschedulercli.scripts.GSTDatabaseRecordsCount.countDynamoDBRecords(GSTDatabaseRecordsCount.java:192)
at com.amazon.gstreporting.mtrschedulercli.scripts.GSTDatabaseRecordsCount.main(GSTDatabaseRecordsCount.java:123)
Could anyone help me in it?

The reason I was getting error - I was passing list of transactionType which ideally should been passed one by one into the query.
After, adding one more root for-loop to go over every transactionType I was able to fix it.
Please see the code change for reference:
private static int getNumberOfRecordsFromTable(final AmazonDynamoDB dynamoDBclient, final String tableName,
final String gsiIndex, final List<String> transactionTypes,
final long startTimeEpoch, final long endTimeEpoch) {
int numberOfRecords=0;
for (String transactionType: transactionTypes) {
Map<String, AttributeValue> lastEvaluatedKey = null;
Map<String, AttributeValue> valueMap = new HashMap<>();
valueMap.put(":transaction_type", new AttributeValue().withS(transactionType));
valueMap.put(":start_time_epoch", new AttributeValue().withN(String.valueOf(startTimeEpoch)));
valueMap.put(":end_time_epoch", new AttributeValue().withN(String.valueOf(endTimeEpoch)));
Map<String, String> nameMap = new HashMap<>();
nameMap.put("#timestamp","timestamp");
nameMap.put("#transactionType","transactionType");
final String conditionExpression = "(#transactionType = :transaction_type) " +
"AND (#timestamp BETWEEN :start_time_epoch AND :end_time_epoch)";
QueryRequest queryRequest = new QueryRequest()
.withTableName(tableName)
.withIndexName(gsiIndex)
.withKeyConditionExpression(conditionExpression)
.withExpressionAttributeNames(nameMap)
.withExpressionAttributeValues(valueMap)
.withProjectionExpression("#transactionType, #timestamp")
.withExclusiveStartKey(lastEvaluatedKey)
.withConsistentRead(false);
do {
int numberOfRecordsFetched=0;
QueryResult queryResult = dynamoDBclient.query(queryRequest);
lastEvaluatedKey = queryResult.getLastEvaluatedKey();
numberOfRecordsFetched = queryResult.getScannedCount();
queryRequest.setExclusiveStartKey(lastEvaluatedKey);
numberOfRecords = numberOfRecords + numberOfRecordsFetched;
} while (lastEvaluatedKey != null);
log.info("Number of {} type messages fetched :: {}", transactionType, numberOfRecords);
}
return numberOfRecords;
}

Related

How to solve the "too many connection" problem in zookeeper when I want to query too many times in reduce stage?

Sorry for my stupid question and thank you in advance.
I need to replace the outputvalue in reduce stage(or map stage). However, it will case too many connection in zookeeper. I don't know how to deal with it.
This is my reduce method:
public static class HbaseToHDFSReducer extends Reducer<Text,Text,Text, Text> {
protected void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
HashSet<String> address = new HashSet<>();
for(Text item :values){
String city = getDataByRowKey("A1",item.toString());
address.add(city);
}
context.write(key,new Text(String.valueOf(address).replace("\"", "")));
}
This is the query method:
public static String getDataByRowKey(String tableName, String rowKey) throws IOException {
Table table = ConnectionFactory.createConnection(conf).getTable(TableName.valueOf(tableName));
Get get = new Get(rowKey.getBytes());
String data = new String();
if (!get.isCheckExistenceOnly()) {
Result result = table.get(get);
for (Cell cell : result.rawCells()) {
String colName = Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength());
String value = Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength());
if (colName.equals(rowKey)) {
data = value;
}
}
}
table.close();
return data;
}
What should I do to solve it?
Thank you again
You created one connection per query, and connection creation is a heavy-weight operation. Maybe you can get a Connection in your reduce method and change
getDataByRowKey(String tableName, String rowKey) to this
getDataByRowKey(Connection connection, String tableName, String rowKey).

Save Expression on Encrypted attribute in DynamoDB

I'm using save expression on an encrypted attribute named transactionAmount while updating data in dynamo DB. However the update query is failing with ConditionalCheckFailedException. The data is encrypted on client side during initial persistence in dynamodb in way same as described here. Following is the code:
Data Transfer Object:
public final class SampleDTO {
#DynamoDBHashKey(attributeName = CommonDynamoDBSchemaConstants.UNIQUE_KEY)
#Getter(onMethod = #__({ #DoNotTouch }))
private String uniqueKey;
#DynamoDBAttribute(attributeName = CommonDynamoDBSchemaConstants.EVENT_RUNNING_TIME_EPOCH)
#Getter(onMethod = #__({ #DoNotTouch }))
private Long eventRunningTimeInEpoch;
#DynamoDBAttribute(attributeName = CommonDynamoDBSchemaConstants.INSTRUMENT_TYPE)
#DynamoDBTypeConverted(converter = InstrumentTypeConverter.class)
#Getter(onMethod = #__({ #DoNotTouch }))
private InstrumentType instrumentType;
#DynamoDBAttribute(attributeName = CommonDynamoDBSchemaConstants.TRANSACTION_AMOUNT)
private String transactionAmount;
}
Data Access Code:
// fetches data from dynamoDB based on unique key passed to it.
SampleDTO sampleDTO = getSampleDTO("testLedgerUniqueKey");
sampleDTO.setInstrumentType(InstrumentType.MACHINE);
DynamoDBSaveExpression saveExpression = new DynamoDBSaveExpression();
Map<String, ExpectedAttributeValue> expressionAttributeValues =
new HashMap<String, ExpectedAttributeValue>();
expressionAttributeValues.put(
CommonDynamoDBSchemaConstants.LEDGER_UNIQUE_KEY,
new ExpectedAttributeValue(true)
.withValue(new AttributeValue(sampleDTO.getLedgerUniqueKey())));
expressionAttributeValues.put(
CommonDynamoDBSchemaConstants.TRANSACTION_AMOUNT,
new ExpectedAttributeValue(true).withValue(
new AttributeValue(sampleDTO.getTransactionAmount())));
saveExpression.setExpected(expressionAttributeValues);
saveExpression.setConditionalOperator(ConditionalOperator.AND);
dynamoDBMapper.save(sampleDTO, saveExpression, null /*dynamoDBMapperConfig*/);
ConditionalCheckFailedException:
You are trying to update a record that does not exist with your query condition. Please verify your query condition to make sure your query returns a record.
Reference:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html#Programming.Errors.MessagesAndCodes
You specified a condition that evaluated to false. For example, you
might have tried to perform a conditional update on an item, but the
actual value of the attribute did not match the expected value in the
condition.
Hope it helps.

Amazon AWS DynamoDB converting List of Map to ArrayList of objects

I am trying to learn DynamoDB from Amazon AWS,and have been able to retrieve data with success, however I am having a hard time converting it to usable form.
My goal is to convert the result to an ArrayList of my Data data type, which is a ValueObject class with attributes, getters and setters.
Thanks!
Map<String,String> expressionAttributesNames = new HashMap<>();
expressionAttributesNames.put("#network_asset_code","network_asset_code");
expressionAttributesNames.put("#temperature","temperature");
Map<String,AttributeValue> expressionAttributeValues = new HashMap<>();
expressionAttributeValues.put(":network_asset_codeValue", new AttributeValue().withS("17AB05"));
expressionAttributeValues.put(":temperature", new AttributeValue().withN("21"));
ScanRequest scanRequest = new ScanRequest()
.withTableName("things")
.withFilterExpression("#network_asset_code = :network_asset_codeValue and #temperature = :temperature")
.withExpressionAttributeNames(expressionAttributesNames)
.withExpressionAttributeValues(expressionAttributeValues);
ScanResult scanResult = client.scan(scanRequest);
List<Map<String,AttributeValue>> attributeValues = scanResult.getItems();
ArrayList<Data> dataArray = new ArrayList<>();
for (Map map: attributeValues) {
Data d = map.values();
dataArray.add(d);
}
You can use DynamoDBMapper to automagically convert DynamoDB items to Java objects (POJO) using annotations.
After a while I could get it right:
for (Map map: attributeValues) {
AttributeValue attr = (AttributeValue) map.get("network_asset_code");
Data d = new Data();
d.network_asset_code = attr.getS();
dataArray.add(d);
}

How to update item by Composite Primary Key in Dynamodb

I have a table called friends:
Friend 1 | Friend 2 | Status
Friend 1 is my HASH attribute and Friend 2 is my range attribute.
I would like to update an item's staus attribute where friend 1 = 'Bob' and friend 2 = 'Joe'. Reading through the documentation on http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/JavaDocumentAPICRUDExample.html I can only see how to update an item by 1 key, how do I include the other key?
Here you go:
DynamoDBQueryExpression<Reply> queryExpression = new DynamoDBQueryExpression<Reply>()
.withKeyConditionExpression("Id = :val1 and ReplyDateTime > :val2")
.withExpressionAttributeValues(
...
where Id is the Hash Key and ReplyDateTime is the Range Key.
Reference:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBMapper.QueryScanExample.html
I'm writing example where you can make update of multiple item in single table. I have primary key as id and range key as Datetime.
Actually there is no feature available in dynamodb so what i'm doing here is first query all the variable with hash key and range key of which i want to make update. Once all data are stored in List then loading data with it's hash key and rangekey and changing or updating field using set and save it.
Since I'm editing in hash key so, hash key original will be there we need to delete it. If you need to update in next attribute no need. I haven't added deleting code write yourself. You can query if you have confusion your entry with hash key will be still and new entry with new hash key will be added.
Code is below:
public static void main(String[] args) {
AmazonDynamoDBClient client = new AmazonDynamoDBClient();
DynamoDBMapper mapper = new DynamoDBMapper(client);
client.setEndpoint("http://localhost:8000/");
String fromDate = "2016-01-13";
String toDate = "2016-02-05";
User user = new User();
user.setId("YourHashKey");
LocalDate frmdate = LocalDate.parse(fromDate, DateTimeFormatter.ISO_LOCAL_DATE);
LocalDate todate = LocalDate.parse(toDate, DateTimeFormatter.ISO_LOCAL_DATE);
LocalDateTime startfrm = frmdate.atStartOfDay();
LocalDateTime endto = todate.atTime(23, 59, 59);
Condition rangeCondition = new Condition().withComparisonOperator(ComparisonOperator.BETWEEN.toString()).withAttributeValueList(new AttributeValue().withS(startfrm.toString()), new AttributeValue().withS(endto.toString()));
DynamoDBQueryExpression<User> queryExpression = new DynamoDBQueryExpression<User>().withHashKeyValues(user).withRangeKeyCondition("DATETIME", rangeCondition);
List<User> latestReplies = mapper.query(User.class, queryExpression);
for (User in : latestReplies) {
System.out.println(" Hashid: " + in.getId() + " DateTime: " + in.getDATETIME() + "location:" + in.getLOCID());
User ma = mapper.load(User.class, in.getId(), in.getDATETIME());
ma.setLOCID("Ohelig");
mapper.save(ma);
}
}

How to get all rows containing (or equaling) a particular ID from an HBase table?

I have a method which select the row whose rowkey contains the parameter passed into.
HTable table = new HTable(Bytes.toBytes(objectsTableName), connection);
public List<ObjectId> lookUp(String partialId) {
if (partialId.matches("[a-fA-F0-9]+")) {
// create a regular expression from partialId, which can
//match any rowkey that contains partialId as a substring,
//and then get all the row with the specified rowkey
} else {
throw new IllegalArgumentException(
"query must be done with hexadecimal values only");
}
}
I don't know how to finish code above.
I just know the following code can get the row with specified rowkey in Hbase.
String rowkey = "123";
Get get = new Get(Bytes.toBytes(rowkey));
Result result = table.get(get);
You can use RowFilter filter with RegexStringComparator to do that. Or, if it is just to fetch the rows which match a given substring you can use RowFilter with SubstringComparator. This is how you use HBase filters :
public static void main(String[] args) throws IOException {
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, "demo");
Scan s = new Scan();
Filter f = new RowFilter(CompareOp.EQUAL, new SubstringComparator("abc"));
s.setFilter(f);
ResultScanner rs = table.getScanner(s);
for(Result r : rs){
System.out.println("RowKey : " + Bytes.toString(r.getRow()));
//rest of your logic
}
rs.close();
table.close();
}
The above piece of code will give you all the rows which contain abc as a part of their rowkeys.
HTH