Assume my data structure has several data members
class Data {
#DynamoDBHashKey
#DynamoDBAutoGeneratedKey
String id;
String id1;
String id2;
....
}
Now I want to do query based on id1 and id2, like SQL where id1 ="id1" and id2 = "id2". I know somehow I should make them as global secondary key like either way below:
Make id1 and id2 as different indexes like this:
#DynamoDBIndexHashKey(globalSecondaryIndexName = "INDEX_ID1")
String id1;
#DynamoDBIndexHashKey(globalSecondaryIndexName = "INDEX_ID2")
String id2;
//And then do query it by any of the index name:
DynamoDBQueryExpression<Data> query = new DynamoDBQueryExpression<Data>()
.withIndexName("INDEX_ID1") //or INDEX_ID2 here
.withConsistentRead(false) //GSIs do not support consistent reads
.withKeyConditionExpression("id1 = :id1 AND id2 = :id2")
.withExpressionAttributeValues(ImmutableMap.of(":id1", new AttributeValue(id1), ":id2", new AttributeValue(id2)));
Make id1 and id2 as partition key and sort key under same index name:
#DynamoDBIndexHashKey(globalSecondaryIndexName = "INDEX_ID1_ID2")
String id1;
#DynamoDBIndexRangeKey(globalSecondaryIndexName = "INDEX_ID1_ID2")
String id2;
//And then do query it by the only index name:
DynamoDBQueryExpression<Data> query = new DynamoDBQueryExpression<Data>()
.withIndexName("INDEX_ID1_ID2")
.withConsistentRead(false) //GSIs do not support consistent reads
.withKeyConditionExpression("id1 = :id1 AND id2 = :id2")
.withExpressionAttributeValues(ImmutableMap.of(":id1", new AttributeValue(id1), ":id2", new AttributeValue(id2)));
Which way is right or better?
Besides, if I want to query based on more than two conditions (say there is a id3), then how can I do that?
Make id1 and id2 as different indexes like this
In this, you can't query on both id1 and id2, you can only query on one. That is because DynamoDB doesn't allow you to use two different indexes simultaneously.
Make id1 and id2 as partition key and sort key under same index name
This will work for SQL-like where id1 ="id1" and id2 = "id2" queries. But the approach won't scale if you add another id, because you can only have one partition-key and one sort-key.
As mentioned in a comment above, you can Scan for more filters and conditions. But if you think you'll need lot more complexities in future, DynamoDB might not be the right tool for you.
Related
I wasnt sure how to open the title but let me explain what is my problem.
I have two tables and both of them used uniqueidentifier as Id and they are auto generated by newsequentialid()
Now, when i make an insert to table B, it runes Insert trigger and i do some specific things inside this trigger and i also insert some values to another table called A and i need to retreive this table A's inserted Id but i am unable to find a solution.
Also let me explain why i need such a trigger. When a user creates an invoice with products in it where they have stock information, this trigger is responsible to create a stock transaction with header and detail about the products inserted (this stock detail table also has a trigger and it updates the warehouses) etc. etc.
i hope this gives some hint what i am trying to do
CREATE TRIGGER [dbo].[IT_TBLDebitInvoiceDetails] ON [dbo].[TBLDebitInvoiceDetails] AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
declare #cnt int;
--HEADER
declare #DocumentId uniqueidentifier;
declare #OrganizationId uniqueidentifier;
declare #Date date;
declare #TermDate date;
declare #DespatchDate date;
declare #DespatchNo nvarchar(20);
declare #WarehouseId uniqueidentifier;
declare #CustomerId uniqueidentifier;
declare #StockId uniqueidentifier;
declare #CurrencyTypeId uniqueidentifier;
declare #GrandTotal money;
--Auditable
declare #WhoCreated uniqueidentifier;
declare #DateCreated datetime;
declare #WhoUpdated uniqueidentifier;
declare #DateUpdated datetime;
--DETAIL
declare #ProductId uniqueidentifier;
declare #Quantity decimal(18, 2);
declare #ProductType int;
SELECT TOP(1) #OrganizationId = OrganizationId, #DocumentId = A.Id, #Date = A.[Date], #TermDate = A.TermDate, #DespatchDate = A.DespatchDate, #DespatchNo = A.DespatchNo,
#WarehouseId = A.WarehouseId, #CustomerId = A.CustomerId, #GrandTotal = A.GrandTotal,
#WhoCreated = A.WhoCreated, #DateCreated = A.DateCreated, #WhoUpdated = A.WhoUpdated, #DateUpdated = A.DateUpdated
FROM TBLDebitInvoices AS A
INNER JOIN inserted AS B ON A.Id = B.InvoiceId
/* CHECK STOCK TRANSACTION */
SELECT #cnt = COUNT(*) FROM inserted AS A
INNER JOIN TBLProducts AS B ON B.Id = A.ProductId
WHERE B.ProductType != 1
--we have products for stock, create stock header
IF(#cnt > 0)
BEGIN
INSERT INTO TBLStocks (OrganizationId, TransactionType, DocumentType, DocumentId, [Date], DeliveryDate, DeliveryNo, SourceWarehouseId, CustomerId, [Description], WhoCreated, DateCreated, WhoUpdated, DateUpdated, IsDeleted)
VALUES (#OrganizationId, 5, 0, #DocumentId, #Date, #DespatchDate, #DespatchNo, #WarehouseId, #CustomerId, '', #WhoCreated, #DateCreated, #WhoUpdated, #DateUpdated, 0);
SELECT #StockId = ???????;
END
INSERT INTO TBLStockDetails (StockId, ProductId, [Value])
SELECT #StockId, ProductId, SUM(Quantity) FROM (
SELECT A.ProductId AS ProductId, A.Quantity FROM inserted AS A
INNER JOIN TBLProducts AS B ON B.Id = A.ProductId
WHERE B.ProductType = 0
UNION ALL
SELECT C.IngredientID, A.Quantity * C.Quantity FROM inserted AS A
INNER JOIN TBLProducts AS B ON B.Id = A.ProductId
INNER JOIN TBLProductRecipes AS C ON C.ProductId = B.Id
WHERE B.ProductType = 2
) AS T1
GROUP BY ProductId;
UPDATE TBLDebitInvoices SET StockId = #StockId WHERE Id = #DocumentId;
/* CHECK DC TRANSACTION */
INSERT INTO TBLDebitCreditTransactions (TransactionType, DocumentType, DocumentId, PaymentStatus, [Date], Amount, AccountType, AccountId, CurrencyTypeId)
VALUES (1, 0, #DocumentId, 0, #TermDate, #GrandTotal, 0, #CustomerId, #CurrencyTypeId);
END
GO
inside this trigger i have this insert:
INSERT INTO TBLStocks (OrganizationId, TransactionType, DocumentType, DocumentId, [Date], DeliveryDate, DeliveryNo, SourceWarehouseId, CustomerId, [Description], WhoCreated, DateCreated, WhoUpdated, DateUpdated, IsDeleted)
VALUES (#OrganizationId, 5, 0, #DocumentId, #Date, #DespatchDate, #DespatchNo, #WarehouseId, #CustomerId, '', #WhoCreated, #DateCreated, #WhoUpdated, #DateUpdated, 0);
SELECT #StockId = ???????;
and i need the Id inserted to this table so i can use its id to insert its row elements.
it seems i found the answer, i wasnt sure inserted statement will give a separate result for the second insert, but it works
DECLARE #IdTable TABLE (StockId uniqueidentifier);
INSERT INTO TBLStocks (OrganizationId, TransactionType, DocumentType, DocumentId, [Date], DeliveryDate, DeliveryNo, SourceWarehouseId, CustomerId, [Description], WhoCreated, DateCreated, WhoUpdated, DateUpdated, IsDeleted)
OUTPUT Inserted.Id INTO #IdTable(StockId)
SELECT #OrganizationId, 5, 0, #DocumentId, #Date, #DespatchDate, #DespatchNo, #WarehouseId, #CustomerId, '', #WhoCreated, #DateCreated, #WhoUpdated, #DateUpdated, 0;
SELECT #StockId = StockId FROM #IdTable;
this will give the second table inserted uniqueidentifier
File.txt
123,abc,4,Mony,Wa
123,abc,4, ,War
234,xyz,5, ,update
234,xyz,5,Rheka,sild
179,ijo,6,all,allSingle
179,ijo,6,ball,ballTwo
1) column1,column2,colum3 are primary Keys
2) column4,column5 are comparision Keys
I have a file with duplicate records like above In this duplicate record i need to get only one record among duplicates based on sorting order.
Expected Output:
123,abc,4, ,War
234,xyz,5, ,update
179,ijo,6,all,allSingle
Please help me. Thanks in advance.
You can try the below code:
data = LOAD 'path/to/file' using PigStorage(',') AS (col1:chararray,col2:chararray,col3:chararray,col4:chararray,col5:chararray);
B = group data by (col1,col2,col3);
C = foreach B {
sorted = order data by col4 desc;
first = limit sorted 1;
generate group, flatten(first);
};
In the above code, you can change the sorted variable to choose the column you would like to consider for sorting and the type of sorting. Also, in case you require more than one record, you can change the limit to greater than 1.
Hope this helps.
Questions isn't soo clear , but I understand this is what you need :
A = LOAD 'file.txt' using PigStorage(',') as (column1,column2,colum3,column4,column5);
B = GROUP A BY (column1,column2,colum3);
C = FOREACH B GENERATE FLATTERN(group) as (column1,column2,colum3);
DUMP C;
Or
A = LOAD 'file.txt' using PigStorage(',') as (column1,column2,colum3,column4,column5);
B = DISTINCT(FOREACH A GENERATE column1,column2,colum3);
DUMP B;
I am using a hash join on some sample data to join a small table on a larger one. In this example '_1080544_27_08_2016' is the larger table and '_2015_2016_playerlistlookup' the smaller one. Here is my code:
data both(drop=rc);
declare Hash Plan
(dataset: 'work._2015_2016_playerlistlookup'); /* declare the name Plan for hash */
rc = plan.DefineKey ('Player_ID'); /* identify fields to use as keys */
rc = plan.DefineData ('Player_Full_Name',
'Player_First_Name', 'Player_Last_Name',
'Player_ID2'); /* identify fields to use as data */
rc = plan.DefineDone (); /* complete hash table definition */
do until (eof1) ; /* loop to read records from _1080544_27_08_2016 */
set _1080544_27_08_2016 end = eof1;
rc = plan.add (); /* add each record to the hash table */
end;
do until (eof2) ; /* loop to read records from _2015_2016_playerlistlookup */
set _2015_2016_playerlistlookup end = eof2;
call missing(Player_Full_Name,
Player_First_Name, Player_Last_Name); /* initialize the variable we intend to fill */
rc = plan.find (); /* lookup each plan_id in hash Plan */
output; /* write record to Both */
end;
stop;
run;
This is producing a table that has the same numbers of rows as the smaller, lookup table. What I would like to see if a table the same size as the larger one with the additional fields from the lookup table joined on via the primary key.
The larger table has repeating primary keys. That is to say the primary key is not unique (based on row number for example).
Can someone please tell me what I need to amend in the code?
Thanks
You are loading both datasets into your hash object - the small one when you declare it, and then the large one as well in your first do-loop. This makes no sense to me, unless you have lookup values already populated for some but not all of the rows in your large dataset, and you are trying to carry them over between rows.
You are then looping through the lookup dataset and producing 1 output row for each row of that dataset.
It is unclear exactly what you are trying to do here, as this is not a standard use case for hash objects.
Here's my best guess - if this isn't what you're trying to do, please post sample input and intended output datasets.
data want;
set _1080544_27_08_2016;
if 0 then set _2015_2016_playerlistlookup;
if _n_ = 1 then do;
declare Hash Plan(dataset: 'work._2015_2016_playerlistlookup');
rc = plan.DefineKey ('Player_ID');
rc = plan.DefineData ('Player_Full_Name', 'Player_First_Name', 'Player_Last_Name', 'Player_ID2');
rc = plan.DefineDone ();
end;
call missing(Player_Full_Name, Player_First_Name, Player_Last_Name);
rc = plan.find();
drop rc;
run;
I have a group of tables that I need to the integer key from and I would like to be able to pass in any of them into a single and get the next value for the key.
I believe that RecordRef is the way to do this, but the code so far doesn't seem quite right.
I am trying to build a function that will take a table record and then return an integer value, that integer value will be the next record for the primary key. IE: if the last record's key is is 62825 the function will return 62826.
FunctionA
BEGIN
Id := GetNextId(SalesRecord); //Assignment not allowed
END;
FunctionB
BEGIN
Id := GetNextId(CreditMemoRecord); //Assignment not allowed
END;
GetNextId(pTableReference: RecordRef) rNextId : Integer
BEGIN
CASE pTableReference.NUMBER OF
DATABASE::SalesRecord: BEGIN
//Find last Record
pTableReference.FINDLAST;
lFieldRef := pTableReference.FIELD(1); //Set to the PK field
END;
DATABASE::CreditMemoRecord: BEGIN
//Find last Record
pTableReference.FINDLAST;
lFieldRef := pTableReference.FIELD(10); //Set to the PK field
END;
... //do more here
END; //CASE
EVALUATE(rNextId,FORMAT(lFieldRef.VALUE)); //Get the integer value from FieldRef
rNextId := rNextId + 1; //Add one for the next value
EXIT(rNextId); //return the value
END;
With this code I am getting the error "Assignment is not allowed for this variable." on the Function Call to GetNextId
Idea of the Table Structure:
Table - SalesRecord
FieldId, Fieldname, Type, Description
1 id integer PK
2 text1 text(30)
3 text2 text(30)
4 dec1 decimal
5 dec2 decmial
Table - CreditMemoRecord
FieldId, Fieldname, Type, Description
10 id integer PK
20 text1 text(30)
30 text2 text(30)
40 dec1 decmial
50 dec2 decmial
Just put function like this in both tables
GetNextId() rNextId : Integer
BEGIN
RESET;
FINDLAST;
EXIT(id+1);
END;
an then call it from record variable
FunctionA
BEGIN
Id := SalesRecord.GetNextId();
END;
FunctionB
BEGIN
Id := CreditMemoRecord.GetNextId();
END;
This is common practice I believe.
You mean "GetNextValue" get next record? I don't quite understand your use-case.
If you want to pass in a generic record, then you'll want to use the VARIANT data type. This is a wildcard type that will accept Records from any table, and allow you to return records from any table.
This is untested, but hopefully give you an idea of how they could work;
LOCAL NextRecord(VAR RecVariant : Variant)
IF RecVariant.ISRECORD THEN BEGIN
RecRef.GETTABLE(RecVariant);
// RecRef.NUMBER is useful for Database::"Customer" style comparisons
RecRef.NEXT;
RecRef.SETTABLE(RecVariant); // Might not be necessary
END;
Given:
an sqlite database with a table T
the table T contains 10 columns - C0, C1 ... C9.
an sqlite3_stmt pointer corresponding to select C3,C2 from T
OK, so I can fetch the selected column values using the sqlite3_column_XXX family of methods (http://www.sqlite.org/capi3ref.html#sqlite3_column_blob), like this:
sqlite3_stmt *s;
sqlite3_prepare_v2(db, query, sizeof(query), &s, NULL);
while ((result = sqlite3_step(s)) == SQLITE_ROW)
{
const char *v3 = reinterpret_cast<const char *>(sqlite3_column_text(s, 0);
const char *v2 = reinterpret_cast<const char *>(sqlite3_column_text(s, 1);
}
What I need is the real index of the selected columns, i.e. 3 for v3 and 2 for v2.
Motivation: I want to be able to parse the returned string value into the real column type. Indeed, my schema says that c3 is a datetime, which sqlite treats as TEXT. So, sqlite3_column_type(s, 0) returns SQLITE3_TEXT, but the table metadata (available from pragma table_info(T)) retains the string datetime, which is the intended type of the column. Knowing it, I can parse the returned string into the respective unix time since the epoch, for instance.
But how can I map the query column index to the table column index:
query column 0 -> table column 3
query column 1 -> table column 2
Thanks.
You could use the sqlite C function sqlite3_column_decltype to get the declared column data type from the result stmt? It doesn't specifically answer your question (getting the original column's index), but could be an alternative way to achieve what you need?