Group by, sum then sort a list of transaction objects java - list

I have a list of transactions by day, the transaction contains the following attributes:
Transaction(int transactionID,
DateTime transactionDate,
String shopId,
int productReference,
int quantity,
float price);
Having a list List<Transaction>, I want to extract top 100 selled products by shop.
So I need to group transactions by shopId then by productReference, then summing quantities than sorting from most selled to least.
Thanks you for your Help

private static Collector<Transaction, ?, List<Transaction>> limit(int limit) {
return Collector.of(
ArrayList::new,
(list, transaction) -> { if (list.size() < limit) list.add(transaction); },
(list1, list2) -> {
list1.addAll(list2.subList(0, Math.min(list2.size(), Math.max(0, limit - list1.size()))));
return list1;
}
);
}
public static void main(String[] args) {
Map<String, List<Transaction>> groupedMap = listOfTransactions
.stream()
.sorted((t1, t2) -> Integer.compare(t2.getQuantity(), t1.getQuantity()))
.collect(
Collectors.groupingBy(
Transaction::getShopId,
limit(100)
)
);
}
As a result you'll get a map with shopId as a key, and lists of transactions sorted by quantity as a value.
Is it expected behavior?

I'd suggest using of additional Product type, with overridden equals() and hasCode() which will consist only of shopId and productReference . New type will serve as an output, which will make all the transformation job more obvious. Consider my version, with Lombok lib usage:
import lombok.*;
#Data
#RequiredArgsConstructor(staticName = "of")
#ToString
public class Product {
final String shopId;
final int productReference;
}
and the function code itself:
List<Product> products = transactions.stream()
// grouping transactions by the same product
.collect(Collectors.groupingBy(transaction -> Product.of(
transaction.getShopId(),
transaction.getProductReference())))
.entrySet().stream()
// summing all price * quantity occurrences to find top sellings
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getValue().stream()
.mapToDouble(p -> p.getQuantity() * p.getPrice())
.sum()))
.entrySet().stream()
// sorting by the most expensive ones at the top,
// limiting to 10 and collecting to the list
.sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
.map(Map.Entry::getKey)
.limit(10)
.collect(Collectors.toList());

Related

DynamoDB - Get all items which overlap a search time interval

My application manages bookings of a user. These bookings are composed by a start_date and end_date, and their current partition in dynamodb is the following:
PK SK DATA
USER#1#BOOKINGS BOOKING#1 {s: '20190601', e: '20190801'}
[GOAL] I would query all reservations which overlap a search time interval as the following:
I tried to find a solution for this issue but I found only a way to query all items inside a search time interval, which solves only this problem:
I decided to make an implementation of it to try to make some change to solve my problem but I didn't found a solution, following you can find my implementation of "query inside interval" (this is not a dynamodb implementation, but I will replace isBetween function with BETWEEN operand):
import { zip } from 'lodash';
const bookings = [
{ s: '20190601', e: '20190801', i: '' },
{ s: '20180702', e: '20190102', i: '' }
];
const search_start = '20190602'.split('');
const search_end = '20190630'.split('');
// s:20190601 e:20190801 -> i:2200119900680011
for (const b of bookings) {
b['i'] = zip(b.s.split(''), b.e.split(''))
.reduce((p, c) => p + c.join(''), '');
}
// (start_search: 20190502, end_search: 20190905) => 22001199005
const start_clause: string[] = [];
for (let i = 0; i < search_start.length; i += 1) {
if (search_start[i] === search_end[i]) {
start_clause.push(search_start[i] + search_end[i]);
} else {
start_clause.push(search_start[i]);
break;
}
}
const s_index = start_clause.join('');
// (end_search: 20190905, start_search: 20190502) => 22001199009
const end_clause: string[] = [];
for (let i = 0; i < search_end.length; i += 1) {
if (search_end[i] === search_start[i]) {
end_clause.push(search_end[i] + search_start[i]);
} else {
end_clause.push(search_end[i]);
break;
}
}
const e_index = (parseInt(end_clause.join('')) + 1).toString();
const isBetween = (s: string, e: string, v: string) => {
const sorted = [s,e,v].sort();
console.info(`sorted: ${sorted}`)
return sorted[1] === v;
}
const filtered_bookings = bookings
.filter(b => isBetween(s_index, e_index, b.i));
console.info(`filtered_bookings: ${JSON.stringify(filtered_bookings)}`)
There’s not going to be a beautiful and simple yet generic answer.
Probably the best approach is to pre-define your time period size (days, hours, minutes, seconds, whatever) and use the value of that as the PK so for each day (or hour or whatever) you have in that item collection a list of the items touching that day with the sort key of the start time (so you can do the inequality there) and you can use a filter on the end time attribute.
If your chosen time period is days and you need to query across a week then you’ll issue seven queries. So pick a time unit that’s around the same size as your selected time periods.
Remember you need to put all items touching that day (or whatever) into the day collection. If an item spans a week it needs to be inserted 7 times.
Disclaimer: This is a very use-case-specific and non-general approach I took when trying to solve the same problem; it picks up on #hunterhacker 's approach.
Observations from my use case:
The data I'm dealing with is financial/stock data, which spans back roughly 50 years in the past up to 150 years into the future.
I have many thousands of items per year, and I would like to avoid pulling in all 200 years of information
The vast majority of the items I want to query spans a time that fits within a year (ie. most items don't go from 30-Dec-2001 to 02-Jan-2002, but rather from 05-Mar-2005 to 10-Mar-2005)
Based on the above, I decided to add an LSI and save the relevant year for every item whose start-to-end time is within a single year. The items that straddle a year (or more) I set that LSI with 0.
The querying looks like:
if query_straddles_year:
# This doesn't happen often in my use case
result = query_all_and_filter_after()
else:
# Most cases end up here (looking for a single day, for instance)
year_constrained_result = query_using_lsi_for_that_year()
result_on_straddling_bins = query_using_lsi_marked_with_0() # <-- this is to get any of the indexes that do straddle a year
filter_and_combine(year_constrained_result, result_on_straddling_bins)

Kotlin aggregation function

I need to write somehow a function, that aggregates results in a list.
I'm working with an Order dto (java class)
public class Order {
private Long orderId;
private String description;
...
}
I have two APIs, the one that return orders and the other one that returns suborders. So i retrieve all orders and get all suborders in a loop by predefined ids:
// for example i have a predefined list of order id's
List<Long> orderIds = listOf(1L, 2L, 3L, 4L, 5L)
val allOrders = orderIds.map {
// at first i retrieve an order
val order = orderService.getOrderById(it.orderId)
// then i get a list of suborders
val suborders = suborderService.getSubordersByOrderId(it.orderId)
// ?
}
How can i combine order (Order) and suborders (List) to a list of Order and then all the elements of nested list into a single list?
I think flatMap is what you want:
val allOrders: List<Order> = orderIds.flatMap {
val order = orderService.getOrderById(it)
val suborders = suborderService.getSubordersByOrderId(it)
suborders + order
}
It flatten all the items of the returned list into one single list altogether.

What is the meaning of the input variables when registering a new group?

The (smart) contract function to register a new group looks as follows:
async registerGroup(name, members, min, max, m, updateInterval) {
...
}
What is the meaning of min,max, m and updateInterval in the above?
name is the name of the group
members is the list of member added to the group at initialization. The list contains probably the public keys.
min and max set the minimum and maximum number of members, min should be >= 3.
m that is the minimum vote weight a request transaction must get.
m sets the total weight of votes required to activate a group transaction the group, the check can be found in the asch/src/contract/group.js file in the activate() function:
const group = await app.sdb.load('Group', account.name)
if (totalWeight < group.m) return 'Vote weight not enough'
Notice that m also can be set when adding a new group member with group.addMember:
async addMember(address, weight, m) {
...
if (m) {
const group = await app.sdb.load('Group', this.sender.name)
if (!group) return 'Group not found'
group.m = m
app.sdb.update('Group', { m }, { name: this.sender.name })
}
...
}
The updateInterval is unclear till now. Possible related to the time a group member should lock it's XAS.

How to find sum of a field in nested lists with a where condition?

I am having two lists and I need to find sum of nested list and there should be filter on the first list.
Ex:
Class Customer{
string Name
List<Order> Orders
string State
}
Class Order{
int OrderID
int OrderTotal
int ItemCode
}
I need to find sum of Orders in a particular state, I am looking for a lambda expression for this.
Below is the lambda expression with can be used to get the sum of the orderTotal with the filter on state.
Customer customer = new Customer();
Now add some data to your customer object and Order object.
customer.Where(cust => cust.State.Equals("Alaska")).Sum(order => order.OrderTotal);

Scala objects not changing their internal state

I am seeing a problem with some Scala 2.7.7 code I'm working on, that should not happen if it the equivalent was written in Java. Loosely, the code goes creates a bunch of card players and assigns them to tables.
class Player(val playerNumber : Int)
class Table (val tableNumber : Int) {
var players : List[Player] = List()
def registerPlayer(player : Player) {
println("Registering player " + player.playerNumber + " on table " + tableNumber)
players = player :: players
}
}
object PlayerRegistrar {
def assignPlayersToTables(playSamplesToExecute : Int, playersPerTable:Int) = {
val numTables = playSamplesToExecute / playersPerTable
val tables = (1 to numTables).map(new Table(_))
assert(tables.size == numTables)
(0 until playSamplesToExecute).foreach {playSample =>
val tableNumber : Int = playSample % numTables
tables(tableNumber).registerPlayer(new Player(playSample))
}
tables
}
}
The PlayerRegistrar assigns a number of players between tables. First, it works out how many tables it will need to break up the players between and creates a List of them.
Then in the second part of the code, it works out which table a player should be assigned to, pulls that table from the list and registers a new player on that table.
The list of players on a table is a var, and is overwritten each time registerPlayer() is called. I have checked that this works correctly through a simple TestNG test:
#Test def testRegisterPlayer_multiplePlayers() {
val table = new Table(1)
(1 to 10).foreach { playerNumber =>
val player = new Player(playerNumber)
table.registerPlayer(player)
assert(table.players.contains(player))
assert(table.players.length == playerNumber)
}
}
I then test the table assignment:
#Test def testAssignPlayerToTables_1table() = {
val tables = PlayerRegistrar.assignPlayersToTables(10, 10)
assertEquals(tables.length, 1)
assertEquals(tables(0).players.length, 10)
}
The test fails with "expected:<10> but was:<0>". I've been scratching my head, but can't work out why registerPlayer() isn't mutating the table in the list. Any help would be appreciated.
The reason is that in the assignPlayersToTables method, you are creating a new Table object. You can confirm this by adding some debugging into the loop:
val tableNumber : Int = playSample % numTables
println(tables(tableNumber))
tables(tableNumber).registerPlayer(new Player(playSample))
Yielding something like:
Main$$anon$1$Table#5c73a7ab
Registering player 0 on table 1
Main$$anon$1$Table#21f8c6df
Registering player 1 on table 1
Main$$anon$1$Table#53c86be5
Registering player 2 on table 1
Note how the memory address of the table is different for each call.
The reason for this behaviour is that a Range is non-strict in Scala (until Scala 2.8, anyway). This means that the call to the range is not evaluated until it's needed. So you think you're getting back a list of Table objects, but actually you're getting back a range which is evaluated (instantiating a new Table object) each time you call it. Again, you can confirm this by adding some debugging:
val tables = (1 to numTables).map(new Table(_))
println(tables)
Which gives you:
RangeM(Main$$anon$1$Table#5492bbba)
To do what you want, add a toList to the end:
val tables = (1 to numTables).map(new Table(_)).toList
val tables = (1 to numTables).map(new Table(_))
This line seems to be causing all the trouble - mapping over 1 to n gives you a RandomAccessSeq.Projection, and to be honest, I don't know how exactly they work, but a bit less clever initialising technique does the job.
var tables: Array[Table] = new Array(numTables)
for (i <- 0 to numTables) tables(i) = new Table(i)
Using the first initialisation method I wasn't able to change the objects (just like you), but using a simple array everything seems to be working.