DynamoDB - Get all items which overlap a search time interval - amazon-web-services

My application manages bookings of a user. These bookings are composed by a start_date and end_date, and their current partition in dynamodb is the following:
PK SK DATA
USER#1#BOOKINGS BOOKING#1 {s: '20190601', e: '20190801'}
[GOAL] I would query all reservations which overlap a search time interval as the following:
I tried to find a solution for this issue but I found only a way to query all items inside a search time interval, which solves only this problem:
I decided to make an implementation of it to try to make some change to solve my problem but I didn't found a solution, following you can find my implementation of "query inside interval" (this is not a dynamodb implementation, but I will replace isBetween function with BETWEEN operand):
import { zip } from 'lodash';
const bookings = [
{ s: '20190601', e: '20190801', i: '' },
{ s: '20180702', e: '20190102', i: '' }
];
const search_start = '20190602'.split('');
const search_end = '20190630'.split('');
// s:20190601 e:20190801 -> i:2200119900680011
for (const b of bookings) {
b['i'] = zip(b.s.split(''), b.e.split(''))
.reduce((p, c) => p + c.join(''), '');
}
// (start_search: 20190502, end_search: 20190905) => 22001199005
const start_clause: string[] = [];
for (let i = 0; i < search_start.length; i += 1) {
if (search_start[i] === search_end[i]) {
start_clause.push(search_start[i] + search_end[i]);
} else {
start_clause.push(search_start[i]);
break;
}
}
const s_index = start_clause.join('');
// (end_search: 20190905, start_search: 20190502) => 22001199009
const end_clause: string[] = [];
for (let i = 0; i < search_end.length; i += 1) {
if (search_end[i] === search_start[i]) {
end_clause.push(search_end[i] + search_start[i]);
} else {
end_clause.push(search_end[i]);
break;
}
}
const e_index = (parseInt(end_clause.join('')) + 1).toString();
const isBetween = (s: string, e: string, v: string) => {
const sorted = [s,e,v].sort();
console.info(`sorted: ${sorted}`)
return sorted[1] === v;
}
const filtered_bookings = bookings
.filter(b => isBetween(s_index, e_index, b.i));
console.info(`filtered_bookings: ${JSON.stringify(filtered_bookings)}`)

There’s not going to be a beautiful and simple yet generic answer.
Probably the best approach is to pre-define your time period size (days, hours, minutes, seconds, whatever) and use the value of that as the PK so for each day (or hour or whatever) you have in that item collection a list of the items touching that day with the sort key of the start time (so you can do the inequality there) and you can use a filter on the end time attribute.
If your chosen time period is days and you need to query across a week then you’ll issue seven queries. So pick a time unit that’s around the same size as your selected time periods.
Remember you need to put all items touching that day (or whatever) into the day collection. If an item spans a week it needs to be inserted 7 times.

Disclaimer: This is a very use-case-specific and non-general approach I took when trying to solve the same problem; it picks up on #hunterhacker 's approach.
Observations from my use case:
The data I'm dealing with is financial/stock data, which spans back roughly 50 years in the past up to 150 years into the future.
I have many thousands of items per year, and I would like to avoid pulling in all 200 years of information
The vast majority of the items I want to query spans a time that fits within a year (ie. most items don't go from 30-Dec-2001 to 02-Jan-2002, but rather from 05-Mar-2005 to 10-Mar-2005)
Based on the above, I decided to add an LSI and save the relevant year for every item whose start-to-end time is within a single year. The items that straddle a year (or more) I set that LSI with 0.
The querying looks like:
if query_straddles_year:
# This doesn't happen often in my use case
result = query_all_and_filter_after()
else:
# Most cases end up here (looking for a single day, for instance)
year_constrained_result = query_using_lsi_for_that_year()
result_on_straddling_bins = query_using_lsi_marked_with_0() # <-- this is to get any of the indexes that do straddle a year
filter_and_combine(year_constrained_result, result_on_straddling_bins)

Related

How to delete a row, if condition on a date is met, through script (half solved)

I have soma data, starting from A10 to column M, until the 59th row.
I have some dates in column F10:F that are text strings, converted to official dates in column N (here the question with the process)
M3 is set to =NOW().
In cell N3 I have: =M3+14.
I want to delete all the rows, with a date in column N10:N that comes before [today + 2 weeks] (so cell N3).
When I create a script in Apps Script, it doesn't run the if statement, but if I leave it in comments, it can go in the for loop and deletes the rows, so I'm pretty sure the problem is, again, date formatting.
In this question I ask: how do I compare the values of N10:N with N3, in order to delete all the rows that don't meet the condition if(datesNcol <= targetDate)? (in code is written as if (rowData[i] < flatArray))
I leave also a demo sheet with this problem explained in detail and two alternatives (getBackground condition and numeric days condition).
Attempts:
This is a simplified code example:
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
const bVals = gen.getRange('B10:B').getValues();
const bFilt = bVals.filter(String);
const dataLastRow = bFilt.length;
function deleteExpired() {
dateCorrette(); //ignore, formula that puts corrected dates from N10 to dataLastRow
var dateCorrect = gen.getRange(10,14,dataLastRow,1).getValues();
var targetDate = gen.getRange('N3').getValues();
var flatArray = [].concat.apply([], targetDate);
for (var i = dateCorrect.length - 1; i >= 0; i--) {
var rowData = dateCorrect[i];
if (rowData[i] < flatArray) {
gen.deleteRow(i+10);
}
}
};
If run the script, nothing is deleted.
If I //comment the if function and the closing bracket, it delets all the rows of the list one by one.
I can't manage to meet that condition.
Right now, it logs this [Sun Jan 01 10:33:20 GMT-05:00 2023] as flatArray
and this [Wed Dec 21 03:00:00 GMT-05:00 2022] as dateCorrect[49], so the first row to delete, that is the 50th (is correct for all the dateCorrect[i] dates).
I tried putting a getTime() method in the targetDate variable, but it only functions if there is the getValue() method, not getValues(), so I then don't know how to use getTime() method on rowData, which is based on dateCorrected[i], which have to use the getValues() method. And then it also doesn't accept the flatArray variable, that has to be commented out (or it logs [ ] for flatArray, not the corrected date)
I leave the other attempts in the demo sheet, because I want to prioritize this problem around the date and make it clear in my head.
Thanks for all the help.
DEMO SHEET, ITA Locale time
I don't know how the demo sheet works with Apps Script, I suggest to copy the code in a personal sheet
UPDATE:
I've also tried putting an extra column, with an IF built-in function that writes "del" if the function has to be deleted.
=IF(O10>14;"del";"")
And then
var boba = gen.getRange(10,16,bLast,1).getDisplayValues();
.
.
if (boba[i] == 'del')
This does the job. But I can't understand why the other methods don't work.
Try this. It seems like you do a lot of things that aren't necessary. Unless I'm missing something.
A few notes. I typically do not use global variable, unless absolutely necessary. I don't create a variable for last row unless I have to use that value multiple times in my script. I use the method Sheet.getLastRow(). dataCorrect is a 2D array of 1 column so the second index can only be [0]. And getRange('N4') is a single cell so getValue() is good enough.
function deleteExpired() {
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
var dateCorrect = gen.getRange(10,14,gen.getLastRow()-9,1).getValues();
var targetDate = gen.getRange('N3').getValue();
for (var i = dateCorrect.length - 1; i >= 0; i--) {
if (dataCorrect[i][0] < targetDate) {
gen.deleteRow(i+10);
}
}
}
Try this:
function delRows() {
const ss = SpreadsheetApp.getActive();
const gsh = ss.getSheetByName('Generatore');
const colB = gsh.getRange('B10:B' + gsh.getLastRow()).getValues();
var colN = gsh.getRange('N10:N' + gsh.getLastRow()).getValues();
var tdv = new Date(new Date().getFullYear(), new Date().getMonth(), new Date().getDate() + 14).valueOf();//current date + 14
let d = 0;
colN.forEach((n, i) => {
if (new Date(n).valueOf() < tdv) {
gsh.deleteRow(i + 10 - d++);
}
});
}

Grouping the output of a CouchDB View

I have a map reduce view:
.....
emit( diffYears, doc.xyz );
reduced with _sum.
xyz is then a number which is summed per integer(diffYears).
The output looks roughly like this:
4 1204.9
5 796.19
6 1124.8
7 1112.6
8 1993.62
9 159.26
10 395.41
11 456.05
12 457.97
13 39.80
14 483.68
15 269.469
etc..
What I would like to do is group the results as follows:
Grouping Total per group
0-4 1959.2 i.e add up the xyz's for years 0,1,2,3,4
5-9 3998.5 same for 5,6,7,8,9 ...etc.
10-14 3566.3
I saw a suggestion where a list was used on a view output here: Using a CouchDB view, can I count groups and filter by key range at the same time?
but have been unable to adapt it to get any kind of result.
The code given is:
{
_id: "_design/authors",
views: {
authors_by_date: {
map: function(doc) {
emit(doc.date, doc.author);
}
}
},
lists: {
count_occurrences: function(head, req) {
start({ headers: { "Content-Type": "application/json" }});
var result = {};
var row;
while(row = getRow()) {
var val = row.value;
if(result[val]) result[val]++;
else result[val] = 1;
}
return result;
}
}
}
I substituted var val = row.key in this section:
while(row = getRow()) {
var val = row.value;
if(result[val]) result[val]++;
else result[val] = 1;
}
(although in this case the result is a count.)
This seems to be the way to do it.
(It is like having a startkey and endkey for each grouping which I can do manually, naturally, but not inside a process. Or is there a way of entering multiple start- and endkeys into one GET command???? )
This must be a fairly normal thing to do especially for researchers using statistical analysis.
I assume therefore that it does get done but I cannot locate examples
as far as CouchDB is concerned.
I would appreciate some help with this please or a pointer in the right direction.
Many thanks.
EDIT:
Perhaps the answer lies in a process in 'reduce' to group the output??
You can accomplish what you want using a complex key. The limitation is that the group size is static and needs to be defined in the view.
You'll need a simple step function to create your groups within map like:
var size = 5;
var group = ( doc.diffYears - (doc.diffYears % size)) / size;
emit( [group, doc.diffYears], doc.xyz);
The reduce function can remain _sum.
Now when you query the view use group_level to control the grouping. At group_level=0, everything will be summed and one value will be returned. At group_level=1 you'll receive your desired sums of 0-4, 5-9 etc. At group_level=2 you'll get your original output.

Retrieve values from an array - get "cannot call value of non-function type String"

I'm trying to retrieve a value from an array, based on an index parsed from a string of digits. I'm stuck on this error, and the other answers to similar questions in this forum appear to be for more advanced developers (this is my first iOS app).
The app will eventually look up weather reports ("MAFOR" groupings of 5 digits each) from a web site, parse each group and lookup values from arrays for wind direction, speed, forecast period etc using each character.
The playground code is below, appreciate any help on where I am going wrong (look for ***)
//: Playground - noun: a place where people can play
import UIKit
var str = "Hello, playground"
// create array for Forecast Period
let forecastPeriodArray = ["Existing conditions at beginning","3 hours","6 hours","9 hours","12 hours","18 hours","24 hours","48 hours","72 hours","Occasionally"]
// create array for Wind Direction
let windDirectionArray = ["Calm","Northeast","East","Southeast","South","Southwest","West","Northwest","North","Variable"]
// create array for Wind Velocity
let windVelocityArray = ["0-10 knots","11-16 knots","17-21 knots","22-27 knots","28-33 knots","34-40 knots","41-47 knots","48-55 knots","56-63 knots","64-71 knots"]
// create array for Forecast Weather
let forecastWeatherArray = ["Moderate or good visibility (> 3 nm.","Risk of ice accumulation (temp 0C to -5C","Strong risk of ice accumulkation (air temp < -5C)","Mist (visibility 1/2 to 3 nm.)","Fog (visibility less than 1/2 nm.)","Drizzle","Rain","Snow, or rain and snow","Squally weather with or without showers","Thunderstorms"]
// retrieve full MAFOR line of several information groups (this will be pulled from a web site)
var myMaforLineString = "11747 19741 13757 19751 11730 19731 11730 13900 11630 13637"
// split into array components wherever " " is encountered
var myMaforArray = myMaforLineString.components(separatedBy: " ")
let count = myMaforArray.count
print("There are \(count) items in the array")
// Go through each group and parse out the needed digits
for maforGroup in myMaforArray {
print("MAFOR group \(maforGroup)")
// get Forecast Period
var idx = maforGroup.index(maforGroup.startIndex, offsetBy: 1)
var periodInt = maforGroup[idx]
print("periodInt is \(periodInt)")
// *** here is where I am stuck... trying to use the periodInt index value to retrieve the description from the ForecastPeriodArray
var periodDescription = forecastPeriodArray(periodInt)
print("Forecast period = (forecastPeriodArray(periodInt)")
// get Wind Direction
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 2)
var directionInt = maforGroup[idx]
print("directionInt is \(directionInt)")
// get Wind Velocity
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 3)
var velocityInt = maforGroup[idx]
print("velocityInt is \(velocityInt)")
// get Weather Forecast
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 4)
var weatherInt = maforGroup[idx]
print("weatherInt is \(weatherInt)")
}
#shallowThought was close.
You are trying to access an array by its index, therefore use the array[index] notation. But your index has to be of the correct type. forecastPeriodArray[periodInt] therefore does not work since periodInt is not an Int as the name would suggest. Currently it is of type Character which does not make much sense.
What you are probably trying to achieve is convert the character to an integer and use that to access the array:
var periodInt = Int(String(maforGroup[idx]))!
You might want to add error handling for the case when the character does not actually represent an integer.

Insertion order of a list based on order of another list

I have a sorting problem in Scala that I could certainly solve with brute-force, but I'm hopeful there is a more clever/elegant solution available. Suppose I have a list of strings in no particular order:
val keys = List("john", "jill", "ganesh", "wei", "bruce", "123", "Pantera")
Then at random, I receive the values for these keys at random (full-disclosure, I'm experiencing this problem in an akka actor, so events are not in order):
def receive:Receive = {
case Value(key, otherStuff) => // key is an element in keys ...
And I want to store these results in a List where the Value objects appear in the same order as their key fields in the keys list. For instance, I may have this list after receiving the first two Value messages:
List(Value("ganesh", stuff1), Value("bruce", stuff2))
ganesh appears before bruce merely because he appears earlier in the keys list. Once the third message is received, I should insert it into this list in the correct location per the ordering established by keys. For instance, on receiving wei I should insert him into the middle:
List(Value("ganesh", stuff1), Value("wei", stuff3), Value("bruce", stuff2))
At any point during this process, my list may be incomplete but in the expected order. Since the keys are redundant with my Value data, I throw them away once the list of values is complete.
Show me what you've got!
I assume you want no worse than O(n log n) performance. So:
val order = keys.zipWithIndex.toMap
var part = collection.immutable.TreeSet.empty[Value](
math.Ordering.by(v => order(v.key))
)
Then you just add your items.
scala> part = part + Value("ganesh", 0.1)
part: scala.collection.immutable.TreeSet[Value] =
TreeSet(Value(ganesh,0.1))
scala> part = part + Value("bruce", 0.2)
part: scala.collection.immutable.TreeSet[Value] =
TreeSet(Value(ganesh,0.1), Value(bruce,0.2))
scala> part = part + Value("wei", 0.3)
part: scala.collection.immutable.TreeSet[Value] =
TreeSet(Value(ganesh,0.1), Value(wei,0.3), Value(bruce,0.2))
When you're done, you can .toList it. While you're building it, you probably don't want to, since updating a list in random order so that it is in a desired sorted order is an obligatory O(n^2) cost.
Edit: with your example of seven items, my solution takes about 1/3 the time of Jean-Philippe's. For 25 items, it's 1/10th the time. 1/30th for 200 (which is the difference between 6 ms and 0.2 ms on my machine).
If you can use a ListMap instead of a list of tuples to store values while they're gathered, this could work. ListMap preserves insertion order.
class MyActor(keys: List[String]) extends Actor {
def initial(values: ListMap[String, Option[Value]]): Receive = {
case v # Value(key, otherStuff) =>
if(values.forall(_._2.isDefined))
context.become(valuesReceived(values.updated(key, Some(v)).collect { case (_, Some(v)) => v))
else
context.become(initial(keys, values.updated(key, Some(v))))
}
def valuesReceived(values: Seq[Value]): Receive = { } // whatever you need
def receive = initial(keys.map { k => (k -> None) })
}
(warning: not compiled)

Scala objects not changing their internal state

I am seeing a problem with some Scala 2.7.7 code I'm working on, that should not happen if it the equivalent was written in Java. Loosely, the code goes creates a bunch of card players and assigns them to tables.
class Player(val playerNumber : Int)
class Table (val tableNumber : Int) {
var players : List[Player] = List()
def registerPlayer(player : Player) {
println("Registering player " + player.playerNumber + " on table " + tableNumber)
players = player :: players
}
}
object PlayerRegistrar {
def assignPlayersToTables(playSamplesToExecute : Int, playersPerTable:Int) = {
val numTables = playSamplesToExecute / playersPerTable
val tables = (1 to numTables).map(new Table(_))
assert(tables.size == numTables)
(0 until playSamplesToExecute).foreach {playSample =>
val tableNumber : Int = playSample % numTables
tables(tableNumber).registerPlayer(new Player(playSample))
}
tables
}
}
The PlayerRegistrar assigns a number of players between tables. First, it works out how many tables it will need to break up the players between and creates a List of them.
Then in the second part of the code, it works out which table a player should be assigned to, pulls that table from the list and registers a new player on that table.
The list of players on a table is a var, and is overwritten each time registerPlayer() is called. I have checked that this works correctly through a simple TestNG test:
#Test def testRegisterPlayer_multiplePlayers() {
val table = new Table(1)
(1 to 10).foreach { playerNumber =>
val player = new Player(playerNumber)
table.registerPlayer(player)
assert(table.players.contains(player))
assert(table.players.length == playerNumber)
}
}
I then test the table assignment:
#Test def testAssignPlayerToTables_1table() = {
val tables = PlayerRegistrar.assignPlayersToTables(10, 10)
assertEquals(tables.length, 1)
assertEquals(tables(0).players.length, 10)
}
The test fails with "expected:<10> but was:<0>". I've been scratching my head, but can't work out why registerPlayer() isn't mutating the table in the list. Any help would be appreciated.
The reason is that in the assignPlayersToTables method, you are creating a new Table object. You can confirm this by adding some debugging into the loop:
val tableNumber : Int = playSample % numTables
println(tables(tableNumber))
tables(tableNumber).registerPlayer(new Player(playSample))
Yielding something like:
Main$$anon$1$Table#5c73a7ab
Registering player 0 on table 1
Main$$anon$1$Table#21f8c6df
Registering player 1 on table 1
Main$$anon$1$Table#53c86be5
Registering player 2 on table 1
Note how the memory address of the table is different for each call.
The reason for this behaviour is that a Range is non-strict in Scala (until Scala 2.8, anyway). This means that the call to the range is not evaluated until it's needed. So you think you're getting back a list of Table objects, but actually you're getting back a range which is evaluated (instantiating a new Table object) each time you call it. Again, you can confirm this by adding some debugging:
val tables = (1 to numTables).map(new Table(_))
println(tables)
Which gives you:
RangeM(Main$$anon$1$Table#5492bbba)
To do what you want, add a toList to the end:
val tables = (1 to numTables).map(new Table(_)).toList
val tables = (1 to numTables).map(new Table(_))
This line seems to be causing all the trouble - mapping over 1 to n gives you a RandomAccessSeq.Projection, and to be honest, I don't know how exactly they work, but a bit less clever initialising technique does the job.
var tables: Array[Table] = new Array(numTables)
for (i <- 0 to numTables) tables(i) = new Table(i)
Using the first initialisation method I wasn't able to change the objects (just like you), but using a simple array everything seems to be working.