Grouping the output of a CouchDB View - mapreduce

I have a map reduce view:
.....
emit( diffYears, doc.xyz );
reduced with _sum.
xyz is then a number which is summed per integer(diffYears).
The output looks roughly like this:
4 1204.9
5 796.19
6 1124.8
7 1112.6
8 1993.62
9 159.26
10 395.41
11 456.05
12 457.97
13 39.80
14 483.68
15 269.469
etc..
What I would like to do is group the results as follows:
Grouping Total per group
0-4 1959.2 i.e add up the xyz's for years 0,1,2,3,4
5-9 3998.5 same for 5,6,7,8,9 ...etc.
10-14 3566.3
I saw a suggestion where a list was used on a view output here: Using a CouchDB view, can I count groups and filter by key range at the same time?
but have been unable to adapt it to get any kind of result.
The code given is:
{
_id: "_design/authors",
views: {
authors_by_date: {
map: function(doc) {
emit(doc.date, doc.author);
}
}
},
lists: {
count_occurrences: function(head, req) {
start({ headers: { "Content-Type": "application/json" }});
var result = {};
var row;
while(row = getRow()) {
var val = row.value;
if(result[val]) result[val]++;
else result[val] = 1;
}
return result;
}
}
}
I substituted var val = row.key in this section:
while(row = getRow()) {
var val = row.value;
if(result[val]) result[val]++;
else result[val] = 1;
}
(although in this case the result is a count.)
This seems to be the way to do it.
(It is like having a startkey and endkey for each grouping which I can do manually, naturally, but not inside a process. Or is there a way of entering multiple start- and endkeys into one GET command???? )
This must be a fairly normal thing to do especially for researchers using statistical analysis.
I assume therefore that it does get done but I cannot locate examples
as far as CouchDB is concerned.
I would appreciate some help with this please or a pointer in the right direction.
Many thanks.
EDIT:
Perhaps the answer lies in a process in 'reduce' to group the output??

You can accomplish what you want using a complex key. The limitation is that the group size is static and needs to be defined in the view.
You'll need a simple step function to create your groups within map like:
var size = 5;
var group = ( doc.diffYears - (doc.diffYears % size)) / size;
emit( [group, doc.diffYears], doc.xyz);
The reduce function can remain _sum.
Now when you query the view use group_level to control the grouping. At group_level=0, everything will be summed and one value will be returned. At group_level=1 you'll receive your desired sums of 0-4, 5-9 etc. At group_level=2 you'll get your original output.

Related

How to delete a row, if condition on a date is met, through script (half solved)

I have soma data, starting from A10 to column M, until the 59th row.
I have some dates in column F10:F that are text strings, converted to official dates in column N (here the question with the process)
M3 is set to =NOW().
In cell N3 I have: =M3+14.
I want to delete all the rows, with a date in column N10:N that comes before [today + 2 weeks] (so cell N3).
When I create a script in Apps Script, it doesn't run the if statement, but if I leave it in comments, it can go in the for loop and deletes the rows, so I'm pretty sure the problem is, again, date formatting.
In this question I ask: how do I compare the values of N10:N with N3, in order to delete all the rows that don't meet the condition if(datesNcol <= targetDate)? (in code is written as if (rowData[i] < flatArray))
I leave also a demo sheet with this problem explained in detail and two alternatives (getBackground condition and numeric days condition).
Attempts:
This is a simplified code example:
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
const bVals = gen.getRange('B10:B').getValues();
const bFilt = bVals.filter(String);
const dataLastRow = bFilt.length;
function deleteExpired() {
dateCorrette(); //ignore, formula that puts corrected dates from N10 to dataLastRow
var dateCorrect = gen.getRange(10,14,dataLastRow,1).getValues();
var targetDate = gen.getRange('N3').getValues();
var flatArray = [].concat.apply([], targetDate);
for (var i = dateCorrect.length - 1; i >= 0; i--) {
var rowData = dateCorrect[i];
if (rowData[i] < flatArray) {
gen.deleteRow(i+10);
}
}
};
If run the script, nothing is deleted.
If I //comment the if function and the closing bracket, it delets all the rows of the list one by one.
I can't manage to meet that condition.
Right now, it logs this [Sun Jan 01 10:33:20 GMT-05:00 2023] as flatArray
and this [Wed Dec 21 03:00:00 GMT-05:00 2022] as dateCorrect[49], so the first row to delete, that is the 50th (is correct for all the dateCorrect[i] dates).
I tried putting a getTime() method in the targetDate variable, but it only functions if there is the getValue() method, not getValues(), so I then don't know how to use getTime() method on rowData, which is based on dateCorrected[i], which have to use the getValues() method. And then it also doesn't accept the flatArray variable, that has to be commented out (or it logs [ ] for flatArray, not the corrected date)
I leave the other attempts in the demo sheet, because I want to prioritize this problem around the date and make it clear in my head.
Thanks for all the help.
DEMO SHEET, ITA Locale time
I don't know how the demo sheet works with Apps Script, I suggest to copy the code in a personal sheet
UPDATE:
I've also tried putting an extra column, with an IF built-in function that writes "del" if the function has to be deleted.
=IF(O10>14;"del";"")
And then
var boba = gen.getRange(10,16,bLast,1).getDisplayValues();
.
.
if (boba[i] == 'del')
This does the job. But I can't understand why the other methods don't work.
Try this. It seems like you do a lot of things that aren't necessary. Unless I'm missing something.
A few notes. I typically do not use global variable, unless absolutely necessary. I don't create a variable for last row unless I have to use that value multiple times in my script. I use the method Sheet.getLastRow(). dataCorrect is a 2D array of 1 column so the second index can only be [0]. And getRange('N4') is a single cell so getValue() is good enough.
function deleteExpired() {
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
var dateCorrect = gen.getRange(10,14,gen.getLastRow()-9,1).getValues();
var targetDate = gen.getRange('N3').getValue();
for (var i = dateCorrect.length - 1; i >= 0; i--) {
if (dataCorrect[i][0] < targetDate) {
gen.deleteRow(i+10);
}
}
}
Try this:
function delRows() {
const ss = SpreadsheetApp.getActive();
const gsh = ss.getSheetByName('Generatore');
const colB = gsh.getRange('B10:B' + gsh.getLastRow()).getValues();
var colN = gsh.getRange('N10:N' + gsh.getLastRow()).getValues();
var tdv = new Date(new Date().getFullYear(), new Date().getMonth(), new Date().getDate() + 14).valueOf();//current date + 14
let d = 0;
colN.forEach((n, i) => {
if (new Date(n).valueOf() < tdv) {
gsh.deleteRow(i + 10 - d++);
}
});
}

Replacing string values in a FeatureCollection with numbers in google earth engine

I have a FeatureCollection with a column named Dominance which has classified regions into stakeholder dominance. In this case, Dominance contains values as strings; specifically 'Small', 'Medium', 'Large' and 'Others'.
I want to replace these values/strings with 1,2,3 and 4. For that, I use the codes below:
var Shape = ee.FeatureCollection('XYZ')
var Shape_custom = Shape.select(['Dominance'])
var conditional = function(feat) {
return ee.Algorithms.If(feat.get('Dominance').eq('Small'),
feat.set({class: 1}),
feat)
}
var test = Shape_custom.map(conditional)
## This I plan to repeat for all classes
However, I am not able to change the values. The error I am getting is feat.get(...).eq is not a function.
What am I doing wrong here?
The simplest way to do this kind of mapping is using a dictionary. That way you do not need more code for each additional case.
var mapping = ee.Dictionary({
'Small': 1,
'Medium': 2,
'Large': 3,
'Others': 4
});
var mapped = Shape
.select(['Dominance'])
.map(function (feature) {
return feature.set('class', mapping.get(feature.get('Dominance')));
});
https://code.earthengine.google.com/8c58d9d24e6bfeca04e2a92b76d623a2

DynamoDB - Get all items which overlap a search time interval

My application manages bookings of a user. These bookings are composed by a start_date and end_date, and their current partition in dynamodb is the following:
PK SK DATA
USER#1#BOOKINGS BOOKING#1 {s: '20190601', e: '20190801'}
[GOAL] I would query all reservations which overlap a search time interval as the following:
I tried to find a solution for this issue but I found only a way to query all items inside a search time interval, which solves only this problem:
I decided to make an implementation of it to try to make some change to solve my problem but I didn't found a solution, following you can find my implementation of "query inside interval" (this is not a dynamodb implementation, but I will replace isBetween function with BETWEEN operand):
import { zip } from 'lodash';
const bookings = [
{ s: '20190601', e: '20190801', i: '' },
{ s: '20180702', e: '20190102', i: '' }
];
const search_start = '20190602'.split('');
const search_end = '20190630'.split('');
// s:20190601 e:20190801 -> i:2200119900680011
for (const b of bookings) {
b['i'] = zip(b.s.split(''), b.e.split(''))
.reduce((p, c) => p + c.join(''), '');
}
// (start_search: 20190502, end_search: 20190905) => 22001199005
const start_clause: string[] = [];
for (let i = 0; i < search_start.length; i += 1) {
if (search_start[i] === search_end[i]) {
start_clause.push(search_start[i] + search_end[i]);
} else {
start_clause.push(search_start[i]);
break;
}
}
const s_index = start_clause.join('');
// (end_search: 20190905, start_search: 20190502) => 22001199009
const end_clause: string[] = [];
for (let i = 0; i < search_end.length; i += 1) {
if (search_end[i] === search_start[i]) {
end_clause.push(search_end[i] + search_start[i]);
} else {
end_clause.push(search_end[i]);
break;
}
}
const e_index = (parseInt(end_clause.join('')) + 1).toString();
const isBetween = (s: string, e: string, v: string) => {
const sorted = [s,e,v].sort();
console.info(`sorted: ${sorted}`)
return sorted[1] === v;
}
const filtered_bookings = bookings
.filter(b => isBetween(s_index, e_index, b.i));
console.info(`filtered_bookings: ${JSON.stringify(filtered_bookings)}`)
There’s not going to be a beautiful and simple yet generic answer.
Probably the best approach is to pre-define your time period size (days, hours, minutes, seconds, whatever) and use the value of that as the PK so for each day (or hour or whatever) you have in that item collection a list of the items touching that day with the sort key of the start time (so you can do the inequality there) and you can use a filter on the end time attribute.
If your chosen time period is days and you need to query across a week then you’ll issue seven queries. So pick a time unit that’s around the same size as your selected time periods.
Remember you need to put all items touching that day (or whatever) into the day collection. If an item spans a week it needs to be inserted 7 times.
Disclaimer: This is a very use-case-specific and non-general approach I took when trying to solve the same problem; it picks up on #hunterhacker 's approach.
Observations from my use case:
The data I'm dealing with is financial/stock data, which spans back roughly 50 years in the past up to 150 years into the future.
I have many thousands of items per year, and I would like to avoid pulling in all 200 years of information
The vast majority of the items I want to query spans a time that fits within a year (ie. most items don't go from 30-Dec-2001 to 02-Jan-2002, but rather from 05-Mar-2005 to 10-Mar-2005)
Based on the above, I decided to add an LSI and save the relevant year for every item whose start-to-end time is within a single year. The items that straddle a year (or more) I set that LSI with 0.
The querying looks like:
if query_straddles_year:
# This doesn't happen often in my use case
result = query_all_and_filter_after()
else:
# Most cases end up here (looking for a single day, for instance)
year_constrained_result = query_using_lsi_for_that_year()
result_on_straddling_bins = query_using_lsi_marked_with_0() # <-- this is to get any of the indexes that do straddle a year
filter_and_combine(year_constrained_result, result_on_straddling_bins)

power bi DAX hierarchical table concatenation of names

Since two days I'm on a problem and I can't solve it so I come here to ask some help...
I have that bit of dax that basically take the path of a hierarchical table (integers) and take the string names of the 2 first in the path.
the names I use:
'HIERARCHY' the hierarchical table with names, id, path, nbrItems, string
mytable / addedcolumn1/2 the new table used to emulate the for loop
DisplayPath =
var __Path =PATH(ParentChild[id], ParentChild[parent_id])
var __P1 = PATHITEM(__Path,1) var __P2 = PATHITEM(__Path,2)
var l1 = LOOKUPVALUE(ParentChild[Place],ParentChild[id],VALUE(__P1))
var l2a = LOOKUPVALUE(ParentChild[Place],ParentChild[id],VALUE(__P2))
var l2 = if(ISBLANK(l2a), "", " -> " & l2a)
return CONCATENATE(l1,l2)
My problem is... I don't know the number of indexes in my path, can go from 0 to I guess 15...
I've tried some things but can't figure out a solution.
First I added a new column called nbrItems which calculate the number of items in the list of the path.
The two columns:
Then I added that bit of code that emulates a for loop depending on the number of items in the path list, and I'd like in it to
get name of parameters
concatenate them in one string that I can return and get
string =
var n = 'HIERARCHY'[nbrItems]
var mytable = GENERATESERIES(1, n)
var addedcolumn1 = ADDCOLUMNS(mytable, "nom", /* missing part: get name */)
var addedcolumn2 = ADDCOLUMNS(addedcolumn1, "string", /* missing part: concatenate previous concatenated and new name */)
var mymax = MAXX(addedcolumn2, [Value])
RETURN MAXX(FILTER(addedcolumn2, [Value] = mymax), [string])
Full table:
Thanks for your help in advance!
Ok, so after some research and a lot of try and error... I've came up to a nice and simple solution:
The original problem was that I had a hierarchical table ,but with all data in the same table.
like so
What I did was, adding a new "parent" column with this dax:
parent =
var a = 'HIERARCHY'[id_parent]
var b = CALCULATE(MIN('HIERARCHY'[libelle]), FILTER(ALL('HIERARCHY'), 'HIERARCHY'[id_h] = a))
RETURN b
This gets the parent name from the id_parent (ref. screen).
then I could just use the path function, not on the id's but on the names... like so:
path = PATH('HIERARCHY'[libelle], 'HIERARCHY'[parent])
It made the problem easy because I didn't need to replace the id's by there names after this...
and finally to make it look nice, I used some substitution to remove the pipes:
formated_path = SUBSTITUTE('HIERARCHY'[path], "|", " -> ")
final result

Retrieve values from an array - get "cannot call value of non-function type String"

I'm trying to retrieve a value from an array, based on an index parsed from a string of digits. I'm stuck on this error, and the other answers to similar questions in this forum appear to be for more advanced developers (this is my first iOS app).
The app will eventually look up weather reports ("MAFOR" groupings of 5 digits each) from a web site, parse each group and lookup values from arrays for wind direction, speed, forecast period etc using each character.
The playground code is below, appreciate any help on where I am going wrong (look for ***)
//: Playground - noun: a place where people can play
import UIKit
var str = "Hello, playground"
// create array for Forecast Period
let forecastPeriodArray = ["Existing conditions at beginning","3 hours","6 hours","9 hours","12 hours","18 hours","24 hours","48 hours","72 hours","Occasionally"]
// create array for Wind Direction
let windDirectionArray = ["Calm","Northeast","East","Southeast","South","Southwest","West","Northwest","North","Variable"]
// create array for Wind Velocity
let windVelocityArray = ["0-10 knots","11-16 knots","17-21 knots","22-27 knots","28-33 knots","34-40 knots","41-47 knots","48-55 knots","56-63 knots","64-71 knots"]
// create array for Forecast Weather
let forecastWeatherArray = ["Moderate or good visibility (> 3 nm.","Risk of ice accumulation (temp 0C to -5C","Strong risk of ice accumulkation (air temp < -5C)","Mist (visibility 1/2 to 3 nm.)","Fog (visibility less than 1/2 nm.)","Drizzle","Rain","Snow, or rain and snow","Squally weather with or without showers","Thunderstorms"]
// retrieve full MAFOR line of several information groups (this will be pulled from a web site)
var myMaforLineString = "11747 19741 13757 19751 11730 19731 11730 13900 11630 13637"
// split into array components wherever " " is encountered
var myMaforArray = myMaforLineString.components(separatedBy: " ")
let count = myMaforArray.count
print("There are \(count) items in the array")
// Go through each group and parse out the needed digits
for maforGroup in myMaforArray {
print("MAFOR group \(maforGroup)")
// get Forecast Period
var idx = maforGroup.index(maforGroup.startIndex, offsetBy: 1)
var periodInt = maforGroup[idx]
print("periodInt is \(periodInt)")
// *** here is where I am stuck... trying to use the periodInt index value to retrieve the description from the ForecastPeriodArray
var periodDescription = forecastPeriodArray(periodInt)
print("Forecast period = (forecastPeriodArray(periodInt)")
// get Wind Direction
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 2)
var directionInt = maforGroup[idx]
print("directionInt is \(directionInt)")
// get Wind Velocity
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 3)
var velocityInt = maforGroup[idx]
print("velocityInt is \(velocityInt)")
// get Weather Forecast
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 4)
var weatherInt = maforGroup[idx]
print("weatherInt is \(weatherInt)")
}
#shallowThought was close.
You are trying to access an array by its index, therefore use the array[index] notation. But your index has to be of the correct type. forecastPeriodArray[periodInt] therefore does not work since periodInt is not an Int as the name would suggest. Currently it is of type Character which does not make much sense.
What you are probably trying to achieve is convert the character to an integer and use that to access the array:
var periodInt = Int(String(maforGroup[idx]))!
You might want to add error handling for the case when the character does not actually represent an integer.