RavenDB Map/Reduce with grouping by date - mapreduce

I have to create a query to get a statistic by post per year/month, e.g. group by date. I created an index:
public class Posts_Count : AbstractIndexCreationTask<Post, ArchiveItem>
{
public Posts_Count()
{
Map = posts => from post in posts
select new
{
Year = post.PublishedOn.Year,
Month = post.PublishedOn.Month,
Count = 1
};
Reduce = results => from result in results
group result by new {
result.Year,
result.Month
}
into agg
select new
{
Year = agg.Key.Year,
Month = agg.Key.Month,
Count = agg.Sum(x => x.Count)
};
}
}
In studio I have next map and reduce functions:
Map:
docs.Posts.Select(post => new {Year = post.PublishedOn.Year, Month = post.PublishedOn.Month, Count = 1})
Reduce:
results
.GroupBy(result => new {Year = result.Year, Month = result.Month})
.Select(agg => new {Year = agg.Key.Year, Month = agg.Key.Month, Count = agg.Sum(x => ((System.Int32)(x.Count)))})
But the problem is I alway get a null values of Year and Month properties:
{
"Year": null,
"Month": null,
"Count": "1"
}
Can anybody help me to resolve the issue with my code? Thank You!

Your code looks fine. I tested it and it works in the current unstable build 1.2.2096. There have been some discussion around this lately on the RavenDB google group, so perhaps it was broken previously. Try again with the current build and see if it works for you now.

Related

How to subtract the value of another row depends on Category column in PowerBI

Here is the photo explanation
I want to show the NewLinePrice in "Other Service", the value will subtract the value of "Graphic Design", so the first column will show ‭1152245‬ and so on.
So far i've tried to defined the new column "NewLinePrice", the following is the formula but not work
NewLinePrice =
Var category = 'Group-dtl'[ProductCategoryName]
var graphic_line_price = CALCULATE(SUM('Group-dtl'[LinePrice]),FILTER('Group-dtl', 'Group-dtl'[ProductCategoryName] = "Graphic Design"))
var graphic_line_price_temp = IF(category = "Graphic Design", 'Group-dtl'[LinePrice], 0)
//var graphic_line_price = 1
Var pre_result = IF(category = "Other Service", 'Group-dtl'[LinePrice] - graphic_line_price, BLANK())
Var result = IF(pre_result > 0, pre_result, BLANK())
return result
Anyone have ideas how to do that?
I spend some time to find the answer for your question, at the end I discover that to achiever your outcome, you cannot perform the calculation within the original but to create a new table, the reason is unknown, however at least it is achievable, see my answer below and accept if help, appreciate my hardworking :)
This is my original table name [Sheet1]
First I create a new table based on the original table
Table =
ADDCOLUMNS(VALUES(Sheet1[Product Name]),
"Sales",CALCULATE(SUM(Sheet1[Amount]),
FILTER(ALL(Sheet1),Sheet1[Product Name]=EARLIER(Sheet1[Product Name]))))
From the new table, I add new column using the following formula to return different value only for "Other Service"
New Line1 =
Var ServiceValue = CALCULATE(MAX(Sheet1[Amount]),Sheet1[Product Name] = "Other Service")
Var graphicValue = CALCULATE(MAX(Sheet1[Amount]),Sheet1[Product Name] = "Graphic Design")
Var charge = ServiceValue - graphicValue
return
if('Table'[Product Name] = "Other Service", charge,'Table'[Sales])
Here is new table with updated value:

Replacing blank with zero for a mesure gives som unexpected extra rows

I have a little puzzle that annoys me in PowerBI/DAX. I'm not looking for workarounds - I am looking for an explanation of what is going on.
I've created some sample data to reconstruct the problem.
Here are my two sample tables written in DAX:
Events =
DATATABLE (
"Course", STRING,
"WeekNo", INTEGER,
"Name", STRING,
"Status", STRING,
{
{ "Python", 1, "Joe", "OnSite" },
{ "Python", 1, "Donald", "Video" },
{ "DAX", 2, "Joe", "OnSite" },
{ "DAX", 2, "Hillary", "Video" },
{ "DAX", 3, "Joe", "OnSite" },
{ "DAX", 3, "Hillary", "OnSite" },
{ "DAX", 3, "Donald", "OnSite" }
}
)
WeekNumbers =
DATATABLE ( "WeekNumber", INTEGER, { { 1 }, { 2 }, { 3 }, { 4 } } )
I have a table with events and another table with all weeknumbers and there is a (m:1) relation between them on the weekNo/weeknumber (I've given them different names to easily distinguish them in this example). I have a slicer in PowerBI on the weeknumber. And I have a table which shows aggregation and counts the participants based on the status with the following measures:
#OnSite = COUNTROWS(FILTER(events,Events[Status]="OnSite"))
#Video = COUNTROWS(FILTER(events,Events[Status]="Video"))
I visualize the two measures in a table together with the Course and the weekNo. With the slicer on weekNumber 3 there are nobody with status video so #video is blank. See screenshot.
Then I decided to create a new measure which should show a 0 instead of blank for the #video:
#VideoWithZero = VAR counter=COUNTROWS(FILTER(events,Events[Status]="Video"))
RETURN IF(ISBLANK(counter),0,counter)
I add the #VideoWithZero to the table and get a lot of extra rows in the table for the other weekNo's:
So my question is - Why do I get the extra rows for week 1 and 2 in the table? I would expect my filter on WeekNumber to filter them out.
The filter is being applied to the context of the query executed, and then the measures are calculated. Now the issue is that one of your measures is always returning a value (0), so regardless of your context it will always show a result, thus making it seem that it is ignoring the filter.
One way I tend to get implement this is by providing some additional context to when I might want to show 0 instead of blank. In your case it would be when one of the counts is not blank:
#OnSite =
VAR video = COUNTROWS(FILTER(events,Events[Status]="Video"))
VAR onsite = COUNTROWS(FILTER(events,Events[Status]="OnSite"))
RETURN IF(ISBLANK(video), onsite, onsite + 0) //+0 will force it not to be blank
#Video =
VAR video = COUNTROWS(FILTER(events,Events[Status]="Video"))
VAR onsite = COUNTROWS(FILTER(events,Events[Status]="OnSite"))
RETURN IF(ISBLANK(onsite), video, video + 0)
So on the OnSite measure it will check if there are Videos and if so, it adds +0 to the result of the OnSite count to force it not to be blank (and vice versa)
One other way could be to count total rows and subtract the ones different to the status you need:
#OnSite =
VAR total= COUNTROWS(Events[Status])
VAR notOnsite = COUNTROWS(FILTER(events,Events[Status]<>"OnSite"))
RETURN total - notOnsite
#Video =
VAR total= COUNTROWS(Events[Status])
VAR notVideo= COUNTROWS(FILTER(events,Events[Status]<>"Video"))
RETURN total - notVideo

How to do a MaxBy in RavenDb MapReduce

Using the Northwind database from RavenDB tutorial I'm trying to group orders by employee and get the most resent order for every employee.
Map:
from order in docs.Orders
select new {
Employee = order.Employee,
Count = 1,
MostRecent = order.OrderedAt,
MostRecentOrderId = order.Id
}
Reduce with nonexisting MaxBy:
from result in results
group result by result.Employee into grp
select new {
Employee = grp.Key,
Count = grp.Sum(result => result.Count),
MostRecent = grp.Max(result => result.MostRecent),
MostRecentOrderId = grp.MaxBy(result => result.MostRecent).MostRecentOrderId,
}
Reduce attempt:
from result in results
group result by result.Employee into grp
let TempMostRecent = grp.Max(result => result.MostRecent)
select new {
Employee = grp.Key,
Count = grp.Sum(result => result.Count),
MostRecent = TempMostRecent,
MostRecentOrderId = grp.First(result => result.MostRecent == TempMostRecent).MostRecentOrderId
}
However my reduce attempt returns 0 results.
Also: will RavenDB treat the Order.OrderetAt as a proper DateTime value and order them correctly?
You need to do it like
from order in docs.Orders
select new {
Employee = order.Employee,
Count = 1,
MostRecent = order.OrderedAt,
MostRecentOrderId = order.Id
}
from result in results
group result by result.Employee into grp
let maxOrder = grp.OrderByDescending(x=>x.MostRecent).First()
select new {
Employee = grp.Key,
Count = grp.Sum(result => result.Count),
MostRecent = maxOrder.MostRecent,
MostRecentOrderId = maxOrder.MostRecentOrderId,
}

Google chart Showing zero in case of missing date in bar graph

I have an object which contains "Date" and "Amount".The object will contain the data for last seven days.If any one date is missing in the object I want to show the bar graph as 0 for that date.
Can someone help me with this issue?
Found the answer .Incase if any one require you can have a look at below code
var orders = _orderService.GetAll(c => c.RestaurantId == restaurantId && (c.Date > DateTime.Now.AddDays(-7))).OrderBy(x => x.Date).GroupBy(item => item.Date.Date).OrderBy(g => g.Key).
Select(i => new Order { Date = i.Key.Date, GrossAmount = i.Sum(w => w.GrossAmount) }).ToList();
var from = DateTime.Now.AddDays(-7);
var to = DateTime.Now.AddDays(-1);
var days = Enumerable.Range(0, 1 + to.Subtract(from).Days)
.Select(offset => from.AddDays(offset))
.ToArray();
var data = days.Select(i =>new Order{ Date=i.Date,GrossAmount=orders.Where(p=>p.Date==i.Date).Sum(w=>w.GrossAmount)}).ToList();

Couchbase custom reduce function

I have some documents in my Couchbase with the following template:
{
"id": 102750,
"status": 5,
"updatedAt": "2014-09-10T10:50:39.297Z",
"points1": 1,
"points2": -3,
"user1": {
"id": 26522,
...
},
"user2": {
"id": 38383,
...
},
....
}
What I want to do is to group the documents on the user and sum the points for each user and then show the top 100 users in the last week. I have been circling around but I haven't come with any solution.
I have started with the following map function:
function (doc, meta) {
if (doc.user1 && doc.user2) {
emit(doc.user1.id, doc.points1);
emit(doc.user2.id, doc.points2);
}
}
and then tried the sum to reduce the results but clearly I was wrong because I wasn't able to sort on the points and I couldn't also include the date parameter
you need to see my exemple I was able to group by date and show the values with reduce. but calculate the sum I did it in my program.
see the response How can I groupBy and change content of the value in couchbase?
I have solved this issue by the help of a server side script.
What I have done is I changed my map function to be like this:
function (doc, meta) {
if (doc.user1 && doc.user2) {
emit(dateToArray(doc.createdAt), { 'userId': doc.user1.id, 'points': doc.points1});
emit(dateToArray(doc.createdAt), { 'userId': doc.user2.id, 'points': doc.points2});
}
}
And in the script I query the view with the desired parameters and then I group and sort them then send the top 100 users.
I am using Node JS so my script is like this: (the results are what I read from couchbase view)
function filterResults(results) {
debug('filtering ' + results.length + ' entries..');
// get the values
var values = _.pluck(results, 'value');
var groupedUsers = {};
// grouping users and sum their points in the games
// groupedUsers will be like the follwoing:
// {
// '443322': 33,
// '667788': 55,
// ...
// }
for (var val in values) {
var userId = values[val].userId;
var points = values[val].points;
if (_.has(groupedUsers, userId)) {
groupedUsers[userId] += points;
}
else
groupedUsers[userId] = points;
}
// changing the groupedUsers to array form so it can be sorted by points:
// [['443322', 33], ['667788', 55], ...]
var topUsers = _.pairs(groupedUsers);
// sort descending
topUsers.sort(function(a, b) {
return b[1] - a[1];
});
debug('Number of users: ' + topUsers.length + '. Returning top 100 users');
return _.first(topUsers, 100);
}