Creating Relationships while avoiding ambiguities - powerbi

I have a flat table like this,
R# Cat SWN CWN CompBy ReqBy Department
1 A 1 1 Team A Team B Department 1
2 A 1 3 Team A Team B Department 1
3 B 1 3 Team A Team B Department 1
4 B 2 3 Team A Team C Department 1
5 B 2 3 Team D Team C Department 2
6 C 2 2 Team D Team C Department 2
R# indicates the RequestNumber,
Cat# indicates the Category,
SWN indicates the Submitted Week Number,
CWN indicates the Completed Week Number,
CompBy indicates Completed By,
ReqBy Indicates Requested By,
Department Indicates Department Name,
I would like to create a data model that avoids ambiguity and at the same time allows me to report on Category, SWN, CWN (needs to be only a week number), CompBY, ReqBy, Department through a single filter.
For example, the dashboard will have a single filter choice to select a week number.If that week number is selected, it will show the details of these requests from submitted and completed week number. I understand this requires the creation of a calendar table or something like that.
I am looking for a data-model that explains the cardinality and direction(Single or both). If possible, kindly post the PBIX file and repost the link here.
What I have tried: Not able to establish one of the four connections
Update: Providing a bounty for this question because I would like to see how does the Star schema will look like for this flat table.
One of the reason I am looking for a star schema over a flat table is - For example, a restaurant menu is a dimension and the purchased food is a fact. If you combined these into one table, how would you identify which food has never been ordered? For that matter, prior to your first order, how would you identify what food was available on the menu?

The scope of your question is rather unclear, so I'm just addressing this part of the post:
the dashboard will have a single filter choice to select a week number. If that week number is selected, it will show the details of these requests from submitted and completed week number.
One way to get OR logic is to use a disconnected parameter table and write your measures using the parameters selected. For example, consider this schema:
If you put WN on a slicer, then you can write a measure to filter the table based on the number selected.
WN Filter = IF(COUNTROWS(
INTERSECT(
VALUES(WeekDimension[WN]),
UNION(
VALUES(MasterTable[SWN]),
VALUES(MasterTable[CWN])))) > 0, 1, 0)
Then if you use that measure as a visual level filter, you can see all the records that correspond to your WN selection.
If you can clarify your question to more closely approach a mcve, then you'll likely get better responses. I can't quite determine the specific idea you're having trouble with.

Related

Unable to get desired results when trying to slice for multiple accounts

In my sales report, I have run into a problem where sales for some customers are sold under another customer and I need to attribute sales to both to show truly accurate values.
Pictured below are the tables in question. tblTicketDetails is my facts table with the other two being my dimension tables.
Here is an example of the accounts in question:
In the above example, all values in [Account #] belong to customer 1 and values in [LookupAccount] belong to others (lets say customer 2 and 3).
I typically use [Customer Name] as my value in my Slicer as all accounts for a customer are named the same.
The outcome I have been trying to obtain is when I slice by customer 1, I get all values shown in [Account #] that have Customer 1 as the name but if I slice by customer 2 (who lets say has one account and is 5809), I will get all [Account #] that equal 5809 as well as all [Account #] whos [LookupAccount] is also 5809 (so I would also get 23634, 37765, 67804 and 95511).
The same applies to those who have multiple accounts, so if I slice by customer 3 whos accounts are lets say 14650, 89670 and 47900, I would get those results from [Account #] as well as the accounts where {LookupAccount] matches, which in this case would be 19734, 28199 and 64218.
I have tried changing the relationship between tblCustomers and tblTicketDetails, I have also added in the table LookupAccount as an in-between for the many to many relationship but none of those have actually changed the results I get (or they cause the visuals to go blank).
What am I missing?
Based on clarification from the author, here is my updated answer.
There are three objects in this model:
Sales in tblTicketDetails
Customers in tblCustomers
A matching table between the two, currently LookupAccount (the table)
Here is a abstract of the set or rules at stakes :
A sale has an account number
A customer is identified by his account number, let's call it primary account
A customer might use another account number in some situation. Let's call it secondary account
The goal is: when filtering on a customer, all sales are return whether they are through his primary account or secondary account.
The maching table LookupaAccount needs a rework for the model to function properly. For the stake of clarification, columns are renamed:
Account# → Primary
LookupAccount → Secondary
Here is an example:
Primary
Secondary
1
2
1
3
2
4
In that example:
Client account 1 has sales on accounts: 1, or 2, or 3.
Client account 2 has sales on accounts: 2, or 4.
In PowerQuery, we want to do some transformations:
Duplicate Primary column and call it CustomerAccount. This is the foreing key that will links to tblCustomers.
Unpivot columns Primary and Secondary with the value column renamed as ReportingAccount, which will be the foreign key to tblTicketDetails
The end result should look like this:
CustomerAccount
Attribute
ReportingAccount
1
Primary
1
1
Secondary
2
1
Primary
1
1
Secondary
3
2
Primary
2
2
Secondary
4
Then, you modify the relationships in your model:
tblTicketDetails is connected to LookupAccount (table) from field Account# to ReportingAccount
LookupAccount (table) is connected to tblCustomers from field CustomerAccount to Account#
Normaly this model should flow better.
Now, you can create measure to calculate de amount of sales per customer :
SalesperCustomer =
CALCULATE(
SUM(tblTicketDetails[Total])
)
Or, for example just the sales through primary account
SalesperCustomer_primary =
CALCULATE(
SUM(tblTicketDetails[Total])
LookupAccount[Attribute] = 'Primary'
)

Changing organization of data so that each observation represents a new variable (I tried)

I am working in Stata with a dataset on electric vehicle charging stations. Variables include
station_name name of charging station
review_text all of the customer reviews for a specific station delimited by }{
num_reviews number of customer reviews.
I'm trying to make a new file where each observation represents one customer review in a new variable customer_review and another variable station_id has the name of the corresponding station. So, if the original dataset had 100 observations (one per station) with 5 reviews each, the new file should have 500 observations.
How can I do this? I would include some code I have tried but I have no idea how to start.
If your data look like this:
station reviews n
1. 1 {good}{bad}{great} 3
2. 2 {poor}{excellent} 2
Then the following:
split(reviews), parse(}{)
drop reviews n
reshape long reviews, i(station) j(review_num)
drop if reviews==""
replace reviews = subinstr(reviews, "}","",.)
replace reviews = subinstr(reviews, "{","",.)
will produce:
station review~m reviews
1. 1 1 good
2. 1 2 bad
3. 1 3 great
4. 2 1 poor
5. 2 2 excellent

Power BI DAX - sum all of one column but keyed off different table

This FEELS like something that can be done but I am at a loss for how to do it.
I have a table that has applicants for jobs...
name, requisition id, division, date applied, date hired
Each row is an applicant. Obviously not all applicants are hired. So in every row all fields are filled out with the exception of date hired for applicants that have not been hired.
I have slicers for month/quarter/year and division.
The date slicers all key off a field in every table called data_as_of which is the last day of the month with a one-to-many relationship with a date dimension table.
Here is a sample table...
[1]
[1]: https://i.stack.imgur.com/XQO9d.png
So here is what I'd like to do.
I'd like to slice by year and show a visual of all people hired in that year. Same with Quarter and Month (ie count all people in that quarter or month as appropriate). So far so good. That's easy.
Now on the same report page I'd like to show a visual (assume bar charts) that shows me a count of all the people that applied to the same requisition id prior to the date hired of whomever was hired in that requisition id.
Using the example above...
All of these examples assume 2021.
So if I used the month slicer in December I'd get 2 hirees in HR, Diane and Mel. In the second visual I'd get 7 Applicants.
If I used the month slicer to show November I'd get two hirees - Rhys and Jody. The applicant visual would show me 8 applicants. All 6 from requisition id 4 and 2 from requisition id 2 because one applied after Rhys was hired.
Consequently if I sliced for April of 2021 I'd get 1 hiree - Remi. In the applicant visual I'd get 4 applicants who all applied prior to Remi's hire date (including Morgan who applied in March but wasn't hired until May).
Does that all make sense?
I very much appreciate your help.
Best regards,
~Don

How to deal with multiple ids multiple categories table to reach THIS on Power BI

I have a problem that i was trying to solve 3 days ago and i'm not able to.
I have the following tables:
Companies
company_id
sales
1
2000
2
3000
3
4000
4
1000
Categories
company_id
category
1
medical
1
sports
2
industrial
3
consumption
4
medical
4
consumption
All i want to reach is a COLUMN CHART with a CATEGORY SLICER where i choose the category and i see the TOP 5 companies by category and sales. Yes, in this example the TOP is not needed but in my real case i have 400 companies so i want to:
Select and Show only the required category.
In that category, show only the 5 better companies by sales.
The problem here is Power BI takes all the companies for the TOP N filter so if i choose a category, and also try a top 5, if the companies are not in the TOP5 all companies list, it doesn`t show anything.
Thanks!
If you always want to show the same Top N values in your visual, you can use the filter pane to achieve that.
Below a walk through:
The to add the Top N filtering, I add the following:
It is in Dutch, so a little clarification:
I add a 'filter on this visual'
I selected Populairste N, which is Top N
And as a value I drag and dropped the maximum of sales.
Results:
Things to keep in mind:
You are using a many to many relationship, make sure that this is activated in the Power BI model.
Make sure the direction of filtering is from category to sales, otherwise the slicer will not work. It looks like this:

Creating and doing Market basket analysis from raw data

I have a data set with me which have many items and their sales data in terms of amount and quantity sold rolled up per week. I want to figure out that is there some correlation between the two or not, trying to access that if sales of one item affecting the other's sale or not, in terms of any positive or negative effect.
Consider the following type of data:
Week # Product # Sale($) Quantity
Week 1 Product 1 1 1
Product 2 2 1
Product 3 3 1
Week 2 Product 1 3 2
Product 3 2 1
Product 6 2 2
Week 3 Product 4 2 1
Product 3 1 2
Product 5 4 2
So,from the above data on week basis, I want to figure out that how can I convert this data into a form of market basket data with the above set of parameters available with me. Since, there isn't any market basket data available.
The parameters I could think of is :
To use the count or occurrences of each product in a given week.
To use the total quantity sold
To use the total sales to find correlation.
So, basically I have to come up with how can an item be correlated to the other of the affinity of one product with the other product.No matter it is positively correlated or negative correlated. The only issue is I do not have any primary key to bind the items with a basket or an order number since it's rolled up sales.
Any answers or help in this topic is highly appreciable. In case you find it incomplete, you can let me know for any further clarity.
You can't do this because you have no information about the co-occurrence. You also have data muddled from daily grain to weekly grain. Aggregates won't permit this.