I need a little help since im new to PowerBI. I have a data set which says what have been eaten in a specific day. At the end there are columns which show if the day before the overall feeling was better (so in this specific day it got worse). I got up to 30 Ingredients and 5 days before. The "1" in e.g. Day2 means its TRUE for condition "2 days before it got worse"
The data looks like this:
Data set example
Now, I want to retrieve and add up all ingredients, which has been eaten at "Day1", "Day2" and so on, so I can see which food is maybe causing problems because it should appear more often in the days before or at least appear in every case there. How do I achieve this?
For example I can see then, that on Day2 overy often Ingredient "Apple" appears, so there could be an assumption that Chicken meat is not good for this person.
I tried to pivot the table, as well as disconnect the "DayX" into another table and make relationship between them, but nothing adds up the things in the way I want it to.
Related
I have a simple django project and I am trying to keep track of ranks for certain objects to see how they change over time. For example, what was the rank of US GDP (compared to other countries) over last 3 years. Below is the postgres db structure I am working with:
Below is what I am trying to achieve:
What I am finding challenging is that the previous period value may or may not exist and it's possible that even the entity may or may not be in the pervious period. Period can be year, quarter or months but for a specific record it can be either of one and stays consistently same for all the years for that record.
Can someone guide me in the right direction to write a query to achieve those tables? I am trying to avoid writing heavy forloop queries because there may be 100s of entities and many years of data.
So far I have only been able to achieve the below output:
I am just trying to figure out how to use annotate to fetch previous period values and ranks but I am pretty much stuck.
I'm building a staff allocation sheet for our production teams so that management can see (graphically) the peaks and troughs for each department. I've anonymised the data and shared it here: https://docs.google.com/spreadsheets/d/140w_v_ApksXH2q7h_dK5Iglm0VOTF1Zk9fq9jPxF5BM/edit?usp=sharing
What I am attempting to achieve is to have the week commencing dates across the top, the employees down the left (Based on a unique list of employees form the department tabs) and the individual cells populated with the relevant show number (I've manually entered the first two employees).
I had done this successfully by having hidden proxy columns for each show that pull a start and end date from the department sheet via and indirect lookup. Where this falls down is if an employee works on a production at two separate times, e.g. all of May and all of August but not between those months.
I initially attempted a series of nested ifs:
=if(and(indirect($A3&"!$B:$B")=$B:$B,
indirect($A3&"!$E:$E")<=AK$2,
indirect($A3&"!$F:$F")>AK$2,
indirect($A3&"!$C:$C")="SHOW #1"
),1,
if(and(indirect($A3&"!$B:$B")=$B:$B,
indirect($A3&"!$E:$E")<=AK$2,
indirect($A3&"!$F:$F")>AK$2,
indirect($A3&"!$C:$C")="SHOW #2"
),2,
if(and(indirect($A3&"!$B:$B")=$B:$B,
indirect($A3&"!$E:$E")<=AK$2,
indirect($A3&"!$F:$F")>AK$2,
indirect($A3&"!$C:$C")="SHOW #3"
),3,
. . .
if(and(indirect($A3&"!$B:$B")=$B:$B,
indirect($A3&"!$E:$E")<=AK$2,
indirect($A3&"!$F:$F")>AK$2,
indirect($A3&"!$C:$C")="SHOW #17"
),17,
0)))))))))))))))))
Hoping that this would calculate in each cell and return the correct result, however this did not work, that's when I discovered if() doesn't work with ranges in this context! I've effectively hit a wall on this.
I feel like I'm missing something obvious in this and would appreciate any help!!!
I am trying to predict match winner based on the historical data set as shown below,
The data set comprises of IPL seasons and Team_Name_id vs Opponent Team are the team names in IPL. I have set the match id as Row id and created the model. When running realtime testing, the result is not as expected (shown below)
Target is set as Match_winner_id.
Am I missing any configurations? Please help
The model is working perfectly correctly. There's just two problems:
Your input data is not very good
There's no way for the model to know that only one of those two teams should win
Data Quality
A predictive model needs good quality input data on which to reverse-engineer a model that explains a given result. This input data should contain information that can be used to predict a result given a different set of input data.
For example, when predicting house prices, it would need to know the suburb (category), number of bedrooms/bathrooms/parking spaces, age of the building and selling price. It could then predict the selling price for other houses with a slightly different mix of variables.
However, based on your screenshot, you are giving the following information (and probably more) on which to make your prediction:
Teams: Not great, because you are separating Column C and Column D. The model will assume they are unrelated information. It doesn't realise that those two values could be swapped.
Match date: Useless information unless the outcome varies in proportion to time (eg a team continually gets better)
Season: As with Match Date, this is probably useless because you're always predicting the future -- you won't be predicting for a past season
Venue: Only relevant if a particular team always wins at a given venue
Toss Decision: Would this really influence the outcome? Also, it's only known once the game begins, so not great for predicting a future game.
Win Type: You won't know the win type until a game is over, so it's not suitable for predicting a future game.
Score: Again, not known until the actual game, so no good for future predictions.
Man of the Match: Not known for future games.
Umpire: How does an umpire influence the result of a game?
City: Yes, given that home teams often have an advantage.
You have provided very little information that could be used to predict a future game. There is really only the teams and the venue. Everything else is either part of the game itself or irrelevant.
Picking only one of the two teams
When the ML model looks at your data and tries to make a prediction, it will look at all the data you have provided. For example, it might notice that for a given venue and season, Team 8 has a higher propensity to win. Therefore, given that venue and season, it will favour a win by Team 8. The model has no concept that the only possible outcome is one of the two teams given in columns C and D.
You are predicting for two given teams and you are listing the teams in either Column C or Column D and this makes no sense -- the result is the same if you swapped the teams between columns, but the model has no concept of this. Also, information about Team 1 vs Team 2 is totally irrelevant for Team 3 vs Team 4.
What you should do is create one dataset per team, listing all their matches, plus a column that shows the outcome -- either a boolean (Win/Lose) or a value that represents the number of runs by which they won (where negative is a loss). You would then ask them model to predict the result for that team, given the input data, which would be win/lose or a points above/below the other team.
But at the core, I think that your input data doesn't have enough rich content to be able to make a sensible prediction. Just ask yourself: "What data would I like to know if I were to guess which team would win?" It would probably be past results, weather conditions, which players were on each team, how many matches they played in the last week, etc. None of this information is being provided as input on each line of your input data.
I need to create a stats sheet that records the amount of touchdowns performed by a certain team. I have it working for the first round when it records all 8 teams, however in the semifinals since only 4 teams make it, it does not keep track of what index the winning teams were on and just couts the stats in a regular order from 0 - 4. Ive been thinking for a couple days now on how I could possibly overcome this but I havent been able to find a solution yet.
the stats table that is outputted
Please let me know if i can contribute anymore information to make my question less vague and easier for you to understand. I appreciate all the help, thank you.
I'm designing a project that will be developed in Django and I had a design philosophy question. In my app I need to track information like current week. This is related to the current week in the NFL (1-17) and can be calculated based on other models in the system (schedule and the current day for example). Since this information gets updated once a week, and will be used quite often in the app, does it make sense to store this information in a model (db table) of its own and just run the update weekly?
There is other information that might be useful to store as well (date/time of first and last games of the current week) so would a model of something like "current weeks information" be appropriate for this, even though the data can be calculated on the fly?
would a model of something like
"current weeks information" be
appropriate for this, even though the
data can be calculated on the fly?
It might be. You can calculate the date Easter falls on, but few applications do that. The calculation is far from dead simple, and any error would have to be treated as a bug fix. But if you store Easter dates in a table, any error can be fixed by anyone who can update calendar data.
It's simple to calculate USA holidays like Martin Luther King Day (observed on the 3rd Monday in January), President's Day (observed on the 3rd Monday in February), and Labor Day (observed on the 1st Monday in September). It's also pretty easy to calculate factory production weeks, which parallels your problem in some ways.
But when I'm building tables for businesses to use for scheduling, estimating, process control, and so on, I like to have the dates that are important to the business--holidays, for example--stored in a table rather than in procedural (calculating) code. The main advantage is that they can be collected, reviewed, and approved or corrected by relatively unskilled employees instead of needing a programmer.
So, if I were in your shoes, I would probably store the weeks in a table. A secondary advantage (or maybe the main advantage, in your case) is that most queries involving weeks might take advantage of indexes on the start and end dates.