Fill in blank cells in ={ARRAYFORMULA()} - if-statement

I have a human-friendly sheet with sparse data:
PART | FRUIT
---------------
Alpha |
| Apples
| Pears
Beta |
| Lemons
| Oranges
I want to create a second automatically updated machine-friendly sheet, which would have all empty cells in column PART filled:
PART | FRUIT
---------------
Alpha |
Alpha | Apples
Alpha | Pears
Beta |
Beta | Lemons
Beta | Oranges
I am OK to have empty cells in the column FRUIT on the machine-friendly sheet. But ideally I would like such rows removed:
PART | FRUIT
---------------
Alpha | Apples
Alpha | Pears
Beta | Lemons
Beta | Oranges
If I wanted to use interpolation in the machine-friendly sheet, I would rely on the MATCH trick or the FILTER paste-anywhere formula.
But I really want to avoid updating the machine-friendly sheet when I add, change or remove rows in the original sheet. (I'm OK if I will have to update it if I add new columns to the original sheet.) This means that using manual interpolation is off-limits.
Ideally on the second sheet I would type in a magic ={ARRAYFORMULA()} or a =QUERY of some kind, and then leave it alone.
={ ARRAYFORMULA(MAGIC(PART)), FRUIT }
But so far I cannot wrap my head on how to approach this. Any suggestions?

use in row 2:
=ARRAYFORMULA(IF(B2:B="",, VLOOKUP(ROW(A2:A), IF(A2:A<>"", {ROW(A2:A), A2:A}), 2, 1)))

Related

Plot Min, Max, Average, Median values into a horizontal enumerated line in Power BI Desktop

I am trying to do a trivial task with Power BI Desktop. I have the following kind of data
| Name | Min | Max | Average | Median |
|-------- |----- |------- |--------- |-------- |
| team A | 0 | 3,817 | 120 | 120 |
| team B | -10 | 1,050 | 25 | 89 |
| team C | 5 | 14,320 | 50 | 48 |
And I want to create my own horizontal line with pre-defined (Start, End) points to plot for each team name the values of the Min, Max, Average, Median. And I filter the team name to adjust the numbers and the visual accordingly.
So far I have done the following static approach
The example above is totally non-dynamic because every point on the line is set by me. Also if for example, I select Team B with a higher median than average then the above visual line does not change the position of the relative spheres (in the image I posted, I have placed average always higher than the median which is not true for all the teams).
Thus, I would like to know if there is any fancy and well-plotted way to represent those 4 descriptive measures for a team name in a horizontal line that will respond when I use a different team. As I have noted on the image attached, the card visuals change when I change the team name. But the spheres do not move across the line.
My desired output
For Team B
While for Team C
I literally don't know if this is feasible in Power BI apart from the static approach I already did. Thank you in advance.
Regards.

Match if partial part of string exists in another column e.g. if AUD exists in one column highlight AUDUSD

Hopefully someone can help me with a google sheets formula to look up a currency in a column of currency pairs, I'm sure it's fairly basic but because it needs a partial match as the news will always be a single currency and the list of pairs will always be two currencies put together I am struggling to do this.
How my sheet is structured; I have a list of currency pairs in one column, the next column is what I'm trying to do. In the "news ?" column the value = "News" if the previous currency pair matches with any the values in H20 to H25. The value = "No News" if the currency pair does not contain any values in H20 to H25. In this example, these values are AUD and CAD as we have news on these currencies today to be wary of.
1 | Currency pairs | News ?
2 | AUDUSD | News
3 | EURUSD | No News
4 | GBPUSD | No News
5 | USDCAD | News
6 | USDCHF | No News
7 | USDJPY | No News
8 | AUDCAD | News
9 | AUDCHF | News
10 | AUDJPY | News
11 | AUDNZD | News
12 | AUDSGD | News
13 | CADCHF | News
14 | CADJPY | News
15 | CHFJPY | No News
etc...
And I have a column of currencies which has news occurring today e.g.
H19 | Today we have news on
H20 | AUD
H21 | CAD
H22 |
H23 |
H24 |
H25 |
My question is: How do I highlight via a formula if one part of the currency pair appears in the news column. Can be conditional formatting or a value in the next column as per the example which says "News" or "No News" (or 1 or 0, tick or cross, doesn't really matter as long as it flags if it matches with the news)
The tricky part is the currency pairs column will always be the six character pairs and the news column will always just be one single three-letter currency.
The news column will have anywhere between 0 news and say 5 rows of currencies which have news
I have tried things like this with no success so far:
=VLOOKUP(H20:H25&"*",A2,1,0)
=IF(REGEXMATCH(A2, H20:H25&"*"), 1, 0)
=if(COUNT(find(H20:H25,A2))=1,CHAR(10004))
formula in B2 cell:
=ARRAYFORMULA(IF(REGEXMATCH(A2:A15, TEXTJOIN("|", 1, A20:A25)), "News", "No News"))
custom conditional formatting formula:
=REGEXMATCH($A2, TEXTJOIN("|", 1, $A$20:$A$25))

How to plot multiple measures on a line chart with shared axis and legend

I have two line plots that come from a different columns in measurement formulas but both share the time axis.
This is the first plot:
The second one:
In this case the second plot is just a scaled version of the first one, but the principle remains the same, I´m not able to just drag and drop the values from both measurements into the same "valores" box, it just iverwrites:
How can I plot this two measurements on the same box?
Something like the stacked areas plot but without the offset.
This is my data:
And the expresion fot the measurements I´m using:
Medida = CALCULATE(sum(test_data[Percentage_By_Class]);filter(test_data;test_data[Date]=max(test_data[Date]));
ALLEXCEPT(test_data;test_data[Score]))/ CALCULATE(sum(test_data[Percentage_By_Class]);
filter(all(test_data);test_data[Date]=max(test_data[Date])))
Medida2 = CALCULATE(sum(test_data[Percentage_By_Class]);filter(test_data;test_data[Date]=max(test_data[Date]));
ALLEXCEPT(test_data;test_data[Score]))/ CALCULATE(1.3*sum(test_data[Percentage_By_Class]);
filter(all(test_data);test_data[Date]=max(test_data[Date])))
And a google drive link to download the used data in CSV format:
https://drive.google.com/file/d/1dEdUwwofv1OQ9rOGQMuyfYKO9_YJDTcl/view?usp=sharing
I don't think there's a nice built-in way to do this, but here's a possible workaround:
Create a new table for the legend which will be the Cartesian product of scores and measures.
Legend =
ADDCOLUMNS(
CROSSJOIN(VALUES(test_data[Score]), {1,2}),
"Legend", [Score] & [Value]
)
This table should like this:
| Score | Value | Legend |
|-------|-------|--------|
| A | 1 | A1 |
| C | 1 | C1 |
| B | 1 | B1 |
| A | 2 | A2 |
| C | 2 | C2 |
| B | 2 | B2 |
Now create a combined measure that switches between [Medida] and [Medida2]:
Combo =
IF(
SELECTEDVALUE(Legend[Value]) = 1,
CALCULATE([Medida], test_data[Score] in VALUES(Legend[Score])),
CALCULATE([Medida2], test_data[Score] in VALUES(Legend[Score]))
)
Then if you put Legend in the legend box and Combo in the values box, you should get a chart like this:
You can change the colors too if you want to visually group the lines.

How to visualize multiple lines from two measures

I have a challenge in Power BI Desktop to model and display a line chart that shows multiple lines in the same visualization where the x,y pair consists of two measures. The X axis contains a measure Average weight and the y axis Price per Kilo. There is a Normal line chart displaying the optimal curve where as there are a number of projects displaying other curves in the same chart (as legends). Below you see the coordinates for the normal curve, while the project curves can have other x,y values. This is easy in Excel but not that easy in Power BI.
To make lines it seems that every x coordinate in Line chart must be in the same interval. Otherwise I only get points not separate lines. Maybe the line chart component is not suitable for showing this. I think scatter chart is more suitable but I don't think it can show lines between the points.
I hope some of you have solved this or may be have pbix file to share how this have been solved.
Regards Geir
Sample data:
| Avg weight | Price pr kg |
|------------|-------------|
| 100 | 129.39 |
| 500 | 63.65 |
| 1000 | 40.13 |
| 1500 | 33.41 |
| 2000 | 30.05 |
| 2500 | 27.53 |
| 3000 | 25.43 |
| 3500 | 23.582 |
| 4000 | 22.91 |
| 4500 | 22.322 |
| 5000 | 21.902 |
| 5500 | 21.734 |
| 6000 | 21.65 |
Plot example:
This is seems quite straightforward, although perhaps your actual data is more complex?
With Avg Weight and 2 data series in one Table, I can use Avg Weight as the X Axis and the 2 data series as Values to achieve something similar to your requirement:

How to store data with large number (constant) of properties in SQL

I am parsing the USDA's food database and storing it in SQLite for query purposes. Each food has associated with it the quantities of the same 162 nutrients. It appears that the list of nutrients (name and units) has not changed in quite a while, and since this is a hobby project I don't expect to follow any sudden changes anyway. But each food does have a unique quantity associated with each nutrient.
So, how does one go about storing this kind of information sanely. My priorities are multi-programming language friendly (Python and C++ having preference), sanity for me as coder, and ease of retrieving nutrient sets to sum or plot over time.
The two things that I had thought of so far were 162 columns (which I'm not particularly fond of, but it does make the queries simpler), or a food table that has a link to a nutrient_list table that then links to a static table with the nutrient name and units. The second seems more flexible i ncase my expectations are wrong, but I wouldn't even know where to begin on writing the queries for sums and time series.
Thanks
You should read up a bit on database normalization. Most of the normalization stuff is quite intuitive, but really going through the definition of the steps and seeing an example helps understanding the concepts and will help you greatly if you want to design a database in the future.
As for this problem, I would suggest you use 3 tables: one for the foods (let's call it foods), one for the nutrients (nutrients), and one for the specific nutrients of each food (foods_nutrients).
The foods table should have a unique index for referencing and the food's name. If the food has other data associated to it (maybe a link to a picture or a description), this data should also go here. Each separate food will get a row in this table.
The nutrients table should also have a unique index for referencing and the nutrient's name. Each of your 162 nutrients will get a row in this table.
Then you have the crossover table containing the nutrient values for each food. This table has three columns: food_id, nutrient_id and value. Each food gets 162 rows inside this table, oe for each nutrient.
This way, you can add or delete nutrients and foods as you like and query everything independent of programming language (well, using SQL, but you'll have to use that anyway :) ).
Let's try an example. We have 2 foods in the foods table and 3 nutrients in the nutrients table:
+------------------+
| foods |
+---------+--------+
| food_id | name |
+---------+--------+
| 1 | Banana |
| 2 | Apple |
+---------+--------+
+-------------------------+
| nutrients |
+-------------+-----------+
| nutrient_id | name |
+-------------+-----------+
| 1 | Potassium |
| 2 | Vitamin C |
| 3 | Sugar |
+-------------+-----------+
+-------------------------------+
| foods_nutrients |
+---------+-------------+-------+
| food_id | nutrient_id | value |
+---------+-------------+-------+
| 1 | 1 | 1000 |
| 1 | 2 | 12 |
| 1 | 3 | 1 |
| 2 | 1 | 3 |
| 2 | 2 | 7 |
| 2 | 3 | 98 |
+---------+-------------+-------+
Now, to get the potassium content of a banana, your'd query:
SELECT food_nutrients.value
FROM food_nutrients, foods, nutrients
WHERE foods_nutrients.food_id = foods.food_id
AND foods_nutrients.nutrient_id = nutrients.nutrient_id
AND foods.name = 'Banana'
AND nutrients.name = 'Potassium';
Use the second (more normalized) approach.
You could even get away with fewer tables than you mentioned:
tblNutrients
-- NutrientID
-- NutrientName
-- NutrientUOM (unit of measure)
-- Otherstuff
tblFood
-- FoodId
-- FoodName
-- Otherstuff
tblFoodNutrients
-- FoodID (FK)
-- NutrientID (FK)
-- UOMCount
It will be a nightmare to maintain a 160+ field database.
If there is a time element involved too (can measurements change?) then you could add a date field to the nutrient and/or the foodnutrient table depending on what could change.