So i have the following problem:
I'm trying to import some data from my database into my report.
The data is provided by some views, which already have predetermined datatypes for my columns.
Although they are not all correct. And here lies the problem.
An example would be:
items (type = int)
itemDetails (type = string)
1
[[[Fruit], Fruit, Apple, 0.90]]
2
[[[Fruit], Fruit, Apple, 0.90],[[Veggie], Veggie, Salat, 1.50]]
Now the problem is, that Power Bi doesn't recognize the data in the second column as a list, or an array.
My goal is to extract the informations in these nested lists: so the desired output would be something like:
items (type = int)
itemDetails
itemCategory
itemName
itemPrice
1
[[[Fruit], Fruit, Apple, 0.90]]
Fruit
Apple
0.9
2
[[[Fruit], Fruit, Apple, 0.90],[[Veggie], Veggie, Salat, 1.50]]
Fruit, Veggie
Apple, Salat
0.9, 1.5
Related
I have table data in Google Spreadsheet something like this:
Date|Diet
4-Jan-2020|Coffee
4-Jan-2020|Snacks
4-Jan-2020|xyz
4-Jan-2020|Coffee
5-Jan-2020|Snacks
5-Jan-2020|abc
6-Jan-2020|Coffee
6-Jan-2020|Snacks
This table is a list of food items I had on a daily basis. I would like to get the number of times I had coffee on a daily basis. So I would like to get the output like this:
Date | No of times I had Coffee
4-Jan-2020| 2
5-Jan-2020| 0
6-Jan-2020| 1
I used this query to get the output.
=query(A1:B1425,"select A, COUNT(B) where B='Coffee' group by A")
With this query, I get the below output. Do note that I don't get those days when I didn't have coffee
4-Jan-2020| 2
6-Jan-2020| 1
So count for 5-Jan-2020 is missing because there is no string "Coffee" for that day.
How do I get the desired output including the count 0? Thank you.
try:
=ARRAYFORMULA({UNIQUE(FILTER(A1:A, A1:A<>"")),
IFNA(VLOOKUP(UNIQUE(FILTER(A1:A, A1:A<>"")),
QUERY(A1:B,
"select A,count(B)
where B='Coffee'
group by A
label count(B)''"), 2, 0))*1})
or try:
=ARRAYFORMULA(QUERY({A1:B, IF(B1:B="coffee", 1, 0)},
"select Col1,sum(Col3)
where Col1 is not null
group by Col1
label sum(Col3)''"))
you might want to change the counter into an If statement.
Something like "IF(COUNT(B) where B='Coffee' group by A">0,COUNT(B) where B='Coffee' group by A",0).
That will force the counter to have an actual value (0), even when nothing is found
I have two Pandas data frames representing an inventory of items. Both data frames have four columns:
df1
id, item, colour, year
1, car, red, 2015
2, truck,, 2016
3, house, blue,
4, car, blue,
5, truck, red, 2015
df2
id, item, colour, year
1, house, blue, 2015
2, truck,, 2015
3, car, blue,
4, house,,
5, car, red, 2015
I know that these inventories are likely to represent the same object, so I would like to relate both of these.
For instance,
df1[1] = df2[5] (3 identique variables)
df1[4] = df2[3] (2 identique variables)
df1[3] (house, blue,) is probably the same as df2[1] (house, blue, 2015).
I have 2 main issues: how to do it efficiently, and how to give a reliability to the link.
I've thought of creating a common field which would be a combination of all the columns [item, colour, year] and merge on this. I would get the two first matches above; but they don't have the same reliability. I wonder if there would be an easy way to 'score' this reliability (at the moment I'm thinking of doing two merges, depending on variable availability).
The I would create another common field, with only 2 variables (item, colour), and merge on this. That would give me the link: (house, blue,) and (house, blue, 2015). This would obviously be a weaker link.
Any idea how to do this without merging sequentially? My current plan is to merge with 3 attributes (when they are present), then 2 attributes (there are 3 permutations) on what is left and has at least 2 attributes, and then 1 only. I would give a reliability score to the link based on the number of attributes I used to merge.
df = pd.DataFrame(
(df1.values[:, None] == df2.values).sum(2),
df1.index, df2.index)
matches = df.mask(df.lt(2)).stack()
def f(df):
i, j = df.name
return pd.concat([df1.loc[i], df2.loc[i]], axis=1, keys=['df1', 'df2']).T
matches.groupby(level=[0, 1]).apply(f).stack().unstack([-2, -1])
I have a frequency table of words which looks like below
> head(freqWords)
employees work bose people company
1879 1804 1405 971 959
employee
100
> tail(freqWords)
youll younggood yoyo ytd yuorself zeal
1 1 1 1 1 1
I want to create another frequency table which will combine similar words and add their frequencies
In above example, my new table should contain both employee and employees as one element with a frequency of 1979. For example
> head(newTable)
employee,employees work bose people
1979 1804 1405 971
company
959
I know how to find out similar words (using adist, stringdist) but I am unable to create the frequency table. For instance I can use following to get a list of similar words
words <- names(freqWords)
lapply(words, function(x) words[stringdist(x, words) < 3])
and following to get a list of similar phrases of two words
lapply(words, function(x) words[stringdist2(x, words) < 3])
where stringdist2 is follwoing
stringdist2 <- function(word1, word2){
min(stringdist(word1, word2),
stringdist(word1, gsub(word2,
pattern = "(.*) (.*)",
repl="\\2,\\1")))
}
I do not have any punctuation/special symbols in my words/phrases. (I do not know a lot of R; I created stringdist2 by tweaking an implementation of adist2 I found here but I do not understand everything about how pattern and repl works)
So I need help to create new frequency table.
For example there are two text file with words and i need to show below output format how ?please give idea to do ?
1.text1 with words
apple
apple
mango
2.text2 with words
apple
apple
mango
I need to show output like this
text1
apple 2
mango 1
text2
apple 2
mango 1
total
apple 4
mango 2
In the mapper set the key as filename +'|' +word and emit to reducer. In your case the output of the mapper will be like this.
(text1|apple,1)
(text1|apple,1)
(text1|mango,1)
(text2|apple,1)
(text2|apple,1)
(text2|mango,1)
After the shuffling and sorting phase the output will be like this
(text1|apple,{1,1})
(text1|mango,{1})
(text2|apple,{1,1})
(text2|mango,{1})
In the reducer you can write the logic to count the number of apples and mangos in each text file.(Count the number of values insist the array)
To find the global sum declare the static variables mangos and apples. Separate the key using the symbol '|' and calculate the sum . Assign the sum to static variables. Finally write the output to the text file
I want a feature for my Google Chart. Such that, if the number of series is greater than a specified number, the remaining series will be grouped together into one series.
For example, I have 5 series of fruits: Apple, Banana, Orange, Mango, Avocado
And my series limit is 3.
My graph will show: Apple, Banana, Orange, Others.
What happened is that the remaining series (Mango and Avocado) were grouped together into 'Others'.
Does Google Charts have a feature for this (e.g: aggregators probably) ?
Use a DataView to combine your extra columns into one:
// assumes you have a DataTable "data" with 6 columns (one domain, plus your 5 fruits)
var view = new google.visualization.DataView(data);
view.setColumns([0, 1, 2, 3, {
type: 'number',
label: 'Others',
calc: function (dt, row) {
// combine columns 4 and 5 together:
return dt.getValue(row, 4) + dt.getValue(row, 5);
}
}]);