I am trying to plot a line chart in powerBI with a reference line based on another columns value. I have data that represents the journeys of different cars on different sections of road. I am plotting those journeys that travel over the same section of road. e.g. RoadId 10001.
Distance JourneyNum Speed ThresholdSpeed RoadId
1 10 50 60 10001
2 10 51 60 10001
3 10 52 60 10001
1 11 45 60 10001
2 11 46 60 10001
3 11 47 60 10001
7 12 20 30 10009
8 12 21 30 10009
9 12 22 30 10009
10 12 23 30 10009
So currently I have:
Distance on x-axis (Axis),
Speed on y-axis (Values),
JourneyNum as the Legend (Legend),
filter to roadId 10001
I want to also add the thresholdSpeed as a reference line or just as another line would do. Any help?
I don't think it's possible (yet) to pass a measure to a constant line, so you'll need a different approach.
One possibility is to reshape your data so that ThresholdSpeed appears as part of your Legend. You can do this in DAX like so:
Table2 =
VAR NewRows = SELECTCOLUMNS(Table1,
"Distance", Table1[Distance],
"JourneyNum", "Threshold",
"Speed", Table1[ThresholdSpeed],
"ThresholdSpeed", Table1[ThresholdSpeed],
"RoadId", Table1[RoadId])
RETURN UNION(Table1, DISTINCT(NewRows))
Which results in a table like this:
Distance JourneyNum Speed ThresholdSpeed RoadId
1 10 50 60 10001
2 10 51 60 10001
3 10 52 60 10001
1 11 45 60 10001
2 11 46 60 10001
3 11 47 60 10001
1 Threshold 60 60 10001
2 Threshold 60 60 10001
3 Threshold 60 60 10001
7 12 20 30 10009
8 12 21 30 10009
9 12 22 30 10009
10 12 23 30 10009
7 Threshold 30 30 10009
8 Threshold 30 30 10009
9 Threshold 30 30 10009
10 Threshold 30 30 10009
Then you make a line chart on this table instead:
Note: It's probably preferable to do this transformation in the query editor though so you don't have redundant tables.
Related
Link to CausalFS GitHub I'm using v.2.0 of CausalFS cpp package by Kui Yu.
Upon running the structural learning algos, my DAG and MB are not matching.
I'm trying to generate a DAG based on the data given in the CDD/data/data.txt directory and CDD/data.txt via some of the Local-to-global structure learning algos mentioned in the manual (PCMB-CSL, STMB-CSL etc.). Running the commands as given by the manual (pg. 18 of 26).
But my resulting DAG is just filled with zeros (for the most part). Given that this is a example dataset that looks suspicious. Upon then checking CDD/mb/mb.out I find that the Markov blankets for the variables do not agree with the DAG output.
For ex, running ./main ./data/data.txt ./data/net.txt 0.01 PCMB-CSL "" "" -1 gives a 1 at position (1,22) (one-indexed) only (relaxing alpha value to 0.1 (kept at 0.01 in ex) gives just another 1). However, this doesn't agree with the output MB for each variable, which looks like (upon running IAMB as ./main ./data/data.txt ./net/net.txt 0.01 IAMB all "" "")-
0 21
1 22 26 28
2 29
3 14 21
4 5 12
5 4 12
6 8 12
7 8 12
8 6 7 12
9 11 15
10 35
11 9 12 15 33
12 4 6 7 8 11 13
13 8 12 14 15 17 30 34
14 3 13 20
15 8 9 11 13 14 17 30
16 15
17 13 15 18 27 30
18 17 19 20 27
19 18 20
20 14 18 21 28
21 0 3 20 26
22 1 21 23 24 28
23 1 22 24
24 5 22 23 25
25 24
26 1 21 22
27 17 18 28 29
28 1 18 21 22 27 29
29 2 27
30 13 14 15 17
31 34
32 15 18 34
33 11 12 32 35 36
34 30 31 32 35 36
35 10 33 34
36 33 34 35
Such an MB profile suggests the DAG to be much more connected.
I would love to hear suggestions from people who've managed to get the package to behave appropriately. I just do not understand the error here from my side. (I'm running on PopOS 20.04)
Thanks a bunch <3
P.S- The files just continue to write upon rerunning the code, so make sure to appropriately delete them.
I have a grouped dataframe which is this:
Speed (mph)
Label Hour
5 13 18.439730
14 24.959555
15 33.912493
7 13 23.397055
14 18.497228
15 33.493978
12 13 32.851146
14 33.187193
15 32.597150
14 13 14.491841
14 12.397724
15 19.581669
21 13 34.985289
14 34.817009
15 34.888187
26 13 35.813901
14 36.622450
15 36.540348
28 13 33.761174
14 33.951116
15 33.736014
29 13 34.545862
14 34.227974
15 34.435377
I am trying to plot bar plots where each bar is grouped by their Label and Hour
An example:
The above graph is just an example I found on the internet. I don't really need the lines and the numbers over the bars.
I tried plotting like this:
newdf.plot.bar()
plt.show()
which gives me -
Question
How can I plot the graph so that Label:5,Hour:13,14,15 and close to gether, then some space and then Label:7,Hour:13,14,15 close together and so on?
It seems you need to unstack
df.unstack().plot.bar()
plt.show()
I am doing some research on image compression via discrete cosine transformations and i want to change to quantization tables sizes so that i can study what happens when i change the matrix sizes that i divide my pictures into. The standard sub-matrix size is 8X8 and there is a lot of tables for those dimensions. For example the standard JPEG quantization table (that i use) is:
standardmatrix8 = np.matrix('16 11 10 16 24 40 51 61;\
12 12 14 19 26 58 60 55;\
14 13 16 24 40 57 69 56;\
14 17 22 29 51 87 80 62;\
18 22 37 56 68 109 103 77;\
24 35 55 64 81 104 103 92;\
49 64 78 77 103 121 120 101;\
72 92 95 98 112 100 103 99').astype('float')
I have assumed that the quantization tabel for 2X2 and 4X4 would be:
standardmatrix2 = np.matrix('16 11; 12 12.astype('float')
standardmatrix4 = np.matrix('16 11 10 16;
12 12 14 19;
14 13 16 24;
18 22 37 56').astype('float')
Since the entries in the standard table correspond to the same frequencies in the smaller matrixes.
But what about a quantization with dimentions 16X16, 24X24 and so on. I know that the standard quantization tabels are worked out by experiments and can't be calculated from some formula, but i assume that someone have tried changing the matrix sizes before me! Where can i find these tabels? Or can i just make something up and scale the last entries to higher frequenzies?
I've encountered an interesting situation while calculating the inter-quartile range. Assuming we have a dataframe such as:
import pandas as pd
index=pd.date_range('2014 01 01',periods=10,freq='D')
data=pd.np.random.randint(0,100,(10,5))
data = pd.DataFrame(index=index,data=data)
data
Out[90]:
0 1 2 3 4
2014-01-01 33 31 82 3 26
2014-01-02 46 59 0 34 48
2014-01-03 71 2 56 67 54
2014-01-04 90 18 71 12 2
2014-01-05 71 53 5 56 65
2014-01-06 42 78 34 54 40
2014-01-07 80 5 76 12 90
2014-01-08 60 90 84 55 78
2014-01-09 33 11 66 90 8
2014-01-10 40 8 35 36 98
# test for q1 values (this works)
data.quantile(0.25)
Out[111]:
0 40.50
1 8.75
2 34.25
3 17.50
4 29.50
# break it by inserting row of nans
data.iloc[-1] = pd.np.NaN
data.quantile(0.25)
Out[115]:
0 42
1 11
2 34
3 12
4 26
The first quartile can be calculated by taking the median of values in the dataframe that fall below the overall median, so we can see what data.quantile(0.25) should have yielded. e.g.
med = data.median()
q1 = data[data<med].median()
q1
Out[119]:
0 37.5
1 8.0
2 19.5
3 12.0
4 17.0
It seems that quantile is failing to provide an appropriate representation of q1 etc. since it is not doing a good job of handling the NaN values (i.e. it works without NaNs, but not with NaNs).
I thought this may not be a "NaN" issue, rather it might be quantile failing to handle even-numbered data sets (i.e. where the median must be calculated as the mean of the two central numbers). However, after testing with dataframes with both even and odd-numbers of rows I saw that quantile handled these situations properly. The problem seems to arise only when NaN values are present in the dataframe.
I would like to use quntile to calculate the rolling q1/q3 values in my dataframe, however, this will not work with NaN's present. Can anyone provide a solution to this issue?
Internally, quantile uses numpy.percentile over the non-null values. When you change the last row of data to NaNs you're essentially left with an array array([ 33., 46., 71., 90., 71., 42., 80., 60., 33.]) in the first column
Calculating np.percentile(array([ 33., 46., 71., 90., 71., 42., 80., 60., 33.]) gives 42.
From the docstring:
Given a vector V of length N, the qth percentile of V is the qth ranked
value in a sorted copy of V. A weighted average of the two nearest
neighbors is used if the normalized ranking does not match q exactly.
The same as the median if q=50, the same as the minimum if q=0
and the same as the maximum if q=100.
Can anyone help me with this pratice question: 33 31 11 47 2 20 24 12 2 43. I am trying to figure out what the contents of the two output lists would be after the first pass of the Merge Sort.
The answer is supposedly:
List 1: 33 11 47 12
List 2: 31 2 20 24 2 43
Not making any sense to me sense I was under the impression that the first pass was where it divided it into two lists at the middle....
33 31 11 47 2 20 24 12
At first the list divides into individual elements such that when a single ton list is formed each element is compared with the one next to it. So after first pass we have
31 33 11 47 2 20 12 24
After that
11 31 33 47 2 12 20 24
and then
2 11 12 20 24 3133 37