Median on a set of non-integer values is being displayed as integers

Median on a set of non-integer values is being displayed as integers - powerbi

I'm presenting visualizations of time to complete different tasks. Some the data is heavily skewed by certain tasks which take much longer than the rest, so I thought it would be a good idea to show both the means and the medians, to demonstrate where that skew is present. I have one page of visualizations for the mean, and an identical page where the mean has been replaced by the median. However, when PowerBI Calculates the medians, it seems to be giving me integer values, where I would like it to display the full decimal value.
Here's screenshots of each page (I've had to black out the labels for confidentiality reasons).
And a snippet of the data so you can see it's being read in as decimal numbers.

Related

Show min/max values with corresponding steps in TensorBoard scalar plots

I'm wondering if there's a tool/workaround to show min/max values on scalar plots in TensorBoard.
A pretty common scenario considers zooming in and finding the optimal point manually for each series, as it is not necessarily the last one.
For example, it is epoch 22 with IoU=75.08 (the maximum value) for the grey series below.
I'd like to have these numbers displayed somewhere (for example in the tooltip or in the chart) or at least a marker in the optimal point.
I've found an open ticket relating to this issue, but it seems that it is still not resolved.
Maybe someone is aware of some sort of a frontend script/plugin extracting these values? Preferably for Safari or Chrome.

How to manage custom number formatting in power BI?

How can I do custom number formatting in a Power Bi visual?
I don't want to show all value as million. I want to put thousand for 1-day value, and million for 1-week value and year for 1-year value.

Power BI charts follow the principles of good data visualisation. That includes a scale that is relevant to the data with labels that relate to the scale.
In the visualisation, the differences for the values less than 1M are not discernible. The label with the 0M supports that approach, although it doesn't look great. But that happens when you have a chart with very large AND very small values. Power BI only supports one display unit and you selected Millions.
You may want to consider using a different visual for the data. Not all visuals to be shown as charts. If you want to show the exact numbers, then a simple table might be a better approach. In a sorted list of numbers, the digits in a number act very much like a horizontal bar.
Or split the chart in two and show one chart for values above 1M and another for values below 1M.
Or use Thousands as display units instead of Millions.

Clustering a list of dates

I have a list of dates I'd like to cluster into 3 clusters. Now, I can see hints that I should be looking at k-means, but all the examples I've found so far are related to coordinates, in other words, pairs of list items.
I want to take this list of dates and append them to three separate lists indicating whether they were before, during or after a certain event. I don't have the time for this event, but that's why I'm guessing it by breaking the date/times into three groups.
Can anyone please help with a simple example on how to use something like numpy or scipy to do this?

k-means is exclusively for coordinates. And more precisely: for continuous and linear values.
The reason is the mean functions. Many people overlook the role of the mean for k-means (despite it being in the name...)
On non-numerical data, how do you compute the mean?
There exist some variants for binary or categorial data. IIRC there is k-modes, for example, and there is k-medoids (PAM, partitioning around medoids).
It's unclear to me what you want to achieve overall... your data seems to be 1-dimensional, so you may want to look at the many questions here about 1-dimensional data (as the data can be sorted, it can be processed much more efficiently than multidimensional data).
In general, even if you projected your data into unix time (seconds since 1.1.1970), k-means will likely only return mediocre results for you. The reason is that it will try to make the three intervals have the same length.
Do you have any reason to suspect that "before", "during" and "after" have the same duration? If not, don't use k-means.
You may however want to have a look at KDE; and plot the estimated density. Once you have understood the role of density for your task, you can start looking at appropriate algorithms (e.g. take the derivative of your density estimation, and look for the largest increase / decrease, or estimate an "average" level, and look for the longest above-average interval).

Here are some workaround methods that may not be the best answer but should help.
You can plot the dates as converted durations from a starting date (such as one week)
and convert the dates to number representations for time in minutes or hours from the starting point.
These would all graph along an x-axis but Kmeans should still be possible and clustering still visible on a graph.
Here are more examples of numpy:Python k-means algorithm

How to combine two images with different gains in Matlab/ C++

I have two images, both taken at the same time from the same detector.
Both images have 11 bit resolution (yes, its odd but that is the case here). The difference between the two images is that one image as been amplified by a factor of 1 and the other has been amplified by a factor of 10.
How can I take these two 11 bit images, and combine their pixel values to get a single 16 bit image? Basically, this increases the dynamic range of the final image.
I am fairly new to image processing. I know there is a solution for this, since other systems do this on the fly pixel-by-pixel in an FPGA. I was just hoping to be able to do this in Matlab post processing instead of live. I know doing bitwise operations in Matlab can be kinda difficult, but we do have an educational license with every toolbox available.
As mentioned below, this look an awful lot like HDR processing. The goal isn't artistic, rather data preservation. This is eventually going to be put in C++ and flown on an autonomous flight computer and running standard bloated HDR software on the fly would kill our timing requirements
Thanks for the help!
As a side note, I'd like to be able to do this for any combination of gains. ie 2x and 30x, 4x and 8x ect. In my gut I feel like this is a deceptively simple algorithm or interpolation, but I just don't know where to start.
Gains
Since there is some confusion on what the gains mean, I'll try to explain. The image sensor (CMOS) being used on our custom camera has the capability to simultaneously output two separate images, both taken from the same exposure. It can do this because the sensor has 2 different electrical amplifiers along its data path.
In photography terms, it would be like your DSLR being able to take a picture using 2 different ISO values at the same time.
Sorry for the confusion

The problems you pose is known as "High Dynamic Range Imaging" and "Tone Mapping". I suggest you start with those Wikipedia articles, then drill down to the bibliography cited therein.
You don't provide enough details about your imagery to give a more specific answer. What is the "gain" you mention? Did you crank up the sensor's gain (to what ISO-equivalent number?), or did you use a longer exposure time? Are the 11-bit pixel values linear or already gamma-compressed?

To upscale an 11bit range to a 16bit range multiple by (2^16-1)/(2^11-1).
(Assuming you want a linear scaling. (Which is reasonable when scaling up.)
If the gain was discrete (applied to the 11bit range), then you have two 11bit images which may have some values saturated.
If the gain was applied in a continuous (analog) or floating point range, then your values can go beyond the original 11bits. Also, if the gain was applied in a continuous (analog) or floating point range, the values were probably scaled to another range first e.g. [0,1] (by dividing by (2^11-1)).
If the values were scaled to another range, you will have to divide by the maximum of the new range instead of by (2^11-1).
Either way (whether gain was in 11bit range or not), due to the gain and due to the addtion, the resulting values may be large than the original range. In this case, you need to decide how you want to scale them:
Do you want to scale the original 11bit range to 16bit (possible causing saturation)?
If so multiple by multiple by (2^16-1)/(2^11-1)
Do you want to scale the maximum possible value to 2^16-1?
If so multiple by multiple by (2^16-1)/( (2^11-1) * (G1+G2) )
Do you want to scale the actual maximum value to 2^16-1?
If so multiple by multiple by (2^16-1)/(max(sum(I1+I2))
Edit:
Since you do not want to add the images, but rather use the different details in them, perhaps this article will help you:
Digital Photography with Flash and No-Flash Image Pairs

Google Visualization Annotated Time Line, removing data points

I am trying to build a graph that will change resolution depending on how far you are zoomed in. Here is what it looks like when you are complete zoomed out.
So this looks good so when I zoom in I get a higher resolution data and my graph looks like this:
The problem is when I zoom out the higher resolution data does not get cleared out of the graph:
The tables below the graphs are table display what is in the DataTable. This is what drawing code looks like.
var g_graph = new google.visualization.AnnotatedTimeLine(document.getElementById('graph_div_json'));
var table = new google.visualization.Table(document.getElementById('table_div_json'));
function handleQueryResponse(response){
log("Drawing graph")
var data = response.getDataTable()
g_graph.draw(data, {allowRedraw:true, thickness:2, fill:50, scaleType:'maximized'})
table.draw(data, {allowRedraw:true})
}
I am try to find a way for it to only displaying the data that is in the DataTable. I have tried removing the allowRedraw flag but then it breaks the zooming operation.
Any help would be greatly appreciated.
Thanks
See also
Annotated TimeLine when zoomed-out, Too Many Datapoints.

you can remove the allow redraw flag.
In that case you have to put the data points manually in your data table
The latest date of the actual whole data
The most outdated date in the actual whole data.
this will retain your zooming operation.
I think you have already seen removing the allowRedraw flag, works but with a small problem, flickering the whole chart.

It seems to me that the best solution would be to draw every nth data point, depending on your level of zoom. On the Google Finance graph(s), the zoom levels are pre-determined at the top: 1m, 5m, 1h, 1 day, 5 days, etc. It seems evident that this is exactly what Google is doing. At the max view level, they're plotting points that fall on the month. If you're polling 1000 times a day (with each poll generating a single point), then you'd be taking every 30,000th point (the fist point being the very first one of the month, and the 30,000th one being the last point).
Each of these zoom levels would implement a different plot of the data points. Each point should have a time stamp with accuracy to the second, so you'll easily be able to scale the plot based on the level of detail.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js