How to plot clustered bar chart from raw Qualtrics excel data? - powerbi

My raw Qualtrics data looks something like this. Basically just 2 questions. Q1-where did you learn your tech skills from? and Q2-Do you agree with the following statement?
I want to plot clustered bar chart in Power BI that looks something like the link below. Basically, for Q1-where did you learn your tech skills from? Each cluster is a channel of learning, and within each cluster there is a standard response from not at all, to a small extent, to a moderate extent, to a great extent, entirely
I figured out I cannot plot straight from the raw Qualtrics data. However, if I unpivot just the columns for Q1, I can get the above clustered bar chart.
But here comes the problem. I have other questions with the same raw Qualtrics format. So I tried to unpivot columns for Q1 FIRST, and THEN unpivot the columns for Q2, and got the following, which does not make sense because Q1 has 4 sub-questions while Q2 has 5 sub-questions. This is like a m:m joins (if I make sense?)
So I thought maybe I could unpivot all the columns except for the Response ID column and I got this
Doing the above has several issues;
the number of rows gets large exponentially and imagine if I have many more questions and many more respondents, the data format just gets too large;
when I want to plot the clustered bar chart, I have no way to restrict the rows just to plot for Q1, or rows just to plot for Q2 etc.
I tried googling and was surprised there isn't a similar question before? given how Qualtrics is very well used for survey data.
Appreciate all your help in advance!

Your first step should be splitting the data from the different questions into separate tables, then "Unpivot Other Columns" but Response ID. You can later relate the tables in the report via this Response ID.
From here creating your column charts should be a no-brainer.
Starting with a list of question identifiers ("Q1", "Q2", ...) you can automate the splitting via a Custom Function
I doubt that with Qualtrics Excel imports you'll come anywhere close to Power BI's data limits. However, I am surprised that they are providing such an awkward interface.

Related

Best way to organize data from multiple companies in Power BI

Basically I have a big Excel dataset about 500x500 with economic information from various companies.
Each row is representing a different company and in columns we have the information. A little bit of it is qualitative like ZIP code, type, etc. But most of it is quantitative. For each of the quantitative info, we have info for 5 years, so we have one column for each year and for each information i.e. Debt 2019, Debt 2020, etc.
So my question is which is the best way to preprocess this data to work with it and how should it be done. Either doing the preprocessing with Excel, running a Script on PowerBI, using Query, SQL, ...
The objective is to have a report which will be accessible online and the user will type the name of the company and it will show them the dashboard with the information of that company (only that one), so they can navigate through it.
The structure and which information is shown is the same for each company, the only thing that changes is the "numbers" that each company has. So it has to be possible to change which data is showing (to use the one from the company they want).
It also needs to be able to show comparative data to other groups of companies or to the total.
I want to have it right from the start, because then changes get complicated.
I thought about doing sort of a "relational model" with one "table" for each company with the quantitative data (with one row for each year and each column one info point) and then a general table with the qualitative data (with rows being each company and the columns the info). But I am not really sure.
I know how to use Power BI but I have never used it for something this big. I would like to know which way to organize this data is better and some info on how to do it.
Many thanks to everyone.
I thought about doing sort of a "relational model" with one "table" for each company with the quantitative data (with one row for each year and each column one info point) and then a general table with the qualitative data (with rows being each company and the columns the info).
Yes, do that.
General guidance is to use Power Query in PowerBI to transform the data into a star schema model. See Understand star schema and the importance for Power BI
So that would typically result in one table that has the "dimension" data for each company, a date table, and a "fact" table at the grain of (CompanyId,Date) with the quantitative data.

Converting a Tableau report with different measures for each row to Power BI

I'm working on a project where we are converting a client from Tableau to PBI. One of the Tableau reports I'm converting looks like this:
Each row is a different calculation (measure). I can achieve a similar look, with regards to the column headers, in PBI by using a matrix. However, there isn't a way, that I know of, to apply a different measure for each row. The only way I can think of to do this is to create three matrix tables and stack them on top of each other. It won't look nearly as good but I can generate the same results. Does anyone have a better solution?
Put the Measure Names pill on Rows, Measure Values on Text and your date fields on Columns. That should give you when you want.

What is difference between edit performed in query edit vs during modelling?

When I get data into Power BI I can edit the query as well as perform edit to the model.
What is difference between edit performed in query edit vs during modelling?
When you edit the query, you use Power Query, with its own Query Editor user interface. The steps you apply are recorded in the "M" language. Use Power Query to extract, transform, and finally load data into the Data Model.
Once the data is in the Data Model, you use DAX to create measures that you use in visuals. You can also use DAX to add more columns or even tables to the data model.
Whether to use Power Query or DAX to add columns or tables to the data model depends on a variety of factors. Some things are dead easy to do in Power Query, but harder to achieve with DAX, and vice versa. If you create a column with a formula that depends on a DAX measure, then you can only do that with DAX, because Power Query is not aware of the measures that are created after the load into the data model.
Power Query is very powerful, but the M code syntax is very different to the Excel formula syntax, or the VBA macro language. Learning to write advanced M code can be quite challenging.
DAX, on the other hand, behaves very similar to Excel formulas. Many Excel functions can even be used in DAX verbatim. If you know Excel, you've already got a head start on DAX and you can ease your way into it by learning additional functions and then expanding into more complex formulas.
The latter is probably the reason why many data manipulations are done in DAX, even though they could as well have been done in Power Query.
There are also some efficiencies with data storage and performance. Power Query makes use of query folding with SQL queries, for example, where its transformations are actually performed at the data source, i.e. on the SQL server side, and not in desktop client, and only the final query result is transferred to the desktop client.
Edit after comment: When the data is loaded into the data model, an algorithm processes the data and sorts it in a way that is most efficient for maximum compression and minimum storage. I don't have any concreate examples, but adding a column in Power Query will result in a smaller footprint than adding the same column with DAX. Read more about the compression algorithm VertiPaq here: https://towardsdatascience.com/inside-vertipaq-in-power-bi-compress-for-success-68b888d9d463
But apart from that, it mainly comes down to personal preference based on skill and experience.
By the way, many of your questions can be answered by reading through the Microsoft documentation, e.g. https://learn.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction

Having trouble transforming this dataset for ETL

I'm playing around with some datasets on Kaggle.com, trying to learn better practices for ETL, as I tend to get stuck with specific things with the transform part. For this question, I am dealing with the survey results from Stack Overflow 2018: https://www.kaggle.com/stackoverflow/stack-overflow-2018-developer-survey - specifically the LanguageWorkedWith column.
Currently I am using a combination of RapidMiner/Excel to attempt to change the data. I am not well versed in R and Python code enough to solve this problem with coding methods.
The problem with the current column, is it lists all the languages that a user has chosen separated by a semi-colon. I can easily split a column on a semi-colon, but what occurs is either 2 things:
I have 31 columns of LanguageWorkedWith1 - LanguageWorkedWith31. This makes gathering a count of languages by salary to not work.
A cartesian effect where each row would be duplicated to accommodate only the choice of language. So you'll have a lot of duplicate rows, which definitely affects the integrity of the data. I have also tried using Power BI ( the Load location) to remove duplicates on the responder ID and language, but that didnt work.
Ideally I'd like to do a language by salary visual in Power BI, similar to how many kernals have it, but cant figure out the process for making this happen outside of code. Not sure how this would look exactly, but if i can split all the languages and count them, I can at least do something like this:
But I'm not sure if i can relate this back to salary with how the data is.
I just want to understand some transforming processes better! Appreciate any help!
The key here is to split into rows instead of columns.
So that you end up with a table like this:
You can keep that row expansion in its own related table in your data model so you aren't creating a giant table.
From there it's pretty easy to make visuals provided you know a little bit of DAX. For example, I created an AvgSalary measure (after converting that column to a numeric type) like this:
AvgSalary =
CALCULATE (
AVERAGE ( survey_results_public[ConvertedSalary] ),
FILTER (
survey_results_public,
survey_results_public[Respondent] IN VALUES ( 'Language'[Respondent] )
)
)
and was then able to create interesting charts like the following:

Plain Report in PowerBI report similar to repeatable section in Cognos

I'm new to Power BI but am not seeing something that I feel should be pretty common report functionality. I have a cognos report that has a list grouped by specific fields, each item in the list has fields, etc. Each "item" is repeated in the list.
Can Power BI do something similar to this functionality? I have been looking at multi-row cards, tables, etc. but I'm not seeing a repeater control or something that would allow to mimic this functionality? The multi-row cards would work but I can't style them how the customer wants or needs b/c they are printed and need to match a certain format.
Even the single cards, if I could drop all the fields I need as single cards, format them how I want. Is there a way to have all rows repeated in a "list/set" of those single cards? Right now when I drop a bunch of single cards and a splicer it displays the first record and that is it? Surely there is a way to get all the records.
Here is an example (I need the formatting to remain basically the same, each row from data source represents one page that looks like this)
Thanks,
Tim
I don't think this is possible in Power BI yet. A muti-row card has similar functionality, but is not customizable enough to match what you are trying to do. Custom formatting is one of the drawbacks of Power BI at this time.
You can vote for this idea on the Power BI Ideas site, but I'm guessing it's not a high priority for Microsoft for now.