I have a table with a field which contains lots of XML data and that data comes back in multiple lines causing it to be difficult to scroll through.
I have only been able to figure out how to turn off the wrapping in the editor, but not the output table. Is this possible?
Related
I have data like this:
It comes from REDCap, and as you may be able to tell, the data in the far right columns are repeated variables about each "protocol_title" (the far left column). I.e. "Love it" and "I want a disc instead" are both about "study 2"
I've imported the data into Power Bi and currently I have this:
What I'd like is for the top left visual to only have one row per study (with columns such as principal investigator and method of image transfer, i.e. columns that had data in the first row) and a visual on the lower left with all the right-most columns.
By switching the top visual from a table to a matrix I can kinda accomplish this:
But it adds a bunch of unnecessary columns. As an alternative I thought I could add a filter to the top visual that would filter to "redcap_event_name"=="protocol_information" which would only be those top rows.... but given the visuals are linked, if I do that it removes everything from the bottom visual. I'd like to keep the link between the visuals so that if I select "study2" in the top visual, it'll highlight relevant study 2 information in the bottom one.
So my question is: what's the best approach for making the visuals I want? Are there special settings for visuals? Do I need to do something to the data first in the query? How should I go about this?
You might want to rework you data structure. At first glance, your flat source table could be parsed into two tables :
Protocol
Survey
This can be done in PowerQuery.
For Protocol :
Select columns A to R.
Filter on redcap_event (?) starts by "protocol_info"
Delete empty rows
For Survey
Select columns A (to keep the protocol ID and be able to link both tables), T and U.
Filter on redcap_event (?) starts by "survey"
Delete empty rows.
You should end up with the two table with a one-to-many relationship between Protocol[Protocol_ID] (column A) and Survey[Protocol_ID] (same)
And it should make everything much easier: visuals, calculations...
I have a csv file in GCS with fields with hundreds of columns enclosed in quotes, like below :
"John","Doe","5/15/2021 7:18:26 PM"
I need to load this to BigQuery using Data fusion, created a pipeline. My question is
How do I trim quotes from these the columns in the Wrangler? I don't find much documentation for this, rather than the basic things
How do I apply this rule for all the columns in one shot.
Please guide me, any good reading on these kind of operations will also be helpful
For testing purposes I used your sample data and add a few more entries.
Remove quotes
If your data looks like this and your objective is to just remove the quotes from your data, what you can do is:
Click the drop down arrow beside body
Select Find and replace
At find put " and leave replace as blank
Your output will look like this:
Parse CSV to split into columns
You can then convert your CSV to columns:
Click the drop down beside body
Select Parse -> CSV
A pop up will appear and select "Comma"
This will tell your wrangler to read it as a CSV and split the comma to columns. But the original data will remain at column body.
To delete body:
Select body by ticking the check box at the right
Click the drop down beside body
Select Delete column
Your data should now look like this:
so, I got 3 xlsx full of data already treated, so I pretty much just got to display the data using the graphs. The problem seems to be, that Powerbi aggregates all numeric data (using: count, sum, etc.) In their community they suggest to create new measures, the thing is, in that case I HAVE TO CREATE A LOT OF MEASURES...Also, I tried to convert the data to text and even so, Powerbi counts it!!!
any help, pls?
There are several ways to tackle this:
When you pull a field into the field well for a visualisation, you can click the drop down in the field well and select "Don't summarize"
in the data model, select the column and on the ribbon select "don't summarize" as the summarization option in the Properties group.
The screenshot shows the field well option on the left and the data model options on the right, one for a numeric and one for a text field.
And, yes, you never want to use the implicit measures, i.e. the automatic calculations that Power BI creates. If you want to keep on top of what is being calculated, create your own measures, and yes, there will be many.
Edit: If by "aggregating" you are referring to the fact that text values will be grouped in a table (you don't see any duplicates), then you need to add a column with unique values to the table so all the duplicates of the text values show up. This can be done in the data source by adding an Index column, then using that Index column in the table and setting it to a very narrow with to make it invisible.
In Power BI, I've got some query tables generated from imported data. All the data comes in as type 'Any', and I'm trying to automatically detect the type of the data in each column.
Some of the queries generate tables with columns based on the in-coming data - I don't know what the columns are going to be until the query runs and sets up the table (data comes from an Azure blob). As I will have quite a few tables to maintain, which columns can change (possibly new columns being added) with any data refresh, it would be unmanageable to go through all of them each time and press 'Detect Data Type' on the columns.
So I'm trying to figure out how I can do a 'Detect Data Type' in the query formula language to attach to the end of the query that generates the table columns. I've tried grabbing the first entry in a column and do Value.Type(column{0}), however this seems to come out as 'Text' for a column which has integers in it. Pressing 'Detect Data Type' does however correctly identifies the type as 'Whole Number'.
Does anyone know how to detect a column's entry types?
P.S. I'm not too worried about a column possibly holding values of different data types
You seem to have multiple issues here. And your solution will be fragile, there's a better way. But let's first deal with column type detection. Power Query uses the 'any' data type as it's go to data type. You can write a function that samples the rows of a column in a table does a best match data type detection then explicitly sets the data type of the column. This is probably messy and tricky since you need to do it once per column. This might be workable for a fixed schema but for a dynamic schema you'll run into a couple of things very quickly. First you'll need to write some crazy PQ code to list all the columns and run you function on each. This will work the first time, but might break in subsequent refreshes because data model changes are not allowed during refresh. If you're using a tool like Power BI Desktop, you'll be able to fix things up. If you publish your report to the Power BI service, you'll just see refresh errors.
Dynamic Schemas will suffer the same data model change issue I mentioned above.
The alternate solution that you won't have problems with is using a Direct Query data source instead of using Power Query. If you load your data into Azure SQL or a Tabular Model, the reporting layer will get the updated fields automatically so you don't have to try to work around using PQ.
I'm trying to tidy up a sheet with the following problem, and would appreciate any advice.
My sheet has 7 "master columns" and about 4000 rows. It was compiled by converting a load of PDF documents.
The master columns are made up of merged minor columns, but at various parts of the data, the minor columns that make up each master column are different.
eg The first master column is made up of merged columns A-H for the first 30 rows, but for the next 25 rows it's made up of merged columns A-G etc.
As I said, overall there are still the same 7 master columns from top to bottom, but the merging is different throughout...
Can anyone think of a way to fix this without doing it all manually?
Copy your horrible spreadsheet into Word with Home > Clipboard – Paste, Paste Special, Unformatted Text and replace ^t^t with ^t. Replace All repeatedly, until Word has completed its search of the document and has made 0 replacements. Copy back in to Excel.
This is not tested on your image so there might be some issues – perhaps column misalignments (where even Word’s limited regex may help to add back tabs where suitable). The result should be no merged cells – mind you someone on SE described these along the lines of “A creation of the Devil to test us beyond endurance” (ie best avoided).
Try selecting the full document and click unmerge button from the ribbon.
As per the screen shot you provided, you can select all and unmerge but getting the corresponding fields in order might be challenging.
Try using macros to set combined functionalities in a single or combine key presses