QuickSight "Geographical fields aren't supported in joins between data sources" - amazon-web-services

I've been trying to work around this issue for a couple of days by now without success so far.
Imagine you have these two dummy datasets
dataset_1
id,latitud,longitude
1,-0.023437,-0.070068
2,-0.069099,-0.069099
dataset_2
id,name
1,"site one"
2,"site two"
and you want to JOIN them by id. This is very straightforward with the QuickSight dataset editor. The issue happens when you change the data type of latitud and longitude to their geospatial type, since the error shown in the title pops-up and won't let you save the dataset.
The weird thing is that the error suggests the fields latitude and/or longitude being used to make the JOIN instead of id.
Before contacting AWS for a possible bug have anyone had and solved this issue before?

At the end we contacted AWS support. It seems they have this feature in consideration but it's still not addressed. They suggested us a work-around though :
Change the datatype of the Geo-spatial field to string and perform the join
Once the join is successful, go back to the dataset page, click on the dataset and select "Use in a new Dataset" option
This will create a new child dataset for the main dataset
Here you can change the datatype back to Geo-spatial and save it
Have in mind that the option Use in a new Dataset is disabled if your dataset have Row-level security or if it exceeds 3 levels of JOIN (in which case you'd have to follow #darcoli's answer first)

This seems to be a limitation with quicksight. Can you do the join in custom sql and then add the fields as geographical coordinates in data preparation?

Related

PowerBI - Filter Table/Report based on User Selection of Slicer falls between Valid From Date & Valid To Date

I have been trying to find an answer to my problem but to no avail, so hoping someone is able to provide some ideas / advice on if this is possible and if so, how to go about it? I've tried various things and none have worked.
We create views within SQL and then connect to them using 'Import' from the Data Connectivity Mode when connecting to the SQL Server from within Power BI. Within the view, we use tables that contain a 'Valid From Date' and 'Valid To Date' for each row of data, so when a change occurs a row is closed off and a new row is created. This is so we can limit the rows of data within the table.
When trying to create a report within Power BI, we need to make it that the end user can use a data drop down list to select a date and the data within the whole of the report shows any rows that the selected date falls on or between the Valid From and Valid To dates. We use Cards, Tables, Matrix and Charts within our reports, so all would need to reflect the date selected by the user.
I have tried various methods that I could think to get it to work but each have had limitations where it either doesn't work or only partly works.
Any help / advice on this would be really appreciated.
Many Thanks
Jon

Most efficient Snowflake connection type from PowerBI?

We're trialling PowerBI on a Snowflake dimensional model and performance seems very non-optimised. Can anyone point me to information on best practices for this connection? I've previously used Tableau and there's an excellent white paper describing the pros/cons of each connection type and how to set this up so that as much heavy lifting as possible is done in Snowflake, with minimal load on the viz tool.
e.g. when you summarise 1 million invoices to get a chart of sales volume by year that distils this to 10 data points, Tableu would send 'SELECT year, sum(volume) FROM t GROUP BY year' (~10 rows), but in PowerBI we see SF receiving a query like 'SELECT invoice_id, sum(volume) FROM t GROUP BY invoice_id' (~1M rows) - leaving the viz tool to do a lot more work.
So far, we've tried mapping the individual facts and dimensions within PowerBI, and also using a mix of direct query and import, but without significant improvement. Is there any guidance on best practice?
Thanks in advance!
I've never used Snowflake, and I have no clue about how PowerBi interfaces to it. That said on the PowerBI side you may be interested in the composite model and aggregations.
MS Docs:
https://learn.microsoft.com/en-us/power-bi/desktop-composite-models
https://learn.microsoft.com/en-us/power-bi/desktop-storage-mode
https://learn.microsoft.com/en-us/power-bi/desktop-aggregations
Radacad's blog about aggregations:
https://radacad.com/power-bi-fast-and-furious-with-aggregations
https://radacad.com/dual-storage-mode-the-most-important-configuration-for-aggregations-step-2-power-bi-aggregations
In practice, when you are using a composite model the aggregation functionality allows you to create a hidden table (in import mode) in your model with aggregated data (by year, month, customer, etc).
Now when you query your data, PowerBI will check if this table can answer the query, if yes then it will just pick the data from this table, otherwise, it will run a query against the source (direct query)
The example you shared about PowerBI querying the source without asking for aggregation (but instead asking for every single InvoiceId) might be caused by not setting up the composite model correctly.
A table in "direct query" cannot reference other tables in its query (in this case the calendar) unless that table is also in "Direct query" or "dual" mode.
How does the model look like in the case you shared? and which is the storage mode of each table?

Power BI Visualize Many to Many

I currently have two tables: A "Send ID" table and an "Affiliation Table" each based on a column of customer IDs.
No columns have purely distinct values so I cannot create a many to one relationship.
I would like to visualize the Send IDs based on the Affiliations as shown here:
Desired Output
I can work with either having the Send IDs repeat per affiliation in the new desired table or have them unique per affiliation - either way works with me.
Any help would be appreciated.
Thank you
noyraz's solution in establishing a many to many relationship based on the customerID should suit your needs.
If you are required to find out where a customer appears in the affiliation table or sendID table, I highly recommend performing a full outer join in the query editor.
Using the picture below, right click on any of the tables, and select reference.
Reference Screenshot
Then rename the table if you like
Click on Merge Queries
In the drop down, select the other table you didnt reference, then click on both customerIDs
select full other join.
Full Outer Join labled screenshot
Expand the new table column
deselect the ID if you like.
Expanding Column Screenshot
If there are occurrences where they don't appear (useful for sending and delivered tables), you can do visual level filters to see where either the Affiliation or SendID is null/blank.
when you create Many To Many relationships like Here
all you have to do next it's just visual this like you desire
hope I understand your question right

Get "not allowed for columns on the one side of a many-to-one relationship" errors

i have created Power bi project, It is working fine in the beginning but when i refresh my datasource , i get this error " not allowed for columns on the one side of a many-to-one relationship" . Any can help me ??
I resolved this issue by going into the relationship, right clicking to view properties,
and making it a Many to One Relationship.
Power BI sometimes automatically creates relationships between the queries that are being used to drive the data in the reports. When I have encountered this error or errors like it in the past I:
Go into Manage Relationships
Verify that there is a relationship listed
Evaluate the From and To relationships that are listed as active
Delete any invalid From and To relationships between separate data sets
My most common issue in the past has been that I will have two very different queries pulling data from separate sources with similar column names and Power BI will generate a relationship between them that is invalid. After removing the relationship it has always resolved my issue.
In my case issue was related to the fact that Power BI was threating "SQL View" as a "Table" and as a result it was creating relations for it.
Although I've checked the "Manage Relationships":
and removed one relation which was not reasonable the issue was still persisting.
Then issue solved when I had looked in the "Relations"
tab and found unreasonable relations related to my View and removed them.
None of these answers helped me. For me I received this error when trying to refresh my dataset which had previously worked for some time. After investigating I found the schema of the source database had changed. Two fields that previously didn't allow nulls now allowed nulls and had null values for some rows. My Power BI model still expected these fields not to contain nulls but was throwing this same and very misleading error:
Data source error: Column 'x' in Table 'y' contains blank values and this is not allowed for columns on the one side of a many-to-one relationship or for columns that are used as the primary key of a table. Table: y.
Initially on seeing this error and opening up my report in PowerBI Desktop and going to Modeling > Manage relationships. I looked for a relationship on table y for column x, but no such relationship existed!? Was I confused? You bet.
After investigating further I discovered the database schema change and resolved by updating my Power BI model by going to the data model editor, expanding table y in the Fields panel on the right hand side, selected field x, expanded "Advanced" in the Properties panel and changed "Is nullable" from No to Yes. I then applied the changes, saved the report and refreshed the dataset.
I followed these steps
Step 1: Go to the Model section from the left side of the Power BI Desktop
Step 2: Delete all the relationships (or connections) amongst the tables that have been created by Power BI itself while you were working with the Power Query Editor
Step 3: Click 'Refresh visual and data' option in Home (besides the Transform Data button)
It worked and loaded the new data and also applied the automations done in the query editor.
I got this error on a completely new table made in Power Query, weird since I had no chance to create a relation yet.
Easy fix: Apply a filter that removes all blanks on that column - apply the filter, and then delete this new filter again
Another problem you might have is that "Autodetect new relationships" is turned on:
This is a setting that disables auto-detecting relations under "File/Options & settings/Options/Current File/Data Load/Relationship/Autodetect new relationships after data is loaded"
I had the same issue, after spending hours of searching for a fix and not finding anything, I started scratching around and found the problem took 1second to fix. My issue specifically, was a an additional relationship created within the model. The connection showed up as a "dotted" line one of my tables. I deleted the relationship, refreshed. Done.
This happens when your table or connected tables in excel, contain blank row, for resolving this issue you will need to click anywhere in the excel table, then click on the table tools, then resize table and Select the entire range of cells to include all rows, make sure no blank rows are included then save and get back to your PowerBI then again refresh it, all will work.

Power Query Formula Language - Detect type of columns

In Power BI, I've got some query tables generated from imported data. All the data comes in as type 'Any', and I'm trying to automatically detect the type of the data in each column.
Some of the queries generate tables with columns based on the in-coming data - I don't know what the columns are going to be until the query runs and sets up the table (data comes from an Azure blob). As I will have quite a few tables to maintain, which columns can change (possibly new columns being added) with any data refresh, it would be unmanageable to go through all of them each time and press 'Detect Data Type' on the columns.
So I'm trying to figure out how I can do a 'Detect Data Type' in the query formula language to attach to the end of the query that generates the table columns. I've tried grabbing the first entry in a column and do Value.Type(column{0}), however this seems to come out as 'Text' for a column which has integers in it. Pressing 'Detect Data Type' does however correctly identifies the type as 'Whole Number'.
Does anyone know how to detect a column's entry types?
P.S. I'm not too worried about a column possibly holding values of different data types
You seem to have multiple issues here. And your solution will be fragile, there's a better way. But let's first deal with column type detection. Power Query uses the 'any' data type as it's go to data type. You can write a function that samples the rows of a column in a table does a best match data type detection then explicitly sets the data type of the column. This is probably messy and tricky since you need to do it once per column. This might be workable for a fixed schema but for a dynamic schema you'll run into a couple of things very quickly. First you'll need to write some crazy PQ code to list all the columns and run you function on each. This will work the first time, but might break in subsequent refreshes because data model changes are not allowed during refresh. If you're using a tool like Power BI Desktop, you'll be able to fix things up. If you publish your report to the Power BI service, you'll just see refresh errors.
Dynamic Schemas will suffer the same data model change issue I mentioned above.
The alternate solution that you won't have problems with is using a Direct Query data source instead of using Power Query. If you load your data into Azure SQL or a Tabular Model, the reporting layer will get the updated fields automatically so you don't have to try to work around using PQ.