I am using the SAS Enterprise Miner 13.2.
I have a SAS table as a data source. In this table i have a binary variable D_TYP ( "I" and "P" ) and other categorical variables.
I want to split the data by D_TYP so i got two tables. One with all "I" and the other with "P". The problem i don’t know how.
I have been looking in the taskbar and i tried Filter and Data Partition. I can probably use SAS Code to split the Data but i think there is an other way with the taks.
You could use two filter nodes to do the job, with one filtering out I and the another filtering out P. The resulted data set should only consist of one type of the binary variable. In case you are not familiar with the filter node, click on the option Class Variable at properties panel and apply User specified filter. You have to manually select the group by clicking on its corresponding bar.
Related
I want to implement on PowerBI a calculator that I developed in Excel.
Basically, it works this way:
I have a list of Dates:
I have a Database that combines in a key the name of the source with a date:
I have a calculation table where I apply into a Dropdown menu value an entire column, it combines with the Source, forming a key, where I can calculate the Source variation choosing two dates (an initial and an end date).
I would like to know how can I apply it into Power Bi, specially step 3. With a Dropdown menu that is applied to an entire column "dynamically"
You can do this with Calculation Groups. To use them, you have to use the free external tool Tabular Editor.
Setting dynamic date ranges like this is a very common use for Calculation Groups.
Here's an example of a prior year Calculation Item:
CALCULATE(SELECTEDMEASURE(), SAMEPERIODLASTYEAR(cal[date]))
You can create multiple Calculation Items to define all the various periods you need.
You can then set the calculation group as the field for a dropdown menu, and each calculation item you defined will be an option in the menu.
Here's an intro article. This same website has the best training on calculation groups you can find online. (And it's all free.)
I have two sets of identical data that I filter differently. One shows sales by location in test locations and the other in control locations. Is there a way to append the results in a table with a "Test/Control" flag based on the first set of slicers so that I can show all the locations color coded by the flag?
You have two options to achieve this. In the Model (DAX) you can create a calculated Table, and use the UNION function to append the two sets of rows together.
https://dax.guide/union/
However UNION is quite fussy - the two parameter tables must have the same set of columns. Sometimes you can overcome small differences by adding other functions, but complex transforms are harder and you cant debug.
For complex requirements, you can use the Power Query Editor - it has an Append Query button on the Home Ribbon. Each query you feed in can have complex transformations.
so, I got 3 xlsx full of data already treated, so I pretty much just got to display the data using the graphs. The problem seems to be, that Powerbi aggregates all numeric data (using: count, sum, etc.) In their community they suggest to create new measures, the thing is, in that case I HAVE TO CREATE A LOT OF MEASURES...Also, I tried to convert the data to text and even so, Powerbi counts it!!!
any help, pls?
There are several ways to tackle this:
When you pull a field into the field well for a visualisation, you can click the drop down in the field well and select "Don't summarize"
in the data model, select the column and on the ribbon select "don't summarize" as the summarization option in the Properties group.
The screenshot shows the field well option on the left and the data model options on the right, one for a numeric and one for a text field.
And, yes, you never want to use the implicit measures, i.e. the automatic calculations that Power BI creates. If you want to keep on top of what is being calculated, create your own measures, and yes, there will be many.
Edit: If by "aggregating" you are referring to the fact that text values will be grouped in a table (you don't see any duplicates), then you need to add a column with unique values to the table so all the duplicates of the text values show up. This can be done in the data source by adding an Index column, then using that Index column in the table and setting it to a very narrow with to make it invisible.
i need to use a append object after a series of join that have a conditional run... So the join step may be not execute if the condition is not verified and his work physical dataset will not be created.
The problem is that the append step take an error if one o more input physical dataset are not created.
Is there a smart way to create a physical empty table from a metadata structure of the works table of the joins or to use the append with some non-created datasets?
The create table with the list of all field is not a real solution because i've to replicate it per 8 different joins and then replicate the job 10 times...
Thanks to all
Roberto
Thank you for your comments.
What you should do:
Amend your conditional node so that it would on positive condition to create a global macro variable with value of MAX. On negative condition to create the same variable with value of 0.
Replace offending SQL step with "CREATE TABLE" node
In the options for "CREATE TABLE", specify macro variable for "MAXIMUM OUTPUT ROWS (OUTOBS)". See the picture below for example of those options.
So now when your condition is not met, you will always end up with an empty table. When condition is met, the step executes normally.
I must say my version of DI Studio is a bit old. In my version SQL node doens't allow passing macro variables to SQL options, only integers can be typed in. Check if your version allows it because if it does, then you can amend existing SQL step and avoid replacing it with another node.
One more thing, you will get a warning when OUTOBS options is less then the resulting would be dataset.
Let me know if you have any questions.
See the picture for create table options:
At the end i've created another step that extract 0 row from the source table by the condition 1=0 in the where tab. In this way i have a empty table that i can use with a data/set in the post sql of the conditional run if the work table of the join does not exist.
This is not a solution but a valid work around.
In Power BI, I've got some query tables generated from imported data. All the data comes in as type 'Any', and I'm trying to automatically detect the type of the data in each column.
Some of the queries generate tables with columns based on the in-coming data - I don't know what the columns are going to be until the query runs and sets up the table (data comes from an Azure blob). As I will have quite a few tables to maintain, which columns can change (possibly new columns being added) with any data refresh, it would be unmanageable to go through all of them each time and press 'Detect Data Type' on the columns.
So I'm trying to figure out how I can do a 'Detect Data Type' in the query formula language to attach to the end of the query that generates the table columns. I've tried grabbing the first entry in a column and do Value.Type(column{0}), however this seems to come out as 'Text' for a column which has integers in it. Pressing 'Detect Data Type' does however correctly identifies the type as 'Whole Number'.
Does anyone know how to detect a column's entry types?
P.S. I'm not too worried about a column possibly holding values of different data types
You seem to have multiple issues here. And your solution will be fragile, there's a better way. But let's first deal with column type detection. Power Query uses the 'any' data type as it's go to data type. You can write a function that samples the rows of a column in a table does a best match data type detection then explicitly sets the data type of the column. This is probably messy and tricky since you need to do it once per column. This might be workable for a fixed schema but for a dynamic schema you'll run into a couple of things very quickly. First you'll need to write some crazy PQ code to list all the columns and run you function on each. This will work the first time, but might break in subsequent refreshes because data model changes are not allowed during refresh. If you're using a tool like Power BI Desktop, you'll be able to fix things up. If you publish your report to the Power BI service, you'll just see refresh errors.
Dynamic Schemas will suffer the same data model change issue I mentioned above.
The alternate solution that you won't have problems with is using a Direct Query data source instead of using Power Query. If you load your data into Azure SQL or a Tabular Model, the reporting layer will get the updated fields automatically so you don't have to try to work around using PQ.