PowerBI: how to create several measures at once? - powerbi

Is it possible to create several PowerBI measures at once?
Not only in two seperate steps like
Measure1 = SUM(df[var1])
and then a 2nd measure
Measure2 = SUM(df[var2])
but in one step like
Measure1 = SUM(df[var1]),
Measure2 = SUM(df[var2])
(this doesnt work)
maybe even bulkwise or with a kind of loop?

have a look at Tabular Editor's advance scripting functionality.
you can create a script in c# to automate the creation of measures
https://docs.tabulareditor.com/Useful-script-snippets.html
eg.
var meas1 = c.Table.AddMeasure("Measure1, "SUM(df[var1]),"")
meas1.FormatString = "0.00";
meas1.Description = "This measure is the sum of column var1 ";
var meas2 = c.Table.AddMeasure("Measure2, "SUM(df[var2]),"")
meas2.FormatString = "0.00";
meas2.Description = "This measure is the sum of column var2" ;

Here is the ugly way to do it.
Save your pbix file using the option Save as and choose pbit extension (pivot file). Say the full name will be test.pbit.
In Explorer change the extension of the file from test.pbit to test.zip.
Extract from test.zip the zipped data to a new folder.
Go inside the folder and find the file DataModelSchema.
Open DataModelSchema with Notepad++ or any other favourite text editor.
Ctr+F and search for your measure. It will take you some time to get used to the structure of this file, but hopefully, you can grasp it. You may copy and paste measure how you like.
When you are done, save the file.
Copy the saved DataModelSchema file and go to your test.zip file. Open it entering inside. Paste there the DataModelSchema.
Follow the reverse path. Change extension back to pbit. Open file in Power BI. That's it.

Related

Power BI: Power Query slowly loading multiple Excel worksheets

I have a small (1.5 MB) Excel file that contains multiple worksheets. I need to transform each worksheet (two significant transformations are created as separate functions) and then expand the results dynamically (i.e. each worksheet can have a different number of columns, so I needed to extract the list of all distinct column names beforehand).
It's all working fine and the output is meeting my expectations, however, I noticed that when refreshing Power Query is loading over 10 MB of data. After it is done, the Load window resets and >10 MB of data is being loaded again.
Here's the M-code that I am using. I have tested each section and it seems like Expanded step might be the slowest one.
let
Source = Excel.Workbook(File.Contents("xxx.xlsx"), null, true),
Split = Table.SplitColumn(Source, "Name", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Name.1", "Name.2", "Name.3"}),
TidyUp = Table.RenameColumns(
Table.RemoveColumns(Split,{"Item", "Kind", "Hidden", "Name.3"}),
{{"Name.1", "PORTFOLIO"}, {"Name.2", "DATE"}}),
GetCurrency = Table.AddColumn(TidyUp, "CURRENCY", each GetCurrencyFromSpreadsheet([Data])),
GetStresses = Table.AddColumn(GetCurrency, "fx", each GetStresses([Data])),
ColNames = Table.ColumnNames(Table.Combine(GetStresses[fx])),
Expanded = Table.ExpandTableColumn(GetStresses, "fx", ColNames),
RemoveData = Table.RemoveColumns(Expanded,{"Data"})
in
RemoveData
As a result, it takes about 5 minutes to process a single small Excel file. As we expect to receive more similar files in the future, I would like to check with you if you have any ideas what can I do to improve the code? Thanks.
I would rebuild this using the Power Query Editor UI. That should lead to cleaner code with less redundant steps like your use of Table.Combine with a single input table.
The Table.AddColumn steps would probably be rebuilt as separate queries that are combined using Merge Queries. Set-based logic like that will usually outperform row-by-row function calls.

Using M query to extract an un-named range from multiple excel files?

The M query code below extracts a range of rows from a sheet called "Survey Information" in an excel file called Paris. - But what if I wish to do this not just for a single excel file, but for all of the excel files located in Folder1 (Berlin, Milan, etc.)? (Whilst each excel file would contain multiple sheets, each file would have a single sheet called "Survey Information".)
Unfortunately there are no named ranges. I have a large number of excel files to pull the data out of.
Very grateful for any insight,
Chris
let
Source = Excel.Workbook(File.Contents("Q:\Folder1\Paris.xls"), null, true),
#"Survey Information1" = Source{[Name="Survey Information"]}[Data],
#"Kept Range of Rows" = Table.Range(#"Survey Information1",19,14) in
#"Kept Range of Rows"
I think I may have found an answer to my own question. - I need to turn the code into a function by prefixing it with (filepath)=>
I found a step-by-step guide here:
https://www.excelguru.ca/blog/2015/02/25/combine-multiple-excel-workbooks-in-power-query/
All the best.

adjusting the project.xml file in a SAS Enterprise Guide project outside SAS EG

We are going to migrate our EG projects (over 1000 projects) to a new environment.
In the old environment we use "W-Latin" as encoding on the Teradata database.
In the new environment we will start using "UTF-8" as encoding on the Teradata database.
And a lot of other changes which I believe are not relevant for this question.
To prevent data issues we will have to replace functions like REVERSE, etc with KREVERSE, etc
We could do this by opening al projects and clicking through it to change the functions in the expression builder.
This would be really time consuming, considering that we have over 1000 .egp files
We already have a code scanner that unzips the .egp file and detects al the use of these functions in the project.xml file.
The next step could be that we find and replace the functions and put the project.xml file back in the .egp file.
Who can tell me how to put the project.xml file back in the .egp file without corrupting the .egp file
I was able to do this.
tl;dr -- Zip the files back up and change the extension to .egp.
Created a new EG project and added a code node to create sample data:
data test;
do cat = "A", "B", "C";
do i=1 to 10;
r = rannor(123);
output;
end;
end;
drop i;
run;
I then added a Query node to the output to do a "SUM" of the r column by cat.
Ran the flow and got expected output.
Saved the EG project.
Opened the EG Project in 7zip and extracted the archive to a location.
In project.xml, I found the section for the Query and changed the SUM to MEAN
<Expression>
<LHS_TYPE>LHS_FUNCTION</LHS_TYPE>
<LHS_DMCOLGROUP>Numeric</LHS_DMCOLGROUP>
<RHS_TYPE>RHS_COLUMN</RHS_TYPE>
<RHS_DMCOLGROUP>Numeric</RHS_DMCOLGROUP>
<InFormat />
<LHS_String>MEAN</LHS_String>
<LHS_Calc />
<OutputType>OPTYPE_NOTSET</OutputType>
<RHS_StringOne>r</RHS_StringOne>
<RHS_StringTwo />
</Expression>
Selected the files and added them to an achieve using 7zip. Selected "zip" compression and saved the file with ".egp" extension.
I opened the project in EG and ran the flow. The output was now the MEAN of R and not the SUM.

Wildcard or equivalent to read in excel file

I have multiple excel files imported on a daily basis, example code of one of the files is here:
Booked <- read_excel("./Source_Data/CONFIDENTIAL - MI8455 Future Change 20180717.xlsx", skip = 1, sheet = "Appendix 1 - Info Data")
Each day this file changes, the name and structure is always the same, the only difference is the date at the end of the file name.
Is there anyway to have R search for the specific name starting with "CONFIDENTIAL - MI8455 Future Change" and import the data accordingly?
To get the path of the file out you can use this pattern
(?'path'\.\/Source_Data\/CONFIDENTIAL - MI8455 Future Change \d+\.xlsx)
Ok, through a lot of trial, error and google I found an answer and hope that someone else who is new to R may have come across the same problem.
First I needed to identify the file, in the end I used the list.files command: MI8455 <- list.files(path= "G:/MY/FilE/PATH/MI8455", pattern="^MI8455_Rate_Change_Report_1.*\\.xlsx$") If likemyself your files are either in other folders/subfolders of the working directory than the first part of the code specifies where the list.files should look. The pattern element allows you to show what format the name is in and then you are able to specify the file type.
Next you can import using the read_excel package, but rather than specifiying a file path you tell it to use the value that you created earlier: Customer_2017 <- read_excel(MI8455,skip = 5, sheet = "Case Listing - Eml")

PDI - Multiple file input based on date in filename

I'm working with a project using Kettle (PDI).
I have to input multiple file of .csv or .xls and insert it into DB.
The file name are AAMMDDBBBB, where AA is code for city and BBBB is code for shop. MMDD is date format like MM-DD. For example LA0326F5CA.csv.
The Regexp I use in the Input file steps look like LA.\*\\.csv or DT.*\\.xls, which is return all files to insert it into DB.
Can you indicate me how to select the files the file just for yesterday (based on the MMDD of the file name).
As you need some "complex" logic in your selection, you cannot filter based only on regexp. I suggest you first read all filenames, then filter the filenames based on their "age", then read the file based on the selected filenames.
In detail:
Use the Get File Names step with the same regexp you currently use (LA.*\.csv or DT.*\.xls). You may be more restrictive at that stage with a Regexp like LA\d\d\d\d.....csv, to ensure MM and DD are numbers, and DDDD is exactly 4 characters.
Filter based on the date. You can do this with a Java Filter, but it would be an order of magnitude easier to use a Javascript Script to compute the "age" of you file and then to use a Filter rows to keep only the file of yesterday.
To compute the age of the file, extract the MM and DD, you can use (other methods are available):
var regexp = filename.match(/..(\d\d)(\d\d).*/);
if(regexp){
var age = new Date() - new Date(2018, regexp[1], regexp[2]);
age = age /1000 /60 /60 /24;
};
If you are not familiar with Javascript regexp: the match will test
the filename against the regexp and keep the values of the parenthesis
in an array. If the test succeed (which you must explicitly check to
avoid run time failure), use the values of the match to compute the
corresponding date, and subtract the date of today to get the age.
This age is in milliseconds, which is converted in days.
Use the Text File Input and Excel Input with the option Accept file from previous step. Note that CSV Input does not have this option, but the more powerful Text File Input has.
well I change the Java Filter with Modified Java Script Value and its work fine now.
Another question, how can I increase the Performance and Speed of my current transformation(now I have 2 trans. for 2 cities)? My insert update make my tranformation to slow and need almost 1hour and 30min to process 500k row of data with alot of field(300mb) and my data not only this, if is it work more fast and my company like to used it, im gonna do it with 10TB of data/years and its alot of trans and rows. I need sugestion about it