How to split column in Power Query by the first space? - powerbi

I am using Power Query and have a column called LandArea; example data is "123.5 sq mi". It is of data type text. I want to remove the "sq mi" part so I just have the number value, 123.5. I tried the Replace function to replace "sq mi" with blank but that doesn't work because it looks at the entire text. So I tried to use Split where I split it on the space and it generated this formula below, and it did create a new column, but with null for all values. The original column still had "123.5 sq mi".
Table.SplitColumn(#"Reordered Columns1","LandArea",Splitter.SplitTextByDelimiter(" ", QuoteStyle.None),{"LandArea.1", "LandArea.2"})
When just splitting at the left-most delimiter:
Table.SplitColumn(#"Reordered Columns1","LandArea",Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false),{"LandArea.1", "LandArea.2"})
I have also tried changing to QuoteStyle.Csv. Any idea how I can get this to work?

Use this to create a custom column:
= Table.AddColumn(
#"Reordered Columns1",
"NewColumn",
each Text.Start([LandArea],Text.PositionOf([LandArea]," "))
)
UPDATE: Every one appears to have "sq mi"
= Table.AddColumn(#"Changed Type", "Custom", each Text.Replace([LandArea]," sq mi",""),type number)
Hope it helps.

This is what I ended up using:
Table.AddColumn(#"Reordered Columns1", "LandArea2",
each Text.Start([LandArea], Text.PositionOf([LandArea], "sq")-1))
I avoided trying to find whitespace.

Related

Is there any way that I can do format matching within a column in powerBI? ( something similar Fuzzy)

I have a column look like as below.
DK060
DK705
DK715
dk681
dk724
Dk716
Dk 685 (there is a space after Dk).
This is obviously due to human error. Is there any way that I can ensure the format is correct based on the specified format which is two uppercase DK followed by three digits?
Or Am I being too ambitious!!??
Go to the power query editor. Select advance editor and paste this 2 steps
#"Uppercase" = Table.TransformColumns(#"Source",{{"Column", Text.Upper, type text}}),
#"Replace Value" = Table.ReplaceValue(#"Uppercase"," ","",Replacer.ReplaceText,{"Column"})
Note: be sure to replace the "Source" statement into the Uppercase sentence for your previuos step name if needed.
So you will have something like this:
This is the expected result:

PowerQuery: How to replace text with each column name for multiple columns

I'm trying to replace "x" in each column (excepts for the first 2 columns) with the column name in a table with an unknown number of columns but with at least 2 columns.
I found the code to change one column, but I want it to be dynamic:
#"Ersatt värde" = Table.ReplaceValue(Källa,"x", Table.ColumnNames(Källa){2},Replacer.ReplaceText,{Table.ColumnNames(Källa){2}})
Any ideas on how to solve it?
If I understand correctly, I think you can try either approach below:
#"Ersatt värde" =
let
columnsToTransform = List.Skip(Table.ColumnNames(Källa), 2),
accumulated = List.Accumulate(columnsToTransform, Källa, (tableState as table, columnName as text) =>
Table.ReplaceValue(tableState,"x", columnName, Replacer.ReplaceText, {columnName})
)
in accumulated
or:
#"Ersatt värde" =
let
columnsToTransform = List.Skip(Table.ColumnNames(Källa), 2),
transformations = List.Transform(columnsToTransform, (columnName) => {columnName, each
Replacer.ReplaceText(Text.From(_), "x", columnName)}),
transformed = Table.TransformColumns(Källa, transformations)
in transformed,
Both ways follow a similar approach:
Figure out which columns to do replacements in (i.e. all except the first 2 columns)
Loop over columns determined in previous step and actually do the replacement.
I've used Replacer.ReplaceText since that's what you'd used in your question, but I believe this will replace both partial matches and full matches.
If you only want full matches to be replaced, I think you can use Replacer.ReplaceValue instead.

Powerquery, does string contain an item in a list

I would like to filter on whether multiple text columns ([Name], [GenericName], or [SimpleGenericName]) contains a substring from a list. The text is also mixed case so I need to do a Text.Lower([Column]) in there as well.
I've tried the formula:
= Table.SelectRows(#"Sorted Rows", each List.Contains(MED_NAME_LIST, Text.Lower([Name])))
However, this does not work as the Column [Name] does not exactly match those items in the list (e.g. it won't pick up "Methylprednisolone Tab" if the list contains "methylprednisolone")
An example of a working filter, with all some of the list written out is:
= Table.SelectRows(#"Sorted Rows", each Text.Contains(Text.Lower([Name]), "methylprednisolone") or Text.Contains(Text.Lower([Name]), "hydroxychloroquine") or Text.Contains(Text.Lower([Name]), "remdesivir") or Text.Contains(Text.Lower([GenericName]), "methylprednisolone") or Text.Contains(Text.Lower([GenericName]), "hydroxychloroquine") or Text.Contains([GenericName], "remdesivir") or Text.Contains(Text.Lower([SimpleGenericName]), "methylprednisolone") or Text.Contains(Text.Lower([SimpleGenericName]), "hydroxychloroquine") or Text.Contains([SimpleGenericName], "remdesivir"))
I would like to make this cleaner than having to write all of this out, as I would also like to be able to expand the list from a referenced table to make this a dynamic search.
Thank you in advance
If I have a list of medicines:
and I need to filter my table:
to only keep rows where certain columns (we'll specify which ones exactly later) contain case-insensitive, partial matches for any of the items in the above list of medicines, then one way to do this might be:
let
MED_NAME_LIST = {"MEthYlprednisolone", "hYdroxychloroquine", "rEMdesivir"},
initialTable = Table.FromRows({
{"Methylprednisolone Tab", "train", "car", "bike"},
{"no", "no", "no", "no"},
{"tram", "teleport", "hydroxychloroQuine Tab", "jet"},
{"no", "no", "no", "yes"},
{"REMdesivir Tab", "bus", "taxi", "concord"}
}, type table [Name = text, GenericName = text, SimpleGenericName = text, SomeOtherColumn = text]),
filtered = Table.SelectRows(initialTable, each List.ContainsAny(
{[Name], [GenericName], [SimpleGenericName]},
MED_NAME_LIST,
(rowValue as text, medicineFromList as text) as logical => Text.Contains(rowValue, medicineFromList, Comparer.OrdinalIgnoreCase)
))
in
filtered
In filtered, List.ContainsAny is used to determine if any of the specified columns (Name, GenericName, SimpleGenericName) contain a "match" for any of the values in MED_NAME_LIST.
The criteria for the "match" is that:
case sensitivity must be ignored (hence Comparer.OrdinalIgnoreCase is used)
the match must be partial (hence Text.Contains is used)
The above code gives me the following, which I believe is the filtering behaviour you described:

Power Query - Table.TranformColumns

I'm trying to add leading zeros into a column in power query call JobCodes. I know I can do this by adding a new column using Text.Start([JobCodes],5,"0"), but I don't want to add a new column and go back to remove the column I don't need. I want to be able to do this in one step using Table.TransformColumns function. Is this possible?
Code:
Table.TransformColumns(#"Changed Type", each Text.PadStart([JobCodes],5,"0"))
Error:
Expression.Error: We cannot convert a value of type Function to type List.
Details:
Value=Function
Type=Type
Your syntax is just a bit off.
I think this is what you want:
= Table.TransformColumns(#"Changed Type",{{"JobCodes", each Text.PadStart(_, 5,"0")}})
The error is because it was expecting a list of columns that you want to transform (notice the {{...}} above.
The easiest wat to get the syntax right is to use the GUI to do a transformation and then just edit the function a bit. For example, you could use Format > Add Prefix which would give you the following step (assuming you choose to prefix 000).
= Table.TransformColumns(#"Changed Type", {{"JobCodes", each "000" & _, type text}})
Just take out the "000" & _ and put in the transformation you actually want.
Or
= Table.ReplaceValue( Source, each [JobCodes] ,each Text.PadStart(Number.ToText([JobCodes]),5,"0") ,Replacer.ReplaceValue,{"JobCodes"})
in Replace
The correct expresion is:
= Table.TransformColumns(#"Changed Type",{{"JobCodes", each Text.PadStart(Text.From( _), 6, "0"), type text}})

Text.Trim Syntax Meaning

I'm working in a Power BI query, trying to trim whitespace from text.
Looking at Microsoft's M reference, I came across the Text.Trim syntax:
Text.Trim(text as nullable text, optional trimChars as any) as nullable text
I couldn't figure out how to plug it into my query code correctly (where it would actually work) so I did some more searching and came across this:
#"Trimmed Text" = Table.TransformColumns(#"Removed Other Columns",{},Text.Trim),
...which doesn't look anything like Microsoft's syntax but works fine for me as I insert it into my code like this:
let
Source = Banding,
#"Removed Other Columns" = Table.SelectColumns(Source,{"Segment", "Granular Band"}),
#"Trimmed Text" = Table.TransformColumns(#"Removed Other Columns",{},Text.Trim),
#"Removed Duplicates" = Table.Distinct(#"Trimmed Text", {"Granular Band"})
in
#"Removed Duplicates"
My problem is that I don't understand what the line's syntax meaning is. I understood Microsoft's example as meaning basically, trim THIS; where THIS is the text I want trimmed. Pretty straightforward.
But I don't know what the syntax meaning of the line that actually works is. I understand that it says to transform table columns (Table.TransformColumns); but I don't know if the reference to the previous line (#"Removed Other Columns") serves as any real "input" for the Text.Trim or if {} is a reference to all columns in the table, or (more importantly to me) how I would reference specific columns. (I've tried a few approaches for specifying columns and failed every time.) I also don't understand why I don't need any arguments following Text.Trim (like in Microsoft's example).
If someone would translate what the line is "saying" in a manner I can understand, I'd sure appreciate it.
The generated code:
#"Trimmed Text" = Table.TransformColumns(#"Removed Other Columns",{},Text.Trim),
means that you most probably selected all columns and then you selected Transform - Format - Trim.
If you would have selected 1 or more columns, then the names of these columns and the required operations would have been between the {}, like in
= Table.TransformColumns(#"Removed Other Columns",{{"SomeText", Text.Trim}})
Any function that is invoked within Table.TransformColumns, gets the column values automatically supplied from Table.TransformColumns, so in this case: the first argument for Text.Trim (text as nullable text).
The other arguments will use the default value, i.c. space, so by default all leading and trailing spaces are removed.
If you want to use other arguments of Text.Trim within the Table.TransformColumns context, then you need to adjust the code and supply the keyword "each", and use an _ as placeholder for the column values.
For example the next code removes leading and trailing spaces and semicolons:
= Table.TransformColumns(#"Removed Other Columns",{{"SomeText", each Text.Trim(_, {" ",";"})}})