I need a custom function that takes two parameters, Column1 and Column2, so:
For each Row, return the value of Column1 but only if exists a Value in the Column2 else return null
I have tried this:
let ColumnsFilter = (Tabla,C1,C2)=>
Table.AddColumn(Tabla, "Custom", each if [C2] <> null then [C1] else null)
in
ColumnsFilter
And calling the function:
#"Previous Step" = .....
#"P" = ColumnsFilter(#"Previous Step","Column1","Column2")
in
P
And is not working. clearly I am not using the syntax properly.
In summary I need a table as input and a table as output adding custom columns.
How can I write this?
(Please don't tell me to use the assisted of Power Query, I need to write similar functions manually)
Since you're passing column names as text and individual rows are a record type, you have to use Record.Field to pull the right column (field) from the current row (record).
let
ColumnsFilter = (Tabla as table, C1 as text, C2 as text) as table =>
Table.AddColumn(Tabla, "Custom",
each if Record.Field(_, C2) <> null then Record.Field(_, C1) else null
)
in
ColumnsFilter
Related
I have columns in bigquery like this:
expected output:
I am trying to merge columns into json using bigquery.
I am taking letter before underscore(common name ) as column then converting.
I am trying this query:
with selectdata as (
SELECT a_firstname, a_middlename,a_lastname FROM `account_id.Dataset.Table_name`
)
select TO_JSON_STRING(t) AS json_data FROM selectdata AS t;
How can I join columns with condition or with case to achieve this output in bigquery
Consider below approach
create temp function extract_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function extract_values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
select * except(row_id) from (
select format('%t',t) row_id,
split(key, '_')[offset(0)] as col,
'{' || string_agg(format('"%s":"%s"', split(key, '_')[safe_offset(1)], value)) || '}' as value
from your_table t, unnest(extract_keys(to_json_string(t))) key with offset
join unnest(extract_values(to_json_string(t))) value with offset
using(offset)
group by row_id, col
)
pivot (any_value(value) for col in ('a','b','c'))
if applied to sample data in your question - output is
I´ve been struggling with this:
My table shows 3 records but when expanding there are like 100 columns. I used this code:
#"Expanded Data" = Table.ExpandTableColumn(#"Source", "Document", List.Union(List.Transform(#"Source"[Document]), each Table.ColumnNames(_))),
but it's not working. How can I expand simultaneously all columns? Also, inside those columns there are even more, for example I expand the first time end then those new columns have more records inside.
What could I do? Thanks in advance!
Try this ExpandAllRecords function - it recursively expands every Record-type column:
https://gist.github.com/Mike-Honey/0a252edf66c3c486b69b
This should work for Records Columns.
let
ExpandIt = (TableToExpand as table, optional ColumnName as text) =>
let
ListAllColumns = Table.ColumnNames(TableToExpand),
ColumnsTotal = Table.ColumnCount(TableToExpand),
CurrentColumnIndex = if (ColumnName = null) then 0 else List.PositionOf(ListAllColumns, ColumnName),
CurrentColumnName = ListAllColumns{CurrentColumnIndex},
CurrentColumnContent = Table.Column(TableToExpand, CurrentColumnName),
IsExpandable = if List.IsEmpty(List.Distinct(List.Select(CurrentColumnContent, each _ is record))) then false else true,
FieldsToExpand = if IsExpandable then Record.FieldNames(List.First(List.Select(CurrentColumnContent, each _ is record))) else {},
ColumnNewNames = List.Transform(FieldsToExpand, each CurrentColumnName &"."& _),
ExpandedTable = if IsExpandable then Table.ExpandRecordColumn(TableToExpand, CurrentColumnName, FieldsToExpand, ColumnNewNames) else TableToExpand,
NextColumnIndex = CurrentColumnIndex+1,
NextColumnName = ListAllColumns{NextColumnIndex},
OutputTable = if NextColumnIndex > ColumnsTotal-1 then ExpandedTable else #fx_ExpandIt(ExpandedTable, NextColumnName)
in
OutputTable
in
ExpandIt
This basically takes Table to Transform as the main argument,and then one by one checks if the Column Record is expandable (if column has "records" in it, it will expand it, otherwise move to next column and checks it again).
Then it returns the Output table once everything is expanded.
This function is calling the function from inside for each iteration.
I have a table of Project:
that I would like to filter by the FIELD, OPERATOR, and VALUE columns contained in the Project Group table:
The Power Query M to apply this filter would be:
let
Source = #"Project",
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Projectid", Int64.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Projectid] >= 100000 and [Projectid] <= 500000)
in
#"Filtered Rows"
Results (need to remove the error row):
How do I convert the FIELD, OPERATOR, and VALUE columns into a function that can be used as a condition for the SelectRows function?
If you need to do comparisons, might be best to first change the types of the columns (in both tables) that are being compared. Preferably to type number.
The code below assumes that:
the OPERATOR column of Project Group table can only contain: > or < and that these values should be interpreted as >= and <= respectively.
the column in Project table (that needs to be compared) can change and its name will be in the FIELD column of the Project Group. It's assumed that the name matches exactly. If this is not the case, you might need to standardise things (or at least perform a case-insensitive search) to ensure values can be mapped to column names correctly.
Based on the assumptions above, here's one approach:
let
// Dummy table for example purposes
project = Table.FromColumns({
{0..10},
{5..15}
}, type table [projectId = number, name = number]),
// Dummy table for example purposes
projectGroup = Table.FromColumns({
{"projectId", "projectId"},
{">", "<"},
{5, 7}
}, type table [FIELD = text, OPERATOR = text, VALUE = number]),
// Should take in a row from "Project" table and return a boolean
// representing whether said row matches the criteria contained
// within "Project Group" table.
selectorFunc = (projectRow as record) as logical =>
let
shouldKeepProjectRow = Table.MatchesAllRows(projectGroup, (projectGroupRow as record) =>
let
fieldNameToCheck = projectGroupRow[FIELD],
valueFromProjectRow = Record.Field(projectRow, fieldNameToCheck),
compared = if projectGroupRow[OPERATOR] = ">" then
valueFromProjectRow >= projectGroupRow[VALUE]
else
valueFromProjectRow <= projectGroupRow[VALUE]
in compared
)
in shouldKeepProjectRow,
selectedRows = Table.SelectRows(project, selectorFunc)
in
selectedRows
The main function used is Table.MatchesAllRows (https://learn.microsoft.com/en-us/powerquery-m/table-matchesallrows).
Another approach could potentially be: Expression.Evaluate: https://learn.microsoft.com/en-us/powerquery-m/expression-evaluate. However, I've not used it, so I'm not sure whether there are any "gotchas"/implications to be aware of.
There is a scenario where I receive a string to the bigquery function and need to use it as a column name.
here is the function
CREATE OR REPLACE FUNCTION METADATA.GET_VALUE(column STRING, row_number int64) AS (
(SELECT column from WORK.temp WHERE rownumber = row_number)
);
When I call this function as select METADATA.GET_VALUE("TXCAMP10",149); I get the value as TXCAMP10 so we can say that it is processed as SELECT "TXCAMP10" from WORK.temp WHERE rownumber = 149 but I need it as SELECT TXCAMP10 from WORK.temp WHERE rownumber = 149 which will return some value from temp table lets suppose the value as A
so ultimately I need value A instead of column name i.e. TXCAMP10.
I tried using execute immediate like execute immediate("SELECT" || column || "from WORK.temp WHERE rownumber =" ||row_number) from this stack overflow post to resolve this issue but turns out I can't use it in a function.
How do I achieve required result?
I don't think you can achieve this result with the help of UDF in standard SQL in BigQuery.
But it is possible to do this with stored procedures in BigQuery and EXECUTE IMMEDIATE statement. Consider this code, which simulates the situation you have:
create or replace table d1.temp(
c1 int64,
c2 int64
);
insert into d1.temp values (1, 1), (2, 2);
create or replace procedure d1.GET_VALUE(column STRING, row_number int64, out result int64)
BEGIN
EXECUTE IMMEDIATE 'SELECT ' || column || ' from d1.temp where c2 = ?' into result using row_number;
END;
BEGIN
DECLARE result_c1 INT64;
call d1.GET_VALUE("c1", 1, result_c1);
select result_c1;
END;
After some research and trial-error methods, I used this workaround to solve this issue. It may not be the best solution when you have too many columns but it surely works.
CREATE OR REPLACE FUNCTION METADATA.GET_VALUE(column STRING, row_number int64) AS (
(SELECT case
when column_name = 'a' then a
when column_name = 'b' then b
when column_name = 'c' then c
when column_name = 'd' then d
when column_name = 'e' then e
end from WORK.temp WHERE rownumber = row_number)
);
And this gives the required results.
Point to note: the number of columns you use in the case statement should be of the same datatype else it won't work
I need to access the previous record of the DTH_REFER_PEDID column to make the IF comparison (DTH_REFER_PEDID-1 <> "A").
That is, I'm reading the index X, I need to compare with the index X-1
Addition_Stats = VAR Atendido_OV = PR_HIST_MOVIM_PEDID[OVITEM_Hist]
VAR linha_anterior2 = CALCULATE(values(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]);filter(PR_HIST_MOVIM_PEDID;EARLIER(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID])))
Return
if(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Month]<PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Month];"Atraso mês ant";
if(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL] = "A" && PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]<=PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Atendido no Prazo";
if((PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]="P"||PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]="L") && PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]<= PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Planejado no prazo";
if(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]<>"A" && PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]>PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Em atraso";
if(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL] = "A"
&& linha_anterior2 <>"A"
&& PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]>PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Atend fora Prazo"
;IF((PR_HIST_MOVIM_PEDID[OVITEM_Hist]=Atendido_OV)&&(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID]>FIRSTDATE(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Date]));"A retido";"NA")
)
)
)
)
)
//)
The error displayed is: A circular dependency has been detected: PR_HIST_MOVIM_PEDID [Addition_Stats].
How do I compare DTH_REFER_PEDID-1 <> "A"?
An easy way to work with previous or next records is:
Make sure your data is in a table with a primary key (=ID)
Make a query with all the fields as in your table and add one colum with ID+1. (or ID-1)
Make another query with the table and the query mentioned above and make a join between ID and ID+1 (or ID-1). Place all the fields of the table and the 1st query and you end up with all the values in 1 record. This way you can work with the previous or next values.