Power Query - Append Variable Dates to Existing Table as Headers

Power Query - Append Variable Dates to Existing Table as Headers - list

I am having an issue w/ my below M Code. I have set up an simple example of what i am trying to do below, but know the issue is when try to append variable dates as headers into the table:
let
Set up steps.
.
.
.
#"Appended Query" = Table.Combine({#"Removed Other Columns",Table.PromoteHeaders(Table.Transpose(Table.FromList(List.Dates(Date.StartOfWeek(DateTime.Date(DateTime.LocalNow()), Day.Sunday), 52, #duration(7,0,0,0)), Splitter.SplitByNothing(), {"Dates"}, null, ExtraValues.Error)),[PromoteAllScalars=true])})
in
#"Appended Query"
Error:
Expression.Error: 2 arguments were passed to a function which expects 1.
Details:
Pattern=
Arguments=List
The table starts like table A. I use some simple query functions to get to this point. But I need to add the columns w/ dates and the function (Table A: Total Qty/50) for all added date columns as in Table B.
Table A
Product Table A: Total Qty
a 323000
b 898807
c 844945
d 36330
e 281009
f 611092
g 633217
Table B
Product Table A: Total Qty 6/4/2017 6/11/2017 6/18/2017 6/25/2017 7/2/2017 7/9/2017
a 323,000 6,460 6,460 6,460 6,460 6,460 6,460
b 898,807 17,976 17,976 17,976 17,976 17,976 17,976
c 844,945 16,899 16,899 16,899 16,899 16,899 16,899
d 36,330 727 727 727 727 727 727
e 281,009 5,620 5,620 5,620 5,620 5,620 5,620
f 611,092 12,222 12,222 12,222 12,222 12,222 12,222
g 633,217 12,664 12,664 12,664 12,664 12,664 12,664
Hope you can help! Thanks in advance!
Alex

Related

Struct Array in Bigquery with nested columns

I have two tables called source and base. The source table has a bunch of ids and all combination of weekly dates. The base table as ids, their tagged devices and the device start and end dates.
Example source table :
id
com_date
acc_1
11/25/2022
acc_1
11/18/2022
acc_1
11/11/2022
acc_2
11/25/2022
acc_3
11/25/2022
acc_3
11/25/2022
Example of base table :
id
device_id
start_date
end_date
acc_1
d1
11/24/2022
12/31/2999
acc_1
d2
11/19/2022
12/31/2999
acc_1
d3
11/12/2022
11/28/2022
acc_2
d4
11/20/2022
11/26/2022
acc_3
d5
11/17/2022
11/24/2022
acc_3
d6
11/10/2022
12/31/2999
I would like my final table to look something like this with nested columns -
Column count should be the count of distinct devices applicable for that com_date
and each com_date should lie between start_date and end_date

You might consider below query.
(I've tested it after changing last com_date in source_table to 11/18/2022.)
SELECT s.id, s.com_date AS dates,
COUNT(DISTINCT device_id) count,
ARRAY_AGG(STRUCT(b.device_id, b.start_date AS to_date, b.end_date AS from_date)) d
FROM source_table s JOIN base_table b
ON s.id = b.id
AND PARSE_DATE('%m/%d/%Y', com_date) BETWEEN PARSE_DATE('%m/%d/%Y', start_date) AND PARSE_DATE('%m/%d/%Y', end_date)
GROUP BY 1, 2;

Redshift generate rows as many as value in another column

df
customer_code contract_code product num_products
C0134 AB01245 toy_1 4
B8328 EF28421 doll_4 2
I would like to transform this table based on the integer value in column num_products and generate a unique id for each row:
Expected_df
unique_id customer_code contract_code product num_products
A1 C0134 AB01245 toy_1 1
A2 C0134 AB01245 toy_1 1
A3 C0134 AB01245 toy_1 1
A4 C0134 AB01245 toy_1 1
A5 B8328 EF28421 doll_4 1
A6 B8328 EF28421 doll_4 1
unique_id can be any random characters as long as I can use a count(distinct) on it later on.
I read that generate_series(1,10000) i is available in later versions of Postgres but not in Redshift

You need to use a recursive CTE to generate the series of number. Then join this with you data to produce the extra rows. I used row_number() to get the unique_id in the example below.
This should meet you needs or at least give you a start:
create table df (customer_code varchar(16),
contract_code varchar(16),
product varchar(16),
num_products int);
insert into df values
('C0134', 'AB01245', 'toy_1', 4),
('B8328', 'EF28421', 'doll_4', 2);
with recursive nums (n) as
( select 1 as n
union all
select n+1 as n
from nums
where n < (select max(num_products) from df) )
select row_number() over() as unique_id, customer_code, contract_code, product, num_products
from df d
left join nums n
on d.num_products >= n.n;
SQLfiddle at http://sqlfiddle.com/#!15/d829b/12

Saving DAX Meassure to table (ABC /Pareto analysis)

I am currently working on an abc/pareto analysis concerning customer IDs.
What I want to calculate is something like this:
ID| Sales / ID |Cum. Sales| %from total | category
G 15.000,00€ 15.000,00 € 21,45% A
D 5.700,00€ 20.700,00 € 29,60% A
H 4.000,00€ 24.700,00 € 35,32% A
Q 3.800,00€ 28.500,00 € 40,75% A
O 3.650,00€ 32.150,00 € 45,97% A
X 3.500,00€ 35.650,00 € 50,97% B
I 3.350,00€ 39.000,00 € 55,76% B
Ü 3.200,00€ 42.200,00 € 60,34% B
Ö 3.050,00€ 45.250,00 € 64,70% B
N 2.900,00€ 48.150,00 € 68,84% B
J 2.750,00€ 50.900,00 € 72,78% C
Ä 2.600,00€ 53.500,00 € 76,49% C
Z 2.450,00€ 55.950,00 € 80,00% C
Y 2.300,00€ 58.250,00 € 83,29% C
L 2.150,00€ 60.400,00 € 86,36% D
P 2.000,00€ 62.400,00 € 89,22% D
W 1.765,00€ 64.165,00 € 91,74% D
R 1.530,00€ 65.695,00 € 93,93% D
F 1.295,00€ 66.990,00 € 95,78% E
V 1.060,00€ 68.050,00 € 97,30% E
B 825,00€ 68.875,00 € 98,48% E
T 590,00€ 69.465,00 € 99,32% E
M 355,00€ 69.820,00 € 99,83% E
C 120,00€ 69.940,00 € 100,00% E
This way I can say that "A-customers" make 50% of my total profit.
I used this tutorial to create my meassures:
https://www.youtube.com/watch?v=rlUBO5qoKow
total_sales = SUM(fact_table[sales])
cumulative sales =
VAR MYSALES = [total_sales]
RETURN
SUMX(
FILTER(
SUMMARIZE(ALLSELECTED(fact_table);fact_table[CustomerID];
"table_sales";[total_sales]);
[total_sales] >= MYSALES);
[table_sales])
Since I am calculating the cumulative sales for >1000 unique customer IDs the calculation takes ages!
Is there a way I can save this calculation in a new table so I only have to calculate it once?
Or does anyone know a Meassure that does the same but is less computationally expensive?
Any help is much appreciated!

You could calculate it once as a calculated column but then ALLSELCTED wouldn't act as you expect since calculated columns cannot be responsive to report filters or slicers.
There are some inefficiencies in your measure though. It looks like you are calculating [total_sales] twice, once inside SUMMARIZE and again for the FILTER.
I haven't tested this measure, but it may be faster as follows:
cumulative sales =
VAR MYSALES = [total_sales]
RETURN
SUMX (
FILTER (
SUMMARIZECOLUMNS (
fact_table[CustomerID];
ALLSELECTED ( fact_table );
"table_sales"; [total_sales]
);
[table_sales] >= MYSALES
);
[table_sales]
)
The important part is reusing [table_sales] in the FILTER but SUMMARIZECOLUMNS might be a bit better too.

Conditional calculation based on another column

I have a cross reference table and another table with the list of "Items"
I connect "PKG" to "Item" as "PKG" has distinct values.
Example:
**Cross table** **Item table**
Bulk PKG Item Value
A D A 2
A E B 1
B F C 4
C G D 5
E 8
F 3
G 1
After connecting the 2 above tables by PKG and ITEM i get the following result
Item Value Bulk PKG
A 2
B 1
C 4
D 5 A D
E 8 A E
F 3 B F
G 1 C G
As you can see nothing shows up for the first 3 values since it is connected by pkg and those are "Bulk" values.
I am trying to create a new column that uses the cross reference table
I want to create the following with a new column
Item Value Bulk PKG NEW COLUMN
A 2 5
B 1 3
C 4 1
D 5 A D 5.75
E 8 A E 9.2
F 3 B F 3.45
G 1 C G 1.15
The new column is what I am trying to create.
I want the original values to show up for bulk as they appear for pkg. I then want the Pkg items to be 15% higher than the original value.
How can I calculate this based on the setup?

Just write a conditional custom column in the query editor:
New Column = if [Bulk] = null then [Value] else 1.15 * [Value]
You can also do this as a DAX calculated column:
New Column = IF( ISBLANK( Table1[Bulk] ), Table1[Value], 1.15 * Table1[Value] )

using subqueries in jpa criteria api

I'm studying JPA criteria api and my database contains Employee table.
I am trying to find all the employees who are paid second highest salary. I was able to write JPQL successfully as follows.
SELECT e FROM Employee e WHERE e.salary = (SELECT MAX(emp.salary) FROM Employee emp WHERE emp.salary < (SELECT MAX(employee.salary) FROM Employee employee) )
but now I am trying to convert it to criteria api and have tried following.
CriteriaQuery<Employee> c = cb.createQuery(Employee.class);
Root<Employee> e1 = c.from(Employee.class);
c.select(e1);
Subquery<Number> sq = c.subquery(Number.class);
Root<Employee> e2 = sq.from(Employee.class);
sq.select(cb.max(e2.<Number> get("salary")));
Subquery<Number> sq1 = sq.subquery(Number.class);
Root<Employee> e3 = sq1.from(Employee.class);
sq1.select(cb.max(e3.<Number> get("salary")));
c.where(cb.lessThan(e2.<Number>get("salary"), e3.<Number>get("salary")));// error here
c.where(cb.equal(e1.get("salary"), sq));
I get the error that parameters are not compatible with lessThan method. I do not understand how can I get this query worked out. Is my approach right?
EDIT :- Updating the question after Mikko's answer.
The jpql provided above provides following results, which are the employees with second highest salary.
Harish Taware salary 4000000.0
Nilesh Deshmukh salary 4000000.0
Deodatta Chousalkar salary 4000000.0
Deodatta Chousalkar salary 4000000.0
but the updated criteria query as below,
CriteriaQuery<Employee> c = cb.createQuery(Employee.class);
Root<Employee> e1 = c.from(Employee.class);
c.select(e1);
Subquery<Long> sq = c.subquery(Long.class);
Root<Employee> e2 = sq.from(Employee.class);
sq.select(cb.max(e2.<Long> get("salary")));
Subquery<Long> sq1 = sq.subquery(Long.class);
Root<Employee> e3 = sq1.from(Employee.class);
sq1.select(cb.max(e3.<Long> get("salary")));
c.where(cb.lessThan(e2.<Long> get("salary"), e3.<Long> get("salary")));
c.where(cb.equal(e1.get("salary"), sq));
employees = em.createQuery(c).getResultList();
for (Employee employee : employees) {
System.out.println(employee.getName() + "salary"
+ employee.getSalary());
}
This provides the employee with highest salary. The result is as below.
Pranil Gildasalary5555555.0
Please tell me where I am being wrong. An explanation is deeply appreciated.

After some more trial and error, I could write the query to select employees with second maximum salary. I would like to suggest that you should write a JPQL query first and write the criteria api accordingly. This is what I analyzed from JPQL.
SELECT e FROM Employee e
WHERE e.salary = (SELECT MAX(emp.salary) FROM Employee emp
WHERE emp.salary < (SELECT MAX(employee.salary) FROM Employee employee) )
Now we can see that
There are 2 subqueries, i.e. subquery of main query contains another subquery
The identification variables e, emp and employee correspond to the main query, subquery of main query and subquery of subquery.
Now while comparing the result of subqueries i.e. maximum salary compared with the employee salary of outer query, the identification variable from outer query is used. for e.g. WHERE emp.salary = (SELECT MAX(emp.salary) FROM Employee emp)
Now let us convert this query in criteria api.
First write CriteriaQuery that corresponds to outermost query i.e. SELECT e FROM Employee e WHERE e.salary =
CriteriaQuery<Employee> c1 = cb.createQuery(Employee.class);
Root<Employee> e3 = c1.from(Employee.class);
c1.select(e3);
Let us leave the WHERE e.salary = for now and go for the subquery
Now this should have a subquery that selects the maximum salary of employees i.e. SELECT MAX(emp.salary) FROM Employee emp
WHERE emp.salary < again let us leave the WHERE emp.salary < for now.
Subquery<Long> sq1 = c1.subquery(Long.class);
Root<Employee> e4 = sq1.from(Employee.class);
sq1.select(cb.max(e4.<Long> get("salary")));
repeating this for subquery of above subquery,
Subquery<Long> sq2 = sq1.subquery(Long.class);
Root<Employee> e5 = sq2.from(Employee.class);
sq2.select(cb.max(e5.<Long> get("salary")));
Now we have written subqueries but WHERE conditions need to be applied yet. So now the where condition in criteria api corresponding to WHERE emp.salary < (SELECT MAX(employee.salary) FROM Employee employee) will be as below.
sq1.where(cb.lessThan(e4.<Long> get("salary"), sq2));
Similarly, WHERE condition corresponding to WHERE e.salary = (SELECT MAX(emp.salary) FROM Employee emp will be as below.
c1.where(cb.equal(e3.<Long> get("salary"), sq1));
So the complete query which gives the employees with second highest salary can be written in criteria api as below.
CriteriaQuery<Employee> c1 = cb.createQuery(Employee.class);
Root<Employee> e3 = c1.from(Employee.class);
c1.select(e3);
Subquery<Long> sq1 = c1.subquery(Long.class);
Root<Employee> e4 = sq1.from(Employee.class);
sq1.select(cb.max(e4.<Long> get("salary")));
Subquery<Long> sq2 = sq1.subquery(Long.class);
Root<Employee> e5 = sq2.from(Employee.class);
sq2.select(cb.max(e5.<Long> get("salary")));
sq1.where(cb.lessThan(e4.<Long> get("salary"), sq2));
c1.where(cb.equal(e3.<Long> get("salary"), sq1));
employees = em.createQuery(c1).getResultList();
for (Employee employee : employees) {
System.out.println(employee.getName() + " " + employee.getSalary());
}

As documented, it cannot work because Number is not Comparable:
<Y extends java.lang.Comparable<? super Y>> Predicate lessThan(Expression<? extends Y> x,
Expression<? extends Y> y)
For expression with Number there is method Criteriabuilder.lt that takes such arguments:
c.where(cb.lt(e2.<Number>get("salary"), e3.<Number>get("salary")));
Other option is to change type argument from Number to something more specific. If salary is Long, following should work:
Subquery<Long> sq = c.subquery(Long.class);
Root<Employee> e2 = sq.from(Employee.class);
sq.select(cb.max(e2.<Long> get("salary")));
Subquery<Long> sq1 = sq.subquery(Long.class);
Root<Employee> e3 = sq1.from(Employee.class);
sq1.select(cb.max(e3.<Long> get("salary")));
c.where(cb.lessThan(e2.<Long>get("salary"), e3.<Long>get("salary")));
c.where(cb.equal(e1.get("salary"), sq));

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Power Query - Append Variable Dates to Existing Table as Headers - list

Related

Struct Array in Bigquery with nested columns

Redshift generate rows as many as value in another column

Saving DAX Meassure to table (ABC /Pareto analysis)

Conditional calculation based on another column

using subqueries in jpa criteria api

Categories

Resources