sqlite - python - full outer join with attaching a database - python-2.7

I have 2 tables in 2 databases:
table call in history.db:
ROWID | ADDRESS | DATE
1 | +98765 | 1396771532
2 | +98765 | 1396771533
3 | +98765 | 1396771534
4 | +98765 | 1396771535
5 | +98765 | 1396771536
6 | +98765 | 1396771537
7 | +98765 | 1396771538
8 | +98765 | 1396771539
9 | +98765 | 1396771510
table info in voices.db:
ID | CALLID | PATH | CODE
1 | 2 | voice1.m4a | 12234
2 | 5 | voice2.m4a | 12234
3 | 1 | voice4.m4a | 89765
First, I did an attach:
conn = sqlite3.connect("history.db")
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute("attach ? as voice", ("voices.db",))
conn.commit()
Then, I joined 2 tables:
cursor.execute("SELECT * FROM call c JOIN (SELECT PATH, CALLID, CODE FROM voice.info WHERE CODE = ?) f ON c.ROWID = f.CALLID ORDER BY c.DATE", ("12234",))
So, I got the following result:
ROWID | ADDRESS | DATE | PATH | CALLID | CODE
2 | +98765 | 1396771533 | voice1.m4a | 2 | 12234
5 | +98765 | 1396771536 | voice2.m4a | 5 | 12234
But, I need a full outer join to get something like:
ROWID | ADDRESS | DATE | PATH | CALLID | CODE
1 | +98765 | 1396771532 | NULL | NULL | NULL
2 | +98765 | 1396771533 | voice1.m4a | 2 | 12234
3 | +98765 | 1396771534 | NULL | NULL | NULL
4 | +98765 | 1396771535 | NULL | NULL | NULL
5 | +98765 | 1396771536 | voice2.m4a | 5 | 12234
6 | +98765 | 1396771537 | NULL | NULL | NULL
7 | +98765 | 1396771538 | NULL | NULL | NULL
8 | +98765 | 1396771539 | NULL | NULL | NULL
9 | +98765 | 1396771510 | NULL | NULL | NULL
I tried this, but I got an error of ... UNION do not have same number result columns ....
How could I have a full outer Join?
I am using Python 2.7, so
Right and FULL OUTER JOINs are not currently supported

Your original query is badly written. Write it like this:
SELECT c.*, i.* FROM CALL c
JOIN voice.info i ON i.CODE = ? AND c.ROWID = i.CALLID
ORDER BY c.DATE;
Now transforming it into the full outer join as in the answer you linked is trivial:
SELECT c.*, i.* FROM CALL c
LEFT JOIN voice.info i ON c.ROWID = i.CALLID AND i.CODE = ?
UNION
SELECT c.*, i.* FROM voice.info i
LEFT JOIN CALL c ON c.ROWID = i.CALLID
WHERE i.CODE = ?;
In the answer you linked they use UNION ALL, which keeps duplicates in the result set. I don't think you want that, so therefore I prefer to use UNION, which removes duplicates (rows where all the columns are equal).
Also: it's actually even better to write out all columns instead of using *, but I didn't do that here for brevity.

Related

How to find count of Direct Reporting's by DAX formula in Power BI?

Good day! I have a sample employee table like the one below. I need a DAX formula in Power BI to create a measure to count the number of direct reports of each employee. For Example, the Direct Report count of GL0001 will be 2 (Because GL0001 is the line manager of GL0002 and GL0019 and they report to GL0001), the Direct Report count of EMP-02023 will be 3, Direct Report count of GL0002 will be 3. Please help me also to create measures regarding the count of only one direct reporting and less than three direct reporting
| Employee ID | Line Manager ID | Layer (of Employee) | Layer (of Line Manager) |
|--------------|-------------------|----------------------|--------------------------|
| EMP-01980 | GL0003 | 4 | 3 |
| EMP-02023 | EMP-02015 | 6 | 5 |
| EMP-01636 | EMP-02015 | 6 | 5 |
| EMP-02138 | EMP-02162 | 6 | 5 |
| EMP-02145 | EMP-01980 | 5 | 4 |
| GL0023 | GL0022 | 5 | 4 |
| GL0001 | | 1 | 0 |
| GL0002 | GL0001 | 2 | 1 |
| GL0003 | GL0002 | 3 | 2 |
| GL0019 | GL0001 | 2 | 1 |
| GL0020 | GL0002 | 3 | 2 |
| GL0024 | GL0002 | 3 | 2 |
| EMP-01918 | EMP-00791 | 9 | 8 |
| EMP-01941 | EMP-00791 | 9 | 8 |
| EMP-02019 | EMP-02156 | 8 | 7 |
| EMP-02024 | EMP-02023 | 7 | 6 |
| EMP-02025 | EMP-02023 | 7 | 6 |
| EMP-03001 | EMP-02023 | 7 | 6 |
Your data doesn't have all the Employee ID for each Line Manager ID. That means the PATH calculation would not work.
I've assumed your data looks like this
Employee ID
Line Manager ID
1000001
1000002
1000001
1000003
1000002
1000004
1000003
1000005
1000004
1000006
1000005
1000007
1000006
1000008
1000007
1000009
1000006
1000010
1000003
Creating Calculated columns you can calculate the PATH and the PATH SIZE
Path
Path = path('Table'[Employee ID],'Table'[Line Manager ID])
Path Size
Path Length = PATHLENGTH([Path])
Output
Edit
In that case, you can use the Line Manager ID column to count direct reports, measure below.
DAX: Calculated Column
CountDirectReport =
VAR EmpId = [Employee ID]
RETURN
CALCULATE (
COUNTROWS ( 'Table' ),
FILTER ( 'Table', [Line Manager ID] = EmpId )
)
Output

PowerBI : Count Distinct values in one column based on Distinct Values in another column

i have a data for group no and its set value. When the set value is same for all the batches i dont want those batches to be counted. but if there are more than 1 set values in each batch then the dax query should count it as 1.
My current data is like this
| group_no | values |
| ---------- | ---------------------- |
| H110201208 | 600 |
| H110201208 | 600 |
| H110201208 | 680 |
| H101201215 | 665 |
| H109201210 | 640 |
| H123201205 | 600 |
| H125201208 | 610 |
| H111201212 | 610 |
| H111201212 | 630 |
I want my output like this
| Group no | Grand Total |
| ---------- | ----------- |
| H101201215 | 1 |
| H109201210 | 1 |
| H110201208 | 3 |
| H111201212 | 2 |
| H123201205 | 1 |
| H125201208 | 1 |
i want to create another table like the one above using dax so that i can plot graphs n percentages based on its output
i want to do this in powerbi using DAX language.
TABLE =
GROUPBY (
Groups, //SourceTable
Groups[ group_no ],
"GrandTotal", COUNTX ( CURRENTGROUP (), DISTINCTCOUNTNOBLANK ( Groups[ values ] ) )
)

PowerQuery - Fill missing data according to specific pattern

I am trying to clean data received from an Excel file and transform it using PowerQuery (in PowerBI) into a useable format.
Below a sample table, and what I am trying to do:
| Country | Type of location |
|--------- |------------------ |
| A | 1 |
| | 2 |
| | 3 |
| B | 1 |
| | 2 |
| | 3 |
| C | 1 |
| | 2 |
| | 3 |
As you can see, I have a list of location types for each country (always constant, always the same number per country, ie each country has 3 rows for 3 location types)
What I am trying to do is to see if there is a way to fill the empty cells in the "Country" column, with the appropriate Country name, which would give something like this:
| Country | Type of location |
|--------- |------------------ |
| A | 1 |
| A | 2 |
| A | 3 |
| B | 1 |
| B | 2 |
| B | 3 |
| C | 1 |
| C | 2 |
| C | 3 |
For now I thought about using a series of if/else if conditions, but as there are 100+ countries this doesn't seem like the right solution.
Is there any way to do this more efficiently?
As Murray mentions, the Table.FillDown function works great and is built into the GUI under the Transform tab in the query editor:
Note that it only fills down to replace nulls, so if you have empty strings instead of nulls in those rows, you'll need to do a replacement first. The button for that is just above the Fill button in the GUI and you'd use the dialog box like this
or else just use the M code that this generates instead of the GUI:
= Table.ReplaceValue(#"Previous Step","",null,Replacer.ReplaceValue,{"Country"})
Yes, like you can do in Excel, you can fill down.
From the docs - Table.FillDown
I believe you will need to sort the data correctly first.
Table.FillDown(
Table.FromRecords({
[Place = 1, Name = "Bob"],
[Place = null, Name = "John"],
[Place = 2, Name = "Brad"],
[Place = 3, Name = "Mark"],
[Place = null, Name = "Tom"],
[Place = null, Name = "Adam"]
}),
{"Place"}
)

Create column to classify rows based on realted tables DAX PowerBI

I have simplified my problem to solve. Lets suppose I have three tables. One containing data and specific codes that identify objects lets say Apples.
+-------------+------------+-----------+
| Data picked | Color code | Size code |
+-------------+------------+-----------+
| 1-8-2018 | 1 | 1 |
| 1-8-2018 | 1 | 3 |
| 1-8-2018 | 2 | 2 |
| 1-8-2018 | 2 | 3 |
| 1-8-2018 | 2 | 2 |
| 1-8-2018 | 3 | 3 |
| 1-8-2018 | 4 | 1 |
| 1-8-2018 | 4 | 1 |
| 1-8-2018 | 5 | 3 |
| 1-8-2018 | 6 | 1 |
| 1-8-2018 | 6 | 2 |
| 1-8-2018 | 6 | 2 |
+-------------+------------+-----------+
And i have two related helping tables to help understand the codes (their relationships are inactive in the model due to ambiguity with other tables in the real case).
+-----------+--------+
| Size code | Size |
+-----------+--------+
| 1 | Small |
| 2 | Medium |
| 3 | Large |
+-----------+--------+
and
+------------+----------------+-------+
| Color code | Color specific | Color |
+------------+----------------+-------+
| 1 | Light green | Green |
| 2 | Green | Green |
| 3 | Semi green | Green |
| 4 | Red | Red |
| 5 | Dark | Red |
| 6 | Pink | Red |
+------------+----------------+-------+
Lets say that I want to create an extra column in the original table to determine which apples are class A and class B given that medium green Apples are class A and large Red apples are class B, the other remain blank as the example below.
+-------------+------------+-----------+-------+
| Data picked | Color code | Size code | Class |
+-------------+------------+-----------+-------+
| 1-8-2018 | 1 | 1 | |
| 1-8-2018 | 1 | 3 | |
| 1-8-2018 | 2 | 2 | A |
| 1-8-2018 | 2 | 3 | |
| 1-8-2018 | 2 | 2 | A |
| 1-8-2018 | 3 | 3 | |
| 1-8-2018 | 4 | 1 | |
| 1-8-2018 | 4 | 1 | |
| 1-8-2018 | 5 | 3 | B |
| 1-8-2018 | 6 | 1 | |
| 1-8-2018 | 6 | 2 | |
| 1-8-2018 | 6 | 2 | |
+-------------+------------+-----------+-------+
What's the proper DAX to use given the relationships are initially inactive. Preferably solvable without creating any further additional columns in any table. I already tried codes like:
CALCULATE (
"A" ;
FILTER ( 'Size Table' ; 'Size Table'[Size] = "Medium");
FILTER ( 'Color Table' ; 'Color Table'[Color] = "Green")
)
And many variations on the same principle
Given that the relationships are inactive, I'd suggest using LOOKUPVALUE to match ID values on the other tables. You should be able to create a calculated column as follows:
Class =
VAR Size = LOOKUPVALUE('Size Table'[Size],
'Size Table'[Size code], 'Data Table'[Size code])
VAR Color = LOOKUPVALUE('Color Table'[Color],
'Color Table'[Color code], 'Data Table'[Color code])
RETURN SWITCH(TRUE(),
(Size = "Medium") && (Color = "Green"), "A",
(Size = "Large") && (Color = "Red"), "B", BLANK())
If your relationships are active, then you don't need the lookups:
Class = SWITCH(TRUE(),
(RELATED('Size Table'[Size]) = "Medium") &&
(RELATED('Color Table'[Color]) = "Green"),
"A",
(RELATED('Size Table'[Size]) = "Large") &&
(RELATED('Color Table'[Color]) = "Red"),
"B",
BLANK())
Or a bit more elegantly written (especially for more classes):
Class =
VAR SizeColor = RELATED('Size Table'[Size]) & " " & RELATED('Color Table'[Color])
RETURN SWITCH(TRUE(),
SizeColor = "Medium Green", "A",
SizeColor = "Large Red", "B",
BLANK())

SAS:add one column from tableB to tableA

I have two table looks like and I want to add column score to tableA from tableB, then get tableC, how to do in SAS?
the only rule is to add a column in tableA name "score " and its value is same as column "score" in tableB (which are all the same in tableB)
+----+---+---+---+
| id | b | c | d |
+----+---+---+---+
| 1 | 5 | 7 | 2 |
| 2 | 6 | 8 | 3 |
| 3 | 7 | 8 | 1 |
| 4 | 5 | 7 | 2 |
| 5 | 6 | 8 | 3 |
| 6 | 7 | 8 | 1 |
+----+---+---+---+
tableA
+---+---+-------+
| e | f | score |
+---+---+-------+
| 3 | 7 | 11 |
| 4 | 6 | 11 |
| 5 | 5 | 11 |
+---+---+-------+
tableB
+----+---+---+---+-------+
| id | b | c | d | score |
+----+---+---+---+-------+
| 1 | 5 | 7 | 2 | 11 |
| 2 | 6 | 8 | 3 | 11 |
| 3 | 7 | 8 | 1 | 11 |
| 4 | 5 | 7 | 2 | 11 |
| 5 | 6 | 8 | 3 | 11 |
| 6 | 7 | 8 | 1 | 11 |
+----+---+---+---+-------+
tableC
If the "id" is present in both tables, you can use the following to create Table C:
PROC SQL;
CREATE TABLE tableC AS
SELECT a.*, b.score
FROM tableA a JOIN tableB b
ON a.id = b.id;
QUIT;
Please confirm that this is what you need?