determining deciles given values from a table BIGQUERY

determining deciles given values from a table BIGQUERY - google-cloud-platform

I Have a table like this:
ID
value
1
value 1
2
value 2
3
value 3
5
value 4
6
value 5
And want to add a third column that defines to each ID which decile he belongs to, given their "value"
ID
value
decile
1
value 1
1
2
value 2
2
3
value 3
7
5
value 4
1
6
value 5
5
Coding in bigquery

Related

Create the Mapping and Display Runs For Each Over and I Also Add First Over Runs in the Sum of Second Over Runs and Same for Third Over

Source table Cricket_Score:
Overs
Balls
Runs
1
1
1
1
2
2
1
3
4
1
4
0
1
5
1
1
6
2
2
1
3
2
2
1
2
3
1
2
4
4
2
5
6
2
6
0
3
1
2
3
2
1
3
3
1
3
4
6
3
5
0
3
6
4
I Want to an output like this:
Overs
Total_Runs
1
10
2
25
3
39
Description: - For First Over means First 6 Balls I Want Sum of First 6 Balls that is 10. and For Second 6 Balls I Want Sum of First 6 Balls [Over] + Second 6 Balls That is 25 [10 + 15 = 25]. and For Third 6 Balls I Want Sum of First 6 Balls [Over] + Second 6 Balls + Third ^ Balls That is 39 [10 + 15 + 14 = 39].
Note: - 6 balls means one over.
How to create a mapping in for this scenario in Informatica / which logic should I use?

i will assume your data is EXACTLY like you have shown in your question. If its not like this in source then it will be a major issue. If its a table where data is not sorted, it will be an issue.
Solution -
Create an expression transformation with below ports - in below order. in - input port, v_variable port, out_* output port
in_balls
in_runs
in_overs
v_cumulative_runs= in_runs+ iif(isnull(v_cumulative_run),0,v_cumulative_run)
out_total_runs=v_cumulative_runs
out_overs=in_overs
Use an aggregator -
in_total_runs
in_out_overs -- group by this port
out_total_runs = max(in_total_runs)
Attach in_out_overs and out_total_runs links to target.

PowerBI: How can I have two different side by side tables scroll at the same time in PBI?

I have two tables:
Table A
id
name
month_1
month_2
month_3
month_4
month_5
month_6
1
John
3
0
1
0
null
null
2
Mary
6
1
2
1
1
2
3
Angelo
1
5
null
null
null
null
4
Diane
3
2
0
1
null
null
Table B
id
name
LastYearTotal
CurrentYearTotal
1
John
2
4
2
Mary
6
13
3
Angelo
9
6
4
Diane
9
6
And then tables A and B will be side by side but not in the same table. Like there will be a separator between A and B. But when I use a filter, both tables will reflect the filter. In addition, there will only be one scroll for both tables so they move at the same time.
Thanks.

Counting observations with duplicate ID's

I have a dataset that I am converting from wide to long format.
Currently I have 1 observation per patient, and each patient can have up to 5 aneurysms, currently recorded in wide format.
I am trying to re-arrange this dataset so that I have one observation per aneurysm instead. I have done so successfully, but now I need to label the aneurysms in a new variable called aneurysmIdentifier.
Here is a glimpse at the data. You can see how, when a patient has 4 aneurysms, I have successfully created 4 corresponding observations, however these are duplicates created via the expand function.
I am stuck at the next point, which, as mentioned, is creating a new variable aneurysmIdentifier that reads 1 if there is only one copy of the specific record_id, 1 and 2 if there are two copies and so forth all the way to 1-2-3-4-5. This would enable me to have a point of reference as to what I call aneurysm 1, 2, 3, 4 and 5 so I can keep re-arranging data to fit as such.
I have created this sketch hopefully showcasing what I mean. As you can see it counts how many duplicates there are and then counts forward up to the maximum of 5.
Can anyone push me in the right direction on how to achieve this?
Example of data:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str32 record_id float aneurysmNumber
"007128de18ce5cb1635b8f27c5435ff3" 1
"00abd7bdb6283dd0ac6b97271608a122" 1
"0142103f84693c6eda416dfc55f65de1" 1
"0153826d93a58d7e1837bb98a3c21ba8" 1
"01c729ac4601e36f245fd817d8977917" 2
"01c729ac4601e36f245fd817d8977917" 2
"01dd90093fbf201a1f357e22eaff6b6a" 1
"0208e14dcabc43dd2b57e2e8b117de4d" 1
"0210f575075e5def7ffa77530ce17ef0" 1
"022cc7a9397e81cf58cd9111f9d1db0d" 1
"02afd543116a22fc7430620727b20bb5" 1
"0303ef0bd5d256cca1c836e2b70415ac" 2
"0303ef0bd5d256cca1c836e2b70415ac" 2
"041b2b0cac589d6e3b65bb924803cf1a" 1
"0536317a2bbb936e85c3eb8294b076da" 1
"06161d4668f217937cac0ac033d8d199" 1
"065e151f8bcebb27fabf8b052fd70566" 4
"065e151f8bcebb27fabf8b052fd70566" 4
"065e151f8bcebb27fabf8b052fd70566" 4
"065e151f8bcebb27fabf8b052fd70566" 4
"07196414cd6bf89d94a33e149983d102" 1
"0721c38f8275dab504fc53aebcc005ce" 4
"0721c38f8275dab504fc53aebcc005ce" 4
"0721c38f8275dab504fc53aebcc005ce" 4
"0721c38f8275dab504fc53aebcc005ce" 4
"07bef516d53279a3f5e477d56d552a2b" 1
"08678829b7e0ee6a01b17974b4d19cfa" 1
"08bb6c65e63c499ea19ac24d5113dd94" 1
"08f036417500c332efd555c76c4654a0" 1
"090c54d021b4b21c7243cec01efbeb91" 1
"09166bb44e4c5cdb8f40d402f706816e" 1
"0930159addcdc35e7dc18812522d4377" 1
"096844af91d2e266767775b0bee9105e" 1
"09884af1bb9d59803de0c74d6df57c23" 1
"09e03748da35e9d799dc5d8ddf1909b5" 1
"0a4ce4a7941ff6d1f5c217bf5a9a3bf9" 1
"0a5db40dc58e97927b407c9210aab7ba" 2
"0a5db40dc58e97927b407c9210aab7ba" 2
"0a73c992955231650965ed87e3bd52f6" 1
"0a84ab77fff74c247a525dfde8ce988c" 3
"0a84ab77fff74c247a525dfde8ce988c" 3
"0a84ab77fff74c247a525dfde8ce988c" 3
"0af333ae400f75930125bb0585f0dcf5" 1
"0af73334d9d2166191f3385de48f15d2" 1
"0b341ac8f396a8cdb88b7c658f66f653" 2
"0b341ac8f396a8cdb88b7c658f66f653" 2
"0b35cf4beb830b361d7c164371f25149" 2
"0b35cf4beb830b361d7c164371f25149" 2
"0b3e110c9765e14a5c41fadcc3cfc300" .
"0b6681f0f441e69c26106ab344ac0733" 1
"0b8d8253a8415275dbc2619e039985bb" 3
"0b8d8253a8415275dbc2619e039985bb" 3
"0b8d8253a8415275dbc2619e039985bb" 3
"0b92c26375117bf42945c04d8d6573d4" 2
"0b92c26375117bf42945c04d8d6573d4" 2
"0ba961f437f43105c357403c920bdef1" 1
"0bb601fabe1fdfa794a5272408997a2f" 1
"0c75b36e91363d596dc46bd563c3f5ef" 1
"0d461328a3bae7164ce7d3a10f366812" 1
"0d4cc4eb459301a804cbef22914f44a3" 1
"0d4e29e11bb94e922112089f3fec61ef" 2
"0d4e29e11bb94e922112089f3fec61ef" 2
"0d513c74d667f55c8f4a9836c304149c" 1
"0da25de126bb3b3ee565eff8888004c2" 2
"0da25de126bb3b3ee565eff8888004c2" 2
"0db9ae1f2201577f431b7603d0819fa6" 1
"0dd8a681f6a5d4c888831a591e57a747" 1
"0e05d6958d878368b5fb831211fad6a1" 1
"0e3ff41e0e2b2cb5ec336fd0b04e5d44" 1
"0f61e560ab56b8fea1f2593d7d3b2718" 2
"0f61e560ab56b8fea1f2593d7d3b2718" 2
"0f69f1f998984d37f133185179d63c60" 1
"1037032886a93e66406a4c910d1ef747" 2
"1037032886a93e66406a4c910d1ef747" 2
"1044b81b354b420e85ae835ea07de2d6" 1
"10620fc488346291281212a404681386" 1
"1074389c469944edf026d193a55b1148" 1
"1090d5a678119b03cddab609289a4d3c" 1
"111eebb45cef2211a2a2ff0219095e6a" 1
"11ddcbc8de8ef56cbc578fc81b602ffc" 1
"11f22488513cf717c333786c789b0289" 2
"11f22488513cf717c333786c789b0289" 2
"121552b22cee2a1eb4360b4d2534cd39" 1
"1251d707c5dc9243dc45d04beb7c3493" 1
"125689659bb3821fa81698dd72462773" 1
"127ba572433921c5bb408fc62eb9b5d7" 1
"129bea3f73e84e37d77d55fadfeb49dd" 1
"12e8dc6fb87822be26d6678cee9644f5" 1
"12f05a65f771c9675c2c5e9cdbfc33d1" 2
"12f05a65f771c9675c2c5e9cdbfc33d1" 2
"13d2bc86f1a19ed2959cd7354bc92d1d" 1
"13db5ede38e2ae1da17884c9a18df202" 1
"13f946e50df8ad74d7cf9fa05b4ad05b" 1
"146c4b8be7996a9789873fe55a47ab41" 1
"147fadd87da13a0271225d944d2a5e98" 1
"14a1dcfa015343bbefaac9a3a45769e5" 2
"14a1dcfa015343bbefaac9a3a45769e5" 2
"14d1377f74a63ffa29db2d99e7f6a1ce" 1
"150017d944a87b4c61f90034380c0659" 1
"150f6ca1ea453260eabf3472d3ebcad1" 1
end

You can go
bysort record_id: gen aneurysm_id = _n
but the results will be arbitrary unless there is some other information, say a date variable, to provide a rationale for the ordering. Let's suppose that there is a date variable date that is numeric and in good order. Then
bysort record_id (date) : gen aneurysm_id = _n
would be a suitable modification. For date read also date-time if time of day is noted and notable.

Get row position reference and fill down X number of times until you encounter a new value

In Google Sheets, I have a column that is randomly populated with a value:
1 value
2
3
4 value
5 value
6
7
8
9 value
10
I need to populate all the blank, non-value cells in the column with a reference to the row number of the value above it.
Among many other things, I've tried using an IF statement:
IF=A2="",row(A1),row(A2)
This would look for a blank value in column A, give me a row reference if the value IS blank, and give me the value of the cell itself if it is not blank. However, this does not work when I fill down.
I am looking for a formula that will look at the values in column A and give me a row ref for the most recent appearance of value in that column:
1 value
2 1
3 1
4 value
5 value
6 5
7 5
8 5
9 value
10 9

=ARRAYFORMULA(IF(A1:A<>"", A1:A,
IF(ROW(A1:A) <= MAX(IF(NOT(ISBLANK(A1:A)), ROW(A1:A))),
VLOOKUP(ROW(A1:A), FILTER(ROW(A1:A), LEN(A1:A)), 1), )))

count the total of unique numbers occur in a range of cells

Hello this is my data sample
coustmer_NO id
1 5
1 13
2 4
2 4
2 4
3 4
3 10
4 8
4 8
using SQL >> I Would like to count for each customer how many different ID They have.
the expected output is:
coustmer_NO total_id
1 2
2 1
3 2
4 1

I guess there is a typo in your data,
The result should be:
coustmer_NO total_id
1 2
2 1
3 2
4 1
You can do the following:
SELECT costumer_NO, count(distinct id) AS total_id FROM <table_name> GROUP BY costumer_NO;

Try this query in MYSQL:
select coustmer_NO, count(distinct id) as 'total_id' from table_name group by coustmer_NO;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

determining deciles given values from a table BIGQUERY - google-cloud-platform

I Have a table like this: ID value 1 value 1 2 value 2 3 value 3 5 value 4 6 value 5 And want to add a third column that defines to each ID which decile he belongs to, given their "value" ID value decile 1 value 1 1 2 value 2 2 3 value 3 7 5 value 4 1 6 value 5 5 Coding in bigquery

Related

Create the Mapping and Display Runs For Each Over and I Also Add First Over Runs in the Sum of Second Over Runs and Same for Third Over

PowerBI: How can I have two different side by side tables scroll at the same time in PBI?

Counting observations with duplicate ID's

Get row position reference and fill down X number of times until you encounter a new value

count the total of unique numbers occur in a range of cells

Categories

Resources