Split one string into multiple rows (SAS) - sas

I'm trying to do the following in SAS.
I have a dataset like this:
| F2 | F3 | PERIOD |
___________________________________________________________
| Text1(1001) Text2(1002) Text3(1003) | | 10-02-03 |
| Text4 | 1004 | 10-02-08 |
| Text5(1005) Text6(1006) | | 10-02-12 |
| Text7 | 1007 | 10-03-01 |
What I would like to do is splitting the F2-column up if there is more than one value. So the dataset would look like this:
| F2 | F3 | Period |
___________________________________________________________
| Text1 | 1001 | 10-02-03 |
| Text2 | 1002 | 10-02-03 |
| Text3 | 1003 | 10-02-03 |
| Text4 | 1004 | 10-02-08 |
| Text5 | 1005 | 10-02-12 |
| Text6 | 1006 | 10-02-12 |
| Text7 | 1007 | 10-03-01 |
So the values in the parentheses ends in the F3 Column and the PERIOD column stays the same.
I hope somebody can help.

the first datastep simulates the first 2 rows of your input, the second performs what you need.
Cheers, Francesco
data in;
length f2 $64
f3 $4
period $8;
f2="Text1(1001) Text2(1002) Text3(1003)";
period="10-02-03";
output;
f2="Text4";
f3="1004";
period="10-02-08";
output;
run;
data out;
length outF2 $16;
set in;
noOfBlanks = countc(strip(f2),' ');
if noOfBlanks then do i=1 to noOfBlanks+1;
outF2=scan(f2,i," ");
f3 = prxchange("s/.*\(|\)//",-1,outF2);
outF2 = prxchange("s/\(.*\)//",-1,outF2);
output;
end;
else do;
outF2=f2;
output;
end;
drop f2 noOfBlanks i;
rename outF2 = f2;
run;

Related

Concatenating two columns from different tables if one of the columns is not empty

I have two tables which are connected by an ID column (not shown in the picture). Here is how the data looks:
| column1 | column2 |
| -------- | -------------- |
| Mike | 345 |
| Steve | 987 |
| Andy | 0 |
| Lucas | 0 |
--
| column3 | column4 |
| -------- | -------------- |
| Mike | 543 |
| Lucas | 0 |
| Andy | 678 |
| Steve | 0 |
I wish to create a calculated column which concatenates the results from the second table in the picture (column3, column4) only if the result in column2 is zero. If the result of column2 is not zero then it should have precedence in concatenation.
Also if both column2 and column4 are zero then there should be no concatenation.
I'm expecting something like this:
| Column3 | Column4 | Concat column|
|---- |------| -----|
| Mike | 543 | Mike 345 |
| Lucas | 0 | |
| Andy | 678 | Andy 678 |
| Steve | 0 | Steve 987 |
Try this.
ConcatColumn = IF(Table1[Column2]<>0,Table1[Column1]&Table1[Column1],RELATED(Table2[Column3])&RELATED(Table2[Column4]))
Before using the above calculated column, First you have to establish relationship Table1 & Table2 by Column1 & Column3.
Also it is assumed that Column2 & Column4 have datatype as WholeNumber

How to extracting all values that contain part of particular number and then deleting them?

How do you extract all values containing part of a particular number and then delete them?
I have data where the ID contains different lengths and wants to extract all the IDs with a particular number. For example, if the ID contains either "-00" or "02" or "-01" at the end, pull to be able to see the hit rate that includes those—then delete them from the ID. Is there a more effecient way in creating this code?
I tried to use the substring function to slice it to get the result, but there is some other ID along with the specified position.
Code:
Proc sql;
Create table work.data1 AS
SELECT Product, Amount_sold, Price_per_unit,
CASE WHEN Product Contains "Pen" and Lenghth(ID) >= 9 Then ID = SUBSTR(ID,1,9)
WHEN Product Contains "Book" and Lenghth(ID) >= 11 Then ID = SUBSTR(ID,1,11)
WHEN Product Contains "Folder" and Lenghth(ID) >= 12 Then ID = SUBSTR(ID,1,12)
...
END AS ID
FROM A
Quit;
Have:
+------------------+-----------------+-------------+----------------+
| ID | Product | Amount_sold | Price_per_unit |
+------------------+-----------------+-------------+----------------+
| 123456789 | Pen | 30 | 2 |
| 63495837229-01 | Book | 20 | 5 |
| ABC134475472 02 | Folder | 29 | 7 |
| AB-1235674467-00 | Pencil | 26 | 1 |
| 69598346-02 | Correction pen | 15 | 1.50 |
| 6970457688 | Highlighter | 15 | 2 |
| 584028467 | Color pencil | 15 | 10 |
+------------------+-----------------+-------------+----------------+
Wanted the final result:
+------------------+-----------------+-------------+----------------+
| ID | Product | Amount_sold | Price_per_unit |
+------------------+-----------------+-------------+----------------+
| 123456789 | Pen | 30 | 2 |
| 63495837229 | Book | 20 | 5 |
| ABC134475472 | Folder | 29 | 7 |
| AB-1235674467 | Pencil | 26 | 1 |
| 69598346 | Correction pen | 15 | 1.50 |
| 6970457688 | Highlighter | 15 | 2 |
| 584028467 | Color pencil | 15 | 10 |
+------------------+-----------------+-------------+----------------+
Just test if the string has any embedded spaces or hyphens and also that the last word when delimited by space or hyphen is 00 or 01 or 02 then chop off the last three characters.
data have;
infile cards dsd dlm='|' truncover ;
input id :$20. product :$20. amount_sold price_per_unit;
cards;
123456789 | Pen | 30 | 2 |
63495837229-01 | Book | 20 | 5 |
ABC134475472 02 | Folder | 29 | 7 |
AB-1235674467-00 | Pencil | 26 | 1 |
69598346-02 | Correction pen | 15 | 1.50 |
6970457688 | Highlighter | 15 | 2 |
584028467 | Color pencil | 15 | 10 |
;
data want;
set have ;
if indexc(trim(id),'- ') and scan(id,-1,'- ') in ('00' '01' '02') then
id = substrn(id,1,length(id)-3)
;
run;
Result
amount_ price_
Obs id product sold per_unit
1 123456789 Pen 30 2.0
2 63495837229 Book 20 5.0
3 ABC134475472 Folder 29 7.0
4 AB-1235674467 Pencil 26 1.0
5 69598346 Correction pen 15 1.5
6 6970457688 Highlighter 15 2.0
7 584028467 Color pencil 15 10.0
There may be other solutions but you have to use some string functions. I used here the functions substr, reverse (reverting the string) and indexc (position of one of the characters in the string):
data have;
input text $20.;
datalines;
12345678
AB-142353 00
AU-234343-02
132453 02
221344-09
;
run;
data want (drop=reverted pos);
set have;
if countw(text) gt 1
then do;
reverted=strip(reverse(text));
pos=indexc(reverted,'- ')+1;
new=strip(reverse(substr(reverted,pos)));
end;
else new=text;
run;

Derive attributes based on multiple events

I have data that I want to transpose to get visualization of the status of a single id at any point in time.
I have been trying to follow #Joe's answer from Aggregating multiple observations depending on validity ranges, but I struggle with the case of multiple modalities attributes.
This is the event-based data I have:
data have;
infile datalines delimiter="|";
input attrib :$30. multiple_attr :$1. id :$30. attrib_id :8. member_value :$100. type :$5. dt_event :datetime18.;
format dt_event datetime20.;
datalines;
TYPE|N|ABC123|111|MEDIUM|Start|01DEC2014:00:00:00
TYPE|N|ABC123|111|MEDIUM|End|18APR2021:00:00:00
TYPE|N|ABC123|111|BIG|Start|19APR2021:00:00:00
TYPE|N|ABC123|111|BIG|End|31DEC2030:00:00:00
POSITION|N|ABC123|222|TOP|Start|01DEC2014:00:00:00
POSITION|N|ABC123|222|TOP|End|31DEC2030:00:00:00
IS_ACTIVE|N|ABC123|333|YES|Start|01DEC2014:00:00:00
IS_ACTIVE|N|ABC123|333|YES|End|31DEC2030:00:00:00
LEVELS|Y|ABC123|1|ALONE|Start|01DEC2014:00:00:00
LEVELS|Y|ABC123|1|BOTH|Start|01DEC2014:00:00:00
LEVELS|Y|ABC123|1|BOTH|End|18APR2021:00:00:00
LEVELS|Y|ABC123|1|ALONE|End|31DEC2030:00:00:00
TYPE|N|DEF456|111|MEDIUM|Start|01DEC2014:00:00:00
TYPE|N|DEF456|111|MEDIUM|End|31DEC2030:00:00:00
POSITION|N|DEF456|222|MID|Start|01DEC2014:00:00:00
POSITION|N|DEF456|222|MID|End|31DEC2030:00:00:00
IS_ACTIVE|N|DEF456|333|YES|Start|01MAR2014:00:00:00
IS_ACTIVE|N|DEF456|333|YES|End|31DEC2030:00:00:00
LEVELS|Y|DEF456|1|ALONE|Start|01MAR2014:00:00:00
LEVELS|Y|DEF456|1|BOTH|Start|01MAR2014:00:00:00
LEVELS|Y|DEF456|1|BOTH|End|31MAR2018:00:00:00
LEVELS|Y|DEF456|1|BOTH|Start|20AUG2018:00:00:00
LEVELS|Y|DEF456|1|ALONE|End|31DEC2030:00:00:00
LEVELS|Y|DEF456|1|BOTH|End|31DEC2030:00:00:00
;
Using #Joe's method:
proc sort data=have;
by id attrib_id dt_event member_value;
run;
data want;
set have(rename=member_value=in_value);
by id attrib_id dt_event;
retain start_date end_date member_value orig_value;
format member_value new_value $100.;
* First row per attrib_id is easy, just start it off with a START;
if first.attrib_id then do;
start_date = dt_event;
member_value = in_value;
end;
else do; *Now is the harder part;
* For ENDs, we want to remove the current member_value from the concatenated value string, always, and then if it is the last row for that dt_event, we want to output a new record;
if type='End' then do;
*remove the current (in_)value;
if first.dt_event then orig_value = member_value;
do _i = 1 to countw(member_value,';');
if scan(orig_value,_i,';') ne in_value then do;
if orig_value > scan(orig_value,_i,';') then new_value = catx('; ',scan(orig_value,_i,';'),new_value);
else new_value = catx('; ',new_value,scan(orig_value,_i,';'));
end;
end;
orig_value = new_value;
if last.dt_event then do;
end_date = dt_event;
output;
start_date = dt_event + 86400;
member_value = new_value;
orig_value = ' ';
end;
end;
else do;
* For START, we want to be more careful about outputting, as this will output lots of unwanted rows if we do not take care;
end_date = dt_event - 86400;
if start_date < end_date and not missing(member_value) then output;
if member_value > in_value then member_value = catx('; ',in_value,member_value);
else member_value = catx('; ',member_value,in_value);
start_date = dt_event;
end_date = .;
end;
end;
format start_date end_date datetime20.;
keep id multiple_attr attrib_id member_value start_date end_date;
run;
I end up with:
+---------------+--------+-----------+--------------------+--------------------+-------------------+
| multiple_attr | id | attrib_id | start_date | end_date | member_value |
+---------------+--------+-----------+--------------------+--------------------+-------------------+
| Y | ABC123 | 1 | 01DEC2014:00:00:00 | 18APR2021:00:00:00 | ALONE; BOTH |
| Y | ABC123 | 1 | 19APR2021:00:00:00 | 31DEC2030:00:00:00 | BOTH; ALONE |
| N | ABC123 | 111 | 01DEC2014:00:00:00 | 18APR2021:00:00:00 | MEDIUM |
| N | ABC123 | 111 | 19APR2021:00:00:00 | 31DEC2030:00:00:00 | BIG |
| N | ABC123 | 222 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | TOP |
| N | ABC123 | 333 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | YES |
| Y | DEF456 | 1 | 01MAR2014:00:00:00 | 31MAR2018:00:00:00 | ALONE; BOTH |
| Y | DEF456 | 1 | 01APR2018:00:00:00 | 19AUG2018:00:00:00 | BOTH; ALONE |
| Y | DEF456 | 1 | 20AUG2018:00:00:00 | 31DEC2030:00:00:00 | BOTH; BOTH; ALONE |
| N | DEF456 | 111 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | MEDIUM |
| N | DEF456 | 222 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | MID |
| N | DEF456 | 333 | 01MAR2014:00:00:00 | 31DEC2030:00:00:00 | YES |
+---------------+--------+-----------+--------------------+--------------------+-------------------+
You can see that multiple modalities attributes (where multiple_attr = "Y") are not handled properly.
The desired output should be like this:
+---------------+--------+-----------+--------------------+--------------------+--------------+
| multiple_attr | id | attrib_id | start_date | end_date | member_value |
+---------------+--------+-----------+--------------------+--------------------+--------------+
| Y | ABC123 | 1 | 01DEC2014:00:00:00 | 18APR2021:00:00:00 | ALONE; BOTH |
| Y | ABC123 | 1 | 19APR2021:00:00:00 | 31DEC2030:00:00:00 | ALONE |
| N | ABC123 | 111 | 01DEC2014:00:00:00 | 18APR2021:00:00:00 | MEDIUM |
| N | ABC123 | 111 | 19APR2021:00:00:00 | 31DEC2030:00:00:00 | BIG |
| N | ABC123 | 222 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | TOP |
| N | ABC123 | 333 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | YES |
| Y | DEF456 | 1 | 01MAR2014:00:00:00 | 31MAR2018:00:00:00 | ALONE; BOTH |
| Y | DEF456 | 1 | 01APR2018:00:00:00 | 19AUG2018:00:00:00 | ALONE |
| Y | DEF456 | 1 | 20AUG2018:00:00:00 | 31DEC2030:00:00:00 | ALONE; BOTH |
| N | DEF456 | 111 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | MEDIUM |
| N | DEF456 | 222 | 01DEC2014:00:00:00 | 31DEC2030:00:00:00 | MID |
| N | DEF456 | 333 | 01MAR2014:00:00:00 | 31DEC2030:00:00:00 | YES |
+---------------+--------+-----------+--------------------+--------------------+--------------+
Is there a way to handle multiple modalities attributes? I can't find a way to delete a member value once a modality of that attribute is ending (i.e. switching from ALONE; BOTH to ALONE after it ended).
Not 100% sure I understand all of this, but I think at least this is one problem.
Looking at where you remove the values, you need to use strip or similar because of spaces. I removed the spaces in the catx() and add strip() to do that here.
if strip(scan(orig_value,_i,';')) ne strip(in_value) then do;
if strip(orig_value) > strip(scan(orig_value,_i,';')) then new_value = catx(';',scan(orig_value,_i,';'),new_value);
else new_value = catx(';',new_value,scan(orig_value,_i,';'));
end;
Otherwise it is comparing words with spaces to words without spaces, and while in some cases those words are identical (or treated as such by SAS), in some cases they aren't, which causes some of your issues here. When I run this, I get "Alone" on the second line, for example.

Sequences in SAS Tables

I'm looking to add a sequence column to my sas dataset, but according to ids and transaction dates. To illustrate, below is the table I'm referring to:
ID | TXN_DT |
01 | 01JAN2020 |
01 | 01JAN2020 |
01 | 02JAN2020 |
01 | 03JAN2020 |
02 | 01JAN2020 |
02 | 02JAN2020 |
02 | 02JAN2020 |
02 | 03JAN2020 |
02 | 03JAN2020 |
and I want to add a sequence like so:
ID | TXN_DT | SEQ |
01 | 01JAN2020 | 1 |
01 | 01JAN2020 | 1 |
01 | 02JAN2020 | 2 |
01 | 03JAN2020 | 3 |
02 | 01JAN2020 | 1 |
02 | 02JAN2020 | 2 |
02 | 02JAN2020 | 2 |
02 | 03JAN2020 | 3 |
02 | 03JAN2020 | 3 |
I'm trying to run the following code, but it seems to jump a row up and not copying the previous' row's value, and instead skips to 2 rows above.
data want;
set have;
by id;
if first.id then seq=1;
else seq+1;
if txn_dt=lag(txn_dt) then seq = lag(seq);
keep id seq txn_dt;
run;
any help? Thanks in advance!
Try
if first.id then seq=0;
seq + (first.id or txn_dt ne lag(txn_dt);
Try to use retain and first.
data want(drop=txn_dt_group);
set have;
by id txn_dt;
retain txn_dt_group seq;
if first.id then do;
txn_dt_group=txn_dt;
seq=1;
end;
if txn_dt ne txn_dt_group then do;
seq=seq+1;
txn_dt_group=txn_dt;
end;
run;
Output:
+-----------+----+-----+
| txn_dt | ID | seq |
+-----------+----+-----+
| 01JAN2020 | 1 | 1 |
| 01JAN2020 | 1 | 1 |
| 02JAN2020 | 1 | 2 |
| 03JAN2020 | 1 | 3 |
| 01JAN2020 | 2 | 1 |
| 02JAN2020 | 2 | 2 |
| 02JAN2020 | 2 | 2 |
| 03JAN2020 | 2 | 3 |
| 03JAN2020 | 2 | 3 |
+-----------+----+-----+
data want;
set have;
by id txn_dt;
if first.id then seq=1;
else if first.txn_dt then seq+1;
run;
I think that should do it.
For completeness, here is a hash solution that does not depend on the order of your data.
data have;
input ID $ TXN_DT :date9.;
infile datalines dlm='|';
format TXN_DT date9.;
datalines;
01|01JAN2020
01|01JAN2020
01|02JAN2020
01|03JAN2020
02|01JAN2020
02|02JAN2020
02|02JAN2020
02|03JAN2020
02|03JAN2020
;
data want(drop=rc);
if _N_ = 1 then do;
dcl hash h1 ();
h1.definekey ('ID', 'TXN_DT');
h1.definedata ('SEQ');
h1.definedone ();
dcl hash h2 ();
h2.definekey ('ID');
h2.definedata ('SEQ');
h2.definedone ();
do until (lr);
set have end=lr;
if h2.find() = 0 then do;
if h1.check() ne 0 then seq + 1;
end;
else seq = 1;
h1.ref();
h2.replace();
end;
end;
set have;
rc = h1.find();
run;

Proc sql and macro variables

I am trying to run a code that should work on tables created considering different factors. As these factors can be more than 1, I decided to create a macro %let to list them:
%let list= factor1 factor2 ...;
What I would like to do is run a code to create these tables using different factors. For each factor, I computed using proc means the mean and the standard deviation, so I should have the variables &list._mean and &list._stddev in the table created by the proc means for each factor. This table is labelled as t2 and I need to join to another table, t1. From t1 I am considering all the variables.
My main difficulties are, therefore, in the proc sql:
proc sql;
create table new_table as
select t1.*
, t2.&list._mean as mean
, t2.&list._stddev as stddev
from table1 as t1
left join table2 as t2
on t1.time=t2.time
order by t2.&list.
quit;
This code is returning an error and I think because I am considering t2.factor1 factor2, i.e. t2 is only applied to the first factor, not to the second one.
What I would expect is the following:
proc sql;
create table new_table as
select t1.*
, t2.factor1._mean as mean
, t2.factor1._stddev as stddev
from table1 as t1
left join table2 as t2
on t1.time=t2.time
order by t2.factor1.
quit;
and another one for factor2.
UPDATE CODE:
%macro test_v1(
_dtb
,_input
,_output
,_time
,_factor
);
data &_input.;
set &_dtb..&_input.;
keep &_col_period. &_factor.;
run;
proc sort data = work.&_input.
out = &_input._1;
by &_factor. &_time.;
run;
%put ERROR: 2
proc means data=&_input._1 nonobs mean stddev;
class &_time.;
var &_factor.;
output out=&_input._n (drop=_TYPE_) mean= stddev= /autoname ;
run;
%put ERROR: 3
proc sql;
create table work.&_input._data as
select t1.*
,t2.&_factor._mean as mean
,t2.&_factor._stddev as stddev
from &_input. as t1
left join &_input._n as t2
on t1.&_time.=t2.&_time.
order by &_factor.;
quit;
%mend test_v1;
Then my question is on how I can consider multiple factors, defined into a macro as a list, as columns of tables and as input data into a macro (for example: %test(dataset, tablename, list).
I suspect that trying to use PROC SQL is what is making the problem hard. If you stick to just using normal SAS syntax your space delimited list of variable names is easy to use.
So taking your code and tweaking it a little:
%macro test_v1
(_dtb /* Input libref */
,_input /* Input member name */
,_output /* Output dataset */
,_time /* Class/By variable(s) */
,_factor /* Analysis variable(s) */
);
proc sort data= &_dtb..&_input. out=_temp1;
by &_time. ;
run;
proc means data=_temp1 nonobs mean stddev;
by &_time.;
var &_factor.;
output out=_temp2 (drop=_TYPE_) mean= stddev= /autoname ;
run;
data &_output. ;
merge _temp1 _temp2 ;
by &_time.;
run;
%mend test_v1;
We can then test it using SASHELP.CLASS by using SEX as the "time" variable and HEIGHT and WEIGHT as the analysis variables.
%test_v1(_dtb=sashelp,_input=class,_output=want,_time=sex,_factor=height weight);
You can try to add macro loop to your macros by scanning list of factors. It could look like:
%macro test(list);
%do i=1 to %sysfunc(countw(&list,%str( )));
%let factorname=%scan(&list,&i,%str( ));
/* if macro variable list equals factor1 factor2 then there would be
two iterations in loop, i=1 factorname=factor1 and i=2 factorname=2*/
/*your code here*/
%end
%mend test;
UPDATE:
%macro test(_input, _output, factors_list); %macro d; %mend d;
%do i=1 %to %sysfunc(countw(&factors_list,%str( )));
%let tfactor=%scan(&factors_list,&i,%str( ));
proc sort data = work.&_input.
out = &_input._1;
by &factors_list. time;
run;
proc means data=&_input._1 nonobs mean stddev;
class time;
var &tfactor.;
output out=&_input._num (drop=_TYPE_) mean= stddev= /autoname ;
run;
proc sql;
create table &_output._&tfactor as
select t1.*
, t2.&tfactor._mean as mean
, t2.&tfactor._stddev as stddev
from &_input as t1
left join &_input._num as t2
on t1.time=t2.time
order by t1.&tfactor;
quit;
%end;
%mend test;
%test(have,newdata,factor1 factor2);
Have dataset:
+------+---------+---------+
| time | factor1 | factor2 |
+------+---------+---------+
| 1 | 12345 | 1234 |
| 2 | 123 | 12 |
| 3 | 1 | -1 |
| 4 | -12 | -123 |
| 5 | -1234 | -12345 |
| 6 | 9876 | 987 |
| 7 | 98 | 8 |
| 8 | 9 | 7 |
| 1 | 1234 | 123 |
| 2 | 12 | 1 |
| 3 | 12 | -12 |
| 4 | -123 | -1234 |
| 5 | -12345 | -123456 |
| 6 | 987 | 98 |
| 7 | 9 | -9 |
| 8 | 1234 | 1234 |
+------+---------+---------+
NEWDATA_FACTOR1:
+------+---------+---------+---------+--------------+
| time | factor1 | factor2 | mean | stddev |
+------+---------+---------+---------+--------------+
| 5 | -12345 | -123456 | -6789.5 | 7856.6634458 |
| 5 | -1234 | -12345 | -6789.5 | 7856.6634458 |
| 4 | -123 | -1234 | -67.5 | 78.488852712 |
| 4 | -12 | -123 | -67.5 | 78.488852712 |
| 3 | 1 | -1 | 6.5 | 7.7781745931 |
| 7 | 9 | -9 | 53.5 | 62.932503526 |
| 8 | 9 | 7 | 621.5 | 866.20580695 |
| 3 | 12 | -12 | 6.5 | 7.7781745931 |
| 2 | 12 | 1 | 67.5 | 78.488852712 |
| 7 | 98 | 8 | 53.5 | 62.932503526 |
| 2 | 123 | 12 | 67.5 | 78.488852712 |
| 6 | 987 | 98 | 5431.5 | 6285.472178 |
| 1 | 1234 | 123 | 6789.5 | 7856.6634458 |
| 8 | 1234 | 1234 | 621.5 | 866.20580695 |
| 6 | 9876 | 987 | 5431.5 | 6285.472178 |
| 1 | 12345 | 1234 | 6789.5 | 7856.6634458 |
+------+---------+---------+---------+--------------+
NEWDATA_FACTOR2:
+------+---------+---------+----------+--------------+
| time | factor1 | factor2 | mean | stddev |
+------+---------+---------+----------+--------------+
| 5 | -12345 | -123456 | -67900.5 | 78567.341564 |
| 5 | -1234 | -12345 | -67900.5 | 78567.341564 |
| 4 | -123 | -1234 | -678.5 | 785.5956339 |
| 4 | -12 | -123 | -678.5 | 785.5956339 |
| 3 | 12 | -12 | -6.5 | 7.7781745931 |
| 7 | 9 | -9 | -0.5 | 12.02081528 |
| 3 | 1 | -1 | -6.5 | 7.7781745931 |
| 2 | 12 | 1 | 6.5 | 7.7781745931 |
| 8 | 9 | 7 | 620.5 | 867.62002052 |
| 7 | 98 | 8 | -0.5 | 12.02081528 |
| 2 | 123 | 12 | 6.5 | 7.7781745931 |
| 6 | 987 | 98 | 542.5 | 628.61792847 |
| 1 | 1234 | 123 | 678.5 | 785.5956339 |
| 6 | 9876 | 987 | 542.5 | 628.61792847 |
| 1 | 12345 | 1234 | 678.5 | 785.5956339 |
| 8 | 1234 | 1234 | 620.5 | 867.62002052 |
+------+---------+---------+----------+--------------+