I have the following code:
DATA _quotes_merged;
merge TICKERS_NBBO (IN=C)
_quotes_processed (IN=D)
;
by SYM_ROOT;
if C and D;
RUN;
In table C, there is a variable named SYM_ROOT which contains a set of unique symbols or stocks.
In table D, there is also a variable named SYM_ROOT, but there are lots of duplicate rows with the same symbols.
Now, I want to keep those rows in D only when a match in C is found, without deleting the duplicates. I also don't want any void values (that's why I am using if C and D instead of if C).
I was wondering if my code above work as planned?
Related
I want to merge two or more tables in to one, for example, I have table1.csv and table2.csv, they are from different Mysql server but have the same structure like [A, B, C, datatime].
For different records, if the values of A, B, C are not the same, then directly treat it as different records, if the values of A, B, and C are the same, then only the record with the latest datatime will be kept.
If I first use the program to select which records are useful locally, and then insert them into mysql together, will it be faster than inserting them one by one while selecting?
You can do it easy with a composite unique key over the three field on the table where you want to insert
This query will add a unique key, so you can add the same row again
ALTER TABLE `table1` ADD UNIQUE `unique_index`(`a`, `b`, `c`);
This query will append only different records
INSERT IGNORE table1 SELECT * FROM table2
How to compare variables in more than two tables?
I even tried with compare but its not comparing more than two tables.
Using PROC COMPARE will not tell you much about the data aside from whether or not 2 tables have the same variable (which I assume is not the info you need).
You'll need to merge the tables together using MERGE. Your code will look something like this:
DATA TABLE1 TABLE2;
MERGE TABLE1 (IN=A) TABLE2 (IN=B);
BY VAR1;
IF A AND B THEN OUTPUT NEWTABLENAME;
ELSE IF A AND NOT B THEN OUTPUT NEWTABLENAME2;
RUN;
The BY statement tells SAS which variable you'd like the tables to merge on, as SAS isn't going to just smush some extra columns onto your existing table.
You can check out more about what PROC COMPARE does here
You can read more about PROC MERGE here
I want to copy only 2 out of 7 columns in 'B' dataset form 'A' dataset
dataset A has (p,q,r,s,t,u,v)
I want to copy p,q,t in a new dataset B.
This is a more efficient way to do it:
data B;
set A (keep=p q t);
run;
Because the keep option in the set statement indicates that only these columns are read to start with. Using keep outside the set statement will still read in the columns, but drop them after.
We can use 'keep' keyword.
data B;
set A;
keep p q t;
run;
What do you mean copy two colums ? Do dataset B already exists ? If thats the case you need to simply merge the two files and use keep statemaent when reading them. If you need to create new data set its even simpler
data B;
set A;
keep p q t;
run;
Hope it helps. If you need the merge plz post and I will explain furthermore
Suppose I have a table A with VAR1 and VAR2. Suppose also, I have Table B with VAR1. Is there a way I can check to see if VAR1 from Table A is in Table B without merging?
You have to merge it in some fashion, but certainly not with merge. Depending on the characteristics of the two tables, you can use:
Merge
SQL Join
Format lookup (load relevant parts of dataset B into a format)
Hash lookup (load relevant parts of dataset B into hash table)
Array lookup (load relevant parts of dataset B into temporary array)
And probably several other methods that I'm forgetting.
Is there a way in sas to create disjoint data sets using the SET statement?
I have tried:
DATA OnlyFirst OnlySecond InBoth;
SET firstds(IN=A)
seconds(IN=B);
IF A AND NOT B THEN OUTPUT OnlyFirst;
IF B AND NOT A THEN OUTPUT OnlySecond;
IF A AND B THEN OUTPUT InBoth;
Run;
But this does not create disjoint sets.
That's not how the set statement works. You should be able to use a merge if you first make sure firstds and seconds are both sorted by a key variable (or variables) they both share. You'd then need to reference that shared variable in a by statement.
DATA OnlyFirst OnlySecond InBoth;
merge firstds(IN=A)
seconds(IN=B);
by <something shared variable>;
IF A AND NOT B THEN OUTPUT OnlyFirst;
IF B AND NOT A THEN OUTPUT OnlySecond;
IF A AND B THEN OUTPUT InBoth;
Run;