create a stored procedure within SAS - sas

I have 100 insert statements like these ones
INSERT INTO table_A (col1,col2col3) VALUES ('ab','jerry',123);
INSERT INTO table_A (col1,col2col3) SELECT col1,col2,col3 FROM Test WHERE col1='ab';
INSERT INTO table_B (col1,col2col3) SELECT loc1,loc2,loc3 FROM Test_v2 WHERE loc2='ab';
I'm running the queries every 2 months. The WHERE clauses are not changing and the recipient table is being deleted every 2 months too, making it clean slate.
I've been looking the internet but it does not seem possible to create the equivalent of a SQL stored procedure and be able to run it , once it in a while .
Or is it ...?
If it doesn't exist, I'm willing to rewrite it but I want to make sure that it does not exist before doing so.
TIA.

This depends on your setup. If you have a SAS Server (including a metadata server), you can create stored processes, which is a direct analogue. See this paper or the documentation.
If your main concern is repeatability, you should just use a macro. If, on the other hand, you're interested in scheduling, you have two major options.
First, a .sas program can be scheduled in batch mode very easily; see Batch processing under Windows or look for a similar article for your operating system of choice. This entails simply setting up a .bat program that will execute your .sas program, and then asking the Windows scheduler to run it however often you need.
Second, an Enterprise Guide process flow can be scheduled via a handy tool built into the program. Go to File -> Schedule , or right click on a process flow and select Schedule . This will create a .vbs and register it with the Windows scheduler.

Related

Select stmt in source qualifier along with procedure call in Informatica

We have a situation where we are dealing with a relational source(Oracle). The system is developed in a way where we have to first execute a package which will enable data read from Oracle and user will be able to get results out of select statement. I am trying to find a way on how to implement this in informatica mapping.
What we tried
1. In PreSQL we tried to execute the package and in SQL query we wrote select statement - data not getting loaded in target.
2. In PreSQL we wrote a block in which we are executing the package and just after that(within same beging...end block) we wrote insert statement on top of select statement - This is inserting data through insert statement however I am not in favor of this solution as both source and target are dummy which will confuse people in future.
Is there any possibility to implement this solution somehow by using 1st option.
Please help and suggest.
Thanks
The stored procedure transformation is there for this purpose configure it to execute source pre load
Pre-Sql and data read are not a part of same session. From what I understand, this needs to be done within the same session as otherwise the read is granted only for the session.
What you can do, is create a stored procedure/package that will grant read access and then return the data. Use it as a SQL Override on your SQ. This way SQ will read the data as usual. The concept:
CREATE PROCEDURE ReadMyData AS
BEGIN
execute immediate 'GiveMeTheReadAccess';
select * from MyTable;
END;
And use the ReadMyData on the Source Qualifier.

how's the database management software like navicat select a large number of data from table

I am writing a c++ program in Linux platform.This program is something like linux-mini-navicat,which can connect to different database(postgresql,mysql,mssql,oracle) and execute sql.And the program start an interface server(thrift) for client connect and execute sql command.
When I execute "select * from table" which have a lot of data,maybe a million or 10 million or more,my program is terminated by linux before returning data back to client, duing to it out of memory.
I am curious about how navicat achive that,and how can I achive that in my program?
Hope I make my question clear.
Usually there is no need to retrieve (and hold in memory) all data from the big table in one go. If you are displaying query results, you could fetch enough data to fill the table on the screen, and then download it when user scrolls the table. If you develop some analysis algorithm, you still could analyze table data in chunks. See documentation on scrollable cursors for your database engine.

MySQL check if table has correct schema

I am currently developing server software in C++ with a MySQL data backend. I am using the official MySQL/connector library from Oracle to work with MySQL. The connection itself is working and I'm not having any issues with that.
My problem is that the database and the table schemas tend to change every once in a while because new tables and columns keep getting added. Also exiting column may be changed for the same reason. To make sure I recognize outdated server software quickly I wanted to add a warning when the database has changed.
My first idea was to hardcode how the database (and tables and such) should look and then check whether the current database matches the hardcoded data. But I have no clue how to achive that.
In summary I want to be able to detect whether
A table has been added or removed
A column in a table has been altered
A column in a table has been added or removed
with as little C++ code as possible. Also it should be quite easy to maintain.
Additional information will be added when required.
I would suggest the following approach:
1) fork and execute the mysql command line client. Set up a pair of pipes, to mysql's standard input and output.
2) At this point you should be able to execute simple commands by piping them to mysql via the standard input pipe, and read the output from the standard output pipe.
You will need to make careful notes as to the output format of each mysql command, so that you know when you finished reading its output, and you can send the next command.
3) As the first order of being, execute:
show tables;
The output that comes back will list all tables in the database. Parsing the output into a list of table names is trival. Then execute for each table:
show create table <tablename>;
The resulting output shows all fields in the table, its keys, and constraints. Pretty much all of this table's schema. Lather, rinse, repeat, for every table.
4) In this manner you can capture a basic schema of the entire database, for comparison purposes. If necessary, use the same approach to capture the triggers, and other objects. You'll likely need to do some minor massaging of the data, and exclude a few bits. "show create table", for example, will include the current AUTO_INCREMENT values, which you can ignore.
This general approach, of driving a mysql process via its standard input and output, is bit wobbly, of course. With a little bit of work, you can use mysql's native client library, and execute all of these commands, and capture their results, directly. This should be more reliable.

In enterprise guide, how do you re open a previous data steps output to view it?

I'm using enterprise guide 4.3.
When you run a data step the resulting output opens in a spreadsheet like table.
Then when you run a proc tabulate or similar, the spreadsheet like view of the data disappears and the table comes up in SAS Report or HTML form etc.
You can then run further commands on that dataset that was created in the data step.
Q. How can you get that spreadsheet like view of the dataset back? (assuming it's possible)
I know you can run the data step again and it will display it but that seems really inefficient, especially if the data step had lots of computations involved. The data is obviously 'sitting there' given you can still interact with it (with proc tabulate etc). I was really surprised to see that it drops off from the process flow view.
Apologises if I've name things poorly above, I'm an R beginning to dabble in SAS.
If I understood you correctly you run some code and the result comes up. Then you run some other piece of code, from the same Code node and the initial result gets removed from the process flow.
You can always find your dataset in the Server List. You can enable it by clicking View -> Server List.
There is also a trick that you can do. When you run your code and the dataset node is created in the process flow, you can do a simple query on it. Just do Right click -> Filter and query and make it do something simple that won't take too long.
Now, when you run your next piece of code, this node will not be replaced (at least this is what happens in EG 4.1).
If you mean viewing the resulting data set from a DATA STEP, choose View/Process Flow and double click on the data set you want to view. Also, within your program, log, data or result view, there should be tabs across the top that allow you to bring up the other items of the process flow.

Tell SAS not to add newly generated tables on the Process Flow

I have a SAS code that creates a lot of intermediary tables for my calculations. Thing is, I don't really care about this tables after the job is done, I only care to the finals results.
But, everytime I run this code, SAS add all the generated tables do my process flow, turning it into a huge mess (I am talking here of 40+ intermediary tables).
Is there a way to tell SAS not to add some tables to the process flow? Or at least to tell it not to add any tables at all? I am using SAS Enterprise Guide 4.1
Thanks in advance
Under SAS 9.1.x and 9.2.x (for Windows), it's possible to suppress the display of datasets in SAS client environments by prefixing the dataset name with "_TO". So in your code and/or tasks, you could call all your intemediate datasets _TO<DataSetName>, and they won't clutter up your process flow. But they will still be there and can be referenced in code and tasks.
If you do this and you're using tasks, note that it might be tricky to work out how to use the output data from a task as the input for another, if you can't see the dataset to select it. If you have trouble with this, comment on this post and we can address that.
Note that this "_TO" prefix thing is an undocumented, "hidden" feature that is to be deprecated in 9.3 - see this blog for details.
If you set the option "Maximum Number of output data sets to add to the project" (under Results General) to zero, it will not add any datasets to the project, but they'll still be available to view from the Server -> Library view (they'll be added to the flow at the point you request them).
I know this question is a year and a half old now, but if you are working with intermediate tables that can be deleted after you get the final results, SAS EG has a built in macro you can use for deleting these tables:
%_eg_conditional_dropds([table1], [table2], ... ,[table-n]);