SAS stored process server vs workspace server

SAS stored process server vs workspace server - sas

SAS has a stored process server that runs stored processes and a workspace server that runs SAS code. But a stored process is nothing but a combination of SAS code statements, so why can't the workspace server run SAS code?
I am trying to understand why SAS developers came up with the concept of a separate server just for stored processes.

A stored process server reuses the SAS process between runs. It is a stateless server meant to run small pre-written programs and return results. The server maintains a pool of processes and allocates requests to that pool. This minimizes the time to run a job as there is no startup/shut down of the process overhead.
A workspace server is a SAS process that is started for 1 user. Every user connection gets a new SAS process on the server. This server is meant to run more interactive processes where a user runs something, looks at output and then runs something else. Code does not have to be prewritten and stored on the server. In that scenario, startup time is not a limiting factor.
Also, a workspace server can provide additional access to the server. A programmer can use this server to access SAS data sets (via ADO in .NET or JDBC in Java) as well as files on the server.
So there are 2 use cases and these servers address them.

From a developers perspective, the two biggest differences are:
Identity. The stored process server runs under the system account (&sysuserid) configured in the SAS General Servers group, sassrv by default. This will affect permissions (eg database access) at the OS level. Workspace sessions are always run under the client account (logged in user) credentials.
Sessions. The option to retain 'state' by leaving your session alive for a set time period (and accessing the same session again using a session id) is available on for the Stored Process server, however - avoid this pattern at all costs! The reason being, that this session will tie up one of your multibridge ports and play havoc with the load balancing. It's also a poor design choice.
Both stored process and workspace servers can be configured to provide pooled sessions (generic sessions kept alive to be re-used, avoiding startup cost for frequent requests).
To further address your points - a Stored Process is a metadata object which points to (or can also contain) raw sas code. A stored process can run on either type (stored process or workspace) of server. The choice of which will depend on your functional needs above, plus performance considerations as per your pooling and load balancing configuration.

Related

Appropriate architecture for event logging in a game

I'm trying to modify a game engine so it records events (like key presses), and store these in a MySQL database on a remote server. The game engine is written in C++, and I currently have the following straightforward architecture, using mysql++ to directly INSERTrecords into appropriate databases:
Unfortunately, there's a very large overhead when connecting to the MySQL server, and the game stops for a significant amount of time. Pushing a batch of Xs worth of events to the server causes a significant delay in gameplay (60s worth of events can take 12s to synchronise). There are also apparently security concerns with leaving the MySQL port accessible publicly.
I was considering an alternative option, instead sending commands to the server, which can interact with the database in its own time:
Here the game would only send the necessary information (e.g. the table to update and the data to insert). I'm not sure whether the speed increase would be sufficient, or what system would be appropriate for managing the commands sent from the game.
Someone else suggested Log4j, but obviously I need a C++ solution. Is there an appropriate existing framework for accomplishing what I want?

Most applications gathering user-interface interaction data (in your case keystrokes) put it into a local file of some sort.
Then at an appropriate time (for example at the end of the game, or the beginning of another game), they POST that file, often in compressed form, to a publicly accessible web server. The software on the web server decompresses the data and loads it into the analytics system (the MySQL server in your case) for processing.
So, I suggest the following.
stop making your MySQL server's port available to people you don't know and trust.
get your game to gather keystrokes locally somehow.
get it to upload that data in big bunches when your game is not in realtime mode.
write a web service to receive and interpret these files.
That way you'll build a more secure analytics system and a more responsive game.

Which SAS servers are involved in servicing requests from enterprise Guide and DI studio

Behind the scenes, SAS has the following servers:
1.metadata server
2.Workspace Server
3.Stored Process Server
4.OLAP Server
When we run a macro or a stored process on Enterprise Guide, does it use Workspace Server which internally uses Meta data server and Stored process Server?
When we run an ETL job on DI Studio, which servers service the request?

When you start EG or DI you initially connect to your metadata server. The metadata server knows who the users are, where the data resides, and how to connect to SAS workspace servers and SAS Stored Process servers.
When you hit the submit button in a project or job from EG or DI, the EG or DI is going to connect to an Object Spawner (daemon) to launch a SAS workspace wherein your SAS code is executed. Stored Process server is not involved. SAS metadata server is only involved in checking permissions and helping the client application find its object spawner.
There are a couple of cases where you can touch a Stored Process server. This typically happens when you ask to run a stored process or convert a job or program to a stored process. Unfortunately, SAS has made this a bit complicated by allowing a "stored process" to run on either a Stored Process server or a SAS Workspace server. This is a regrettable name choice, but something we all need to deal with when using this software stack.

Access .ldb file & multiple connection.

I have an API which opens an access database for read and write. The API opens the connection when it's constructed and closes the connection when it's destructed. When the db is opened an .ldb file is created and when it closes it's removed (or disappears).
There are multiple applications using the API to read and write to the access db. I want to know:
Is ldb file used to track multiple connections
Does calling an db.close() closes all connections or just one instance.
Will there be any sync issues with the above approach.

db.Close() closes one connecton. The .ldb is automatically removed when all connections are closed.
Keep in mind that while Jet databases (i.e. Access) do support mutiple simultaneous users, they're not extremely well-suited for a very large concurrent user base; for one thing, they are easily corrupted when there are network issues. I'm actually dealing with that right now. If it comes to that, you will want to use a database server.
That said, I've used Jet databases in that way many times.
Not sure what you mean when you say "sync issues".

Yes, it's required to open database in shared mode by multiple users. Seems it stands for "Lock Database". See more info in MSDN: Introduction to .ldb files in Access 2000.
Close() closes only one connection, others are unaffected.
Yes, it's possible if you try to write records that another user has locked. However data will remain consistent, you will just receive error about write conflict.
Actually MS Access is not best solution for multi-connection usage scenario.
You may take a look at SQL Server Compact which is light version of MS SQL Server. It runs in-process, supports multiple connections and multithreading, most of robust T-SQL features (excluding stored procs) etc.

As an additional note to otherwise good answers, I would strongly recommend keeping a connection to a dummy table open for the lifetime of the client application.
Closing connections too often and allowing the lock file to be created/deleted every time is a huge performance bottleneck and, in some cases of rapid access to the database, can actually cause queries and inserts to fail.
You can read a bit more in this answer I gave a while ago.
When it comes to performance and reliability, you can get quite a lot out of Access databases providing that you keep some things in mind:
Keep a connection open to a dummy table for the duration of the life of the client (or at least use some timeout that would close the connection after like 20 seconds of inactivity if you don't want to keep it open all the time).
Engineer your clients apps to properly close all connections (including the dummy one when i'ts time to do it), whatever happens (eg crash, user shutdown, etc).
Leaving locks in place is not good, as it could mean that the client has left the database in an unknown state, and could increase the likelihood of corruption if other clients keep leaving stale locks.
Compact and repair the database regularly. Make it a nightly task.
This will ensure that the database is optimised, and that any stale data is removed and open locks properly closed.
Good, stable network connectivity is paramount to data integrity for a file-based database: avoid WiFi like the plague.
Have a way to kick out all clients from the database server itself.
For instance, have a table with for instance a MaintenanceLock field that clients poll regularly. If the field is set, the client should disconnect, after giving an opportunity for the user to save his work.
Similarly, when a client app starts, check this field in the database to allow or disallow the client to connect to it.
Now, you can quick out clients at any time without having to go to each user and ask them to close the app. It's also very useful to ensure that no client left open at night are still connected to the database when you run Compact & Repair maintenance on it.

Cache data on Multiple Hosts in AppFabric

Let me first explain that I am very new when it comes to use AppFabric for improving the Responsiveness of your application. I am trying to configure the Server Cluster with 2 Nodes over XML provider over Network Shared location.
My requirement is that the cached data should be created on both the Hosts so that If One of the host is down my other host in the Cluster should be able to serve the request and provide the cached data. As I said I have 2 Host in my Cluster and one of them is defined as Lead Host. Now when I am saving the data in cache I could not see the data in both the hosts (Not sure is there any specific command where you can see the data in a specific host). So what I want to test is that I’ll stop one of the Cache host and try to see if still I able to get the data from the second cache host.
thanks in advance
-Nitin

What you're talking about here is High Availability. To enable this, you'll need to be running Windows Server Enterprise Edition - if you're on Standard Edition then you just can't do it. You also really need a minimum of three hosts, so that if one goes down there are still two copies of your cached data to provide failover. If you can meet these requirements then the only extra step to create a highly-available cache is to set the Secondaries flag when you call new-cache e.g.
new-cache myHACache -Secondaries 1
There's no programmatic way to query what data is held on a specific host, because you only ever address the logical cache, not an individual physical host.

From our experience, using SQL authentication to the database does not work. Its clearly stated that only Integrated Security option is supported. Also we faced issues with the service running with "Integrated Security" since our SQL cluster was running under a domain account and AppFabric needs to run under "Network service" and we couldnt successfully connect to the sql cluster from AppFabric service.
This was a painful experience for us and I hope AppFabric caching improves the way it sends out "error messages and error codes". And also allows us to decide how we want to connect to the sql. KInd of stupid having to undergo this pain of "has to run as Network Service" and "no SQL authentication".

Target IIS Worker Processes on Request

Ok, strange setup, strange question. We've got a Client and an Admin web application for our SaaS app, running on asp.net-2.0/iis-6. The Admin application can change options displayed on the Client application. When those options are saved in the Admin we call a Webservice on the Client, from the Admin, to flush our cache of the options for that specific account.
Recently we started giving our Client application >1 Worker Processes, thus causing the cache of options to only be cleared on 1 of the currently running Worker Processes.
So, I obviously have other avenues of fixing this problem (however input is appreciated), but my question is: is there any way to target/iterate through each Worker Processes via a web request?

I'm making some assumptions here for this answer....
I'm assuming the client app is using one of the .NET caching classes to store your application's options?
When you say 'flush' do you mean flush them back to a configuration file or db table?
Because the cache objects and data won't be shared between processes you need a mechanism to signal to the code running on the other worker process that it needs to re-read it's options into its cache or force the process to restart (which is not exactly convenient and most likely undesirable).
If you don't have access to the client source to modify to either watch the options config file or DB table (say using a SqlCacheDependency) I think you're kinda stuck with this behaviour.

I have full access to admin and client, by cache, I mean .net's Cache object. By flush I mean removing the item from the Cache object.
I'm aware that both worker processes don't share the cache data. That's sort of my conundrum)
The system is the way it is to remove the need to hit sql every new-session that comes in. So I'm trying to find a solution that can just tell each worker process that the cache needs to be cleared w/o getting sql involved.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js