I noted that Sybase SQL Anywhere supports them, but can't find any documentation on ASE also doing so.
If it doesn't, what would be my best option for designing a recursive query? In SQL Server 2008 I'd do it with a CTE, but if that's not available? A function perhaps?
Sybase ASE 12.5 (and also 15.0) doesn't support CTE.
You can use a inner query to solve your problem.
A simple CTE like this:
WITH Sales_CTE (Folders_Id)
AS
-- Define the CTE query.
(
SELECT Folders_Id FROM Folders
)
SELECT Folders_Id FROM Sales_CTE
is the same as this:
SELECT aux.Folders_Id
FROM (SELECT Folders_Id FROM Folders) aux
For a litle more info, check this!
Since 1984, the Standard, and Sybase, allowed for full recursion. We generally perform recursion in a stored proc, so that depth is controlled, infinite loops are avoided, and execution is faster than uncompiled SQL, etc.
Stored procs have no limits to recursion, result set construction etc. Of course, defining the content of the brackets as a View would make it faster again (it is after all a real View, not one that we have to Materialise every time we need it).
Point being, if you were used to this method (recursion in the server, a proc coded for recursion), as I am, there is no need for CTEs, with its new syntax; uncompiled speeds; temp tables; work tables; a cursor "walking" the hierarchy; all resulting in horrendous performance.
A recursive proc reads only the data, and nothing but the data, and reads only those rows that qualify at each level of the recursion. It does not use a cursor. It does not "walk", it builds the hierarchy.
A second option is to use Dynamic SQL. Simply construct the SELECTs, one per level of the hierarchy, and keep adding the UNIONs until you run out of levels; then execute.
You can use a function to provide the facility of a CTE, but do not do so. A function is intended for a different, column-oriented purpose, the code is subject to those constraints. It is scalar, good for constructing a column value. Stored procs and CTEs are row-oriented.
Related
In Stata, is there a way to redirect the data that a command does into a table instead of a graph?
Example: if someone created a normal probability distribution of data with the pnorm var_name command, is there a way to redirect the data so that instead of appearing in a graph, it appears in a table?
To add to #Noobie's answer:
Different commands work in different ways. There's no better short summary.
What you can look out for includes
generate() options that produce new variables. (There is absolute rule that the options have this name, but that or a similar name is the most common single variety.)
Options that allow saving results to new datasets.
Saved results, especially those visible after return list or ereturn list. These can be quite elaborate, e.g. saving of matrices of counts after tabulate.
More broadly, Stata commands aren't functions! One characteristic of a function, as so named in many languages or programs, is that there is a result, with special cases where the result is void or null. There clearly are statistical programs which in broad terms hinge on calling functions which have results, and what you see displayed is often a side-effect of that. Stata commands don't work like that in the sense that the results of a program can be various. In the case of commands designed just to show something, the "result" may be a display. It's worth noting that Mata, which underlies and underpins Stata, is more recognisably a C-like language, with (e.g.) many matrix extensions, which is based on functions (and much else).
Yes and no. It really depends on the command you are using. You should look at the help files first.
For instance, pnorm does not allow that. You can create the data yourself using the formula for pnorm described in the help file, where the cumulative distribution at some point is plotted against the so-called plotting position.
Other Stata commands allow you to generate the points directly. This is the case for kdensity for instance.
I ran the following code and an hour later, just as the code was finishing a sort execute error occurred. Is there something wrong with my code or is my computer processor and Ram insufficient
proc sql;
create table today as
select a.account_number, a.client_type, a.device ,a.entry_date_est,
a.entry_time_est, a.duration_seconds, a.channel_name, b.esn, b.service_start_date,
b.service_end_date, b.product_name, b.billing_frequency_fee, b.plan_category,
b.plan_subtype, b.plan_type
from listen_nomiss a inner join service_nomiss b
on (a.account_number = b.account_number)
order by account_number;
quit;
That error is most commonly seen when you run out of utility space to perform the sort. A few suggestions for troubleshooting are available in this SAS KB post; the most useful suggestions:
options fullstimer msglevel=i ; will give you a lot more information about what's going on behind the scenes, so you can troubleshoot what is causing the issue
proc options option=utilloc; run; will tell you where the utility directory is that your temporary files will be created in for the sort. Verify that about 3 times the space needed for the final table is available - sorting requires roughly 3 times the space in order to properly sort the dataset due to how the sort is processed.
OPTIONS COMPRESS; will save some (possibly a lot of) space if not already enabled.
options memsize; and options sortsize; will tell you how much memory is allocated to SAS, and at what size a sort is done in memory versus on disk. sortsize should be about 1/3 of memsize (given the requirement of 3x space to process it). If your final table is around but just over sortsize, you may be better off trying to increase sortsize if the default is too low (same for memsize).
You could also have some issues with permissions; some of the other suggestions in the kb article relate to verifying you actually have permission to write to the utility directory, or that it exists at all.
I've had a project in the past where resources was an issue as well.
A couple of ways around it when sorting were:
Don't forget that proc sort has a TAGSORT option, which will make it first only sort on the by statement variables and attach everything else afterwards. Useful when having many columns not involved in the by statement.
Indexes: if you build an index of exactly the variables in your by-statement, you can use a by statement without sorting, it will rely on the index.
Split it up: you can split up the dataset in multiple chunks and sort each chunk separately. Then you do a data step where you put them all in the set statement. When you use a by statement there as well, SAS will weave the records so that the result is also according to the by-statement.
Note that these approaches have a performance hit (maybe the third one only to a lesser extent) and indexes can give you headaches if you don't take them into account later on (or destroy them intentionally).
One note if/when you would rewrite the whole join as a SAS merge: keep in mind that SAS merge does not by itself mimic many-to-many joins. (it does one-to-one, one-to-many and many-to-one) Probably not the case here (it rarely is), but i mention it to be on the safe side.
I am looking for a strategy to batch all my queries (with IN clause) to overcome the restrictions by databases on IN clause (See here).
I usually get list of size 100000 to 305000. So, this has become very important to tackle.
I have tried two strategies so far.
Strategy 1:
Create an entity and hence a table with one column to hold such values (can we create temp tables on the fly with JPA 2.0 vendor-independent?) and use the data from the temp table as a subquery to the original query before eventually cleaning up the temp table.
Advantage: Very performant queries. Really quick, I must admit for the numbers I have mentioned, it was mostly under a minute.
Possible drawback: Use of temp table which is actually a permanent one in my case thus far.
Strategy 2:
Calculate the batch size for the given input list and for each batch execute the query and accumulate the result.
Advantage: No temp tables. Easy for any threads within the same transaction.
Disadvantage: A big disadvantage is amount of time it takes to execute all the batches. For the mentioned numbers, this is at an unacceptable level at the moment. Takes anything between 5 to 15 mins!
I would appreciate any feedback, suggestions or improvements from all you JPA gurus.
Thanks.
I only tested up to 50,000 integers but I have some pretty good performance data around splitting large lists using various methods, with CLR and a numbers table leading the pack at the higher end:
Splitting a list of integers : another roundup
Not sure if you are using integers or strings but the results should be roughly equivalent.
As an aside, I'll confess I have no idea what JPA 2.0 is, but I assume you can control the format of the lists that it sends to SQL Server.
When we fire a SQL query like
SELECT * FROM SOME_TABLE_NAME under ORACLE
What exactly happens internally? Is there any parser at work? Is it in C/C++ ?
Can any body please explain ?
Thanks in advance to all.
Short answer is yes, of course there is a parser module inside Oracle that interprets the statement text. My understanding is that the bulk of Oracle's source code is in C.
For general reference:
Any SQL statement potentially goes through three steps when Oracle is asked to execute it. Often, control is returned to the client between each of these steps, although the details can depend on the specific client being used and the manner in which calls are made.
(1) Parse -- I believe the first action is actually to check whether Oracle has a cached copy of the exact statement text. If so, it can save the work of parsing your statement again. If not, it must of course parse the text, then determine an execution plan that Oracle thinks is optimal for the statement. So conceptually at least there are two entities at work in this phase -- the parser and the optimizer.
(2) Execute -- For a SELECT statement this step would generally run just enough of the execution plan to be ready to return some rows to the client. Depending on the details of the plan, that might mean running the whole thing, or might mean doing just a very small fraction of the work. For any other kind of statement, the execute phase is when all of the work is actually done.
(3) Fetch -- This is when rows are actually returned to the client. Generally the client has a predetermined fetch array size which sets the maximum number of rows that will be returned by a single fetch call. So there may be many fetches made for a single statement. Of course if the statement is one that cannot return rows, then there is no fetch step necessary.
Manasi,
I think internally Oracle would have its own parser, which does parsing and tries compiling the query. Think its not related to C or C++.
But need to confirm.
-Justin Samuel.
I am using the libodbc++ ODBC wrapper for C++, designed similar to JDBC. I have a prepared statement "INSERT INTO t1 (col1) VALUES (?)", where t1.col1 is defined as VARCHAR(500).
When I call statement->setString(1, s), the value of s is truncated to 255. I suspect the libodbc++ library, but as I am not very familiar with ODBC I'd like to be sure that the wrapper doesn't just expose a restriction of the underlying ODBC. The ODBC API reference is too complicated to be skimmed quickly and frankly I really don't want to do that, so pardon me for asking a basic question.
NOTE: an un-prepared and un-parameterized insert statement via the same library inserts a long value ok, so it isn't a problem of the MySql DB.
For long string, use PreparedStatement::setAsciiStream() instead of PreparedStatement::setString().
But when use stream, I often encounter error "HY104 Invalid Precision Value", which is annoying because I have no idea how to tackle it head on, however I work around it with following steps:
1, order the columns in SQL statement, non-stream columns go first;
2, if that doesn't work, split the statement to multiple ones, update or query a single column per statement.
But(again), in order to insert a row first and then update some columns in stream manner, one may have to get the last insert id, which turns out to be another challenge which I again failed to tackle head on for now...
I don't know libodbc++, but PreparedStatements available via ODBC API can store more characters. I use it with Delphi/Kylix ODBC wrapper.
Maybe there is some configuration in libodbc++ to set value length limit? I have such setting in my Delphi library. If you use PreparedStatement then you can allocate big chunk of memory, divide it into fields and show ODBC where block for each column starts and how long is it via SQLBindParameter() function.