I have a query in standard SQL of bigquery:
insert into table1 (valriable1) select x as variable1 from table2;
where variable1 is a float and x is an integer in table2 (I want to put an integer into a float), but I receive an error saying:
invalid schema update. Field x has changed type
the same problem when putting an integer into a string. I don't want to use cast function. Is there any other way for automatic type casting (like what we have in sql server)
Related
Suppose I have in SAS someTable with a column someColumn of type Character.
I can adjust length, format, informat and label in the following way:
ALTER TABLE WORK.someTable
MODIFY someColumn char(8) format=$CHAR6. informat=$CHAR6. label='abcdef'
But I doubt if this is the correct way for the following reasons:
It seems pointless that the syntax requires the type char because column type can't be changed with a MODIFYstatement.
This code does not work if someColumn is of type Numeric or Date.
The syntax for changing length is inconsistent with the syntax for changing format/informat/label.
Actually, I expected the following code to work:
ALTER TABLE WORK.someTable
MODIFY someColumn length=8 format=$CHAR6. informat=$CHAR6. label='someLabel'
This code runs without errors nut does not change the length.
Question:
What is the correct syntax to modify the length of a column using ALTER TABLE / MODIFY?
(For arbitrary column type like character/numeric/date.)
The syntax for defining the altered variable ("column") is the same as the syntax PROC SQL uses for defining a variable. What the documentation calls "column-definition Component"
column data-type <column-modifier(s)>
That is why you use the SQL syntax, char(n) or num, for specifying the type. Note that SAS datasets only have two data types: fixed length character strings and floating point numbers. SAS will automatically convert any other SQL data-type into the proper one of those.
The limitations on altering the type are spelled out in the documentation:
Changing Column Attributes
If a column is already in the table, then
you can change the following column attributes by using the MODIFY
clause: length, informat, format, and label. The values in a table are
either truncated or padded with blanks (if character data) as
necessary to meet the specified length attribute.
You cannot change a character column to numeric and vice versa. To
change a column’s data type, drop the column and then add it (and its
data) again, or use the DATA step.
Note: You cannot change the length of a numeric column with the ALTER
TABLE statement. Use the DATA step instead.
Note that to make such changes to a dataset SAS will have to create a whole new dataset. So you might as well just write a data step to create the new dataset and then you will have full control.
Also be careful if you change the length of character variable to make sure that the attached FORMAT is still correct.
In your example you are changing the variable to be 8 bytes long, but are attaching a format that will only display the first 6 bytes.
In general it is best to not attach formats to character variables to avoid the confusion that type of mismatch can cause. Unfortunately there is no way to remove the attached format using PROC SQL. The best you could do is to set the format to $., that is without an explicit width. If you want to completely remove the format you will need to use a FORMAT statement in PROC DATASETS or a data step.
We are working on a bigdata pipeline automation on GCP and are ingesting some CSV files. To prevent process break at BQ level due to schema we have ingested the first table after converting all columns as 'STRING' type.
Is it gracefully possible in BQ to have the schema conversion on the table just ingested , so that we can change the STRING types to their actual types like INT64, FLOAT , etc.
Is it a good approach?
There is not a way of "changing data type" without refreshing the whole table. You can run a SQL like
CREATE TEMP FUNCTION myFunctionStringToFloat(x STRING)
AS (
-- Assuming you have non-trivial logic to safely convert STRING to FLOAT
-- If you don't, you can just put SAFE_CAST(x AS FLOAT)
);
CREATE OR REPLACE myTable
AS SELECT * EXCEPT(col1), myFunctionStringToFloat(col1) as col1 FROM myTable;
You will be charged the scanning of the table though. The other way is to keep your CSV super clean and make sure the table load succeed with FLOAT column.
You can try the GCS/BQ transfer service and define your schema ahead. If there are failures you can get notifications.
PROC SQL;
SELECT end_dt-start_dt as EXPOSURE,
(CASE WHEN (EXPOSURE)<0 THEN 0 ELSE CAST(TRUNC((EXPOSURE)/30+0.99) as
INTEGER) END as bucket) FROM TABLE
This statement works fine in SQL but throws an error in proc sql at both 'as'.
CAST is not a valid SAS SQL function. Use the appropriate SAS SQL function, in this case likely INT(), to convert calculation to an integer value.
If you'd like to use your DB SQL you need to use SAS SQL Pass Through which will pass the code directly to your database, but then the entire query must be valid on that database.
SAS has attributes for every field like Length, Format, Informat. They help store, read and read from a data source.
Your PROC SQL would not require a type cast. Instead use FORMAT statement.
PROC SQL; SELECT end_dt-start_dt as EXPOSURE, CASE WHEN (EXPOSURE)<0 THEN 0 ELSE INT(TRUNC((EXPOSURE)/30+0.99)) END as bucket Format 8. FROM TABLE
Not sure or syntax of the whole statement as I couldn't get to test it, although the whole idea holds true.
I am using the below query in SAS Enterprise Guide to find the count for different offer_ids customers for different dates :
PROC SQL;
CREATE TABLE test1 as
select offer_id,
(Count(DISTINCT (case when date between '2016-11-13' and '2016-12-27' then customer_id else 0 end))) as CUSTID
from test
group by offer_id
;QUIT;
ERROR: Expression using IN has components that are of different data types
Note: Here, Offer_id is the character variable whereas Custome_id is an numeric variable.
Most likely the error is caused by comparing the numeric variable DATE to the character strings '2016-11-13'. If you want to specify a date literal in SAS you must specify the date in DATE9 format and append the letter D after the close quote.
date BETWEEN '13NOV2016'd AND '27DEC2016'd
Note that there is no reference to any external database in the posted code. But even if your source table was tdlib.tdtable instead of work.test you still need to use SAS syntax when writing SAS code. Let the Teradata engine figure out how to convert it for you.
You don't make it clear whether this is being run on SAS or Teradata (via pass through).
I'm guessing SAS, in which case you are missing d after your dates (e.g. '2016-11-13'd). Without this, the dates are being treated as text instead of formatted numbers.
The error statement is slightly misleading, as SAS is treating the between statement as an in statement.
I have a SAS dataset with a numeric variable ACCT_ID (among other fields). Its attributes in a PROC CONTENTS are:
# Variable Type Len Format Informat Label
1 ACCT_ID Num 8 19. 19. ACCT_ID
I know that this field doesn't have any non-integer values in it, so I want to store it as a BIGINT in Teradata, and I've specified this with the dbtype data set option like this:
data td.output(dbtype=(ACCT_ID="BIGINT", <etc etc>));
However, this gives the following error:
ERROR: Datatype mismatch for column: ACCT_ID.
There are no missing or non-integer values in that field, and the error persists even if I round ACCT_ID using round(acct_id, 1) to explicitly remove any floating point values that could exist.
Strangely enough, no error is given if I assign this to be a DECIMAL(18,0) in Teradata rather than a BIGINT. I guess that could be one workaround, but I'd like to understand how I can create integer fields in Teradata from SAS numeric variables like this given that SAS doesn't distinguish types between integer and floating point.
SAS does not support the BIGINT datatype. See http://support.sas.com/kb/34/729.html.
Teradata's BIGINT data type is not supported in SAS/ACCESS Interface
to Teradata. You cannot read or update a table containing a column
with the BIGINT data type in SAS/ACCESS Interface to Teradata.
Attempting to do so generates the following error message:
ERROR: At least one of the columns in this DBMS table has a datatype that is
not supported by this engine.