I have a redshift table with a column which has empty values rarely. It is expected to have only integer values but some places empty values exist. When I try to cast it using :: it throws error -
[Code: 500310, SQL State: XX000] [Amazon](500310) Invalid operation: Invalid digit, Value 'B', Pos 0, Type: Integer
Details:
-----------------------------------------------
error: Invalid digit, Value 'B', Pos 0, Type: Integer
code: 1207
context: BEVEL_ON
query: 34112149
location: :0
process: query1_836_34112149 [pid=0]
-----------------------------------------------;
So to clarify you have a text column that contains numeric characters most of the time and you want to case this to integer, right? It also sounds like you believe that the only only non-numeric values are the empty string ''.
If this is the case then the solution is fairly simple - change the empty string to NULL before casting. The DECODE statement is my go to for this:
DECODE(col_X, '', NULL, col_X)::INT
If a more varied set of strings are in the column then using regexp_replace() to strip all the non-numeric characters would be needed.
text_to_int_alt(
case
when regexp_replace(convert(varchar, creative_id), '[^0-9]', '') <> '' then
regexp_replace(convert(varchar, creative_id), '[^0-9]', '')
end)
I have a table that has a text field which has formatted strings that represent money.
For example, it will have values like this, but also have "bad" invalid data as well
$5.55
$100050.44
over 10,000
$550
my money
570.00
I want to convert this to a numeric field but maintain the actual numbers that can be retained, and for any that can't , convert to null.
I was using this function originally which did convert clean numbers (numbers that didn't have any formatting). The issue was that it would not convert $5.55 as an example and set this to null.
CREATE OR REPLACE FUNCTION public.cast_text_to_numeric(
v_input text)
RETURNS numeric
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
declare v_output numeric default null;
begin
begin
v_output := v_input::numeric;
exception when others then return null;
end;
return v_output;
end;
$BODY$;
I then created a simple update statement which removes the all non digit characters, but keeps the period.
update public.numbertesting set field_1=regexp_replace(field_1,'[^\w.]','','g')
and if I run this statement, it correctly converts the text data to numeric and maintains the number:
alter table public.numbertesting
alter column field_1 type numeric
using field_1::numeric
But I need to use the function in order to properly discard any bad data and set those values to null.
Even after I run the clean up to set the text value to say 5.55
my "cast_text_to_numeric" function STILL sets this to null ? I don't understand why this sets it to null, but the above statement correctly converts it to a proper number.
How can I fix my cast_text_to_numeric function to properly convert values such as 5.55 , etc?
I'm ok with disgarding (setting to NULL) any values that don't end up with numbers and a period. The regular expression will strip out all other characters... and if there happens to be two numbers in the text field, with the script, they would be combined into one (spaces are removed) and I'm good with that.
In the example of data above, after conversion, the end result in numeric field would be:
5.55
100050.44
null
550
null
570.00
FYI, I am on Postgres 11 right now
I want to convert a readable timestamp to UNIX time.
For example: I want to convert 2018-08-24 18:42:16 to 1535136136000.
Here is my syntax:
TO_UNIXTIME('2018-08-24 06:42:16') new_year_ut
My error is:
SYNTAX_ERROR: line 1:77: Unexpected parameters (varchar(19)) for function to_unixtime. Expected: to_unixtime(timestamp) , to_unixtime(timestamp with time zone)
You need to wrap the varchar in a CAST to timestamp:
to_unixtime(CAST('2018-08-24 06:42:16' AS timestamp)) -- note: returns a double
If your timestamp value doesn't have fraction of second (or you are not interested in it), you can cast to bigint to have integral result:
CAST(to_unixtime(CAST('2018-08-24 06:42:16' AS timestamp)) AS BIGINT)
If your readable timestamp value is a string in different format than the above, you would need to use date_parse or parse_datetime for the conversion. See https://trino.io/docs/current/functions/datetime.html for more information.
Note: when dealing with timestamp values, please keep in mind that: https://github.com/trinodb/trino/issues/37
I'm using Python to read values from SQL Server (pypyodbc) and insert them into PostgreSQL (psycopg2)
A value in the NAME field has come up that is causing errors:
MontaƱo
The value is existing in my MSSQL database just fine (SQL_Latin1_General_CP1_CI_AS encoding), and can be inserted into my PostgreSQL database just fine (UTF8) using PGAdmin and an insert statement.
The problem is selecting it using python causes the value to be converted to:
Monta\xf1o
(xf1 is ASCII for 'Latin small letter n with tilde')
...which is causing the following error to be thrown when trying to insert into PostgreSQL:
invalid byte sequence for encoding "UTF8": 0xf1 0x6f 0x20 0x20
Is there any way to avoid the conversion of the input string to the string that is causing the error above?
Under Python_2 you actually do want to perform a conversion from a basic string to a unicode type. So, if your code looks something like
sql = """\
SELECT NAME FROM dbo.latin1test WHERE ID=1
"""
mssql_crsr.execute(sql)
row = mssql_crsr.fetchone()
name = row[0]
then you probably want to convert the basic latin1 string (retrieved from SQL Server) to the type unicode before using it as a parameter to the PostgreSQL INSERT, i.e., instead of
name = row[0]
you would do
name = unicode(row[0], 'latin1')
I'm having some difficulty with inserting some data using libpq. I have two custom data types:
create type size as (width real, height real);
create type rotated_rect as (angle real, center point, bounding_box box, size size)
and I would like to insert a record into a table which has a rotated_rect field, so for the field using libpq I'm putting together the string value:
paramv[3] = "(10.5,10.1,10.2,20,20,20,40,(5,5))";
However, it's giving me the error: invalid input syntax for type point: "10.1"
I've also tried:
paramv[3] = "(10.5,(10.1,10.2),20,20,20,40,(5,5))"; -> invalid input syntax for "(10.1"
paramv[3] = "(10.5,(10.1,10.2),(20,20,20,40),(5,5))"; -> as above
and the sql command I'm using is:
res = PQexecParams(conn, "insert into test (r,b,s,rr) values ($1::real,$2::box,$3::size,$4::rotated_rect)", 4, NULL, paramv, NULL, NULL,0);
How do I fix this?
This works (tested in Postgres 9.3):
SELECT '(10.5,"(10.1,10.2)","(20,20,20,40)","(5,5)")'::rotated_rect
Returns:
'(10.5,"(10.1,10.2)","(20,40),(20,20)","(5,5)")'
Note the different syntax for box. Try this form.
What got me were that escaped double quotes and parenthesis need to be used around the values representing a field of the custom compound data type which requires more than one value to create, so:
paramv[0] = "(10.5,\"(10.1,10.2)\",\"(20,20,20,40)\",\"(5,5)\")";
As this string is used as a parameter, the single quotes that would usually wrap the outer parenthesis are not needed.
In a non-parameterised query, it would be implemented like so with the single quotes:
res = PQexec(conn, "insert into test (rr) values ('(10.5,\"(10.1,10.2)\",\"(20,20,20,40)\",\"(5,5)\")')");