store infinity in postgres json via django - django

I have a list of tuples like below -
[(float.inf, 1.0), (270, 0.9002), (0, 0.0)]
I am looking for a simple serializer/deserializer that helps me store this tuple in a jsonb field in PostgreSQL.
I tried using JSONEncoder().encode(a_math_function) but didn't help.
I am facing the following error while attempting to store the above list in jsonb field -
django.db.utils.DataError: invalid input syntax for type json
LINE 1: ...", "a_math_function", "last_updated") VALUES (1, '[[Infinit...
DETAIL: Token "Infinity" is invalid.
Note: the field a_math_function is of type JSONField()

t=# select 'Infinity'::float;
float8
----------
Infinity
(1 row)
because
https://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-FLOAT
In addition to ordinary numeric values, the floating-point types have
several special values:
Infinity
-Infinity
NaN
yet, the json does not have such possible value (unless its string)
https://www.json.org/
value
string
number
object
array
true
false
null
thus:
t=# select '{"k":Infinity}'::json;
ERROR: invalid input syntax for type json
LINE 1: select '{"k":Infinity}'::json;
^
DETAIL: Token "Infinity" is invalid.
CONTEXT: JSON data, line 1: {"k":Infinity...
Time: 19.059 ms
so it's not the jango or postgres limitation - just Infinity is invalid token, yet 'Infinity' is a valid string. so
t=# select '{"k":"Infinity"}'::json;
json
------------------
{"k":"Infinity"}
(1 row)
works... But Infinity here is "just a word". Of course you can save it as a string, not as numeric value and check every string if it's not equal "Infinity", and if it is - launch your program logic to treat it as real Infinity... But in short - you can't do it, because json specification does not support it... same asyou can't store lets say red #ff0000 as colour in json - only as string, to be caught and processed by your engine...
update:
postgres would cast float to text itself on to_json:
t=# select to_json(sub) from (select 'Infinity'::float) sub;
to_json
-----------------------
{"float8":"Infinity"}
(1 row)
update
https://www.postgresql.org/docs/current/static/datatype-json.html
When converting textual JSON input into jsonb, the primitive types
described by RFC 7159 are effectively mapped onto native PostgreSQL
types
...
number numeric NaN and infinity values are disallowed

Related

Unable to cast redshift column to integer due to empty values

I have a redshift table with a column which has empty values rarely. It is expected to have only integer values but some places empty values exist. When I try to cast it using :: it throws error -
[Code: 500310, SQL State: XX000] [Amazon](500310) Invalid operation: Invalid digit, Value 'B', Pos 0, Type: Integer
Details:
-----------------------------------------------
error: Invalid digit, Value 'B', Pos 0, Type: Integer
code: 1207
context: BEVEL_ON
query: 34112149
location: :0
process: query1_836_34112149 [pid=0]
-----------------------------------------------;
So to clarify you have a text column that contains numeric characters most of the time and you want to case this to integer, right? It also sounds like you believe that the only only non-numeric values are the empty string ''.
If this is the case then the solution is fairly simple - change the empty string to NULL before casting. The DECODE statement is my go to for this:
DECODE(col_X, '', NULL, col_X)::INT
If a more varied set of strings are in the column then using regexp_replace() to strip all the non-numeric characters would be needed.
text_to_int_alt(
case
when regexp_replace(convert(varchar, creative_id), '[^0-9]', '') <> '' then
regexp_replace(convert(varchar, creative_id), '[^0-9]', '')
end)

How to convert text field with formatted currency to numeric field type in Postgres?

I have a table that has a text field which has formatted strings that represent money.
For example, it will have values like this, but also have "bad" invalid data as well
$5.55
$100050.44
over 10,000
$550
my money
570.00
I want to convert this to a numeric field but maintain the actual numbers that can be retained, and for any that can't , convert to null.
I was using this function originally which did convert clean numbers (numbers that didn't have any formatting). The issue was that it would not convert $5.55 as an example and set this to null.
CREATE OR REPLACE FUNCTION public.cast_text_to_numeric(
v_input text)
RETURNS numeric
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
declare v_output numeric default null;
begin
begin
v_output := v_input::numeric;
exception when others then return null;
end;
return v_output;
end;
$BODY$;
I then created a simple update statement which removes the all non digit characters, but keeps the period.
update public.numbertesting set field_1=regexp_replace(field_1,'[^\w.]','','g')
and if I run this statement, it correctly converts the text data to numeric and maintains the number:
alter table public.numbertesting
alter column field_1 type numeric
using field_1::numeric
But I need to use the function in order to properly discard any bad data and set those values to null.
Even after I run the clean up to set the text value to say 5.55
my "cast_text_to_numeric" function STILL sets this to null ? I don't understand why this sets it to null, but the above statement correctly converts it to a proper number.
How can I fix my cast_text_to_numeric function to properly convert values such as 5.55 , etc?
I'm ok with disgarding (setting to NULL) any values that don't end up with numbers and a period. The regular expression will strip out all other characters... and if there happens to be two numbers in the text field, with the script, they would be combined into one (spaces are removed) and I'm good with that.
In the example of data above, after conversion, the end result in numeric field would be:
5.55
100050.44
null
550
null
570.00
FYI, I am on Postgres 11 right now

Presto SQL: TO_UNIXTIME

I want to convert a readable timestamp to UNIX time.
For example: I want to convert 2018-08-24 18:42:16 to 1535136136000.
Here is my syntax:
TO_UNIXTIME('2018-08-24 06:42:16') new_year_ut
My error is:
SYNTAX_ERROR: line 1:77: Unexpected parameters (varchar(19)) for function to_unixtime. Expected: to_unixtime(timestamp) , to_unixtime(timestamp with time zone)
You need to wrap the varchar in a CAST to timestamp:
to_unixtime(CAST('2018-08-24 06:42:16' AS timestamp)) -- note: returns a double
If your timestamp value doesn't have fraction of second (or you are not interested in it), you can cast to bigint to have integral result:
CAST(to_unixtime(CAST('2018-08-24 06:42:16' AS timestamp)) AS BIGINT)
If your readable timestamp value is a string in different format than the above, you would need to use date_parse or parse_datetime for the conversion. See https://trino.io/docs/current/functions/datetime.html for more information.
Note: when dealing with timestamp values, please keep in mind that: https://github.com/trinodb/trino/issues/37

How to avoid conversion to ASCII when reading

I'm using Python to read values from SQL Server (pypyodbc) and insert them into PostgreSQL (psycopg2)
A value in the NAME field has come up that is causing errors:
MontaƱo
The value is existing in my MSSQL database just fine (SQL_Latin1_General_CP1_CI_AS encoding), and can be inserted into my PostgreSQL database just fine (UTF8) using PGAdmin and an insert statement.
The problem is selecting it using python causes the value to be converted to:
Monta\xf1o
(xf1 is ASCII for 'Latin small letter n with tilde')
...which is causing the following error to be thrown when trying to insert into PostgreSQL:
invalid byte sequence for encoding "UTF8": 0xf1 0x6f 0x20 0x20
Is there any way to avoid the conversion of the input string to the string that is causing the error above?
Under Python_2 you actually do want to perform a conversion from a basic string to a unicode type. So, if your code looks something like
sql = """\
SELECT NAME FROM dbo.latin1test WHERE ID=1
"""
mssql_crsr.execute(sql)
row = mssql_crsr.fetchone()
name = row[0]
then you probably want to convert the basic latin1 string (retrieved from SQL Server) to the type unicode before using it as a parameter to the PostgreSQL INSERT, i.e., instead of
name = row[0]
you would do
name = unicode(row[0], 'latin1')

How do I insert this custom data type with libpq?

I'm having some difficulty with inserting some data using libpq. I have two custom data types:
create type size as (width real, height real);
create type rotated_rect as (angle real, center point, bounding_box box, size size)
and I would like to insert a record into a table which has a rotated_rect field, so for the field using libpq I'm putting together the string value:
paramv[3] = "(10.5,10.1,10.2,20,20,20,40,(5,5))";
However, it's giving me the error: invalid input syntax for type point: "10.1"
I've also tried:
paramv[3] = "(10.5,(10.1,10.2),20,20,20,40,(5,5))"; -> invalid input syntax for "(10.1"
paramv[3] = "(10.5,(10.1,10.2),(20,20,20,40),(5,5))"; -> as above
and the sql command I'm using is:
res = PQexecParams(conn, "insert into test (r,b,s,rr) values ($1::real,$2::box,$3::size,$4::rotated_rect)", 4, NULL, paramv, NULL, NULL,0);
How do I fix this?
This works (tested in Postgres 9.3):
SELECT '(10.5,"(10.1,10.2)","(20,20,20,40)","(5,5)")'::rotated_rect
Returns:
'(10.5,"(10.1,10.2)","(20,40),(20,20)","(5,5)")'
Note the different syntax for box. Try this form.
What got me were that escaped double quotes and parenthesis need to be used around the values representing a field of the custom compound data type which requires more than one value to create, so:
paramv[0] = "(10.5,\"(10.1,10.2)\",\"(20,20,20,40)\",\"(5,5)\")";
As this string is used as a parameter, the single quotes that would usually wrap the outer parenthesis are not needed.
In a non-parameterised query, it would be implemented like so with the single quotes:
res = PQexec(conn, "insert into test (rr) values ('(10.5,\"(10.1,10.2)\",\"(20,20,20,40)\",\"(5,5)\")')");