Unable to cast redshift column to integer due to empty values - casting

I have a redshift table with a column which has empty values rarely. It is expected to have only integer values but some places empty values exist. When I try to cast it using :: it throws error -
[Code: 500310, SQL State: XX000] [Amazon](500310) Invalid operation: Invalid digit, Value 'B', Pos 0, Type: Integer
Details:
-----------------------------------------------
error: Invalid digit, Value 'B', Pos 0, Type: Integer
code: 1207
context: BEVEL_ON
query: 34112149
location: :0
process: query1_836_34112149 [pid=0]
-----------------------------------------------;

So to clarify you have a text column that contains numeric characters most of the time and you want to case this to integer, right? It also sounds like you believe that the only only non-numeric values are the empty string ''.
If this is the case then the solution is fairly simple - change the empty string to NULL before casting. The DECODE statement is my go to for this:
DECODE(col_X, '', NULL, col_X)::INT
If a more varied set of strings are in the column then using regexp_replace() to strip all the non-numeric characters would be needed.

text_to_int_alt(
case
when regexp_replace(convert(varchar, creative_id), '[^0-9]', '') <> '' then
regexp_replace(convert(varchar, creative_id), '[^0-9]', '')
end)

Related

How to convert text field with formatted currency to numeric field type in Postgres?

I have a table that has a text field which has formatted strings that represent money.
For example, it will have values like this, but also have "bad" invalid data as well
$5.55
$100050.44
over 10,000
$550
my money
570.00
I want to convert this to a numeric field but maintain the actual numbers that can be retained, and for any that can't , convert to null.
I was using this function originally which did convert clean numbers (numbers that didn't have any formatting). The issue was that it would not convert $5.55 as an example and set this to null.
CREATE OR REPLACE FUNCTION public.cast_text_to_numeric(
v_input text)
RETURNS numeric
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
declare v_output numeric default null;
begin
begin
v_output := v_input::numeric;
exception when others then return null;
end;
return v_output;
end;
$BODY$;
I then created a simple update statement which removes the all non digit characters, but keeps the period.
update public.numbertesting set field_1=regexp_replace(field_1,'[^\w.]','','g')
and if I run this statement, it correctly converts the text data to numeric and maintains the number:
alter table public.numbertesting
alter column field_1 type numeric
using field_1::numeric
But I need to use the function in order to properly discard any bad data and set those values to null.
Even after I run the clean up to set the text value to say 5.55
my "cast_text_to_numeric" function STILL sets this to null ? I don't understand why this sets it to null, but the above statement correctly converts it to a proper number.
How can I fix my cast_text_to_numeric function to properly convert values such as 5.55 , etc?
I'm ok with disgarding (setting to NULL) any values that don't end up with numbers and a period. The regular expression will strip out all other characters... and if there happens to be two numbers in the text field, with the script, they would be combined into one (spaces are removed) and I'm good with that.
In the example of data above, after conversion, the end result in numeric field would be:
5.55
100050.44
null
550
null
570.00
FYI, I am on Postgres 11 right now

store infinity in postgres json via django

I have a list of tuples like below -
[(float.inf, 1.0), (270, 0.9002), (0, 0.0)]
I am looking for a simple serializer/deserializer that helps me store this tuple in a jsonb field in PostgreSQL.
I tried using JSONEncoder().encode(a_math_function) but didn't help.
I am facing the following error while attempting to store the above list in jsonb field -
django.db.utils.DataError: invalid input syntax for type json
LINE 1: ...", "a_math_function", "last_updated") VALUES (1, '[[Infinit...
DETAIL: Token "Infinity" is invalid.
Note: the field a_math_function is of type JSONField()
t=# select 'Infinity'::float;
float8
----------
Infinity
(1 row)
because
https://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-FLOAT
In addition to ordinary numeric values, the floating-point types have
several special values:
Infinity
-Infinity
NaN
yet, the json does not have such possible value (unless its string)
https://www.json.org/
value
string
number
object
array
true
false
null
thus:
t=# select '{"k":Infinity}'::json;
ERROR: invalid input syntax for type json
LINE 1: select '{"k":Infinity}'::json;
^
DETAIL: Token "Infinity" is invalid.
CONTEXT: JSON data, line 1: {"k":Infinity...
Time: 19.059 ms
so it's not the jango or postgres limitation - just Infinity is invalid token, yet 'Infinity' is a valid string. so
t=# select '{"k":"Infinity"}'::json;
json
------------------
{"k":"Infinity"}
(1 row)
works... But Infinity here is "just a word". Of course you can save it as a string, not as numeric value and check every string if it's not equal "Infinity", and if it is - launch your program logic to treat it as real Infinity... But in short - you can't do it, because json specification does not support it... same asyou can't store lets say red #ff0000 as colour in json - only as string, to be caught and processed by your engine...
update:
postgres would cast float to text itself on to_json:
t=# select to_json(sub) from (select 'Infinity'::float) sub;
to_json
-----------------------
{"float8":"Infinity"}
(1 row)
update
https://www.postgresql.org/docs/current/static/datatype-json.html
When converting textual JSON input into jsonb, the primitive types
described by RFC 7159 are effectively mapped onto native PostgreSQL
types
...
number numeric NaN and infinity values are disallowed

Cellular number valid in pl sql oracle

I want to check cellular number if it current and if true - get it in fromat
for example
the numbers correct :
0521234567
521234567 - need only to 0 in the start
052-1234567
(052)1234567
052-123-456-7
numbers not correct:
052123
0871234567
how I do it??
i tried to write:
SELECT REGEXP_REPLACE('0521234567', '^0?(5[0-9])(\-)?\d{7}$', '') FROM dual;
but it's return '' ;
thank.
SELECT CASE WHEN REGEXP_LIKE('0521234567', '^0?(5[0-9])(\-)?\d{7}$')
THEN '0521234567'
ELSE NULL END
FROM dual;
If a string satisfies the format return the string, otherwise return NULL

InvalidOperation: Invalid literal for Decimal: u' '

When the users perform allocation of money in each envelope sometimes they forgot to put amounts in other envelopes which result to '0'. Then it will result to InvalidOperation.
How to fix this error? Or How can the system get only the amount that is more than 0?
Exception
Types: InvalidOperation
Value: Invalid literal for Decimal: u''
envelopes/views.py in allocate (application)
t2_payee = 'Envelope Transfer'
for val in request.POST:
if val[0:4] == "env_":
env = Envelope.objects.get(pk=int(val[4:]))
amt = Decimal(request.POST[val])
<WSGIRequest
path:/envelopes/allocate/6313/,
GET:<QueryDict: {}>,
POST:<QueryDict: {u'allocation_date': [u'2013-03-03'], u'month': [u'03'],
u'source': [u'6313'], u'year': [u'2013'], u'env_6316': [u''],
u'csrfmiddlewaretoken': [u'3kKoVymvIpbyhCknE1c3WH6YFznTaEoj'],
u'env_6315': [u'1'], u'env_6314': [u'0']}>,
COOKIES:{'__utma': '136509540.132217190.1357543480.1362303551.1362307904.34',
'__utmb': '1
In your example value of env_6316 is empty, Decimal doesn't know how to convert that to a number. You should check if the val is empty and if so then replace it with 0 before converting to Decimal.
I encountered this error while running a SQL query and attempting to construct a Pandas dataframe with the data returned from the query. An alteration to the query solved the problem for me. I had also attempted to CAST the column values returned, but ultimately, appending ::FLOAT8 to the problematic field was the only solution for me.
Example query:
SELECT sum(dollars)::FLOAT8 FROM [table] WHERE ...
sum(dollars) was the field causing the issue for me. It's Type in my table was numeric(10,6), and Size was 8.

Django - coercing to Unicode

I am having a unicode problem and, as everytime I have something related I'm completely lost..
One of my Django template renders a TypeError :
Exception Value:
coercing to Unicode: need string or buffer, long found
The line giving trouble is just a string ( which I want to use in a mysql query) :
query = unicode('''(SELECT asset_name, asset_description, asset_id, etat_id, etat_name FROM Asset LEFT OUTER JOIN Etat ON etat_id_asset=asset_id WHERE asset_id_proj='''+proj+''' AND asset_id_type='''+t.type_id+''' ORDER BY asset_name, asset_description) UNION (SELECT asset_name, asset_description, asset_id, 'NULL', 'NULL' FROM Asset WHERE asset_id_proj='''+proj+''' AND asset_id_type='''+t.type_id+''' AND asset_id IN (SELECT etat_id_asset FROM Etat)); ''')
What can be wrong here ?
I know you figured out a better way to accomplish, but to answer the original question, in case you get that error again somewhere else in the project:
t.type_id appears to be a long integer. You cannot mix integers in strings unless you convert to string, this is really simple:
myString = 'some string with type id ' + str(t.type_id) + ', and whatever else you want in the string.'