sqlalchemy double quoting list items - list

I'm have a quoting issue with raw query when using "WHERE IN" statement. SQLAlchemy is adding double quotes around single quotes in a list...
Query that i'm trying to execute
sql_query = "SELECT col1, col2, col3 FROM preferences WHERE recipient IN :recipients"
preferences = sqlsession.execute(sql_query,dict(recipient=tuple(message.recipients))
message.recipients is a list like so:
["recipient1","recipient2","recipient3"]
SQLAlchemy debug log
INFO:sqlalchemy.engine.base.Engine:SELECT col1, col2, col2 FROM preferences WHERE recipient IN %s
INFO:sqlalchemy.engine.base.Engine:(('recipient1', 'recipient2', 'recipient3'),)
Mariadb log
9 Query SELECT col1, col2, col3 FROM preferences WHERE recipient IN ("'recipient1'", "'recipient1'", "'recipient1'") <-- double quotes around single quotes
I have run strace to see where those quotes are added and it's sqlalchemy fault.
Table schema:
CREATE TABLE `preferences` (
`recipient` varchar(255) COLLATE latin1_general_ci NOT NULL,
`col1` tinyint(1) NOT NULL DEFAULT '1',
`col2` tinyint(1) NOT NULL DEFAULT '1',
`col3` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`recipient`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci
Environment
CentOS 7
python-sqlalchemy.x86_64 0.9.7-3.el7 epel

Hi.
As proposed on the gmane.comp.python.sqlalchemy.user, you could use the autoload feature.
To summarise Michael Bayer answer:
t = sqlalchemy.Table(
'preferences', # your table name
sqlalchemy.MetaData(),
autoload=True,
autoload_with=sqlsession,
)
query = sqlalchemy.select([t.c.col1, t.c.col2, t.c.col3]) \
.where(t.c.recipient.in_(message.recipients))
preferences = query.fetchall()
In my case I would had to "autoload" a bunch of tables, and it was not really convenient as they had complex joins.
I ended using something in the lines:
query = "SELECT col1, col2, col3 FROM preferences\nWHERE recipient IN (%s);" % (
', '.join(['%s'] * len(message.recipients)
)
result = sqlsession.execute(query, (message.recipients,)) # notice the ","
The idea is to build the query with the number of items which will be given to the IN expression; by doing so you will have the benefits of the auto-escaping functionality and be compatible with all databases backends (as far as I know).
You can see the resulting query with:
>>> print result._saved_cursor._last_executed
SELECT col1, col2, col3 FROM preferences
WHERE recipient IN ('recipient1', 'recipient2', ...);

Related

Using IF in SELECT statement of Google Sheets Query

In order to provide relevant data to an accounting person for him to work further in the Bill Booking stage from a master sheet that I manage in Google Sheets, the query function needs to satisfy either of the two conditions.
The master sheet is here
https://docs.google.com/spreadsheets/d/1pY53-XaGnUQ3BPmLh90mLSqIwSo7S2_QOPbD6JBQHOA/edit#gid=0
Two conditions for an accounting person to see what he needs to work on is
(if either of the mentioned two conditions is fulfilled, the data needs to be shown)
If Col8 = "Yes" and Col14 = "Done"
If Col8 = "No" and Col11 = "Done"
For this, I have tried the following
Ifna(QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1pY53-XaGnUQ3BPmLh90mLSqIwSo7S2_QOPbD6JBQHOA/edit#gid=0","Master!A3:N")," Select Col1, Col2, Col3, Col4, Col5, Col7 "&IF(OR(AND(COL8 = "Yes", COL14 = "Done"), AND(COL8 = "No", COL11 = "Done")))&" "))
Currently, this is not showing me any data and without any error. However, it should show him 3 data to work on.
Kindly help me with this.
You need to put the AND and OR directly into the query like this:
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1pY53-XaGnUQ3BPmLh90mLSqIwSo7S2_QOPbD6JBQHOA/edit#gid=0","Master!A3:N")," Select Col1, Col2, Col3, Col4, Col5, Col7,Col8,Col11,Col14 where Col8='Yes' and Col14='Done' or Col8='No' and Col11='Done' ")
If you take the IFNA out of the original query, you can see that there is an error because the If statement delivers #N/A. The reason why the If statement delivers #N/A is that there is no second argument to the if statement.
If you reduce the If statement to
=OR(AND(COL8 = "Yes", COL14 = "Done"), AND(COL8 = "No", COL11 = "Done"))
it always delivers FALSE because COL8, COL11 and COL14 are valid cell references, but point to blank cells outside the range of the current sheet so it still doesn't work.

RelNode of a query in which FROM clause itself has a query

I want to achieve result from a table where I ORDER BY column id and I don't want id to be present in the result. I can achieve this using the following query.
SELECT COALESCE (col1, '**')
FROM (select col1, id FROM myDataSet.myTable WHERE col4 = 'some filter' ORDER BY id);
Now, I want to create a RelNode of the above query. As far as I know, in calcite, to perform table scan, there are only two methods scan(String tableName) and scan(Iterable<String> tableNames). Is there a way to scan(RelNode ) ? How to do this ?
The query
select col1, col2, col2 FROM myDataSet.myTable WHERE col4 = 'some filter' ORDER BY id
should also give you the desired result.
If you want to represent the query you have written more directly, you would start by constructing a RelNode for the query in the from clause, starting with a scan of myDataSet.myTable, adding the filter, and the order. Then you can project the specific set of columns you want.
Just simply create a RelNode of inner subquery and create another projection on top of it. Like so.
builder.scan('myTable')
.filter(builder.call(SqlStdOperator.EQUALS, builder.field(col4), builder.literal('some filter') )))
.project(builder.field('col1'), builder.field('id'))
.sort(builder.field('id'))
.project(builder.call(SqlStdOperator.COALESCE(builder.field('col1'), builder.literal('**'))))
.build()

How to execute a dynamic SQL statement in a single Select statement?

I just wonder how to eval the content of dynamic SQL using one select; this is the example. This is only an example. but I would like dynamically functions, and manage using single selects. ( I know that sqls are only for SELECT instead of modify... but In this deep querentee Im becomeing in a crazy developer)
SELECT 'SELECT SETVAL(' || chr(39) || c.relname || chr(39)|| ' ,
(SELECT MAX(Id)+1 FROM ' || regexp_replace(c.relname, '_[a-zA-Z]+_[a-zA-Z]+(_[a-zA-Z0-9]+)?', '', 'g') ||' ), true );'
FROM pg_class c WHERE c.relkind = 'S';
The original output is:
SELECT SETVAL('viewitem_id_seq' , (SELECT MAX(Id)+1 FROM viewitem ), true );
SELECT SETVAL('userform_id_seq' , (SELECT MAX(Id)+1 FROM userform ), true );
This is the dynamic sentence:
(SELECT MAX(Id)+1 FROM ' || regexp_replace(c.relname, '[a-zA-Z]+[a-zA-Z]+(_[a-zA-Z0-9]+)?', '', 'g')
is an string that generates as output a SQL, how to eval in the same line this statement?
The desired output is:
SELECT SETVAL('viewitem_id_seq' , 25, true );
SELECT SETVAL('userform_id_seq' , 85, true );
thanks!
If those are serial or identity columns it would be better to use pg_get_serial_sequence() to get the link between a table's column and its sequence.
You can actually run dynamic SQL inside a SQL statement by using query_to_xml()
I use the following script if I need to synchronize the sequences for serial (or identity) columns with their actual values:
with sequences as (
-- this query is only to identify all sequences that belong to a column
-- it's essentially similar to your select * from pg_class where reltype = 'S'
-- but returns the sequence name, table and column name to which the
-- sequence belongs
select *
from (
select table_schema,
table_name,
column_name,
pg_get_serial_sequence(format('%I.%I', table_schema, table_name), column_name) as col_sequence
from information_schema.columns
where table_schema not in ('pg_catalog', 'information_schema')
) t
where col_sequence is not null
), maxvals as (
select table_schema, table_name, column_name, col_sequence,
--
-- this is the "magic" that runs the SELECT MAX() query
--
(xpath('/row/max/text()',
query_to_xml(format('select max(%I) from %I.%I', column_name, table_schema, table_name), true, true, ''))
)[1]::text::bigint as max_val
from sequences
)
select table_schema,
table_name,
column_name,
col_sequence,
coalesce(max_val, 0) as max_val,
setval(col_sequence, coalesce(max_val, 1)) --<< this uses the value from the dynamic query
from maxvals;
The dynamic part here is the call to query_to_xml()
First I use format() to properly deal with identifiers. It also makes writing the SQL easier as no concatenation is required. So for every table returned by the first CTE, something like this is executed:
query_to_xml('select max(id) from public.some_table', true, true, '')
This returns something like:
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<max>42</max>
</row>
The value is than extracted from the XML value using xpath() and converted to a number which then is used in the final SELECT to actually call setval()
The nesting with multiple CTEs is only used to make each part more readable.
The same approach can e.g. used to find the row count for all tables
Some background on how query_to_xml() works

Athena/Presto data discovery query to recommend JSON schema?

I have an Athena table (raw) with just one column (json).
I have the following query which outputs the frequencies of json keys:
SELECT key, count(*)
FROM (
SELECT map_keys(cast(json_parse(json) AS map(varchar, json))) AS keys
FROM raw
)
CROSS JOIN UNNEST (keys) AS t (key)
GROUP BY key
How can I extend this query so that it'll tell me whether a particular key has values with any non-numeric characters?
[failed attempts deleted after I found answer]
This works:
SELECT k, count(*) as isPresent, sum(isNumber) as isNumber,
count(*)-sum(isNumber) as notIsNumber from (
with dataset as (SELECT
cast(json_parse(json) AS map(varchar, varchar)) as kv FROM raw)
SELECT t.k, t.v,
IF(TRY(cast(t.v as double)) is null, 0, 1) as isNumber
from dataset cross join unnest(kv) as t(k, v)
) group by k

Show tables, describe tables equivalent in redshift

I'm new to aws, can anyone tell me what are redshifts' equivalents to mysql commands?
show tables -- redshift command
describe table_name -- redshift command
All the information can be found in a PG_TABLE_DEF table, documentation.
Listing all tables in a public schema (default) - show tables equivalent:
SELECT DISTINCT tablename
FROM pg_table_def
WHERE schemaname = 'public'
ORDER BY tablename;
Description of all the columns from a table called table_name - describe table equivalent:
SELECT *
FROM pg_table_def
WHERE tablename = 'table_name'
AND schemaname = 'public';
Update:
As pointed by #Kishan Pandey 's answer, if you are looking for details of a schema different by public, you need to set search_path to my_schema. (show search_path display current search path)
Listing tables in my_schema schema:
set search_path to my_schema;
select * from pg_table_def;
I had to select from the information schema to get details of my tables and columns; in case it helps anyone:
SELECT * FROM information_schema.tables
WHERE table_schema = 'myschema';
SELECT * FROM information_schema.columns
WHERE table_schema = 'myschema' AND table_name = 'mytable';
Or simply:
\dt to show tables
\d+ <table name> to describe a table
Edit: Works using the psql command line client
Tomasz Tybulewicz answer is good way to go.
SELECT * FROM pg_table_def WHERE tablename = 'YOUR_TABLE_NAME' AND schemaname = 'YOUR_SCHEMA_NAME';
If schema name is not defined in search path , that query will show empty result.
Please first check search path by below code.
SHOW SEARCH_PATH
If schema name is not defined in search path , you can reset search path.
SET SEARCH_PATH to '$user', public, YOUR_SCEHMA_NAME
You can use - desc / to see the view/table definition in Redshift. I have been using Workbench/J as a SQL client for Redshift and it gives the definition in the Messages tab adjacent to Result tab.
In the following post, I documented queries to retrieve TABLE and COLUMN comments from Redshift.
https://sqlsylvia.wordpress.com/2017/04/29/redshift-comment-views-documenting-data/
Enjoy!
Table Comments
SELECT n.nspname AS schema_name
, pg_get_userbyid(c.relowner) AS table_owner
, c.relname AS table_name
, CASE WHEN c.relkind = 'v' THEN 'view' ELSE 'table' END
AS table_type
, d.description AS table_description
FROM pg_class As c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace
LEFT JOIN pg_description As d
ON (d.objoid = c.oid AND d.objsubid = 0)
WHERE c.relkind IN('r', 'v') AND d.description > ''
ORDER BY n.nspname, c.relname ;
Column Comments
SELECT n.nspname AS schema_name
, pg_get_userbyid(c.relowner) AS table_owner
, c.relname AS table_name
, a.attname AS column_name
, d.description AS column_description
FROM pg_class AS c
INNER JOIN pg_attribute As a ON c.oid = a.attrelid
INNER JOIN pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace
LEFT JOIN pg_description As d
ON (d.objoid = c.oid AND d.objsubid = a.attnum)
WHERE c.relkind IN('r', 'v')
AND a.attname NOT
IN ('cmax', 'oid', 'cmin', 'deletexid', 'ctid', 'tableoid','xmax', 'xmin', 'insertxid')
ORDER BY n.nspname, c.relname, a.attname;
Shortcut
\d for show all tables
\d tablename to describe table
\? for more shortcuts for redshift
redshift now support show table
show table analytics.dw_users
https://forums.aws.amazon.com/ann.jspa?annID=8641
You can simply use the command below to describe a table.
desc table-name
or
desc schema-name.table-name