Convert the timestamp in accessing VDB (TEIID) - teiid

I am new to TEIID and VDB. As per my need, I need to extract the VDB data using Postgres SQL by validating the converted timestamp.
Issue is
Here is the sample query that I am trying to execute.
select col1, col2, col3 from (EXEC my_Views.get_mytable('2022-01-01 00:00:00.000', now())) as TAB where my_date >= to_char(to_timestamp('2022-06-15T08:27:00.599Z','YYYY-MM-DD\"T\"HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS')
The problem is, the conversion is not happening at WHERE clause.
Issue I am facing is
I am getting different issues when I try with different conversion tips each time in postgres.
(1)
ERROR: TEIID31100 Parsing error: Encountered
"'2022-06-15T08:27:00.599Z','YYYY-MM-DD"T"HH24:MI:SS),
'[]YYYY[]-MM" at line 1, column 672.Was expecting: "and" | "between"
| "in" | "is" | "like" | "like_regex" | "not" | "or" | "order" |
"similar" ...org.teiid.jdbc.TeiidSQLException: TEIID31100 Parsing
error: Encountered
"'2022-06-15T08:27:00.599Z','YYYY-MM-DD"T"HH24:MI:SS),
'[]YYYY[]-MM" at line 1, column 672.Was expecting: "and" | "between"
| "in" | "is" | "like" | "like_regex" | "not" | "or" | "order" |
"similar" ...;Error while executing the query nil
(2)
ERROR: TEIID30068 The function 'date_trunc('second',
cast('2022-06-15T08:31:00.731Z' AS timestamp))' is an unknown form.
Check that the function name and number of arguments is
correct.org.teiid.jdbc.TeiidSQLException: TEIID30068 The function
'date_trunc('second', cast('2022-06-15T08:31:00.731Z' AS timestamp))'
is an unknown form. Check that the function name and number of
arguments is correct.;Error while executing the query nil
My try
select col1,col2,col3 from (EXEC my_Views.get_mytable('2022-01-01 00:00:00.000', now())) as TAB where my_date >= to_char(to_timestamp('2022-06-15T08:27:00.599Z','YYYY-MM-DD\"T\"HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS')
select col1, col2, col3 from (EXEC my_Views.get_mytable('2022-01-01 00:00:00.000', now())) as TAB where my_date >= date_trunc('second', '2022-06-15T08:27:00.599Z'::timestamp)
I feel the conversion issue will resolve the problem.
Input TS: '2022-06-15T08:27:00.599Z' (YYYY-MM-DDTHH:mm:ss.SSSZ)
Expected TS: '2022-06-15 08:27:00' (YYYY-MM-DD HH:mm:ss)
I initially thought it is postgres issue, later I realize that it is not an issue with postgres. I have applied some conversion techniques in postgres but no luck. Those conversion queries is working in DB fiddle but not in python and postgres.
Hope my stuff will help you understand the issue. Please help me to resolve this.
Thanks.

There is no specific function available in teiid functions to convert my desired TS format. I used substr to capture the required TS in my case. It worked.
Reference link teiid

Related

Django "Join" and "count" - Easy in psql, not so trivial in Django

user id 8a0615d2-b123-4714-b76e-a9607a518979 has many entries in mylog table. each with an ip_id field. I'd like to see a weighted list of these ip_id fields.
in sql i use:
select distinct(ip_id), count(ip_id) from mylog
where user_id = '8a0615d2-b123-4714-b76e-a9607a518979'
group by ip_id
this gets me:
ip_id count
--------------------------------------+--------
84285515-0855-41f4-91fb-bcae6bf840a2 | 187
fc212052-71e3-4489-86ff-eb71b73c54d9 | 102
687ab635-1ec9-4c0a-acf1-3a20d0550b7f | 84
26d76a90-df12-4fb7-8f9e-a5f9af933706 | 18
389a4ae4-1822-40d2-a4cb-ab4880df6444 | 10
b5438f47-0f3a-428b-acc4-1eb9eae13c9e | 3
Now I am trying to get to the same result in django. It's surprisingly elusive.
Getting the user:
u = User.objects.get(id='8a0615d2-b123-4714-b76e-a9607a518979') #this works fine.
I tried:
logs = MyLog.objects.filter(Q(user=u) & Q(ip__isnull=False)).values('ip').annotate(total=Count('ip', distinct=True))
I am getting 6 rows in logs which is fine, but the count is always 6, not the weight of the unique ip as it is in the SQL response above.
What am I doing wrong?
You seem to be mistaken about what the keyword argument distinct does in the Count function. It simply means you want to count only the distinct values (you actually don't want to do that). In fact the part in your SQL query distinct(ip_id) is also redundant as you are going to use the group by clause on that anyway.
Furthermore you write .value('ip') which is a typo and should be .values('ip').
So your ORM query should be:
logs = MyLog.objects.filter(Q(user=u) & Q(ip__isnull=False)).values('ip').annotate(total=Count('ip'))

Getting table information for Redshift `stl_load_errors` errors

I am using Redshift COPY command to load data into Redshift table from S3. When something goes wrong, I typically get an error ERROR: Load into table 'example' failed. Check 'stl_load_errors' system table for details. I can always lookup stl_load_errors manually to get details. Now, I am trying to figure out how I can do that automatically.
From documentation it looks like the following query should give me all the details I need:
SELECT *
FROM stl_load_errors errors
INNER JOIN svv_table_info info
ON errors.tbl = info.table_id
AND info.schema = '<schema-name>'
AND info.table = '<table-name>'
However it always returns nothing. I also tried using stv_tbl_perm instead of svv_table_info, and still nothing.
After some troubleshooting, I see two things I don't understand:
I see multiple different IDs in stv_tbl_perm and svv_table_info for the same exact table. Why is that?
I see tbl filed on stl_load_errors referencing ids that do not exist in stv_tbl_perm or svv_table_info. Again why?
Feels like I don't understanding something in structure of these tables, but it completely escapes me what.
This is because tbl and table_id are with different types. First one is integer, second one is iod.
When you cast iod to integer the columns have the same values. You could check this query:
SELECT table_id::integer, table_id
FROM SVV_TABLE_INFO
I have result when I execute
SELECT errors.tbl, info.table_id::integer, info.table_id, *
FROM stl_load_errors errors
INNER JOIN svv_table_info info
ON errors.tbl = info.table_id
Please note that inner join is ON errors.tbl = info.table_id
I finally got to the bottom of it, and it is surprisingly boring and probably not useful to many ...
I had an existing table. My code that was creating the table was wrapped in transaction, and it was dropping the table inside the transaction. The code that was querying the stl_load_errors was outside the transaction. So the table_id outside and inside the transaction where different, as it was a different table.
You could try looking by filename. Doesn't really answer the question about joining the various tables, but I use a query like so to group up files that are part of the same manifest file and let me compare it to the maxerror setting:
select min(starttime) over (partition by substring(filename, 1, 53)) as starttime,
substring(filename, 1, 53) as filename, btrim(err_reason) as err_reason, count(*)
from stl_load_errors where filename like '%/some_s3_path/%'
group by starttime, filename, err_reason order by starttime desc;
This worked for me without any casting:
schemaz=# select i.database, e.err_code from stl_load_errors e join svv_table_info i on e.tbl=i.table_id limit 5
schemaz-# ;
database | err_code
-----------+----------
schemaz | 1204
schemaz | 1204
schemaz | 1204
schemaz | 1204
schemaz | 1204

Declare a variable in RedShift

SQL Server has the ability to declare a variable, then call that variable in a query like so:
DECLARE #StartDate date;
SET #StartDate = '2015-01-01';
SELECT *
FROM Orders
WHERE OrderDate >= #StartDate;
Does this functionality work in Amazon's RedShift? From the documentation, it looks that DECLARE is used solely for cursors. SET looks to be the function I am looking for, but when I attempt to use that, I get an error.
set session StartDate = '2015-01-01';
[Error Code: 500310, SQL State: 42704] [Amazon](500310) Invalid operation: unrecognized configuration parameter "startdate";
Is it possible to do this in RedShift?
Slavik Meltser's answer is great. As a variation on this theme, you can also use a WITH construct:
WITH tmp_variables AS (
SELECT
'2015-01-01'::DATE AS StartDate,
'some string' AS some_value,
5556::BIGINT AS some_id
)
SELECT *
FROM Orders
WHERE OrderDate >= (SELECT StartDate FROM tmp_variables);
Actually, you can simulate a variable using a temporarily table, create one, set data and you are good to go.
Something like this:
CREATE TEMP TABLE tmp_variables AS SELECT
'2015-01-01'::DATE AS StartDate,
'some string' AS some_value,
5556::BIGINT AS some_id;
SELECT *
FROM Orders
WHERE OrderDate >= (SELECT StartDate FROM tmp_variables);
The temp table will be deleted after the transaction execution.
Temp tables are bound per session (connect), therefor cannot be shared across sessions.
No, Amazon Redshift does not have the concept of variables. Redshift presents itself as PostgreSQL, but is highly modified.
There was mention of User Defined Functions at the 2014 AWS re:Invent conference, which might meet some of your needs.
Update in 2016: Scalar User Defined Functions can perform computations but cannot act as stored variables.
Note that if you are using the psql client to query, psql variables can still be used as always with Redshift:
$ psql --host=my_cluster_name.clusterid.us-east-1.redshift.amazonaws.com \
--dbname=your_db --port=5432 --username=your_login -v dt_format=DD-MM-YYYY
# select current_date;
date
------------
2015-06-15
(1 row)
# select to_char(current_date,:'dt_format');
to_char
------------
15-06-2015
(1 row)
# \set
AUTOCOMMIT = 'on'
...
dt_format = 'DD-MM-YYYY'
...
# \set dt_format 'MM/DD/YYYY'
# select to_char(current_date,:'dt_format');
to_char
------------
06/15/2015
(1 row)
You can now use user defined functions (UDF's) to do what you want:
CREATE FUNCTION my_const()
RETURNS CSTRING IMMUTABLE AS
$$ return 'my_string_constant' $$ language plpythonu;
Unfortunately, this does require certain access permissions on your redshift database.
Not an exact answer but in DBeaver, you can set up variables to use in your local queries in the IDE. Our team has found this helpful in testing before we put code into production.
From this answer: https://stackoverflow.com/a/58308439/220997
You should then be able to do:
#set date = '2019-10-09'
SELECT ${date}::DATE, ${date}::TIMESTAMP WITHOUT TIME ZONE
which produces:
| date | timestamp |
|------------|---------------------|
| 2019-10-09 | 2019-10-09 00:00:00 |
Again note: This only works in the DBeaver IDE. This SQL won't work when integrated in stored procedures or called from other tools

Doctrine 2: There is no column with name '$columnName' on table '$table'

When I do:
vendor/bin/doctrine-module orm:schema-tool:update
Doctrine 2.4 gives me this error:
[Doctrine\DBAL\Schema\SchemaException]
There is no column with name 'resource_id' on table 'role_resource'.
My actual MySQL database schema has the column and the table, as evident from running this command (no errors thrown):
mysql> select resource_id from role_resource;
Empty set (0.00 sec)
Thus, the error must be somewhere in the Doctrine's representation of the schema. I did a var_dump() of $this object, and here is what I get (partial):
object(Doctrine\DBAL\Schema\Table)#546 (10) {
["_name" :protected] => string(13) "role_resource"
["_columns":protected] => array(0) { }
Note that indeed, the _columns key does not contain any columns, which is how Doctrine checks for column names.
In my case, the partial trace dump is as follows:
SchemaException.php#L85
Table.php#L252
Table.php#L161
Reading other posts with similar problem, seem to suggest that I may have an error in the column case (upper vs lower). While it is possible I have missed something, but looking over my actual schema on the Database and the Annotations in my code seem to suggest a match (all lowercase). Similarly, Doctrine2's code does incorporate checks for such casing errors. So I am ruling out the error casing possibility.
Another post I've seen suggests that there may be an error in my Annotations, i.e. wrong naming, syntax, or id placement. I don't know, I checked it and it seems fine. Here is what I have:
class Role implements HierarchicalRoleInterface
{
/**
* #var \Doctrine\Common\Collections\Collection
* #ORM\ManyToMany(targetEntity="ModuleName\Entity\Resource")
* #ORM\JoinTable(name="role_resource",
* joinColumns={#ORM\JoinColumn(name="role_id", referencedColumnName="id")},
* inverseJoinColumns={#ORM\JoinColumn(name="resource_id", referencedColumnName="id")}
* )
*/
protected $resource;
So at the moment, I am stuck, and unable to use the ORM's schema-generation tools. This is a persistent error. I have scraped my database, generated schema anew using ORM, but still get stuck on this error whenever I try to do an update via ORM, as I describe in this post. Where perhaps should I look next?
Update: traced it to this code:
$sql before this line ==
SELECT COLUMN_NAME AS Field,
COLUMN_TYPE AS Type,
IS_NULLABLE AS `Null`,
COLUMN_KEY AS `Key`,
COLUMN_DEFAULT AS `Default`,
EXTRA AS Extra,
COLUMN_COMMENT AS Comment,
CHARACTER_SET_NAME AS CharacterSet,
COLLATION_NAME AS CollactionName,
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = 'loginauth' AND TABLE_NAME = 'role_resource'
which when I run it form MySQL prompt, returns (some columns were trimmed):
+-------------+---------+------+-----+--------------+----------------+
| Field | Type | Null | Key | CharacterSet | CollactionName |
+-------------+---------+------+-----+--------------+----------------+
| role_id | int(11) | NO | PRI | NULL | NULL |
| resource_id | int(11) | NO | PRI | NULL | NULL |
+-------------+---------+------+-----+--------------+----------------+
and the $this->executeQuery($sql, $params, $types) returns the proper(?) statement that runs fine on my prompt, but when ->fetchAll() is called, specifically this fetchAll() it breaks down and returns an empty array. Can I have someone make sense out of this?
MORE:
Essentially, from above links, $this->executeQuery($sql, $params, $types) returns:
object(Doctrine\DBAL\Driver\PDOStatement)#531 (1) {
["queryString"]=> string(332) "SELECT COLUMN_NAME AS Field, COLUMN_TYPE AS Type, IS_NULLABLE AS `Null`, COLUMN_KEY AS `Key`, COLUMN_DEFAULT AS `Default`, EXTRA AS Extra, COLUMN_COMMENT AS Comment, CHARACTER_SET_NAME AS CharacterSet, COLLATION_NAME AS CollactionName FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = 'loginauth' AND TABLE_NAME = 'role_resource'"
}
but then $this->executeQuery($sql, $params, $types)->fetchAll() (adding fetchAll()), returns this:
array(0) {
}
And that is so sad my friends :( because I don't know why it returns an empty array, when the statement in queryString above is so clearly valid and fruitful.
Check that the column names used in 'index' and 'uniqueContraints' schema definitions actually exist:
For example using Annotations:
#ORM\Table(name="user_password_reset_keys", indexes={#ORM\Index(name="key_idx", columns={"key"})} )
I had renamed my column from 'key' to 'reset_key' and this column name mismatch caused the error
/**
* #var string
*
* #ORM\Column(name="reset_key", type="string", length=255, nullable=false)
*/
private $resetKey;
Turns out that my DB permissions prevented my DB user from reading that particular table. using GRANT SELECT ... fixed the issue. Also, DBAL team traced it to a DB Permissions peculiarity returning NULL in MySQL and SQL Server.
Late answer, but may be helps some one
I had the same problem. It was because I used Symphony's command line to create my entity and camel case calling some of its properties. Then, when Doctrine create the table, also via command line, it change camel case for "_ "convention.
I was using a ORM Designer software called Skipper that exports my Entities for me. My problem for just one table only out of 30. Was that I was missing the name attribute on my annotations. Example.
#ORM\Column(name="isActive", ....
I added that attribute "name=" and it worked again!

Amazon RedShift: Unique Column not being honored

I use the following query to create my table.
create table t1 (url varchar(250) unique);
Then I insert about 500 urls, twice. I am expecting that the second time I had the URLs that no new entries show up in my table, but instead my count value doubles for:
select count(*) from t1;
What I want is that when I try and add a url that is already in my table, it is skipped.
Have I declared something in my table deceleration incorrect?
I am using RedShift from AWS.
Sample
urlenrich=# insert into seed(url, source) select 'http://www.google.com', '1';
INSERT 0 1
urlenrich=# select * from seed;
url | wascrawled | source | date_crawled
-----------------------+------------+--------+--------------
http://www.google.com | 0 | 1 |
(1 row)
urlenrich=# insert into seed(url, source) select 'http://www.google.com', '1';
INSERT 0 1
urlenrich=# select * from seed;
url | wascrawled | source | date_crawled
-----------------------+------------+--------+--------------
http://www.google.com | 0 | 1 |
http://www.google.com | 0 | 1 |
(2 rows)
Output of \d seed
urlenrich=# \d seed
Table "public.seed"
Column | Type | Modifiers
--------------+-----------------------------+-----------
url | character varying(250) |
wascrawled | integer | default 0
source | integer | not null
date_crawled | timestamp without time zone |
Indexes:
"seed_url_key" UNIQUE, btree (url)
Figured out the problem
Amazon RedShift does not enforce constraints...
As explained here
http://docs.aws.amazon.com/redshift/latest/dg/t_Defining_constraints.html
They said they may get around to changing it at some point.
NEW 11/21/2013
RDS has added support for PostGres, if you need unique and such an postgres rds instance is now the best way to go.
In redshift, constraints are recommended but doesn't take effect, constraints will just help to the query planner to select better ways to perform the query.
Usually, columnar databases do not manage indexes or constraints.
Although Amazon Redshift doesn't support unique constraints, there are some ways to delete duplicated records that can be helpful.
See the following link for the details.
copy data from Amazon s3 to Red Shift and avoid duplicate rows
Primary and unique key enforcement in distributed systems, never mind column store systems, is difficult. Both RedShift (Paracel) and Vertica face the same problems.
The challenge with a column store is that the question that is being asked is "does this table row have a relevant entry in another table row" but column stores are not designed for row operations.
In HP Vertica there is an explicit command to report on constraint violations.
In Redshift it appears that you have to roll your own.
SELECT COUNT(*) AS TotalRecords, COUNT(DISTINCT {your PK_Column}) AS UniqueRecords
FROM {Your table}
HAVING COUNT(*)> COUNT(DISTINCT {your PK_Column})
Obviously, if you have a multi-column PK you have to do something more heavyweight.
SELECT COUNT(*)
FROM (
SELECT {PkColumns}
FROM {Your Table}
GROUP BY {PKColumns}
HAVING COUNT(*)>1
) AS DT
If the above returns a value greater than zero then you have a primary key violation.
For anyone who:
Needs to use redshift
Wants unique inserts in a single query
Doesn't care too much about query performance
Only really cares about inserting a single unique value at a time
Here's an easy way to get it done
INSERT INTO MY_TABLE (MY_COLUMNS)
SELECT MY_UNIQUE_VALUE WHERE MY_UNIQUE_VALUE NOT IN (
SELECT MY_UNIQUE_VALUE FROM MY_TABLE
WHERE MY_UNIQUE_COLUMN = MY_UNIQUE_VALUE
)