Column does not exist AWS Timestream Query error - amazon-web-services

I am trying to apply WHERE clause on DIMENSION of the AWS Timestream records. However, I got the error: Column does not exist
Here is my table schema:
The table schema
The table measure
First, I will show all the sample data I put in the table
SELECT username, time, manual_usage
FROM "meter-reading"."meter-metrics"
ORDER BY time DESC
LIMIT 4
The result:
Result
What I wanted to do is to query and filter the records by the Dimension ("username" specifically).
SELECT *
FROM "meter-reading"."meter-metrics"
WHERE measure_name = "OnceADay"
ORDER BY time DESC LIMIT 10
Then I got the Error: Column 'OnceADay' does not exist
I tried to search for any quotas for Dimensions name and check for error in my schema:
https://docs.aws.amazon.com/timestream/latest/developerguide/ts-limits.html#limits.naming
https://docs.aws.amazon.com/timestream/latest/developerguide/ts-limits.html#limits.system_identifier
But I didn't find that my "username" for the dimension violate any of the above rules.
I checked for some other queries by AWS Blog, the author used the WHERE clause for the Dimension filter normally:
https://aws.amazon.com/blogs/database/effective-queries-for-common-query-patterns-in-amazon-timestream/

I figured it out after I tried with the sample code. Turn out it was a silly mistake I believe.
Using apostrophe (') instead of single quotation marks ("") solved my problem.
SELECT *
FROM "meter-reading"."meter-metrics"
WHERE username = 'OnceADay'
ORDER BY time DESC LIMIT 10

Related

BigQuery MERGE statement billing more bytes than editor shows

I have a very large (3.5B records) table that I want to update/insert (upsert) using the MERGE statement in BigQuery. The source table is a staging table that contains only the new data, and I need to check if the record with a corresponding ID is in the target table, updating the row if so or inserting if not.
The target table is partitioned by an integer field called IdParent, and the matching is done on IdParent and another integer field called IdChild. My merge statement/script looks like this:
declare parentList array<int64>;
set parentList = array(select distinct IdParent from dataset.Staging);
merge into dataset.Target t
using dataset.Staging s
on
-- target is partitioned by IdParent, do this for partition pruning
t.IdParent in unnest(parentList)
and t.IdParent = s.IdParent
and t.IdChild = s.IdChild
when matched and t.IdParent in unnest(parentList) then
update
set t.Column1 = s.Column1,
t.Column2 = s.Column2,
...<more columns>
when not matched and IdParent in unnest(parentList) then
insert (<all the fields>)
values (<all the fields)
;
So I:
Pull the IdParent list from the staging table to know which partitions to prune
limit the partitions of the target table in the join predicate
also limit the partitions of the target table in the match/not matched conditions
The total size of dataset.Target is ~250GB. If I put this script in my BQ editor and remove all the IdParent in unnest(parentList) then it shows ~250GB to bill in the editor (as expected since there's no partition pruning). If I add the IdParent in unnest(parentList) back in so the script is exactly like you see it above i.e. attempting to partition prune, the editor shows ~97MB to bill. However, when I look at the query results, I see that it actually billed ~180GB:
The target table is also clustered on the two fields being matched, and I'm aware that the benefits of clustering are typically not shown in the editor's estimate. However, my understanding is that that should only make the bytes billed smaller... I can't think of any reason why this would happen.
Is this a BQ bug, or am I just missing something? BigQuery doesn't even say "the script is estimated to process XX MB", it says "This will process XX MB" and then it processes way more.
That's very interesting. What you did seems totally correct.
It seems BQ query planner could interpret your SQL correctly and know the partition pruning is provided, but when it executes. it failed to do so.
try removing t.IdParent in unnest(parentList) from both when matched clauses to see if the issue still happens, that is,
declare parentList array<int64>;
set parentList = array(select distinct IdParent from dataset.Staging);
merge into dataset.Target t
using dataset.Staging s
on
-- target is partitioned by IdParent, do this for partition pruning
t.IdParent in unnest(parentList)
and t.IdParent = s.IdParent
and t.IdChild = s.IdChild
when matched then
update
set t.Column1 = s.Column1,
t.Column2 = s.Column2,
...<more columns>
when not matched then
insert (<all the fields>)
values (<all the fields)
;
It would be a good idea to submit a bug to BigQuery if it couldn't be resolved.

QuickSight could not generate any output column after applying transformation Error

I am running a query that works perfectly on AWS Athena however when I use athena as a data source from quicksight and tries to run query it keeps on giving me QuickSight could not generate any output column after applying transformation error message.
Here is my query:
WITH register as (
select created_at as register_time
, serial_number
, node_name
, node_visible_time_name
from table1
where type = 'register'),
bought as (
select created_at as bought_time
, node_name
, serial_number
from table1
where type= 'bought')
SELECT r.node_name
, r.serial_number
, r.register_time
, b.bought_time
, r.node_visible_time_name
FROM register r
LEFT JOIN bought b
ON r.serial_number = b.serial_number
AND r.node_name = b.node_name
AND b.bought_time between r.deploy_time and date(r.deploy_time + INTERVAL '1' DAY)
LIMIT 11;
I've did some search and found similar question Quicksight custom query postgresql functions In this case adding INTERVAL '1' DAY had the problem. I've tried other alternatives but no luck. Furthermore running query without it still outputs same error message.
No other lines seems to be getting transformed in any other way.
Re-creating dataset and running exact same query works.
I think queries that has been ran on existing dataset transforms the data. Please let me know if anyone knows why this is so.

SSAS Tabular/Analysis services Many-to-Many for Multiple curriencies input - Multiple currencies output

I am trying to create on the fly currencies conversion from many currencies input ( InvoicesHeaders rows have differents currencies , so each row have an amount and the currency code for this amount) and many currencies output ( each affiliate want to see figures with it's own currency ).
Therefore I end up in a many to many, join between the InvoiceTable and the currency table. To join them I create in SQL a concatenated field with the day and the currency code.
Then ( reusing tutorial from internet ) I create a calculation doing a lookup from the Invoice to the rate.
Amount adj:=SUMX(Invoices,Invoices[TotalInvoiceAmount]/LOOKUPVALUE(ExchangeRatesPerDay[Rate],ExchangeRatesPerDay[ToCurrencyConcatenatedday112],Invoices[CurrencyCodeConcatenateInvoiceDate112]))
However, when I am trying to use this measure in excel (filtering on one currency at the time of course ) I am getting an error message saying many rows where pass but only one was expected.
From the error message, it looks like the lookup is getting multiple values which is strange because in the excel I am filtering on one currency. Therefore for each combination of day+currencycode there is only one row. I check the SQL using this query
with cte as (
SELECT [RateTypeName]
,[FromCurrency]
,[ToCurrency]
,[StartDate]
,[Rate]
,[EndDate]
,[ConversionFactor]
,[RateTypeDescription]
,[dday]
,[dday112]
,[ToCurrencyConcatenatedday112]
,[FromCurrencyConcatenatedday112]
, count(*) over (partition by [ToCurrencyConcatenatedday112],FromCurrency ) as co
FROM [stg].[ExchangeRatesPerDay]
)
select * from cte where co>1
And it doesn't return any record.
I will appreciate any idea you may have.
Regards
Vincent
I don't understand why my answer have been deleted. Anynay I am posting it back : I found this website https://www.kasperonbi.com/currency-conversion-in-dax-for-power-bi-and-ssas/ that provide an answer. I have been using this logic in production for over a month and it work great.

How to setup an AWS Athena query with multiple regex replacements?

I have been trying to make an AWS Athena query and got enough work done to get my data. However, my data needs to identify some patterns and change it in an uniform way in order to group those "similars". So I'm trying to make a regex_replacement, but how can i do multiple replacements to a same column in the same column?
Here's my query:
with q as (SELECT r.key,
r.otherid,
r.complexString,
minute(date_trunc('minute', from_iso8601_timestamp(r.time) AT TIME ZONE 'America/New_York')) AS minute,
hour(from_iso8601_timestamp(r.time) AT TIME ZONE 'America/New_York') AS hour,
day(from_iso8601_timestamp(r.time) AT TIME ZONE 'America/New_York') AS day
FROM requests0918 t
JOIN requests0918 t1 ON t.id = t1.id
WHERE t1.msg = 'response_written' AND t1.code = '200'
and t.otherid is not null
and t.key is not null
and t.path is not null
limit 10)
Select q.key, q.otherid, REGEXP_REPLACE(q.complexString, '\/accounts\/[0-9]+\/balances', '/accounts/.../balances' ) as path, q.minute, q.hour, q.day from q
So I'm successfully changing this strings to that ones, but I need to set more patterns and to replace under the same column name. So I'm looking on how to do it. I could add more layers of with q as {Query} to add more rules, but that sounds pretty wrong.

Adding LIMIT fixes "Invalid digit, Value N" error in Amazon Redshift. Why?

I have a standard listings table on Redshift table with all varchars (due to loading into database)
This query (simplified) gives me error:
with AL as (
select
L.price::int as price,
from listings L
where L.price <> 'NULL'
and L.listing_type <> 'NULL'
)
select price from AL
where price < 800
and the error:
-----------------------------------------------
error: Invalid digit, Value 'N', Pos 0, Type: Integer
code: 1207
context: NULL
query: 2422868
location: :0
process: query0_24 [pid=0]
-----------------------------------------------
If I remove the where price < 800 condition, the query returns just fine... but I need the where condition to be there.
I've also checked the number validity of the price field and all look good.
After playing around, this actually makes it work, and I can't quite explain why.
with AL as (
select
L.price::int as price,
from listings L
where L.price <> 'NULL'
and L.listing_type <> 'NULL'
limit 10000000000
)
select price from AL
where price < 800
Note that the table has far less records than the number stated in limit.
Can anyone (possibly from the Redshift engineer team) explain why this is the way it is? Possibly something to do with how the query plan being executed and parallelized?
I had query that could be expressed simply as:
SELECT TOP 10 field1, field2
FROM table1
INNER JOIN table2
ON table1.field3::int = table2.field3
ORDER BY table1.field1 DESC
Removing the explicit cast to ::int solved a similar error for me.
Meanwhile, postgresql locally requires the "::int" to work.
For what it's worth, my local postgresql version is
PostgreSQL 9.6.4 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit
Loading CSV data with NaN into AWS Redshift
I found this post while searching google but the above link had what I needed. I was importing a numeric column with value NaN, which is unsupported by redshift numeric.