I have a bq table withhas a column timestamp as a string, with format 20090630 16:36:23:880, how can I convert it to a proper timestamp ?
parse_datetime('%Y%m%d %H:%M:%E3S', '20090630 16:36:23.880')
Related
I am trying to load a CSV from GCS which contains timestamps in one of the columns.
When I upload via BQ interface, I get the following error:
Could not parse '2018-05-03 10:25:18.257000000' as DATETIME for field creation_date (position 6) starting at location 678732930 with message 'Invalid datetime string "2018-05-03 10:25:18.257000000"'
Is the issue here the trailing 0's? How would I fix the issue using Python?
Thanks in advance
Yes you are correct. The issue is the trailing 0s. DATETIME field only allows 6 digits at the subsecond value.
Name | Range
DATETIME | 0001-01-01 00:00:00 to 9999-12-31 23:59:59.999999
To remove the trailing 0s, you can use Pandas to convert it to a proper DATETIME format so it can be used in BigQuery. For testing purposes, I used a CSV file that contains a dummy value at column 0 and DATETIME with trailing 0s at column 1.
Test,2018-05-03 10:25:18.257000000
Test1,2018-05-03 10:22:18.123000000
Test2,2018-05-03 10:23:18.234000000
Using this block of code, Pandas will convert column 1 to the proper DATETIME format:
import pandas as pd
df = pd.read_csv("data.csv",header=None) #define your CSV file here
first_column = df.iloc[:, 1] # Change to the location of your DATETIME column
df.iloc[:, 1] = pd.to_datetime(first_column,format='%Y-%m-%d %H:%M:%S.%f') # convert to correct datetime format
df.to_csv("data.csv", header=False, index=False) # write the new values to data.csv
print(df) #print output for testing
This will result to:
Test,2018-05-03 10:25:18.257
Test1,2018-05-03 10:22:18.123
Test2,2018-05-03 10:23:18.234
You can now use the updated CSV file to write to BQ via BQ interface. See result of BQ testing:
I'm having a problem converting this varchar into an AWS Athena datetime
"2012-06-10T11:33:25.202615+00:00"
I've tried some like date_parse(pickup, %Y-%m-%dT%T)
I want to make a view like this using the timestamp already converted
CREATE OR REPLACE VIEW vw_ton AS
(
SELECT
id,
date_parse(pickup, timestamp) as pickup,
date_parse(dropoff, timestamp) as dropoff,
FROM "table"."ton"
)
You can use parse_datetime() function:
presto> SELECT parse_datetime('2012-06-10T11:33:25.202615+00:00', 'YYYY-mm-dd''T''HH:mm:ss.SSSSSSZ');
_col0
-----------------------------
2012-01-10 11:33:25.202 UTC
(1 row)
(Verified on Presto 339)
I am trying to use Athena to query some data I have stored in an s3 bucket in parquet format. I have field called datetime which is defined as a date data type in my AWS Glue Data Catalog.
When I try running the following query in Athena, I get the error below:
SELECT DISTINCT datetime
FROM "craigslist"."pq_craigslist_rental_data_parquet"
WHERE datetime > '2018-09-14'
ORDER BY datetime DESC;
And the error:
Your query has the following error(s):
SYNTAX_ERROR: line 3:16: '>' cannot be applied to date, varchar(10)
What am I doing wrong here? How can I properly filter this data by date?
the string literal that you provide has to be casted to a date, in order to compare to a date.
where datetime = date('2019-11-27')
its having issue with the string literal used for date filter. Use WHERE datetime > date '2018-09-14'
from_iso8601_date or date should work.
SELECT DISTINCT datetime
FROM "craigslist"."pq_craigslist_rental_data_parquet"
WHERE datetime > from_iso8601_date('2018-09-14')
ORDER BY datetime DESC;
both return a proper date object.
SELECT typeof(from_iso8601_date('2018-09-14'))
Bit late here, but I had the same issue and the only workaround I have found is:
WHERE datetime > (select date '2018-09-14')
I want to convert the string 20160101000000 into datetime format using expression. I have used below date function
TO_DATE(PERIOD_END_DATE),'MM/DD/YYYY HH24:MI:SS')
But my table file is not loading. My session and workflow gets succeed. My target and source is also flatfile.
I want to change the string 20160101000000 into MM/DD/YYYY HH24:MI:SS for loading data into my target table.
You need to give exact format that looks so that to_date function can understand that format and converts it into date.
TO_DATE(PERIOD_END_DATE,'YYYYMMDDHH24MISS')
So here your date looks like YYYYMMDDHH24MISS (20160101000000).
There is often confusion with the TO_DATE function... it is in fact for converting a string into a date and the function itself is to describe the pattern of the incoming date. Now if you want to convert a date field to a specified date format you must use TO_CHAR
I have three variables stored as number, string and string, as shown below.
load_id = 100
t_date = '2014-06-18'
p_date = '19-JUN-14 10.51.45.378196'
I would like to insert them into a SQL Server table using Python 2.7. The SQL Server table structure is as follows
load_id = float
t_date = date
p_date = timestamp
In Oracle, we tend to use TO_DATE or TO_TIMESTAMP to convert the string to DATE or TIMESTAMP field.
I would like to know how I can do similar conversion while inserting into an SQL Server table.
Thanks in advance.
convert with :
import datetime
import calendar
thedate=datetime.datetime.strptime(p_date,'%d-%b-%y %H.%M.%S.%f')
thetimestamp=calendar.timegm(thedate.utctimetuple())
https://community.toadworld.com/platforms/sql-server/b/weblog/archive/2012/04/18/convert-datetime-to-timestamp
DECLARE #DateTimeVariable DATETIME
SELECT #DateTimeVariable = GETDATE()
SELECT #DateTimeVariable AS DateTimeValue,
CAST(#DateTimeVariable AS TIMESTAMP) AS DateTimeConvertedToTimestampCAST
SELECT CAST(CAST(#DateTimeVariable AS TIMESTAMP) AS DATETIME) AS
TimestampToDatetime
Do the conversion with SQL instead of trying to get Python to match the SQL format.
Neither format matches yours, however the DATETIME type should be adequate.