Redshift add new column based on values from existing column - amazon-web-services

I have a Redshift table I want to alter adding a new column, which values are derived from an existing column on the table.
Basically, only adding a column "year" which extracts the year from the column "snapshot_date".
Any ideas how to achieve that? Tried following code, but it errors out.
ALTER TABLE test_schema.table_name ADD year AS ( extract(year from snapshot_date) );

Related

Referencing single value of another column in Power Query Editor

I would like to get a single value from "table2.MappedValue" for every record in table1 in Power Query Editor,
I have two tables, that have a many to one relationship, table2 is just a mapping table:
table1: ID | Values
table2: ID | MappedValue
when I try Table.Column(#"table2","MappedValue"), I get a list and not a single value.
I can do that from Table tools-> New Column, but I was wondering if that is possible in Power Query Editor.
You can do this by merging queries. In the query editor go to Home tab and select table1 click on merge and merge with table2. Next step is to expand your new column by selecting the dubble arrow in the column and select the column you want.

I can't sort a table column by a calculated column

Trying to sort a column in my custom date table (a csv file) via a calculated column in the same table but am seeing an error. The calculated column does not reference the column I wish to sort by. Here's the DAX for the calculated column:
PeriodOffset =
Dates[Period] + Dates[FiscalYear] * 13
- CALCULATE ( VALUES ( Dates[Period] ), Dates[Date] = TODAY () )
- CALCULATE ( VALUES ( Dates[FiscalYear] ), Dates[Date] = TODAY () ) * 13
My date table has every date from 2003/4 to 2034/35, along with custom period numbers, calendar and fiscal years etc. The column I am trying to sort is called PeriodFiscalYear. Each value in that column has only one entry in the PeriodOffset column so it's not that.
The weird thing is, I have had this working in a previous report. In this instance, I was simply trying to recreate the functionality but it won't do it. Even stranger, if I create the PeriodFiscalYear column as a calculated column (currently it's hard-coded in the csv file), it works! So I have a sort-of workaround, I would just like to understand what is going on.
Thanks
I believe this has to do with the fact that data column are sorted when data are ingested into PBI. Calculated columns are calculated only at a later time.
Therefore:
you can sort data column only with other data columns (because calculated columns have not been calculated yet)
you can sort calculated column with both data column and calculated column
Solution:
A) PeriodFiscalYear becomes a calculated column
B) PeriodOffset becomes a data column (either in your CSV or Power Query)
I actually figured this out. The problem was with my data model - I had a circular relationship in there as I was deriving the Period column in one table using my calendar table then linking them back in the relationship!
I created a linking table with the keys in both to make the relationship, then hid it.
Thanks

Add column with hardcoded values in PowerBI

It's probably extremely simple but I can't find an answer. I have created a new column and I would like to use the DAX syntax to fill the column with hardcoded values.
I can write this: Column = 10 and I will get a column of 10s but let's say my table has 3 rows and I would like to insert a column with [10, 17, 155]. How can I do that?
Try using DATATABLE function
Table = DATATABLE("Column Name",INTEGER,{{10},{17},{155}})
You can also put more columns with their own data if you want to, check this
https://learn.microsoft.com/en-us/dax/datatable-function
Assuming your table has a primary key column, say, ID, you could create a new table with just the column you want to manually input.
ID Value
---------
1 10
2 17
3 155
You can create this table either through the Enter Data button or create it using the DAX DATATABLE function as #Deltapimol suggests.
Once you have this table you can create a relationship to your existing table in the data model at which point you can either use this new table in your report to get the values you need or if you really need them in the existing table for some reason, you can pull them over using the RELATED function in a calculated column.
Table1 = GENERATESERIES(1, 3)
Table2 = DATATABLE(
"ID", INTEGER,
"Value" INTEGER,
{{1, 10},{2, 17},{3, 155}}
)
Now you can create a relationship from Table1 to Table2[ID] and then define a calculated column on Table1 as follows:
ValueFromTable2 = RELATED(Table2[Value])
If you don't want to create a relationship, then you could use the LOOKUPVALUE function instead in a calculated column on Table11.

Adding multiple record count to a table

My students table has the following tables
student id | student year | test result | semester
I would like to group the records together to see how many re-tests did the student do in a particular semester.
I am trying to alter the table and add the total_tests_taken column to the table and use an
update statement like:
ALTER table students
(add total_tests_taken number );
UPDATE students
SET total_tests_taken = (select count(*) OVER ( PARTITION BY student_id, semester) FROM students)
but my sql fails saying: "ORA-01427: single-row subquery returns more than one row"
what am I doing wrong?
Do I need to create a temp table and than do it?
Thanks
the reason you are getting the error is because you are trying to set a column's value = a table. SET statement would update each row that matches the constraint with the given value. What you are trying to do can be accomplished by UPDATE with JOIN statement if your DBMS supports it. you can check out the answer to this question for the syntax
How can I do an UPDATE statement with JOIN in SQL?

Alter column data type in Amazon Redshift

How to alter column data type in Amazon Redshift database?
I am not able to alter the column data type in Redshift; is there any way to modify the data type in Amazon Redshift?
As noted in the ALTER TABLE documentation, you can change length of VARCHAR columns using
ALTER TABLE table_name
{
ALTER COLUMN column_name TYPE new_data_type
}
For other column types all I can think of is to add a new column with a correct datatype, then insert all data from old column to a new one, and finally drop the old column.
Use code similar to that:
ALTER TABLE t1 ADD COLUMN new_column ___correct_column_type___;
UPDATE t1 SET new_column = column;
ALTER TABLE t1 DROP COLUMN column;
ALTER TABLE t1 RENAME COLUMN new_column TO column;
There will be a schema change - the newly added column will be last in a table (that may be a problem with COPY statement, keep that in mind - you can define a column order with COPY)
to avoid the schema change mentioned by Tomasz:
BEGIN TRANSACTION;
ALTER TABLE <TABLE_NAME> RENAME TO <TABLE_NAME>_OLD;
CREATE TABLE <TABLE_NAME> ( <NEW_COLUMN_DEFINITION> );
INSERT INTO <TABLE_NAME> (<NEW_COLUMN_DEFINITION>)
SELECT <COLUMNS>
FROM <TABLE_NAME>_OLD;
DROP TABLE <TABLE_NAME>_OLD;
END TRANSACTION;
(Recent update) It's possible to alter the type for varchar columns in Redshift.
ALTER COLUMN column_name TYPE new_data_type
Example:
CREATE TABLE t1 (c1 varchar(100))
ALTER TABLE t1 ALTER COLUMN c1 TYPE varchar(200)
Here is the documentation link
If you don't want to change the column order, an option will be creating a temp table, drop & create the new one with desired size and then bulk again the data.
CREATE TEMP TABLE temp_table AS SELECT * FROM original_table;
DROP TABLE original_table;
CREATE TABLE original_table ...
INSERT INTO original_table SELECT * FROM temp_table;
The only problem recreating the table is that you will need to grant again permissions and if the table is too bigger it will take a piece of time.
ALTER TABLE publisher_catalogs ADD COLUMN new_version integer;
update publisher_catalogs set new_version = CAST(version AS integer);
ALTER TABLE publisher_catalogs DROP COLUMN version RESTRICT;
ALTER TABLE publisher_catalogs RENAME new_version to version;
Redshift being columnar database doesn't allow you to modify the datatype directly,
however below is one approach this will change the column order.
Steps -
1.Alter table add newcolumn to the table
2.Update the newcolumn value with oldcolumn value
3.Alter table to drop the oldcolumn
4.alter table to rename the columnn to oldcolumn
If you don't want to alter the order of the columns then solution would be to
1.create temp table with new column name
copy data from old table to new table.
drop old table
rename the newtable to oldtable
One important thing create a new table using like command instead simple create.
This method works for converting an (big) int column into a varchar
-- Create a backup of the original table
create table original_table_backup as select * from original_table;
-- Drop the original table, and then recreate with new desired data types
drop table original_table;
create table original_table (
col1 bigint,
col2 varchar(20) -- changed from bigint
);
-- insert original entries back into the new table
insert into original_table select * from original_table_backup;
-- cleanup
drop original_table_backup;
You can use the statements below:
ALTER TABLE <table name --etl_proj_atm.dim_card_type >
ALTER COLUMN <col name --card_type> type varchar(30)
UNLOAD and COPY with table rename strategy should be the most efficient way to do this operation if retaining the table structure(row order) is important.
Here is an example adding to this answer.
BEGIN TRANSACTION;
ALTER TABLE <TABLE_NAME> RENAME TO <TABLE_NAME>_OLD;
CREATE TABLE <TABLE_NAME> ( <NEW_COLUMN_DEFINITION> );
UNLOAD ('select * from <TABLE_NAME>_OLD') TO 's3://bucket/key/unload_' manifest;
COPY <TABLE_NAME> FROM 's3://bucket/key/unload_manifest'manifest;
END TRANSACTION;
for updating the same column in redshift this would work fine
UPDATE table_name
SET column_name = 'new_value' WHERE column_name = 'old_value'
you can have multiple clause in where by using and, so as to remove any confusion for sql
cheers!!