I create a program to sync tables between 2 databases.
I use this common code:
DbSyncScopeDescription myScope = new DbSyncScopeDescription("myscope");
DbSyncTableDescription tblDesc = SqlSyncDescriptionBuilder.GetDescriptionForTable("Table", onPremiseConn);
myScope.Tables.Add(tblDesc);
My program creates the tracking table only with Primary Key (id column).
The sync is ok to delete and insert rows.
But updating don't. I need update all the columns and they are not updated (For example: a telephone column).
I read that I need to add the columns I want to sync MANUALLY with this code:
Collection<string> includeColumns = new Collection<string>();
includeColumns.Add("telephone");
...
includeColumns.Add(Last column);
And changing the table descripcion in this way:
DbSyncTableDescription tblDesc = SqlSyncDescriptionBuilder.GetDescriptionForTable("Table", includeColumns, onPremiseConn);
Is there a way to add all the columns of the table automatically?
Something like:
Collection<string> includeColumns = GetAllColums("Table");
Thanks,
SqlSyncDescriptionBuilder.GetDescriptionForTable("Table", onPremiseConn) will include all the columns of the table already.
the tracking tables only stores the PK and filter columns and some Sync Fx specific columns.
the tracking is at row level, not column level.
during sync, the tracking table and its base table are joined to get the row to be synched.
Related
I want to create a new column based on conditions.
The table look like this
Table with two authors from UK
I want to create a new column author_uk and country_uk with UK values
Table with new coilumn
I have an API which reads from two main tables Table A and Table B.
Table A has a column which acts as foreign key to Table B entries.
Now inside api flow, I have a method which runs below logic.
Raw SQL -> Joining table A with some other tables and fetching entries which has an active status in Table A.
From result of previous query we take the values from Table A column and fetch related rows from Table B using Django Models.
It is like
query = "Select * from A where status = 1" #Very simplified query just for example
cursor = db.connection.cursor()
cursor.execute(query)
results = cursor.fetchAll()
list_of_values = get_values_for_table_B(results)
b_records = list(B.objects.filter(values__in=list_of_values))
Now there is a background process which will enter or update new data in Table A and Table B. That process is doing everything using models and utilizing
with transaction.atomic():
do_update_entries()
However, the update is not just updating old row. It is like deleting old row and deleting related rows in Table B and then new rows are added to both tables.
Now the problem is if I run api and background job separately then everything is good, but when both are ran simultaneously then for many api calls the second query of Table B fails to get any data because the transaction executed in below manner:
Table A RAW Transaction executes and read old data
Background Job runs in a single txn and delete old data and enter new data. Having different foreign key values that relates it to Table B.
Table B Models read query executes which refers to values already deleted by previous txn, hence no records
So, for reading everything in a single txn I have tried below options
with transaction.atomic():
# Raw SQL for Table A
# Models query for Table B
This didn't worked and I am still getting same issue.
I tried another way around
transaction.set_autocommit(False)
Raw SQl for Table A
Models query for Table B
transaction.commit()
transaction.set_autocommit(True)
But this didn't work either. How can I read both queries in a single transaction so background job updates should not affect this read process.
Trying to insert data into a new column I added. Athena does not have an update table command. Is there anyway to do this without reloading the whole table?
I created a test table and then added the column doing this:
ALTER TABLE MikeTest ADD COLUMNS (monthNum int);
I want to update the column with this SQL statement:
month(date_parse("date", '%m/%d/%Y'))
Amazon Athena reads its data from Amazon S3. It is not possible to 'update' a table because this would require re-writing the files in S3.
You could create a new table with the additional column:
CREATE TABLE new_table
WITH (
external_location = 's3://my_athena_results/folder/',
format = 'Parquet',
write_compression = 'SNAPPY'
)
AS
SELECT
*,
month(date_parse("date", '%m/%d/%Y')) as month
from old_table
This will copy the data to a new location in S3, while populating the new column
I have a large target table with columns (id, value). I want to update value='old' to value='new'.
The simplest way would be to UPDATE target SET value='new' WHERE value='old';
However, this deletes and creates new rows and is not recommended, possibly. So I tried to do a merge column update:
# staging
CREATE TABLE stage (LIKE target INCLUDING DEFAULTS);
INSERT INTO stage (SELECT id, value FROM target WHERE value=`old`);
UPDATE stage SET value='new' WHERE value='old'; # ??? how do you update value?
# merge
begin transaction;
UPDATE target
SET value = stage.value FROM stage
WHERE target.id = stage.id and target.distkey = stage.distkey; # collocated join?
end transaction;
DROP TABLE stage;
This can't be the best way of creating the table stage: I have to do all these UPDATE delete/writes when I update this way. Is there a way to do it in the INSERT?
Is it necessary to force the collocated join when I use CREATE TABLE LIKE?
Are you updating all the rows in the table?
If yes you can use CTAS (create table as) which is recommended method
Assuming you table looks like this
table1
id, col1,col2, value
You can use the following SQL to create a new table
CREATE TABLE tmp_table AS
SELECT id, col1,col2, 'new_value'
FROM table1;
After you verify data in tmp_table
DROP TABLE table1;
ALTER TABLE tmp_table RENAME TO table1;
If you are not updating all the rows you can use a filter to do a CTAS and insert the rest of the rows to the new table, let me know if you need more info if this is the case
CREATE TABLE tmp_table AS
SELECT id, col1,col2, 'new_value'
FROM table1
WHERE value = 'old'
INSERT INTO tmp_table SELECT * from table1;
Next step would be DROP the tmp table and rename table1
Update: Based on your comment you can do the following, let me know if this solves your case.
This method basically creates a new table to replace your existing table.
I have used some of your code
CREATE TABLE stage (LIKE target INCLUDING DEFAULTS);
INSERT INTO stage SELECT id, 'new' FROM target WHERE value=`old`;
Above INSERT inserts rows to be updated with 'new', no need to run an UPDATE after this.
Bring unchanged rows
INSERT INTO stage SELECT id, value FROM target WHERE value!=`old`;
After this point you have target table which is your original table intact
stage table will have both sets of rows, updated rows with 'new' value and rows you did not want to change
To replace your target with stage
DROP TABLE target;
or to keep it further verification
ALTER TABLE target RENAME TO target_old;
ALTER TABLE stage RENAME TO target;
From a redshift developer:
This case doesn't require an upsert, or update+insert, and it is fine to just run the update:
UPDATE target SET value='new' WHERE value='old';
Another way would be to INSERT the rows you need and DELETE the other rows, but that's unnecessarily complicated.
If a Cloud Spanner table is created with nullable columns, is it possible to add a NOT NULL constraint on a column without recreating the table?
You can add a NOT NULL constraint to a non-key column. You must first ensure that all rows actually do have values for the column. Spanner will scan the data to verify before fully applying the NOT NULL constraint. More information about how to alter tables is here and here.
However, you can not add such a constraint to a key column. That kind of change would require rewriting all the data in the table, because the nullness of the key affects how the data is encoded. The only option for making that change is to create a new table that's set up the way you want, make code changes to support using both tables temporarily, gradually move the data from the old table to the new table, and eventually change the code to use only the new table and drop the old table. If you further then wanted the original table name, you'd have to do the whole thing again.
Unfortenately there is not way to add not null column
The way to do it:
1 add nullable column
ALTER TABLE table1 ADD COLUMN column1 STRING(255)
UPDATE table1.column1, SET NOT NULL VALUE to the column (if the table is not empty).
UPDATE TABLE table1 SET column1 = "<GENERATED DATA>"
Add constraint
ALTER TABLE table1 ADD COLUMN column1 STRING(255) NOT NULL
Thanks.
Creating a non-nullable column in Spanner on an existing table is typically a three step process:
# add new column to table
ALTER TABLE <table_name> ADD COLUMN <column_name> <value_type>;
# create default values
UPDATE <table_name> SET <column_name>=<default_value> WHERE TRUE;
# add constraint
ALTER TABLE <table_name> ALTER COLUMN <column_name> <value_type> NOT NULL;