Microsoft Sync Framework 'Expected column' error - microsoft-sync-framework

I've been using the Microsoft Sync Framework with no problems the last months, I added new columns to the database with no problems with the synchronization between my local database and my server database. Recently, I added some new columns to a tablet and I'm getting an error while synchronizing that specific table:
Expected column 'foo' was not found on the DataTable to be applied to the destination table 'MyTable'.
Parameter name: dataTable
System.ArgumentException: Expected column 'foo' was not found on the DataTable to be applied to the destination table 'MyTable'.
Parameter name: dataTable
at Microsoft.Synchronization.Data.SqlServer.SqlChangeHandler.SetColumnOrder(DataTable dataTable, Int32& updateKeyOrdinal, Int32& createKeyOrdinal)
at Microsoft.Synchronization.Data.SqlServer.SqlChangeHandler.ApplyBulkChanges(DataTable dataTable)
at Microsoft.Synchronization.Data.RelationalSyncProvider.ApplyChangesInternal(DbSyncScopeMetadata scopeMetadata, IDbTransaction transaction, FailedDeleteDelegate_type failedDeleteDelegate, DataSet dataSet, ChangeApplicationType applyType)
at Microsoft.Synchronization.Data.RelationalSyncProvider.ApplyChanges(DbSyncScopeMetadata scopeMetadata, IDbTransaction applyTransaction, DataSet dataSet, DbSyncSession DbSyncSession, Boolean commitTransaction, FailedDeleteDelegate_type failedDeleteDelegate, String batchFileName, ChangeApplicationAction& action)
at Microsoft.Synchronization.Data.RelationalSyncProvider.SingleTransactionApplyChangesAdapter.Apply(DataSet dataSet, Boolean commitTransaction, FailedDeleteDelegate_type failedDeleteDelegate, String batchFileName, ChangeApplicationAction& action)
at Microsoft.Synchronization.Data.RelationalSyncProvider.ApplyChanges(DbSyncScopeMetadata scopeMetadata, DataSet dataSet, DbSyncSession dbSyncSession, Boolean commitTransaction)
at Microsoft.Synchronization.Data.RelationalSyncProvider.ProcessChangeBatch(ConflictResolutionPolicy resolutionPolicy, ChangeBatch sourceChanges, Object changeDataRetriever, SyncCallbacks syncCallbacks, SyncSessionStatistics sessionStatistics)
at Microsoft.Synchronization.KnowledgeProviderProxy.ProcessChangeBatch(CONFLICT_RESOLUTION_POLICY resolutionPolicy, ISyncChangeBatch pSourceChangeManager, Object pUnkDataRetriever, ISyncCallback pCallback, _SYNC_SESSION_STATISTICS& pSyncSessionStatistics)
at Microsoft.Synchronization.CoreInterop.ISyncSession.Start(CONFLICT_RESOLUTION_POLICY resolutionPolicy, _SYNC_SESSION_STATISTICS& pSyncSessionStatistics)
at Microsoft.Synchronization.KnowledgeSyncOrchestrator.DoOneWaySyncHelper(SyncIdFormatGroup sourceIdFormats, SyncIdFormatGroup destinationIdFormats, KnowledgeSyncProviderConfiguration destinationConfiguration, SyncCallbacks DestinationCallbacks, ISyncProvider sourceProxy, ISyncProvider destinationProxy, ChangeDataAdapter callbackChangeDataAdapter, SyncDataConverter conflictDataConverter, Int32& changesApplied, Int32& changesFailed)
at Microsoft.Synchronization.KnowledgeSyncOrchestrator.DoOneWayKnowledgeSync(SyncDataConverter sourceConverter, SyncDataConverter destinationConverter, SyncProvider sourceProvider, SyncProvider destinationProvider, Int32& changesApplied, Int32& changesFailed)
at Microsoft.Synchronization.KnowledgeSyncOrchestrator.Synchronize()
at Microsoft.Synchronization.SyncOrchestrator.Synchronize()
I'm trying to download the entire table from the server to my local database. This always worked but now I suddenly get this error. I have the column foo in the table MyTable in both, my local database and my server database. What can be the reason behind this error?

I couldn't find any solution to this so I ended up creating a new database, running migrations and importing all the data from the old database. The problem seems to be somewhere in the migrations history of the database.

when you provision a table, a corresponding UDF (TVF) is created to represent the table. This may have become out of sync with the base table when you run migrations.

Related

Transaction Management with Raw SQL and Models in a single transaction Django 1.11.49

I have an API which reads from two main tables Table A and Table B.
Table A has a column which acts as foreign key to Table B entries.
Now inside api flow, I have a method which runs below logic.
Raw SQL -> Joining table A with some other tables and fetching entries which has an active status in Table A.
From result of previous query we take the values from Table A column and fetch related rows from Table B using Django Models.
It is like
query = "Select * from A where status = 1" #Very simplified query just for example
cursor = db.connection.cursor()
cursor.execute(query)
results = cursor.fetchAll()
list_of_values = get_values_for_table_B(results)
b_records = list(B.objects.filter(values__in=list_of_values))
Now there is a background process which will enter or update new data in Table A and Table B. That process is doing everything using models and utilizing
with transaction.atomic():
do_update_entries()
However, the update is not just updating old row. It is like deleting old row and deleting related rows in Table B and then new rows are added to both tables.
Now the problem is if I run api and background job separately then everything is good, but when both are ran simultaneously then for many api calls the second query of Table B fails to get any data because the transaction executed in below manner:
Table A RAW Transaction executes and read old data
Background Job runs in a single txn and delete old data and enter new data. Having different foreign key values that relates it to Table B.
Table B Models read query executes which refers to values already deleted by previous txn, hence no records
So, for reading everything in a single txn I have tried below options
with transaction.atomic():
# Raw SQL for Table A
# Models query for Table B
This didn't worked and I am still getting same issue.
I tried another way around
transaction.set_autocommit(False)
Raw SQl for Table A
Models query for Table B
transaction.commit()
transaction.set_autocommit(True)
But this didn't work either. How can I read both queries in a single transaction so background job updates should not affect this read process.

writing from a Spark DataFrame to BigQuery table gives BigQueryException: Provided Schema does not match

My PySpark computes a DataFrame that I want to insert into a BigQuery table (from a dataproc cluster).
On the BigQuery side, the partition field is REQUIRED.
On the DataFrame side, the partition field inferred is not REQUIRED, that is why I make a schema defining this field as REQUIRED :
StructField("date_part",DateType(),False)
So, I create a new DF with the new schema and when I show this DF, I see as expected :
date_part: date (nullable = false)
But my PySpark ended like that :
Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Provided Schema does not match Table xyz$20211115. Field date_part has changed mode from REQUIRED to NULLABLE
Is there something I missed ?
Update :
I am using Spark 3.0 image
And spark-bigquery-latest_2.12.jar connector

Django error: cannot cast type date to time without time zone

Please help me with this error:
File "/Users/ecommerce/proshopvenv/lib/python3.8/site-packages/django/db/backends/utils.py",
line 84, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.CannotCoerce: cannot cast type date to time without time zone
LINE 1: ..." ALTER COLUMN "createdAt" TYPE time USING "createdAt"::time
The above exception was the direct cause of the following exception:
"a lot of code, and:"
django.db.utils.ProgrammingError: cannot cast type date to time without time zone
LINE 1: ..." ALTER COLUMN "createdAt" TYPE time USING "createdAt"::time
In models i have:
createdAt = models.DateTimeField(auto_now_add=True)
and how i understand from docs https://docs.djangoproject.com/en/3.2/ref/models/fields/
"The auto_now and auto_now_add options will always use the date in the default timezone at the moment of creation or update."
But zone doesn't create, or error in something else, i creating new database, so i can delete all tables
Maybe there is step in django where it initials time zone?
Readed this topic postgreSQL alter column data type to timestamp without time zone and many others but can't fix it to 'migrate' without errors
Yes there a problem modifying a date to time with or without time zone. The problem being that in Postgres date does not have a time component so any successful attempt to get time results in '00:00:00'. Try
select current_date::timestamp::time;
Given that then assuming your objective is the change the type then just drop and re-add the column;
alter table test_table drop createdat;
alter table test_table add createdat time default '00:00:00'::time;
Of course I have never figured out what good time is without the date. Which came first, '10:15:00' or '11:15:00'? Well if 10:15 was today and 11:15 was yesterday then 11:15 came first. Unless you define it as hours-since-midnight. So perhaps a better option:
alter table test_table
alter column createdat
set data type timestamp
using createdat::timestamp;
See Demo Yes, timestamp takes more space, but is much more flexible.
First i want say Thanks to Belayer, but have no rating to vote for his answer
I needed to Alter columns from type "date" to make them "timestamp with time zone", because my Django models configuration wanted to be them like this createdAt = models.DateTimeField(auto_now_add=True)
so i run alter table "base_product" alter column "createdAt" type timestamp with time zone;
But there was strange thing for me, all tables where there was type "date" modifyed to "timestamp with time zone" (it was good, because there where few other places where i need that), but only not in table "base_product" where i run query, and also didn't want to modify it automatically any more
so then i run ALTER TABLE "base_product" ALTER COLUMN "createdAt" DROP DEFAULT, ALTER COLUMN "createdAt" TYPE timestamp with time zone USING timestamp with time zone 'epoch' + "createdAt", ALTER COLUMN "createdAt" SET DEFAULT now();
i read that from postgres docs https://www.postgresql.org/docs/9.1/sql-altertable.html
In my case what a did was, delete de migrations to start from where all work fine, then I delete the field, apply a migrations and then a Added like a new column. And it works. Doing it in that way the django doesn't know tha instead of alter a column, he think that I create a new one. 😁

Rename Column Name in Athena AWS

I have tried several ways to rename some column name in athena table.
after reading the following article
https://docs.aws.amazon.com/athena/latest/ug/alter-table-replace-columns.html
But I have get a no luck on it.
I tried
ALTER TABLE "users_data"."values_portions" REPLACE COLUMNS ('username/teradata' 'String', 'username_teradata' 'String')
Got error
no viable alternative at input 'alter table "users_data"."values_portions" replace' (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: 23232ssdds.....; proxy: null)
You can refer to this document which talks about renaming columns. The query that you are trying to run will replace all the columns in the existing table with provided column list.
One strategy for renaming columns is to create a new table based on the same underlying data, but using new column names. The example mentioned in the link creates a new orders_parquet table called orders_parquet_column_renamed. The example changes the column o_totalprice name to o_total_price and then runs a query in Athena.
Another way of changing the column name is by simply going to AWS Glue -> Select database -> select table -> edit schema -> double click on column name -> type in new name -> save.

AWS Amplify create Global Secondary Index after DynamoDB table creation

I have a large schema with ~70 tables and many of them connected to each other(194 #connection directives) like this:
type table1 #model {
id:ID!
name: String!
...
table2: table2 #connection
}
type table2 #model {
id:ID!
....
}
This works fine. Now my data amount is steadily growing and I need to be able to query for results and sort them.
I've read several articles and found one giving me the advice to create a #key directive to generate a GSI with 2 fields so I can say "Filter the results according to my filter property, sort them by the field "name" and return the first 10 entries, the rest accessible via nextToken parameter"
So I tried to add a GSI like this:
type table1 #model
#key(name: "byName", fields:["id","name"], queryField:"idByName"){
id:ID!
name: String!
...
table2: table2 #connection
}
running
amplify push --minify
I receive the error
Attempting to add a local secondary index to the table1Table table in the table1 stack. Local secondary indexes must be created when the table is created.
An error occured during the push operation: Attempting to add a local secondary index to the table1Table table in the table1 stack.
Local secondary indexes must be created when the table is created.
Why does it create a LSI instead of a GSI? Are there any ways to add #key directives to the tables after they have been created and filled? There are so many datasets from different tables linked with each other so just setting up a new schema would take ages.
Billingmode is PAY_PER_REQUEST if this has some impact.
Any ideas how to proceed?
Thanks in advance!
Regards Christian
If you are using new environment, delete folder #current-cloud-backend first.
Then amplify init created the folder again but alas, with only one file in it amplify-meta.json.