Using SQL Replication, will records that are deleted in an article on the publisher be deleted on the subscriber too? - database-replication

I'm considering the use of SQL Replication for a requirement my client has to move data from an OLTP database (publisher) to a reporting database (subscriber). However, every month data older than 2 years gets deleted from the OLTP database. If I use SQL Replication, will this deletion of records from the OLTP database mean that the corresponding records will then be deleted from the Reporting database as well? If so, is there a way to prevent this from happening? You see, my client needs the Reporting database to retain all the data older than 2 years even once it has been deleted from the OLTP database.

Yup it turns out that for each article (database table) that gets added to the publication, one can set whether deletes need to be sync'd.

Related

AWS RDS - replicate data in one table to another

No. I am not talking about read replicas.
The scenario I am thinking of is this. Let's say you have an RDS table called user_profile. You want to record a history of the changes of each user profile in another table, let's say we call it user_profile_history. Is it possible in RDS to do real time porting from the main user_profile table to its history table, whenever updates are done to the main table?
End scenario would be, user_profile table only contain the latest user data. All other past snapshots of profile are in the history table.
Both the tables are on the same RDS database.
I have done my due diligence and did a bit of research but all I could find was read replicas and replicating data to another region. Haven't found any that would cover this scenario. Yes, you could say that we can just implement the logic in the app itself but what if we want to "pass the burden" to the RDS DB?

Azure SQL DWH delete and restore it when requires

Is there an option to restore the deleted database in SQL DWH at a later time(more than a year )?
The documentation clearly indicates that when an Azure SQL Data Warehouse is dropped it keeps the final snapshot for seven days:
When you drop a data warehouse, SQL Data Warehouse creates a final
snapshot and saves it for seven days. You can restore the data
warehouse to the final restore point created at deletion.
The same article also mentions the fact you can vote for this feature here:
https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/35114410-user-defined-retention-periods-for-restore-points
Even if you could do this, you are basically leaving it up to someone else to be in charge of your warehouse backups. What you could do instead is take control:
Store your Azure SQL Data Warehouse schema in source code control (eg git, Azure DevOps formerly VSTS, etc). If it isn't there already you can reverse engineer the schema using SQL Server Management Studio (SSMS) versions 17.x onwards or even use the SSDT preview feature
Export your data to Data Lake or Azure Blob Storage using CREATE EXTERNAL TABLE AS SELECT (CETAS). This will export your data as flat files to storage where it won't be deleted. Alternately use Azure Data Factory to export the data and zip it up to save space.
When you need to recreate the warehouse, simply redeploy the schema from source code control and redeploy the data, eg via CTAS in to staging tables, or use Azure Data Factory to re-import. If you saved your external tables in the schema you save to source code control then it will just be there when you redeploy. INSERT back in to the main tables from the external tables.
In this way you are in charge of your warehouse schema and your data to be recreated at any point you require, whether it be a day, a month or years.
A simple diagram of the proposed design:

Locking Behavior In Spanner vs MySQL

I'm exploring moving an application built on top of MySQL into Spanner and am not sure if I can replicate certain functionality from our MySQL db.
basically a simplified version of our mysql schema would look like this
users idnamebalanceuser_transactionsiduser_idexternal_idamountuser_locksuser_iddate
when the application receives a transaction for a user the app starts a mysql transaction, updates the user_lock for that user, checks if the user has sufficient balance for the transaction, creates a new transaction, and then updates the balance. It is possible the application receive transactions for a user at the same time and so the lock forces them to be sequential.
Is it possible to replicate this in Spanner? How would I do so? Basically If the application receives two transactions at the same time I want to ensure that they are given an order and that the changed data from the first transaction is propagated to the second transaction.
Cloud Spanner would do this by default since it provides serializability which means that all transactions appear to have occurred in serial order. You can read more about the transaction semantics here:
https://cloud.google.com/spanner/docs/transactions#rw_transaction_semantics

Making database schema changes using Microsoft Sync framework without losing any tracking table data

I am using Microsoft Synch Service Framework 4.0 for synching Sql server Database tables with SqlLite Database on the Ipad side.
Before making any Database schema changes in the Sql Server Database, We have to Deprovision the database tables. ALso after making the schema changes, we ReProvision the tables.
Now in this process, the tracking tables( i.e. the Synching information) gets deleted.
I want the tracking table information to be restored after Reprovisioning.
How can this be done? Is it possible to make DB changes without Deprovisioning.
e.g, the application is in Version 2.0, The synching is working fine. Now in the next version 3.0, i want to make some DB changes. SO, in the process of Deprovisioning-Provisioning, the tracking info. gets deleted. So all the tracking information from the previous version is lost. I do not want to loose the tracking info. How can i restore this tracking information from the previous version.
I believe we will have to write a custom code or trigger to store the tracking information before Deprovisioning. Could anyone suggest a suitable method OR provide some useful links regarding this issue.
the provisioning process should automatically populate the tracking table for you. you don't have to copy and reload them yourself.
now if you think the tracking table is where the framework stores what was previously synched, the answer is no.
the tracking table simply stores what was inserted/updated/deleted. it's used for change enumeration. the information on what was previously synched is stored in the scope_info table.
when you deprovision, you wipe out this sync metadata. when you synch, its like the two replicas has never synched before. thus you will encounter conflicts as the framework tries to apply rows that already exists on the destination.
you can find information here on how to "hack" the sync fx created objects to effect some types of schema changes.
Modifying Sync Framework Scope Definition – Part 1 – Introduction
Modifying Sync Framework Scope Definition – Part 2 – Workarounds
Modifying Sync Framework Scope Definition – Part 3 – Workarounds – Adding/Removing Columns
Modifying Sync Framework Scope Definition – Part 4 – Workarounds – Adding a Table to an existing scope
Lets say I have one table "User" that I want to synch.
A tracking table will be created "User_tracking" and some synch information will be present in it after synching.
WHen I make any DB changes, this Tracking table "User_tracking" will be deleted AND the tracking info. will be lost during the Deprovisioning- Provisioning process.
My workaround:
Before Deprovisioning, I will write a script to copy all the "User_tracking" data into another temporary table "User_tracking_1". so all the existing tracking info will be stored in "User_tracking_1". WHen I reprovision the table, a new trackin table "User_Tracking" will be created.
After Reprovisioning, I will copy the data from table "User_tracking_1" to "User_Tracking" and then delete the contents from table "User_Tracking_1".
UserTracking info will be restored.
Is this the right approach...

How to monitor database updates from application?

I work with SQL Server database with ODBC, C++. I want to detect modifications in some tables of the database: another application inserts or updates rows and I have to detect all these modifications. It does not have to be the immediate trigger, it is acceptable to use polling to periodically check database tables for modifications.
Below is the way I think this can be done, and need your opinions whether this is the standard/right way of doing this, or any better approaches exist.
What I've thought of is this: I add triggers in SQL Server, which, on any modification, will insert the identifiers of modified/added rows into special table, which I will check periodically from my application. Suppose there are 3 tables: Customers, Products, Services. i will make three additional tables: Change_Customers, Change_Products, Change_Services, and will insert the identifiers of modified rows of the respective tables. Then I will read these Change_* tables from my application periodically and delete processed records.
Now if you agree that above solution is right, I have another question: Is it better to have separate Change_* tables for each of my tables I wish to monitor, or is it better to have one fat Changes table which will contain the changes from all tables.
Query Notifications is the technology designed to do exactly what you're describing. You can leverage Query Notifications from managed clients via the well known SqlDependency class, but there are native Ole DB and ODBC ways too. See Working with Query Notifications, the paragraphs about SSPROP_QP_NOTIFICATION_MSGTEXT (OleDB) and SQL_SOPT_SS_QUERYNOTIFICATION_MSGTEXT (ODBC). See The Mysterious Notification for an explanation how Query Notifications work.
This is the only polling-free solution that work with any kind of updates. Triggers and polling for changes has severe scalability and performance issues. Change Data Capture and Change Tracking are really covering a different topic (synchronizing datasets for occasionally connected devices, eg. Sync Framework).
Change Data Capture(CDC)--http://msdn.microsoft.com/en-us/library/cc645937.aspx
First you will need to enable CDC in database
::
USE db_name
GO
EXEC sys.sp_cdc_enable_db
GO
Enable CDC on table then
:: sys.sp_cdc_enable_table
Then you can query changes
If your version of Sql Server is 2005 - you may use Notification Services
If your Sql Server is 2008+ - there is most preferrable way to use triggers and log changes to log tables and periodically poll these tables from application to see the changes