LocalInsertRemoteInsert conflicts on initial sync - microsoft-sync-framework

I am trying to implement unidirectional sync from Server to Client, both running SQLExpress 2008 using Sync Framework 2.1. I'm having trouble during initial sync.
Two scopes are defined on the server:
Scope1 contains tables A, B, C and Scope2 contains tables B, C, D
Before the initial sync, the client database contains tables A, B, C and D (imported using a script), and each table is empty. Database has been provisioned using the server's scope information for Scope1 and Scope2, using default provisioning settings.
When syncing Scope1 for the first time, everything succeeds. When syncing Scope2, every record for shared tables (B, C) is being downloaded and reported as a LocalInsertRemoteInsert conflict.
Is there a way to avoid such behavior? Shouldn't Sync Framework know, after Scope1 is done syncing, that tables B and C are now up to date?

you have overlapping scopes where data contained in the other scope is the same data that falls in the other scope.
information on what was synched is stored at the scope level, what was synched on first scope is not known to the second scope and vice versa.
when you sync Scope 1, it downloads data and updates its own sync knowledge on what was synched. When you synch Scope 2, it has no idea of what Scope 1 has synched, so it downloads them again.
i would put B and C on Scope 1 and create Scope 2 on A, Scope 3 on D. so you only synch B and C once.

Related

Lifetime and scope of class and instance variables in django

While there are quite a few questions and answers on here already about the life of different variables within python I am looking for how they translate into the django environment in terms of application scope and endpoint scopes. Here is a simple version of what I am making and I want to ensure that it will behave the way I am expecting it to
my_cache/models/GlobalCache.py:
# This class should be global to the entire application and only
# load when the server is started.
class GlobalCacheobject):
_cache = {}
#classmethod
def fetch(cls):
return cls._cache
#classmethod
def flush(cls):
cls._cache = {}
#classmethod
def load_cache(cls, files_to_load_data_from):
for file in files_to_load_from:
cls._cache[file] = <load file and process its data into an entry>
my_cache/models/InstanceCache.py:
from .GlobalCache import GlobalCache
# This class will contain a reference to the global cache and use it to look
# up entries.
class InstanceCache(object):
def __init__(self, name=None):
self._name = name
self._cache = GlobalCache.fetch()
def fetch_file_data(self, file_name):
cache_entry = self._cache.get(file_name, None)
if cache_entry is None:
raise EntryNotFoundException()
return ReadOnlyInterfaceObject(cache_entry)
The intent is to have GlobalCache have a cls._cache value that will persist as long as the server is running. Calling GlobalCache.flush() will drop its global reference to the data it was tracking and calling GlobalCache.load(files_to_load_from) will populate a new instance of its data from.
The InstanceCache object is then intended to hold a reference to the current version of the data and return read-only objects for the different data sets identified by their original file name.
From my testing this seems to work, though I do not really have the InstanceCache object per se. I can load the global cache, retrieve read only objects to it and then flush the global, load it with new data. The original read only objects still return the values they were originally loaded with, new requests will use the new data values.
What I want to confirm is that GlobalCache will exist as long as the server is running and only alter its data with direct calls to flush() and load_cache(). And that when I hit an endpoint and create an InstanceCache it will keep a reference to the original data only as long as it exists. When the execution on the end point is done I would expect it to go out of scope removing the reference to the global cache and if that was the last one, it goes away and only the new/current data is kept. If it matters I am running Python 2.7.6 and django 1.5.12. Solutions that require an upgrade may be useful as well but it is not an immediate option for me.
The answer here is a maybe, and it also depends a lot on which app server you are using to run django (if you are running multi-process).
So, generally speaking, yes, the GlobalCache will retain its cached contents for the lifetime of the process it is in after it has been initialized.
But InstanceCache, on the other hand, is only guaranteed to be garbage collected at some time after there are no more references to it. Garbage collection is a deep field and there are often teams of people that work on the algorithms so going into exact scenarios is probably outside the scope of an answer on SO. A popular implementation of python is pypy, and you can read more about the garbage collection used in pypy here.
That said, please remember that most app servers are multi-process. Both uwsgi and gunicorn spin up child processes to serve requests. So even though GlobalCache is a singleton in its process, there may be several processes, each with its own GlobalCache. And, this GlobalCache will ultimately be garbage collected/cleaned up when the process exits. Both uwsgi and gunicorn will usually kill child processes after the child services some number of HTTP requests.

onRequestStart CFWheels

I am having some concurrency issues in cfwheels.
I have some code in events/onrequeststart.cfm that is being executed every time the user is requesting something.
Test case:
User A - request time: 10sec
User B - request time: 2sec
If the user B issues a request while the user A is already working on a request, user's B settings will go into user A and user A will display results based on user's B request.
I tried using cflock on the onrequeststart.cfm but it doesn't seem to work. I don't have much experience with cfwheels so I maybe trying to do something that is logically wrong.
This is part of the code that gets confused.
<cfquery name="currentUser" datasource="#application.ds#">
select * from clientadmin where clientAdminid ='#session.clientadminid#'
</cfquery>
<cfquery name="currentClient" datasource="#application.ds#">
select * from clientBrands where clientbrandID ='#currentUser.ClientBrandID#'
</cfquery>
<cfset application.clientAdminSurveys = application.generalFunctions.clientSurveys(clientAdminID=session.clientAdminID, clientBrandID = currentUser.clientBrandID)>
<cfset application.AssociatedDoctors = application.generalFunctions.AssociatedDoctors(clientAdminID=session.clientAdminID, clientBrandID = currentUser.clientBrandID)>
So, I guess my question is, how to avoid this from happening?
1) The Application scope is "application wide" (all users of site wide) - you shouldn't be setting per user settings there, ever, as as you've discovered, user B overwrites user A. Use the session scope for per user stuff. So in your last two lines you're setting application scope stuff using session scope data!
2) As a side note, in wheels you can use application.wheels.datasourcename to get the database name
I would put that code into a function inside the controller(controller.cfc) and run it using a filter.
see: http://cfwheels.org/docs/1-1/chapter/filters
This has worked for me without issues for similar tasks.
Also I would drop any reference to application. because that's likely where items are getting mixed up. The correct place to put these functions is into events/functions.cfm
of course this is without seeing more of your code...
As mentioned by Neokoenig, you are utilizing a shared scope to store user specific data, you should store that in SESSION. If you need the data in application scope you should be using a lock while setting that data, but it looks like you should be running this in onSessionStart once and not on every request. If you need to run it on every request you may want to continue using the onRequestStart but utilize the user specific session storage and not the global application layer.
Just remember:
Application variables will show the same data for all users. So if user a sets application.foo = 1 and user b sets application.foo = 2 then user one try to access application.foo, user 1 will see user 2's value of 2. If this was using session scope you would not have this same issue. If user 1 sets SESSION.foo = 1 and user 2 sets SESSION.foo = 2. When the user accesses the SESSION.foo variable it will only contain the data set by that user (ex: user 1 will output SESSION.foo and see the value, 1)

JPA with JTA how to persist many entites in one transaction

I have a list of objects. They are JPA "Location" entities.
List<Location> locations;
I have a stateless EJB which loops thru the list and persists each one.
public void createLocations() {
List<Locations> locations = getListOfJPAManagedLocationEntities(); // I'm leaving out the details of this because it has nothing to do with the issue
for(Location location : locations) {
em.persist(location);
}
}
The code works fine. I do not have any problems.
However, the issue is: I want this to be an all-or-none transaction. Currently, each time thru the for loop, the persist() method will insert a new row into the database. Suppose I have 100 location objects and the 54th object has something wrong with it and an exception is thrown. There will be 53 records inserted into the database. What I want is: all of them must succeed before any of them succeed.
I'm using the latest & greatest version of Java EE6, EJB 3.x., and JPA 2. My persistence.xml uses JTA.
<persistence-unit name="myPersistenceUnit" transaction-type="JTA">
And I like having JTA.
I do not want to stop using JTA.
90% of the time JTA does exactly what I want it to do. But in this case, I doesn't seem to.
My understanding of JTA must be inaccurate because I always thought the beginning and end of the EJB method marked the boundaries of the JTA transaction (assume only one method is in-play as I've shown above). By my logic, the transaction would not end until the for-loop is done and the method returns, and then at that point the records are persisted.
I'm using the JTDS driver for SqlServer 2008. Perhaps the database doesn't want to insert a record without immediately committing it. The entity id is defined like this:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
I've checked the spec., and it is not proper to call the various "UserTransaction" or "getTransaction()" methods in a JTA environment.
So what can I do?
Thanks.
If you use JTA and container managed transactions the default behavior for an session EJB method call is to run in a transaction (is like annotating it with #TransactionAttribute(TransactionAttributeType.REQUIRED). That means that your code already runs in a transaction and will do what you expect: if an exception occurs at row 54 all previous inserted rows will be rolled-back. You can go ahead and test it by throwing yourself an exception at some point in the loop. Note that if you throw a checked exception declared by your method you can specify what the container should do when that exception occurs. You need to annotate the exception class with #ApplicationException (rollback=true).
if there was a duplicate entry while looping then it will continue without problems and when compiler reaches this line em.flush(); after the loop then it will throw an exception and rollback the transaction.
I'm using JBoss. Set your datasource in your standalone.xml or domain.xml to have
<datasource jta="true" ...>
Seems obvious, but I obviously set it wrong a long time ago and forgot about it.

"Declarative" composite state with concurrent substates in UML

Given a System that contains two components, A and B, and
The System starts up A and B concurrently. Now A can go through states {A.Starting, A.Ready}, and B can be in states {B.Starting, B.DoingX, B.DoingY}. (Events to transition between A's and B's states are named accordingly: B.doingx => B goes to B.DoingX etc...)
I want to model that
While A is in A.Starting, or B is in B.Starting, the System is "Starting"
The System is in state "DoingX" when A is in A.Ready and B is in B.DoingX
The System is in state "DoingY" when A is in A.Ready and B is in B.DoingY
If I'm not mistaken, the fork/join pseudo-states could be used here.
But do these model elements have the declarative semantics of the composed state mentioned above? Is there another way to model this?
(Note: the diagrams are from http://yuml.me)
Why don't you just pull these apart? Here's another idea on how you could model it (assuming I understood it correctly) :
a state "Starting", that contains the states you refer to as A.Starting and B.Starting in parallel regions (you can use fork/joins here, or just rely on the default behavior of all regions being activated when "Starting" state is entered)
another state "Doing" that contains a region with your "A.Ready" state and another parallel region, that contains the two states "B.DoingX" and "B.DoingY".
If you really need to have an overall "DoingX" state, then you may have to create two states that correspond to A.Ready.
Anyways, on a broader perspective: I believe your point of view is a little bit off here, when you say that the "System is in state ...". Rather, the system modeled by such a state machine is in a set of states. So normally, I would be perfectly happy to say that "the system is currently in A.Ready and B.DoingX".
Maybe all you need is a change of terminology. What about this:
The system is in configuration "DoingX" when A.Ready and B.DoingX states are active ?
In response to the comment: Yes, this is standard, here's the corresponding part from the superstructure specification (version 2.4 beta):
In a hierarchical state machine more than one state can be active at the same time. [...] the current active “state” is actually represented by a set of trees of states
starting with the top-most states of the root regions down to the innermost active substate. We refer to such a state tree as
a state configuration.

Setting Connection Parameters via ADO for SQL Server

Is it possible to set a connection parameter on a connection to SQL Server and have that variable persist throughout the life of the connection? The parameter must be usable by subsequent queries.
We have some old Access reports that use a handful of VBScript functions in the SQL queries (let's call them GetStartDate and GetEndDate) that return global variables. Our application would set these before invoking the query and then the queries can return information between date ranges specified in our application.
We are looking at changing to a ReportViewer control running in local mode, but I don't see any convenient way to use these custom functions in straight T-SQL.
I have two concept solutions (not tested yet), but I would like to know if there is a better way. Below is some pseudo code.
Set all variables before running Recordset.OpenForward
Connection->Execute("SET #GetStartDate = ...");
Connection->Execute("SET #GetEndDate = ...");
// Repeat for all parameters
Will these variables persist to later calls of Recordset->OpenForward? Can anything reset the variables aside from another SET/SELECT #variable statement?
Create an ADOCommand "factory" that automatically adds parameters to each ADOCommand object I will use to execute SQL
// Command has been previously been created
ADOParameter *Parameter1 = Command->CreateParameter("GetStartDate");
ADOParameter *Parameter2 = Command->CreateParameter("GetEndDate");
// Set values and attach etc...
What I would like to know if there is something like:
Connection->SetParameter("GetStartDate", "20090101");
Connection->SetParameter("GetEndDate", 20100101");
And these will persist for the lifetime of the connection, and the SQL can do something like #GetStartDate to access them. This may be exactly solution #1, if the variables persist throughout the lifetime of the connection.
Since no one has ventured an answer I'm guessing there isn't an elegant solution, that said:
Global cursors persist for the duration of the connection and can be accessed from any SQL or stored proc so you could execute this once on the connection:
DECLARE KludgeKursor CURSOR GLOBAL STATIC FOR
SELECT StartDate = '2010-01-01', EndDate = '2010-04-30'
OPEN KludgeKursor
and in your stored procedures:
--get the values
DECLARE #StartDate datetime, #EndDate datetime
FETCH FIRST FROM GLOBAL KludgeKursor
INTO #StartDate, #EndDate
--go crazy
SELECT #StartDate, #EndDate
Each connection would only see their own values, so the same stored procs can be used for different connection/values. The global cursor is automatically deallocated when the connection ends