We have a very weird issue in our Siebel 7.8 application.
In the Application_Start event we define a bunch of profile attributes, which determine if the logged user will be allowed to perform certain operations or not. The code is something like this:
if (userHasSuperpowers) {
TheApplication().SetProfileAttr("CanFly", "Y");
} else {
// CanFly is not set, and GetProfileAttr("CanFly") returns ''
}
Everything works fine, except for one of these profile attributes. The conditions are not met, so we don't set its value. But when we check it using GetProfileAttr... it returns 'Y' instead of ''.
I've checked the code. A lot. I've put traces everywhere, and I'm 100% sure that when the last line of the Application_Start event executes, the attribute is still empty. However, in the first Applet_Load event after the login (in the HLS Salutation Applet (HLS Home) applet), its value has already changed to 'Y'. Why!!? I've looked everywhere, but I can't find anywhere else where we'd be doing a SetProfileAttr. So far, I've ruled out:
Every browser and server script for all our applets, application, BCs and business services.
All the runtime business services (the ones defined directly in the application instead of the SRF).
The Personalization Profile business component fields.
SmartScripts (not that they would matter in this particular scenario, I just mention them to acknowledge that you can set profile attributes there too).
Workflows: every step invoking the SIS OM PMT Service method Set Profile Attribute.
Siebel magically setting its value. The profile attribute name is custom made, in Spanish, and it contains our project name and a row_id. I really don't think Siebel is using the same name for its own profile attributes :).
But wait, there is more, I left the best part for last: the problem only happens in our development environment!
It's not an SRF issue: if we promote the same SRF to our testing or production environments, it works and returns the expected value.
It's not a data problem: still with the same SRF, I can use my local thick client, connecting to our development database with the same login and password, and it works fine too.
It's not a concurrency problem: we are testing with only one user logged in. And even if we had more, they wouldn't share sessions. And even if they did, the value wouldn't be always 'Y'.
It's not a temporary glitch, or something due to a wrong incremental compilation or a corrupted SRF: we have been experiencing this for at least 6 months (obviously, in that time frame, we've had dozens of different SRF files... all of them having the same problem, but only in development, and only if you use the server and not the dedicated client... seriously...).
Where else could I search the profile attribute being set? I've read that they can be persisted to the DB, but in order to do so, you have to define them as a field in a BC based on an S_PARTY extension table, right?
Is there any way to trace profile attribute changes somehow? Maybe rising some loglevel?
How can I find out at least what's being executed after the Application_Start, before loading the first applet?
Any other ideas? I tried checking the SQL spool file too, but didn't find anything suspicious there either (i.e., any of the queries we use to check the conditions, being run twice with different parameters).
Update: following Ranjith R's suggestions, I've also checked:
Other vanilla business services which could be also invoked from a workflow to set a profile attr: User Registration > SetProfileAttr, SessionAccessService > SetProfileAttr and ISS Promotion Agreement Manager > SetProfileAttributes.
Runtime events setting profile attributes directly or using a business service (we don't have any runtime events apart from the vanilla ones).
Business services being called from DVMs (we only have vanilla data validation rules, and none of them apply to our buscomps).
Still no luck...
Ok... finally we found what's happening:
We access the URL to our server and get to the login page. This triggers a first Application_Start event, for the SADMIN user.
We set the profile attributes in that session. SADMIN is the Siebel administrator user, so yes, he hasSuperpowers and therefore we do TheApplication().SetProfileAttr("CanFly", "Y");.
The Application_Start event finishes.
We enter our username and password in the login screen to access into Siebel. This triggers a second Application_Start event, this time for our user. This is the one I was monitoring with the trace files.
We set the profile attributes again in the new session. Our user doesn't hasSuperpowers, so we don't set any value for the CanFly attribute.
The Application_Start event finishes, and CanFly is still empty.
Siebel merges both sessions into one before loading the first screen!! Or at least, it transfers over the profile attributes we had set for SADMIN.
I'm sure it happens that way, for two reasons. First, we changed the profile attribute name to include the username too. And second, instead of storing just an "Y", we are storing now the current date:
var time = (new Date()).getTime();
TheApplication().SetProfileAttr("CanFly_" + TheApplication().LoginName(), time);
We end up having CanFly_SADMIN, but no CanFly_USER, and the time value stored is the same we see in the log file for step 2... which is smaller than any of the values for the *_USER attributes.
So that's what happening. I still don't know why Siebel behaves this way, but that would be matter for another question. According to the Siebel bookshelf:
The Start event is called when the client starts and again when the user interface is first displayed.
...but it doesn't say anythign about it being called from two different sessions, different users too, and then merging them together. It must be something misconfigured in our dev environment, considering it doesn't happen in the other ones.
Does Siebel 7.8 have runtime Events? I can't recall. Runtime events have an action set for setevent, which can set/clear profile attributes.
There are still other vanilla business services which can set profile attributes, try searching in tools flat under business service methods for *rofile*tt*.
The SIS OM service can also be invoked from DVMs for from RunTime events directly, so thats also a possibility.
There is no logging system to see values of Profile Attributes changing, testing is the only way out.
Related
I'm making a website right now and need to use django-tracking2 for analytics. Everything works but I would like to allow users to opt out and I haven't seen any options for that. I was thinking modifying the middleware portion may work but honestly, I don't know how to go about that yet since I haven't written middleware before.
I tried writing a script to check a cookie called no_track and if it wasn't set, I would set it to false for default tracking and if they reject, it sets no_track to True but I had no idea where to implement it (other than the middle ware, when I tried that the server told me to contact the administrator). I was thinking maybe I could use signals to prevent the user being tracked but then that would slow down the webpage since it would have to deal with preventing a new Visitor instance on each page (because it would likely keep making new instances since it would seem like a new user). Could I subclass the Visitor class and modify __init__ to do a check for the cookie and either let it save or don't.
Thanks for any answers, if I find a solution I'll edit the post or post and accept the answer just in case someone else needs this.
I made a function in my tools file (holds all functions used throughout the project to make my life easier) to get and set a session key. Inside the VisitorTrackingMiddleware I used the function _should_track() and placed a check that looks for the session key (after _should_track() checks that sessions is installed and before all other checks), with the check_session() function in my tools file, if it doesn't exist, the function creates it with the default of True (Track the user until they accept or reject) and returns an HttpResponse (left over from trying the cookie method).
When I used the cookie method, the firefox console said the cookie will expire so I just switched to sessions another reason is that django-tracking2 runs on it.
It seems to work very well and it didn't have a very large impact on load times, every time a request is made, that function runs and my debug tells me if it's tracking me or not and all the buttons work through AJAX. I want to run some tests to see if this does indeed work and if so, maybe I'll submit a pull request to django-tracking2 just in case someone else wants to use it.
A Big advantage to this is that you can allow users to change their minds if they want or you can reprompt at user sign up depending on if they accepted or not. with the way check_session() is set up, I can use it in template tags and class methods as well.
I am trying to implement an Event Sourcing system using Kafka and have run into the following issue. During a new user sign-up I want to check if the username the user provided is already taken. However, consider the case where 2 users are trying to sign-up at the same time providing the same username.
In my understanding of how ES works the controller that processes the sign-up request will check if the request is valid, it will then send a new event (e.g. NewUser) to Kafka, and finally that event will be picked up by another controller which will persist it in a materialized view (e.g. Postgres DB). The problem is that the validation of the request is done against the materialized view but the actual persistence to it happens later. So because the 2 requests are being processed in parallel (by different service instances) they might both pass the validation, resulting in 2 NewUser messages. However, when the second controller tries to persist those 2 NewUser messages in the database saving the second event will fail because of the violation of the uniqueness constraint for the username.
Any ideas on how to address this?
Thanks.
UPDATE:
In particular, I would like to verify whether the following are accepted approaches to the problem:
use the username as the userId (restrictive)
send an event to a topic partitioned by username and when validation
is done send an event to another topic
Initial validation against the materialized view won't be enough in most scenarios where you have constraints. There can always be some relevant events haven't been materialized yet. There are two main concurrency control approaches to ensure that correct results are generated:
1. Pessimistic approach:
If you want to validate constraints before you publish an event, you need to lock relevant resources (entity, aggregate or data set). The locking means your services must not be able to publish events on these resources. After this point, to get the current state of your data:
You can wait until all events published before locking are materialized.
You can read current state from the database and apply events on it in a separate process.
2. Optimistic approach:
In this approach, you perform your validations after publishing events. To achieve this, you need to implement a feedback mechanism. The process which consumes events and performs validations should be able to publish validation results. You can perform the validations in-memory when possible. Otherwise, you can rely on your materialized data store.
Martin Kleppman talks about a two-step solution for exactly the same problem here and in his book. In this solution, there are two topics: "claims" and "registrations". First, you publish a claim to take the username, then try to write it to the database, and finally publish the result to the registrations topic. At conceptual level, it follows the same steps in the second approach you have mentioned. In validation step, it avoids implementing validation logic and keeping secondary indexes in memory by relying on the database.
During a new user sign-up I want to check if the username the user provided is already taken.
You may want to review Greg Young's essay on Set Validation.
In my understanding of how ES works the controller that processes the sign-up request will check if the request is valid, it will then send a new event (e.g. NewUser) to Kafka, and finally that event will be picked up by another controller which will persist it in a materialized view (e.g. Postgres DB).
That's a little bit different from the usual arrangement. (You may also want to review Greg's talk on polyglot data.)
Suppose we begin with two writers; that's fine, but if there is going to be a single point of truth, then you are going to need synchronization somewhere.
The usual arrangement is to use a form of optimistic concurrency; when processing a request, you reserve a copy of your original state, then you do your calculation, and finally you send the book of record a `replace(originalState,newState)'.
So at this point, we have two writes racing toward the book of record
replace(red,green)
replace(red,blue)
At the book of record, the writes are processed in series.
[...,replace(red,blue)...,replace(red,green)]
So when the book of record processes replace(red,blue), it performs a check that yes, the state is currently red, and swaps in blue. Later, when the book of record tries to process replace(red,green), the book of record performs the check, which fails because the state is no longer red.
So one of the writes has succeeded, and the other fails; the latter can propagate the failure outwards, or retry, or..., precisely what depends on the specific mechanics in question. A retry should mean, of course, reload the "original state", at which point the model would discover that some previous edit already claimed the username.
Any ideas on how to address this?
Single writer per stream makes the rest of the problem pretty simple, by eliminating the ambiguity introduced by having multiple in memory copies of the model.
Multiple writers using a synchronous write to the durable store is probably the most common design. It requires an event store that understands the idea of writing to a specific location in a stream -- aka "expected version".
You can perform an asynchronous write, and then start doing other work until you get an acknowledgement that the write succeeded (or not, or until you time out, or)....
There's no magic -- if you want uniqueness (or any other sort of invariant enforcement, for that matter), then everybody needs to agree on a single authority, and anybody else who wants to propose a change won't know if it has been accepted without getting word back from the authority, and needs to be prepared for a rejected proposal.
(Note: this shouldn't be a surprise -- if you were using a traditional design with current state stored in a RDBMS, then your authority would be a user table in the database, with a uniqueness constraint on the username column, and the race would be between the two insert statements trying to finish their transaction first....)
I´m trying to perform some actions in the pipeline "httpRequestBegin" only when necessary.
My processor is executed after Sitecore resolves the user (processor type="Sitecore.Pipelines.HttpRequest.UserResolver, Sitecore.Kernel" ), as i´m resolving the user too if Sitecore is not able to resolve it first.
Later, i want to add some rendering in the pipeline "insertRenderings", only if actions in the previous pipeline were executed (If i resolved the user, show a message), so i´m trying to save some "flag" in the first step, to check in the second.
My question is, where can I store that flag? I´m trying to find some kind of "per request" cache...
So far, I've tried:
The session: Wrong, it's too early, session doesn't exists yet.
Items (HttpContext.Current.Items): It doesn't work either, my item is not there on the seconds step.
So far i'm using the application cache (HttpContext.Current.Cache) with some unique key, but I don´t like this solution.
Anybody body knows a better approach to share this "flag"?
You could add a flag to the request header and then check it's existence in the latter pipelines, e.g.
// in HttpRequest pipeline
HttpContext.Current.Request.Headers.Add("CustomUserResolve", "true");
// in InsertRenderings pipeline
var customUserResolve = HttpContext.Current.Request.Headers["CustomUserResolve"];
if (Sitecore.MainUtil.GetBool(customUserResolve, false))
{
// custom logic goes here
}
This feels a little dirty, I think adding to Request.QueryString or Request.Params would been nicer but those are readonly. However, if you only need this for a one time deal (i.e. only the first time it is resolved) then it will work since in the next request the Headers are back to default without your custom header added.
HttpContext.Current.Cache or HttpRuntime.Cache could be the fastest solution here. Though this approach would not preserve data when the AppPool gets recycled.
If you add only a few keys to the cache and then maintain them, this solution might work for you. If each request puts an entry into the cache, it may eventually overflow the memory used by worker process in a long run.
As alternative to this you may try to use Sitecore.Context.ClientData property. It uses ClientDataStore that employs a database (look for clientDataStore section in the web.config file) to store data. These entries can survive the AppPool recycle.
Though if you use them a lot, it may become a bottleneck under the load when you need to write to and/or read from the entries.
If you do know that there could be a lot of entries created for sharing purposes, I'd create a scheduled task to clean up the data store from obsolete entries.
I know this is a very old question, but I just want post solution I worked around
Below will hold data per http request basis.
HttpContext.Current.Items["ModuleInfo"] = "Custom Module Info"
we can store data to httpcontext in one sitecore pipeline and retrieve in another...
https://www.codeproject.com/Articles/146455/When-Can-We-Use-HttpContext-Current-Items-to-Store
Would it be ok to get a CF app to check for a valid database before proceeding to process that request?
This is because there may be instances where the database server may be down or being upgraded, hence an error comes when a db dependant request is made.
If there is no connection to the db server, the user can be safely redirected to a safe page.
Or can cfcatch work?
How can this check be done?
Thank you.
in your onRequestStart method of your Application.cfc file or in an Application.cfm file you can run a simple query to check that the database is available. Wrap the query in cftry/cfcatch. If the query fails, you can redirect the user in the cfcatch, if it succeeds, you can be reasonably sure that your database is "alive".
I've used such a check in one project. Code may looks as follows (not sure if it will work in versions of ColdFusion lower than 8), consider this sample as chunk of UDF written in CFScript:
// service factory object instance
factory = CreateObject("java","coldfusion.server.ServiceFactory");
// the datasource service
dsService = factory.DatasourceService;
// verify the dsn
return dsService.verifyDataSource(arguments.dsn);
Oh, I have even found small note in the code I wrote on my old laptop couple of years ago:
// [performance note] this server check takes 1-3ms at local PC (Kubuntu 7.10, CF8 + Apache2, Sempron 3500+, 1GB RAM)
While time looks like small I have found out that doing this check on each request is not really useful for my application. Any way I have a habit to use the try/catch extensively for errors handling. But if your datasources may cheange frequently it may have more sense.
Adding an extra query to every request to make sure that the database is up is a patently bad idea. A better approach would be to build a "maintenance mode" switch into your application, that you would manually enable when you are doing planned maintenance (upgrades, etc).
If you want to have a "friendly" page displayed when an error (like database issues) occur, then use the onError() method in Application.cfc and/or the <cferror .../> tag in Application.cfm, as a global error handler.
If you are worried the db could vanish, I would implement a "SELECT 1 AS A" query in your OnRequestStart handler that runs only every N minutes. This can be accomplished by using the query caching feature. I'd start with performing the query every 30 min.
If there a way to protect against concurrent modifications of the same data base entry by two or more users?
It would be acceptable to show an error message to the user performing the second commit/save operation, but data should not be silently overwritten.
I think locking the entry is not an option, as a user might use the "Back" button or simply close his browser, leaving the lock for ever.
This is how I do optimistic locking in Django:
updated = Entry.objects.filter(Q(id=e.id) && Q(version=e.version))\
.update(updated_field=new_value, version=e.version+1)
if not updated:
raise ConcurrentModificationException()
The code listed above can be implemented as a method in Custom Manager.
I am making the following assumptions:
filter().update() will result in a single database query because filter is lazy
a database query is atomic
These assumptions are enough to ensure that no one else has updated the entry before. If multiple rows are updated this way you should use transactions.
WARNING Django Doc:
Be aware that the update() method is
converted directly to an SQL
statement. It is a bulk operation for
direct updates. It doesn't run any
save() methods on your models, or emit
the pre_save or post_save signals
This question is a bit old and my answer a bit late, but after what I understand this has been fixed in Django 1.4 using:
select_for_update(nowait=True)
see the docs
Returns a queryset that will lock rows until the end of the transaction, generating a SELECT ... FOR UPDATE SQL statement on supported databases.
Usually, if another transaction has already acquired a lock on one of the selected rows, the query will block until the lock is released. If this is not the behavior you want, call select_for_update(nowait=True). This will make the call non-blocking. If a conflicting lock is already acquired by another transaction, DatabaseError will be raised when the queryset is evaluated.
Of course this will only work if the back-end support the "select for update" feature, which for example sqlite doesn't. Unfortunately: nowait=True is not supported by MySql, there you have to use: nowait=False, which will only block until the lock is released.
Actually, transactions don't help you much here ... unless you want to have transactions running over multiple HTTP requests (which you most probably don't want).
What we usually use in those cases is "Optimistic Locking". The Django ORM doesn't support that as far as I know. But there has been some discussion about adding this feature.
So you are on your own. Basically, what you should do is add a "version" field to your model and pass it to the user as a hidden field. The normal cycle for an update is :
read the data and show it to the user
user modify data
user post the data
the app saves it back in the database.
To implement optimistic locking, when you save the data, you check if the version that you got back from the user is the same as the one in the database, and then update the database and increment the version. If they are not, it means that there has been a change since the data was loaded.
You can do that with a single SQL call with something like :
UPDATE ... WHERE version = 'version_from_user';
This call will update the database only if the version is still the same.
Django 1.11 has three convenient options to handle this situation depending on your business logic requirements:
Something.objects.select_for_update() will block until the model become free
Something.objects.select_for_update(nowait=True) and catch DatabaseError if the model is currently locked for update
Something.objects.select_for_update(skip_locked=True) will not return the objects that are currently locked
In my application, which has both interactive and batch workflows on various models, I found these three options to solve most of my concurrent processing scenarios.
The "waiting" select_for_update is very convenient in sequential batch processes - I want them all to execute, but let them take their time. The nowait is used when an user wants to modify an object that is currently locked for update - I will just tell them it's being modified at this moment.
The skip_locked is useful for another type of update, when users can trigger a rescan of an object - and I don't care who triggers it, as long as it's triggered, so skip_locked allows me to silently skip the duplicated triggers.
For future reference, check out https://github.com/RobCombs/django-locking. It does locking in a way that doesn't leave everlasting locks, by a mixture of javascript unlocking when the user leaves the page, and lock timeouts (e.g. in case the user's browser crashes). The documentation is pretty complete.
You should probably use the django transaction middleware at least, even regardless of this problem.
As to your actual problem of having multiple users editing the same data... yes, use locking. OR:
Check what version a user is updating against (do this securely, so users can't simply hack the system to say they were updating the latest copy!), and only update if that version is current. Otherwise, send the user back a new page with the original version they were editing, their submitted version, and the new version(s) written by others. Ask them to merge the changes into one, completely up-to-date version. You might try to auto-merge these using a toolset like diff+patch, but you'll need to have the manual merge method working for failure cases anyway, so start with that. Also, you'll need to preserve version history, and allow admins to revert changes, in case someone unintentionally or intentionally messes up the merge. But you should probably have that anyway.
There's very likely a django app/library that does most of this for you.
Another thing to look for is the word "atomic". An atomic operation means that your database change will either happen successfully, or fail obviously. A quick search shows this question asking about atomic operations in Django.
The idea above
updated = Entry.objects.filter(Q(id=e.id) && Q(version=e.version))\
.update(updated_field=new_value, version=e.version+1)
if not updated:
raise ConcurrentModificationException()
looks great and should work fine even without serializable transactions.
The problem is how to augment the deafult .save() behavior as to not have to do manual plumbing to call the .update() method.
I looked at the Custom Manager idea.
My plan is to override the Manager _update method that is called by Model.save_base() to perform the update.
This is the current code in Django 1.3
def _update(self, values, **kwargs):
return self.get_query_set()._update(values, **kwargs)
What needs to be done IMHO is something like:
def _update(self, values, **kwargs):
#TODO Get version field value
v = self.get_version_field_value(values[0])
return self.get_query_set().filter(Q(version=v))._update(values, **kwargs)
Similar thing needs to happen on delete. However delete is a bit more difficult as Django is implementing quite some voodoo in this area through django.db.models.deletion.Collector.
It is weird that modren tool like Django lacks guidance for Optimictic Concurency Control.
I will update this post when I solve the riddle. Hopefully solution will be in a nice pythonic way that does not involve tons of coding, weird views, skipping essential pieces of Django etc.
To be safe the database needs to support transactions.
If the fields is "free-form" e.g. text etc. and you need to allow several users to be able to edit the same fields (you can't have single user ownership to the data), you could store the original data in a variable.
When the user committs, check if the input data has changed from the original data (if not, you don't need to bother the DB by rewriting old data),
if the original data compared to the current data in the db is the same you can save, if it has changed you can show the user the difference and ask the user what to do.
If the fields is numbers e.g. account balance, number of items in a store etc., you can handle it more automatically if you calculate the difference between the original value (stored when the user started filling out the form) and the new value you can start a transaction read the current value and add the difference, then end transaction. If you can't have negative values, you should abort the transaction if the result is negative, and tell the user.
I don't know django, so I can't give you teh cod3s.. ;)
From here:
How to prevent overwriting an object someone else has modified
I'm assuming that the timestamp will be held as a hidden field in the form you're trying to save the details of.
def save(self):
if(self.id):
foo = Foo.objects.get(pk=self.id)
if(foo.timestamp > self.timestamp):
raise Exception, "trying to save outdated Foo"
super(Foo, self).save()