Hello I have a simple mapping, that basically has a router to decided whether the record has to be inserted or updated and then use Update Strategy to flag the row.
The records were updating and inserting as expected, I had to make some modifications to the logic and did the required changes.
And now the records are no more getting flagged as an insert or an update. Below settings :
1) DD_UPDATE and DD_INSERT coded in the update strategy.
2) At session level, treat source as set to Data Driven.
3) The 2 targets set to update as update and insert respectively.
I even ran a debugger to see what is happening, the insert update records are passing through the update strategy, however the row type is set to blank when its passed to the target instance :( what could be the issue?
I finally found the issue. Both the Update Strategy's were corrupted. deleting and recreating the update strategy's resolved the issue :) Thanks for your help!
Related
Very basic setup: source-to-target - wanted to replicate the MERGE behavior.
Removed the update strategy, activated "update then insert" rule on target within the session. Doesn't work as described, always attempts to insert into the primary key column, even though the same key arrives, which should have triggered an "update" statement. Tried other target methods - always attempts to insert. Attached is the mapping pic.
basic merge attempt
Finally figured this out. You have to make edits in 3 places: a) mapping - remove update strategy b) session::target properties - set the "update then insert" method c) session's own properties - "treat source rows as"
In the third case you have to switch "treat source rows as" from insert to update.
Which will then allow both - updates and inserts.
Why is it set like this is beyond me. But it works.
I'll make an attempt to clarify this a bit.
First of all, using Update Strategy in the mapping requires the session Treat source rows as property to be set to Data driven. This is slowest possible option as it means it will be set on row-by-row basis within the mapping - but that's exactly what you need if using the Update Strategy transformation. So in order to mirror MERGE, you need to remove it.
And tell the session not to expect this in the mapping anymore - so the property needs to be set to one of the remaining ones. There are two options:
set Treat source rows as to Insert - this means all the rows will be inserted each time. If there are no errors (e.g. caused by unique index), the data will be multiplied. In order to mimic MERGE behavior, you'd need to add the unique index that would prevent inserts and tell the target connector to insert else update. This way in case the insert fails it will make an update attempt.
set Treat source rows as to Update - now this will tell PowerCenter to try updates for each and every input row. Now, using update else insert will cause that in case of failure (i.e. no row to update) there will be no error - instead an insert attempt will be made. Here there's no need for unique index. That's one difference.
Additional difference - although both solutions will reflect the MERGE operation - might be observed in performance. In the environment where new data is very rare, the first approach will be slow: each time an insert attempt will be made just to fail and do an update operation then. Just a few times it will succeed at first attempt. Second approach will be faster: updates will succeed most of the time and just on a rare occasion it will fail and result in an insert operation.
Of course, if updates are not often expected, it will be exactly the opposite.
This can be seen as complex solution for a simple merge. But it also lets the developer to influence the performance.
Hope this sheds some light!
I am getting below error while I save the transformation in pentaho spoon:
Error saving transformation to repository!
Error updating batch
Cannot insert duplicate key row in object 'dbo.R_STEP_ATTRIBUTE' with unique index 'IDX_RSAT'. The duplicate key value is (2314, PARTITIONING_SCHEMA, 0).
Everything was working fine before I ran a job that creates multiple excel files. While this job was running suddenly a memory issue occurred and the job was aborted. After that I tried to save my file but it is deleted for saving but not been saved. So I lost the job I created.
Please help me to know the reason.
The last save of the directory did not end gracefully.
There is a small chance that you can repair it by easing the db-caches file in the .kettle directory.
If it does not work, create a new repository and copy the current in the new. Try the global repository export/import. Then erase the old rep and do the same from the just rebuild repository.
The intermediary repository may be on files rather than on a database.
If it is the first time you do this, plan for a one-two hours.
There is a easy way to recover this.
As AlainD says, the problem occurs when you save or delete a transformations, and suddenly you lost the connection or had a problem with Kettle.
When that occurs, you will find a lot of step records into the table R_STEP_ATTRIBUTE. In the error shown is the [ID_TRANSFORMATION] = 2314.
So, if you check the table R_TRANSFORMATION with [ID_TRANSFORMATION] = 2314, maybe wont find any transformation with that id.
After check that, you can delete all the records related with that [ID_TRANSFORMATION], for example:
delete from R_STEP_ATTRIBUTE where ID_TRANSFORMATION=2314
We just solved this issue by executing the following SQL statement
DELETE
FROM R_STEP_ATTRIBUTE
WHERE ID_STEP NOT IN (SELECT ID_STEP FROM R_STEP)
I have a fairly large production database system, based on a large hierarchy of nodes each with a 10+ associated models. If someone deletes a node fairly high in the tree, there can be thousands of models deleted and if that deletion was a mistake, restoring them can be very difficult. I'm looking for a way to give me an easy 'undo' option.
I've tried using Django-reversion, but it seems like in order to get the functionality I want (easily reverting a large cascade delete) it needs to store a bunch of information with each revision. When I created initial revisions, the process is less than 10% done and it's already using 8GB in my database, which is not going to work for me.
So, is there a standard solution for this problem? Or a way to customize Django-reversions to fit my use case?
What you're looking for is called a soft delete. Add a column named deleted with a value of false to the table. Now when you want to do a "delete" instead change the column deleted to true. Update all the code not to show the rows marked as deleted (or move the database table and replace it with a view that doesn't show them). Change all the unique constraints to have a filter WHERE deleted = false so you won't have a problem with not being able to add something similar to what user can't see in the system.
As for the cascades you have two options. Either do an ON UPDATE trigger that will update the child rows or add the deleted column to the FK and define it as ON UPDATE CASCADE.
You'll get the whole reverse functionality at a cost of one extra row (and not being able to delete stuff to save space unless you do it manually).
I am facing an issue where I am getting the below error while inserting a record in the table via Siebel Operation step.
Here the error is showing for field which is based on a picklist. Could anyone please suggest why i am getting this error:
SBL-DAT-00225: The value entered in field District of buscomp Contact_Address_LT does not match any value in the bounded pick list PickList Comm Resolution.
SBL-BPR-00100: This error is returned when the workflow/task is executing the Siebel Operation business service.
I am aware that this happens when the value is not defined in the picklist. But i have verified this, and LOV is having the value which I am trying to get insert.
This error is quite common. And could happen for a couple of reason.
As you have mentioned, that you have already checked the value which is getting inserted is already there in the LOV defined for the picklist.
I have recently faced this error, and spent hours to debug it. Try below to sort your problem.
Check for the below points:
1) Check for the pick map for this field, check if any contraint field is also present in it.
2) If yes, then check those constraint field is also getting inserted in same Siebel Operation step. Siebel does not follow sequence in the input argument. So if this is the case do step 3 to resolve your issue.
3) Split the insert statement into 2 parts, 1 where you insert the record with the values which is present in the pick map constraint and then update the same record. This will ensure that all the required field are populated.
Solution from 8.1.1.4 is to add parameter into OM's config file, e.g fins.cfg:
[Task]
ProcessArgAsc = true
More details in my oracle support.
I have an update query that is based on the result of a select, typically returning more than 1000 rows.
If some of these rows are updated by other queries before this update can touch them could that cause a problem with the records? For example could they get out of sync with the original query?
If so would it be better to select and update individual rows rather than in batch?
If it makes a difference, the query is being run on Microsoft SQL Server 2008 R2
Thanks.
No.
A Table cannot be updated while something else is in the process of updating it.
Databases use concurrency control and have ACID properties to prevent exactly this type of problem.
I would recommend reading up on isolation levels. The default in SQL Server is READ COMMITTED, which means that other transactions cannot read data that has been updated but not committed by a given transaction.
This means that data returned by your select/update statement will be an accurate reflection of the database at a moment in time.
If you were to change your database to READ UNCOMMITTED then you could get into a situation where the data from your select/update is out of synch.
If you're selecting first, then updating, you can use a transaction
BEGIN TRAN
-- your select WITHOUT LOCKING HINT
-- your update based upon select
COMMIT TRAN
However, if you're updating directly from a select, then, no need to worry about it. A single transaction is implied.
UPDATE mytable
SET value = mot.value
FROM myOtherTable mot
BUT... do NOT do the following, otherwise you'll run into a deadlock
UPDATE mytable
SET value = mot.value
FROM myOtherTable mot WITH (NOLOCK)