Error while saving transformation in pentaho spoon - kettle

I am getting below error while I save the transformation in pentaho spoon:
Error saving transformation to repository!
Error updating batch
Cannot insert duplicate key row in object 'dbo.R_STEP_ATTRIBUTE' with unique index 'IDX_RSAT'. The duplicate key value is (2314, PARTITIONING_SCHEMA, 0).
Everything was working fine before I ran a job that creates multiple excel files. While this job was running suddenly a memory issue occurred and the job was aborted. After that I tried to save my file but it is deleted for saving but not been saved. So I lost the job I created.
Please help me to know the reason.

The last save of the directory did not end gracefully.
There is a small chance that you can repair it by easing the db-caches file in the .kettle directory.
If it does not work, create a new repository and copy the current in the new. Try the global repository export/import. Then erase the old rep and do the same from the just rebuild repository.
The intermediary repository may be on files rather than on a database.
If it is the first time you do this, plan for a one-two hours.

There is a easy way to recover this.
As AlainD says, the problem occurs when you save or delete a transformations, and suddenly you lost the connection or had a problem with Kettle.
When that occurs, you will find a lot of step records into the table R_STEP_ATTRIBUTE. In the error shown is the [ID_TRANSFORMATION] = 2314.
So, if you check the table R_TRANSFORMATION with [ID_TRANSFORMATION] = 2314, maybe wont find any transformation with that id.
After check that, you can delete all the records related with that [ID_TRANSFORMATION], for example:
delete from R_STEP_ATTRIBUTE where ID_TRANSFORMATION=2314

We just solved this issue by executing the following SQL statement
DELETE
FROM R_STEP_ATTRIBUTE
WHERE ID_STEP NOT IN (SELECT ID_STEP FROM R_STEP)

Related

Waiting for a table to be completely deleted

I have a table that has to be refreshed daily from an external source. All the recommendations I read say to delete the whole table and re-create it instead of deleting all the items.
I tried the suggested method, but the deleteTable function returns successful even though the table is still in a state of "Table is being deleted", as seen from the DynamoDB console. Sometimes this takes more than a minute.
What is the proper way of deleting and re-creating a table? Should I just keep trying createTable until the already exists error goes away?
I am using Node.js.
(The table is a list of some 5,000+ bus stops. The source doesn't specify how often the data changes nor give any indicator that there are changes. I found a small number of changes once every few weeks.)
If you are using boto3 (Python), there is a waiter called TableNotExists:
Polls DynamoDB.Client.describe_table() every 20 seconds until a successful state is reached. An error is returned after 25 failed checks.
Or, you could just do that polling yourself.
I would suggest changing the table name each day, using the current date as part of the table name. Then you can create the new table and start populating it without having to wait for the delete of the previous day's table to complete.
If the response from the createTable method is a Table already exists exception, the exception also contains a retryDelay property that is a number.
I can't find documentation on retryDelay but it seems to be a time duration in seconds.
I use the Table already exists exception to check that the table is not completely deleted, and, if not, back off for a period specified in the retryDelay property. After a few iterations, the table can be successfully created.
Sometimes the value in retryDelay can be more than 20.
This approach has worked without issues for me every time.

Informatica Update Stragegy is not flagging records

Hello I have a simple mapping, that basically has a router to decided whether the record has to be inserted or updated and then use Update Strategy to flag the row.
The records were updating and inserting as expected, I had to make some modifications to the logic and did the required changes.
And now the records are no more getting flagged as an insert or an update. Below settings :
1) DD_UPDATE and DD_INSERT coded in the update strategy.
2) At session level, treat source as set to Data Driven.
3) The 2 targets set to update as update and insert respectively.
I even ran a debugger to see what is happening, the insert update records are passing through the update strategy, however the row type is set to blank when its passed to the target instance :( what could be the issue?
I finally found the issue. Both the Update Strategy's were corrupted. deleting and recreating the update strategy's resolved the issue :) Thanks for your help!

C++ and Sqlite DELETE query doesn`t actually delete the value from the database file

I`ve came accross this issue on SQlite and c++ and i can't find any answer on it.
Everything is working fine in SQlite and c++ all queries all outputs all functions but i have this question that can`t find any solution around it.
I create a database MyTest.db
I create a table test with an id and a name as fields
I enter 2 values to each id=1 name=Name1 and id=2 name=Name2
I delete the 2nd value
The data inside table now says that i have only the id=1 with name=Name1
When i open my Mytest.db with notepad.exe the values that i have deleted such as id=2 name=Name2 are still inside the database file though that it doesn`t come to the data results of this table but still exists though.
What i like to ask from anyone that knows about it is this :
Is there any other way that the value has to be deleted also from the database file or is it my mistake with the DELETE option of SQLITE (that i doubt it)
Its like the database file keeps collecting all the trash inside it without removing DELETED values from its tables...
Any help or suggestion is much appreciated
If you use "PRAGMA secure_delete=ON;" then SQLite overwrites deleted content with zeros. See https://www.sqlite.org/pragma.html#pragma_secure_delete
Even with secure_delete=OFF, the deleted space will be reused (and overwritten) to store new content the next time you INSERT. SQLite does not leak disk space, nor is it necessary to VACUUM in order to reclaim space. But normally, deleted content is not overwritten as that uses extra CPU cycles and disk I/O.
Basically all databases only mark rows active or inactive, they won't delete the actual data from the file immediately. It would be a huge waste of time and resources, since that part of the file can just be reused.
Since your queries show that the row isn't active in results, is this in some way an issue? You can always run a VACUUM on the database if you want to reclaim the space, but I would just let the database engine handle everything by itself. It won't "keep collecting all the trash inside it", don't worry.
If you see that the file size is growing and the space is not reused, then you can issue vacuums from time to time.
You can also test this by just inserting other rows after deleting old ones. The engine should reuse those parts of the file at some point.

Rails 4 ActiveRecord Sql Server - Unable to save binary into image column

We are working to upgrade our application to a more current version of Ruby & Rails. Our app integrates with a legacy database (SQL Server 2008 R2) that has a table with a column of image data type (we are unable to change this column to varbinary(max)). Previously we were able to save a binary into the image column. However now we are getting conversion errors.
We are working to upgrade to the following (among others):
Rails 4.2.1
ActiveRecord_SQLServer_Adapter (4.2.4)
tiny_tds (0.6.3.rc1)
freeTDS (v0.91.112)
When we now attempt to save into the image column, we get errors similar to:
TinyTds::Error: Unclosed quotation mark after the character string
Researching various issues within tiny_tds & activerecord_sqlserver_adapter, we decided to create a second table that matched the first but change the data type from image to varbinary(max). We can save a binary into the column.
The code causing the challenge is in a background job where we grab images from s3, store them locally and then push the image into the database. Again, we don't control the legacy database and thus can't change the data type (or confront the issue of why we are storing the image in the db in the first place).
...
#d = Doc.new
...
open("#{Rails.root}/cache/pictures/image.png", "wb") do |file|
file << open(r.image.url).read
end
#d.document = File.binread("#{Rails.root}/cache/pictures/image.png")
#d.save!
Given the upgrade has broken our saving images, we are trying to figure out how best to determine a fix. We could obviously roll back until we find a version that works. However we hope to find a fix. Anyone have any ideas?
Update:
We added the following configuration as we had triggers on the table being inserted: ActiveRecord::ConnectionAdapters::SQLServerAdapter.use_output_inserted = true
When we remove this configuration we get the following error:
TinyTds::Error: The target table 'doc' of the DML statement cannot have any enabled triggers if the statement contains an OUTPUT clause without INTO clause.
Note: We are unable to make any modifications to the triggers.
Per feedback on the ActiveRecord_SQLServer_Adapter site, we rolled back to 4.1.11 and we are now able to save into the image column.
We also had to add this snippet to overcome the issue with the triggers.

Sitecore media conversion tool eating storage space

I have a question regarding the media conversion tool for Sitecore.
With this module you can convert media items between a hard drive location and a Sitecore database, and vice versa. But each time I convert some items it keeps taking additional harddrive space.
So when I convert 3gb to the hard drive it adds an additional 3gb (which seems logic -> 6gb total) but then when I convert them back to the blob format it adds another 3gb (9gb total). Instead of overwriting the previous version in the database.
Is there a way to clean the previous blobs or something? Because now it is using too much hard drive space.
Thanks in advance.
Using "Clean Up Databases" should work, but if the size gets too large, as my client's blob table did, the clean up will fail due to either a SQL timeout or because SQL Server uses up all the available locks.
Another solution is to run a script to manually clean up the blobs table. We had this issue previously and Sitecore support was able to provide us with a script to do so:
DECLARE #UsableBlobs table(
ID uniqueidentifier
);
I-N-S-E-R-T INTO
#UsableBlobs
S-E-L-E-C-T convert(uniqueidentifier,[Value]) as EmpID from [Fields]
where [Value] != ''
and (FieldId='{40E50ED9-BA07-4702-992E-A912738D32DC}' or FieldId='{DBBE7D99-1388-4357-BB34-AD71EDF18ED3}')
D-E-L-E-T-E from [Blobs]
where [BlobId] not in (S-E-L-E-C-T * from #UsableBlobs)
This basically looks for blobs that are still in use and stores them in a temp table. It them compares the items in this table to the Blobs table and deletes the ones that aren't in the temp table.
In our case, even this was bombing out due to the SQL Server locks problem, so I updated the delete statement to be delete top (x) from [Blobs] where x is a number you feel is more appropriate. I started at 1000 and eventually went up to deleting 400,000 records at a time. (Yes, it was that large)
So try the built-in "Clean Up Databases" option first and failing that, try to run the script to manually clean the table.
Edit note: Sorry, had to change the "insert", "select" and "delete" commands to allow SO to save the entry.