Berkeley DB Environment issues - c++

So we're using the Berkeley DB, and our API uses the BDB C++ API. We recently added some new indexes on our database. After adding the new indexes, we needed to migrate all the old data to add the new indexes on the old records, and since then whenever we start up the process that writes to the database, we get these warnings:
BDB2058 Warning: Ignoring DB_SET_LOCK_TIMEOUT when joining the environment.
BDB2059 Warning: Ignoring DB_SET_TXN_TIMEOUT when joining the environment.
If I'm understanding those correctly, we now runt he risk of deadlocking since it's 'ignoring' the timeouts we set. I'm also seeing the process hang when trying to write tot he database randomly. The only way to get around it is to restart the process right now. My question is if anyone knows what would cause these warnings, or how I might go about debugging the Environment instantiation to find out? Any help or suggestions would be appreciated.

The timeout's are likely a persistent global attribute of the dbenv environment, not an attribute of each usage instance of a dbenv.
You might try running db_recover on the database to remove the __db.NNN files.
Otherwise you may have multiple processes sharing a dbenv and the warning is indicating that later processes are trying to change attributes that are already set.

Related

MySQL crash on DROP FUNCTION

I have created a UDF through the CREATE FUNCTION command, and now when I try to drop it the server crashes. According to the docs, this is a known issue:
To upgrade the shared library associated with a UDF, issue a DROP FUNCTION statement, upgrade the shared library, and then issue a CREATE FUNCTION statement. If you upgrade the shared library first and then use DROP FUNCTION, the server may crash.
It does, indeed, crash, and afterwards any attempt to remove the function crashes, even if I completely remove the DLL from the plugin directory. During development I'm continually replacing the library that defines the UDF functions. I've already re-installed MySQL from scratch once today and would rather not do it again. Aside from being more careful, is there anything I can do to e.g. clean up the mysql.* tables manually so as to remove the function?
Edit: after some tinkering, the database seems to have settled into a pattern of crashing until I have removed the offending DLL, and after that issuing Error Code: 1305: FUNCTION [schema].[functionName] does not exist. If I attempt to drop the function as root, I get the same message but without the schema prefix.
SELECT * from mysql.func shows the function. If I remove the record by hand, I get the same 1305 error.
Much of the data in the system tables in the mysql schema is cached in memory on first touch. After that, modifying the tables by hand may not have the expected effect unless the server is restarted.
For the grant tables, a mechanism for flushing any cached data is provided -- FLUSH PRIVILIGES -- but for other tables, like func and the time zone tables, the only certain way to ensure that manual changes to the tables are all taken into consideration is to restart the server process.

Restore items accidentally overwritten by importing a Package

Problem:
I imported a small package of about 15 items from one of our DBs to another one and somehow in the process the children of one of the items got overwritten by this operation.
I may have incorrectly selected "overwrite" instead of "merge" but I'm not sure about that.
The worst thing is, I also published to the web DB after import, so the items are not in the web DB either.
Things I checked:
Checked the Recycle Bin, not there
Also checked the Archive, not there either
Even wrote a piece of code to find the items by ID in the DB, FAILED
My question:
Are the items overwritten by Launch Wizard gone forever? Or there could still be a trace of them remaining in the DB?
There is no "rollback or uninstall a package" feature out of the box in Sitecore. This seems to be the only available info regarding the matter.
I've heard of some shared source modules which could be useful, but never tried them personally.
I think, your best choice is to restore items from a database backup or revert content, if you have a serialized copy on the file system.

Why sqlite3 can't work with NFS?

I switch to using sqlite3 instead of MySQL because I had to run many jobs on a PBS system which doesn't not have mysql. Of course on my machine I do not have a NFS while there exists one on the PBS. After spending lots of time switching to sqlite3, I go to run many jobs and I corrupt my database.
Of course down in the sqlite3 FAQ it is mentioned about NFS, but I didn't even think about this when I started.
I can copy the database at the beginning of the job but it will turn into a merging nightmare!
I would never recommend sqlite to any of my colleagues for this simple reason: "sqlite doesn't work (on the machines that matter)"
I have read rants about NFS not being up to par and it being their fault.
I have tried a few workarounds, but as this post suggests, it is not possible.
Isn't there a workaround which sacrifices performance?
So what do I do? Try some other db software? Which one?
You are using the wrong tool. Saying "I would never recommend sqlite ..." based on this experience is a bit like saying "I would never recommend glass bottles" after they keep breaking when you use them to hammer in a nail.
You need to specify your problem more precisely. My attempt to read between the lines of your question gives me something like this:
You have many nodes that get work through some unspecified path, and produce output. The jobs do not interact because you say you can copy the database. The output from all the jobs can be merged after they are finished. How do you effectively produce the merged output?
Given that as the question, this is my advice:
Have each job produce its output in a structured file, unique to each job. After the jobs are finished, write a program to parse each file and insert it into an sqlite3 database. This uses NFS in a way it can handle (single process writing sequentially to a file) and uses sqlite3 in a way that is also sensible (single process writing to a database on a local filesystem). This avoid NFS locking issues while running the job, and should improve throughput because you don't have contention on the sqlite3 database.

Process queue as folder with files. Possible problems?

I have an executable that needs to process records in the database when the command arrives to do so. Right now, I am issuing commands via TCP exchange but I don't really like it cause
a) queue is not persistent between sessions
b) TCP port might get locked
The idea I have is to create a folder and place files in it whose names match the commands I want to issue
Like:
1.23045.-1.1
2.999.-1.1
Then, after the command has been processed, the file will be deleted or moved to Errors folder.
Is this viable or are there some unavoidable problems with this approach?
P.S. The process will be used on Linux system, so Antivirus problems are out of the question.
Yes, a few.
First, there are all the problems associated with using a filesystem. Antivirus programs are one (though I cannot see why it doesn't apply to Linux - no delete locks?). Disk space, file and directory count maximums are others. Then, open file limits and permissions...
Second, race conditions. If there are multiple consumers, more than one of them might see and start processing the command before the first one has [re]moved it.
There are also the issues of converting commands to filenames and vice versa, and coming up with different names for a single command that needs to be issued multiple times. (Though these are programming issues, not design ones; they'll merely annoy.)
None of these may apply or be of great concern to you, in which case I say: Go ahead and do it. See what we've missed that Real Life will come up with.
I probably would use an MQ server for anything approaching "serious", though.

If using South, does it even matter what the underlying database is?

Today I have started my first steps into postgresql, since its recommended by the Django team.
I came across several issues, that I solved patiently one by one.
1) Creating tables under postgresql requires to login as a different OS login, from which you don't even know the password. Fine, I found the solution and created the database.
2) After running syncdb, you can't simply execute a simple insert sql like this:
INSERT INTO App_contacttype (contact_type, company_id) VALUES ('Buyer', 1),('Seller', 1);
Since Django creates it with quotes the table becomes case sensitive, hence it has to be like this:
INSERT INTO "App_contacttype" (contact_type, company_id) VALUES ('Buyer', 1),('Seller', 1);
But the problems seem never to end. Now suddenly the execution of the insert script says
ERROR: value too long for type character varying(40)
SQL state: 22001
In MySQL this was no problem. I don't know, right now I am getting a bit of cold feet, maybe I should just stick to MySQL.
The only reason I was considering postgresql was that some research suggested postgresql has much better support for changing Schemas along the way than MySQL.
However considering http://south.aeracode.org/ would take away all the pain of syncing Schemas, would I even need to worry about Schema changes at all no matter what the underlying database is?