MySQL crash on DROP FUNCTION - c++

I have created a UDF through the CREATE FUNCTION command, and now when I try to drop it the server crashes. According to the docs, this is a known issue:
To upgrade the shared library associated with a UDF, issue a DROP FUNCTION statement, upgrade the shared library, and then issue a CREATE FUNCTION statement. If you upgrade the shared library first and then use DROP FUNCTION, the server may crash.
It does, indeed, crash, and afterwards any attempt to remove the function crashes, even if I completely remove the DLL from the plugin directory. During development I'm continually replacing the library that defines the UDF functions. I've already re-installed MySQL from scratch once today and would rather not do it again. Aside from being more careful, is there anything I can do to e.g. clean up the mysql.* tables manually so as to remove the function?
Edit: after some tinkering, the database seems to have settled into a pattern of crashing until I have removed the offending DLL, and after that issuing Error Code: 1305: FUNCTION [schema].[functionName] does not exist. If I attempt to drop the function as root, I get the same message but without the schema prefix.
SELECT * from mysql.func shows the function. If I remove the record by hand, I get the same 1305 error.

Much of the data in the system tables in the mysql schema is cached in memory on first touch. After that, modifying the tables by hand may not have the expected effect unless the server is restarted.
For the grant tables, a mechanism for flushing any cached data is provided -- FLUSH PRIVILIGES -- but for other tables, like func and the time zone tables, the only certain way to ensure that manual changes to the tables are all taken into consideration is to restart the server process.

Related

Why are direct writes to Amazon S3 eliminated in EMR 5.x versions?

After reading this page:
http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-differences.html
"Operational Differences and Considerations" -> "Direct writes to Amazon S3 eliminated" section.
I wonder - does this mean that writing to S3 from Hive in EMR 4.x versions will be faster than 5.x versions?
If so, isn't it kind of regression? why would AWS want to eliminate this optimization?
Writing to a Hive table which is located in S3 is a very common scenario.
Can someone clear up that issue?
This optimization originally was developed by Qubole and pushed to Apache Hive.
See here.
This feature is rather dangerous because it is a bypass of Hive fault-tolerance mechanism and also forces developers to use normally unnecessary intermediate tables which, in its turn leads to performance degradation and increases cost.
Very common use-case is when we need to merge increment data into partitioned target table, described here The query is an insert overwrite table from itself, without intermediate table (in a single query) it is rather efficient. The query can be much more complex, with many tables joined. This is what happens with Direct Writes enabled in this use-case:
Partition folder is being deleted before query finished, this causing FileNotFound exception in Mapper reading the same table which is being written fails because partition folder deleted before mapper executed.
If the target table is initially empty, first run succeeds because Hive knows there is no any partition and does not read folder. Second run fails because see (1) folder deleted before mapper finishes.
Known workaround has performance impact. Loading data incrementally is quite often use-case. Direct writing to S3 feature forces developers to use temporary table in this case to eliminate FileNotFoundException and table corruption. As a result we are doing this task much slower and much more costly than if this feature is disabled and we are writing target table from itself.
After the first failure, successful restart is impossible, table is not selectable and not writable because Hive partition exists in metadata but folder does not exists and this causing FileNotFoundException in other queries from this table, which are not overwriting it.
The same described with less details on Amazon page which you are refering: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-differences.html
Another possible issue is described on Qubole page, some existing fix with using preffixes is mentioned, though this does not work with use-case above because writing new files into folder which is being read will anyway create a problem.
Also mappers, reducers may fail and restart, whole session may fail and restart, writing files directly even with postponed deleting the old ones seems not so good idea because it increases the chance of unrecoverable failure or data corruption.
To disable direct writes, set this configuration property:
set hive.allow.move.on.s3=true; --this disables direct write
You can use this feature for small tasks and when not reading the same table which is being written, though for small tasks it will not give you much. This optimizetion is most efficient when you are rewriting many partitions in a very big table and move task at the end is extremely slow, then you may want to enable it at risk of data corruption.

Restore items accidentally overwritten by importing a Package

Problem:
I imported a small package of about 15 items from one of our DBs to another one and somehow in the process the children of one of the items got overwritten by this operation.
I may have incorrectly selected "overwrite" instead of "merge" but I'm not sure about that.
The worst thing is, I also published to the web DB after import, so the items are not in the web DB either.
Things I checked:
Checked the Recycle Bin, not there
Also checked the Archive, not there either
Even wrote a piece of code to find the items by ID in the DB, FAILED
My question:
Are the items overwritten by Launch Wizard gone forever? Or there could still be a trace of them remaining in the DB?
There is no "rollback or uninstall a package" feature out of the box in Sitecore. This seems to be the only available info regarding the matter.
I've heard of some shared source modules which could be useful, but never tried them personally.
I think, your best choice is to restore items from a database backup or revert content, if you have a serialized copy on the file system.

Change stored macro SAS

In SAS using SASMSTORE option I can specify a place where the SASMACR catalog will exist. In this catalog will reside some macro.
At some moment I may need to change the macro and this moment may occure while this macro and therefore the catalog will be in use by another user. But then it will be locked and unavailable to be modified.
How can I avoid such a situation?
If you're using a SAS Macro catalog as a public catalog that is shared among colleagues, a few options exist.
First, use SVN or similar source control option so that you and your colleagues each have a local copy of the macro catalog. This is my preferred option. I'd do this, and also probably not used stored compiled macros - I'd just set it up as autocall macros, personally - because that makes it easy to resolve conflicts (as you have separate files for each macro). Using SCMs you won't be able to resolve conflicts, so you'll have to make sure everyone is very well behaved about always downloading the newest copy before making any changes, and discusses any changes so you don't have two competing changes made at about the same time. If SCMs are important for your particular use case, you could version control the macros that create the SCMs and build the SCM yourself every time you refresh your local copy of the sources.
Second, you could and should separate development from production here. Even if you have a shared library located on a shared network folder, you should have a development copy as well that is explicitly not locked by anyone except when developing a new macro for it (or updating a currently used macro). Then make your changes there, and on a consistent schedule push them out once they've been tested and verified (preferably in a test environment, so you have the classic three: dev, test, and prod environments). Something like this:
Changes in Dev are pushed to Test on Wednesdays. Anyone who's got something ready to go by Wednesday 3pm puts it in a folder (the macro source code, that is), and it's compiled into the test SCM automatically.
Test is then verified Thursday and Friday. Anything that is verified in Test by 3pm Friday is pushed to the Dev source code folder at that time, paying attention to any potential conflicts in other new code in test (nothing's pushed to dev if something currently in test but not verified could conflict with it).
Production then is run at 3pm Friday. Everyone has to be out of the SCM by then.
I suggest not using Friday for prod if you have something that runs over the weekend, of course, as it risks you having to fix something over the weekend.
Create two folders, e.g. maclib1 and maclib2, and a dataset which stores the current library number.
When you want to rebuild your library, query the current number, increment (or reset to 1 if it's already 2), assign your macro library path to the corresponding folder, compile your macros, and then update the dataset with the new library number.
When it comes to assigning your library, query the current library number from the dataset, and assign the library path accordingly.

Berkeley DB Environment issues

So we're using the Berkeley DB, and our API uses the BDB C++ API. We recently added some new indexes on our database. After adding the new indexes, we needed to migrate all the old data to add the new indexes on the old records, and since then whenever we start up the process that writes to the database, we get these warnings:
BDB2058 Warning: Ignoring DB_SET_LOCK_TIMEOUT when joining the environment.
BDB2059 Warning: Ignoring DB_SET_TXN_TIMEOUT when joining the environment.
If I'm understanding those correctly, we now runt he risk of deadlocking since it's 'ignoring' the timeouts we set. I'm also seeing the process hang when trying to write tot he database randomly. The only way to get around it is to restart the process right now. My question is if anyone knows what would cause these warnings, or how I might go about debugging the Environment instantiation to find out? Any help or suggestions would be appreciated.
The timeout's are likely a persistent global attribute of the dbenv environment, not an attribute of each usage instance of a dbenv.
You might try running db_recover on the database to remove the __db.NNN files.
Otherwise you may have multiple processes sharing a dbenv and the warning is indicating that later processes are trying to change attributes that are already set.

COM: How to get all running objects

You know GetActiveObject just can get the COM object of the first opened application. How to get all running objects? e.g. I run two Excel applications, How to get the two Excel objects in C++ code?
There is usually only one instance of Excel as Hans says. If there is only one instance it will refuse to open the same document twice.
But there may be more than one, typically if a second has been started explicitly. In that case it may open the same file (though you will get a warning about locking).
They may or may not both appear in the Running Object Table. Use ROT viewer or something like this to determine whether that is the case:
http://social.msdn.microsoft.com/Forums/en-US/vsx/thread/ccccc9bd-f21a-4f74-a3f0-64a594fa1b16
Finally you might consider using Microsoft UI Automation:
http://msdn.microsoft.com/en-us/library/ms753388.aspx
http://msdn.microsoft.com/en-us/library/ms726294(VS.85).aspx