InitializeCriticalSection works in one project, but fails in another

InitializeCriticalSection works in one project, but fails in another - c++

Using Visual Studio 2019 Professional on Windows 10 x64. I have several C++ DLL projects, some of which are multi-threaded. I'm using CRITICAL_SECTION objects for thread safety.
In DLL1:
CRITICAL_SECTION critDLL1;
InitializeCriticalSection(&critDLL1);
In DLL2:
CRITICAL_SECTION critDLL2;
InitializeCriticalSection(&critDLL2);
When I use critDLL1 with EnterCriticalSection or LeaveCriticalSection everything is fine in both _DEBUG or NDEBUG mode. But when I use critDLL2, I get an access violation in 'ntdll.dll' in NDEBUG (though not in _DEBUG).
After popping up message boxes in NDEBUG mode, I was eventually able to track the problem down to the first use of EnterCriticalSection.
What might be causing the CRITICAL_SECTION to fail in one project but work in others? The MSDN page was not helpful.
UPDATE 1
After comparing project settings of DLL1 (working) and DLL2 (not working), I've accidentally got DLL2 working. I've confirmed this by reverting to an earlier version (which crashes) and then making the project changes (no crash!).
This is the setting:
Project Properties > C/C++ > Optimization > Whole Program Optimization
Set this to Yes (/GL) and my program crashes. Change that to No and it works fine. What does the /GL switch do and why might it cause this crash?
UPDATE 2
The excellent answer from #Acorn and comment from #RaymondChen, provided the clues to track down and then resolve the issue. There were two problems (both programmer errors).
PROBLEM 1
The assumption of Whole Program Optimzation (wPO) is the MSVC compiler is compiling "the whole program". This is an incorrect assumption for my DLL project which internally consumes a 3rd party library and is in turn consumed by an external application written in Delphi. This setting is set to Yes (/GL) by default but should be No. This feels like a bug in Visual Studio, but in any case, the programmer needs to be aware of this. I don't know all the details of what WPO is meant to do, but at least for DLLs meant to be consumed by other applications, the default should be changed.
PROBLEM 2
Serious programmer error. It was a call into a 3rd party library, which returned a 128-byte ASCII code which was the error:
// Before
// m_config::acSerial defined as "char acSerial[21]"
(void) m_pLib->GetPara(XPARA_PRODUCT_INFO, &m_config.acSerial[0]);
EnterCriticalSection(&crit); // Crash!
// After
#define SERIAL_LEN 20
// m_config::acSerial defined as "char acSerial[SERIAL_LEN+1]"
//...
char acSerial[128];
(void) m_pLib->GetPara(XPARA_PRODUCT_INFO, &acSerial[0]);
strncpy(m_config.acSerial, acSerial, max(SERIAL_LEN, strlen(acSerial)));
EnterCriticalSection(&crit); // Works!
The error, now obvious, is that the 3rd party library did not copy the serial number of the device into the char* I provided...it copied 128 bytes into my char* stomping over everything contiguous in memory after acSerial. This wasn't noticed before because m_pLib->GetPara(XPARA_PRODUCT_INFO, ...) was one of the first calls into the 3rd party library and the rest of the contiguous data was mostly NULL at that point.
The problem was never to do with the CRITICAL_SECTION. My thanks for Acorn and RaymondChen ... sanity has been restored to this corner of the universe.

If your program crashes under WPO (an optimization that assumes that whatever you are compiling is the entire program), it means that either the assumption is incorrect or that the optimizer ends up exploiting some undefined behavior that previously didn't (without the optimization applied), even if the assumption is correct.
In general, avoid enabling optimizations unless you are really sure you know you meet their requirements.
For further analysis, please provide a MRE.

Related

AccessViolationException reading memory allocated in C++ application from C++/CLI DLL

I have a C++ client to a C++/CLI DLL, which initializes a series of C# dlls.
This used to work. The code that is failing has not changed. The code that has changed is not called before the exception is thrown. My compile environment has changed, but recompiling on a machine with an environment similar to my old one still failed. (EDIT: as we see in the answer this is not entirely true, I was only recompiling the library in the old environment, not the library and client together. The client projects had been upgraded and couldn't easily go back.)
Someone besides me recompiled the library, and we started getting memory management issues. The pointer passed in as a String must not be in the bottom 64K of the process's address space. I recompiled it, and all worked well with no code changes. (Alarm #1) Recently it was recompiled, and memory management issues with strings re-appeared, and this time they're not going away. The new error is Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I'm pretty sure the problem is not located where I see the exception, the code didn't change between the successful and failing builds, but we should review that to be complete. Ignore the names of things, I don't have much control over the design of what it's doing with these strings. And sorry for the confusion, but note that _bridge and bridge are different things. Lots of lines of code missing because this question is already too long.
Defined in library:
struct Config
{
std::string aye;
std::string bee;
std::string sea;
};
extern "C" __declspec(dllexport) BridgeBase_I* __stdcall Bridge_GetConfiguredDefaultsImplementationPointer(
const std::vector<Config> & newConfigs, /**< new configurations to apply **/
std::string configFolderPath, /**< folder to write config files in **/
std::string defaultConfigFolderPath, /**< folder to find default config files in **/
std::string & status /**< output status of config parse **/
);
In client function:
GatewayWrapper::Config bridge;
std::string configPath("./config");
std::string defaultPath("./config/default");
GatewayWrapper::Config gwtransport;
bridge.aye = "bridged.dll";
bridge.bee = "1.0";
bridge.sea = "";
configs.push_back(bridge);
_bridge = GatewayWrapper::Bridge_GetConfiguredDefaultsImplementationPointer(configs, configPath, defaultPath, status);
Note that call to library that is crashing is in the same scope as the vector declaration, struct declaration, string assignment and vector push-back
There are no threading calls in this section of code, but there are other threads running doing other things. There is no pointer math here, there are no heap allocations in the area except perhaps inside the standard library.
I can run the code up to the Bridge_GetConfiguredDefaultsImplementationPointer call in the debugger, and the contents of the configs vector look correct in the debugger.
Back in the library, in the first sub function, where the debugger don't shine, I've broken down the failing statement into several console prints.
System::String^ temp
List<CConfig^>^ configs = gcnew List<CConfig ^>((INT32)newConfigs.size());
for( int i = 0; i< newConfigs.size(); i++)
{
std::cout << newConfigs[i].aye<< std::flush; // prints
std::cout << newConfigs[i].aye.c_str() << std::flush; // prints
temp = gcnew System::String(newConfigs[i].aye.c_str());
System::Console::WriteLine(temp); // prints
std::cout << "Testing string creation" << std::endl; // prints
std::cout << newConfigs[i].bee << std::flush; // crashes here
}
I get the same exception on access of bee if I move the newConfigs[i].bee out above the assignment of temp or comment out the list declaration/assignment.
Just for reference that the std::string in a struct in a vector should have arrived at it's destination ok
Is std::vector copying the objects with a push_back?
std::string in struct - Copy/assignment issues?
http://www.cplusplus.com/reference/vector/vector/operator=/
Assign one struct to another in C
Why this exception is not caught by my try/catch
https://stackoverflow.com/a/918891/2091951
Generic AccessViolationException related questions
How to handle AccessViolationException
Programs randomly getting System.AccessViolationException
https://connect.microsoft.com/VisualStudio/feedback/details/819552/visual-studio-debugger-throws-accessviolationexception
finding the cause of System.AccessViolationException
https://msdn.microsoft.com/en-us/library/ms164911.aspx
Catching access violation exceptions?
AccessViolationException when using C++ DLL from C#
Suggestions in above questions
Change to .net 3.5, change target platform - these solutions could have serious issues with a large mult-project solution.
HandleProcessCorruptedStateExceptions - does not work in C++, this decoration is for C#, catching this error could be a very bad idea anyway
Change legacyCorruptedStateExceptionsPolicy - this is about catching the error, not preventing it
Install .NET 4.5.2 - can't, already have 4.6.1. Installing 4.6.2 did not help. Recompiling on a different machine that didn't have 4.5 or 4.6 installed did not help. (Despite this used to compile and run on my machine before installing Visual Studio 2013, which strongly suggests the .NET library is an issue?)
VSDebug_DisableManagedReturnValue - I only see this mentioned in relation to a specific crash in the debugger, and the help from Microsoft says that other AccessViolationException issues are likely unrelated. (http://connect.microsoft.com/VisualStudio/feedbackdetail/view/819552/visual-studio-debugger-throws-accessviolationexception)
Change Comodo Firewall settings - I don't use this software
Change all the code to managed memory - Not an option. The overall design of calling C# from C++ through C++/CLI is resistant to change. I was specifically asked to design it this way to leverage existing C# code from existing C++ code.
Make sure memory is allocated - memory should be allocated on the stack in the C++ client. I've attempted to make the vector be not a reference parameter, to force a vector copy into explicitly library controlled memory space, did not help.
"Access violations in unmanaged code that bubble up to managed code are always wrapped in an AccessViolationException." - Fact, not a solution.

but it was the mismatch, not the specific version that was the problem
Yes, that's black letter law in VS. You unfortunately just missed the counter-measures that were built into VS2012 to turn this mistake into a diagnosable linker error. Previously (and in VS2010), the CRT would allocate its own heap with HeapAlloc(). Now (in VS2013), it uses the default process heap, the one returned by the GetProcessHeap().
Which is in itself enough to trigger an AVE when you run your app on Vista or higher, allocating memory from one heap and releasing it from another triggers an AVE at runtime, a debugger break when you debug with the Debug Heap enabled.
This is not where it ends, another significant issue is that the std::string object layout is not the same between the versions. Something you can discover with a little test program:
#include <string>
#include <iostream>
int main()
{
std::cout << sizeof(std::string) << std::endl;
return 0;
}
VS2010 Debug : 32
VS2010 Release : 28
VS2013 Debug : 28
VS2013 Release : 24
I have a vague memory of Stephen Lavavej mentioning the std::string object size reduction, very much presented as a feature, but I can't find it back. The extra 4 bytes in the Debug build is caused by the iterator debugging feature, it can be disabled with _HAS_ITERATOR_DEBUGGING=0 in the Preprocessor Definitions. Not a feature you'd quickly want to throw away but it makes mixing Debug and Release builds of the EXE and its DLLs quite lethal.
Needless to say, the different object sizes seriously bytes when the Config object is created in a DLL built with one version of the standard C++ library and used in another. Many mishaps, the most basic one is that the code will simply read the Config::bee member from the wrong offset. An AVE is (almost) guaranteed. Lots more misery when code allocates the small flavor of the Config object but writes the large flavor of std::string, that randomly corrupts the heap or the stack frame.
Don't mix.

I believe 2013 introduced a lot of changes in the internal data formats of STL containers, as part of a push to reduce memory usage and improve perf. I know vector became smaller, and string is basically a glorified vector<char>.
Microsoft acknowledges the incompatibility:
"To enable new optimizations and debugging checks, the Visual Studio
implementation of the C++ Standard Library intentionally breaks binary
compatibility from one version to the next. Therefore, when the C++
Standard Library is used, object files and static libraries that are
compiled by using different versions can't be mixed in one binary (EXE
or DLL), and C++ Standard Library objects can't be passed between
binaries that are compiled by using different versions."
If you're going to pass std::* objects between executables and/or DLLs, you absolutely must ensure that they are using the same version of the compiler. It would be well-advised to have your client and its DLLs negotiate in some way at startup, comparing any available versions (e.g. compiler version + flags, boost version, directx version, etc.) so that you catch errors like this quickly. Think of it as a cross-module assert.
If you want to confirm that this is the issue, you could pick a few of the data structures you're passing back and forth and check their sizes in the client vs. the DLLs. I suspect your Config class above would register differently in one of the fail cases.
I'd also like to mention that it is probably a bad idea in the first place to use smart containers in DLL calls. Unless you can guarantee that the app and DLL won't try to free or reallocate the internal buffers of the other's containers, you could easily run into heap corruption issues, since the app and DLL each have their own internal C++ heap. I think that behavior is considered undefined at best. Even passing const& arguments could still result in reallocation in rare cases, since const doesn't prevent a compiler from diddling with mutable internals.

You seem to have memory corruption. Microsoft Application Verifier is invaluable in finding corruption. Use it to find your bug:
Install it to your dev machine.
Add your exe to it.
Only select Basics\Heaps.
Press Save. It doesn't matter if you keep application verifier open.
Run your program a few times.
If it crashes, debug it and this time, the crash will point to your problem, not just some random location in your program.
PS: It's a great idea to have Application Verifier enabled at all times for your development project.

Boost threads: in IOS, thread_info object is being destructed before the thread finishes executing

Our project uses a few boost 1.48 libraries on several platforms, including Windows, Mac, Android, and IOS.
We are able to consistently get the IOS version of the project to crash (nontrivially but reliably) when using IOS, and
from our investigation we see that ~thread_data_base is being called on the thread's thread_info while its thread is still running.
This seems to happen as a result of the smart pointer reaching a zero count, even though it is obviously still
in scope in the thread_proxy function which creates it and runs the requested function in the thread.
This seems to happen in various cases - the call stack is not identical between crashes, though there are a few
variations which are common.
Just to be clear - this often requires running code which is creating hundreds of threads, though there are
never more than about 30 running simultaneously. I have "been lucky" and got it very very early in the
run also, but that's rare.
I created a version of the destructor which actually catches the code red-handed:
in libs/thread/src/pthread/thread.cpp:
thread_data_base::~thread_data_base()
{
boost::detail::thread_data_base* const thread_info=detail::get_current_thread_data();
void *void_thread_info = (void *) thread_info;
void *void_this = (void *) this;
// is somebody destructing the thread_data other than its own thread?
// (remember that its own which should no longer point to it anyway,
// because of the call to detail::set_current_thread_data(0) in thread_proxy)
if (void_thread_info) { // == void_this) {
__builtin_trap();
}
}
I should note that (as seen from the commented-out code) I had previously checked to see that void_thread_info == void_this because I
was only checking for the case where the thread's current thread_info was killing itself.
I have also seen cases where the value returned by get_current_thread_data is non-zero and
different from "this", which is really weird.
Also when I first wrote that version of the code, I wrote:
if (((void*)thread_info) == ((void*)this))
and at run-time I got some very weird exception that said I something about a virtual function table
or something like that - I don't remember. I decided that it was trying to call "==" for this object type
and was unhappy with that, so I rewrote as above, putting the conversions to void * as separate
lines of code. That in itself is quite suspicious to me. I am not one to run to rush to blame compilers, but...
I should also note that when we did catch this happening the trap, we saw the destructor for
~shared_count appear twice consecutively on the stack in Xcode source. Very doubleweird.
We tried to look at the disassembly, but couldn't make much out of it.
Again - it looks like this is always a result of the shared_count which seems to be owned by
the shared_ptr which owns the thread_info reaching zero too early.
Update: it seems that it is possible to get into situations which reach the above trap without the situation doing any harm. Since fixing the issue (see answer) I have seen it happen, but always after thread_info->run() has finished executing. Don't yet understand how...but it's working.
Some additional info:
I should note that the boost.sh from Pete Goodliffe (and modified by others) that is commonly used to compile boost for IOS
has the following note in the header:
: ${EXTRA_CPPFLAGS:="-DBOOST_AC_USE_PTHREADS -DBOOST_SP_USE_PTHREADS"}
# The EXTRA_CPPFLAGS definition works around a thread race issue in
# shared_ptr. I encountered this historically and have not verified that
# the fix is no longer required. Without using the posix thread primitives
# an invalid compare-and-swap ARM instruction (non-thread-safe) was used for the
# shared_ptr use count causing nasty and subtle bugs.
#
# Should perhaps also consider/use instead: -BOOST_SP_USE_PTHREADS
I use those flags, but to no avail.
I found the following which is very tantalizing - it looks like they had the same issue in std::thread:
http://llvm.org/bugs/show_bug.cgi?format=multiple&id=12730
That was suggestive of using an alternate implementation inside boost for arm processors which seems also to directly address this issue:
spinlock_gcc_arm.hpp
The version included with boost 1.48 uses outdated arm assembly.
I took the updated version from boost 1.52, but I'm having trouble compiling it.
I get the following error:
predicated instructions must be in IT block
I found a reference to what looks to be a similar use of this instruction here:
https://zeromq.jira.com/browse/LIBZMQ-414
I was able to use the same idea to get the 1.52 code to compile by modifying
the code as follows (I inserted an appropriate IT instruction)
__asm__ __volatile__(
"ldrex %0, [%2]; \n"
"cmp %0, %1; \n"
"it ne; \n"
"strexne %0, %1, [%2]; \n"
BOOST_SP_ARM_BARRIER :
"=&r"( r ): // outputs
"r"( 1 ), "r"( &v_ ): // inputs
"memory", "cc" );
But in any case, there are ifdefs in this file which look for the arm architecture, which is not defined that way in my environment. After I simply edited the file so that only ARM 7 code
was left, the compiler complains about the definition of BOOST_SP_ARM_BARRIER:
In file included from ./boost/smart_ptr/detail/spinlock.hpp:35:
./boost/smart_ptr/detail/spinlock_gcc_arm.hpp:39:13: error: instruction requires a CPU feature not currently enabled
BOOST_SP_ARM_BARRIER :
^
./boost/smart_ptr/detail/spinlock_gcc_arm.hpp:13:32: note: expanded from macro 'BOOST_SP_ARM_BARRIER'
# define BOOST_SP_ARM_BARRIER "dmb"
Any ideas??

Figured this out. It turns out that the boost.sh script that I mention in the question chose the incorrect boost flag to address this problem - instead of BOOST_SP_USE_PTHREADS (and the other flag there with it, BOOST_AC_USE_PTHREADS) it turns out that what is needed on IOS is BOOST_SP_USE_SPINLOCK. This ends up giving pretty much the identical solution used in the std::thread issue referred to in the question.
If you are compiling for any modern IOS device which uses ARM 7, but using an older boost (we are using 1.48), you need to copy the file spinlock_gcc_arm.hpp from a more recent boost (like 1.52). That file is #ifdef'd for the different arm architectures, but it is not clear to me that the defines it is looking for are defined in the IOS compile environment using the script. So you can either edit the file (violent but effective) or invest some time to figure out how to make this tidy and correct.
In any case, you may need to insert the extra assembly instruction that I did above in the question:
"it ne; \n"
I have not yet gone back to see if I can delete that now that I have my compile environment working problem.
However, we're not done yet. The code used in boost for this option includes, as discussed, ARM assembly language instructions. The ARM chips support two instruction sets which can't be mixed in a given module (not sure of the scope, but evidently file by file is an acceptable granularity when compiling). The instructions used in boost for this locking include non-Thumb instructions, but IOS by default uses the Thumb instruction set. The boost code, aware of the instruction set issue, checks to see that you have arm enabled but not thumb, but by default in IOS, thumb is on.
Getting the compiler to generate non-thumb ARM code depends on which compiler you are using in IOS - Apple's LLVM or LLVM GCC. GCC is deprecated, and Apple's LLVM is the default when you use XCode.
For the default Clang + Apple LLVM 4.1, you need to compile using the -mno-thumb flag. Also any files in your IOS app which use any part of boost which uses smart pointers will also have to be compiled using -mno-thumb.
To compile boost like this, I think you can just add -mno-thumb to the EXTRA_CPP_FLAGS in the script. (I modified the user-config.jam directly while experimenting and haven't yet gone back to clean up.)
For your app, in Xcode you need to select your target, then go into the Build Phases tab, and there select Compile sources. There you have the option of adding compile flags, so for each relevant file (which includes boost), add the -mno-thumb flag. You can do this directly in project.pbxproj also where each file has
settings = { COMPILER_FLAGS = ""; };
you just change this to
settings = { COMPILER_FLAGS = "-mno-thumb"; };
But there's a little more. You also have to modify the darwin.jam file in the tools/build/v2/tools directory. In boost 1.48, there is a code that says:
case arm :
{
options = -arch armv6;
}
This has to be modified to
case arm :
{
options = -arch armv7 ;
}
Finally, in the boost.sh script, in the function writeBjamUserConfig(), you should remove the references to -arch armv6.
If somebody knows how to do this a little more generally and cleanly, I'm sure we'd all benefit. For now, this is where I've gotten to, and I hope that this will help other IOS boost threads users. I hope that the various variants on the boost.sh IOS script out there will be updated. I plan to add some more links to this answer later.
Update: For a great article which describes the issue on the processor level,
see here:
http://preshing.com/20121019/this-is-why-they-call-it-a-weakly-ordered-cpu
Enjoy!

I use boost.asio, boost.thread, boost.smart_ptr etc. on iOS platform, the app always crash when run in release mode, which throws signal sigabrt. The crash call stack is :
__stack_chk_fail
boost::asio::detail::completion_handle
boost::asio::detail::task_ios_service_operation::complete
boost::asio::detail::task_io_service::do_run_one
boost::asio::detail::task_ios_service::run
boost::asio::io_service::run
![when create a asio work with creating new thread and io_service][1]
When trying to solve the problem, I found the following articles:
[boost-thread-threads-not-starting-on-the-iphone-ipad-in-release-build][2]
[The issue of spin_lock and thumb on iOS][3]
Then I try to add -mno-thumb to my project compile flag, and the problem occured in release mode is gone.
However, a new bug bring out : EXC_ARM_DA_ALIGN, which crashed at where I try to convert network data to host-endian.
As[this article][4] says, the ARM instructions strict that the memory data must be aligned.
And follow the article [Exc_arm_da_align][5], I fix it by using memcpy for the data convert, instead of directly converting from the pointer.
[1]: http://i.stack.imgur.com/3ijF4.png
[2]: http://stackoverflow.com/questions/4201262/boost-thread-threads-not-starting-on-the-iphone-ipad-in-release-builds/4245821#4245821
[3]: http://groups.google.com/group/boost-list/browse_thread/thread/7dc1e80659182ab3
[4]: https://brewx.qualcomm.com/bws/content/gi/common/appseng/en/knowledgebase/docs/kb95.html
[5]: http://www.cnblogs.com/unionfind/archive/2013/02/25/2932262.html

Unhandled Exception in Rad Studio Debugger Thread

I have a large application that recently started exhibiting rather strange behavior when running in a debugger. First, the basics:
OS: Windows 7 64-bit.
Application: Multithreaded VCL app with many dlls, bpls, and other components.
Compiler/IDE: Embarcadero RAD Studio 2010.
The observed symptom is this: While the debugger is attached to my application, certain tasks cause the application to crash. The particulars are furthermore perplexing: My application stops with a Windows message saying, "YourApplication has stopped working." And it helpfully offers to send a minidump to Microsoft.
It should be noted: the application doesn't crash when the debugger is not attached. Also, the debugger doesn't indicate any exceptions or other issues while the application is running.
Setting and stepping through breakpoints seems to affect the point at which the application crashes, but I suspect that is a symptom of debugging a thread other than the problematic one.
These crashes also occur on the computers on my colleagues, with the same behavior I observe. This leads me to not suspect a failed installation of something on my computer particularly. My colleagues experiencing the issue are also running Windows 7 64-bit. I have no colleagues not experiencing the issue.
I've collected an analyzed a number of full dumps from the crashes. I discovered that the failure was actually happening in the same place each time. Here is the exception data from the dumps (it is always the same, except of course the ThreadId):
Exception Information
ThreadId: 0x000014C0
Code: 0x4000001F Unknown (4000001F)
Address: 0x773F2507
Flags: 0x00000000
NumberParameters: 0x00000001
0x00000000
Google reveals that Code 0x4000001F is actually STATUS_WX86_BREAKPOINT. Microsoft unhelpfully describes it as "An exception status code that is used by the Win32 x86 emulation subsystem."
Here are the stack details (which don't seem to vary):
0x773F2507: ntdll.dll+0x000A2507: RtlQueryCriticalSectionOwner + 0x000000E8
0x773F3DAB: ntdll.dll+0x000A3DAB: RtlQueryProcessLockInformation + 0x0000020D
0x773D2ED9: ntdll.dll+0x00082ED9: RtlUlonglongByteSwap + 0x00005C69
0x773F3553: ntdll.dll+0x000A3553: RtlpQueryProcessDebugInformationRemote + 0x00000044
0x74F73677: kernel32.dll+0x00013677: BaseThreadInitThunk + 0x00000012
0x77389F02: ntdll.dll+0x00039F02: RtlInitializeExceptionChain + 0x00000063
0x77389ED5: ntdll.dll+0x00039ED5: RtlInitializeExceptionChain + 0x00000036
It is worth noting that there appears to be a function epilog at 0x773F24ED, which rather suggests that the RtlQueryCriticalSectionOwner is a red herring. Likewise, a function epilog casts doubt on RtlQueryProcessLockInformation. The 0x5C69 offset casts doubt on the RtlUlonglongByteSwap. The other symbols look legit, though.
Specifically, RtlpQueryProcessDebugInformationRemote looks legitimate. Some people on the internet (http://www.cygwin.com/ml/cygwin-talk/2006-q2/msg00050.html) seem to think that it is created by the debugger to collect debug information. That theory seems sound to me, since it only seems to appear when the debugger is attached.
As always, when something breaks, something changed that broke it. In this case, that something is dynamically loading a new dll. I can cause the crash to stop happening by not dynamically loading the particular dll. I'm not convinced that the dll loading is related, but here are the details, just in case:
The dll source is C. Here are the compile options that are not set to the default:
Language Compliance: ANSI
Merge duplicate strings: True
Read-only strings: True
PCH usage: Do not use
Dynamic RTL: False
(The Project Options say False is default for Dynamic RTL, though it was set to True when I created the dll project.)
The dll is loaded with LoadLibrary and freed with FreeLibrary. All seems to be fine with the loading and unloading of the module. However, shortly after the library is unloaded (with FreeLibrary), the aforementioned thread crashes the program. For debugging, I removed all actual calls to the library (including, for more testing, DllMain). No combination of calls or not calls, DllMain or no DllMain, or anything else seemed to alter the behavior of the crash in any way. Simply loading and unloading the dll invokes the crash later on.
Furthermore, changing the dll to use the Dynamic RTL also causes the debugger thread crash to cease. This is undesirable because the compiled dll really should be usable without CodeGear Runtime available. Also, dll size is important. The C code contained in the dll does not make use of any libraries. (It includes no headers, even standard library headers. No malloc/free, no printf, no nothin'. It contains only functions that depend exclusively on their inputs and do not require dynamic allocation.) It is also undesirable because "fixing" a bug by changing stuff until it works without understanding why it works is really never a good plan. (It tends to lead to bug recurrences and strange coding practices. But really, at this point, if I can't find anything else, I may admit defeat on this count.)
And finally, my issue may be related to one of these issues:
System error after debugging multithreaded applications
Program and debugger quit without indication of problem
Any ideas or suggestions would be appreciated.

I solved the above mentioned problem by using a modified version of the PatchINT3 workaround, which was published in 2007 for BDS 2006:
procedure PatchINT3;
const
INT3: Byte = $CC;
NOP: Byte = $90;
var
NTDLL: THandle;
BytesWritten: DWORD;
Address: PByte;
begin
if Win32Platform <> VER_PLATFORM_WIN32_NT then
Exit;
NTDLL := GetModuleHandle('NTDLL.DLL');
if NTDLL = 0 then
Exit;
Address := GetProcAddress(NTDLL, 'RtlQueryCriticalSectionOwner');
if Address = nil then
Exit;
Inc(Address, $E8);
try
if Address^ <> INT3 then
Exit;
if WriteProcessMemory(GetCurrentProcess, Address, #NOP, 1, BytesWritten)
and (BytesWritten = 1) then
FlushInstructionCache(GetCurrentProcess, Address, 1);
except
//Do not panic if you see an EAccessViolation here, it is perfectly harmless!
on EAccessViolation do
;
else
raise;
end;
end;
Call this routine once after you have loaded the DLL in your thread. The patch fixes a user breakpoint in ntdll.dll version 6.1.7601.17725 and changes it to a NOP.
If there is no user breakpoint (INT3 (=$CC) opcode) at the expected address, the patch routine does nothing and exits.
Hope that helps,
Andreas
Footnote
The original source of PatchINT3 can be found here:
http://coding.derkeiler.com/Archive/Delphi/borland.public.delphi.non-technical/2007-01/msg04431.html
Footnote2
The same function in C++:
void PatchINT3()
{
unsigned char INT3 = 0xCC;
unsigned char NOP = 0x90;
if (Win32Platform != VER_PLATFORM_WIN32_NT)
{
return;
}
HMODULE ntdll = GetModuleHandle(L"NTDLL.DLL");
if (ntdll == NULL)
{
return;
}
unsigned char *address = (unsigned char*)GetProcAddress(ntdll,
"RtlQueryCriticalSectionOwner");
if (address == NULL)
{
return;
}
address += 0xE8;
try
{
if (*address != INT3)
{
return;
}
unsigned long bytes_written = 0;
if (WriteProcessMemory(GetCurrentProcess(), address, &NOP, 1,
&bytes_written) && (bytes_written == 1))
{
FlushInstructionCache(GetCurrentProcess, address, 1);
}
}
catch (EAccessViolation &e)
{
//Do not panic if you see an EAccessViolation
//here, it is perfectly harmless!
}
catch(...)
{
throw;
}
}

Just an idea...
Perhaps you need to close in on the crashing thread. The state that you are observing seems to be a bit after the actual error.
First, your stack traces seems incomplete to me. What is the base root of the stack of that thread? What was the origin of that thread?
And, in the VS debugger there is the possibility to break on exceptions, (Debug->Exceptions...->[Add]). Then all threads will freeze at the moment the exception occurs. I dunno about RAD, but the trick to do it programatically seems to be WaitForDebugEvent().
I could be wrong but I think there is a fair chance that the bug is in the debugger not your code. In that case an uggly workaround is IMHO fully forgivable. Good luck!

I cannot answer this because I cannot see the code...
But...
1) In Borland C++ at least with C++ from BDS there can be a provable problem with the realloc function in multithreaded library. Is your C++ code using realloc?
2) The stack you are showing is more than likely being called as a result of your code actually hitting a "CALL BAD_ADRESS" and that can happen as a result of a bug in your own code. In other words, in the DLL you load, there is likely a function that is doing something that is overwriting the executable code in your program with junk, and then when the now junk section runs, it crashes.
Another way is if something in the C++ dll is modifying the stack below where it runs, and then your code is hitting that later.
3) Check out the CPU flags settings for your DLL. Borland libraries sometimes use conflicting CPU flags upon entry and you might need to save and restore before calling into the DLL. For example if you call a VST plugin made with C++ from Delphi and do not set the flags properly, you can get subsequent divide by zero errors from the VST plugin which was compiled with that exception turned off.

We had the same issue today. In our case the crash happens if having a breakpoint after call to TOpenDialog->Execute() (which is using dialog from shell32.dll I think) (Windows 7 x64, C++ Builder XE2)
After uninstalling iCloud (v2.1.0.39), the issue was resoved.
Unfortunately we are still looking into similar issue our customers are having some times with our release product under Windows Vista. After selecting the file using TOpenDialog, the application crashes in gdiplus.dll with Access violation, removing the iCloud seems to also resolve the issue.

What is causing boost::lower to fail an is_singular assertion?

I am occasionally getting odd behavior from boost::lower, when called on a std::wstring. In particular, I have seen the following assertion fail in a release build (but not in a debug build):
Assertion failed: !is_singular(), file C:\boost_1_40_0\boost/range/iterator_range.hpp, line 281
I have also seen what appear to be memory errors after calling boost::to_lower in contexts such as:
void test(const wchar_t* word) {
std::wstring buf(word);
boost::to_lower(buf);
...
}
Replacing the calls boost::tolower(wstr) with std::transform(wstr.begin(), wstr.end(), wstr.begin(), towlower) appears to fix the problem; but I'd like to know what's going wrong.
My best guess is that perhaps the problem has something to do with changing the case of unicode characters -- perhaps the encoding size of the downcased character is different from the encoding size of the source character?
Does anyone have any ideas what might be going on here? It might help if I knew what "is_singular()" means in the context of boost, but after doing a few google searches I wasn't able to find any documentation for it.
Relevant software versions: Boost 1.40.0; MS Visual Studio 2008.

After further debugging, I figured out what was going on.
The cause of my trouble was that one project in the solution was not defining NDEBUG (despite being in release mode), while all the other modules were. Boost allocates some extra fields in its data structures, which it uses to store debug information (such as whether a data structure has been initialized). If module A has debugging turned off, then it will create data structures that don't contain those fields. Then when module B, which has debugging turned on, gets its hands on that data structure, it will try to check those fields (which were not allocated), resulting in random memory errors.
Defining NDEBUG in all projects in the solution fixed the problem.

An iterator range should only be singular if it's been constructed with the default constructor (stores singular iterators, i.e doesn't represent a range). As it's rather hard to believe that the boost's to_lower function manages to create a singular range, it suggests that the problem might also be elsewhere (a result of some undefined behavior, such as using uninitialized variables which might be initialized to some known value in debug builds).
Read more on Heisenbugs.

Do you really need a main() in C++?

From what I can tell you can kick off all the action in a constructor when you create a global object. So do you really need a main() function in C++ or is it just legacy?
I can understand that it could be considered bad practice to do so. I'm just asking out of curiosity.

If you want to run your program on a hosted C++ implementation, you need a main function. That's just how things are defined. You can leave it empty if you want of course. On the technical side of things, the linker wants to resolve the main symbol that's used in the runtime library (which has no clue of your special intentions to omit it - it just still emits a call to it). If the Standard specified that main is optional, then of course implementations could come up with solutions, but that would need to happen in a parallel universe.
If you go with the "Execution starts in the constructor of my global object", beware that you set yourself up to many problems related to the order of constructions of namespace scope objects defined in different translation units (So what is the entry point? The answer is: You will have multiple entry points, and what entry point is executed first is unspecified!). In C++03 you aren't even guaranteed that cout is properly constructed (in C++0x you have a guarantee that it is, before any code tries to use it, as long as there is a preceeding include of <iostream>).
You don't have those problems and don't need to work around them (wich can be very tricky) if you properly start executing things in ::main.
As mentioned in the comments, there are however several systems that hide main from the user by having him tell the name of a class which is instantiated within main. This works similar to the following example
class MyApp {
public:
MyApp(std::vector<std::string> const& argv);
int run() {
/* code comes here */
return 0;
};
};
IMPLEMENT_APP(MyApp);
To the user of this system, it's completely hidden that there is a main function, but that macro would actually define such a main function as follows
#define IMPLEMENT_APP(AppClass) \
int main(int argc, char **argv) { \
AppClass m(std::vector<std::string>(argv, argv + argc)); \
return m.run(); \
}
This doesn't have the problem of unspecified order of construction mentioned above. The benefit of them is that they work with different forms of higher level entry points. For example, Windows GUI programs start up in a WinMain function - IMPLEMENT_APP could then define such a function instead on that platform.

Yes! You can do away with main.
Disclaimer: You asked if it were possible, not if it should be done. This is a totally un-supported, bad idea. I've done this myself, for reasons that I won't get into, but I am not recommending it. My purpose wasn't getting rid of main, but it can do that as well.
The basic steps are as follows:
Find crt0.c in your compiler's CRT source directory.
Add crt0.c to your project (a copy, not the original).
Find and remove the call to main from crt0.c.
Getting it to compile and link can be difficult; How difficult depends on which compiler and which compiler version.
Added
I just did it with Visual Studio 2008, so here are the exact steps you have to take to get it to work with that compiler.
Create a new C++ Win32 Console Application (click next and check Empty Project).
Add new item.. C++ File, but name it crt0.c (not .cpp).
Copy contents of C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src\crt0.c and paste into crt0.c.
Find mainret = _tmain(__argc, _targv, _tenviron); and comment it out.
Right-click on crt0.c and select Properties.
Set C/C++ -> General -> Additional Include Directories = "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src".
Set C/C++ -> Preprocessor -> Preprocessor Definitions = _CRTBLD.
Click OK.
Right-click on the project name and select Properties.
Set C/C++ -> Code Generation -> Runtime Library = Multi-threaded Debug (/MTd) (*).
Click OK.
Add new item.. C++ File, name it whatever (app.cpp for this example).
Paste the code below into app.cpp and run it.
(*) You can't use the runtime DLL, you have to statically link to the runtime library.
#include <iostream>
class App
{
public: App()
{
std::cout << "Hello, World! I have no main!" << std::endl;
}
};
static App theApp;
Added
I removed the superflous exit call and the blurb about lifetime as I think we're all capable of understanding the consequences of removing main.
Ultra Necro
I just came across this answer and read both it and John Dibling's objections below. It was apparent that I didn't explain what the above procedure does and why that does indeed remove main from the program entirely.
John asserts that "there is always a main" in the CRT. Those words are not strictly correct, but the spirit of the statement is. Main is not a function provided by the CRT, you must add it yourself. The call to that function is in the CRT provided entry point function.
The entry point of every C/C++ program is a function in a module named 'crt0'. I'm not sure if this is a convention or part of the language specification, but every C/C++ compiler I've come across (which is a lot) uses it. This function basically does three things:
Initialize the CRT
Call main
Tear down
In the example above, the call is _tmain but that is some macro magic to allow for the various forms that 'main' can have, some of which are VS specific in this case.
What the above procedure does is it removes the module 'crt0' from the CRT and replaces it with a new one. This is why you can't use the Runtime DLL, there is already a function in that DLL with the same entry point name as the one we are adding (2). When you statically link, the CRT is a collection of .lib files, and the linker allows you to override .lib modules entirely. In this case a module with only one function.
Our new program contains the stock CRT, minus its CRT0 module, but with a CRT0 module of our own creation. In there we remove the call to main. So there is no main anywhere!
(2) You might think you could use the runtime DLL by renaming the entry point function in your crt0.c file, and changing the entry point in the linker settings. However, the compiler is unaware of the entry point change and the DLL contains an external reference to a 'main' function which you're not providing, so it would not compile.

Generally speaking, an application needs an entry point, and main is that entry point. The fact that initialization of globals might happen before main is pretty much irrelevant. If you're writing a console or GUI app you have to have a main for it to link, and it's only good practice to have that routine be responsible for the main execution of the app rather than use other features for bizarre unintended purposes.

Well, from the perspective of the C++ standard, yes, it's still required. But I suspect your question is of a different nature than that.
I think doing it the way you're thinking about would cause too many problems though.
For example, in many environments the return value from main is given as the status result from running the program as a whole. And that would be really hard to replicate from a constructor. Some bit of code could still call exit of course, but that seems like using a goto and would skip destruction of anything on the stack. You could try to fix things up by having a special exception you threw instead in order to generate an exit code other than 0.
But then you still run into the problem of the order of execution of global constructors not being defined. That means that in any particular constructor for a global object you won't be able to make any assumptions about whether or not any other global object yet exists.
You could try to solve the constructor order problem by just saying each constructor gets its own thread, and if you want to access any other global objects you have to wait on a condition variable until they say they're constructed. That's just asking for deadlocks though, and those deadlocks would be really hard to debug. You'd also have the issue of which thread exiting with the special 'return value from the program' exception would constitute the real return value of the program as a whole.
I think those two issues are killers if you want to get rid of main.
And I can't think of a language that doesn't have some basic equivalent to main. In Java, for example, there is an externally supplied class name who's main static function is called. In Python, there's the __main__ module. In perl there's the script you specify on the command line.

If you have more than one global object being constructed, there is no guarantee as to which constructor will run first.

If you are building static or dynamic library code then you don't need to define main yourself, but you will still wind up running in some program that has it.

If you are coding for windows, do not do this.
Running your app entirely from within the constructor of a global object may work just fine for quite awhile, but sooner or later you will make a call to the wrong function and end up with a program that terminates without warning.
Global object constructors run during the startup of the C runtime.
The C runtime startup code runs during the DLLMain of the C runtime DLL
During DLLMain, you are holding the DLL loader lock.
Tring to load another DLL while already holding the DLL loader lock results in a swift death for your process.
Compiling your entire app into a single executable won't save you - many Win32 calls have the potential to quietly load system DLLs.

There are implementations where global objects are not possible, or where non-trivial constructors are not possible for such objects (especially in the mobile and embedded realms).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js