So, today I was running some code built with Address Sanitizer and have stumbled upon a strange stack-use-after-scope bug.
I have this simplified example:
#include <functional>
class k
{
public: operator int(){return 5;}
};
const int& n(const int& a)
{
return a;
}
int main()
{
k l;
return std::bind(n, l)();
}
ASAN complains about the last code line:
==27575==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffeab375210 at pc 0x000000400a01 bp 0x7ffeab3750e0 sp 0x7ffeab3750d8
READ of size 4 at 0x7ffeab375210 thread T0
#0 0x400a00 (/root/tstb.exe+0x400a00)
#1 0x7f97ce699730 in __libc_start_main (/lib64/libc.so.6+0x20730)
#2 0x400a99 (/root/tstb.exe+0x400a99)
Address 0x7ffeab375210 is located in stack of thread T0 at offset 288 in frame
#0 0x40080f (/root/tstb.exe+0x40080f)
This frame has 6 object(s):
[32, 33) '<unknown>'
[96, 97) '<unknown>'
[160, 161) '<unknown>'
[224, 225) '<unknown>'
[288, 292) '<unknown>' <== Memory access at offset 288 is inside this variable
[352, 368) '<unknown>'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope (/root/tstb.exe+0x400a00)
Shadow bytes around the buggy address:
0x1000556669f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100055666a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100055666a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
0x100055666a20: f1 f1 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2
0x100055666a30: f2 f2 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2
=>0x100055666a40: f2 f2[f8]f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f3 f3
0x100055666a50: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100055666a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100055666a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100055666a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100055666a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==27575==ABORTING
If I understand correctly, it says that we are accessing a stack variable after it has already gone out of scope.
Looking at the uninstrumented and unoptimized disassembly I indeed see that it happens inside instantiated __invoke_impl :
Dump of assembler code for function std::__invoke_impl<int const&, int const& (*&)(int const&), k&>(std::__invoke_other, int const& (*&)(int const&), k&):
0x0000000000400847 <+0>: push %rbp
0x0000000000400848 <+1>: mov %rsp,%rbp
0x000000000040084b <+4>: push %rbx
0x000000000040084c <+5>: sub $0x28,%rsp
0x0000000000400850 <+9>: mov %rdi,-0x28(%rbp)
0x0000000000400854 <+13>: mov %rsi,-0x30(%rbp)
0x0000000000400858 <+17>: mov -0x28(%rbp),%rax
0x000000000040085c <+21>: mov %rax,%rdi
0x000000000040085f <+24>: callq 0x4007a2 <std::forward<int const& (*&)(int const&)>(std::remove_reference<int const& (*&)(int const&)>::type&)>
0x0000000000400864 <+29>: mov (%rax),%rbx
0x0000000000400867 <+32>: mov -0x30(%rbp),%rax
0x000000000040086b <+36>: mov %rax,%rdi
0x000000000040086e <+39>: callq 0x4005c4 <std::forward<k&>(std::remove_reference<k&>::type&)>
0x0000000000400873 <+44>: mov %rax,%rdi
0x0000000000400876 <+47>: callq 0x40056a <k::operator int()>
0x000000000040087b <+52>: mov %eax,-0x14(%rbp)
0x000000000040087e <+55>: lea -0x14(%rbp),%rax
0x0000000000400882 <+59>: mov %rax,%rdi
0x0000000000400885 <+62>: callq *%rbx
=> 0x0000000000400887 <+64>: add $0x28,%rsp
0x000000000040088b <+68>: pop %rbx
0x000000000040088c <+69>: pop %rbp
0x000000000040088d <+70>: retq
End of assembler dump.
After calling k::operator int() it places the returned value on the stack and passes its address to the n(), which immediately returns it, and then it is returned from __invoke_impl itself (and goes all the way up to main's return).
So, it looks like ASAN it right here and we really have an stack-use-after-scope access.
The question is: What is wrong with my code?
I have tried building it with gcc, clang and icc and they all produce similar assembler outputs.
std::bind essentially generates an implementation function object that calls the bound function with the desired arguments. In your case, this implementation function object is about equivalent to
struct Impl
{
const int &operator()() const
{
int tmp = k_;
return n(tmp);
}
private:
k k_;
Impl(/*unspecified*/);
};
Since n returns its argument as a const reference, the call operator of Impl will return a reference to a local variable, which is a dangling reference, which is then read from in main. Hence the stack use after scope error.
Your confusion may stem from the fact that return n(l); without the bind is expected to work fine here. However, in the latter case, the temporary int is created in the stack frame of main, lives for the duration of the full expression that makes up the argument to return, which is evaluated to int.
In other words, while a temporary lives until the end of the full expression in which it was created, this is not the case for temporaries generated inside functions called within that full expression. These are considered part of a different full expression and are destroyed when that expression has been evaluated.
PS: For this reason, binding any function (object) of signature R(Args...) to a std::function<const R&(Args...)> results in a guaranteed return of a dangling reference when called – a construct that IMO the library should reject at compile time.
Ok this is a tough one if you don't know the specifics about std::bind.
When binding an argument to a callable with std::bind, a copy of the argument is maid (source):
The arguments to bind are copied or moved, and are never passed by reference unless wrapped in std::ref or std::cref.
std::bind(n, l) returns a callable object of unspecified type having a member object of type k build as a copy of l. Please note this callable object is a temporary (an rvalue) I'll give it a name: bindtmp.
When invoked, bindtmp() creates a temporary (inttemp) integer (5) in order to apply bindtmp::lcopy to bindtmp::ncopy (those are the member objects constructed from main::l and ::n). ::n returns a const reference to inttemp inside the scope of bindtmp() in a return statement.
This is where things get tricky (source):
Whenever a reference is bound to a temporary or to a subobject thereof, the lifetime of the temporary is extended to match the lifetime of the reference, with the following exceptions:
- a temporary bound to a return value of a function in a return statement is not extended: it is destroyed immediately at the end of the return expression. Such function always returns a dangling reference.
- ...
This means, the temporary inttemp is destroyed after ::n has returned.
From this point, everything falls apart. bindtmp() returns a reference to an object whose lifetime has ended, main tries and convert it into an lvalue, and thi sis where undefined behaviour (odr-use of an object from the stack after its use) happens.
Related
We have a problem where initializing the Google Breakpad exception handler errors out when the program is run under lldb, but not when run normally from the shell.
The system is MacOS 13 (Ventura) and the IDE is Visual Studio Code.
The code below fails on a call to init():
namespace crashhandler {
static std::unique_ptr<google_breakpad::ExceptionHandler> pExceptionHandler;
namespace {
bool DumpCallback(const char* dump_dir, const char* minidump_id, void*, bool success) {
if (success)
printf("Application crashed. Breakpad Crash Handler created a dump at location %s/%s.dmp\n",
dump_dir, minidump_id);
else
printf("Application crashed. Breakpad Crash Handler failed to create a dump");
fflush(stdout);
return success;
}
} // namespace
void init(const std::string& reportPath) // <-- crash happens when calling this function
{
if (pExceptionHandler)
return;
pExceptionHandler.reset(
new google_breakpad::ExceptionHandler(reportPath, nullptr, DumpCallback, nullptr, true, nullptr));
}
} // namespace crashhandler
The debug console shows:
=================================================================
==5060==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x00016fdfee00 at pc 0x000100ac9030 bp 0x00016ff11540 sp 0x00016ff10d08
READ of size 4608 at 0x00016fdfee00 thread T2
#0 0x100ac902c in wrap_write+0x15c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x1902c)
#1 0x100109f08 in google_breakpad::UntypedMDRVA::Copy(unsigned int, void const*, unsigned long)+0x54 (my_server:arm64+0x100109f08)
#2 0x10010ce14 in google_breakpad::MinidumpGenerator::WriteStackFromStartAddress(unsigned long long, MDMemoryDescriptor*)+0xf8 (my_server:arm64+0x10010ce14)
#3 0x10010d244 in google_breakpad::MinidumpGenerator::WriteThreadStream(unsigned int, MDRawThread*)+0x100 (my_server:arm64+0x10010d244)
#4 0x10010c04c in google_breakpad::MinidumpGenerator::WriteThreadListStream(MDRawDirectory*)+0xfc (my_server:arm64+0x10010c04c)
#5 0x10010bd20 in google_breakpad::MinidumpGenerator::Write(char const*)+0xc8 (my_server:arm64+0x10010bd20)
#6 0x10010adc0 in google_breakpad::ExceptionHandler::WriteMinidumpWithException(int, int, int, __darwin_ucontext64*, unsigned int, bool, bool)+0x160 (my_server:arm64+0x10010adc0)
#7 0x10010af1c in google_breakpad::ExceptionHandler::WaitForMessage(void*)+0x104 (my_server:arm64+0x10010af1c)
#8 0x1a330a068 in _pthread_start+0x90 (libsystem_pthread.dylib:arm64e+0x7068)
#9 0x1a3304e28 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1e28)
Address 0x00016fdfee00 is located in stack of thread T0 at offset 0 in frame
#0 0x1000034cc in main main.cpp:36
This frame has 10 object(s):
[32, 56) 'reportPath' (line 39) <== Memory access at offset 0 partially underflows this variable
[96, 120) 'ref.tmp' (line 40) <== Memory access at offset 0 partially underflows this variable
[160, 208) 'parser' (line 43) <== Memory access at offset 0 partially underflows this variable
[240, 264) 'configPath' (line 45) <== Memory access at offset 0 partially underflows this variable
[304, 320) 'ref.tmp12' (line 46) <== Memory access at offset 0 partially underflows this variable
[336, 360) 'agg.tmp' <== Memory access at offset 0 partially underflows this variable
[400, 416) 'ref.tmp30' (line 57) <== Memory access at offset 0 partially underflows this variable
[432, 456) 'agg.tmp42' <== Memory access at offset 0 partially underflows this variable
[496, 520) 'agg.tmp80' <== Memory access at offset 0 partially underflows this variable
[560, 568) 'ref.tmp86' (line 74) <== Memory access at offset 0 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-underflow (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x1902c) in wrap_write+0x15c
Shadow bytes around the buggy address:
0x00702dfdfd70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00702dfdfd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00702dfdfd90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00702dfdfda0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00702dfdfdb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x00702dfdfdc0:[f1]f1 f1 f1 00 00 00 f2 f2 f2 f2 f2 f8 f8 f8 f2
0x00702dfdfdd0: f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2 f8 f8
0x00702dfdfde0: f8 f2 f2 f2 f2 f2 f8 f8 f2 f2 00 00 00 f2 f2 f2
0x00702dfdfdf0: f2 f2 f8 f8 f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00
0x00702dfdfe00: 00 f2 f2 f2 f2 f2 f8 f3 f3 f3 f3 f3 00 00 00 00
0x00702dfdfe10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Thread T2 created by T0 here:
#0 0x100ae8c5c in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x38c5c)
#1 0x10010a360 in google_breakpad::ExceptionHandler::Setup(bool)+0xd0 (my_server:arm64+0x10010a360)
#2 0x10010a1c4 in google_breakpad::ExceptionHandler::ExceptionHandler(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool (*)(void*), bool (*)(char const*, char const*, void*, bool), void*, bool, char const*)+0x110 (my_server:arm64+0x10010a1c4)
#3 0x1001132b4 in crashhandler::init(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)+0x58 (my_server:arm64+0x1001132b4)
#4 0x10000367c in main main.cpp:41
#5 0x1a2fdfe4c (<unknown module>)
==5060==ABORTING
To reiterate, if I run the program outside of the debugger, it proceeds normally.
What can cause this?
Breakpad is inserting a right into the process's "task exception port" - which is where you listen for crashes and the like either from within the process or externally - i.e. when you are a debugger. But in Mach the exception ports only have a single owner. So when you run under the debugger, Breakpad and the debugger fight for control of the exception port. For example, if you got your port right set up before the debugger attached, you will end up with a bad port right after the attach, because lldb now owns the port.
Debugging programs that use task exception port handlers is not well supported, because (a) it would be tricky to get that right and (b) there aren't enough programs that need to do this to motivate the effort (at least on the debugger side). Most people turn off their exception handling for their debug builds since their exception catcher and the debugger are pretty much doing the same job, and it's more convenient to trap in the debugger than the internal exception handler. And the core part of the exception handler is usually simple enough that you can do printf debugging if you really need to debug that part.
I was recently doing the first question in the Leetcode Biweekly Competition 78, and I received an unexpected runtime error which I couldn't understand, especially since I had written similar code before which worked fine. I'm quite new to programming and these competitions, please tell me what this Runtime Error means and how I could change my code to fix it.
class Solution {
public:
int divisorSubstrings(int num, int k) {
string b=to_string(num);
string a="";
int x;
int ans=0;
for(int i=0;i<=b.size()-k;++i){
for(int j=i;i<i+k;++j){
a+=b[j];
}
x=stoi(a);
if(num%x==0){
++ans;
}
}
return ans;
}
};
And the error:
=================================================================
==33==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffca3ed0900 at pc 0x000000343d81 bp 0x7ffca3ed0890 sp 0x7ffca3ed0888
READ of size 1 at 0x7ffca3ed0900 thread T0
#2 0x7fe89b85d0b2 (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
Address 0x7ffca3ed0900 is located in stack of thread T0 at offset 96 in frame
This frame has 4 object(s):
[32, 40) '__endptr.i'
[64, 96) 'b' <== Memory access at offset 96 overflows this variable
[128, 160) 'a'
[192, 193) 'ref.tmp'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
Shadow bytes around the buggy address:
0x1000147d20d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d20e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d20f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d2110: 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2 00 00 00 00
=>0x1000147d2120:[f2]f2 f2 f2 00 00 00 00 f2 f2 f2 f2 f8 f3 f3 f3
0x1000147d2130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d2140: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
0x1000147d2150: 01 f2 04 f2 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d2160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000147d2170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==33==ABORTING
If you need the question, it is at https://leetcode.com/contest/biweekly-contest-78/problems/find-the-k-beauty-of-a-number/
Thanks
Your problem is this line of code:
for(int j=i;i<i+k;++j){
You have two habits you should break. First, you don't use white space. That makes the error in this line much harder to read. Second, you use very short variable names. That ALSO makes the error in this line harder to read.
That for-loop loops forever. The problem is the center clause:
i < i + k
Notice how obvious it is when I add spaces? This problem will get worse as you get older and your eyes get older. The code begins to resemble a wall of unreadable text. Old farts like me won't be able to read your code.
So please, add a little white space. I would have written that line like this:
for (int j = i; j < i + k; ++j) {
Yes, it takes more horizontal space. Space is cheap. Bugs are expensive.
Note that I still think this code is going to go out of range of b's size, so you might still have issues.
Here's the accepted solution modified from your snippet. Review the changes made.
class Solution {
public:
int divisorSubstrings(int num, int k) {
string b = to_string(num);
int ans = 0;
for(int i = 0; i <= b.size() - k; i++) { // the error causing crash
string a = ""; // keep declation close to it's usage, compiler will optimize declaration
for(int j = i; j < i + k; j++) a += b[j];
int x = stoi(a);
if (!x) continue; // you might not want to devide by 0
if( num % x == 0 ) ans++;
}
return ans;
}
};
when given base64 'text' to decode, that contains new lines, the following will throw an exception - non base64 characters present. In reference to the new lines.
terminate called after throwing an instance of 'boost::archive::iterators::dataflow_exception'
what(): attempt to decode a value not in base64 char set
Does anyone know how to tell boost to gracefully handle newlines? I recognize I could remove them myself from the string, prior to decoding, but was hoping and guessing there was a more streamlined way.
typedef transform_width< binary_from_base64<remove_whitespace<char*>>, 8, 6 > base64_dec;
unsigned int size = s.size(); //where 's' is the string holding the base64 characters to include newlines at every 76th character
std::string decoded_token(base64_dec(s.c_str()), base64_dec(s.c_str() + size));
Okay, so the trouble is that newlines are not the problem. The filter_iterator is just fundamentally broken.
It will cause Undefined Behaviour as soon as the input sequence ends in a character that doesn't satisfy the filter predicate (in this case, a whitespace character):
Crashing Live On Compiler Explorer
#include <boost/archive/iterators/remove_whitespace.hpp>
#include <iomanip>
#include <iostream>
namespace bai = boost::archive::iterators;
int main() {
using It = bai::remove_whitespace<const char*>;
std::string const s = "oops "; // ends in whitespace, causes UB
std::string filtered(It(s.c_str()), It(s.c_str() + s.length()));
std::cout << std::quoted(filtered) << std::flush;
}
Prints, with ASan enabled (without it it just segfaults):
=================================================================
==1==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffecf132b0 at pc 0x00000040390d bp 0x7fffecf12fd0 sp 0x7fffecf12fc8
READ of size 1 at 0x7fffecf132b0 thread T0
#0 0x40390c in dereference_impl /opt/compiler-explorer/libs/boost_1_78_0/boost/archive/iterators/remove_whitespace.hpp:105
#1 0x40390c in dereference /opt/compiler-explorer/libs/boost_1_78_0/boost/archive/iterators/remove_whitespace.hpp:113
#2 0x40390c in dereference<boost::archive::iterators::filter_iterator<(anonymous namespace)::remove_whitespace_predicate<char>, char const*> > /opt/compiler-explorer/libs/boost_1_78_0/boost/iterator/iterator_facade.hpp:550
#3 0x40390c in operator* /opt/compiler-explorer/libs/boost_1_78_0/boost/iterator/iterator_facade.hpp:656
#4 0x40390c in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<boost::archive::iterators::remove_whitespace<char const*> >(boost::archive::iterators::remove_whitespace<char const*>, boost::archive::iterators::remove_whitespace<char const*>, std::input_iterator_tag) /opt/compiler-explorer/gcc-trunk-20220419/include/c++/12.0.1/bits/basic_string.tcc:204
#5 0x40390c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<boost::archive::iterators::remove_whitespace<char const*>, void>(boost::archive::iterators::remove_whitespace<char const*>, boost::archive::iterators::remove_whitespace<char const*>, std::allocator<char> const&) /opt/compiler-explorer/gcc-trunk-20220419/include/c++/12.0.1/bits/basic_string.h:756
#6 0x40390c in main /app/example.cpp:11
#7 0x7fb625b560b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2)
#8 0x4041ed in _start (/app/output.s+0x4041ed)
Address 0x7fffecf132b0 is located in stack of thread T0 at offset 576 in frame
#0 0x40245f in main /app/example.cpp:7
This frame has 21 object(s):
[32, 33) '<unknown>'
[48, 49) '<unknown>'
[64, 65) '<unknown>'
[80, 81) '<unknown>'
[96, 97) '<unknown>'
[112, 113) '<unknown>'
[128, 136) 'start'
[160, 168) 'start'
[192, 200) '__guard'
[224, 232) 'start'
[256, 264) 'start'
[288, 296) '__capacity'
[320, 328) '__guard'
[352, 368) '<unknown>'
[384, 400) '<unknown>'
[416, 432) '<unknown>'
[448, 464) '<unknown>'
[480, 496) '<unknown>'
[512, 528) '<unknown>'
[544, 576) 's' (line 9) <== Memory access at offset 576 overflows this variable
[608, 640) 'filtered' (line 11)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /opt/compiler-explorer/libs/boost_1_78_0/boost/archive/iterators/remove_whitespace.hpp:105 in dereference_impl
Shadow bytes around the buggy address:
0x10007d9da600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
0x10007d9da610: f1 f1 f8 f2 01 f2 f8 f2 f8 f2 f8 f2 01 f2 00 f2
0x10007d9da620: f2 f2 00 f2 f2 f2 f8 f2 f2 f2 00 f2 f2 f2 00 f2
0x10007d9da630: f2 f2 00 f2 f2 f2 00 f2 f2 f2 00 00 f2 f2 00 00
0x10007d9da640: f2 f2 00 00 f2 f2 00 00 f2 f2 00 00 f2 f2 00 00
=>0x10007d9da650: f2 f2 00 00 00 00[f2]f2 f2 f2 00 00 00 00 f3 f3
0x10007d9da660: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007d9da670: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007d9da680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007d9da690: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007d9da6a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==1==ABORTING
You can consider yourself lucky that you noticed due to symptoms, instead of it eating your puppy or launching the nukes in production code.
This should be reported. There is no test for filter_iterator (or even remove_whitespace for that matter) in isolation, and previous tickets seem to indicate a stance like "It Works For Me" (where "Me" is the Boost Serialization Library). See e.g. https://github.com/boostorg/serialization/issues/135.
Interestingly the analysis at that ticket makes zero sense, since there hasn't been a two-iterator constructor for filter_iterator since ... forever. I can only guess that Robert mistakenly looked at Boost Iterator instead of the filter_iterator from Boost Archive.
So my hunch was to recommend you use filter_iterator.hpp from Boost Iterator
instead. Ironically it took several tries (and a trip to cppslack/github) to
get right, but here it is: Live On Compiler Explorer
Fix Using Boost Iterator's filter_iterator
We should be able to fix using the working filter_iterator implementation:
using FiltIt =
boost::iterators::filter_iterator<IsGraph, std::string::const_iterator>;
using base64_dec = //
bai::transform_width< //
bai::binary_from_base64<FiltIt>, 8, 6>; //
Now, it's still tricky to get it right. Notably, the naive approach will just fail with UB again:
// CAUTION: this invokes UB:
std::string filtered(base64_dec(s.begin()), base64_dec(s.end()));
It's the curse of implicit conversions + default arguments. Instead we MUST explicitly construct the FiltIt separately:
FiltIt f(IsGraph{}, s.begin(), s.end()), // !!
l(f.predicate(), f.end(), f.end()); // !!
Now we can "just" use them in base64_dec:
std::string filtered(base64_dec{f}, base64_dec{l});
Note the uniform {} initializers to sidestep Most Vexing Parse
Live On Compiler Explorer
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/iterator/filter_iterator.hpp>
#include <iomanip>
#include <iostream>
namespace bai = boost::archive::iterators;
static std::string const s = "aGVsbG8g\nd29ybGQK";
int main() {
std::cout << std::unitbuf;
struct IsGraph {
// unsigned char prevents sign extension
bool operator()(unsigned char ch) const {
return std::isgraph(ch); // !std::isspace
}
};
using FiltIt =
boost::iterators::filter_iterator<IsGraph, std::string::const_iterator>;
using base64_dec = //
bai::transform_width< //
bai::binary_from_base64<FiltIt>, 8, 6>; //
//// CAUTION: this invokes UB:
//std::string filtered(base64_dec(s.begin()), base64_dec(s.end()));
FiltIt f(IsGraph{}, s.begin(), s.end()), // !!
l(f.predicate(), f.end(), f.end()); // !!
std::string filtered(base64_dec{f}, base64_dec{l});
std::cout << "OUT:" << std::quoted(filtered) << std::endl;
}
Prints itself (minus the sample base64)
OUT:"hello world
"
Summary/TL;DR
Consider the lurking bugs and undocumented limitations, consider using a
proper, simpler base64 implementation.
Beast has one in its implementation details, so that's also unsupported, but
chances are pretty high that it is at least less brittle.
Alternatively the must be a library that has proper tests and documentation.
I'm learning C++, and on LeetCode, converting a char[] to a string gives a AddressSanitizer: stack-buffer-overflow error.
string test1() /* Line 70 */
{
char test[] = "11";
return string(test);
}
string test2() /* Line 76 */
{
char test[] = {'1', '1'};
return string(test);
}
int main()
{
cout << test1() << endl;
cout << test2() << endl;
}
In this code above, test1 returns "11" and test2 gives the error below with ASAN on. Why does this happen? Aren't they just different ways to initialize a char array?
==87465==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffee2400c22 at pc 0x00010d837634 bp 0x7ffee2400ad0 sp 0x7ffee2400290
READ of size 3 at 0x7ffee2400c22 thread T0
pc_0x10d837633###func_wrap_strlen###file_<null>###line_3###obj_(libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x1a633)
pc_0x10d803a14###func_std::__1::char_traits<char>::length(char const*)###file___string###line_253###obj_(CCC:x86_64+0x100005a14)
pc_0x10d803950###func_std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string<std::nullptr_t>(char const*)###file_string###line_819###obj_(CCC:x86_64+0x100005950)
pc_0x10d80326c###func_std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string<std::nullptr_t>(char const*)###file_string###line_817###obj_(CCC:x86_64+0x10000526c)
pc_0x10d80338f###func_test2()###file_p67-add-binary.cpp###line_79###obj_(CCC:x86_64+0x10000538f)
pc_0x10d803569###func_main###file_p67-add-binary.cpp###line_85###obj_(CCC:x86_64+0x100005569)
pc_0x7fff6cf80cc8###func_start###file_<null>###line_2###obj_(libdyld.dylib:x86_64+0x1acc8)
Address 0x7ffee2400c22 is located in stack of thread T0 at offset 34 in frame
pc_0x10d80328f###func_test2()###file_p67-add-binary.cpp###line_77###obj_(CCC:x86_64+0x10000528f)
This frame has 1 object(s):
[32, 34) 'test' (line 78) <== Memory access at offset 34 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x1a633) in wrap_strlen+0x183
Shadow bytes around the buggy address:
0x1fffdc480130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1fffdc480140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1fffdc480150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1fffdc480160: f1 f1 f1 f1 f8 f2 f8 f3 00 00 00 00 00 00 00 00
0x1fffdc480170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x1fffdc480180: f1 f1 f1 f1[02]f3 f3 f3 00 00 00 00 00 00 00 00
0x1fffdc480190: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
0x1fffdc4801a0: f8 f8 f8 f2 f2 f2 f2 f2 00 00 00 f3 f3 f3 f3 f3
0x1fffdc4801b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1fffdc4801c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1fffdc4801d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
If you want your char * to be processed properly as a string, you must make sure it's null-terminated:
char test[] {'1', '1', '\0'};
String literals do that automatically. "11" is the same as {'1', '1', '\0'}.
Alternatively, you can pass the number of characters to read:
string str(test, sizeof test);
I read that with inline functions where ever the function call is made we replace the function call with the body of the function definition.
According to the above explanation there should not be any function call when inline is user.
If that is the case Why do I see three call instructions in the assembly code ?
#include <iostream>
inline int add(int x, int y)
{
return x+ y;
}
int main()
{
add(8,9);
add(20,10);
add(100,233);
}
meow#vikkyhacks ~/Arena/c/temp $ g++ -c a.cpp
meow#vikkyhacks ~/Arena/c/temp $ objdump -M intel -d a.o
0000000000000000 <main>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: be 09 00 00 00 mov esi,0x9
9: bf 08 00 00 00 mov edi,0x8
e: e8 00 00 00 00 call 13 <main+0x13>
13: be 0a 00 00 00 mov esi,0xa
18: bf 14 00 00 00 mov edi,0x14
1d: e8 00 00 00 00 call 22 <main+0x22>
22: be e9 00 00 00 mov esi,0xe9
27: bf 64 00 00 00 mov edi,0x64
2c: e8 00 00 00 00 call 31 <main+0x31>
31: b8 00 00 00 00 mov eax,0x0
36: 5d pop rbp
37: c3 ret
NOTE
Complete dump of the object file is here
You did not optimize so the calls are not inlined
You produced an object file (not a .exe) so the calls are not resolved. What you see is a dummy call whose address will be filled by the linker
If you compile a full executable you will see the correct addresses for the jumps
See page 28 of:
http://www.cs.princeton.edu/courses/archive/spr04/cos217/lectures/Assembler.pdf