How to find global static initializations - c++

I just read this excellent article: http://neugierig.org/software/chromium/notes/2011/08/static-initializers.html
and then I tried: https://gcc.gnu.org/onlinedocs/gccint/Initialization.html
What it says about finding initializers does not work for me though. The .ctors section is not available, but I could find .init_array (see also Can't find .dtors and .ctors in binary). But how do I interpret the output? I mean, summing up the size of the pages can also be handled by the size command and its .bss column - or am I missing something?
Furthermore, nm does not report any *_GLOBAL__I_* symbols, only *_GLOBAL__N_* functions, and - more interesting - _GLOBAL__sub_I_somefile.cpp entries. The latter probably indicates files with global initialization. But can I somehow get a list of constructors that are being run? Ideally, a tool would give me a list of
Foo::Foo in file1.cpp:12
Bar::Bar in file2.cpp:45
...
(assuming I have debug symbols available). Is there such a tool? If not, how could one write it? Does the .init_array section contain pointers to code which could be translated via some DWARF magic to the above?

As you already observed, the implementation details of contructors/initialization functions are highly compiler (version) dependent. While I am not aware of a tool for this, what current GCC/clang versions do is simple enough to let a small script do the job: .init_array is just a list of entry points. objdump -s can be used to load the list, and nm to lookup the symbol names. Here's a Python script that does that. It should work for any binary that was generated by the said compilers:
#!/usr/bin/env python
import os
import sys
# Load .init_array section
objdump_output = os.popen("objdump -s '%s' -j .init_array" % (sys.argv[1].replace("'", r"\'"),)).read()
is_64bit = "x86-64" in objdump_output
init_array = objdump_output[objdump_output.find("Contents of section .init_array:") + 33:]
initializers = []
for line in init_array.split("\n"):
parts = line.split()
if not parts:
continue
parts.pop(0) # Remove offset
parts.pop(-1) # Remove ascii representation
if is_64bit:
# 64bit pointers are 8 bytes long
parts = [ "".join(parts[i:i+2]) for i in range(0, len(parts), 2) ]
# Fix endianess
parts = [ "".join(reversed([ x[i:i+2] for i in range(0, len(x), 2) ])) for x in parts ]
initializers += parts
# Load disassembly for c++ constructors
dis_output = os.popen("objdump -d '%s' | c++filt" % (sys.argv[1].replace("'", r"\'"), )).read()
def find_associated_constructor(disassembly, symbol):
# Find associated __static_initialization function
loc = disassembly.find("<%s>" % symbol)
if loc < 0:
return False
loc = disassembly.find(" <", loc)
if loc < 0:
return False
symbol = disassembly[loc+2:disassembly.find("\n", loc)][:-1]
if symbol[:23] != "__static_initialization":
return False
address = disassembly[disassembly.rfind(" ", 0, loc)+1:loc]
loc = disassembly.find("%s <%s>" % (address, symbol))
if loc < 0:
return False
# Find all callq's in that function
end_of_function = disassembly.find("\n\n", loc)
symbols = []
while loc < end_of_function:
loc = disassembly.find("callq", loc)
if loc < 0 or loc > end_of_function:
break
loc = disassembly.find("<", loc)
symbols.append(disassembly[loc+1:disassembly.find("\n", loc)][:-1])
return symbols
# Load symbol names, if available
nm_output = os.popen("nm '%s'" % (sys.argv[1].replace("'", r"\'"), )).read()
nm_symbols = {}
for line in nm_output.split("\n"):
parts = line.split()
if not parts:
continue
nm_symbols[parts[0]] = parts[-1]
# Output a list of initializers
print("Initializers:")
for initializer in initializers:
symbol = nm_symbols[initializer] if initializer in nm_symbols else "???"
constructor = find_associated_constructor(dis_output, symbol)
if constructor:
for function in constructor:
print("%s %s -> %s" % (initializer, symbol, function))
else:
print("%s %s" % (initializer, symbol))
C++ static initializers are not called directly, but through two generated functions, _GLOBAL__sub_I_.. and __static_initialization... The script uses the disassembly of those functions to get the name of the actual constructor. You'll need the c++filt tool to unmangle the names, or remove the call from the script to see the raw symbol name.
Shared libraries can have their own initializer lists, which would not be displayed by this script. The situation is slightly more complicated there: For non-static initializers, the .init_array gets an all-zero entry that is overwritten with the final address of the initializer when loading the library. So this script would output an address with all zeros.

There are multiple things executed when loading an ELF object, not just .init_array. To get an overview, I suggest looking at the sources of libc's loader, especially _dl_init() and call_init().

Related

output_ptr Assertion Error in Main - Cairo-lang

I am going through the official Cairo language tutorial. When I get to the build_dict function for the 15-puzzle, I am a bit confused so would like to print out some things to see.
struct Location:
member row : felt
member col : felt
end
...
func finalize_state(dict : DictAccess*, idx) -> (
dict : DictAccess*):
if idx == 0:
return (dict=dict)
end
assert dict.key = idx
assert dict.prev_value = idx - 1
assert dict.new_value = idx - 1
return finalize_state(
dict=dict + DictAccess.SIZE, idx=idx - 1)
end
##### I added the {} along with context in it for output #####
func main{output_ptr : felt*}():
alloc_locals
local loc_tuple : (Location, Location, Location, Location, Location) = (
Location(row=0, col=2),
Location(row=1, col=2),
Location(row=1, col=3),
Location(row=2, col=3),
Location(row=3, col=3),
)
# Get the value of the frame pointer register (fp) so that
# we can use the address of loc_tuple.
let (__fp__, _) = get_fp_and_pc()
# Since the tuple elements are next to each other we can use the
# address of loc_tuple as a pointer to the 5 locations.
verify_location_list(
loc_list=cast(&loc_tuple, Location*), n_steps=4)
##### Here is what I added #####
local locs : Location* = cast(&loc_tuple, Location*)
tempvar loc = [locs]
tempvar row = loc.row
serialize_word(row)
################################
return ()
end
I added the lines for printing the first row in loc_tuple. However, the Cairo compiler is giving me the following errors:
Traceback (most recent call last):
File "/Users/yijiachen/cairo_venv/bin/cairo-compile", line 10, in <module>
sys.exit(main())
File "/Users/yijiachen/cairo_venv/lib/python3.8/site-packages/starkware/cairo/lang/compiler/cairo_compile.py", line 397, in main
cairo_compile_common(
File "/Users/yijiachen/cairo_venv/lib/python3.8/site-packages/starkware/cairo/lang/compiler/cairo_compile.py", line 121, in cairo_compile_common
assembled_program = assemble_func(
File "/Users/yijiachen/cairo_venv/lib/python3.8/site-packages/starkware/cairo/lang/compiler/cairo_compile.py", line 367, in
cairo_assemble_program
check_main_args(program)
File "/Users/yijiachen/cairo_venv/lib/python3.8/site-packages/starkware/cairo/lang/compiler/cairo_compile.py", line 296, in check_main_args
assert main_args == expected_builtin_ptrs, (
AssertionError: Expected main to contain the following arguments (in this order): []. Found: ['output_ptr'].
I have tried with various serialize_word statements and none seem to work. This issue never arose before with other serialize_word statements, including in earlier parts of the tutorial.
Declare %builtins output at the top of your code so that the compiler will know that you use an implicit argument (output_ptr) in your main function and expect it.
The Cairo compiler is able to process implicit arguments only if you declare that you are going to use them. See here.

Gets ¨IndexError: list index out of range¨ Is this a bug?

I can't seem to find the error in my Python 2.7.13 code. When I try to run it, the following shows up:
"IndexError: list index out of range"
for d in dopant[1:]:
for s in xrange(1,3,2):
for k in xrange(0,1):
# creates folder
try:
os.makedirs("path")
except OSError:
if not os.path.isdir("path"):
raise
# enters that folder
os.chdir("path")
file2 = open("atomicXYZ","a+")
stdin=subprocess.PIPE, stdout = file2).stdin
subprocess.Popen(['cat', '/path/file'], stdout = cmd1)
file2.seek(0)
# The following reads atomicXYZ and converts its contents to tuples
result = []
with file2 as fp:
for i in fp.readlines():
tmp = i.split()
try:
result.append((float(tmp[0]), float(tmp[1]), float(tmp[2])))
except:pass
# As a check, I access the last line of the tuple
x,y,z = result[len(result)-1]
os.chdir("..")
That is when the error shows up. This is surprising because atomicXYZ is NOT empty, as you can see here:
atomicXYZ
0.309595018 0.070879924 0.041045030
0.600985479 0.103996517 0.130482163
0.982347083 -0.008801119 -0.088718291
0.266923601 0.125720284 -0.038070136
0.520845390 0.163282973 0.061118496
0.812787033 0.194089924 0.131124240
0.398054509 0.270533816 -0.097226923
0.673016094 0.332428625 0.006571612
0.968473946 0.356972107 0.087712083
0.896549601 0.449435057 0.027658530
0.602586223 0.391867525 -0.070503370
0.732266134 0.576057624 -0.111890811
1.018201372 0.643127004 -0.009288985
0.914029765 0.703744085 -0.066356115
And what is even stranger is that when I split this entire code into two codes-- one for writing the atomic coordinates, and one for reading the coordinates-- it works.
What is it that I'm doing wrong?

How to convert a list of strings into a dict object with kwarg as the keys?

I have seen similar questions. This one is the most similar that I've found:
Python converting a list into a dict with a value of 1 for each key
The difference is that I need the dict keys to be unique and ordered keyword arguments.
I am trying to feed the list of links I've generated through a scraper into a request command. I understand the request.get() function only takes a URL string or kwarg parameters - hence my need to pair the list of links with keyword arguments that are ordered.
terms = (input(str('type boolean HERE -->')))
zipcity = (input(str('type location HERE -->')))
search = driver.find_element_by_id('keywordsearch')
search.click()
search.send_keys('terms')
location = driver.find_element_by_id('WHERE')
location.click()
location.send_keys('zipcity')
clickSearch = driver.find_element_by_css_selector('#buttonsearch-button')
clickSearch.click()
time.sleep(5)
cv = []
cvDict = {}
bbb = driver.find_elements_by_class_name('user-name')
for plink in bbb:
cv.append(plink.find_element_by_css_selector('a').get_attribute('href'))
cvDict = {x: 1 for x in cv}
print(cvDict)
SOLVED: (for now). Somehow figured it out myself. That literally never happens. Lucky day I guess!
cvDict = {'one': cv[:1],
'tw': cv[:2],
'thr': cv[:3],
'fou': cv[:4],
'fiv': cv[:5],
'six': cv[:6],
'sev': cv[:7],
'eig': cv[:8],
'nin': cv[:9],
'ten': cv[:10],
'ele': cv[:11],
'twe': cv[:12],
'thi': cv[:13],
'fourteen': cv[:14],
'fifteen': cv[:15],
'sixteen': cv[:16],
'seventeen': cv[:17],
'eighteen': cv[:18],
'nineteen': cv[:19],
'twent': cv[:20],
}

Sequence of dictionaries in python

I am trying to create a sequence of similar dictionaries to further store them in a tuple. I tried two approaches, using and not using a for loop
Without for loop
dic0 = {'modo': lambda x: x[0]}
dic1 = {'modo': lambda x: x[1]}
lst = []
lst.append(dic0)
lst.append(dic1)
tup = tuple(lst)
dic0 = tup[0]
dic1 = tup[1]
f0 = dic0['modo']
f1 = dic1['modo']
x = np.array([0,1])
print (f0(x) , f1(x)) # 0 , 1
With a for loop
lst = []
for j in range(0,2):
dic = {}
dic = {'modo': lambda x: x[j]}
lst.insert(j,dic)
tup = tuple(lst)
dic0 = tup[0]
dic1 = tup[1]
f0 = dic0['modo']
f1 = dic1['modo']
x = np.array([0,1])
print (f0(x) , f1(x)) # 1 , 1
I really don't understand why I am getting different results. It seems that the last dictionary I insert overwrite the previous ones, but I don't know why (the append method does not work neither).
Any help would be really welcomed
This is happening due to how scoping works in this case. Try putting j = 0 above the final print statement and you'll see what happens.
Also, you might try
from operator import itemgetter
lst = [{'modo': itemgetter(j)} for j in range(2)]
You have accidentally created what is know as a closure. The lambda functions in your second (loop-based) example include a reference to a variable j. That variable is actually the loop variable used to iterate your loop. So the lambda call actually produces code with a reference to "some variable named 'j' that I didn't define, but it's around here somewhere."
This is called "closing over" or "enclosing" the variable j, because even when the loop is finished, there will be this lambda function you wrote that references the variable j. And so it will never get garbage-collected until you release the references to the lambda function(s).
You get the same value (1, 1) printed because j stops iterating over the range(0,2) with j=1, and nothing changes that. So when your lambda functions ask for x[j], they're asking for the present value of j, then getting the present value of x[j]. In both functions, the present value of j is 1.
You could work around this by creating a make_lambda function that takes an index number as a parameter. Or you could do what #DavisYoshida suggested, and use someone else's code to create the appropriate closure for you.

How can use let - in for multiple lines in OCaml?

I want to do the following:
let dist = Stack.pop stck and dir = Stack.pop stck in (
print_endline(dist);
print_endline(dir);
......
......
)
The above gives me the following error:
Error: This expression has type unit
This is not a function; it cannot be applied.
How can I use the variables dist and dir over multiple lines?
The error is not in the piece of code you show here. I guess you forgot a ; somewhere.
But there is a subtle error in your code.
In this line of code
let dist = Stack.pop stck and dir = Stack.pop stck in
You expect to obtain the first element of the stack in dist and the second one in dir but it may not be the case as the order of evaluation is unspecified.
The basic syntax of your code is OK. Here's a simple example showing that it works fine:
$ ocaml
OCaml version 4.01.0
# let x = 5 and y = 4 in (
print_int x;
print_int y;
);;
54- : unit = ()
#
The reported errors have to do with other problems. We would need more context to see what's wrong. Possibly the errors are in the lines you elided. Or they could be caused by what comes next.