I have several calls to getenv in my code(called a lot of times), so I see the potential for an optimization. My question is, does getenv somehow cache the result internally, or does it query the environment variables on each call?
I have profiled the code, getenv is not a bottleneck, but I'd still like to change it if it's more efficient.
As a side question, can an environment variable be changed for a program while it is running? I'm not doing that, so in my case caching the result would be safe, it's just curiosity.
Environment variables usually live in the memory of given process so there is nothing to cache there, they are readily available.
As for updates, any component of a running process can call putenv to updated the environment, you should not cache it for prolonged periods if you expect that to happen.
I doubt it caches the results, environment variables could change from call to call. You can implement that cache yourself:
#include <map>
#include <iostream>
#include <string>
#include <stdexcept>
#include <cstdlib>
class EnvCache {
public:
const std::string &get_env(const std::string &key) {
auto it = cache_entries.find(key);
if(it == cache_entries.end()) {
const char *ptr = getenv(key.c_str());
if(!ptr)
throw std::runtime_error("Env var not found");
it = cache_entries.insert({key, ptr}).first;
}
return it->second;
}
void clear() {
cache_entries.clear();
}
private:
std::map<std::string, std::string> cache_entries;
};
int main() {
EnvCache cache;
std::cout << cache.get_env("PATH") << std::endl;
}
You could invalidate cache entries in case you modify environment variables. You could also map directly to const char*, but that's up to you.
A process inherits the environment from the process creating the new process. This is held in memory.
Indeed, In C and C++ you can define main to have an extra parameter that contains the environment - see http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html#Program-Arguments
Additionally you can use extern char **environ; to access the array containing the environment. (this is null terminated)
Therefore you do not need a cache. The environment variables are held in memory as an array.
Related
One error that I often see is a container being cleared whilst iterating through it. I have attempted to put together a small example program demonstrating this happening. One thing to note is that this can often happen many function calls deep so is quite hard to detect.
Note: This example deliberately shows some poorly designed code. I am trying to find a solution to detect the errors caused by writing code such as this without having to meticulously examine an entire codebase (~500 C++ units)
#include <iostream>
#include <string>
#include <vector>
class Bomb;
std::vector<Bomb> bombs;
class Bomb
{
std::string name;
public:
Bomb(std::string name)
{
this->name = name;
}
void touch()
{
if(rand() % 100 > 30)
{
/* Simulate everything being exploded! */
bombs.clear();
/* An error: "this" is no longer valid */
std::cout << "Crickey! The bomb was set off by " << name << std::endl;
}
}
};
int main()
{
bombs.push_back(Bomb("Freddy"));
bombs.push_back(Bomb("Charlie"));
bombs.push_back(Bomb("Teddy"));
bombs.push_back(Bomb("Trudy"));
for(size_t i = 0; i < bombs.size(); i++)
{
bombs.at(i).touch();
}
return 0;
}
Can anyone suggest a way of guaranteeing this cannot happen?
The only way I can currently detect this kind of thing is replacing the global new and delete with mmap / mprotect and detecting use after free memory accesses. This and Valgrind however sometimes fail to pick it up if the vector does not need to reallocate (i.e only some elements removed or the new size is not yet the reserve size). Ideally I don't want to have to clone much of the STL to make a version of std::vector that always reallocates every insertion/deletion during debug / testing.
One way that almost works is if the std::vector instead contains std::weak_ptr, then the usage of .lock() to create a temporary reference prevents its deletion whilst execution is within the classes method. However this cannot work with std::shared_ptr because you do not need lock() and same with plain objects. Creating a container of weak pointers just for this would be wasteful.
Can anyone else think of a way to protect ourselves from this.
Easiest way is to run your unit tests with Clang MemorySanitizer linked in.
Let some continuous-integration Linux box to do it automatically on each push
into repo.
MemorySanitizer has "Use-after-destruction detection" (flag -fsanitize-memory-use-after-dtor + environment variable MSAN_OPTIONS=poison_in_dtor=1) and so it will blow the test up that executes the code and that turns your continuous-integration red.
If you have neither unit tests nor continuous integration in place then you can also just manually debug your code with MemorySanitizer but that is hard way compared with the easiest. So better start to use continuous integration and write unit tests.
Note that there may be legitimate reasons of memory reads and writes after destructor has been ran but memory hasn't yet been freed. For example std::variant<std::string,double>. It lets us to assign it std::string then double and so its implementation might destroy the string and reuse same storage for double. Filtering such cases out is unfortunately manual work at the moment, but tools evolve.
In your particular example the misery boils down to no less than two design flaws:
Your vector is a global variable. Limit the scope of all of your objects as much as possible and issues like this are less likely to occur.
Having the single responsibility principle in mind, I can hardly imagine how one could come up with a class that needs to have some method that either directly or indirectly (maybe through 100 layers of call stack) deletes objects that could happen to be this.
I am aware that your example is artificial and intentionally bad, so please don't get me wrong here: I'm sure that in your actual case it is not so obvious how sticking to some basic design rules can prevent you from doing this. But as I said, I strongly believe that good design will reduce the likelyhood of such bugs coming up. And in fact, I cannot remember that I was ever facing such an issue, but maybe I am just not experienced enough :)
However, if this really keeps being an issue despite sticking with some design rules, then I have this idea how to detect it:
Create a member int recursionDepth in your class and initialize it with 0
At the beginning of each non-private method increment it.
Use RAII to make sure that at the end of each method it is decremented again
In the destructor check it to be 0, otherwise it means that the destructor is directly or indirectly called by some method of this.
You may want to #ifdef all of this and enable it only in debug build. This would essentially make it a debug assertion, some people like them :)
Note, that this does not work in a multi threaded environment.
In the end I went with a custom iterator that if the owner std::vector resizes whilst the iterator is still in scope, it will log an error or abort (giving me a stacktrace of the program). This example is a bit convoluted but I have tried to simplify it as much as possible and removed unused functionality from the iterator.
This system has flagged up about 50 errors of this nature. Some may be repeats. However Valgrind and ElecricFence at this point came up clean which is disappointing (In total they flagged up around 10 which I have already fixed since the start of the code cleanup).
In this example I use clear() which Valgrind does flag as an error. However in the actual codebase it is random access erases (i.e vec.erase(vec.begin() + 9)) which I need to check and Valgrind unfortunately misses quite a few.
main.cpp
#include "sstd_vector.h"
#include <iostream>
#include <string>
#include <memory>
class Bomb;
sstd::vector<std::shared_ptr<Bomb> > bombs;
class Bomb
{
std::string name;
public:
Bomb(std::string name)
{
this->name = name;
}
void touch()
{
if(rand() % 100 > 30)
{
/* Simulate everything being exploded! */
bombs.clear(); // Causes an ABORT
std::cout << "Crickey! The bomb was set off by " << name << std::endl;
}
}
};
int main()
{
bombs.push_back(std::make_shared<Bomb>("Freddy"));
bombs.push_back(std::make_shared<Bomb>("Charlie"));
bombs.push_back(std::make_shared<Bomb>("Teddy"));
bombs.push_back(std::make_shared<Bomb>("Trudy"));
/* The key part is the lifetime of the iterator. If the vector
* changes during the lifetime of the iterator, even if it did
* not reallocate, an error will be logged */
for(sstd::vector<std::shared_ptr<Bomb> >::iterator it = bombs.begin(); it != bombs.end(); it++)
{
it->get()->touch();
}
return 0;
}
sstd_vector.h
#include <vector>
#include <stdlib.h>
namespace sstd
{
template <typename T>
class vector
{
std::vector<T> data;
size_t refs;
void check_valid()
{
if(refs > 0)
{
/* Report an error or abort */
abort();
}
}
public:
vector() : refs(0) { }
~vector()
{
check_valid();
}
vector& operator=(vector const& other)
{
check_valid();
data = other.data;
return *this;
}
void push_back(T val)
{
check_valid();
data.push_back(val);
}
void clear()
{
check_valid();
data.clear();
}
class iterator
{
friend class vector;
typename std::vector<T>::iterator it;
vector<T>* parent;
iterator() { }
iterator& operator=(iterator const&) { abort(); }
public:
iterator(iterator const& other)
{
it = other.it;
parent = other.parent;
parent->refs++;
}
~iterator()
{
parent->refs--;
}
bool operator !=(iterator const& other)
{
if(it != other.it) return true;
if(parent != other.parent) return true;
return false;
}
iterator operator ++(int val)
{
iterator rtn = *this;
it ++;
return rtn;
}
T* operator ->()
{
return &(*it);
}
T& operator *()
{
return *it;
}
};
iterator begin()
{
iterator rtn;
rtn.it = data.begin();
rtn.parent = this;
refs++;
return rtn;
}
iterator end()
{
iterator rtn;
rtn.it = data.end();
rtn.parent = this;
refs++;
return rtn;
}
};
}
The disadvantages of this system is that I must use an iterator rather than .at(idx) or [idx]. I personally don't mind this one so much. I can still use .begin() + idx if random access is needed.
It is a little bit slower (nothing compared to Valgrind though). When I am done, I can do a search / replace of sstd::vector with std::vector and there should be no performance drop.
What is the best approach to make my program to execute the data. Say, I wrote the (so-called) compiler for x86_64 machine:
#include <iostream>
#include <vector>
#include <cstdlib>
#include <cstdint>
struct compiler
{
void op() const { return; }
template< typename ...ARGS >
void op(std::uint8_t const _opcode, ARGS && ..._tail)
{
code_.push_back(_opcode);
return op(std::forward< ARGS >(_tail)...);
}
void clear() { code_.clear(); }
long double operator () () const
{
// ?
}
private :
std::vector< std::uint8_t > code_;
};
int main()
{
compiler compiler_; // long double (*)();
compiler_.op(0xD9, 0xEE); // FLDZ
compiler_.op(0xC3); // ret
std::cout << compiler_() << std::endl;
return EXIT_SUCCESS;
}
But I don't know how to implement operator () correctly. I suspect, that I must put all the contents of code_ into contiguous memory chunk and then cast to long double (*)(); and call this. But there is some difficulties:
Should I use VirtualProtect(Ex) (+ FlushInstructionCache) on Windows? And something similar on Linux?
What is the container, that reliably places the bytes in the memory in proper manner (i.e. one by one)? And also allows to get the pointer to memory chunk.
First, you will need to allocate the code as executable [using VirtualAlloc with "executable" flag in Windows, and mmap using "MAP_EXECUTABLE" as one of the flags]. It's probably a lot easier to allocate a large region of this kind of memory, and then have a "allocation function" for your content. You could possibly use virtualprotect and whatever the corresponding function is in Linux, but I'd say that allocating as executable in the first place is a better choice. I don't believe you need to flush instruction cache it the memory is already allocated as executable - certainly not on x86 at least - and since your instructions are x86 instructions, I guess that's a fair limitation.
Second, you'll need to make something like a function pointer to your code. SOmething like this should do it:
typedef void (*funcptr)(void);
funcptr f = reinterpret_cast<funcptr>(&code_[0]);
should do the trick.
The following code seems to always follow the true branch.
#include <map>
#include <iostream>
class TestClass {
// implementation
}
int main() {
std::map<int, TestClass*> TestMap;
if (TestMap[203] == nullptr) {
std::cout << "true";
} else {
std::cout << "false";
}
return 0;
}
Is it defined behaviour for an uninitialized pointer to point at nullptr, or an artifact of my compiler?
If not, how can I ensure portability of the following code? Currently, I'm using similar logic to return the correct singleton instance for a log file:
#include <string>
#include <map>
class Log {
public:
static Log* get_instance(std::string path);
protected:
Log(std::string path) : path(path), log(path) {};
std::string path;
std::ostream log;
private:
static std::map<std::string, Log*> instances;
};
std::map<std::string, Log*> Log::instances = std::map<std::string, Log*>();
Log* Log::get_instance(std::string path) {
if (instances[path] == nullptr) {
instances[path] = new Log(path);
}
return instances[path];
}
One solution would be to use something similar to this where you use a special function provide a default value when checking a map. However, my understanding is that this would cause the complexity of the lookup to be O(n) instead of O(1). This isn't too much of an issue in my scenario (there would only ever be a handful of logs), but a better solution would be somehow to force pointers of type Log* to reference nullptr by default thus making the lookup check O(1) and portable at the same time. Is this possible and if so, how would I do it?
The map always value-initializes its members (in situations where they are not copy-initialized, of course), and value-initialization for builtin types means zero-initialization, therefore it is indeed defined behaviour. This is especially true for the value part of new keys generated when accessing elements with operator[] which didn't exist before calling that.
Note however that an uninizialized pointer is not necessarily a null pointer; indeed, just reading its value already invokes undefined behaviour (and might case a segmentation fault on certain platforms under certain circumstances). The point is that pointers in maps are not uninitialized. So if you write for example
void foo()
{
TestClass* p;
// ...
}
p will not be initialized to nullptr.
Note however that you might want to check for presence instead, to avoid accumulating unnecessary entries. You'd check for presence using the find member function:
map<int, TestClass*>::iterator it = TestMap.find(203);
if (it == map.end())
{
// there's no such element in the map
}
else
{
TestClass* p = it->second;
// ...
}
Yes, that's defined behaviour. If an element isn't yet in a map when you access it via operator[], it gets default constructed.
I am creating scripting language that first parse the code
and then copy functions (To execute the code) to one buffer\memory as the parsed code.
There is a way to copy function's binary code to buffer and then execute the whole buffer?
I need to execute all the functions at once to get better performance.
To understand my question to best I want to do something like this:
#include <vector>
using namespace std;
class RuntimeFunction; //The buffer to my runtime function
enum ByteCodeType {
Return,
None
};
class ByteCode {
ByteCodeType type;
}
void ReturnRuntime() {
return;
}
RuntimeFunction GetExecutableData(vector<ByteCode> function) {
RuntimeFunction runtimeFunction=RuntimeFunction(sizeof(int)); //Returns int
for (int i = 0 ; i < function.size() ; i++ ) {
#define CurrentByteCode function[i]
if (CurrentByteCode.Type==Return) {
runtimeFunction.Append(&ReturnRuntime);
} //etc.
#undef
}
return runtimeFunction;
}
void* CallFunc(RuntimeFunction runtimeFunction,vector<void*> custom_parameters) {
for (int i=custom_parameters-1;i>=0;--i) { //Invert parameters loop
__asm {
push custom_parameters[i]
}
}
__asm {
call runtimeFunction.pHandle
}
}
There are a number of ways of doing this, depending on how deep you want to get into generating code at runtime, but one relatively simple way of doing it is with threaded code and a threaded code interpreter.
Basically, threaded code consists of an array of function pointers, and the interpreter goes through the array calling each pointed at function. The tricky part is that you generally have each function return the address of array element containing a pointer to the next function to call, which allows you to implement things like branches and calls without any effort in the interpreter
Usually you use something like:
typedef void *(*tc_func_t)(void *, runtime_state_t *);
void *interp(tc_func_t **entry, runtime_state_t *state) {
tc_func_t *pc = *entry;
while (pc) pc = (*pc)(pc+1, state);
return entry+1;
}
That's the entire interpreter. runtime_state_t is some kind of data structure containing some runtime state (usually one or more stacks). You call it by creating an array of tc_func_t function pointers and filling them in with function pointers (and possibly data), ending with a null pointer, and then call interp with the address of a variable containing the start of the array. So you might have something like:
void *add(tc_func_t *pc, runtime_state_t *state) {
int v1 = state->data.pop();
int v2 = state->data.pop();
state->data.push(v1 + v2);
return pc; }
void *push_int(tc_func_t *pc, runtime_state_t *state) {
state->data.push((int)*pc);
return pc+1; }
void *print(tc_func_t *pc, runtime_state_t *state) {
cout << state->data.pop();
return pc; }
tc_func_t program[] = {
(tc_func_t)push_int,
(tc_func_t)2,
(tc_func_t)push_int,
(tc_func_t)2,
(tc_func_t)add,
(tc_func_t)print,
0
};
void run_prgram() {
runtime_state_t state;
tc_func_t *entry = program;
interp(&entry, &state);
}
Calling run_program runs the little program that adds 2+2 and prints the result.
Now you may be confused by the slightly odd calling setup for interp, with an extra level of indirection on the entry argument. That's so that you can use interp itself as a function in a threaded code array, followed by a pointer to another array, and it will do a threaded code call.
edit
The biggest problem with threaded code like this is related to performance -- the threaded coded interpreter is extremely unfriendly to branch predictors, so performance is pretty much locked at one threaded instruction call per branch misprediction recovery time.
If you want more performance, you pretty much have to go to full-on runtime code generation. LLVM provides a good, machine independent interface to doing that, along with pretty good optimizers for common platforms that will produce pretty good code at runtime.
I want to write the data "somebytes" that I get from a function called NextUnit() to a file named "output.txt", but the code that I wrote does not work. When I open the file, it does not look like my "somebytes". Here is the code:
#include <stdio.h>
#include <string.h>
char* NextUnit()
{
char Unit[256];
strcpy(Unit,"somebytes");
return &Unit[0];
}
int main()
{
FILE *ourfile;
ourfile=fopen("output.txt","wb");
char* somedata;
somedata=NextUnit();
printf("%s\n",somedata);
fwrite(somedata,1,strlen(somedata),ourfile);
fclose(ourfile);
}
You are returning the local address from a function (aka released stack address). Which is then changed once you call the next function.
Ether just return a global constant
const char* NextUnit() { return "somebytes"; }
or copy it into a new memory structure, which you will then need to also free later...
char* NextUnit()
{
char* newstr = new char[256];
strcpy(newstr,"somebytes");
return newstr;
}
// some code later
char* tmpstr = NextUnit();
// do writing stuff
// free memory
delete tmpstr;
You've declared Unit[256] on the stack in a subprocedure. But when your NextUnit() returns, the variable that was scoped to it goes out of scope, and you are no longer pointing to valid memory.
Consider allocating the memory with new, and then releasing it in the caller, or having the caller pass down a pointer to preallocated memory.
you are returning the local address of
a function. Ether just return
const char* NextUnit() { return
"somebytes"; }
so it's constant, or copy it into a
new memory stucture, which you will
then need to also free later...
I don't have enough mojo to comment on the quoted answer, so I have to put this as a new answer.
His answer is trying to say the right thing, but it came out wrong.
Your code is returning the address of a local variable inside the NextUnit() function. Don't do that. Its bad. Do what he suggested.
If you are using C++, the following is a much better way to go about this:
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char ** argv)
{
ofstream outFile;
outFile.open("output.txt");
outFile << "someBytes";
outFile.close();
return 0;
}
And, once you are comfortable with that, the next thing to learn about is RAII.
I would rewrite it like this:
char *NextUnit(char *src)
{
strcpy(src, "somebytes");
return src;
}
This way you can decide what to do with the variable outside the function implementation:
char Unit[256];
char *somedata = NextUnit(Unit);
NextUnit returns the address of Unit, which is an array local to that function. That means that it is allocated on the stack, and "released" when the function returns, making the return value non-valid.
To solve this problem you can:
Dynamically allocate a new string each time NextUnit is called. Note that in that case you will have to delete the memory afterwards.
Create a global string. That's fine for a small "test" application, but generally use of global variables is discouraged.
Have main allocate a string (either dynamically or on the stack), pass it as a parameter to NextUnit, and have NextUnit copy to that string.
You have a few problems here. The main one, I think, is that NextUnit() is allocating the buffer on the stack and you're effectively going out of scope when you try to return the address.
You can fix this in a C-style solution by mallocing space for the buffer and returning the pointer that malloc returns.
I think a first step might be to rewrite the code to something more like the following:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char* NextUnit()
{
char *Unit = (char *)malloc( 256 );
memset(Unit, 0, sizeof(Unit));
strcpy(Unit,"somebytes");
return Unit;
}
int main()
{
FILE *ourfile;
ourfile=fopen("output.txt","wb");
char* somedata;
somedata=NextUnit();
printf("%s\n",somedata);
//fwrite(somedata,1,strlen(somedata),ourfile);
fprintf(ourfile, somedata);
free(somedata);
fclose(ourfile);
}
"Unit" declared as a local variable inside NextUnit which is actually a "stack" variable meaning that it's lifetime is only as long as NextUnit hasn't returned.
So, while NextUnit hasn't returned yet, copying "somebytes" to it is ok, as is printing it out. As soon as NextUnit returns, Unit is released from the stack and the pointer somedata in main will not be pointing to something valid.
Here is quick fix. I still don't recommend writing programs this way this way, but it's the least changes.
#include <stdio.h>
#include <string.h>
char Unit[256];
char* NextUnit()
{
strcpy(Unit,"somebytes");
return &Unit[0];
}
int main()
{
FILE *ourfile;
ourfile=fopen("output.txt","wb");
char* somedata;
somedata=NextUnit();
printf("%s\n",somedata);
fwrite(somedata,1,strlen(somedata),ourfile);
fclose(ourfile);
}
That works but it's kindof pointless returning the address of Unit when it's actually Global!
Declare Unit as static:
char* NextUnit()
{
static char Unit[256];
strcpy(Unit,"somebytes");
return &Unit[0];
}
But if you use C++ compiler you should consider using std::string instead of char*. std::string is more safe and will do all allocation/deallocation jobs for you.