How to make my own "const data" unmodifiable?

How to make my own "const data" unmodifiable? - c++

I had a trouble with my own const attribute. I could not protect my const data completely because of functions. Functions can modify everything.
const reference is an awful thing because it requires an address; if it's an address of a const variable, most likely the variable data still can be changed or modified. My data is writeable. The parser cannot prevent (ignore) this kind of code because it's totally legal in C++.
scanf("%d", &const_var);
How to lock my data and make my const data unmodifiable (or unwritable)?

In this case...
scanf("%d", &foo); // Out of range!!!
...foo could indeed be modified, or pretty much anything else can happen. The compiler can't catch this because scanf uses old-fashion c varargs which allows any number of parameters, of any type to be passed, with no checking whatsoever. There is no attempt to ensure that the parameter passed is an address, let alone the address of an int, and certainly not the address of a non-const int.
For safer C++ code, avoid using printf/sscanf family of functions entirely.
Lint or some other source code checker could help here. I haven't tried it with this specific case.

C++ is not a safe language. In C++, you can only protect yourself from accidents, not perfidy. If a user wants to change something that they're not supposed to be able to change, it's as simple as this:
void SomeFunc(const Obj &dontChange)
{
Obj &change = const_cast<Obj &>(dontChange);
change.X = ...;
}
There is nothing you can do to protect yourself from a user who is determined to violate the contract between your code and theirs. You can only protect yourself from a user who accidentally tried to break the contract. You can't even protect private members of your class, as they can simply cast the type to something else and poke at the memory directly.
Your best bet is this: don't worry about it. If the user does the wrong thing with some variable, then any breakage that happens as a result is their fault.

You can't protect yourself from such situations directly. If the constant is put inside writable memory, one may always acquire address of this constant and modify it one or another way. Remember though, that standard says, that modifying constant values is considered to have an undefined behavior.
If you want the constant to be unmodifyable, you can make a workaround and use a macro:
#define foo 100
This is a trick, because it will exchange all occurrences of foo in the source into 100, however noone will be able to modify its value in the runtime (by definition, 100 is 100).
However, I would avoid using macros. You may use another workaround instead: define a function:
constexpr int FooValue()
{
return 100;
}
Noone will be able to modify it either, but this solution allows you to embed such "constant" in class or namespace. If you use compiler, which can handle C++11, using constexpr will allow you to input this function everywhere, where you might have inserted a constant, for example:
int array[FooValue()];

Related

Questions and Verifications on immutable [string] objects c++

I've been doing some reading on immutable strings in general and in c++, here, here, and I think I have a decent understanding of how things work. However I have built a few assumptions that I would just like to run by some people for verification. Some of the assumptions are more general than the title would suggest:
While a const string in c++ is the closest thing to an immutable string in STL, it is only locally immutable and therefore doesn't experience the benefit of being a smaller object. So it has all the trimmings of a standard string object but it can't access all of the member functions. This means that it doesn't create any optimization in the program over non-const? But rather just protects the object from modification? I understand that this is an important attribute but I'm simply looking to know what it means to use this
I'm assuming that an object's member functions exist only once in read-only memory, and how is probably implementation specific, so does a const object have a separate location in memory? Or are the member functions limited in another way? If there are only 'const string' objects and no non-const strings in a code base, does the compiler leave out the inaccessible functions?
I recall hearing that each string literal is stored only once in read-only memory in c++, however I don't find anything on this here. In other words, if I use some string literal multiple times in the same program, each instance references the same location in memory. I'm going to assume no, but would two string objects initialized by the same string literal point to the same string until one is modified?
I apologize if I have included too many disjunct thoughts in the same post, they are all related to me as string representation and just learning how to code better.

As far as I know, std::string cannot assume that the input string is a read-only constant string from your data segment. Therefore, point (3) does not apply. It will most likely allocate a buffer and copy the string in the buffer.
Note that C++ (like C) has a const qualifier for compilation time, it is a good idea to use it for two reasons: (a) it will help you find bugs, a statement such as a = 5; if a is declared const fails to compile; (b) the compile may be able to optimize the code more easily (it may otherwise not be able to figure out that the object is constant.)
However, C++ has a special cast to remove the const-ness of a variable. So our a variable can be cast and assigned a value as in const_cast<int&>(a) = 5;. An std::string can also get its const-ness removed. (Note that C does not have a special cast, but it offers the exact same behavior: * (int *) &a = 5)
Are all class members defined in the final binary?
No. std::string as most of the STL uses templates. Templates are compiled once per unit (your .o object files) and the link will reduce duplicates automatically. So if you look at the size of all the .o files and add them up, the final output will be a lot small.
That also means only the functions that are used in a unit are compiled and saved in the object file. Any other function "disappear". That being said, often function A calls function B, so B will be defined, even if you did not explicitly call it.
On the other hand, because these are templates, very often the functions get inlined. But that is a choice by the compiler, not the language or the STL (although you can use the inline keyword for fun; the compiler has the right to ignore it anyway).
Smaller objects... No, in C++ an object has a very specific size that cannot change. Otherwise the sizeof(something) would vary from place to place and C/C++ would go berserk!
Static strings that are saved in read-only data sections, however, can be optimized. If the linker/compiler are good enough, they will be able to merge the same string in a single location. These are just plan char * or wchar_t *, of course. The Microsoft compiler has been able to do that one for a while now.
Yet, the const on a string does not always force your string to be put in a read-only data section. That will generally depend on your command line option. C++ may have corrected that, but I think C still put everything in a read/write section unless you use the correct command line option. That's something you need to test to make sure (your compiler is likely to do it, but without testing you won't know.)
Finally, although std::string may not use it, C++ offers a quite interesting keyword called mutable. If you heard about it, you would know that a variable member can be marked as mutable and that means even const functions can modify that variable member. There are two main reason for using that keyword: (1) you are writing a multi-thread program and that class has to be multi-thread safe, in that case you mark the mutex as mutable, very practical; (2) you want to have a buffer used to cache a computed value which is costly, that buffer is only initialized when that value is requested to not waste time otherwise, that buffer is made mutable too.
Therefore the "immutable" concept is only really something that you can count on at a higher level. In practice, reality is often quite different. For example, an std::string c_str() function may reallocate the buffer to add the necessary '\0' terminator, yet that function is marked as being a const:
const CharT* c_str() const;
Actually, an implementation is free to allocate a completely different buffer, copy its existing data to that buffer and return that bare pointer. That means internally the std::string could be allocate many buffers to store large strings (instead of using realloc() which can be costly.)
Once thing, though... when you copy string A into string B (B = A;) the string data does not get copied. Instead A and B will share the same data buffer. Once you modify A or B, and only then, the data gets copied. This means calling a function which accepts a string by copy does not waste that much time:
int func(std::string a)
{
...
if(some_test)
{
// deep copy only happens here
a += "?";
}
}
std::string b;
func(b);
The characters of string b do not get copied at the time func() gets called. And if func() never modifies 'a', the string data remains the same all along. This is often referenced as a shallow copy or copy on write.

Practical use of pointer to const

I understand the use of pointer to constant for strlen implementation.
size_t strlen ( const char * str );
Can anyone suggest other reasons or provide some scenarios where 'Pointer to Const value' is useful in practice.

Think of it this way. You want me to look at the value of a variable but you don't want me to alter that variable in any way, so you pass it to me as a constant. When I use your function and see that a parameter is constant then I know there is a contract between you and I that says I should not change the value of that variable nor can I do it directly.
When you write code you don't always know who will use your functions. So it is good practice to protect your code. It also protects you from yourself, you will get compiler errors the moment you tr to change the value of that variable.
A side note: It is true that in C you can still change the value even though the parameter says const, but it would take another pointer which would alter the content of that variable in memory.
Try compiling this code and notice how the compiler protects you from making a mistake.
const char *cookies(const char *s)
{
return ('\0' == *s)? s: s + 1;
}
It won't let you compile, why? Because you are trying to change a const variable.
Another post with the same question here: const usage with pointers in C

Marking a pointer argument as const is a contract whereby you assert to the user you won't change the values pointed to. It's also a contract that you won't try to write to them, so that it is legal for them to give you a pointer to something that is read-only memory, and your function won't crash - as long as you fulfill this contract.
It's not just "useful", in some cases it is required, particularly when you are dealing with what amounts to write protected memory such as string literals or any value stored in the code section of your executable.
It's also valuable when you are using data that might be shared: If two threads want to call a function foo(char* x) each will need its own copy of the string, otherwise bad things would happen. If the function is foo(const char* x) then we know it is safe for both to share a single pointer to the input.
Consider, if you have a pointer to write-protected memory:
mprotect(ptr, sizeOfData, PROT_READ);
It is now not possible to call a function that attempts to write to ptr without a program exception (this is often done with things like caches when nobody owns a write-lock on the cache).
Thus you can only call const functions on ptr at this point.

Pointer to const of char is the type of the strings literals (those inside ""). This is the primary use for pointer to const value.

Valid reasons for References of locally accessible data?

So I was just minding my own business looking over some code I wrote, and I noticed something I had done. I made a reference of protected data float& px = A->p.x I did this purely to shorten an if statement's width and make it more readable. The function was stated as a friend of the class of course, so it could access the protected data. Furthermore, there were 4 similar references.
I am wondering if this is a valid reason to make a reference.
If not are there valid reasons to create a reference which is in no way a parameter to the function being worked in.
Example:
void generic_function(A_class* A)
{
float& x = A->x;
//purposeful and valid code using x?
}
I tried to word the question as best I could, I apologize if it doesn't communicate my purpose for asking.
Edit:
As a point of importance. I am concerned with optimization for real time application design.

Absolutely, this is a very valid way of using references. Any time you need a stand-in for another location in memory to shorten your code, a reference is a perfect choice. In your case, it also shortens the chain of dereferencing: if you use x several times, for example, in a loop, you would cut on the number of dereferences, because you would no longer need to read A to get to x inside it. The compiler may be smart enough to optimize out this dereference for you, so you shouldn't do it as a matter of micro-optimization.
Another valid reason to use a reference is to access a cached value inside a container. A common pattern for accessing cached values from std::map is to read the value into a reference, check it, and then assign to the reference thus bypassing a second look-up in the map. The timing of this could be significant, especially in a tight loop, so using a reference into a map can be considered an optimization technique.

I have used this several times to do things like:
// logFile is optional; if set, the function will log to that file
// If it's empty, it will log to stderr
void foo(const std::string& logFile)
{
std::ofstream file;
std::ostream& os = (logFile.empty()) ? std::cerr : file;
if (!logFile.empty())
{
file.open(logFile);
// validate that the file opened successfully, etc.
}
// Then, throughout the function, I can just use os rather than writing
// an if statement every time I log something
os << "Hello world!\n";
}
Edit: I made the code actually valid

A reference is only an alias for the referenced object and therefore there is no additional memory overhead or performance cost of using it inside a function like you do. If it results in a cleaner code I can't see a reason not to use it.

What use are const pointers (as opposed to pointers to const objects)?

I've often used pointers to const objects, like so...
const int *p;
That simply means that you can't change the integer that p is pointing at through p. But I've also seen reference to const pointers, declared like this...
int* const p;
As I understand it, that means that the pointer variable itself is constant -- you can change the integer it points at all day long, but you can't make it point at something else.
What possible use would that have?

When you're designing C programs for embedded systems, or special purpose programs that need to refer to the same memory (multi-processor applications sharing memory) then you need constant pointers.
For instance, I have a 32 bit MIPs processor that has a little LCD attached to it. I have to write my LCD data to a specific port in memory, which then gets sent to the LCD controller.
I could #define that number, but then I also have to cast it as a pointer, and the C compiler doesn't have as many options when I do that.
Further, I might need it to be volatile, which can also be cast, but it's easier and clearer to use the syntax provided - a const pointer to a volatile memory location.
For PC programs, an example would be: If you design DOS VGA games (there are tutorials online which are fun to go through to learn basic low level graphics) then you need to write to the VGA memory, which might be referenced as an offset from a const pointer.
-Adam

It allows you to protect the pointer from being changed. This means you can protect assumptions you make based on the pointer never changing or from unintentional modification, for example:
int* const p = &i;
...
p++; /* Compiler error, oops you meant */
(*p)++; /* Increment the number */

another example:
if you know where it was initialized, you can avoid future NULL checks.
The compiler guarantees you that the pointer never changed (to NULL)…

In any non-const C++ member function, the this pointer is of type C * const, where C is the class type -- you can change what it points to (i.e. its members), but you can't change it to point to a different instance of a C. For const member functions, this is of type const C * const. There are also (rarely encountered) volatile and const volatile member functions, for which this also has the volatile qualifier.

One use is in low-level (device driver or embedded) code where you need to reference a specific address that's mapped to an input/output device like a hardware pin. Some languages allow you to link variables at specific addresses (e.g. Ada has use at). In C the most idiomatic way to do this is to declare a constant pointer. Note that such usages should also have the volatile qualifier.
Other times it's just defensive coding. If you have a pointer that shouldn't change it's wise to declare it such that it cannot change. This will allow the compiler (and lint tools) to detect erroneous attempts to modify it.

I've always used them when I wanted to avoid unintended modification to the pointer (such as pointer arithmetic, or inside a function). You can also use them for Singleton patterns.
'this' is a hardcoded constant pointer.

Same as a "const int" ... if the compiler knows it's not going to change, it can be optimization assumptions based on that.
struct MyClass
{
char* const ptr;
MyClass(char* str) :ptr(str) {}
void SomeFunc(MyOtherClass moc)
{
for(int i=0; i < 100; ++i)
{
printf("%c", ptr[i]);
moc.SomeOtherFunc(this);
}
}
}
Now, the compiler could do quite a bit to optimize that loop --- provided it knows that SomeOtherFunc() does not change the value of ptr. With the const, the compiler knows that, and can make the assumptions. Without it, the compiler has to assume that SomeOtherFunc will change ptr.

I have seen some OLE code where you there was an object passed in from outside the code and to work with it, you had to access the specific memory that it passed in. So we used const pointers to make sure that functions always manipulated the values than came in through the OLE interface.

Several good reasons have been given as answers to this questions (memory-mapped devices and just plain old defensive coding), but I'd be willing to bet that most instances where you see this it's actually an error and that the intent was to have to item be a pointer-to-const.
I certainly have no data to back up this hunch, but I'd still make the bet.

Think of type* and const type* as types themselves. Then, you can see why you might want to have a const of those types.

always think of a pointer as an int. this means that
object* var;
actually can be thought of as
int var;
so, a const pointer simply means that:
const object* var;
becomes
const int var;
and hence u can't change the address that the pointer points too, and thats all. To prevent data change, u must make it a pointer to a const object.

Tracking automatic variable lifetime?

This may not be possible, but I figured I'd ask...
Is there any way anyone can think of to track whether or not an automatic variable has been deleted without modifying the class of the variable itself? For example, consider this code:
const char* pStringBuffer;
{
std::string sString( "foo" );
pStringBuffer = sString.c_str();
}
Obviously, after the block, pStringBuffer is a dangling pointer which may or may not be valid. What I would like is a way to have a wrapper class which contains pStringBuffer (with a casting operator for const char*), but asserts that the variable it's referencing is still valid. By changing the type of the referenced variable I can certainly do it (boost shared_ptr/weak_ptr, for example), but I would like to be able to do it without imposing restrictions on the referenced type.
Some thoughts:
I'll probably need to change the assignment syntax to include the referenced variable (which is fine)
I might be able to look at the stack pointer to detect if my wrapper class was allocated "later" than the referenced class, but this seems hacky and not standard (C++ doesn't define stack behavior). It could work, though.
Thoughts / brilliant solutions?

In general, it's simply not possible from within C++ as pointers are too 'raw'. Also, looking to see if you were allocated later than the referenced class wouldn't work, because if you change the string, then the c_str pointer may well change.
In this particular case, you could check to see if the string is still returning the same value for c_str. If it is, you are probably still valid and if it isn't then you have an invalid pointer.
As a debugging tool, I would advise using an advanced memory tracking system, like valgrind (available only for linux I'm afraid. Similar programs exist for windows but I believe they all cost money. This program is the only reason I have linux installed on my mac). At the cost of much slower execution of your program, valgrind detects if you ever read from an invalid pointer. While it isn't perfect, I've found it detects many bugs, in particular ones of this type.

One technique you may find useful is to replace the new/delete operators with your own implementations which mark the memory pages used (allocated by your operator new) as non-accessible when released (deallocated by your operator delete). You will need to ensure that the memory pages are never re-used however so there will be limitations regarding run-time length due to memory exhaustion.
If your application accesses memory pages once they've been deallocated, as in your example above, the OS will trap the attempted access and raise an error. It's not exactly tracking per se as the application will be halted immediately but it does provide feedback :-)
This technique is applicable in narrow scenarios and won't catch all types of memory abuses but it can be useful. Hope that helps.

You could make a wrapper class that works in the simple case you mentioned. Maybe something like this:
X<const char*> pStringBuffer;
{
std::string sString( "foo" );
Trick trick(pStringBuffer).set(sString.c_str());
} // trick goes out of scope; its destructor marks pStringBuffer as invalid
But it doesn't help more complex cases:
X<const char*> pStringBuffer;
{
std::string sString( "foo" );
{
Trick trick(pStringBuffer).set(sString.c_str());
} // trick goes out of scope; its destructor marks pStringBuffer as invalid
}
Here, the invalidation happens too soon.
Mostly you should just write code which is as safe as possible (see: smart pointers), but no safer (see: performance, low-level interfaces), and use tools (valgrind, Purify) to make sure nothing slips through the cracks.

Given "pStringBuffer" is the only part of your example existing after sString goes out of scope, you need some change to it, or a substitute, that will reflect this. One simple mechanism is a kind of scope guard, with scope matching sString, that affects pStringBuffer when it is destroyed. For example, it could set pStringBuffer to NULL.
To do this without changing the class of "the variable" can only be done in so many ways:
Introduce a distinct variable in the same scope as sString (to reduce verbosity, you might consider a macro to generate the two things together). Not nice.
Wrap with a template ala X sString: it's arguable whether this is "modifying the type of the variable"... the alternative perspective is that sString becomes a wrapper around the same variable. It also suffers in that the best you can do is have templated constructor pass arguments to wrapped constructors up to some finite N arguments.
Neither of these help much as they rely on the developer remembering to use them.
A much better approach is to make "const char* pStringBuffer" simply "std::string some_meaningful_name", and assign to it as necessary. Given reference counting, it's not too expensive 99.99% of the time.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to make my own "const data" unmodifiable? - c++

Related

Questions and Verifications on immutable [string] objects c++

Practical use of pointer to const

Valid reasons for References of locally accessible data?

What use are const pointers (as opposed to pointers to const objects)?

Tracking automatic variable lifetime?

Categories

Resources