I understand that two variables, say a1 and a2 appear in Equivalence(a1,a2) statement in Fortran, then they occupy the same memory space. So say this happens in a procedure where both a1 and a2 are local variables in that procedure.
This means that you can't only have a copy of a1 and a2 in memory right? Because one of the values will be overwritten. You could keep a1 in the memory location and keep a2 in a register for the whole of the procedure and this will be fine right?
My question is basically: Can you keep a1 in a register for the whole procedure?
I would say yes...unless you run out of registers and a1 has to be stored back to memory. Then you will overwrite a2 and lose it's value, and both variables a1 and a2 will then actually point to the value of a1.
a1 and a2 in an equivalence statement means that those two variables will occupy the same storage. Changing one will alter the other, even if they are variables of different types (e.g., a1 is an integer and a2 is a real.). Fortran doesn't give you any way to specify that a variable should be in a register and it seems extremely likely that an equivalence statement will inhibit the compiler doing so automatically.
So
You could keep a1 in the memory location and keep a2 in a register for
the whole of the procedure and this will be fine right?
is inapplicable.
I strongly recommend against the use of equivalence ... it is pernicious and likely only retained in the language to support legacy code. If you have the need to transfer data across types, the modern Fortran method is the transfer intrinsic.
The only reason to use equivalence is to have two names for the same thing. As a fortran programmer, you can't control the registers or any such thing. Don't think about memory and registers. Think that you have one 'box' with two names. Whichever name you use, you are storing into or retrieving from the same box.
Related
I'm trying to port C++ code from a developer who uses global variables called
p0, p1, ..., p30
of integer type.
I wondered if I could not just use an array int p[31]; and access them as p[0], p[1],...
It seems plausible that there would be no performance hit if the indices were always passed as constants. Then I could just pass his data as extern int p[];.
Obviously I could use descriptive macros for the various indices to make the code clearer.
I know that this sounds like weird code, but the developer seems to have a "neurodiverse" personality, and we can't just tell him to mend his ways. Performance is very important in the module he is working on.
I don't see any danger in the replacement of variables with an array.
Modern compilers are very good at optimizing code.
You normally can asume that there will be no difference between using individual variables p0, … p30 and an std::array<int, 31> (or an int[31]), if they are used in the same way and if you use only constants for accessing the array.
A compiler is not required to keep an std::array or an int[] as such, but can completely or partially optimize it a way as long as it complains with the as-if rule.
Variables (also arrays) only need to exists in memory if the compiler can't determin their contents at runtime and/or if the registers are not sufficient to do all manipulations related to those variables using only these registers.
If they exits in memory they need to be referenced by their address in memory, for both a pN and a p[N] (if N is constant) the address where the value is in memory can be determined in the same way at compile time.
If you are unsure if the generated code is the same you can always compair the output generated by the compiler (this can e.g. be done on godbolt), or using the corresponding compiler flags if you don't want to submit code to a foreign service.
Suppose I have an array defined as follows:
volatile char v[2];
And I have two threads (denoted by A, B respectively) manipulating array v. If I ensure that A, B use different indices at any time, that is to say, if A is now manipulating v[i], then B is either doing nothing, or manipulating v[1-i]. I wonder is synchronization needed for this situation?
I have referred to this question, however I think it is limited in Java. The reason why I ask this question is that I have been struggling with a strange and rare bug in a large project for days, and up to now, the only reason I could come up with to explain the bug is that synchronization is needed for the above manipulation. (Since the bug is very rare, it is hard for me to prove whether my conjecture is true)
Edit: both reading and modifying are possible for v.
As far as the C++11 and C11 standards are concerned, your code is safe. C++11 §1.7 [intro.memory]/p2, irrelevant note omitted:
A memory location is either an object of scalar type or a maximal
sequence of adjacent bit-fields all having non-zero width. Two or more
threads of execution (1.10) can update and access separate memory
locations without interfering with each other.
char is a integral type, which means it's an arithmetic type, which means that volatile char is a scalar type, so v[0] and v[1] are separate memory locations.
C11 has a similar definition in §3.14.
Before C++11 and C11, the language itself has no concept of threads, so you are left to the mercy of the particular implementation you are using.
It might be a compiler bug or a hardware limitation.
Sometimes, when a less than 32-bit/64-bit variable is accesses from memory, the processor will read 32 bits, set the apprpriate 8 or 16 bits, then write back the whole register. That means it will read/write the adjacent memory as well, leading to a data race.
Solutions are
use byte-access instructions. They may not be available for your processor or your compiler does not know to use them.
pad your elements to avoid this kind of sharing. The compiler should do it automatically if your target platform does not support byte access. But in an array, this conflicts with the memory layout reqiurements.
synchronize the whole structure
C++03/C++11 debate
In classic C++ it's up to you to avoid/mitigate this kind of behaviour. In C++11 this violates memry model requitements, as stated in other answers.
You need to handle synchronization only if you are accessing the same memory and modifying it. If you are only reading then also you don't need to take care about the synchronization.
As you are saying Each thread will access different indices then you don't require synchronization here. but you need to make sure that the two thread should not modify the same indice at the same time.
In my program, I have the following arrays of double: a1, a2, ..., am; b1, b2, ..., bm; c1, c2, ..., cm; which are members of a class, all of length N, where m and N are known at run time. The reason I named them a, b, and, c is because they mean different things and that's how they are accessed outside the class. I wonder what's the best way to allocate memory for them. I was thinking:
1) Allocating everything in one big chunk. Something like.. double *ALL = new double[3*N*m] and then have a member function return a pointer to the requested part using pointer arithmetic.
2) Create 2D arrays A, B, and C of size m*N each.
3) Use std::vector? But since m is known at run time, then I need vector of vectors.
or does it not really matter what I use? I'm just wondering what's a good general practice.
If all three are linked in some way, if there is any relationship between a[i] and b[i], then they should all be stored together, ideally in a structure that names them with a meaningful and relevant name. This will be easier to understand for any future developer and ensures that the length of the array is always correct by default.
This is called design affordance, meaning that the structure of an object or interface lends itself to be used as intended by default. Just think how a programmer who had never seen the code before would interpret its purpose, the less ambiguity the better.
EDIT
Rereading I realize you might be asking about some kind of memory optimization (?) although it isn't clear. I'd still say use something like this, either an array of class pointers or structs depending on just how large N is.
This really depends significantly on how the data are used. If each array is used independently then the straightforward approach is either a number of named vectors of vectors.
If the arrays are used together where for example a[i] and b[i] are related and used together, separate arrays is not really a good approach because you'll keep accessing different areas of memory potentially causing a lot of cache misses. Instead you would want to aggregate the elements of a and b together into a struct or class and then have a single vector of those aggregates.
I don't see a big problem with allocating a big array and providing an appropriate interface to access the correct sets of elements. But please don't do this with new to manage your memory: Use vector even in this case: std::vector<double> all(3*N*m); However I'm not sure this buys you anything either so one of my other options may be more clear for the intention.
Use option 3, a vector of vectors. That will free you from worrying about memory management.
Then hide it behind an interface so you can change it if you feel the need.
Also, far and near pointers ... can anyone elaborate a bit?
In C++, I have no clue on how pointers work in the direct opcode level, or on the circuit level, but I know it's memory accessing other memory, or vice-versa, etc.
But in Assembly you can use pointers as well.
Is there any notable difference here that's worth knowing, or is it the same concept? Is it applied differently on the mneumonics level of low-level microprocessor specific Assembly?
Near and far pointers were only relevant for 16 bit operating systems. Ignore them unless you really really need them. You might still find the keywords in today's compilers for backwards compatibility but they won't actually do anything. In 16-bit terms, a near pointer is a 16-bit offset where the memory segment is already known by the context and a far pointer contains both the 16-bit segment and a 16-bit offset.
In assembler a pointer simply points to a memory location. The same is true in C++, but in C++ the memory might contain an object; depending on the type of the pointer, the address may change, even though it's the same object. Consider the following:
class A
{
public:
int a;
};
class B
{
public:
int b;
};
class C : public A, B
{
public:
int c;
};
C obj;
A * pA = &obj;
B * pB = &obj;
pA and pB will not be equal! They will point to different parts of the C object; the compiler makes automatic adjustments when you cast the pointer from one type to another. Since it knows the internal layout of the class it can calculate the proper offsets and apply them.
Generally, pointer is something that allows you to access something else, because it points to it.
In a computer, the "something" and the "something else" are memory contents. Since memory is accessed by specifying its memory address, a pointer (something) is a memory location that stores the memory address of something else.
In a programming language, high level or assembler, you give memory addresses a name since a name is easier to remember than a memory address (that is usually given as a hex number). This name is the name of constant that for the compiler (high level) or the assembler (machine level) is exactly the same as the hex number, or, of a variable (a memory location) that stores the hex number.
So, there is no difference in the concept of a pointer for high level languages like C++ or low level languages like assembler.
Re near/far:
In the past, some platforms, notably 16-bit DOS and 16-bit Windows, used a concept of memory segments. Near pointers were pointer into an assumed default segment and were basically just an offset whereas far pointers contained both a segment part and an offset part and could thus represent any address in memory.
In DOS, there were a bunch of different memory models you could choose from for C/C++ in particular, one where there was just one data segment and so all data pointers were implicitly near, several where there was only one code segment, one where both code and data pointers were far, and so on.
Programming with segments is a real PITA.
In addition to what everyone said regarding near/far: the difference is that in C++, pointers are typed - the compiler knows what they are pointing at, and does some behind the scenes address arithmetics for you. For example, if you have an int *p and access p[i], the compiler adds 4*i to the value of p and accesses memory at that address. That's because an integer (in most modern OSes) is 4 bytes long. Same with pointers to structures/classes - the compiler will quietly calculate the offset of the data item within the structure and adjust the memory address accordingly.
With assembly memory access, no such luck. On assembly level, technically speaking, there's almost no notion of variable datatype. Specifically, there's practically no difference between integers and pointers. When you work with arrays of something bigger than a byte, keeping track of array item length is your responsibility.
class foo { }
writeln(foo.classinfo.init.length); // = 8 bytes
class foo { char d; }
writeln(foo.classinfo.init.length); // = 9 bytes
Is d actually storing anything in those 8 bytes, and if so, what? It seems like a huge waste, If I'm just wrapping a few value types then the the class significantly bloats the program, specifically if I am using a lot of them. A char becomes 8 times larger while an int becomes 3 times as large.
A struct's minimum size is 1 byte.
In D, object have a header containing 2 pointer (so it may be 8bytes or 16 depending on your architecture).
The first pointer is the virtual method table. This is an array that is generated by the compiler filled with function pointer, so virtual dispatch is possible. All instances of the same class share the same virtual method table.
The second pointer is the monitor. It is used for synchronization. It is not sure that this field stay here forever, because D emphasis local storage and immutability, which make synchronization on many objects useless. As this field is older than these features, it is still here and can be used. However, it may disapear in the future.
Such header on object is very common, you'll find the same in Java or C# for instance. You can look here for more information : http://dlang.org/abi.html
D uses two machine words in each class instance for:
A pointer to the virtual function table. This contains the addresses of virtual methods. The first entry points towards the class's classinfo, which is also used by dynamic casts.
The monitor, which allows the synchronized(obj) syntax, documented here.
These fields are described in the D documentation here (scroll down to "Class Properties") and here (scroll down to "Classes").
I don't know the particulars of D, but in both Java and .net, every class object contains information about its type, and also holds information about whether it's the target of any monitor locks, whether it's eligible for finalization cleanup, and various other things. Having a standard means by which all objects store such information can make many things more convenient for both users and implementers of the language and/or framework. Incidentally, in 32-bit versions of .net, the overhead for each object is 8 bytes except that there is a 12-byte minimum object size. This minimum stems from the fact that when the garbage-collector moves objects around, it needs to temporarily store in the old location a reference to the new one as well as some sort of linked data structure that will permit it to examine arbitrarily-deep nested references without needing an arbitrarily-large stack.
Edit
If you want to use a class because you need to be able to persist references to data items, space is at a premium, and your usage patterns are such that you'll know when data items are still useful and when they become obsolete, you may be able to define an array of structures, and then pass around indices to the array elements. It's possible to write code to handle this very efficiently with essentially zero overhead, provided that the structure of your program allows you to ensure that every item that gets allocated is released exactly once and things are not used once they are released.
If you would not be able to readily determine when the last reference to an object is going to go out of scope, eight bytes would be a very reasonable level of overhead. I would expect that most frameworks would force objects to be aligned on 32-bit boundaries (so I'm surprised that adding a byte would push the size to nine rather than twelve). If a system is going have a garbage collector that works better than a Commodore 64(*), it would need to have an absolute minimum of a bit of overhead per object to indicate which things are used and which aren't. Further, unless one wants to have separate heaps for objects which can contain supplemental information and those which can't, one will every object to either include space for a supplemental-information pointer, or include space for all the supplemental information (locking, abandonment notification requests, etc.). While it might be beneficial in some cases to have separate heaps for the two categories of objects, I doubt the benefits would very often justify the added complexity.
(*) The Commodore 64 garbage collector worked by allocating strings from the top of memory downward, while variables (which are not GC'ed) were allocated bottom-up. When memory got full, the system would scan all variables to find the reference to the string that was stored at the highest address. That string would then be moved to the very top of memory and all references to it would be updated. The system would then scan all variables to find the reference to the string at the highest address below the one it just moved and update all references to that. The process would repeat until it didn't find any more strings to move. This algorithm didn't require any extra data to be stored with strings in memory, but it was of course dog slow. The Commodore 128 garbage collector stored with each string in GC space a pointer to the variable that holds a reference and a length byte that could be used to find the next lower string in GC space; it could thus check each string in order to find out whether it was still used, relocating it to the top of memory if so. Much faster, but at the cost of three bytes' overhead per string.
You should look into the storage requirements for various types. Every instruction, storage allocation (ie:variable/object, etc) uses up a specific amount of space. In c# an Int32 type integer object should store integer information to the tune of 4 bytes (32bit). It might have other information, too, because it is an object, but your character data type probably only requires 1 byte of information. If you have constructs like for or while in your class, those things will take up space, too, because each of those things is telling your class to do something. The class itself requires a number of instructions to be created in memory, which would account for the 8 initial bytes.
Take an assembler language course. You'll learn all you ever wanted to know and then some about why your programs use however much memory or take up however much storage when compiled.