What is serial copy? And why it is implemented like this? - c++

What is serial copy? Is it different from deep-copy and shallow-copy?
According to the wiki entry under Duff's device, it is traditionally implemented as:
do { //count > 0 assumed
*to = *from++; //Note that the 'to' pointer is NOT incremented
} while(--count > 0);
And then it makes a note, saying
Note that to is not incremented because Duff was copying to a single memory-mapped output register.
I didn't really understand this note.
If to pointer is not incremented, then what is the point of the loop? Why then it is implemented as:
*to = from[count-1]; //does it not do the same thing?
I suspect that it has something to do with the definition of serial copy.
How can we allocate memory for to so that the loop would make some difference?

The point of such a copy is that it is not made to normal memory, but to a serial register.
So, each time a write is made to the address of the register (to), the hardware associated with the register will do something like send the bits over a serial link, or push them onto a queue for some other hardware to deal with.
Typically you cannot even read from register addresses like this, so they are very unlike normal memory, and best thought of as an interface to a particular piece of hardware that just happens to be located at a memory address.

http://en.wikipedia.org/wiki/Memory-mapped_I/O#Example
Some platforms have special addresses that when you read from / write to it, the system will perform some I/O. For example, the to could be an address that controls the speaker when written. In that case, the loop would, e.g. be able to play a sound, while the *to = from[count-1]; will not give any useful output.

The to pointer here is "special". On certain hardware you can access IO ports by writing to special memory regions. If you wanted to send a bit pattern over an IO port, where the pattern was already in memory this is the sort of thing you'd do.
Every write to to causes the output from the IO port to be changed typically. This is for iterating over the pattern and writing it to the "special" memory.
How you get access to such "special" memory is very platform and implementation specific. Sometimes it's just a question of always writing to a fixed address - normally some platform header provides a #define or similar then to make that information available to you at compile time. Sometimes there's a system call you need to make that tells you the address a particular device you're interested in is mapped at.

Related

Writing to a particular Memory location in Embedded Programming

Suppose for an embedded program, the hardware is designed such a way that it performs certain operation if the memory address 0x8729 is filled with 0xff.
Is there a way to access the memory address 0x8729 and write to it?
Try this:
uint8_t * p_memory = (uint8_t *) 0x8729;
const uint8_t value_from_memory = *p_memory;
*p_memory = 0xff; // Writing to memory.
You may not need the cast, but I put it there anyway.
Explanation:
Declare a uint8_t pointer and assign to your memory address.
To write, dereference the pointer and assign your value.
To read, dereference the pointer and assign the value to your variable.
I'd say that without more information about the specific Linux implementation we won't know if this is a problem with a "tiny" Linux implementation that doesn't support memory virtualization for userspace apps or if this is the normal behavior of a "standard" Linux implementation that does virtualize physical memory addresses from userspace.
I am inclined to believe this is userspace a virtualization issue, in which case you could use the techniques mentioned with this posting. It describes how to use devmem to access physical memory from the command line, and how to use the device file handle /dev/mem and memmap from within a C program.

Read from a socket without the associated memcpy from kernel space to user space

In Linux, is there a way to read from a socket while avoiding the implicit memcpy of the data from kernel space to user space?
That is, instead of doing
ssize_t n = read(socket_fd, buffer, count);
which obviously requires the kernel to do a memcpy from the network buffer into my supplied buffer, I would do something like
ssize_t n = fancy_read(socket_fd, &buffer, count);
and on return have buffer pointing to non memcpy()'ed data received from the network.
Initially I thought AF_PACKET option to socket family can be of help, but it cannot.
Nevertheless it is possible technically, as there is nothing that prevents you from implementing kernel module handling system call that returns user mapped pointer to kernel data (even if it is not very safe).
There are couple of questions regarding the call you would like to have:
Memory management. How would you know the memory can still be accessed after fancy_read system call returned?
How would you tell the kernel to eventually free that memory? There would need to be some form of memory management in place and if you would like kernel to give you a safe pointer to nonmemcpy'ed memory than a lot of changes would need to go into the kernel to enable this feature. Just imagine that all that data couldn't be freed before you tell that it can, so kernel would need to keep track of all of these returned pointers.
These could be done in a lot of ways, so basically yes, this is possible but you need to take many things into consideration.

Read access privilage of last 5 elements

Can I set/disable read(or write)-access privilage of the last several elements of an ordinary array in C/C++ ? Since I cannot use other processes' memory, I suspect this could be possible but how? I googled but couldnt find.
If I can, how?
Because I want to try something like this:
SetPrivilage(arr,LAST_5_ELEMENTS,false);
try
{
for(int i=0;;i++) //without bound checking. i know its evil. just trying if it is possible
{
arr[i]++; //array is 1-billion elements
}
}
catch(int catch_end_of_array)
{
printf("array-inc complete");
}
Memory:
|start of array |00|01|02|03|04|05|06|07|..|..|1B|start of protected page|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|
Lets assume I learned how to protect a page, then how could i declare an array just near the page so arrays end-point will be next to the page. ?
This can not be done in a portable manner and depends on your operating system. I suspect that it's not really possible anywhere since memory protection normally operates on a much coarser level (e.g. Linux has the mprotect syscall, but that can only protect entire pages (usually 4k blocks), not arbitrary ranges.
If you protect a page using an operating system interface, then you could position an array so that the array ends where the protection begins. You would have to designate the array by a pointer that you set (e.g., int *p) rather than declare it as an array (e.g., int p[40]), because most C implementations do not give you a way to specify the address of an array.
Because of the granularity of most systems’ memory-protection, you typically can only align one end of an array with the protection boundary. So this is not a generally useful mechanism for protecting array bounds. I have used it for testing purposes, by testing both ends separately:
Align an array so that its end abuts the start of the protected memory. Execute tests.
Align an array so that its start abuts the end of the protected memory. Execute tests.
Thus, if the routines being tested improperly access memory before or after the array, then one of the tests will fail.
I am assuming your arr is a POD (plain old data) array. You could make it in C++ a class and overload the operator[] to do runtime index checking.
You usually cannot do what you want, and if you could, it would be strongly implementation and operating system dependent.
On Linux, access permission to data is related to virtual memory mapping. This is related to the mmap(2) and munmap(2) with mprotect(2) system call. These calls work at a page-level granularity (a page is usually 4Kbytes, and 4Kbytes aligned).
You could make naughty tricks like mmap-ing a large region, mprotect its last page, and do unportable pointer arithmetic to compute the arr pointer. This is disgusting, so don't do that. And catching SIGSEGV with dirty mmap-based tricks like this is not very portable and probably not very efficient. And a signal handler cannot throw C++ exceptions.

Windows: pointer unicity

I had a need of a quick unique ID in one of my classes to differenciate one process from another. I decided to use the address of the instance to do so. I ended up with something like this (quintptr is a Qt defined type of integer to store addresses with the correct size, according to the platform):
Foo::Foo()
: _id(reinterpret_cast<quintptr>(this))
{
...
}
The idea is to compare the output of two different processes of the same exe. On Vista (my dev machine) there's no problem. But on XP, the value of _id is the same (!) in the two processes.
Can anyone explain why is that? and if it's a good idea to use pointers like that (I thought so, I'm not so sure anymore)?
Thanks.
Every process gets its own address space. On XP, they're all the same. Therefore it's very common to see what you saw: two objects that have the same address, but in two different address spaces.
It turns out that this contributes to security risks. Attackers were able to guess where vulnerable objects would be in memory, and exploit those. Vista randomizes address spaces (ASLR) which means that two processes are far more likely to put the same object at different addresses.
For your case, using pointers like that is not a smart idea. Just use the process ID
The reason is each process has its own address space and if two processes do the same they just use the same virtual addresses - maybe even heap allocations will be done at same virtual addresses.
You could call GetCurrentProcessId() once and store the result somewhere so that further retrieval is very fast. The process id persists and is unique for the lifetime of the process.
Each process gets its own address space. Unless something like ASLR kicks in, the memory layouts of two processes stemming from the same executable are likely to be very similar, if not identical.
So your idea is not a good one. Using the process ID sounds like a saner approach here, but keep in mind that those can be recycled too.

Is 0x000001, 0x000002, etc. ever a valid memory address in application level programming?

Or are those things are reserved for the operation system and things like that?
Thanks.
While it's unlikely that 0x00000001, etc. will be valid pointers (especially if you use odd numbers on many processors) using a pointer to store an integer value will be highly system dependent.
Are you really that strapped for space?
Edit:
You could make it portable like this:
char *base = malloc(NUM_MAGIC_VALUES);
#define MAGIC_VALUE_1 (base + 0)
#define MAGIC_VALUE_2 (base + 1)
...
Well the OS is going to give each program it's own virtual memory space, so when the application references memory spaces 0x0000001 or 0x0000002, it's actually referencing some other physical memory address. I would take a look at paging and virtual memory. So a program will never have access to memory the operating system is using. However I would stay away from manually assigning a memory address for a pointer rather than using malloc() because those memory addresses might be text or reserved space.
This depends on operating system layout. For User space applications running in general purpose operating systems, these are inaccessible addresses.
This problem is related to a architecture's virtual address space. Have a loot at this http://web.cs.wpi.edu/~cs3013/c07/lectures/Section09.1-Intel.pdf
Of course, you can do this:
int* myPointer1 = 0x000001;
int* myPointer2 = 0x000032;
But do not try to dereference addresses, cause it will end in an Access Violation.
The OS gives you the memory, by the way these addresses are just virtual
the OS hides the details and shows it like a big, continous stripe.
Maybe the 0x000000-0x211501 part is on a webserver and you read/write it through net,
and remaining is on your hard disk. Physical memory is just an illusion from your current viewpoint.
You tagged your question C++. I believe that in C++ the address at 0 is reserved and is normally referred to as NULL. Other than that you cannot assume anything. If you want to ask about a particular implementation on a particular OS then that would be a different question.
It depends on the compiler/platform, but many older compilers actually have something like the string "(null)" at address 0x00000000. This is a debug feature because that string will show up if a NULL pointer is ever used by accident. On newer systems like Windows, a pointer to this area will most likely cause a processor exception.
I can pretty much guarantee that address 1 and 2 will either be in use or will raise a processor exception if they're ever used. You can store any value you like in a pointer. But if you try and dereference a pointer with a random value, you're definitely asking for problems.
How about a nice integer instead?
Although the standard requires that NULL is 0, a pointer that is NULL does not have to consist of all zero bits, although it will do in many implementations. That is also something you have to beware of if you memset a POD struct that contains some pointers, and then rely on the pointers holding "NULL" as their value.
If you want to use the same space as a pointer you could use a union, but I guess what you really want is something that doubles up as a pointer and something else, and you know it is not a pointer to a real address if it contains low-numbered values. (With a union you still need to know which type you have).
I'd be interested to know what the magic other value is really being used for. Is this some lazy-evaluation issue where the pointer gives an indication of how to load the data when it is not yet loaded and a genuine pointer when it is?
Yes, on some platforms address 0x00000001 and 0x00000002 are valid addresses. On other platforms they are not.
In the embedded systems world, the validity depends on what resides at those locations. Some platforms may put interrupt or reset vectors at those addresses. Other embedded platforms may place Position Independent executable code there.
There is no standard specification for the layout of addresses. One cannot assume anything. If you want your code to be portable then forget about accessing specific addresses and leave that to the OS.
Also, the structure of a pointer is platform dependent. So is the conversion of the value in a pointer to a physical address. Some systems may only decode a portion of the pointer, others use the entire pointer value. Some may use indirection (a.k.a. virtual addressing) to access real objects. Still no standardization here either.