Use of *(char *) in C++ - c++

I don't understand what using this syntax is for: *(char *). What does it do and can it be used with other data types like int?
void function(int a)
{
*(char*)(0x12345 + (0x3980 * a)) = 0xFF;
}

*(char *)hoge means that interpret hoge as a pointer for char and read the data on where hoge points at.
It can be used with other data types like int.
One usage example: comparison function for qsort
int cmp(const void *x, const void *y) {
int a = *(int *)x;
int b = *(int *)y;
if (a > b) return 1;
if (a < b) return -1;
return 0;
}

I don't know where you got your example from, but it doesn't make sense to me for some reason. Anyway, when you use the character "*" before something like (char*), what is happening is you're telling the compiler to cast the value computed between those parentheses (0x12345 + (0x3980 * a)) into a pointer to char, and then change the value store in that location on the memory to be 0xFF.
In other words, what just happened is you grabbed a random location on the memory, and you told the compiler to act like that location contain a char "*(char*)", and store my value "0xFF" there.

The question has been awnsered already but here is a "real world" example where this kind of syntax is used.
In (low-level) embedded software developement you often have to interface with a mcu's hardware peripheripherals. These perhipherals are controlled by registers which are mapped to fixed memory addresses.
When an mcu has multiple of the same peripherals (ie. 3 ADC's) it'll usually have 3 equal register sets mapped right after each other.
When interfacing you want to work with the addresses directly but add an abstraction. A simple API for control may looks like this:
.H file
/* Header file, defines addresses for specific chip*/
#define ADC_BASE_ADDRESS 0x00001000 /* Start address of first register of ADC0 */
#define SIZEOF_ADC_REGISTERS 0x00000020 /* Size of all ADC0 registers */
#define ADC_REG_CFG_OFFSET 0x00 /* ADC Config register offset */
#define ADC_REG_BLA_BLA_OFFSET 0x04 /* ADC Config register offset */
/* etc, etc, etc*/
#define ADC_CFG_ENABLE 0x01 /* Enable command */
.C file
#include "chip.h"
void adc_enable(int adc){
*(uint32_t *)(ADC_BASE_ADDRESS + ADC_REG_CFG_OFFSET + (adc * SIZEOF_ADC_REGISTERS)) = ADC_CFG_ENABLE;
}
/* Calling code */
adc_enable(0);
adc_enable(3);
Do note, as mentioned this is typically done in C, not so much in C++.

Related

i2c byte write function, how works this code? I can´t understand complety

Can someone explain me how works this lines
template <class T>
...const T& value)...
.
.
.
const uint8_t* p = (const uint8_t*)(const void*)&value;
on this code (i2c byte write for eeprom)
template <class T>
uint16_t writeObjectSimple(uint8_t i2cAddr, uint16_t addr, const T& value){
const uint8_t* p = (const uint8_t*)(const void*)&value;
uint16_t i;
for (i = 0; i < sizeof(value); i++){
Wire.beginTransmission(i2cAddr);
Wire.write((uint16_t)(addr >> 8)); // MSB
Wire.write((uint16_t)(addr & 0xFF));// LSB
Wire.write(*p++);
Wire.endTransmission();
addr++;
delay(5); //max time for writing in 24LC256
}
return i;
}
template <class T>
uint16_t readObjectSimple(uint8_t i2cAddr, uint16_t addr, T& value){
uint8_t* p = (uint8_t*)(void*)&value;
uint8_t objSize = sizeof(value);
uint16_t i;
for (i = 0; i < objSize; i++){
Wire.beginTransmission (i2cAddr);
Wire.write((uint16_t)(addr >> 8)); // MSB
Wire.write((uint16_t)(addr & 0xFF));// LSB
Wire.endTransmission();
Wire.requestFrom(i2cAddr, (uint8_t)1);
if(Wire.available()){
*p++ = Wire.read();
}
addr++;
}
return i;
}
I think the lines works like pointers?
I can't understand how the code store correctly each type of data when I do that
struct data{
uint16_t yr;
uint8_t mont;
uint8_t dy;
uint8_t hr;
uint8_t mn;
uint8_t ss;
};
.
.
.
data myString;
writeObjectSimple(0x50,0,myString);
And then recover the values correctly using
data myStringRead;
readObjectSimple(0x50,0,myStringRead)
the function i2c byte write detect some special character between each data type to store in the correct place?
thx
First I have to state, that this code has been written by a person not fully familiar with the differences between how C++ and C deal with pointer types. My impression that this person has a strong C background and was simply trying to shut up a C++ compiler to throw warnings.
Let's break down what this line of code does
const uint8_t* p = (const uint8_t*)(const void*)&value;
The intent here is to take a buffer of an arbitrary type – which we don't even know here, because it's a template type – and treat it as if it were a buffer of unsigned 8 bit integers. The reason for that is, that later on the contents of this buffer are to be sent over a wire bit by bit (this is called "bit banging").
In C the way to do this would have been to write
const uint8_t* p = (const void*)&value;
This works, because in C is perfectly valid to assign a void* typed pointer to a non-void pointer and vice versa. The important rule set by the C language however is, that – technically – when you convert a void* pointer to a non-void type, then the void* pointer must have been obtained by taking the address (& operator) of an object of the same type. In practice however implementations allow casting of a void* types pointer to any type that is alignment compatible to the original object and for most – but not all! – architectures uint8_t buffers may be aligned to any address.
However in C++ this back-and-forth assignment of void* pointers is not allowed implicitly. C++ requires an explicit cast (which is also why you can often see C++ programmers writing in C code something like struct foo *p = (struct foo*)malloc(…)).
So what you'd write in C++ is
const uint8_t* p = (const uint8_t*)&value;
and that actually works and doesn't throw any warnings. However some static linter tools will frown upon it. So the first cast (you have to read casts from right to left) first discards the original typing by casting to void* to satisfy the linter, then the second cast casts to the target type to satisfy the compiler.
The proper C++ idiom however would have been to use a reinterpret_cast which most linters will also accept
const uint8_t* p = reinterpret_cast<const uint8_t*>(&value);
However all this casting still invokes implementation defined behavior and when it comes to bit banging you will be hit by endianess issues (the least).
Bit banging itself works by extracting each bit of a value one by one and tickling the wires that go in and out of a processor's port accordingly. The operators used here are >> to shift bits around and binary & to "select" particular bits.
So for example when you see a statement like
(v & (1<<x))
then what is does is checking if bit number x is set in the variable v. You can also mask whole subsets of the bits in a variable, by masking (= applying the binary & operator – not to be confused with the unary "address of" operator that yields a pointer).
Similarly you can use the | operator to "overlay" the bits of several variables onto each other. Combined with the shift operators you can use this to build the contents of a variable bit-by-bit (with the bits coming in from a port).
The target device is an I2C EEPROM, so the general form for writing is to send the destination address followed by some data. To read data from the EEPROM, you write the source address and then switch to a read mode to clock out the data.
First of all, the line:
const uint8_t* p = (const uint8_t*)(const void*)&value;
is simply taking the templated type T and casting away its type, and converting it to a byte array (uint8_t*). This pointer is used to advance one byte at a time through the memory containing value.
In the writeObjectSimple method, it first writes the 16-bit destination address (in big-endian format) followed by a data byte (where p is a data pointer into value):
Wire.write(*p++);
This writes the current byte from value and moves the pointer along one byte. It repeats this for however many bytes are in the type of T. After writing each byte, the destination address is also incremented, and it repeats.
When you code:
data myString;
writeObjectSimple(0x50,0,myString);
the templated writeObjectSimple will be instantiated over the data type, and will write its contents (one byte at a time) starting at address 0, to the device with address 0x50. It uses sizeof(data) to know how many bytes to iterate over.
The read operation works very much the same way, but writes the source address and then requests a read (which is implicit in the LSB of the I2C address) and reads one byte at a time back from the device.
the function i2c byte write detect some special character between each data type to store in the correct place?
Not really, each transaction simply contains the address followed by the data.
[addr_hi] [addr_lo] [data]
Having explained all that, operating one byte at a time is a very inefficient way of achieving this. The device is a 24LC256, and this 24LC family of EEPROMs support sequential writes (up to a page) in size in a single I2C transaction. So you can easily send the entire data structure in one transfer, and avoid having to retransmit the address (2 bytes for every byte of data). Have a look in the datasheet for the full details.

How to get the size the trailing padding of a struct or class?

sizeof can be used to get the size of a struct or class. offsetof can be used to get the byte offset of a field within a struct or class.
Similarly, is there a way to get the size of the trailing padding of a struct or class? I'm looking for a way that doesn't depend on the layout of the struct, e.g. requiring the last field to have a certain name.
For the background, I'm writing out a struct to disk but I don't want to write out the trailing padding, so the number of bytes I need to write is the sizeof minus the size of the trailing padding.
CLARIFICATION: The motivation for not writing the trailing padding is to save output bytes. I'm not trying to save the internal padding as I'm not sure about the performance impact of non-aligned access and I want the writing code to be low-maintenance such that it doesn't need to change if the struct definition changes.
The way a compiler can pad fields in a structure is not strictly defined in the standard, so it's a kind of free choice and implementation dependent.
If a data aggregate have to be interchanged the only solution is to avoid any padding.
This is normally accomplished using a #pragma pack(1). This pragma instructs the compiler to pack all fields together on a 1 byte boundary. It will slow the access on some processors, but will make the structure compact and well defined on any system, and, of course, without any padding.
pragma pack or equivalent is the canonical way to do that. Apart from that I can only think of a macro, if the number of members is fixed or the maximum number is low, like
$ cat struct-macro.c && echo
#include<stdio.h>
using namespace std;
#define ASSEMBLE_STRUCT3(Sname, at, a, bt, b, ct, c) struct Sname {at a; bt b; ct c; }; \
int Sname##_trailingbytes() { return sizeof(struct Sname) - offsetof(Sname, c) - sizeof(ct); }
ASSEMBLE_STRUCT3(S, int, i1, int, i2, char, c)
int main()
{
printf("%d\n", S_trailingbytes());
}
$ g++ -Wall -o struct-macro struct-macro.c && ./struct-macro
3
$
I wonder if something fancy can be done with a variadic template class with in C++. But I can't quite see how the class/structure can be defined and the offset function/constant be provided without a macro again -- which would defeat the purpose.
The padding could be anywhere inside the struct, except at the very beginning. There is no standard way to disable padding, although some flavour of #pragma pack is a common non-standard extension.
What you actually should do if you want a robust, portable solution, is to write a serialization/de-serialization routine for your struct.
Something like this:
typedef
{
int x;
int y;
...
} mytype_t;
void mytype_serialize (uint8_t* restrict dest, const mytype_t* restrict src)
{
memcpy(dest, &src->x, sizeof(src->x)); dest += sizeof(src->x);
memcpy(dest, &src->y, sizeof(src->y)); dest += sizeof(src->y);
...
}
And similarly for the other way around.
Please note that padding is there for a reason. If you get rid of it, you sacrifice execution speed in favour of memory size.
EDIT
The weird way to do it, just by skipping trailing padding:
size_t mytype_serialize (uint8_t* restrict dest, const mytype_t* restrict src)
{
size_t size = offsetof(my_type_t, y); // assuming y is last object
memcpy(dest, src, size);
memcpy(dest+size, &src->y, sizeof(src->y));
size += sizeof(src->y);
return size;
}
You need to know the size and do something meaningful with it, because otherwise you can't know the size of the stored data when you need to read it back.
This is a possiblity:
#define BYTES_AFTER(st, last) (sizeof (st) - offsetof(st, last) - sizeof ((st*)0)->last)
As is this (C99) approach:
#define BYTES_AFTER(st, last) (sizeof (st) - offsetof(st, last) - sizeof (st){0}.last)
Another way is just declaring your structs packed via some non-standard #pragma or similar. This would also take care of padding in the middle.
Neither of those two are pretty though. Sharing between different systems might not work because different alignment requirements. And using non-standard extensions is, well, non-standard.
Just do the serialization yourself. Maybe something like that:
unsigned char buf[64];
mempcpy(mempcpy(mempcpy(buf,
&st.member_1, sizeof st.member_1),
&st.member_2, sizeof st.member_2),
&st.member_3, sizeof st.member_3);
mempcpy is a GNU extension, if it's not available, just define it yourself:
static inline void * mempcpy (void *dest, const void *src, size_t len) {
return (char*)memcpy(dest, src, len) + len;
}
IMO, it makes code like that easier to read.

Pointers Casting Endianness

#include "stdio.h"
typedef struct CustomStruct
{
short Element1[10];
}CustomStruct;
void F2(char* Y)
{
*Y=0x00;
Y++;
*Y=0x1F;
}
void F1(CustomStruct* X)
{
F2((char *)X);
printf("s = %x\n", (*X).Element1[0]);
}
int main(void)
{
CustomStruct s;
F1(&s);
return 0;
}
The above C code prints 0x1f00 when compiled and ran on my PC.
But when I flash it to an embedded target (uController) and debugging, I find that
(*X).Element1[0] = 0x001f.
1- Why the results are different on PC and on the embedded target?
2- What can I modify in this code so that it prints 0x001f in the PC case,
without changing the core of code (by adding a compiler option or something maybe).
shorts are typically two bytes and 16 bits. When you say:
short s;
((char*)&s)[0] = 0x00;
((char*)&s)[1] = 0x1f;
This sets the first of those two bytes to 0x00 and the second of those two bytes to 0x1f. The thing is that C++ doesn't specify what setting the first or second byte does to the value of the overall short, so different platforms can do different things. In particular, some platforms say that setting the first byte affects the 'most significant' bits of the short's 16 bits and setting the second byte affects the 'least significant' bits of the short's 16 bits. Other platforms say the opposite; That setting the first byte affect the least significant bits and setting the second byte affects the most significant bits. These two platform behaviors are referred to as big-endian and little-endian respectively.
The solution to getting consistent behavior independent of these differences is to not access the bytes of the short this way. Instead you should simply manipulate the value of the short using methods that the language does define, such as with bitwise and arithmetic operators.
short s;
s = (0x1f << 8) | (0x00 << 0); // set the most significant bits to 0x1f and the least significant bits to 0x00.
The problem is that, for many reasons, I can only change the body of the function F2. I can not change its prototype. Is there a way to find the sizeof Y before it have been castled or something?
You cannot determine the original type and size using only the char*. You have to know the correct type and size through some other means. If F2 is never called except with CustomStruct then you can simply cast the char* back to CustomStruct like this:
void F2(char* Y)
{
CustomStruct *X = (CustomStruct*)Y;
X->Element[0] = 0x1F00;
}
But remember, such casts are not safe in general; you should only cast a pointer back to what it was originally cast from.
The portable way is to change the definition of F2:
void F2(short * p)
{
*p = 0x1F;
}
void F1(CustomStruct* X)
{
F2(&X.Element1[0]);
printf("s = %x\n", (*X).Element1[0]);
}
When you reinterpret an object as an array of chars, you expose the implementation details of the representation, which is inherently non-portable and... implementation-dependent.
If you need to do I/O, i.e. interface with a fixed, specified, external wire format, use functions like htons and ntohs to convert and leave the platform specifics to your library.
It appears that the PC is little endian and the target is either big-endian, or has 16-bit char.
There isn't a great way to modify the C code on the PC, unless you replace your char * references with short * references, and perhaps use macros to abstract the differences between your microcontroller and your PC.
For example, you might make a macro PACK_BYTES(hi, lo) that packs two bytes into a short the same way, regardless of machine endian. Your example becomes:
#include "stdio.h"
#define PACK_BYTES(hi,lo) (((short)((hi) & 0xFF)) << 8 | (0xFF & (lo)))
typedef struct CustomStruct
{
short Element1[10];
}CustomStruct;
void F2(short* Y)
{
*Y = PACK_BYTES(0x00, 0x1F);
}
void F1(CustomStruct* X)
{
F2(&(X->Element1[0]));
printf("s = %x\n", (*X).Element1[0]);
}
int main(void)
{
CustomStruct s;
F1(&s);
return 0;
}

Pointer arithmetic and portability

I'm writing an application and I had to do some pointers arithmetic. However this application will be running on different architecture! I was not really sure if this would be problematic but after reading this article, I thought that I must change it.
Here was my original code that I didn't like much:
class Frame{
/* ... */
protected:
const u_char* const m_pLayerHeader; // Where header of this layer starts
int m_iHeaderLength; // Length of the header of this layer
int m_iFrameLength; // Header + payloads length
};
/**
* Get the pointer to the payload of the current layer
* #return A pointer to the payload of the current layer
*/
const u_char* Frame::getPayload() const
{
// FIXME : Pointer arithmetic, portability!
return m_pLayerHeader + m_iHeaderLength;
}
Pretty bad isn't it! Adding an int value to a u_char pointer! But then I changed to this:
const u_char* Frame::getPayload() const
{
return &m_pLayerHeader[m_iHeaderLength];
}
I think now, the compiler is able to say how much to jump! Right? Is the operation [] on array considered as pointer arithmetic? Does it fix the portability problem?
p + i and &p[i] are synonyms when p is a pointer and i a value of integral type. So much that you can even write &i[p] and it's still valid (just as you can write i + p).
The portability issue in the example you link was coming from sizeof(int) varying across platforms. Your code is just fine, assuming m_iHeaderLength is the number of u_chars you want to skip.
In your code you are advancing the m_pLayerHeader by m_iHeaderLength u_chars. As long as whatever wrote the data you are pointing into has the same size for u_char, and i_HeaderLength is the number of u_chars in the header area you are safe.
But if m_iHeaderLength is really referring to bytes, and not u_chars, then you may have a problem if m_iHeaderLength is supposed to advance the pointer past other types than char.
Say you are sending data from a 16-bit system to a 32-bit system, your header area is defined like this
struct Header {
int something;
int somethingElse;
};
Assume that is only part of the total message defined by the struct Frame.
On the 32-bit machine you write the data out to a port that the 16-bit machine will read from.
port->write(myPacket, sizeof(Frame));
On the 16-bit machine you have the same Header definition, and try to read the information.
port->read(packetBuffer, sizeof(Frame));
You are already in trouble because you've tried to read twice the amount of data the sender wrote. The size of int on the 16-bit machine doing the reading is two, and the size of the header is four. But the header size was eight on the sending machine, two ints of four bytes each.
Now you attempt to advance your pointer
m_iHeaderLength = sizeof(Header);
...
packetBuffer += m_iHeaderLength;
packetBuffer will still be pointing into the data which was in the header in the frame sent from the originator.
If there is a portability problem, then no, that wouldn't fix it. m_pLayerHeader + m_iHeaderLength and &m_pLayerHeader[m_iHeaderLength] are completely equivalent (in this case).

What's the shortest code to write directly to a memory address in C/C++?

I'm writing system-level code for an embedded system without memory protection (on an ARM Cortex-M1, compiling with gcc 4.3) and need to read/write directly to a memory-mapped register. So far, my code looks like this:
#define UART0 0x4000C000
#define UART0CTL (UART0 + 0x30)
volatile unsigned int *p;
p = UART0CTL;
*p &= ~1;
Is there any shorter way (shorter in code, I mean) that does not use a pointer? I looking for a way to write the actual assignment code as short as this (it would be okay if I had to use more #defines):
*(UART0CTL) &= ~1;
Anything I tried so far ended up with gcc complaining that it could not assign something to the lvalue...
#define UART0CTL ((volatile unsigned int *) (UART0 + 0x30))
:-P
Edited to add: Oh, in response to all the comments about how the question is tagged C++ as well as C, here's a C++ solution. :-P
inline unsigned volatile& uart0ctl() {
return *reinterpret_cast<unsigned volatile*>(UART0 + 0x30);
}
This can be stuck straight in a header file, just like the C-style macro, but you have to use function call syntax to invoke it.
I'd like to be a nitpick: are we talking C or C++ ?
If C, I defer to Chris' answer willingly (and I'd like the C++ tag to be removed).
If C++, I advise against the use of those nasty C-Casts and #define altogether.
The idiomatic C++ way is to use a global variable:
volatile unsigned int& UART0 = *((volatile unsigned int*)0x4000C000);
volatile unsigned int& UART0CTL = *(&UART0 + 0x0C);
I declare a typed global variable, which will obey scope rules (unlike macros).
It can be used easily (no need to use *()) and is thus even shorter!
UART0CTL &= ~1; // no need to dereference, it's already a reference
If you want it to be pointer, then it would be:
volatile unsigned int* const UART0 = 0x4000C000; // Note the const to prevent rebinding
But what is the point of using a const pointer that cannot be null ? This is semantically why references were created for.
You can go one further than Chris's answer if you want to make the hardware registers look like plain old variables:
#define UART0 0x4000C000
#define UART0CTL (*((volatile unsigned int *) (UART0 + 0x30)))
UART0CTL &= ~1;
It's a matter of taste which might be preferable. I've worked in situations where the team wanted the registers to look like variables, and I've worked on code where the added dereference was considered 'hiding too much' so the macro for a register would be left as a pointer that had to be dereferenced explicitly (as in Chris' answer).
#define UART0 ((volatile unsigned int*)0x4000C000)
#define UART0CTL (UART0 + 0x0C)
I like to specify the actual control bits in a struct, then assign that to the control address. Something like:
typedef struct uart_ctl_t {
unsigned other_bits : 31;
unsigned disable : 1;
};
uart_ctl_t *uart_ctl = 0x4000C030;
uart_ctl->disable = 1;
(Apologies if the syntax isn't quite right, I haven't actually coded in C for quite awhile...)
Another option which I kinda like for embedded applications is to use the linker to define sections for your hardward devices and map your variable to those sections. This has the advantage that if you are targeting multiple devices, even from the same vendor such as TI, you will typically have to alter the linker files on a device by device basis. i.e. Different devices in the same family have different amounts of internal direct mapped memory, and board to board you might have different amounts of ram as well and hardware at different locations. Here's an example from the GCC documentation:
Normally, the compiler places the objects it generates in sections
like data and bss. Sometimes, however, you need additional sections,
or you need certain particular variables to appear in special
sections, for example to map to special hardware. The section
attribute specifies that a variable (or function) lives in a
particular section. For example, this small program uses several
specific section names:
struct duart a __attribute__ ((section ("DUART_A"))) = { 0 };
struct duart b __attribute__ ((section ("DUART_B"))) = { 0 };
char stack[10000] __attribute__ ((section ("STACK"))) = { 0 };
int init_data __attribute__ ((section ("INITDATA")));
main()
{
/* Initialize stack pointer */
init_sp (stack + sizeof (stack));
/* Initialize initialized data */
memcpy (&init_data, &data, &edata - &data);
/* Turn on the serial ports */
init_duart (&a);
init_duart (&b);
}
Use the section attribute with global variables and not local variables, as shown in the example.
You may use the section attribute with initialized or uninitialized
global variables but the linker requires each object be defined once,
with the exception that uninitialized variables tentatively go in the
common (or bss) section and can be multiply “defined”. Using the
section attribute will change what section the variable goes into and
may cause the linker to issue an error if an uninitialized variable
has multiple definitions. You can force a variable to be initialized
with the -fno-common flag or the nocommon attribute.