questions about the testing of the xloper xll - c++

code as follows:
'''
int __stdcall xloper_type(const xloper *type){
if(type->xltype&~(xlbitXLFree|xlbitDLLFree)==xltypeRef)
return xltypeRef;
}
'''
why to use ~(xlbitXLFree|xlbitDLLFree) when testing pointer type's xltype?? thanks!

Below is the relevant section from xlcall.h which defines the XLOPER data type. I don't know how 'xloper' (lower case) is defined in the OP's code (whether XLOPER or XLOPER12), but for XLOPER the xltype member is an unsigned int (WORD). Each bit corresponds to a different flag, so you can use bit-wise operations.
The xlbitXLFree and xlbitDLLFree are bits that indicate who (Excel or you) is responsible for cleaning up any memory allocated to the OPER (eg if it contains a string, or array). So typically, if you allocate any memory for the OPER in your xll, the xlbitDLLFree bit will be set to 1, and Excel will expect you to tidy up afterwards (passing the OPER to the xlAutoFree() function that should be declared somewhere in your xll code).
So
type->xltype&~(xlbitXLFree|xlbitDLLFree)
zeroes out the memory allocation bits in a bit-wise AND NOT() operation, leaving you with just the basic type. Eg if you received a string from Excel into your code, it's xltype would likely be xltypeStr & xlbitXLFree (Excel will tidy up), so 0x1002, which would not be equal to xltypeStr (0x0002), hence the need the bit-wise operation before you check for the OPER underlying type.
A neater way to write this might be (depending on requirements)
if(type->xltype & xltypeRef) {}
as the result will be non-zero (true) if the bit matching xltypeRef (0x0008) is set in type->xltype.
/*
** XLOPER and XLOPER12 data types
**
** Used for xltype field of XLOPER and XLOPER12 structures
*/
#define xltypeNum 0x0001
#define xltypeStr 0x0002
#define xltypeBool 0x0004
#define xltypeRef 0x0008
#define xltypeErr 0x0010
#define xltypeFlow 0x0020
#define xltypeMulti 0x0040
#define xltypeMissing 0x0080
#define xltypeNil 0x0100
#define xltypeSRef 0x0400
#define xltypeInt 0x0800
#define xlbitXLFree 0x1000
#define xlbitDLLFree 0x4000

Related

Macro values defined using bit-shifts

I've been going through an old source project, trying to make it compile and run (it's an old game that's been uploaded to GitHub). I think a lot of the code was written with C-style/C-syntax in mind (a lot of typedef struct {...} and the likes) and I've been noticing that they define certain macros with the following style:
#define MyMacroOne (1<<0) //This equals 1
#define MyMacroTwo (1<<1) //This equals 2, etc.
So my question now is this - is there any reason why macros would be defined this way? Because, for example, 0x01 and 0x02 are the numerical result of the above. Or is it that the system will not read MyMacroOne = 0x01 but rather as a "shift object" with the value (1<<0)?
EDIT: Thanks for all of your inputs!
It makes it more intuitive and less error prone to define bit values, especially on multibit bitfields. For example, compare
#define POWER_ON (1u << 0)
#define LIGHT_ON (1u << 1)
#define MOTOR_ON (1u << 2)
#define SPEED_STOP (0u << 3)
#define SPEED_SLOW (1u << 3)
#define SPEED_FAST (2u << 3)
#define SPEED_FULL (3u << 3)
#define LOCK_ON (1u << 5)
and
#define POWER_ON 0x01
#define LIGHT_ON 0x02
#define MOTOR_ON 0x04
#define SPEED_STOP 0x00
#define SPEED_SLOW 0x08
#define SPEED_FAST 0x10
#define SPEED_FULL 0x18
#define LOCK_ON 0x20
It is convenient for the humans
for example
#define PIN0 (1u<<0)
#define PIN5 (1u<<5)
#define PIN0MASK (~(1u<<0))
#define PIN5MASK (~(1u<<5))
and it is easy too see if there is a correct bit position. it does not make the code slower as it is calculated at the compile time
You can always use constant integer expression shifts as a way to express (multiples of) powers of two, i.e. Multiple*(2 to the N-th power) = Mutliple << N (with some caveats related to when you hit the guaranteed size limits of the integer types and UB sets in*) and pretty much rely on the compiler folding them.
An integer expression made of integer constants is defined as an integer constant expression. These can be used to specify array sizes, case labels and stuff like that and so every compiler has to be able to fold them into a single intermediate and it'd be stupid not to utilize this ability even where it isn't strictly required.
*E.g.: you can do 1U<<15, but at 16 you should switch to at least 1L<<16 because ints/unsigneds are only required to have at least 16 bits and leftshifting an integer by its width or into the place where its sign bit is is undefined (6.5.7p4):
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated
bits are filled with zeros. If E1 has an unsigned type, the value of
the result is E1 x 2E2 , reduced modulo one more than the maximum
value representable in the result type. If E1 has a signed type and
nonnegative value, and E1 x 2E2 is representable in the result type,
then that is the resulting value; otherwise, the behavior is
undefined.
Macro are just replacement text. Everywhere macro is replaced by replacement text!! This is convenient especially if you want to name something constant which otherwise is prone to mistakes.
To illustrate how this (1<<0) syntax is more practical, consider this example from the code-base of Git 2.25 (Q1 2020), which moves the definition of a set of bitmask constants from 0ctal literal to (1U<<count) notation.
See commit 8679577 (17 Oct 2019) by Hariom Verma (harry-hov).
(Merged by Junio C Hamano -- gitster -- in commit 8f40d89, 10 Nov 2019)
builtin/blame.c: constants into bit shift format
Signed-off-by: Hariom Verma
We are looking at bitfield constants, and elsewhere in the Git source code, such cases are handled via bit shift operators rather than octal numbers, which also makes it easier to spot holes in the range.
If, say, 1<<5 was missing:
it is easier to spot it between 1<<4 and 1<<6
than it is to spot a missing 040 between a 020 and a 0100.
So instead of:
#define OUTPUT_ANNOTATE_COMPAT 001
#define OUTPUT_LONG_OBJECT_NAME 002
#define OUTPUT_RAW_TIMESTAMP 004
#define OUTPUT_PORCELAIN 010
You get:
#define OUTPUT_ANNOTATE_COMPAT (1U<<0)
#define OUTPUT_LONG_OBJECT_NAME (1U<<1)
#define OUTPUT_RAW_TIMESTAMP (1U<<2)
#define OUTPUT_PORCELAIN (1U<<3)

Use of *(char *) in C++

I don't understand what using this syntax is for: *(char *). What does it do and can it be used with other data types like int?
void function(int a)
{
*(char*)(0x12345 + (0x3980 * a)) = 0xFF;
}
*(char *)hoge means that interpret hoge as a pointer for char and read the data on where hoge points at.
It can be used with other data types like int.
One usage example: comparison function for qsort
int cmp(const void *x, const void *y) {
int a = *(int *)x;
int b = *(int *)y;
if (a > b) return 1;
if (a < b) return -1;
return 0;
}
I don't know where you got your example from, but it doesn't make sense to me for some reason. Anyway, when you use the character "*" before something like (char*), what is happening is you're telling the compiler to cast the value computed between those parentheses (0x12345 + (0x3980 * a)) into a pointer to char, and then change the value store in that location on the memory to be 0xFF.
In other words, what just happened is you grabbed a random location on the memory, and you told the compiler to act like that location contain a char "*(char*)", and store my value "0xFF" there.
The question has been awnsered already but here is a "real world" example where this kind of syntax is used.
In (low-level) embedded software developement you often have to interface with a mcu's hardware peripheripherals. These perhipherals are controlled by registers which are mapped to fixed memory addresses.
When an mcu has multiple of the same peripherals (ie. 3 ADC's) it'll usually have 3 equal register sets mapped right after each other.
When interfacing you want to work with the addresses directly but add an abstraction. A simple API for control may looks like this:
.H file
/* Header file, defines addresses for specific chip*/
#define ADC_BASE_ADDRESS 0x00001000 /* Start address of first register of ADC0 */
#define SIZEOF_ADC_REGISTERS 0x00000020 /* Size of all ADC0 registers */
#define ADC_REG_CFG_OFFSET 0x00 /* ADC Config register offset */
#define ADC_REG_BLA_BLA_OFFSET 0x04 /* ADC Config register offset */
/* etc, etc, etc*/
#define ADC_CFG_ENABLE 0x01 /* Enable command */
.C file
#include "chip.h"
void adc_enable(int adc){
*(uint32_t *)(ADC_BASE_ADDRESS + ADC_REG_CFG_OFFSET + (adc * SIZEOF_ADC_REGISTERS)) = ADC_CFG_ENABLE;
}
/* Calling code */
adc_enable(0);
adc_enable(3);
Do note, as mentioned this is typically done in C, not so much in C++.

How can I get the SEXPTYPE of an SEXP value?

Suppose I have a function taking an SEXP type as a parameter:
SEXP myFun(SEXP param)
How can I find out the type of this parameter? Looking at the SEXP type in my debugger, I can see that I could call param->sxpinfo.type to get a numerical representation of the SEXPTYPE. From a quick glance, they seem to match with these:
no SEXPTYPE Description
0 NILSXP NULL
1 SYMSXP symbols
2 LISTSXP pairlists
3 CLOSXP closures
4 ENVSXP environments
5 PROMSXP promises
6 LANGSXP language objects
7 SPECIALSXP special functions
8 BUILTINSXP builtin functions
9 CHARSXP internal character strings
10 LGLSXP logical vectors
13 INTSXP integer vectors
14 REALSXP numeric vectors
15 CPLXSXP complex vectors
16 STRSXP character vectors
17 DOTSXP dot-dot-dot object
18 ANYSXP make “any” args work
19 VECSXP list (generic vector)
20 EXPRSXP expression vector
21 BCODESXP byte code
22 EXTPTRSXP external pointer
23 WEAKREFSXP weak reference
24 RAWSXP raw vector
25 S4SXP S4 classes not of simple type
(source: http://www.biosino.org/R/R-doc/R-ints/SEXPTYPEs.html#SEXPTYPEs)
But this seems hacky. What is the right way to check the type of a SEXP variable?
With the R API, the TYPEOF macro is used to get the runtime type. We can see some relevant bits from Rinternals.h (interestingly, not encoded as an enum, but as a series of macro defines; presumedly for backwards compatibility with some very bad compiler on some very bad platform...)
typedef unsigned int SEXPTYPE;
#define NILSXP 0 /* nil = NULL */
#define SYMSXP 1 /* symbols */
#define LISTSXP 2 /* lists of dotted pairs */
#define CLOSXP 3 /* closures */
#define ENVSXP 4 /* environments */
#define PROMSXP 5 /* promises: [un]evaluated closure arguments */
#define LANGSXP 6 /* language constructs (special lists) */
#define SPECIALSXP 7 /* special forms */
#define BUILTINSXP 8 /* builtin non-special forms */
#define CHARSXP 9 /* "scalar" string type (internal only)*/
#define LGLSXP 10 /* logical vectors */
/* 11 and 12 were factors and ordered factors in the 1990s */
#define INTSXP 13 /* integer vectors */
#define REALSXP 14 /* real variables */
#define CPLXSXP 15 /* complex variables */
#define STRSXP 16 /* string vectors */
#define DOTSXP 17 /* dot-dot-dot object */
#define ANYSXP 18 /* make "any" args work.
Used in specifying types for symbol
registration to mean anything is okay */
#define VECSXP 19 /* generic vectors */
#define EXPRSXP 20 /* expressions vectors */
#define BCODESXP 21 /* byte code */
#define EXTPTRSXP 22 /* external pointer */
#define WEAKREFSXP 23 /* weak reference */
#define RAWSXP 24 /* raw bytes */
#define S4SXP 25 /* S4, non-vector */
/* used for detecting PROTECT issues in memory.c */
#define NEWSXP 30 /* fresh node creaed in new page */
#define FREESXP 31 /* node released by GC */
#define FUNSXP 99 /* Closure or Builtin or Special */
If USE_RINTERNALS is defined, we can see that R queries the SEXPTYPE with:
#define TYPEOF(x) ((x)->sxpinfo.type)
which is exactly as you propsed :) But in most cases (ie -- unless you know what you're doing), you shouldn't be using that #define, and so the definition comes from memory.c:
int (TYPEOF)(SEXP x) { return TYPEOF(CHK(x)); }
...which just makes a call to the TYPEOF macro, but uses CHK to ensure the SEXP it's looking at hasn't already been unprotected.
It is useful to browse the R sources (and definitely take a look at R.h and Rinternals.h) to get a better idea of what is actually exposed in the R API, and how it is used.
That said, the R API is a bit of an ugly beast, so we really do recommend using Rcpp, which provides a number of nice classes that wrap over SEXPs but provide compile-time types and a slew of useful functions to using / manipulating them. See #eddelbuettel's Rcpp page for an introduction, and the Rcpp Gallery for example Rcpp use.

Exporting Packed structures with bool

What is the best practice for exporting a packed structure containing booleans?
I ask this because I'm trying to find the best way to do that. Current I do:
#ifndef __cplusplus
#if __STDC_VERSION__ >= 199901L
#include <stdbool.h> //size is 1.
#else
typedef enum {false, true} bool; //sizeof(int)
#endif
#endif
now in the above, the size of a boolean can be 1 or sizeof(int)..
So in a structure like:
#pragma pack(push, 1)
typedef struct
{
long unsigned int sock;
const char* address;
bool connected;
bool blockmode;
} Sock;
#pragma pack(pop)
the alignment is different if using C compared to C99 & C++. If I export it as an integer then languages where boolean is size 1 have alignment problems and need to pad the structure.
I was wondering if it would be best to typedef a bool as a char in the case of pre-C99 but it just doesn't feel right.
Any better ideas?
It depends on what you're looking for: preserve space but run a few extra instructions, or waste a few bytes but run faster.
If you're looking to be fast, but can "waste" a few bytes of space (i.e. a single value for each boolean flag, see sizeof bool discussion), your current approach is the superior. That is because it can load and compare the boolean values directly without having to mask them out of a packed field (see next).
If you're looking to preserve space then you should look into C bitfields:
struct Sock {
...
int connected:1; // For 2 flags, you could also use char here.
int blockmode:1;
}
or roll your own "flags" and set bits in integer values:
#define SOCKFLAGS_NONE 0
#define SOCKFLAGS_CONNECTED (1<<0)
#define SOCKFLAGS_BLOCKMODE (1<<1)
struct Sock {
...
int flags; // For 2 flags, you could also use char here.
}
Both examples lead to more or less the same code which masks bits and shifts values around (extra instructions) but is denser packed than simple bool values.
IMHO, using #pragma pack is more pain (in long term) than the gain (in short term).
It is compiler specific; non-standard and non-portable
I understand the embedded systems or protocols scenarios. With little extra effort, the code can be written pragma free.
I too want to pack my structure as much as possible and lay out the members in wider-first way as you did. However, I do not mind losing 2 bytes, if that allows my code to be standard-compliant and portable.
I would do the following three things:
Declare the flags as bool (you already did) and assign true/false
Put them as last members of the struct (you already did)
Use bitfield (as suggested by fellow stackers)
Combining these:
typedef struct Sock
{
long unsigned int sock;
const char* address;
bool connected : 1;
bool blockmode : 1;
} Sock;
In the pre-C99 case, it is risky to typedef char bool;. That will silently break code like:
bool x = (foo & 0x100);
which is supposed to set x to be true if that bit is set in foo. The enum has the same problem.
In my code I actually do typedef unsigned char bool; but then I am careful to write !! everywhere that an expression is converted to this bool. It's not ideal.
In my experience, using flags in an integral type leads to fewer issues than using bool in your structure, or bitfields, for C90.

Multiple Parameters in a single Parameter (functions) in C/C++

Ok this might sound a little vague from the title, but that's because I have no idea how to word it differently. I'll try to explain what I mean: very often in certain libraries, the 'init' function accepts some parameters, but that parameter then accepts multiple parameters (right..). An example, would be like this:
apiHeader.h
#define API_FULLSCREEN 0x10003003
#define API_NO_DELAY 0x10003004
#define API_BLAH_BLAH 0x10003005
main.c:
apiInit(0, 10, 10, 2, API_FULLSCREEN | API_NO_DELAY | API_BLAH_BLAH);
How does this work? I can't find the answer anywhere, most likely because I don't know how it's actually called so I have no clue what to search for. It would be very useful in my current project.
Thanks in advance!
The parameter is usually called "$FOO flags" and the values are or-ed. The point is that the parameter is a numeric type that is constructed as the bitwise or of multiple possible values.
In the processing functions, the values are usually tested with a bitwise and:
if ( (flags & API_FULLSCREEN) != 0 )
You have to be careful to assign values in a way that keeps the OR operation linear. In other words, don't set the same bit in two different or-able values, like you did in your header. For example,
#define API_FULLSCREEN 0x1
#define API_NO_DELAY 0x2
#define API_BLAH_BLAH 0x4
works and allows you to deconstruct all combinations of flags in your function, but
#define API_FULLSCREEN 0x1
#define API_NO_DELAY 0x2
#define API_BLAH_BLAH 0x3
does not because API_FULLSCREEN | API_NO_DELAY == API_BLAH_BLAH.
Viewing from a higher level, a flags int is a poor man's variable argument list. If you consider C++, you should encapsulate such detail in a class or at least a std::bitset.
This fifth parameter is usually a mask. It works by defining several consts (probably an enum) with values that are powers of two, or combinations of them. Then they are encoded into a single value using |, and decoded using &. Example:
#define COLOUR_RED 0x01
#define COLOUR_GREEN 0x02
#define COLOUR_BLUE 0x04
#define COLOUR_CYAN (COLOUR_BLUE | COLOUR_GREEN) // 0x06
// Encoding
SetColour(COLOUR_RED | COLOUR_BLUE); // Parameter is 0x05
// Decoding
void SetColour(int colour)
{
if (colour & COLOUR_RED) // If the mask contains COLOUR_RED
// Do whatever
if (colour & COLOUR_BLUE) // If the mask contains COLOUR_BLUE
// Do whatever
// ..
}
What they are doing there is using binary OR to combine the flags together.
so what is actually happening is:
0x10003003 | 0x10003004 | 0x10003005 == 0x10003007
It's still one parameter, but the 3 flags will combine to create a unique value for that parameter which can be used in the function.
What you are defining as multiple parameter is strictly a single parameter from the function signature point of view.
As for handling multiple Options based on a single parameter, as you can see there is the bitwise Or Operator which sets a single value for the parameter value. The body of the function then uses individual bits to determine the complete settings.
Usually, one bit is allocated for one option and they usually have two state(true/false) values.
The parameter is usually called "flags" and contains an or-ed combination of a set of allowed values.
int flags = API_FULLSCREEN | API_NO_DELAY;
The function can the take this integer parameter and extract the individual items like this:
int fullscreen_set = flags & API_FULLSCREEN;
int no_delay_set = flags & API_NO_DELAY;
int blah_blah_set = flags & API_BLAH_BLAH;
For this to work one has to be carfull in how one chooses the numeric values for the API_* parameters.
Bitwise OR
Bitwise OR works almost exactly the same way as bitwise AND. The only difference is that only one of the two bits needs to be a 1 for that position's bit in the result to be 1. (If both bits are a 1, the result will also have a 1 in that position.) The symbol is a pipe: |. Again, this is similar to boolean logical operator, which is ||.
01001000 | 10111000 = 11111000
and consequently
72 | 184 = 248
So In you Method not a multiple parameter it is actully one parameter.
you can use Bitwise OR opearation on API_FULLSCREEN | API_NO_DELAY | API_BLAH_BLAH and passed it in method.
The example that you gave will not work as expected. What you do is to use a particular bit for a particular option - and the OR combines then
Example
#define OPT1 1
#define OPT2 2
#define OPT3 4
So bit 1 is for OPT1, bit 2 is for OPT2 etc.
So OPT1 | OPT3 sets bit 1 and 3 and gives a value of 5
In the function you can test if a particular option is required using the AND operator
So
void perform(int opts)
{
if (opts & OPT1)
{
// Do stuff for OPT1
}
...
The value of these parameters are defined in a way that they don't have any overlap. Something like this:
#define A 0x01
#define B 0x02
#define C 0x04
#define D 0x08
Given the above definitions, your can always determine which of the above variables have been ORed using the bitwise AND operator:
void foo(int param)
{
if(param & A)
{
// then you know that A has been included in the param
}
if(param & B)
{
// then you know that B has been included in the param
}
...
}
int main()
{
foo (A | C);
return 0;
}