why passing string_view by value is faster than const reference

why passing string_view by value is faster than const reference - c++

I checked this question, and most answers say that I should pass it by value despite it's clearly passing more data (since by value you pass 8 bytes while by reference only 4 bytes, in 32bit system sizeof(string_view) > sizeof(string_view*))
is that still relevant in C++20/17 ? and can someone explains why exactly ?

Indirection through a reference (as well as a pointer) has a cost. That cost can be more than the cost of copying a few bytes. As in most cases, you need to verify through measurement whether that is true for your use case / target system. Note that if the function is expanded inline, then there is unlikely to be any difference as you may end up with identical assembly in either case. Even if not, the difference may be extremely small and hard to measure.

Related

C++ memory Alignment on a 64 bit machine

Class A
{
public:
void fun1();
void fun2();
private:
uint16 temp_var;
};
Is there any reason why I shouldn't just make this variable uint16 to full uint64? Doing it this way (uin16), do I leave memory "holes" in the object ?, and I'm told the processor is more efficient in dealing with full uint64s.
for clarification, temp_var is the only member variable. And no where i was using it for size(temp_var) or as a counter to loop back to zero
Thank you all for your inputs, appreciate it..

If the question is, can the compiler do the substitution:
You asked for a uint16 so it gave you a uint16. It would be surprising to get something else.
For instance, imagine if a developer was counting on a behavior of integer overflow or underflow. In that case, if the compiler substituted a uint64 behind the scenes, then that would be problematically surprising for the developer.
Along the same lines, one would expect sizeof(temp_var) to equal sizeof(uint16).
There are probably further cases in which such a substitution could lead to unexpected behavior that wouldn't be anticipated by the developer.
If the question is, can you the developer pick something else:
Sure, you can if you want a variable of that size. So then how about some possibilities where you wouldn't...
If you rely on overflow/underflow behavior of a uint16 then of course you would want to stick to that.
Or perhaps this data is going to be passed along to some further location that only supports values in the range of a uint16, so leaving it that size may make logical sense to try to implicitly document what's valid and/or avoid compiler warnings.
Similarly you might want for sizeof(temp_var) to be 2 bytes instead of 8 for some logical reason relevant to other parts of the program.
I also expect there are some games one could play with the packing pragma, but I presume that isn't relevant to the intended question.
Depending on the goal of your program, logical consistency or clarity of code may be more important than maximum possible performance (especially at the micro level of being concerned about size/type of a member variable). To phrase that another way, uint16 is still fast enough for many many use cases.
In some situations though, there won't be any compelling reason to make the decision one way or the other. At that point, I would go with whatever seems to make the most sense as per personal sensibilities.

C++ Expression Evaluation: What Happens "Under The Hood"?

I'm still learning C++. I'm trying to understand how evaluation is carried out, in a rather step-by-step fashion. So using this simple example, an expression statement:
int x = 8 * 5 - 5;
This is what I believe happens. Please tell me how far off the mark I am:
The operands x, 8, 5, and 5 are "evaluated." Possibly, a temporary object is created to hold each value (I am not too sure about this).
8 * 5 evaluates to 40, which is stored in a temporary.
40 (temporary) - 5 evaluates to 35 (another temporary).
35 is copied into x.
All temporary objects are destroyed in the reverse order they were created in (the value is discarded).
Am I at least close to being right?

"Thank you, sir. Hm. What would happen if all the operands were named objects, rather than literals? Would it create temporaries on the fly, so to speak, rather than at compile time?"
As Sam mentioned, you are on the right track on a high level.
In your first example it would use CPU registers to store temporaries (since they are not named objects), if they would be named objects it depends on the optimization flags that are set on the compiler and the complexity of the code as to how 'optimized' the code will be that is generated. you can take a look at the disassembly to really see what happens. for example if you do
a = 5;
b = 2;
c = a * b;
the compiler will try and generate the most optimal code, and since in this case there are 2 constants that are known at compile time, and you do a multiplication by 2, it will be able to take shortcuts, sometimes multiplications are replaced by bit operations which are cheaper (multiply by 2 is the same as shifting 1 to the left)
named variables have to live somewhere, either on the stack or heap, and the CPU will use the address of named objects to pass them around and perform functions on. (if they are small enough it will fit in registers and operate on them, otherwise it will start using memory, first the cache, and then bleed out to RAM)
You could google for 'abstract syntax tree' to get an idea of how readable c++ code is converted to machine code.
this is why it is important to learn about const correctness, aliasing and pointer vs references to make sure you give the compiler the best chance at generating optimal code for you. (aside from the advantages a user gets from that)

Performance of operator '==' with boolean variable?

I guess this situation came across in every programmer,where we can use comparison operator '==',in my case situation is like this,a c++ pgm
code 1:This has been used in all files except constructor
if(a==10)
{
//do something;
}
but i can do the same as above with the following way,
i set a bool variable to true when variable a becomes 10 in constructor itself,i.e
constructor_name()
{
boolean variable_name=TRUE;//when a == 10;
}
then i use the following code in my all files instead of code 1,
code 3:
if(variable_name)
{
//do same as first code
}
which is better for performance ,the code 1 or code 3.I hope i have illustrated my situation so than you can understand.Please help me.Thanks in advance.

You shouldn't micro-optimize. You will hardly notice any difference between your 2 version in performance (maybe you will save 1 CPU cycle), but it is not worth the time and effort, especially because nowadays CPUs are really fast.
Only optimize if you profile and find a bottleneck in your code.
Look at it this way, if you store the boolean variable in the class, it uses memory (1 byte) for maybe saving 1 CPU cycle. Depending on how often you create the class, that can scale up (even though the amount would still be ridiculously small). You maybe saved 1 cycle, but you lost 1 byte.
If you wrote this in production code, I am sure that others would find it confusing (I would), and wonder why you put a isTen boolean in the class, instead of just comparing the value using operator==.
Also, there may be a bug if you change a outisde of the constructor to 10, then isTen would still be false, but a is 10!

I found the below thing would make a difference,
consider variable a is an integer variable and it takes 4 bytes(assuming 4bytes for int),then compiler has to perform comparision of 4 bytes of memory where as a bool variable takes 1byte ,i guess this makes a differance in performance.

Use of Literals, yay/nay in C++

I've recently heard that in some cases, programmers believe that you should never use literals in your code. I understand that in some cases, assigning a variable name to a given number can be helpful (especially in terms of maintenance if that number is used elsewhere). However, consider the following case studies:
Case Study 1: Use of Literals for "special" byte codes.
Say you have an if statement that checks for a specific value stored in (for the sake of argument) a uint16_t. Here are the two code samples:
Version 1:
// Descriptive comment as to why I'm using 0xBEEF goes here
if (my_var == 0xBEEF) {
//do something
}
Version 2:
const uint16_t kSuperDescriptiveVarName = 0xBEEF;
if (my_var == kSuperDescriptiveVarName) {
// do something
}
Which is the "preferred" method in terms of good coding practice? I can fully understand why you would prefer version 2 if kSuperDescriptiveVarName is used more than once. Also, does the compiler do any optimizations to make both versions effectively the same executable code? That is, are there any performance implications here?
Case Study 2: Use of sizeof
I fully understand that using sizeof versus a raw literal is preferred for portability and also readability concerns. Take the two code examples into account. The scenario is that you are computing the offset into a packet buffer (an array of uint8_t) where the first part of the packet is stored as my_packet_header, which let's say is a uint32_t.
Version 1:
const int offset = sizeof(my_packet_header);
Version 2:
const int offset = 4; // good comment telling reader where 4 came from
Clearly, version 1 is preferred, but what about for cases where you have multiple data fields to skip over? What if you have the following instead:
Version 1:
const int offset = sizeof(my_packet_header) + sizeof(data_field1) + sizeof(data_field2) + ... + sizeof(data_fieldn);
Version 2:
const int offset = 47;
Which is preferred in this case? Does is still make sense to show all the steps involved with computing the offset or does the literal usage make sense here?
Thanks for the help in advance as I attempt to better my code practices.

Which is the "preferred" method in terms of good coding practice? I can fully understand why you would prefer version 2 if kSuperDescriptiveVarName is used more than once.
Sounds like you understand the main point... factoring values (and their comments) that are used in multiple places. Further, it can sometimes help to have a group of constants in one place - so their values can be inspected, verified, modified etc. without concern for where they're used in the code. Other times, there are many constants used in proximity and the comments needed to properly explain them would obfuscate the code in which they're used.
Countering that, having a const variable means all the programmers studying the code will be wondering whether it's used anywhere else, keeping it in mind as they inspect the rest of the scope in which it's declared etc. - the less unnecessary things to remember the surer the understanding of important parts of the code will be.
Like so many things in programming, it's "an art" balancing the pros and cons of each approach, and best guided by experience and knowledge of the way the code's likely to be studied, maintained, and evolved.
Also, does the compiler do any optimizations to make both versions effectively the same executable code? That is, are there any performance implications here?
There's no performance implications in optimised code.
I fully understand that using sizeof versus a raw literal is preferred for portability and also readability concerns.
And other reasons too. A big factor in good programming is reducing the points of maintenance when changes are done. If you can modify the type of a variable and know that all the places using that variable will adjust accordingly, that's great - saves time and potential errors. Using sizeof helps with that.
Which is preferred [for calculating offsets in a struct]? Does is still make sense to show all the steps involved with computing the offset or does the literal usage make sense here?
The offsetof macro (#include <cstddef>) is better for this... again reducing maintenance burden. With the this + that approach you illustrate, if the compiler decides to use any padding your offset will be wrong, and further you have to fix it every time you add or remove a field.
Ignoring the offsetof issues and just considering your this + that example as an illustration of a more complex value to assign, again it's a balancing act. You'd definitely want some explanation/comment/documentation re intent here (are you working out the binary size of earlier fields? calculating the offset of the next field?, deliberately missing some fields that might not be needed for the intended use or was that accidental?...). Still, a named constant might be enough documentation, so it's likely unimportant which way you lean....

In every example you list, I would go with the name.
In your first example, you almost certainly used that special 0xBEEF number at least twice - once to write it and once to do your comparison. If you didn't write it, that number is still part of a contract with someone else (perhaps a file format definition).
In the last example, it is especially useful to show the computation that yielded the value. That way, if you encounter trouble down the line, you can easily see either that the number is trustworthy, or what you missed and fix it.
There are some cases where I prefer literals over named constants though. These are always cases where a name is no more meaningful than the number. For example, you have a game program that plays a dice game (perhaps Yahtzee), where there are specific rules for specific die rolls. You could define constants for One = 1, Two = 2, etc. But why bother?

Generally it is better to use a name instead of a value. After all, if you need to change it later, you can find it more easily. Also it is not always clear why this particular number is used, when you read the code, so having a meaningful name assigned to it, makes this immediately clear to a programmer.
Performance-wise there is no difference, because the optimizers should take care of it. And it is rather unlikely, even if there would be an extra instruction generated, that this would cause you troubles. If your code would be that tight, you probably shouldn't rely on an optimizer effect anyway.

I can fully understand why you would prefer version 2 if kSuperDescriptiveVarName is used more than once.
I think kSuperDescriptiveVarName will definitely be used more than once. One for check and at least one for assignment, maybe in different part of your program.
There will be no difference in performance, since an optimization called Constant Propagation exists in almost all compilers. Just enable optimization for your compiler.

`short int` vs `int`

Should I bother using short int instead of int? Is there any useful difference? Any pitfalls?

short vs int
Don't bother with short unless there is a really good reason such as saving memory on a gazillion values, or conforming to a particular memory layout required by other code.
Using lots of different integer types just introduces complexity and possible wrap-around bugs.
On modern computers it might also introduce needless inefficiency.
const
Sprinkle const liberally wherever you can.
const constrains what might change, making it easier to understand the code: you know that this beastie is not gonna move, so, can be ignored, and thinking directed at more useful/relevant things.
Top-level const for formal arguments is however by convention omitted, possibly because the gain is not enough to outweight the added verbosity.
Also, in a pure declaration of a function top-level const for an argument is simply ignored by the compiler. But on the other hand, some other tools may not be smart enough to ignore them, when comparing pure declarations to definitions, and one person cited that in an earlier debate on the issue in the comp.lang.c++ Usenet group. So it depends to some extent on the toolchain, but happily I've never used tools that place any significance on those consts.
Cheers & hth.,

Absolutely not in function arguments. Few calling conventions are going to make any distinction between short and int. If you're making giant arrays you could use short if your data fits in short to save memory and increase cache effectiveness.

What Ben said. You will actually create less efficient code since all the registers need to strip out the upper bits whenever any comparisons are done. Unless you need to save memory because you have tons of them, use the native integer size. That's what int is for.
EDIT: Didn't even see your sub-question about const. Using const on intrinsic types (int, float) is useless, but any pointers/references should absolutely be const whenever applicable. Same for class methods as well.

The question is technically malformed "Should I use short int?". The only good answer will be "I don't know, what are you trying to accomplish?".
But let's consider some scenarios:
You know the definite range of values that your variable can take.
The ranges for signed integers are:
signed char — -2⁷ – 2⁷-1
short — -2¹⁵ – 2¹⁵-1
int — -2¹⁵ – 2¹⁵-1
long — -2³¹ – 2³¹-1
long long — -2⁶³ – 2⁶³-1
We should note here that these are guaranteed ranges, they can be larger in your particular implementation, and often are. You are also guaranteed that the previous range cannot be larger than the next, but they can be equal.
You will quickly note that short and int actually have the same guaranteed range. This gives you very little incentive to use it. The only reason to use short given this situation becomes giving other coders a hint that the values will be not too large, but this can be done via a comment.
It does, however, make sense to use signed char, if you know that you can fit every potential value in the range -128 — 127.
You don't know the exact range of potential values.
In this case you are in a rather bad position to attempt to minimise memory useage, and should probably use at least int. Although it has the same minimum range as short, on many platforms it may be larger, and this will help you out.
But the bigger problem is that you are trying to write a piece of software that operates on values, the range of which you do not know. Perhaps something wrong has happened before you have started coding (when requirements were being written up).
You have an idea about the range, but realise that it can change in the future.
Ask yourself how close to the boundary are you. If we are talking about something that goes from -1000 to +1000 and can potentially change to -1500 – 1500, then by all means use short. The specific architecture may pad your value, which will mean you won't save any space, but you won't lose anything. However, if we are dealing with some quantity that is currently -14000 – 14000, and can grow unpredictably (perhaps it's some financial value), then don't just switch to int, go to long right away. You will lose some memory, but will save yourself a lot of headache catching these roll-over bugs.

short vs int - If your data will fit in a short, use a short. Save memory. Make it easier for the reader to know how much data your variable may fit.
use of const - Great programming practice. If your data should be a const then make it const. It is very helpful when someone reads your code.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js