What is the internal logic of the Python bitwise OR (|) operator?

What is the internal logic of the Python bitwise OR (|) operator? - python-2.7

I've just started the "advanced" stages of the Python 2.7 course on Codeacademy and went to the extra effort of trying to write a function to perform a manual bitwise OR (|) operation.
What I came up with is not the most readable solution (so not so Pythonic)...
def bitwiseOr(bin1, bin2):
"""Perform a bitwise OR swap of two string representations of a binary number"""
# if the second bit is larger (in length), swap them
if len(bin1) < len(bin2):
bin1, bin2 = bin2, bin1
# cast string into list using list comprehension
result = [x for x in bin1]
resultString = ""
# start at the end of the smallest length bit string,
# moving backwards by 1 char, and stop at the 2nd char
for i in range(len(bin2), 2, -1):
if bin2[i-1] == "1":
result[i] = bin2[i-1]
# convert the list back into a string
for item in result:
resultString += item
return resultString
print bin(0b1110 | 0b101)
print bitwiseOr("0b101", "0b1110")
The above two print calls both return the same result (albeit with the first call returning a binary number and the second returning a string representation of a binary number).
Readability aside - I'm interested to see how it's logically done, underneath the hood, by Python internally. Poking around in the CPython repository didn't yield much, despite the fact that I think I found the right file (C is very foreign to me).
What I mean by logically is, there are normally a few ways to solve any given problem, and I solved this one by passing the base 2 representations of each number as strings, creating a list based on the larger length binary, and comparing each character, preferencing the 1.
How does Python do it internally?

The OR operator is like a parallel electrical circuit with two routes, even if one of the routes is broken the current will still flow through. It only stops when both the routes are broken. But you have to be careful with an OR operator in python, although it looks simple the logic has to be decided really carefully or else you might have a very hard time debugging your code.

This question, to the untrained eye (someone without a computer science background), is about trying to understand the C implementation of Python's bitwise OR operation.
However after some more research I've realised the question, to the trained eye, might seem absurd, since it is actually the processor itself which understands this type of operation.
What this means is that the internal logic of the Python bitwise OR operation is actually entirely dependent on the instructions available to the CPU, that is, it is not dependent on a higher level language.
So in summary, the internal logic of the Python bitwise OR operator performs a comparison on the bits (the 1's and 0's) of the machine in question, and how it performs this operation is a direct comparison of said bits which, mind-blowingly, is not dissimilar to how Ancient Egyptians performed multiplication. Woah.

Related

How to convert Text to Binary (and Reverse) in C++?

Okay, this will be a very beginner question, Though I can´t seem to find a good resource on this topic.
What I want is simple. take a string (or char*) and convert it to a binary file that I can store somewhere on my system.
Then, at a later date, I want to be able to read that binary file and convert it back to a string (or char*).
Now...
Whenever I search for this I often get to the concept of Serialisation, which is basically what I want.
There´s a problem though, most often "Boost-Serialisation" is recommended. Which (IMO) is quite heavy for just converting simple text to binary and converting simple binary to text. (ok, I know it isn´t THAT easy, but you get the idea)
There has got to be an easier way to handle this. I hope you can help me find it. :D
Thank you very much in advance for your answers.

How to convert Text to Binary (and Reverse)
There's nothing to do. Text is already data, and the in-memory presentation of all data in any modern computer is always binary.
You need to know what you mean. If you just mean "write it to a file" (in any representation), then just do that:
std::string my_text;
std::ofstream ofs("myfile.bin", std::ios::binary);
ofs.write(my_text.data(), my_text.size());
If you need some specific representation (different character sets, encodings or even (archive) file formats) you might need to do that conversion.
Oh, lest I forget, to read-back:
std::ifstream ifs("myfile.bin", std::ios::binary);
std::string my_text(std::istreambuf_iterator<char>(ifs), {});

You just have to use bit manipulation tricks. C and C++ both have operators that allow you to run integer values through logic gates. So for example:
x = 3 & 1 ;
Will set x to 1 because when you do an AND operation you're taking every bit from the left side of the & and the corresponding bit from the right side of the & and putting them through an And operation.
You can also do bit shifting. Where you shift the bits over by some number. For example:
y = 1 << 2;
Will shift all the bits in the integer 1 over by two, and the new rightmost bits will be set to zero.
So the way to do this is: for every byte in the string, do an AND operation with 128 and then if the value from that is zero then that means the left-most bit is zero (and you print "0"), if not then the value is one (so you print "1"). Then shift it to the left by one and do the operation again. Do that eight times and you've converted one byte to binary.

You could use ofstream/ifstream. They work similar to cout/cin except it reads and writes files instead of the console. Maybe this link is helpful: https://www.cplusplus.com/doc/tutorial/files/

Any better alternatives for getting the digits of a number? (C++)

I know that you can get the digits of a number using modulus and division. The following is how I've done it in the past: (Psuedocode so as to make students reading this do some work for their homework assignment):
int pointer getDigits(int number)
initialize int pointer to array of some size
initialize int i to zero
while number is greater than zero
store result of number mod 10 in array at index i
divide number by 10 and store result in number
increment i
return int pointer
Anyway, I was wondering if there is a better, more efficient way to accomplish this task? If not, is there any alternative methods for this task, avoiding the use of strings? C-style or otherwise?
Thanks. I ask because I'm going to be wanting to do this in a personal project of mine, and I would like to do it as efficiently as possible.
Any help and/or insight is greatly appreciated.

The time it takes to extract the digits will be dwarfed by the time required to dynamically allocate the array. Consider returning the result in a struct:
struct extracted_digits
{
int number_of_digits;
char digits[12];
};
You'll want to pick a suitable value for the maximum number of digits (12 here, which is enough for a 32-bit integer). Alternatively, you could return a std::array<char, 12> and encode the terminal by using an invalid value (so, after the last value, store a 10 or something else that isn't a digit).
Depending on whether you want to handle negative values, you'll also have to decide how to report the unary minus (-).

Unless you want the representation of the number in a base that's a power of 2, that's about the only way to do it.

Smacks of premature optimisation. If profiling proves it matters, then be sure to compare your algo to itoa - internally it may use some CPU instructions that you don't have explicit access to from C++, and which your compiler's optimiser may not be clever enough to employ (e.g. AAM, which divs while saving the mod result). Experiment (and benchmark) coding the assembler yourself. You might dig around for assembly implementations of ITOA (which isn't identical to what you're asking for, but might suggest the optimal CPU instructions).

By "avoiding the use of strings", I'm going to assume you're doing this because a string-only representation is pretty inefficient if you want an integer value.
To that end, I'm going to suggest a slightly unorthodox approach which may be suitable. Don't store them in one form, store them in both. The code below is in C - it will work in C++ but you may want to consider using c++ equivalents - the idea behind it doesn't change however.
By "storing both forms", I mean you can have a structure like:
typedef struct {
int ival;
char sval[sizeof("-2147483648")]; // enough for 32-bits
int dirtyS;
} tIntStr;
and pass around this structure (or its address) rather than the integer itself.
By having macros or inline functions like:
inline void intstrSetI (tIntStr *is, int ival) {
is->ival = i;
is->dirtyS = 1;
}
inline char *intstrGetS (tIntStr *is) {
if (is->dirtyS) {
sprintf (is->sval, "%d", is->ival);
is->dirtyS = 0;
}
return is->sval;
}
Then, to set the value, you would use:
tIntStr is;
intstrSetI (&is, 42);
And whenever you wanted the string representation:
printf ("%s\n" intstrGetS(&is));
fprintf (logFile, "%s\n" intstrGetS(&is));
This has the advantage of calculating the string representation only when needed (the fprintf above would not have to recalculate the string representation and the printf only if it was dirty).
This is a similar trick I use in SQL with using precomputed columns and triggers. The idea there is that you only perform calculations when needed. So an extra column to hold the indexed lowercased last name along with an insert/update trigger to calculate it, is usually a lot more efficient than select lower(non_lowercased_last_name). That's because it amortises the cost of the calculation (done at write time) across all reads.
In that sense, there's little advantage if your code profile is set-int/use-string/set-int/use-string.... But, if it's set-int/use-string/use-string/use-string/use-string..., you'll get a performance boost.
Granted this has a cost, at the bare minimum extra storage required, but most performance issues boil down to a space/time trade-off.
And, if you really want to avoid strings, you can still use the same method (calculate only when needed), it's just that the calculation (and structure) will be different.
As an aside: you may well want to use the library functions to do this rather than handcrafting your own code. Library functions will normally be heavily optimised, possibly more so than your compiler can make from your code (although that's not guaranteed of course).
It's also likely that an itoa, if you have one, will probably outperform sprintf("%d") as well, given its limited use case. You should, however, measure, not guess! Not just in terms of the library functions, but also this entire solution (and the others).

It's fairly trivial to see that a base-100 solution could work as well, using the "digits" 00-99. In each iteration, you'd do a %100 to produce such a digit pair, thus halving the number of steps. The tradeoff is that your digit table is now 200 bytes instead of 10. Still, it easily fits in L1 cache (obviously, this only applies if you're converting a lot of numbers, but otherwise efficientcy is moot anyway). Also, you might end up with a leading zero, as in "0128".

Yes, there is a more efficient way, but not portable, though. Intel's FPU has a special BCD format numbers. So, all you have to do is just to call the correspondent assembler instruction that converts ST(0) to BCD format and stores the result in memory. The instruction name is FBSTP.

Mathematically speaking, the number of decimal digits of an integer is 1+int(log10(abs(a)+1))+(a<0);.
You will not use strings but go through floating points and the log functions. If your platform has whatever type of FP accelerator (every PC or similar has) that will not be a big deal ,and will beat whatever "sting based" algorithm (that is noting more than an iterative divide by ten and count)

possible sets of output-preserving code manipulations

This is a theoretical question, so expect that many details here are not computable in practice or even in theory.
Let's say I have a string s that I want to compress. The result should be a self-extracting binary (can be x86 asm but can also be some other hypothetical Turing-complete low level language) which outputs s.
Now, we can easily iterate through all possible such binaries/programs, ordered by size. Let B_s be the sub-list of these binaries who output s (of course B_s is uncomputable).
As every set of positive integers must have a minimum, there must be a smallest program b_min_s in B_s
From s, I can also construct a canonical program b_cano_s which just outputs s in a trivial way. I.e. the size of b_cano_s will be in O(#s) -- if we think of ELF with data segments, we will even have #b_cano_s ~ #s.
Is there a set A of possible operations on the binaries which:
1 . Will preserve the output.
2a . Given b_cano_s, we can arrive somehow by operations from A at b_min_s.
(2b . Given b_cano_s, we can arrive at all programs in B_s.)
for all possible strings s.
The conditions 1+2a are weaker than the conditions 1+2b. Maybe, if there is such a set A, we will automatically have both, though. (Is that so?)
Does such a set A exists? I was thinking about some obvious operations, like searching for some repeated strings and shorten this. Or some of the other common compression methods. However, that probably is not enough to arrive at all programs B_s and my intention says also not necessarily at b_min_s for the same reason.
If it exists, can we express it, i.e. is it computable?

You should link your related previous questions.
2a. As noted, you can not determine b_min_s, because that results in a paradox. As a result, I don't think you can prove the operations in A are sufficient to reduce to it.
2b. You can brute force B_s, but this is an infinite set, and the procedure is non-terminating. However, for each program in B_s, you can calculate a manipulation from b_cano_s to B_s. However, that does not imply these possible operations will be meaningful. It seems operations like "delete characters in this range", "insert character at this position" qualify.

C++ string comparison in one clock cycle

Is it possible to compare whole memory regions in a single processor cycle? More precisely is it possible to compare two strings in one processor cycle using some sort of MMX assembler instruction? Or is strcmp-implementation already based on that optimization?
EDIT:
Or is it possible to instruct C++ compiler to remove string duplicates, so that strings can be compared simply by their memory location? Instead of memcmp(a,b) compared by a==b (assuming that a and b are both native const char* strings).

Just use the standard C strcmp() or C++ std::string::operator==() for your string comparisons.
The implementations of them are reasonably good and are probably compiled to a very highly optimized assembly that even talented assembly programmers would find challenging to match.
So don't sweat the small stuff. I'd suggest looking at optimizing other parts of your code.

You can use the Boost Flyweight library to intern your immutable strings. String equality/inequality tests then become very fast since all it has to do at that point is compare pointers (pun not intended).

Not really. Your typical 1-byte compare instruction takes 1 cycle.
Your best bet would be to use the MMX 64-bit compare instructions( see this page for an example). However, those operate on registers, which have to be loaded from memory. The memory loads will significantly damage your time, because you'll be going out to L1 cache at best, which adds some 10x time slowdown*. If you are doing some heavy string processing, you can probably get some nifty speedup there, but again, it's going to hurt.
Other people suggest pre-computing strings. Maybe that'll work for your particular app, maybe it won't. Do you have to compare strings? Can you compare numbers?
Your edit suggests comparing pointers. That's a dangerous situation unless you can specifically guarantee that you won't be doing substring compares(ie, you are comparing some two byte strings: [0x40, 0x50] with [0x40, 0x42]. Those are not "equal", but a pointer compare would say they are).
Have you looked at the gcc strcmp() source? I would suggest that doing that would be the ideal starting place.
* Loosely speaking, if a cycle takes 1 unit, a L1 hit takes 10 units, an L2 hit takes 100 units, and an actual RAM hit takes really long.

It's not possible to perform general-purpose string operations in one cycle, but there are many optimizations you can apply with extra information.
If your problem domain allows the use of an aligned, fixed-size buffer for strings that fits in a machine register, you can perform single-cycle comparisons (not counting the load instructions).
If you always keep track of the lengths of your strings, you can compare lengths and use memcmp, which is faster than strcmp. If your application is multi-cultural, keep in mind that this only works for ordinal string comparison.
It appears you are using C++. If you only need equality comparisons with immutable strings, you can use a string interning solution (copy/paste link since I'm a new user) to guarantee that equal strings are stored at the same memory location, at which point you can simply compare pointers. See en.wikipedia.org/wiki/String_interning
Also, take a look at the Intel Optimization Reference Manual, Chapter 10 for details on the SSE 4.2's instructions for text processing. www.intel.com/products/processor/manuals/
Edit: If your problem domain allows the use of an enumeration, that is your single-cycle comparison solution. Don't fight it.

If you're optimizing for string comparisons, you may want to employ a string table (then you only need to compare the indexes of the two strings, which can be done in a single machine instruction).
If that's not feasible, you can also create a hashed string object that contains the string and a hash. Then most of the time you only have to compare the hashes if the strings aren't equal. If the hashes do match you'll have to do a full comparison though to make sure it wasn't a false positive.

It depends on how much preprocessing you do. C# and Java both have a process called interning strings which makes every string map to the same address if they have the same contents. Assuming a process like that, you could do a string equality comparison with one compare instruction.
Ordering is a bit harder.
EDIT: Obviously this answer is sidestepping the actual issue of attempting to do a string comparison within a single cycle. But it's the only way to do it unless you happen to have a sequence of instructions that can look at an unbounded amount of memory in constant time to determine the equivalent of a strcmp. Which is improbable, because if you had such an architecture the person who sold it to you would say "Hey, here's this awesome instruction that can do a string compare in a single cycle! How awesome is that?" and you wouldn't need to post a question on stackoverflow.
But that's just my reasoned opinion.

Or is it possible to instruct c++
compiler to remove string duplicates,
so that strings can be compared simply
by their memory location?
No. The compiler may remove duplicates internally, but I know of no compiler that guarantees or provides facilities for accessing such an optimisation (except possibly to turn it off). Certainly the C++ standard has nothing to say in this area.

Assuming you mean x86 ... Here is the Intel documentation.
But off the top of my head, no, I don't think you can compare more than the size of a register at a time.
Out of curiosity, why do you ask? I'm the last to invoke Knuth prematurely, but ... strcmp usually does a pretty good job.
Edit: Link now points to the modern documentation.

You can certainly compare more than one byte in a cycle. If we take the example of x86-64, you can compare up to 64-bits (8 bytes) in a single instruction (cmps), this isn't necessarily one cycle but will normally be in the low single digits (the exact speed depends on the specific processor version).
However, this doesn't mean you'll be able to all the work of comparing two arrays in memory much faster than strcmp :-
There's more than just the compare - you need to compare the two values, check if they are the same, and if so move to next chunk.
Most strcmp implementations will already be highly optimised, including checking if a and b point to the same address, and any suitable instruction-level optimisations.
Unless you're seeing alot of time spent in strcmp, I wouldn't worry about it - have you got a specific problem / use case you are trying to improve?

Even if both strings were cached, it wouldn't be possible to compare (arbitrarily long) strings in a single processor cycle. The implementation of strcmp in a modern compiler environment should be pretty much optimized, so you shouldn't bother to optimize too much.
EDIT (in reply to your EDIT):
You can't instruct the compiler to unify ALL duplicate strings - most compilers can do something like this, but it's best-effort only (and I don't know any compiler where it works across compilation units).
You might get better performance by adding the strings to a map and comparing iterators after that... the comparison itself might be one cycle (or not much more) then
If the set of strings to use is fixed, use enumerations - that's what they're there for.

Here's one solution that uses enum-like values instead of strings. It supports enum-value-inheritance and thus supports comparison similar to substring comparison. It also uses special character "¤" for naming, to avoid name collisions. You can take any class, function, or variable name and make it into enum-value (SomeClassA will become ¤SomeClassA).
struct MultiEnum
{
vector<MultiEnum*> enumList;
MultiEnum()
{
enumList.push_back(this);
}
MultiEnum(MultiEnum& base)
{
enumList.assign(base.enumList.begin(),base.enumList.end());
enumList.push_back(this);
}
MultiEnum(const MultiEnum* base1,const MultiEnum* base2)
{
enumList.assign(base1->enumList.begin(),base1->enumList.end());
enumList.assign(base2->enumList.begin(),base2->enumList.end());
}
bool operator !=(const MultiEnum& other)
{
return find(enumList.begin(),enumList.end(),&other)==enumList.end();
}
bool operator ==(const MultiEnum& other)
{
return find(enumList.begin(),enumList.end(),&other)!=enumList.end();
}
bool operator &(const MultiEnum& other)
{
return find(enumList.begin(),enumList.end(),&other)!=enumList.end();
}
MultiEnum operator|(const MultiEnum& other)
{
return MultiEnum(this,&other);
}
MultiEnum operator+(const MultiEnum& other)
{
return MultiEnum(this,&other);
}
};
MultiEnum
¤someString,
¤someString1(¤someString), // link to "someString" because it is a substring of "someString1"
¤someString2(¤someString);
void Test()
{
MultiEnum a = ¤someString1|¤someString2;
MultiEnum b = ¤someString1;
if(a!=¤someString2){}
if(b==¤someString2){}
if(b&¤someString2){}
if(b&¤someString){} // will result in true, because someString is substring of someString1
}
PS. I had definitely too much free time on my hands this morning, but reinventing the wheel is just too much fun sometimes... :)

How to handle arbitrarily large integers

I'm working on a programming language, and today I got the point where I could compile the factorial function(recursive), however due to the maximum size of an integer the largest I can get is factorial(12). What are some techniques for handling integers of an arbitrary maximum size. The language currently works by translating code to C++.

If you need larger than 32-bits you could consider using 64-bit integers (long long), or use or write an arbitrary precision math library, e.g. GNU MP.

If you want to roll your own arbitrary precision library, see Knuth's Seminumerical Algorithms, volume 2 of his magnum opus.

If you're building this into a language (for learning purposes I'd guess), I think I would probably write a little BCD library. Just store your BCD numbers inside byte arrays.
In fact, with today's gigantic storage abilities, you might just use a byte array where each byte just holds a digit (0-9). You then write your own routine to add, subtract multiply and divide your byte arrays.
(Divide is the hard one, but I bet you can find some code out there somewhere.)
I can give you some Java-like psuedocode but can't really do C++ from scratch at this point:
class BigAssNumber {
private byte[] value;
// This constructor can handle numbers where overflows have occurred.
public BigAssNumber(byte[] value) {
this.value=normalize(value);
}
// Adds two numbers and returns the sum. Originals not changed.
public BigAssNumber add(BigAssNumber other) {
// This needs to be a byte by byte copy in newly allocated space, not pointer copy!
byte[] dest = value.length > other.length ? value : other.value;
// Just add each pair of numbers, like in a pencil and paper addition problem.
for(int i=0; i<min(value.length, other.value.length); i++)
dest[i]=value[i]+other.value[i];
// constructor will fix overflows.
return new BigAssNumber(dest);
}
// Fix things that might have overflowed 0,17,22 will turn into 1,9,2
private byte[] normalize(byte [] value) {
if (most significant digit of value is not zero)
extend the byte array by a few zero bytes in the front (MSB) position.
// Simple cheap adjust. Could lose inner loop easily if It mattered.
for(int i=0;i<value.length;i++)
while(value[i] > 9) {
value[i] -=10;
value[i+1] +=1;
}
}
}
}
I use the fact that we have a lot of extra room in a byte to help deal with addition overflows in a generic way. Can work for subtraction too, and help on some multiplies.

There's no easy way to do it in C++. You'll have to use an external library such as GNU Multiprecision, or use a different language which natively supports arbitrarily large integers such as Python.

Other posters have given links to libraries that will do this for you, but it seem like you're trying to build this into your language. My first thought is: are you sure you need to do that? Most languages would use an add-on library as others have suggested.
Assuming you're writing a compiler and you do need this feature, you could implement integer arithmetic functions for arbitrarily large values in assembly.
For example, a simple (but non-optimal) implementation would represent the numbers as Binary Coded Decimal. The arithmetic functions could use the same algorithms as you'd use if you were doing the math with pencil and paper.
Also, consider using a specialized data type for these large integers. That way "normal" integers can use the standard 32 bit arithmetic.

My prefered approach would be to use my current int type for 32-bit ints(or maybe change it to internally to be a long long or some such, so long as it can continue to use the same algorithms), then when it overflows, have it change to storing as a bignum, whether of my own creation, or using an external library. However, I feel like I'd need to be checking for overflow on every single arithmetic operation, roughly 2x overhead on arithmetic ops. How could I solve that?

If I were implement my own language and want to support arbitrary length numbers, I will use a target language with the carry/borrow concept. But since there is no HLL that implements this without severe performance implications (like exceptions), I will certainly go implement it in assembly. It will probably take a single instruction (as in JC in x86) to check for overflow and handle it (as in ADC in x86), which is an acceptable compromise for a language implementing arbitrary precision. Then I will use a few functions written in assembly instead of regular operators, if you can utilize overloading for a more elegant output, even better. But I don't expect generated C++ to be maintainable (or meant to be maintained) as a target language.
Or, just use a library which has more bells and whistles than you need and use it for all your numbers.
As a hybrid approach, detect overflow in assembly and call the library function if overflow instead of rolling your own mini library.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js