Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
class Solution {
public:
vector<vector<int>> threeSum(vector<int>& nums) {
vector<int> v;
vector<vector<int>> ans;
int n=nums.size();
sort(nums.begin(),nums.end());
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
for(int k=j+1;k<n;k++){
if(nums[i]+nums[j]+nums[k]==0 && i!=j && i!=k && j!=k){
v.push_back(nums[i]);
v.push_back(nums[j]);
v.push_back(nums[k]);
ans.push_back(v);
}
}
}
}
return ans;
}
};
it is not showing an error but it is displaying wrong answer as i have given in the attachment
Input: [-1, 0, 1, 2, -1, 4]
Your output: [[-1, -1, 2], [-1, -1, 2, -1, 0, 1], [-1, -1, 2, -1, 0, 1, -1, 0, 1]]
Expected output: [[-1, -1, 2], [-1, 0, 1]]
I can understand the problem with pushing back more and more values the my vector v. OK.
But maybe, somebody could give me a hint on how to tackle the problem with the duplicates?
Any help for me as a new user is highly welcome and appreciated.
Of course, we will help you here on SO.
Starting with a new language is never that easy and there may by some things that are not immediately clear in the beginning. Additionally, I do apologize for any rude comments that you may see, but you can be assured that the vast majority of the members of SO are very supportive.
I want to first give you some information on pages like Leetcode and Codeforces and the like. Often also referred to as “competitive programming” pages. Sometimes people misunderstand this and they think that you have only a limited time to submit the code. But that is not the case. There are such competitions but usually not on the mentioned pages. The bad thing is, the coding style used in that real competition events is also used on the online pages. And that is really bad. Because this coding style is that horrible that no serious developer would survive one day in a real company who needs to earn money with software and is then liable for it.
So, these pages will never teach you or guide you how to write good C++ code. And even worse, if newbies start learning the language and see this bad code, then they learn bad habits.
But what is then the purpose of such pages?
The purpose is to find a good algorithm, mostly optimized for runtime execution speed and often also for low memory consumption.
So, the are aiming at a good design. The Language or coding style does not matter for them. So, you can submit even completely obfuscated code or “code golf” solutions, as long at is it fast, it does not matter.
So, do never start to code immediately as a first step. First, think 3 days. Then, take some design tool, like for example a piece of paper, and sketch a design. Then refactor you design and then refactor your design and then refactor your design and then refactor your design and then refactor your design and so one. This may take a week.
And next, search for an appropriate programming language that you know and can handle your design.
And finally, start coding. Because you did a good design before, you can use long and meaningful variable names and write many many comments, so that other people (and you, after one month) can understand your code AND your design.
OK, maybe understood.
Now, let’s analyze your code. You selected a brute force solution with a triple nested loop. That could work for a low number of elements, but will result in most cases in a so called TLE (Time Limit Exceeded) error. Nearly all problems on those pages cannot be solved with brute force. Brute force solutions are always an indicator that you did not do the above design steps. And this leads to additional bugs.
Your code has too major semantic bugs.
You define in the beginning a std::vector with the name “v”. And then, in the loop, after you found a triplet meeting the given condition, you push_back the results in the std::vector. This means, you add 3 values to the std::vector “v” and now there are 3 elements in it. In the next loop run, after finding the next fit, you again push_back 3 additional values to your std::vector ”v” and now there are 6 elements in it. In the next round 9 elements and so on.
How to solve that?
You could use the std::vector’s clear function to delete the old elements from the std::vector at the beginning of the most inner loop, after the if-statement. But that is basically not that good, and, additionally, time consuming. Better is to follow the general idiom, to define variables as late as possible and at that time, when it is needed. So, if you would define your std::vector ”v” after the if statement, then the problem is gone. But then, you would additionally notice that it is only used there and nowhere else. And hence, you do not need it at all.
You may have seen that you can add values to a std::vector by using an initializer list. Something like:
std::vector<int> v {1,2,3};
With that know-how, you can delete your std::vector “v” and all related code and directly write:
ans.push_back( { nums[i], nums[j], nums[k] } );
Then you would have saved 3 unnecessary push_back (and a clear) operations, and more important, you would not get result sets with more than 3 elements.
Next problem. Duplicates. You try to prevent the storage of duplicates by writing && i!=j && i!=k && j!=k. But this will not work in general, because you compare indices and not values and because also the comparison is also wrong. The Boolean expressions is a tautology. It is always true. You initialize your variable j with i+1 and therefore “i” can never be equal to “j”. So, the condition i != j is always true. The same is valid for the other variables.
But how to prevent duplicate entries. You could do some logical comparisons, or first store all the triplets and later use std::unique (or other functions) to eliminate duplicates or use a container that would only store unique elements like a std::set. For the given design, having a time complexity of O(n^3), meaning it is already extremely slow, adding a std::set will not make things worse. I checked that in a small benchmark. So, the only solution is a completely different design. We will come to that later. Let us first fix the code, still using the brute force approach.
Please look at the below somehow short and elegant solution.
vector<vector<int>> threeSum(vector<int>& nums) {
std::set<vector<int>> ans;
int n = nums.size();
sort(nums.begin(), nums.end());
for (int i = 0; i < n; i++)
for (int j = i + 1; j < n; j++)
for (int k = j + 1; k < n; k++)
if (nums[i] + nums[j] + nums[k] == 0)
ans.insert({ nums[i], nums[j], nums[k] });
return { ans.begin(), ans.end() };
}
But, unfortunately, because of the unfortunate design decision, it is 20000 times slower for big input than a better design. And, because the online test programs will work with big input vectors, the program will not pass the runtime constraints.
How to come to a better solution. We need to carefully analyze the requirements and can also use some existing know-how for similar kind of problems.
And if you read some books or internet articles, then you often get the hint, that the so called “sliding window” is the proper approach to get a reasonable solution.
You will find useful information here. But you can of course also search here on SO for answers.
for this problem, we would use a typical 2 pointer approach, but modified for the specific requirements of this problem. Basically a start value and a moving and closing windows . . .
The analysis of the requirements leads to the following idea.
If all evaluated numbers are > 0, then we can never have a sum of 0.
It would be easy to identify duplicate numbers, if they would be beneath each other
--> Sorting the input values will be very beneficial.
This will eliminate the test for half of the values with randomly distribute input vectors. See:
std::vector<int> nums { 5, -1, 4, -2, 3, -3, -1, 2, 1, -1 };
std::sort(nums.begin(), nums.end());
// Will result in
// -3, -2, -1, -1, -1, 1, 2, 3, 4, 5
And with that we see, that if we shift our window to the right, then we can sop the evaluation, as soon as the start of the window hits a positive number. Additionally, we can identify immediately duplicate numbers.
Then next. If we start at the beginning of the sorted vector, this value will be most likely very small. And if we start the next window with one plus the start of the current window, then we will have “very” negative numbers. And to get a 0 by summing 2 “very” negative numbers, we would need a very positive number. And this is at the end of the std::vector.
Start with
startPointerIndex 0, value -3
Window start = startPointerIndex + 1 --> value -2
Window end = lastIndexInVector --> 5
And yes, we found already a solution. Now we need to check for duplicates. If there would be an additional 5 at the 2nd last position, then we can skip. It will not add an additional different solution. So, we can decrement the end window pointer in such a case. Same is valid, if there would be an additional -2 at the beginning if the window. Then we would need to increment the start window pointer, to avoid a duplicate finding from that end.
Some is valid for the start pointer index. Example: startPointerIndex = 3 (start counting indices with 0), value will be -1. But the value before, at index 2 is also -1. So, no need to evaluate that. Because we evaluate that already.
The above methods will prevent the creation of duplicate entries.
But how to continue the search. If we cannot find a solution, the we will narrow down the window. This we will do also in a smart way. If the sum is too big, the obviously the right window value was too big, and we should better use the next smaller value for the next comparison.
Same on the starting side of the window, If the sum was to small, then we obviously need a bigger value. So, let us increment the start window pointer. And we do this (making the window smaller) until we found a solution or until the window is closed, meaning, the start window pointer is no longer smaller than the end window pointer.
Now, we have developed a somehow good design and can start coding.
We additionally try to implement a good coding style. And refactor the code for some faster implementations.
Please see:
class Solution {
public:
// Define some type aliases for later easier typing and understanding
using DataType = int;
using Triplet = std::vector<DataType>;
using Triplets = std::vector<Triplet>;
using TestData = std::vector<DataType>;
// Function to identify all unique Triplets(3 elements) in a given test input
Triplets threeSum(TestData& testData) {
// In order to save function oeverhead for repeatingly getting the size of the test data,
// we will store the size of the input data in a const temporary variable
const size_t numberOfTestDataElements{ testData.size()};
// If the given input test vector is empty, we also immediately return an empty result vector
if (!numberOfTestDataElements) return {};
// In later code we often need the last valid element of the input test data
// Since indices in C++ start with 0 the value will be size -1
// With taht we later avoid uncessary subtractions in the loop
const size_t numberOfTestDataElementsMinus1{ numberOfTestDataElements -1u };
// Here we will store all the found, valid and unique triplets
Triplets result{};
// In order to save the time for later memory reallocations and copying tons of data, we reserve
// memory to hold all results only one time. This will speed upf operations by 5 to 10%
result.reserve(numberOfTestDataElementsMinus1);
// Now sort the input test data to be able to find an end condition, if all elements are
// greater than 0 and to easier identify duplicates
std::sort(testData.begin(), testData.end());
// This variables will define the size of the sliding window
size_t leftStartPositionOfSlidingWindow, rightEndPositionOfSlidingWindow;
// Now, we will evaluate all values of the input test data from left to right
// As an optimization, we additionally define a 2nd running variable k,
// to avoid later additions in the loop, where i+1 woild need to be calculated.
// This can be better done with a running variable that will be just incremented
for (size_t i = 0, k = 1; i < numberOfTestDataElements; ++i, ++k) {
// If the current value form the input test data is greater than 0,
// As um with the result of 0 will no longer be possible. We can stop now
if (testData[i] > 0) break;
// Prevent evaluation of duplicate based in the current input test data
if (i and (testData[i] == testData[i-1])) continue;
// Open the window and determin start and end index
// Start index is always the current evaluate index from the input test data
// End index is always the last element
leftStartPositionOfSlidingWindow = k;
rightEndPositionOfSlidingWindow = numberOfTestDataElementsMinus1;
// Now, as long as if the window is not closed, meaning to not narrow, we will evaluate
while (leftStartPositionOfSlidingWindow < rightEndPositionOfSlidingWindow) {
// Calculate the sum of the current addressed values
const int sum = testData[i] + testData[leftStartPositionOfSlidingWindow] + testData[rightEndPositionOfSlidingWindow];
// If the sum is t0o small, then the mall value on the left side of the sorted window is too small
// Therefor teke next value on the left side and try again. So, make the window smaller
if (sum < 0) {
++leftStartPositionOfSlidingWindow;
}
// Else, if the sum is too biig, the the value on the right side of the window was too big
// Use one smaller value. One to the left of the current closing address of the window
// So, make the window smaller
else if (sum > 0) {
--rightEndPositionOfSlidingWindow;
}
else {
// Accodring to above condintions, we found now are triplet, fulfilling the requirements.
// So store this triplet as a result
result.push_back({ testData[i], testData[leftStartPositionOfSlidingWindow], testData[rightEndPositionOfSlidingWindow] });
// We know need to handle duplicates at the edges of the window. So, left and right edge
// For this, we remember to c
const DataType lastLeftValue = testData[leftStartPositionOfSlidingWindow];
const DataType lastRightValue = testData[rightEndPositionOfSlidingWindow];
// Check left edge. As long as we have duplicates here, we will shift the opening position of the window to the right
// Because of boolean short cut evaluation we will first do the comparison for duplicates. This will give us 5% more speed
while (testData[leftStartPositionOfSlidingWindow] == lastLeftValue && leftStartPositionOfSlidingWindow < rightEndPositionOfSlidingWindow)
++leftStartPositionOfSlidingWindow;
// Check right edge. As long as we have duplicates here, we will shift the closing position of the window to the left
// Because of boolean short cut evaluation we will first do the comparison for duplicates. This will give us 5% more speed
while (testData[rightEndPositionOfSlidingWindow] == lastRightValue && leftStartPositionOfSlidingWindow < rightEndPositionOfSlidingWindow)
--rightEndPositionOfSlidingWindow;
}
}
}
return result;
}
};
The above solution will outperform 99% of other solutions. I made many benchmarks to prove that.
It additionally contains tons of comments to explain what is going on there. And If I have selected “speaking” and meaningful variable names for a better understanding.
I hope, that I could help you a little.
And finally: I dedicate this answer to Sam Varshavchik and PaulMcKenzie.
Problem 1: suppose you have an array of n floats and you want to calculate an array of n running averages over three elements. The middle part would be straightforward:
for (int i=0; i<n; i++)
b[i] = (a[i-1] + a[i] + a[i+1])/3.
But you need to have separate code to handle the cases i==0 and i==(n-1). This is often done with extra code before the loop, extra code after the loop, and adjusting the loop range, e.g.
b[0] = (a[0] + a[1])/2.
for (int i=1; i<n-1; i++)
b[i] = (a[i-1] + a[i] + a[i+1])/3.;
b[n-1] = (a[n-1] + a[n-2])/2.
Even that is not enough, because the cases of n<3 need to be handled separately.
Problem 2. You are reading a variable-length code from an array (say implementing a UTF-8 to UTF-32 converter). The code reads a byte, and accordingly may read one or more bytes to determine the output. However, before each such step, it also needs to check if the end of the input array has been reached, and if so, perhaps load more data into a buffer, or terminate with an error.
Both of these problems are cases of loops where the interior of the loop can be expressed neatly, but the edges need special handling. I find these sort of problems the most prone to error and to messy programming. So here's my question:
Are there any C++ idioms which generalize wrapping such loop patterns in a clean way?
Efficiently and elegantly handling boundary conditions is troublesome in any programming language -- C++ has no magic hammer for this. This is a common problem with applying convolution filters to signals / images -- what do you do at the image boundaries where your kernel goes outside the image support?
There are generally two things you are trying to avoid:
out of bounds array indexing (which you must avoid), and
special computation (which is inelegant and results in slower code due to extra branching).
There are usually three approaches:
Avoid the boundaries -- this is the simplest approach and is often sufficient since the boundary case make up a tiny slice of the problem and can be ignored.
Extend the bounds of your buffer -- add extra columns/rows of padding to the array so the same code used in the general case can be used at the edges. Of course this raises the problem of what values to place in the padding -- this often depends on the problem you are solving and is considered in the next approach.
Special computation at the boundary -- this is what you do in your example. Of course how you do this is problem dependent and raises a similar issue as the previous approach -- what is the correct thing to do when my filter (in your case an averaging filter) extends beyond the array support? What should I consider the values to be outside the array support? Most image filter libraries provide some form of extrapolation options -- for example:
assume a value zero or some other constant (define a[i] = 0 if i < 0 || i >= n),
replicate the boundary value (e.g. a[i] = a[0] if i < 0 and a[i] = a[n-1] if i >= n)
wrap the value (define a[i] = a[(i + n) % n] -- makes sense of some cases -- e.g, texture filters)
mirror the border ((e.g. a[i] = a[abs(i+1)] if i < 0 and a[i] = a[2n - i -1] if i >= n)
other special case (what you do)
When reasonable, its best to separate the special case from the general case (like you do) to avoid inelegant and slow general cases. One could always wrap/hide the special case and general case in a function or operator (e.g., overload operator[]) , but this only sugar coats the problem like any contrived C++ idiom would. In a multi-threaded environment (e.g. CUDA / SIMD) you can do some other tricks be preloading out-of-bounds values, but you are still stuck with the same problem.
This is why programmers use the phrase "edge case" in referring any kind of special case programming and is often a time sink and a source for annoying errors. Some languages that efficiently support exception handling for out of bounds array indexing (e.g. Ada) can make for prettier code, but still cause the same pain.
Unfortunately the answer is NO.
There is no C++ idioms which generalize wrapping such loop patterns in a clean way!
You can do it by making something like this, but you still need to adjust window size.
template <typename T, int N>
T subscript(T (&data)[N], int index) {
if (index < 0 || index >= N) {
return 0;
}
return data[index];
}
for (int i = 0; i < n; ++i) {
b[i] = (subscript(a, i - 1) + subscript(a, i) + subscript(a, i + 1)) / 3.
}
Most of the for loops I have read/written start from 0 and to be fair most of the code I have read are used for embedded systems and they were in C/C++. In embedded systems the readability is not as important as code efficiency in some cases. Therefore, I am not sure which of the following cases would be a better choice:
version 1
for(i = 0; i < allowedNumberOfIteration; i++)
{
//something that may take from 1 iteration to allowedNumberOfIteration before it happens
if(somethingHappened)
{
if(i + 1 > maxIteration)
{
maxIteration = i + 1;
}
}
}
Version 2
for(i = 1; i <= allowedNumberOfIteration; i++)
{
//something that may take from 1 iteration to allowedNumberOfIteration before it happens
if(somethingHappened)
{
if(i > maxIteration)
{
maxIteration = i;
}
}
}
Why first version is better in my opinion:
1.Most loops starts with 0. So, maybe experienced programmers find it to be better if it starts from 0.
Why second version is better in my opinion:
To be fair if there was an array in the function starting from 0
would be great because the index of arrays start from zero. But in
this part of the code no arrays are used.
Beside the second version looks simpler because you do not have to think about the '+1'.
Things I do not know
1) Is there any performance difference?
2) Which version is better?
3) Are there any other aspect that should be considered in deciding the starting point?
4) Am I worrying too much?
1) No
2) Neither
3) Arrays in C and C++ are zero-based.
4) Yes.
Arrays of all forms in C++ are zero-based. I.e. their index start at zero and goes up to the size of the array minus one. For example an array of five elements will have the indexes 0 to 4 (inclusive).
That is why most loops in C++ are starting at zero.
As for your specific list of questions, for 1 there might be a performance difference. If you start a loop at 1 then you might need to subtract 1 in each iterator if you use the value as an array index. Or if you increase the size of the arrays then you use more memory.
For 2 it really depends on what you're iterating over. Is it over array indexes, then the loop starting at zero is clearly better. But you might need to start a loop at any value, it really depends on what you're doing and the problem you're trying to solve.
For 3, what you need to consider is what you're using the loop for.
And 4, maybe a little. ;)
This argument comes from a small, 3-page note by the famous computer scientist Dijkstra (the one from Dijkstra's algorithm). In it, he lays out the reasons we might index starting at zero, and the story begins with trying to iterate over a sequence of natural numbers (meaning a sequence on the number line 0, 1, 2, 3, ...).
There are 4 possibilities to index 2, 3, ..., 12.
a.) 2 <= i < 13
b.) 1 < i <= 12
c.) 2 <= i <= 12
d.) 1 < i < 13
He mentions that a.) and b.) have the advantage that the difference of the two bounds equals the number of elements in the sequence. He also mentions if two sequences are adjacent, the upper bound of one equals the lower bound of the other. He says this doesn't help decide between a.) or b.) so he will start afresh.
He immediately removes b.) and d.) from the list since, if we were to start a natural sequence with zero, they would have bounds outside the natural numbers (-1), which is "ugly". He completes the observation by saying we prefer <= for the lower bound -- leaving us with a.) and c.).
For an empty set, he notes that in b.) and c.) will have -1 for its upper bound, which is also "ugly".
All three of these observations leads to the convention to represent a sequence of natural numbers with a.), and that indeed is how most people write a for that goes over an array: for(int i = 0; i < size; ++i). We include the lower bound (i <= 0), and we exclude the upper bound (i < size).
If you were to use something like for(int i = 0; i <= iterations - 1; ++i) to do i iterations, you can see the ugliness he refers to in the case of the empty set. iterations - 1 would be -1 for zero iterations.
So by convention, we use a.) and due to indexing arrays at zero, we start a huge number for for loops with i = 0. Then, we reason parsimony - might as well do different things the exact same way if there is no other reason to do one or the other a different way.
Now, if we were to use a.) with 1-based indexing into an array instead of 0-based indexing, we would get for(int i = 1; i < size + 1; ++i). The + 1 is "ugly", so we prefer to start our range with i = 0.
In conclusion, you should do a for iterations times with for(int i = 0; i < iterations; ++i). Something like for(int i = 1; i <= iterations; ++i) is fairly understandable and works, but is there any good reason to add a different way to loop iterations times? Just use the same pattern as when indexing an array. In other words, use 0 <= i < size. Worse, the loop based on 1 <= i <= iterations doesn't have all the reasons Dijkstra came up with to support using 0 <= i < iterations as a convention.
You're not worrying too much. In fact, Dijkstra himself wondered the exact same question as has pretty much any serious programmer. Tuning your style like a craftsman who loves their trade is the ground a great programmer stands on. Pursuing parsimony and writing code the way others tend to write code (including yourself - the looping of an array!) are both sane, great things to pursue.
Due to this convention, when I see for(i = 1, I notice a departure from a convention. I am then more cautious around that code, thinking the logic within the for might depend on starting at 1 instead of 0. This is slight, but there's no reason to add that possibility when a convention is so widely used. If you happen to have a large for body, this complaint becomes less slight.
To understand why starting at one makes no sense, consider taking the argument to its natural conclusion - the argument of "but it makes sense to me!": You can start i at anything! If we free ourselves from convention, why not loop for(int i = 5; i <= iterations + 4; ++i)? Or for(int i = -5; i > -iterations - 5; --i)? Just do it the way a majority of programmers do in the majority of cases, and save being different for when there's a good reason - the difference signals to the programmer reading your code that the body of the for contains something unusual. With the standard way, we know the for is either indexing/ordering/doing arithmetic with a sequence starting at 0 or executing some logic iterations times in a row.
Note how prevalent this convention is too. In C++, every standard container iterates between [start, end), which corresponds to a.) above. There, they do it so that the end condition can be iter != end, but the fact that we already do the logic one way and that that one way has no immediate drawbacks flows naturally into the argument of "Why do it two different ways when we already do it this way in this context?" In his little paper, Dijkstra also notes a language called Mesa that can do a.), b.), c.), or d.) with particular syntax. He claims that there, a.) has won out in practice, and the others are associated with the cause of bugs. He then laments how FORTRAN indexes at 1 and how PASCAL took on c.) by convention.
I am parsing files around 1MB in size, reading the first 300KB and searching for a number of particular signatures. My strategy is, for each byte, see if the byte is in a map/vector/whatever of bytes that I know might be at the start of a signature, and if so look for the full signature - for this example, assume those leading bytes are x37, x50, and x52. Processing a total of 90 files (9 files 10 times actually), the following code executes in 2.122 seconds:
byte * bp = &buffer[1];
const byte * endp = buffer + bytesRead - 30; // a little buffer for optimization - no signature is that long
//multimap<byte, vector<FileSignature> >::iterator lb, ub;
map<byte, vector<FileSignature> >::iterator findItr;
vector<FileSignature>::iterator intItr;
while (++bp != endp)
{
if (*bp == 0x50 || *bp == 0x52 || *bp == 0x37) // Comparison line
{
findItr = mapSigs.find(*bp);
for (intItr = findItr->second.begin(); intItr != findItr->second.begin(); intItr++)
{
bool bMatch = true;
for (UINT i = 1; i < intItr->mSignature.size(); ++i)
{
if (intItr->mSignature[i] != bp[i])
{
bMatch = false;
break;
}
}
if (bMatch)
{
CloseHandle(fileHandle);
return true;
}
}
}
}
However, my initial implementation finishes in a sluggish 84 seconds. The only difference is related to the line labeled "// Comparison line" above:
findItr = mapSigs.find(*bp);
if (findItr != mapSigs.end())
...
A very similar implementation using a vector containing the 3 values also results in extremely slow processing (190 seconds):
if (find(vecFirstChars.begin(), vecFirstChars.end(), *bp) != vecFirstChars.end())
{
findItr = mapSigs.find(*bp);
...
But an implementation accessing the elements of the vector directly performs rather well (8.1 seconds). Not as good as the static comparisons, but still far far better than the other options:
if (vecFirstChars[0] == *bp || vecFirstChars[1] == *bp || vecFirstChars[2] == *bp)
{
findItr = mapSigs.find(*bp);
...
The fastest implementation so far (inspired by Component 10 below) is the following, clocking in at about 2.0 seconds:
bool validSigs[256] = {0};
validSigs[0x37] = true;
validSigs[0x50] = true;
validSigs[0x52] = true;
while (++bp != endp)
{
if (validSigs[*bp])
{
...
Extending this to use 2 validSigs to look if the 2nd char is valid as well brings the total run time down to 0.4 seconds.
I feel the other implementations should perform better. Especially the map, which should scale as more signature prefixes are added, and searches are O(log(n)) vs O(n). What am I missing? My only shot-in-the-dark guess is that with the static comparisons and (to a lesser extant) the vector indexing, I am getting the values used for the comparison cached in a register or other location that makes it significantly faster than reading from memory. If this is true, am I able to explicitly tell the compiler that particular values are going to be used often? Are there any other optimizations that I can take advantage of for the below code that are not apparent?
I am compiling with Visual Studio 2008.
This is simple enough to come down to the number of instructions executed. The vector, map, or lookup table will reside entirely in the CPU level 1 data cache so memory access isn't taking up time. As for the lookup table, as long as most bytes don't match a signature prefix the branch predictor will stop flow control from taking up time. (But the other structures do incur flow control overhead.)
So quite simply, comparing against each value in the vector in turn requires 3 comparisons. The map is O(log N), but the coefficient (which is ignored by big-O notation) is large due to navigating a linked data structure. The lookup table is O(1) with a small coefficient because access to the structure can be completed by a single machine instruction, and then all that remains is one comparison against zero.
The best way to analyze performance is with a profiler tool such as valgrind/kcachegrind.
The "compare against constants" compares 3 memory addresses against 3 constants. This case is going to be extremely easy to do things like unroll or do bit optimization on, if the compiler feels like it. The only branches that the written ASM is going to have here are going to be highly predictable.
For the literal 3 element vector lookup, there is the additional cost of dereferencing the addresses of the vector values.
For the vector loop, the compiler has no idea how big the vector is at this point. So it has to write a generic loop. This loop has a branch in it, a branch that goes one way 2 times, then the other way. If the computer uses the heuristic "branches go the way they did last time", this results in lots of branch prediction failures.
To verify that theory, try making the branching more predictable -- search for each element for up to 100 different input bytes at a time, then search for the next one. That will make naive branch prediction work on the order of 98% of the time, instead of the 33% in your code. Ie, scan 100 (or whatever) characters for signature 0, then 100 (or whatever) characters for signature 1, until you run out of signatures. Then go on to the next block of 100 characters to scan for signatures. I chose 100 because I'm trying to avoid branch prediction failures, and I figure a few percent branch prediction failures isn't all that bad. :)
As for the map solution, well maps have a high constant overhead, so it being slow is pretty predictable. The main uses of a map are dealing with large n lookups, and the fact that they are really easy to code against.
Find the middle of the string or array with an unknown length. You may
not traverse the list to find the length. You may not use anything to
help you find the length - as it is "unknown." (ie. no sizeof (C) or count(C#) etc...)
I had this question as an interview question. I'm just wondering what the answer is. I did ask if i could use sizeof, he said "no, the size of the string or array is unknown - you just need to get to the middle."
BTW, i'm not sure if this is actually possible to solve with no traversing. I almost felt as though he may have wanted to see how confident i am in my answer :S not sure...
His English was bad - also not sure if this contributed to misunderstandings. He directly told me that i do not need to traverse the list to get to the middle :S :S I'm assuming he meant no traversing at all..... :S
Have two counters, c1 and c2. Begin traversing the list, incrementing c1 every time and c2 every other time. By the time c1 gets to the end, c2 will be in the middle.
You haven't "traversed the list to find the length" and then divided it by two, you've just gone through once.
The only other way I can think of would be to keep taking off the first and last item of the list until you are left with the one(s) in the middle.
You (or your interviewer) are very vague in what the data is (you mentioned "string" and "array"); there's no assumption that can be made, so it can be anything.
You mentioned that the length of the string is unknown, but from your wording it might seem like you (or the interviewer) actually meant to say unknowable.
a) If it's just unknown, then the question is, how can it can be determined? In the case of strings, for example, you can consider the end to be '\0'. You can then apply some algorithms like the ones suggested by the other answers.
b) If it's unknowable, the riddle has no solution. The concept of middle has no meaning without a beginning and an end.
Bottom line, you cannot talk about a middle without a beginning and an end, or a length. Either this question was intentionally unanswerable, or you did not understand it properly. You must know more than just the beginning of the memory segment and maybe its type.
The following code will find the middle of an array WITHOUT traversing the list
int thearray[80];
int start = (int)&thearray;
int end = (int)(&thearray+1);
int half = ((end-start) / 4)/ 2;
std::cout << half << std::endl;
EDITS:
This code assumes you are dealing with an actual array and not a pointer to the first element of one, thus code like:
int *pointer_to_first_element = (int *)malloc(someamount);
will not work, likewise with any other notation that degrades the array reference into a pointer to the first element. Basically any notation using the *.
You would just use the difference between the addresses of the first and last elements.
I think this problem is aimed to also test your skills in problem analysis and requirements gathering. As others have stated before, we will need at least another piece of data to solve this issue.
My approach is to let clear to the interviewer that we can solve the problem with one constraint in the function call: the caller must provide 2 pointer, one to the beginning and another to the end of the array. Given those 2 pointers, and using basic pointer arithmetic, I reach this solution; please let me know what you think about it.
int *findMiddleArray( int const *first, int const *last )
{
if( first == NULL || last == NULL || first > last )
{
return NULL;
}
if( first == last )
{
return (int *)first;
}
size_t dataSize= ( size_t )( first + 1 ) - ( size_t )first,
addFirst= ( size_t )first,
addLast= ( size_t )last,
arrayLen= ( addLast - addFirst) / dataSize + 1,
arrayMiddle= arrayLen % 2 > 0 ? arrayLen / 2 + 1 : arrayLen / 2;
return ( int * )( ( arrayMiddle - 1 ) * dataSize + addFirst );
}
one way you can find midpoint of array is (for odd length array)
just use two loops ,1st loop start traverse from 0 index and the other (nested) loop will traverse from last index of array. Now just compare elements when it comes same ...that will be the mid point of array. i.e if(arr[i]== arr[j]) . Hope you got the point !
For Even length array ..you can do if(arr[i] == arr[j-1]) Or if(arr[i] == arr[j+1]) as they will never be same .try it by dry run!