Double errors correcting - hamming-code

Is there a method to correct double errors using Hamming code?

No. But there are other coding schemes that do this.

Actually yes. Hamming codes are capable of double error correction.
You just need to increase the Hamming distance from 3 to 5.
In his original paper, Hamming talked a lot about the (7,4) Hamming code, so
now whenever you google Hamming code you get that information and not much
more. The (7,4) code is a perfect code, so it is pretty useful.
Double error correction codes are going to be less efficient.
For a block length of n=7, I could only find two codewords at a Hamming
distance of 5 away from each other that do not overlap with others.
For a block length of n=10, I could only achieve 8 codewords. That would make it a (10,3) code. That's a lot better than just repeating the message
5 times (which would be a (15,3) code), but there are a lot better ways
to do this.

Related

C++ armadillo not correctly solving poorly conditioned matrix

I have a relatively simple question regarding the linear solver built into Armadillo. I am a relative newcomer to C++ but have experience coding in other languages. I am solving a fluid flow problem by successive linearization, using the armadillo function Solve(A,b) to get the solution at each iteration.
The issue that I am running into is that my matrix is very ill-conditioned. The determinant is on the order of 10^-20 and the condition number is 75000. I know these are terrible conditions but it's what I've got. Does anyone know if it is possible to specify the precision in my A matrix and in the solve function to something beyond double (long double perhaps)? I know that there are double matrix classes in Armadillo but I haven't found any documentation for higher levels of precision.
To approach this from another angle, I wrote some code in Mathematica and the LinearSolve worked very well and the program converged to the correct answer. My reasoning is that Mathematica variables have higher precision which can handle the higher levels of rounding error.
If anyone has any insight on this, please let me know. I know there are other ways to approach a poorly conditioned matrix (like preconditioning and pivoting), but my work is more in the physics than in the actual numerical solution so I'm trying to steer clear of that.
EDIT: I just limited the precision in the Mathematica version to 15 decimal places and the program still converges. This leads me to believe it is NOT a variable precision question but rather an issue with the method.
As you said "your work is more in the physics": rather than trying to increase the accuracy, I would use the Moore-Penrose Pseudo-Inverse, which in Armadillo can be obtained by the function pinv. You should then experience a bit with the parameter tolerance to set it to a reasonable level.
The geometrical interpretation is as follows: bad condition numbers are due to the fact that the row/column-vectors are linearly dependent. In physics, such linearly dependencies usually have an origin which at least needs to be interpreted. The pseudoinverse first projects the matrix onto a lower dimensional space in which the vectors are "less linearly dependent" by dropping all singular vectors with singular values smaller than the parameter tolerance. The reulting matrix has a better condition number such that the standard inverse can be constructed with less problems.

Ideas Related to Subset Sum with 2,3 and more integers

I've been struggling with this problem just like everyone else and I'm quite sure there has been more than enough posts to explain this problem. However in terms of understanding it fully, I wanted to share my thoughts and get more efficient solutions from all the great people in here related to Subset Sum problem.
I've searched it over the Internet and there is actually a lot sources but I'm really willing to re-implement an algorithm or finding my own in order to understand fully.
The key thing I'm struggling with is the efficiency considering the set size will be large. (I do not have a limit, just conceptually large). The two phases I'm trying to implement ideas on is finding two numbers that are equal to given integer T, finding three numbers and eventually K numbers. Some ideas I've though;
For the two integer part I'm thing basically sorting the array O(nlogn) and for each element in the array searching for its negative value. (i.e if the array element is 3 searching for -3). Maybe a hash table inclusion could be better, providing a O(1) indexing the element?
For the three or more integers I've found an amazing blog post;http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/. However even the author itself states that it is not applicable for large numbers.
So I was for 2 and 3 and more integers what ideas could be applied for the subset problem. I'm struggling with setting up a dynamic programming method that will be efficient for the large inputs as well.
That blog post you linked to looked pretty great, actually. This is, after all, an NP-complete problem...
But I bet you could speed it up even further. I haven't done any benchmarks, but I'm gonna guess that his use of a matrix is his single biggest time sink. First, it'll take a huge amount of memory for some really trivial inputs (For example: [-1000, 1000] will need 2001 columns! Good grief!), and then you're wasting a ton of cycles scanning through each row looking for "T"s, which are often gonna be pretty sparse.
So instead: Use a "set" data structure. That'll keep space and iteration time to a minimum,* but store values just as well: If it's in the set, it's a "T"; otherwise, it's an "F".
Hope that helps!
*: Of course, "minimum" doesn't necessarily = "small."

c++ numerical analysis Accurate data structure?

Using double type I made Cubic Spline Interpolation Algorithm.
That work was success as it seems, but there was a relative error around 6% when very small values calculated.
Is double data type enough for accurate scientific numerical analysis?
Double has plenty of precision for most applications. Of course it is finite, but it's always possible to squander any amount of precision by using a bad algorithm. In fact, that should be your first suspect. Look hard at your code and see if you're doing something that lets rounding errors accumulate quicker than necessary, or risky things like subtracting values that are very close to each other.
Scientific numerical analysis is difficult to get right which is why I leave it the professionals. Have you considered using a numeric library instead of writing your own? Eigen is my current favorite here: http://eigen.tuxfamily.org/index.php?title=Main_Page
I always have close at hand the latest copy of Numerical Recipes (nr.com) which does have an excellent chapter on interpolation. NR has a restrictive license but the writers know what they are doing and provide a succinct writeup on each numerical technique. Other libraries to look at include: ATLAS and GNU Scientific Library.
To answer your question double should be more than enough for most scientific applications, I agree with the previous posters it should like an algorithm problem. Have you considered posting the code for the algorithm you are using?
If double is enough for your needs depends on the type of numbers you are working with. As Henning suggests, it is probably best to take a look at the algorithms you are using and make sure they are numerically stable.
For starters, here's a good algorithm for addition: Kahan summation algorithm.
Double precision will be mostly suitable for any problem but the cubic spline will not work well if the polynomial or function is quickly oscillating or repeating or of quite high dimension.
In this case it can be better to use Legendre Polynomials since they handle variants of exponentials.
By way of a simple example if you use, Euler, Trapezoidal or Simpson's rule for interpolating within a 3rd order polynomial you won't need a huge sample rate to get the interpolant (area under the curve). However, if you apply these to an exponential function the sample rate may need to greatly increase to avoid loosing a lot of precision. Legendre Polynomials can cater for this case much more readily.

4 randomly pulled cards at least one would be ace

Please help me my c++ program that I don't know how to write. Question is as below.
There is a well mixed deck of 32 cards. Method of statistical tests to obtain the probability of an event that of the 4 randomly pulled charts at least one would be ace.
Compare the value of the error of calculating the probability of the true error (the true probability value is approximately equal to 0.432). Vary the number of experiments n.
What are the odds of not drawing an ace in one draw?
In four successive draws?
What are the odds that that doesn't happen?
From what I understand of your question, you have already calculated the odds of drawing the ace, but now need a program to prove it.
Shuffle your cards.
Draw 4 cards.
Check your hand for the presence of an ace.
Repeat these steps n times, where n is the number of test you need to make. Your final, "proven" probability is a/n, where a is the number of times an ace came up.
Of course, given the nature of randomness, there's no way to ensure that your results will be near the mathematical answer, unless you have the time available to make n equal to infinity.
Unfortunately I need to 'answer' rather than comment as I would wish because my rep is not high enough to allow me to do so.
There is information missing which will make it impossible to be sure of providing a correctly functioning program.
Most importantly coming to your problem from a mathematical /probability background :
I need to know for sure how many of the reduced deck of 32 cards are aces!
Unfortunately this sentence :
Method of statistical tests to obtain
the probability of an event that of
the 4 randomly pulled charts at least
one would be ace.
is mathematical goobledygook!
You need to correctly quote the sentences given to you in your assignment.
Those sentences hold vital information on which depends what the c++ program is to simulate!

Matrix implementation benchmarks, should I whip myself?

I'm trying to find out some matrix multiplication/inversion benchmarks online. My C++ implementation can currently invert a 100 x 100 matrix in 38 seconds, but compared to this benchmark I found, my implementation's performances really suck. I don't know if it's a super-optimized something or if really you can easily invert a 200 x 200 matrix in about 0.11 seconds, so I'm looking for more benchmarks to compare the results. Have you god some good link?
UPDATE
I spotted a bug in my multiplication code, that didn't affect the result but was causing useless cycle waste. Now my inversion executes in 20 seconds. It's still a lot of time, and any idea is welcome.
Thank you folks
This sort of operation is extremely cache sensitive. You want to be doing most of your work on variables that are in your L1 & L2 cache. Check out section 6 of this doc:
http://people.redhat.com/drepper/cpumemory.pdf
He walks you through optimizing a matrix multiply in a cache-optimized way and gets some big perf improvements.
Check if you are passing huge matrix objects by value (As this could be costly if copying the whole matrix).
If possable pass by reference.
The thing about matricies and C++ is that you want to avoid copying as much as possable.
So your main object should probably not conatain the "matrix data" but rather contain meta data about the matrix and a pointer (wrapped in by somthing smart) to the data portion. Thus when copying an object you only copy a small chunk of data not the whole thing (see string implementation for an example).
Why do you need to implement your own matrix library in the first place? As you've already discovered, there are already extremely efficient libraries available doing the same thing. And as much as people like to think of C++ as a performance language, that's only true if you're really good at the language. It is extremely easy to write terribly slow code in C++.
I don't know if it's a super-optimized
something or if really you can easily
invert a 200 x 200 matrix in about
0.11 seconds
MATLAB does that without breaking a sweat either. Are you implementing the LAPACK routines for matrix inversion (e.g. LU decomposition)?
Have you tried profiling it?
Following this paper (pdf), the calculation for a 100x100 matrix with LU decomposition will need 1348250 (floating point operations). A core 2 can do around 20 Gflops (processor metrics). So theoretically speaking you can do an inversion in 1 ms.
Without the code is pretty difficult to assert what is the cause of the large gap. From my experience trying micro-optimization like loop unrolling, caching values, SEE, threading, etc, you only will get a speed up, which at best is only a constant factor of you current (which maybe enough for you).
But if you want an order of magnitude speed increase you should take a look at your algorithm, perhaps your implementation of LU decomposition have a bug. Another place to take a look is the organization of your data, try different organization, put row/columns elements together.
The LINPACK benchmarks are based on solving linear algebra problems. They're available for different machines and languages. Maybe they can help you, too.
LINPACK C++ libraries available here, too.
I actually gained about 7 seconds using **double**s instead of **long double**s, but that's not a great deal since I lost half of my precision.