Algorithm to change an already sampled guitarstring to sound like it was a bit damped by the hand - c++

I'm trying to simulate the damping of the hand on a guitarstring on an already recorded/sampled open guitarstring sound. I've been trying to use low pass filter and had a moving frequency range but that didn't make it sound like a damped string, just the loss of higher frequencies.
Could someone help me find good material on this, that a human could atleast grasp a bit?
It's going to be implemented in C++ and I have been searching and found almoust everything about the karplus-strong string algorithm, but that's not what I want.. I do want the damping part implemented on a sample of an already recorded real played string.

This probably not as simple as you think. It is not just the right filter, but also the sound will decay faster. This is likely differrent for different frequencies.
If you have guitar at your disposal, you could measure the sound spectum over time when you strike it normally, and once while you dampen it. You can measure the difference in the initial spectrum as well as the difference in decay rate.
You can apply this information to the sound you want to alter, but you'd need to convert the signal to frequency-vs-time first.
But this may be far too complicated for what you had in mind. A simpler approach could be to first increase the decay, by multiplying the signal by e^(w*t), with w as the decay rate. You could split the signal in low and high pass signals and apply different decay rates, with the high freq component getting a faster decay.

Related

Improved SGBM based on previous frames result

I was wondering if there is any good method to make SGBM process faster, by taking the info from the previous video frame.
I think that it can be made faster by searching correspondences only near the distance of the disparity of previous frame. The problem I see in this is when from one frame to the next, the block passes from an object to background of viceversa. I think, in case to be possible, is an interesting improve to be made, and I have looked for it but I didn't find it.
You have told what is the problem, if the scene is in motion.
I managed to wrote some algorithm that take in consideration the critical zone around the objects' borders, they were a little more accurate but very slower than SGBM.
Maybe you can simply set the maximum and the minimum value of disparity in a reasonable range of what you find in the previous frame instead of "safe values". In my experience wuth OpenCV, stereoBM is faster but not so good as SGBM, and SGBM is better optimized than any other algorithm written by oneself (always in my experience).
Maybe you can have some better (faster) result using the CUDA algorithm (SGBM processed in GPU). My group and I are working on that.

How to organize data (writing your own profiler)

I was thinking about using reflection to generate a profiler. Lets say I am generating code without a problem; how do I properly measure or organize the results? I'm mostly concerned about CPU time but memory suggestions are welcome
There are lots of bad ways to write profilers.
I wrote what I thought was a pretty good one over 20 years ago.
That is, it made a decent demo, but when it came down to serious performance tuning I concluded there really is nothing that works better, and gives better results, than the dumb old manual method, and here's why.
Anyway, if you're writing a profiler, here's what I think it should do:
It should sample the stack at unpredictable times, and each stack sample should contain line number information, not just functions, in the code being tuned. It's not so important to have that in system functions you can't edit.
It should be able to sample during blocked time like I/O, sleeps, and locking, because those are just as likely to result in slowness as CPU operations.
It should have a hot-key that the user can use, to enable the sampling during the times they actually care about (like not when waiting for the user to do something).
Do not assume it is necessary to get measurement precision, necessitating a large number of frequent samples. This is incredibly basic, and it is a major reversal of common wisdom. The reason is simple - it doesn't do any good to measure problems if the price you pay is failure to find them.
That's what happens with profilers - speedups hide from them, so the user is content with finding maybe one or two small speedups while giant ones get away.
Giant speedups are the ones that take a large percentage of time, and the number of stack samples it takes to find them is inversely proportional to the time they take. If the program spends 30% of its time doing something avoidable, it takes (on average) 2/0.3 = 6.67 samples before it is seen twice, and that's enough to pinpoint it.
To answer your question, if the number of samples is small, it really doesn't matter how you store them. Print them to a file if you like - whatever.
It doesn't have to be fast, because you don't sample while you're saving a sample.
What does allow those speedups to be found is when the user actually looks at and understands individual samples. Profilers have all kinds of UI - hot spots, call counts, hot paths, call graphs, call trees, flame graphs, phony 3-digit "statistics", blah, blah.
Even if it's well done, that's only timing information.
It doesn't tell you why the time is spent, and that's what you need to know.
Make eye candy if you want, but let the user see the actual samples.
... and good luck.
ADDED: A sample looks something like this:
main:27, myFunc:16, otherFunc:9, ..., someFunc;132
That means main is at line 27, calling myFunc. myFunc is at line 16, calling otherFunc, and so on. At the end, it's in someFunc at line 132, not calling anything (or calling something you can't identify).
No need for line ranges.
(If you're tempted to worry about recursion - don't. If the same function shows up more than once in a sample, that's recursion. It doesn't affect anything.)
You don't need a lot of samples.
When I did it, sampling was not automatic at all.
I would just have the user press both shift keys simultaneously, and that would trigger a sample.
So the user would grab like 10 or 20 samples, but it is crucial that the user take the samples during the phase of the program's execution that annoys the user with its slowness,
like between the time some button is clicked and the time the UI responds.
Another way is to have a hot-key that runs sampling on a timer while it is pressed.
If the program is just a command-line app with no user input, it can just sample all the time while it executes.
The frequency of sampling does not have to be fast.
The goal is to get a moderate number of samples during the program phase that is subjectively slow.
If you take too many samples to look at, then when you look at them you need to select some at random.
The thing to do when examining a sample is to look at each line of code in the sample so you can fully understand why the program was spending that instant of time.
If it is doing something that might be avoided,
and if you see a similar thing on another sample, you've found a speedup.
How much of a speedup? This much (the math is here):
For example, if you look at three samples, and on two of them you see avoidable code, fixing it will give you a speedup - maybe less, maybe more, but on average 4x.
(That's what I mean by giant speedup. The way you get it is by studying individual samples, not by measuring anything.)
There's a video here.

Spike Filtering in Real time C++

I'm trying to implement a spike filter on some torque that I'm reading in from an SEA in real time. As of now, we're using a moving average to replace the spike values that cross a certain threshold. (We're getting spikes b/c the actuator sometimes messes up and gives a sudden spike).
I am trying to figure out a better, more accurate way to filter the spikes, so that it more accurately predicts what the torque would have been instead of the spike.
BTW, this is a c++ program.
Thanks!
If your torque isn't changing very fast the easiest way to filter spikes is so-called "slew rate limiter". The operation is trivial and can easily be implemented in any language. You need to store last good value. When you get a reading compare it with the last and if it's larger then increment last one, if it's smaller then decrement the last one.

Convert all doubles to integers for better performance, is it just a rumor?

I have a very complicated and sophisticated data fitting program which uses the Levenverg-Marquardt algorithm to do fitting in double precision (basically the fitting class is templatized, but I use instantiate it to doubles). The fitting process involves:
Calculating an error function (chi-square)
Solving a system of linear equations (I use lapack for that)
calculating the derivatives of a function with respect to the parameters, which I want to fit to the data (usually 20+ parameters)
calculating the function value continuously: the function is a complicated combination of a sinusoidal and exponential functions with a few harmonics.
A colleague of mine has suggested that I use integers for at least 10 times faster at least. My questions are:
Is that true that I will get that kind of improvement?
Is it safe to convert everything to integers? And what are the drawbacks to this?
What advice would you have for this whole issue? What would you do?
The program is developed to calculate some parameters from the signal online, which means that the program must be as fast as possible, but I'm wondering whether it's worth it to start the project of converting everything to integers.
The amount of improvement depends on your platform. For example, if your platform has a fast floating point coprocessor, performing arithmetic in floating point may be faster than integral arithmetic.
You may be able to get more performance gain by optimizing your algorithms rather than switching to integer arithmetic.
Another method for boosting performance is to reduce data cache hits and also reducing branches and loops.
I would measure performance of the program to find out where the bottlenecks are and then review the sections that where most of the performance takes place. For example, in my embedded system, micro-optimizations like what you are suggesting, saved 3 microseconds. This gain is not worth the effort to retest the entire system. If it works, don't fix it. Concentrate on correctness and robustness first.
The bottom line here is that you have to test it and decide for yourself. Profile a release build using real data.
1- Is that true that I will get that kind of improvement?
Maybe yes, maybe no. It depends on a number of factors, such as
How long it takes to convert from double to int
How big a word is on your machine
What platform/toolset you're using and what optimizations you have enabled
(Maybe) how big a cache line is on your platform
How fast your memory is
How fast your platform computes floating-point versus integer.
And who knows what else. In short, too many complex variables for anyone to be able to say for sure if you will or will not improve performance.
But I would be highly skeptical about your friend's claim, "at least 10 times faster at least."
2- Is it safe to convert everything to integers? And what are the
drawbacks to this?
It depends on what you're converting and how. Obviously converting a value like 123.456 to an integer is decidedly unsafe.
Drawbacks include loss of precision, loss of accuracy, and the expense in terms of space and time to actually do the conversions. Another significant drawback is the fact that you have to write a substantial amount of code, and every line of code you write is a probable source of new bugs.
3- What advice would you have for this whole issue? What would you do?
I would step back & take a deep breath. Profile your code under real-world conditions. Identify the sources of the bottlenecks. Find out what the real problems are, and if there even are any.
Identify inefficiencies in your algorithms, and fix them.
Throw hardware at the problem.
Then you can endeavor to start micro-optimizing. This would be my last resort, especially if the optimization technique you are considering would require writing a lot of code.
First, this reeks of attempting to optimize unnecessarily.
Second, doubles are a minimum of 64-bits. ints on most systems are 32-bits. So you have a couple of choices: truncate the double (which reduces your precision to a single), or store it in the space of 2 integers, or store it as an unsigned long long (which is at least 64-bits as well). For the first 2 options, you are facing a performance hit as you must convert the numbers back and forth between the doubles you are operating on and the integers you are storing it as. For the third option, you are not gaining any performance increase (in terms of memory usage) as they are basically the same size - so you'd just be converting them to integers for no reason.
So, to get to your questions:
1) Doubtful, but you can try it to see for yourself.
2) The problem isn't storage as the bits are just bits when they get into memory. The problem is the arithmetic. Since you stated you need double precision, attempting to do those operations on an integer type will not give you the results you are looking for.
3) Don't optimize until it has been proven something needs to have a performance improvement. And always remember Amdahl's Law: Make the common case fast and the rare case correct.
What I would do is:
First tune it in single-thread mode (by the random-pausing method) until you can't find any way to reduce cycles. The kinds of things I've found are:
a large fraction of time spent in library functions like sin, cos, exp, and log where the arguments were often unchanged, so the answers would be the same. The solution for that is called "memoizing", where you figure out a place to store old values of arguments and results, and check there first before calling the function.
In calling library functions like DGEMM (lapack matrix-multiply) that one would assume are optimized to the teeth, they are actually spending a large fraction of time calling a function to determine if the matrices are upper or lower triangle, square, symmetric, or whatever, rather than actually doing the multiplication. If so, the answer is obvious - write a special routine just for your situation.
Don't say "but I don't have those problems". Of course - you probably have different problems - but the process of finding them is the same.
Once you've made it as fast as possible in single-thread, then figure out how to parallelize it. Multi-threading can have high overhead, so it's best not to tightly-couple the threads.
Regarding your question about converting from doubles to integers, the other answers are right on the money. It only makes sense in very particular situations.

Speed of QHash lookups using QStrings as keys

I need to draw a dynamic overlay on a QImage. The component parts of the overlay are defined in XML and parsed out to a QHash<QString, QPicture> where the QString is the name (such as "crosshairs") and the QPicture is the resolution independent drawing. I then draw components of the overlay as they are needed at a position determined during runtime.
Example: I have 10 pictures in my QHash composing every possible element in a HUD. During a particular frame of video I need to draw 6 of them at different positions on the image. During the next frame something has changed and now I only need to draw 4 of them but 2 of those positions have changed.
Now to my question: If I am trying to do this quickly, should I redefine my QHash as QHash<int, QPicture> and enumerate the keys to counteract the overhead caused by string comparisons; or are the comparisons not going to make a very big impact on performance? I can easily make the conversion to integer keys as the XML parser and overlay composer are completely separate classes; but I would like to use a consistent data structure across the application.
Should I overcome my desire for consistency and re-usability in order to increase performance? Will it even matter very much if I do?
Gareth has the right answer of course. I'd like to extend it a tiny bit.
Go for consistency and reusability
first. Try not introduce huge
performance bottlenecks too; it's
hard to strike the balance
Set realistic performance criteria. I'm guessing you are making something game-like, a reasonable criteria would be "sustaining 25 fps on my dev machine"
Is your application meeting the criteria? Yes? Enough optimizations, go to 5.
Profile your application, optimize the parts that take the most time. Go back to 3.
Profit!
Back to your concrete question, if the number of elements in your hash table is less than or about a hundred, the key type probably won't matter at all.
The answer is that you should profile your app. Only if you find string comparisons to be a bottleneck should you implement an alternative strategy. Premature optimisation is likely to be a waste of time.
First, ensure the correctness of your program, i.e. make sure it passes all of its unit tests. (I'm assuming that correctness and performance are orthogonal - which is usually a reasonable assumption, unless you're programming a hard real-time application) Then, benchmark to find out whether the performance meets your requirements. Only if the benchmark shows that performance is too low should you optimise, and then, do so by following the guidance of your profiler. Any optimisations which you make can be checked for correctness by re-running the unit tests.