how to find index by value in ChunkedArray from Apache Arrow? - apache-arrow

The closest I can find seems to be the index from ComputeFunction:
https://arrow.apache.org/docs/python/api/compute.html
But I do not find a working code example for it in C++ from the apache arrow codebase.

Here's the documentation for that function in the cpp docs:
https://arrow.apache.org/docs/cpp/compute.html#aggregations
And here's a short example of how to call the function in C++:
8.0.0: https://github.com/apache/arrow/blob/apache-arrow-8.0.0/cpp/src/arrow/compute/kernels/aggregate_test.cc#L2234
7.0.0: https://github.com/apache/arrow/blob/apache-arrow-7.0.0/cpp/src/arrow/compute/kernels/aggregate_test.cc#L2206
[2022-05-23 Edit]
Here is an example that calls the Index function, using arrow 7.0.0:
https://github.com/drin/cookbooks/blob/mainline/arrow/compute-api/recipe.cpp#L18
The recipe.hpp file should show the required includes and types that are used (I tried to minimize to just what's necessary).
Also, here is corresponding code for usage, including making some test data and using the IndexOf function and viewing the result:
https://github.com/drin/cookbooks/blob/mainline/arrow/compute-api/index.cpp#L18
I wrote IndexOf to show how you can use the Index function yourself, so you can use it directly, or write a wrapper function in a similar style.
NOTE: I thought I needed to upgrade to 8.0.0 to use Scalar types, but I think 8.0.0 mostly introduced the documentation for Scalar rather than introducing code for it, as this works with arrow 7.0.0.

Related

Adding Custom c++ function in chromium and call them in browser

I am trying to write custom function in bootstrapper.cc under v8/src/init.
int helloworld(){
return 0;
}
When it try to call it from chromium console, it throws undefined.
Look around bootstrapper.cc to see how other built-in functions are installed. Examples you could look at include Array and DataView (or any other, really).
There is no way to simply define a C++ function of a given name and have that show up in JavaScript. Instead, you have to define a property on the global object; and the function itself needs to have the right calling convention, and process its parameters / prepare its return value appropriately so that it can be called from JavaScript. You can't just take or return an int.
If you find it inconvenient to work with C++, an alternative might be to develop a Chrome extension, which would allow you to use JavaScript for the implementation, and also remove the need to compile/maintain/update your own build (which is a lot of work!). There is no existing guide for how to extend V8 in the way you're asking, because that approach is so much work that we don't recommend doing it like this (though of course it is possible -- you just have to read enough of the existing C++ source to understand how it's done).

How does to!string(enum.member) works?

How does std.conv.to!string(enum.member) work? How is it possible that a function takes an enum member and returns its name? Does it use a compiler extension or something similar? It's a bit usual to me since I came from C/C++ world.
What it does is use compile time reflection on the enum type to get a list of members (the names as strings) and their values. It constructs a switch statement out of this information for a fast lookup to get the name from a value. to!SomeEnum("a_string") uses the same principle, just in the other direction.
The compile time reflection info is accessed with __traits(allMembers, TheEnumType), which returns a list of strings that can be looped over to build the switch statement. Then __traits(getMember, TheEnumType, memberName) is used to fetch the body.
Traits can be seen more of here: http://dlang.org/traits.html#allMembers
That allMembers one works on many types, not just classes as seen in the example, but also structs, enums, and more, even modules.
The phobos source code has some examples like EnumMembers in std.traits: https://github.com/D-Programming-Language/phobos/blob/master/std/traits.d#L3360
though the phobos source is kinda hard to read, but on line 3399, at the bottom of that function, you can see it using __traits(allMembers) as its data source. std.conv.to is implemented in terms of many std.traits functions.
You can also check out the sample chapter tab to get the Reflection chapter out of my D cookbook which discusses this stuff too:
http://www.packtpub.com/discover-advantages-of-programming-in-d-cookbook/book
The final example in that chapter shows how to use several of the reflection capabilities to build a little function dispatcher based on strings. The following chapter (not available for free though) shows how to build a switch out of it for better efficiency too. It's actually pretty easy: just put the case statements inside a foreach over the compile time data and the D compiler will unroll then optimize the lookup table for you!

Changing an OpenCV function Standard Parameters

Is there a way to permanently change the standard parameters in a OpenCV function?
For example, how can I modify the MSER Feature Detector so that I can call
MserFeatureDetector detector
instead of
MserFeatureDetector detector(10,50,1000)
I am not precisley well versed in the inner mechanisms of C++ libraries, but I imagine the actual program code has to be somewhere, right?
A bit of information on my actual problem:
I'm currently using MEXOpenCV to run OpenCV functions in MatLab, and some MEX-Functions lack (as far as I know) the option to pass input parameters and run with the defaults like this:
detector = cv.FeatureDetector('MSER'); % 'MSER' is the only parameter taken
I recon changing the standard parameters directly at the OpenCV programs would be a way to do it.
Any other ideas on how to solve the actual problem are welcome too!
I solved the actual problem by setting the parameters with the 'set' method of DescriptorExtractor like this
detector=cv.FeatureDetector('MSER'); detector.set('delta',10);

How to get the value of a variable that's deep in the source code (C++)? (eg. value of stage_sum in haar.cpp, OpenCV)

I'd like to get a value from a variable that's located deeply in the source code of the OpenCV library. Specifically, I'm trying to print out the value of stage_sum from the file haar.cpp. My starting point, facedetect.cpp, calls the method detectMultiScale, which then calls the function cvHaarDetectObjects, which calls cvHaarDetectObjectsForROC etc., until it finally reaches the function cvRunHaarClassifierCascadeSum, where stage_sum is calculated.
Is there a way I could get the value out to facedetect.cpp easily, without changing the declarations of all the preceding functions/methods, headers etc.? Simply trying to cout or printf the value directly in the source code hasn't given any results.
Thanks everyone for your help!
One option is simply to use a debugger.
However, if you want to do this programatically (i.e. access the variable as part of your application code), then unless the variable is exposed in the library's public interface, there are two options available:
Modify the library's source code, and recompile it.
Resort to undefined-behaviour (fiddling around with the raw bytes that make up an object, etc.).
Just to point the obvious, adding a std::cout() or printf() call inside haar.cpp won't do the trick. You need to recompile OpenCV for this changes to take effect and then reinstall the libraries on your system.

Is there a tool that enables me to insert one line of code into all functions and methods in a C++-source file?

It should turn this
int Yada (int yada)
{
return yada;
}
into this
int Yada (int yada)
{
SOME_HEIDEGGER_QUOTE;
return yada;
}
but for all (or at least a big bunch of) syntactically legal C/C++ - function and method constructs.
Maybe you've heard of some Perl library that will allow me to perform these kinds of operations in a view lines of code.
My goal is to add a tracer to an old, but big C++ project in order to be able to debug it without a debugger.
Try Aspect C++ (www.aspectc.org). You can define an Aspect that will pick up every method execution.
In fact, the quickstart has pretty much exactly what you are after defined as an example:
http://www.aspectc.org/fileadmin/documentation/ac-quickref.pdf
If you build using GCC and the -pg flag, GCC will automatically issue a call to the mcount() function at the start of every function. In this function you can then inspect the return address to figure out where you were called from. This approach is used by the linux kernel function tracer (CONFIG_FUNCTION_TRACER). Note that this function should be written in assembler, and be careful to preserve all registers!
Also, note that this should be passed only in the build phase, not link, or GCC will add in the profiling libraries that normally implement mcount.
I would suggest using the gcc flag "-finstrument-functions". Basically, it automatically calls a specific function ("__cyg_profile_func_enter") upon entry to each function, and another function is called ("__cyg_profile_func_exit") upon exit of the function. Each function is passed a pointer to the function being entered/exited, and the function which called that one.
You can turn instrumenting off on a per-function or per-file basis... see the docs for details.
The feature goes back at least as far as version 3.0.4 (from February 2002).
This is intended to support profiling, but it does not appear to have side effects like -pg does (which compiles code suitable for profiling).
This could work quite well for your problem (tracing execution of a large program), but, unfortunately, it isn't as general purpose as it would have been if you could specify a macro. On the plus side, you don't need to worry about remembering to add your new code into the beginning of all new functions that are written.
There is no such tool that I am aware of. In order to recognise the correct insertion point, the tool would have to include a complete C++ parser - regular expressions are not enough to accomplish this.
But as there are a number of FOSS C++ parsers out there, such a tool could certainly be written - a sort of intelligent sed for C++ code. The biggest problem would probably be designing the specification language for the insert/update/delete operation - regexes are obviously not the answer, though they should certainly be included in the language somehow.
People are always asking here for ideas for projects - how about this for one?
I use this regex,
"(?<=[\\s:~])(\\w+)\\s*\\([\\w\\s,<>\\[\\].=&':/*]*?\\)\\s*(const)?\\s*{"
to locate the functions and add extra lines of code.
With that regex I also get the function name (group 1) and the arguments (group 2).
Note: you must filter out names like, "while", "do", "for", "switch".
This can be easily done with a program transformation system.
The DMS Software Reengineering Toolkit is a general purpose program transformation system, and can be used with many languages (C#, COBOL, Java, EcmaScript, Fortran, ..) as well as specifically with C++.
DMS parses source code (using full langauge front end, in this case for C++),
builds Abstract Syntax Trees, and allows you to apply source-to-source patterns to transform your code from one C# program into another with whatever properties you wish. THe transformation rule to accomplish exactly the task you specified would be:
domain CSharp.
insert_trace():function->function
"\visibility \returntype \fnname(int \parametername)
{ \body } "
->
"\visibility \returntype \fnname(int \parametername)
{ Heidigger(\CppString\(\methodname\),
\CppString\(\parametername\),
\parametername);
\body } "
The quote marks (") are not C++ quote marks; rather, they are "domain quotes", and indicate that the content inside the quote marks is C++ syntax (because we said, "domain CSharp"). The \foo notations are meta syntax.
This rule matches the AST representing the function, and rewrites that AST into the traced form. The resulting AST is then prettyprinted back into source form, which you can compile. You probably need other rules to handle other combinations of arguments; in fact, you'd probably generalize the argument processing to produce (where practical) a string value for each scalar argument.
It should be clear you can do a lot more than just logging with this, and a lot more than just aspect-oriented programming, since you can express arbitrary transformations and not just before-after actions.