Python array management C++ equivalent - c++

I know SO is not rent-a-coder, but I have a really simple python example that I need help translating to C++
grey_image_as_array = numpy.asarray( cv.GetMat( grey_image ) )
non_black_coords_array = numpy.where( grey_image_as_array > 3 )
# Convert from numpy.where()'s two separate lists to one list of (x, y) tuples:
non_black_coords_array = zip( non_black_coords_array[1], non_black_coords_array[0] )
First one is rather simple I guess - a linear indexable array is created with what bytes are retruned from cv.GetMat, right?
What would be an equivalent of pyton's where and especially this zip functions?

I don't know about OpenCV, so I can't tell you what cv.GetMat() does. Apparently, it returns something that can be used as or converted to a two-dimensional array. The C or C++ interface to OpenCV that you are using will probably have a similarly names function.
The following lines create an array of index pairs of the entries in grey_image_as_array that are bigger than 3. Each entry in non_black_coords_array are zero based x-y-coordinates into grey_image_as_array. Given such a coordinates pair x, y, you can access the corresponsing entry in the two-dimensional C++ array grey_image_as_array with grey_image_as_array[y][x].
The Python code has to avoid explicit loops over the image to achieve good performance, so it needs to make to with the vectorised functions NumPy offers. The expression grey_image_as_array > 3 is a vectorised comparison and results in a Boolean array of the same shape as grey_image_as_array. Next, numpy.where() extracts the indices of the True entries in this Boolean array, but the result is not in the format described above, so we need zip() to restructure it.
In C++, there's no need to avoid explicit loops, and an equivalent of numpy.where() would be rather pointless -- you just write the loops and store the result in the format of your choice.

Related

Duplicate values in Julia with Function

I need writing a function which takes as input
a = [12,39,48,36]
and produces as output
b=[4,4,4,13,13,13,16,16,16,12,12,12]
where the idea is to repeat one element three times or two times (this should be variable) and divided by 2 or 3.
I tried doing this:
c=[12,39,48,36]
a=size(c)
for i in a
repeat(c[i]/3,3)
end
You need to vectorize the division operator with a dot ..
Additionally I understand that you want results to be Int - you can vectorizing casting to Int too:
repeat(Int.(a./3), inner=3)
Przemyslaw's answer, repeat(Int.(a./3), inner=3), is excellent and is how you should write your code for conciseness and clarity. Let me in this answer analyze your attempted solution and offer a revised solution which preserves your intent. (I find that this is often useful for educational purposes).
Your code is:
c = [12,39,48,36]
a = size(c)
for i in a
repeat(c[i]/3, 3)
end
The immediate fix is:
c = [12,39,48,36]
output = Int[]
for x in c
append!(output, fill(x/3, 3))
end
Here are the changes I made:
You need an array to actually store the output. The repeat function, which you use in your loop, would produce a result, but this result would be thrown away! Instead, we define an initially empty output = Int[] and then append! each repeated block.
Your for loop specification is iterating over a size tuple (4,), which generates just a single number 4. (Probably, you misunderstand the purpose of the size function: it is primarily useful for multidimensional arrays.) To fix it, you could do a = 1:length(c) instead of a = size(c). But you don't actually need the index i, you only require the elements x of c directly, so we can simplify the loop to just for x in c.
Finally, repeat is designed for arrays. It does not work for a single scalar (this is probably the error you are seeing); you can use the more appropriate fill(scalar, n) to get [scalar, ..., scalar].

Passing dataset from R to C++ (using .Call)

I need to speed up data processing in R through C++. I already have my C++ code and it basically reads from txt file what R should pass. Since I need R for my analysis, I want to integrate my C++ code in R.
What the C++ code needs is a (large) dataframe (for which I use std::vector< std::vector> >) and a set of parameters, so I am thinking about passing parameters through .Call interface and then deal with data in the following way:
R: write data in txt file with a given encoding
C++: read from txt, do what I need to do and write the result in a txt (which is still a dataset -> std::vector)
R: read the result from txt
This would avoid me to rewrite part of the code. The possible problem/bottleneck is in reading/writing, do you believe it is a real problem?
Otherwise, as an alternative, is it reasonable to copy all my data in C++ structures through .Call interface?
Thank you.
You could start with the very simple DataFrame example in the RcppExamples package:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
List DataFrameExample(const DataFrame & DF) {
// access each column by name
IntegerVector a = DF["a"];
CharacterVector b = DF["b"];
DateVector c = DF["c"];
// do something
a[2] = 42;
b[1] = "foo";
c[0] = c[0] + 7; // move up a week
// create a new data frame
DataFrame NDF = DataFrame::create(Named("a")=a,
Named("b")=b,
Named("c")=c);
// and return old and new in list
return List::create(Named("origDataFrame") = DF,
Named("newDataFrame") = NDF);
}
You can assign vectors (from either Rcpp or the STL) and matrices (again, either from Rcpp, or if you prefer nested STL vectors). And then you also have Eigen and Armadillo via RcppEigen and RcppArmadillo. And on and on -- there are over 1350 packages on CRAN you could study. And a large set of ready-to-run examples are at the Rcpp Gallery.
Reading and writing large datasets back and forth is not an optimal solution for passing the data between R and your C++ code. Depending on how long your C++ code executes this might or might not be the worst bottleneck in your code, but this approach should be avoided.
You can look a at the following solution to pass a data.frame (or data.table) object:
Passing a `data.table` to c++ functions using `Rcpp` and/or `RcppArmadillo`
As for passing additional parameters, the solution will depend on what kind of parameters we are talking about. If those are just numeric values, then you can pass them directly to C++ (see High performance functions with Rcpp: http://adv-r.had.co.nz/Rcpp.html).

How exactly do the hashlib hashers treat input?

The Python 2.7 documentation has this to say about the hashlib hashers:
hash.update(arg)
Update the hash object with the string arg. [...]
But I have seen people feed it objects that are not strings, e.g. buffers, numpy ndarrays.
Given Python's duck typing, I'm not surprised that it is possible to specify non-string arguments.
The question is: how do I know the hasher is doing the right thing with the argument?
I can't imagine the hasher naïvely doing a shallow iteration on the argument because that would probably fail miserably with ndarrays with more than one dimension - if you do a shallow iteration, you get an ndarray with n-1 dimensions.
update unpacks its arguments using the s# format spec. This means that it can be either a string, Unicode or a buffer interface.
You can't define a buffer interface in pure Python, but C libraries like numpy can and do - which allows them to be passed into hash.update.
Things like multiple dimension arrays work fine - on the C level they're stored as a contiguous series of bytes.

Compare two bson_t in C / C++

I need to compare two bson_t. I found that two bson_t s may have different sequence of key-value pairs. for example {"key1": "val1", "key2" : "val2"} and {"key2": "val2", "key1" : "val1"}. But they are the same in my project. bson_compare() and bson_equal() will return false in this case. How to solve this problem in C/C++?
By the way, how to sort these key-value pairs in C or C++?
Thanks
bson_compare and bson_equal check if two content buffers are equal (not only values # two buffers (or memory locations)). It uses memcmp internally to compare two objects. Hence, x==y does not imply that memcmp(x,y)==0.
Two methods:
(1) It is easy to do this in Python. Write a python function. And call this python function from C++ program.
(2) Using bson_iter_t to iterate each key-value pair in bson_t and do comparison recursively.
The second method seems more complex. But I decided to use it. Now, I already finished part of the method.

How do I iterate through a list in a TI-83 calculator program

I created a set of programs to calculate the area under a graph using various methods of approximation (midpoint, trapezoidal, simpson) for my Calculus class.
Here is an example of one of my programs (midpoint):
Prompt A,B,N
(A-B)/N->D
Input "Y1=", Y1
0->X
0->E
For(X,A+D/2,b-D/2,D)
Y1(x)+E->E
End
Disp E*D
Instead of applying these approximation rules to a function (Y1), I would like to apply them to a list of data (L1). How do I iterate through a list? I would need to be able to get the last index in the list in order for a "For Loop" to be any good. I can't do anything like L1.length like I would do in Java.
You can obtain the length of the list using dim(). That can be found in 2nd->LIST->OPS->dim(. Just make sure that you use a list variable otherwise dim() will complain about the type. You could then index into the list with a subscript.
e.g.,
{1, 2, 3, 4} -> L1
For (X, 1, dim(L1), 1)
Disp L1(X)
End
The for loop is the simplest way to iterate over a list in TI-Basic, as it is in many languages. Jeff Mercado already covered that, so I'll mention a few techniques that are powerful tools in specialized situation.
Mapping over lists
TI-Basic supports simple mapping operation over lists that have the same effect as a map function in any other language. TI-Basic support for this extends to most basic arithmetic function, and selection of other functions.
The syntax could not be simpler. If you want to add some number X to every element in some list L1 you type X+L1→L1.
seq(
Most for loops over a lists in TI-Basic can be replaced by cleverly constructed seq( command that will outperform the for loop in time and memory. The exceptions to this rule are loops that contain I/O or storing variables.
The syntax for this command can be quite confusing, so I recommend reading over this documentation before using it. In case that link dies, here's the most relevant information.
Command Summary
Creates a list by evaluating a formula with one variable taking on a
range of values, optionally skipping by a specified step.
Command Syntax
seq(formula, variable, start-value, end-value [, step])
Menu Location
While editing a program, press:
2nd LIST to enter the LIST menu RIGHT to enter the OPS submenu 5 to
choose seq(, or use arrows.
Calculator Compatibility
TI-83/84/+/SE
Token Size
1 byte
The documentation should do a good job explaining the syntax for seq(, so I'll just provide a sample use case.
If you want the square of every number between 1 and 100 you could do this
For Loop
DelVar L1100→dim(L1
for(A,1,100
A²→L1(A
End
or, this
seq
seq(A²,A,1,100→L1
The drawback of seq( is that you can't do any I/O or store any variables inside the expression.
Predefined list iteration function
Go to the LIST menu and check out all the operations under OPS and MATH. These predefined function are always going to be faster than a for loops or even a seq( expression designed to do the same thing.