Open office calc dynamic cell selection - openoffice-calc

So I have what I think is a pretty simple problem in open office calc I want fixed, but I can't find the answer anywhere.
If this is my table:
A | B | C
5 | 2 | ?
3 | 1 | ?
I want C to get the value of the B next to it and get the value of A{that number}.
So it should look like this:
A | B | C
5 | 2 | 3
3 | 1 | 5
EDIT:
I'll try to explain better. So what I want cell C1 to do is to get the value of B1. Then, it takes that value, call it X, and gets the value of the cell AX. Where A is the column. So in this case, C1 gets the value of A2 and C2 gets the value of A1.

So this is the answer:
the formula should be =INDIRECT("A"&B1) in the cell C1 and then copied to the C column

Related

Capturing non-missing values row wise and storing it in new variables

My dataset contains multiple variables called avar_1 to bvar_10 referring to the history of an individual. For some reasons, the history is not always complete and there are some "gaps" (e.g. avar_1 and avar_4 are non-missing, but avar_2 and avar_3 are missing). For each individual, I want to store the first non-missing value in a new variable called var1 the second non-missing in var2 etc, so that I have a history without missing values.
I've tried the following code
local x=1
foreach wave in a b {
forval i=1/10 {
capture drop var`x'
generate var`x'=.
capture replace var`x'=`wave'var`i' if !mi(`wave'`var'`i')
if (!mi(var`x')) {
local x=1+`x'
}
}
}
var1 is generated properly but var2 only contains missings and following variables are not generated. However, I set trace on and saw that the var2 is actually replaced for all variables from avar_1 to bvar_10.
My guess is that the local x is not correctly updated as its value change for the whole dataset but should be different for each observation.
Is that the problem and if so, how can I avoid it?
A concise concrete data example is worth more than a long explanation. Your description seems consistent with an example like this:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str1 id float(avar_1 avar_2 avar_3 bvar_1 bvar_2)
"A" 1 . 6 8 10
"B" 2 4 . 9 .
"C" 3 5 7 . 11
end
* 4 is specific to this example.
rename (bvar_*) (avar_#), renumber(4)
reshape long avar_, i(id) j(which)
(note: j = 1 2 3 4 5)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 3 -> 15
Number of variables 6 -> 3
j variable (5 values) -> which
xij variables:
avar_1 avar_2 ... avar_5 -> avar_
-----------------------------------------------------------------------------
drop if missing(avar_)
bysort id (which) : replace which = _n
list, sepby(id)
+--------------------+
| id which avar_ |
|--------------------|
1. | A 1 1 |
2. | A 2 6 |
3. | A 3 8 |
4. | A 4 10 |
|--------------------|
5. | B 1 2 |
6. | B 2 4 |
7. | B 3 9 |
|--------------------|
8. | C 1 3 |
9. | C 2 5 |
10. | C 3 7 |
11. | C 4 11 |
+--------------------+
Positive points:
Your data layout cries out for some structure given by a rename and especially by a reshape long. I don't give here code for a reshape wide as for the great majority of Stata purposes, you'd be better off with this layout.
Negative points:
!mi(var`x')
returns whether the first value of a variable is not missing. If foo were a variable in the dataset, !mi(foo) is evaluated as !mi(foo[1]). That is not what you want here. See https://www.stata.com/support/faqs/programming/if-command-versus-if-qualifier/ for the full story.
I'd recommend more evocative variable names.

Splitting values in Stata and save them in a new variable

I have a numeric variable with values similar to the following system
1
2
12
21
2
I would like to split the values which have length > 1 and put the second half of the value
in another variable.
So the second variable would have the values:
.
.
2
1
.
Theoretically I would just use a simple replace statement, but I am looking for a code/loop, which would
recognize the double digit values and split them automatically and save them in the second variable. Because with time, there will be more observations added and I cannot do this task manually for >10k cases.
Here's one approach:
clear
input foo
1
2
12
21
2
end
generate foo1 = floor(foo/10)
generate foo2 = mod(foo, 10)
list
+-------------------+
| foo foo1 foo2 |
|-------------------|
1. | 1 0 1 |
2. | 2 0 2 |
3. | 12 1 2 |
4. | 21 2 1 |
5. | 2 0 2 |
+-------------------+
More on these functions here, here and here.
If zeros for the first part should be missing, then
replace foo1 = . if foo1 == 0
or (to do it in one)
generate foo1 = floor(foo/10) if foo >= 10
The code is also good for any arguments with three digits or more.

Calculated columns: Select which columns affect calculation

I have the following table:
SUM1 and SUM2 are calculated columns.
Org | A1 | B1 | C1 | SUM1 | A2 | B2 | C2 | SUM2 |
----|----|----|----|------|----|----|----|------|
x | 1 | 2 | 6 | 9 | 3 | 3 | 9 | 15 |
y | 2 | 3 | 5 | 10 | 4 | 5 | 3 | 12 |
z | 3 | 4 | 7 | 14 | 2 | 1 | 5 | 8 |
I would like to have a scatter plot, representing: SUM1 on X-axis and SUM 2 on Y-axis. I want one dot for each Org.
Also, I would like to filter which of A1, B1 or C1 is involved in SUM1 calculation. The same regarding A2, B2 or C2 and SUM2.
The effect I want to get is to visualize how each of these variables affects the total calculation plot when I take them out.
Is this possible at all? Is there another suggested approach?
Any comments will be much appreciated.
Thanks.
Given that you "would like to filter which of A1, B1 or C1 is involved in SUM1 calculation", SUM1 and SUM2 cannot be calculated columns. For calculations dynamically responsive to filters/slicers, you need to write measures.
I could solve the issue by doing the following: 1. Unpivoting the original table 2. Calculating a measure for SUM1 and another for SUM2. Each of them, filtering the corresponding values from the attribute column. 3. Plotting the measures one on each axis and placing Org in "Legend" in order to have a dot for each Org. 4. With a slicer "Attribute SUM1" I can filter the values of the column "Attribute" of the unpivoted table (i.e the columns of the original table) that affect the measure SUM. Then I the same for SUM2

How to find a cell based on row and col criteria?

I have a table like this:
| a | b | c |
x | 1 | 8 | 6 |
y | 5 | 4 | 2 |
z | 7 | 3 | 5 |
What I want to do is finding a value based on the row and col titles, so for example if I have c&y, then it should return 2. What function(s) should I use to do this in OpenOffice Calc?
later:
I tried =INDEX(B38:K67;MATCH('c';B37:K37;0);MATCH('y';A38:A67;0)), but it writes invalid argument.
It turned out I wrote the arguments of INDEX in the wrong order. The =INDEX(B38:K67;MATCH('y';A38:A67;0);MATCH('c';B37:K37;0)) formula works properly. The second argument is the row number and not the col number.

Retrieve row number according to value of another variable in index

This problem is very simple in R, but I can't seem to get it to work in Stata.
I want to use the square brackets index, but with an expression in it that involves another variable, i.e. for a variable with unique values cumul I want:
replace country = country[cumul==20] in 12
cumul == 20 corresponds to row number 638 in the dataset, so the above should replace in line 12 the country variable with the value of that same variable in line 638. The above expression is clearly not the right way to do it: it just replaces the country variable in line 12 with a missing value.
Stata's row indexing does not work in that way. What you can do, however, is a simple two-line solution:
levelsof country if cumul==20
replace country = "`r(levels)'" in 12
If you want to be sure that cumul==20 uniquely identifies just a single value of country, add:
assert `:word count `r(levels)''==1
between the two lines.
It's probably worth explaining why the construct in the question doesn't work as you wish, beyond "Stata is not R!".
Given a variable x: in a reference like x[1] the [1] is referred to as a subscript, despite nothing being written below the line. The subscript is the observation number, the number being always that in the dataset as currently held in memory.
Stata allows expressions within subscripts; they are evaluated observation by observation and the result is then used to look-up values in variables. Consider this sandbox:
clear
input float y
1
2
3
4
5
end
. gen foo = y[mod(_n, 2)]
(2 missing values generated)
. gen x = 3
. gen bar = y[y == x]
(4 missing values generated)
. list
+-------------------+
| y foo x bar |
|-------------------|
1. | 1 1 3 . |
2. | 2 . 3 . |
3. | 3 1 3 1 |
4. | 4 . 3 . |
5. | 5 1 3 . |
+-------------------+
mod(_n, 2) is the remainder on dividing the observation _n by 2: that is 1 for odd observation numbers and 0 for even numbers. Observation 0 is not in the dataset (Stata starts indexing at 1). It's not an error to refer to values in that observation, but the result is returned as missing (numeric missing here, and empty strings "" if the variable is string). Hence foo is x[1] or 1 for odd observation numbers and missing for even numbers.
True or false expressions are evaluated as 1 if true and 0 is false. Thus y == x is true only in observation 3, and so bar is the value of y[1] there and missing everywhere else. Stata doesn't have the special (and useful) twist in R that it is the subscripts for which a true or false expression is true that are used to select zero or more values.
There are ways of using subscripts to get special effects. This example shows one. (It's much easier to get the same kind of result in Mata.)
. gen random = runiform()
. sort random
. gen obs = _n
. sort y
. gen randomsorted = random[obs]
. l
+-----------------------------------------------+
| y foo x bar random obs random~d |
|-----------------------------------------------|
1. | 1 1 3 . .3488717 4 .0285569 |
2. | 2 . 3 . .2668857 3 .1366463 |
3. | 3 1 3 1 .1366463 2 .2668857 |
4. | 4 . 3 . .0285569 1 .3488717 |
5. | 5 1 3 . .8689333 5 .8689333 |
+-----------------------------------------------+
This answer doesn't cover matrices in Stata or Mata.