If-Else logic flow - if-statement

Can anyone explain to me why the distribution for this code:
InfectionHistory <- rep(1,100)
for(x in 1:99)
{
r <- runif(1)
print(r)
if(InfectionHistory[x]==1)
{
if(r < 0.04)
{
InfectionHistory[x+1] <- 2
}
else
{
InfectionHistory[x+1] <- InfectionHistory[x]
}
}
if(InfectionHistory[x]==2)
{
if(r < 0.11)
{
InfectionHistory[x+1] <- 1
}
else
{
InfectionHistory[x+1] <- InfectionHistory[x]
}
}
}
plot(InfectionHistory, xlab = "Day", ylab = "State", main = "Patient Status per Day", type = "o")
is different from this code:
InfectionHistory <- rep(1,100)
for(x in 1:99)
{
r <- runif(1)
print(r)
if((InfectionHistory[x]==1)&&(r < 0.04))
{
InfectionHistory[x+1] <- 2
}
if((InfectionHistory[x]==2)&&(r < 0.11))
{
InfectionHistory[x+1] <- 1
}
}
plot(InfectionHistory, xlab = "Day", ylab = "State", main = "Patient Status per Day", type = "o")
I feel like it has to do with the logic of if-else statements. The objective of the code is to simulate an infection model along the lines of a Markov chain

The first example contains if else pairs nested within if statements.
The second example contains if statements with compound conditions.
Upon cursory review, it looks like each example should do the same thing. This code serves as an example that nested if statements can be rewritten as if statements with compound conditions.
For example, in example 1, when we get to this line of code:
if(r < 0.04)
{
InfectionHistory[x+1] <- 2
}
We already know that
InfectionHistory[x]==1
has returned true, or else we wouldn't be within the body of the if that checked this condition.
In the second example, this is simply rewritten as:
if((InfectionHistory[x]==1)&&(r < 0.04))
{
InfectionHistory[x+1] <- 2
}
EDIT: Looking at it again, the second example doesn't seem to have a case for handling:
else
{
InfectionHistory[x+1] <- InfectionHistory[x]
}
This bit of code is repeated in the first example. It is paired with both nested if statements. The second example could easily be rewritten to handle this however with a simple if else if else structure. For example:
if((InfectionHistory[x]==1)&&(r < 0.04))
{
InfectionHistory[x+1] <- 2
}
else if((InfectionHistory[x]==2)&&(r < 0.11))
{
InfectionHistory[x+1] <- 1
}
else
{
InfectionHistory[x+1] <- InfectionHistory[x]
}

Related

Making groups (combinations) of objects using their min/max values

First of all, this is my first question, you can tell me how to improve it and what tags to use.
What I am trying to do is I have a bunch of objects that have minimal and maximal values by those values you can deduce if two objects have some sort of overlapping value and thus they can be put together in a group
This question might need dynamic programming to solve.
example objects:
1 ( min: 0, max: 2 )
2 ( min: 1, max: 3 )
3 ( min: 2, max: 4 )
4 ( min: 3, max: 5 )
object 1 can be grouped with objects 2, 3
object 2 can be grouped with objects 1, 3, 4
object 3 can be grouped with objects 1, 2, 4
object 4 can be grouped with objects 2, 3
as you can see there are multiple ways to group those elements
[1, 2]
[3, 4]
[1]
[2, 3]
[4]
[1]
[2, 3, 4]
[1, 2, 3]
[4]
now there should be some sort of rule to deduce which of the solutions is the best solution
for example least amount of groups
[1, 2]
[3, 4]
or
[1]
[2, 3, 4]
or
[1, 2, 3]
[4]
or most objects in one group
[1]
[2, 3, 4]
or
[1, 2, 3]
[4]
or any other rule that uses another attribute of said objects to compare the solutions
what I have now:
$objects = [...objects...];
$numberOfObjects = count($objects);
$groups = [];
for ($i = 0; $i < $numberOfObjects; $i++) {
$MinA = $objects[$i]['min'];
$MaxA = $objects[$i]['max'];
$groups[$i] = [$i];
for ($j = $i + 1; $j < $numberOfObjects; $j++) {
$MinB = $objects[$j]['min'];
$MaxB = $objects[$j]['max'];
if (($MinA >= $MinB && $MinA <= $MaxB) || ($MaxA >= $MinB && $MaxA <= $MaxB) || ($MinB >= $MinA && $MinB <= $MaxA)) {
array_push($groups[$i], $j);
}
}
}
this basically creates an array with indexes of objects that can be grouped together
from this point, I don't know how to proceed, how to generate all the solution and then check each of them how good it is, and the pick the best one
or maybe there is even better solution that doesn't use any of this?
PHP solutions are preferred, although this problem is not PHP-specific
When I was first looking at your algorithm, I was impressed by how efficient it is :)
Here it is rewritten in javascript, because I moved away from perl a good while ago:
function setsOf(objects){
numberOfObjects = objects.length
groups = []
let i
for (i = 0; i < numberOfObjects; i++) {
MinA = objects[i]['min']
MaxA = objects[i]['max']
groups[i] = [i]
for (j = i + 1; j < numberOfObjects; j++) {
MinB = objects[j]['min']
MaxB = objects[j]['max']
if ((MinA >= MinB && MinA <= MaxB) || (MaxA >= MinB && MaxA <= MaxB) ||
(MinB >= MinA && MinB <= MaxA)) {
groups[i].push(j)
}
}
}
return groups
}
if you happen to also think well in javascript, you might find this form more direct (it is identical, however):
function setsOf(objects){
let groups = []
objects.forEach((left,i) => {
groups[i]=[i]
Array.from(objects).splice(i+1).forEach((right, j) => {
if ((left.min >= right.min && left.min <= right.max) ||
(left.max >=right.max && left.max <= right.max) ||
(right.min >= left.min && right.min <= left.max))
groups[i].push(j+i+1)
})
})
return groups
}
so if we run it, we get:
a = setsOf([{min:0, max:2}, {min:1, max:3}, {min:2, max:4}, {min:3, max: 5}])
[Array(3), Array(3), Array(2), Array(1)]0: Array(3)1: Array(3)2: Array(2)3: Array(1)length: 4__proto__: Array(0)
JSON.stringify(a)
"[[0,1,2],[1,2,3],[2,3],[3]]"
and it does impressively catch the compound groups :) a weakness is that it is capturing groups containing more objects than necessary, without capturing all available objects. You seem to have a very custom selection criteria. To me, it seems like the groups should either be every last intersecting subset, or only subsets where each element in the group provides unique coverage: [0,1], [0,2], [1,2], [1,3], [2,3], [0,1,3]
the algorithm for that is perhaps more involved. this was my approach, and it is nowhere near as terse and elegant as yours, but it works:
function intersectingGroups (mmvs) {
const min = []
const max = []
const muxo = [...mmvs]
mmvs.forEach(byMin => {
mmvs.forEach(byMax => {
if (byMin.min === byMax.min && byMin.max === byMax.max) {
console.log('rejecting identity', byMin, byMax)
return // identity
}
if (byMax.min > byMin.max) {
console.log('rejecting non-overlapping objects', byMin, byMax)
return // non-overlapping objects
}
if ((byMax.max <= byMin.max) || (byMin.min >= byMax.min)) {
console.log('rejecting non-expansive coverage or inversed order',
byMin, byMax)
return // non-expansive coverage or inversed order
}
const entity = {min: byMin.min, max: byMax.max,
compositeOf: [byMin, byMax]}
if(muxo.some(mv => mv.min === entity.min && mv.max === entity.max))
return // enforcing Set
muxo.push(entity)
console.log('adding', byMin, byMax, muxo)
})
})
if(muxo.length === mmvs.length) {
return muxo.filter(m => 'compositeOf' in m)
// solution
} else {
return intersectingGroups(muxo)
}
}
now there should be some sort of rule to deduce which of the solutions is the best solution
Yeah, so, usually for puzzles or for a specification you are fulfilling, that would be given as part of the problem. As it is, you want a general method that is adaptable. It's probably best to make an object that can be configured with the results and accepts rules, then load the rules you are interested in, and the results from the search, and see what rules match where. For example, using your algorithm and sample criteria:
least amount of groups
start with code like:
let reviewerFactory = {
getReviewer (specification) { // generate a reviewer
return {
matches: [], // place to load sets to
criteria: specification,
review (objects) { // review the sets already loaded
let group
let results = {}
this.matches.forEach(mset => {
group = [] // gather each object from the initial set for each match in the result set
mset.forEach(m => {
group.push(objects[m])
})
results[mset] = this.criteria.scoring(group) // score the match relative to the specification
})
return this.criteria.evaluation(results) // pick the best score
}
}
},
specifications: {}
}
now you can add specifications like this one for least amount of groups:
reviewerFactory.specifications['LEAST GROUPS'] = {
scoring: function (set) { return set.length },
evaluation: function (res) { return Object.keys(res).sort((a,b) => res[a] - res[b])[0] }
}
then you can use that in the evaluation of a set:
mySet = [{min:0, max:2}, {min:1, max:3}, {min:2, max:4}, {min:3, max: 5}]
rf = reviewerFactory.getReviewer(reviewerFactory.specifications['LEAST GROUPS'])
Object {matches: Array(0), criteria: Object, review: function}
rf.matches = setsOf(mySet)
[Array(3), Array(3), Array(2), Array(1)]
rf.review(mySet)
"3"
or, most objects:
reviewerFactory.specifications['MOST GROUPS'] = {
scoring: function (set) { return set.length },
evaluation: function (res) { return Object.keys(res).sort((a,b) => res[a] - res[b]).reverse()[0] }
}
mySet = [{min:0, max:2}, {min:1, max:3}, {min:2, max:4}, {min:3, max: 5}]
reviewer = reviewerFactory.getReviewer(reviewerFactory.specifications['MOST GROUPS'])
reviewer.matches = setsOf(mySet)
reviewer.review(mySet)
"1,2,3"
Of course this is arbitrary, but so are the criteria, by definition in the OP. Likewise, you would have to change the algorithms here to work with my intersectingGroups function because it doesn't return indices. But this is what you are looking for I believe.

R function for pattern matching

I am doing a text mining project that will analyze some speeches from the three remaining presidential candidates. I have completed POS tagging with OpenNLP and created a two column data frame with the results. I have added a variable, called pair. Here is a sample from the Clinton data frame:
V1 V2 pair
1 c( NN FALSE
2 "thank VBP FALSE
3 you PRP FALSE
4 so RB FALSE
5 much RB FALSE
6 . . FALSE
7 it PRP FALSE
8 is VBZ FALSE
9 wonderful JJ FALSE
10 to TO FALSE
11 be VB FALSE
12 here RB FALSE
13 and CC FALSE
14 see VB FALSE
15 so RB FALSE
16 many JJ FALSE
17 friends NNS FALSE
18 . . FALSE
19 ive JJ FALSE
20 spoken VBN FALSE
What I'm now trying to do is write a function that will iterate through the V2 POS column and evaluate it for specific pattern pairs. (These come from Turney's PMI article.) I'm not yet very knowledgeable when it comes to writing functions, so I'm certain I've done it wrong, but here is what I've got so far.
pairs <- function(x){
JJ <- "JJ" #adjectives
N <- "N[A-Z]" #any noun form
R <- "R[A-Z]" #any adverb form
V <- "V[A-Z]" #any verb form
for(i in 1:(length)(x) {
if(x == J && x+1 == N) { #i.e., if the first word = J and the next = N
pair[i] <- "JJ|NN" #insert this into the 'pair' variable
} else if (x == R && x+1 == J && x+2 != N) {
pair[i] <- "RB|JJ"
} else if (x == J && x+1 == J && x+2 != N) {
pair[i] <- "JJ|JJ"
} else if (x == N && x+1 == J && x+2 != N) {
pair[i] <- "NN|JJ"
} else if (x == R && x+1 == V) {
pair[i] <- "RB|VB"
} else {
pair[i] <- "FALSE"
}
}
}
# Run the function
cl.df.pairs <- pairs(cl.df$V2)
There are a number of (truly embarrassing) issues. First, when I try to run the function code, I get two Error: unexpected '}' in " }" errors at the end. I can't figure out why, because they match opening "{". I'm assuming it's because R is expecting something else to be there.
Also, and more importantly, this function won't exactly get me what I want, which is to extract the word pairs that match a pattern and then the pattern that they match. I honestly have no idea how to do that.
Then I need to figure out how to evaluate the semantic orientation of each word combo by comparing the phrases to the pos/neg lexical data sets that I have, but that's a whole other issue. I have the formula from the article, which I'm hoping will point me in the right direction.
I have looked all over and can't find a comparable function in any of the NLP packages, such as OpenNLP, RTextTools, etc. I HAVE looked at other SO questions/answers, like this one and this one, but they haven't worked for me when I've tried to adapt them. I'm fairly certain I'm missing something obvious here, so would appreciate any advice.
EDIT:
Here is the first 20 lines of the Sanders data frame.
head(sa.POS.df, 20)
V1 V2
1 the DT
2 american JJ
3 people NNS
4 are VBP
5 catching VBG
6 on RB
7 . .
8 they PRP
9 understand VBP
10 that IN
11 something NN
12 is VBZ
13 profoundly RB
14 wrong JJ
15 when WRB
16 , ,
17 in IN
18 our PRP$
19 country NN
20 today NN
And I've written the following function:
pairs <- function(x, y) {
require(gsubfn)
J <- "JJ" #adjectives
N <- "N[A-Z]" #any noun form
R <- "R[A-Z]" #any adverb form
V <- "V[A-Z]" #any verb form
for(i in 1:(length(x))) {
ngram <- c(x[[i]], x[[i+1]])
# the ngram consists of the word on line `i` and the word below line `i`
}
strapply(y[i], "(J)\n(N)", FUN = paste(ngram, sep = " "), simplify = TRUE)
ngrams.df = data.frame(ngrams=ngram)
return(ngrams.df)
}
So, what is SUPPOSED to happen is that when strapply matches the pattern (in this case, an adjective followed by a noun, it should paste the ngram. And all of the resulting ngrams should populate the ngrams.df.
So I've entered the following function call and get an error:
> sa.JN <- pairs(x=sa.POS.df$V1, y=sa.POS.df$V2)
Error in x[[i + 1]] : subscript out of bounds
I'm only just learning the intricacies of regular expressions, so I'm not quite sure how to get my function to pull the actual adjective and noun. Based on the data shown here, it should pull "american" and "people" and paste them into the data frame.
Okay, here we go. Using this data (shared nicely with dput()):
df = structure(list(V1 = structure(c(15L, 3L, 11L, 4L, 5L, 9L, 2L,
16L, 18L, 14L, 13L, 8L, 12L, 20L, 19L, 1L, 7L, 10L, 6L, 17L), .Label = c(",",
".", "american", "are", "catching", "country", "in", "is", "on",
"our", "people", "profoundly", "something", "that", "the", "they",
"today", "understand", "when", "wrong"), class = "factor"), V2 = structure(c(3L,
5L, 7L, 12L, 11L, 10L, 2L, 8L, 12L, 4L, 6L, 13L, 10L, 5L, 14L,
1L, 4L, 9L, 6L, 6L), .Label = c(",", ".", "DT", "IN", "JJ", "NN",
"NNS", "PRP", "PRP$", "RB", "VBG", "VBP", "VBZ", "WRB"), class = "factor")), .Names = c("V1",
"V2"), class = "data.frame", row.names = c("1", "2", "3", "4",
"5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18", "19", "20"))
I'll use the stringr package because of its consistent syntax so I don't have to look up the argument order for grep. We'll first detect the adjectives, then the nouns, and figure out where the line up (offsetting by 1). Then paste the words together that correspond to the matches.
library(stringr)
adj = str_detect(df$V2, "JJ")
noun = str_detect(df$V2, "NN")
pairs = which(c(FALSE, adj) & c(noun, FALSE))
ngram = paste(df$V1[pairs - 1], df$V1[pairs])
# [1] "american people"
Now we can put it in a function. I left the patterns as arguments (with adjective, noun as the defaults) for flexibility.
bigram = function(word, type, patt1 = "JJ", patt2 = "N[A-Z]") {
pairs = which(c(FALSE, str_detect(type, pattern = patt1)) &
c(str_detect(type, patt2), FALSE))
return(paste(word[pairs - 1], word[pairs]))
}
Demonstrating use on the original data
with(df, bigram(word = V1, type = V2))
# [1] "american people"
Let's cook up some data with more than one match to make sure it works:
df2 = data.frame(w = c("american", "people", "hate", "a", "big", "bad", "bank"),
t = c("JJ", "NNS", "VBP", "DT", "JJ", "JJ", "NN"))
df2
# w t
# 1 american JJ
# 2 people NNS
# 3 hate VBP
# 4 a DT
# 5 big JJ
# 6 bad JJ
# 7 bank NN
with(df2, bigram(word = w, type = t))
# [1] "american people" "bad bank"
And back to the original to test out a different pattern:
with(df, bigram(word = V1, type = V2, patt1 = "N[A-Z]", patt2 = "V[A-Z]"))
# [1] "people are" "something is"
I think the following is the code you wrote, but without throwing errors:
pairs <- function(x) {
J <- "JJ" #adjectives
N <- "N[A-Z]" #any noun form
R <- "R[A-Z]" #any adverb form
V <- "V[A-Z]" #any verb form
pair = rep("FALSE", length(x))
for(i in 1:(nrow(x)-2)) {
this.pos = x[i,2]
next.pos = x[i+1,2]
next.next.pos = x[i+2,2]
if(this.pos == J && next.pos == N) { #i.e., if the first word = J and the next = N
pair[i] <- "JJ|NN" #insert this into the 'pair' variable
} else if (this.pos == R && next.pos == J && next.next.pos != N) {
pair[i] <- "RB|JJ"
} else if (this.pos == J && next.pos == J && next.next.pos != N) {
pair[i] <- "JJ|JJ"
} else if (this.pos == N && next.pos == J && next.next.pos != N) {
pair[i] <- "NN|JJ"
} else if (this.pos == R && next.pos == V) {
pair[i] <- "RB|VB"
} else {
pair[i] <- "FALSE"
}
}
## then deal with the last two elements, for which you can't check what's up next
return(pair)
}
not sure what you mean by this, though:
Also, and more importantly, this function won't exactly get me what I
want, which is to extract the word pairs that match a pattern and then
the pattern that they match. I honestly have no idea how to do that.

C++ to VBA (Excel)

So, basically, in Excel, I have 4 columns of data (all with strings) that I want to process, and want to have the results in another column, like this (nevermind the square brackets, they just represent cells):
Line Column1 Column2 Column3 Column4 Result
1: [a] [b] [k] [YES] [NO]
2: [a] [c] [l] [YES] [NO]
3: [b] [e] [] [YES] [NO]
4: [c] [e] [f] [NO] [NO]
5: [d] [h] [b] [NO] [NO]
6: [d] [] [w] [NO] [NO]
7: [e] [] [] [YES] [NO]
8: [j] [m] [] [YES] [YES]
9: [j] [] [] [YES] [YES]
10: [] [] [] [YES] [YES]
The process that I want the data to go through is this:
Assume that CheckingLine is the Line for which I currently want to calculate the value of Result, and that CurrentLine is any Line (except CheckingLine) that I am using to calculate the value of Result, at a given moment.
If Column4[CheckingLine] is "NO", Result is "NO" (simple enough, no help needed);
Example: CheckingLine = 1 -> Column4[1] = "NO" -> Result = "NO";
Else, I want to make sure that all Lines that share a common value with CheckingLine (in any Column between 1 and 3), also have Column4 as "YES" (Doing that would be simple enough even without VBA - in fact, I started by doing it in plain Excel and realised that it wasn't what I wanted) - if that happens, Result is "YES";
Example: CheckingLine = 8 -> Only shared value is "j" -> CurrentLine = 9 -> Column4[9] = "YES" -> Result = "YES";
Here's the tricky part: If one of those lines has any value (again, in any Column between 1 and 3) that IS NOT shared with CheckingLine, I want to do the whole process (restart at 1.), but checking the CurrentLine instead.
Example: CheckingLine = 2, "a" is shared with Line 1, c is shared with Line 4 -> CurrentLine = 1 -> Column4[1] = "YES", but "b" and "k" are not shared with CheckingLine -> CheckingLine' = 1 -> "b" is shared with Line 5 -> Column4[5] = "NO" -> Result = "NO";
I have written the corresponding C++ code (which works) (and it could have been in any other language, C++ was just the one I was using at the moment) (and the code HAS NOT been optimized in any way, because it's purpose was to be AS CLEAR about its functionality AS POSSIBLE) (the table above is the actual result of running it):
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> column1, column2, column3, column4, contentVector;
unsigned int location, columnsSize;
void InsertInVector(std::string Content)
{
if(Content == "")
{
return;
}
for(unsigned int i = 0; i < contentVector.size(); i++)
{
if(contentVector[i] == Content)
{
return;
}
}
contentVector.push_back(Content);
}
std::string VerifyCurrentVector(unsigned int Start)
{
std::string result = "";
if(contentVector.size() == 0)
{
result = "YES";
}
else
{
unsigned int nextStart = contentVector.size();
for(unsigned int i = 0; i < columnsSize; i++)
{
if(i != location)
{
for(unsigned int j = Start; j < nextStart; j++)
{
if(column1[i] == contentVector[j])
{
InsertInVector(column2[i]);
InsertInVector(column3[i]);
}
else if(column2[i] == contentVector[j])
{
InsertInVector(column1[i]);
InsertInVector(column3[i]);
}
else if(column3[i] == contentVector[j])
{
InsertInVector(column1[i]);
InsertInVector(column2[i]);
}
}
}
}
if(nextStart == contentVector.size())
{
for(unsigned int i = 0; i < columnsSize; i++)
{
if(i != location)
{
for(unsigned int j = 0; j < nextStart; j++)
{
if(column1[i] == contentVector[j] || column2[i] ==
contentVector[j] || column3[i] == contentVector[j])
{
if(column4[i] == "NO")
{
result = "NO";
return result;
}
}
}
}
}
result = "YES";
}
else
{
result = VerifyCurrentVector(nextStart);
}
}
return result;
}
std::string VerifyCell(unsigned int Location)
{
std::string result = "";
location = Location - 1;
if(column4.size() < Location)
{
result = "Error";
}
else if(column4[location] == "NO")
{
result = "NO";
}
else
{
contentVector.clear();
InsertInVector(column1[location]);
InsertInVector(column2[location]);
InsertInVector(column3[location]);
result = VerifyCurrentVector(0);
}
return result;
}
void SetUpColumns(std::vector<std::string> &Column1, std::vector<std::string> &Column2,
std::vector<std::string> &Column3, std::vector<std::string> &Column4)
{
if(Column4.size() > Column1.size())
{
for(unsigned int i = Column1.size(); i < Column4.size(); i++)
{
Column1.push_back("");
}
}
if(Column4.size() > Column2.size())
{
for(unsigned int i = Column2.size(); i < Column4.size(); i++)
{
Column2.push_back("");
}
}
if(Column4.size() > Column3.size())
{
for(unsigned int i = Column3.size(); i < Column4.size(); i++)
{
Column3.push_back("");
}
}
column1 = Column1;
column2 = Column2;
column3 = Column3;
column4 = Column4;
columnsSize = Column4.size();
}
int main()
{
std::vector<std::string> Column1, Column2, Column3, Column4;
Column1.push_back("a");
Column1.push_back("a");
Column1.push_back("b");
Column1.push_back("c");
Column1.push_back("d");
Column1.push_back("d");
Column1.push_back("e");
Column1.push_back("j");
Column1.push_back("j");
Column2.push_back("b");
Column2.push_back("c");
Column2.push_back("e");
Column2.push_back("e");
Column2.push_back("h");
Column2.push_back("");
Column2.push_back("");
Column2.push_back("m");
Column3.push_back("k");
Column3.push_back("l");
Column3.push_back("");
Column3.push_back("f");
Column3.push_back("b");
Column3.push_back("w");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("NO");
Column4.push_back("NO");
Column4.push_back("NO");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
SetUpColumns(Column1, Column2, Column3, Column4);
std::cout << "Line\t" << "Column1\t" << "Column2\t" << "Column3\t" << "Column4\t" <<
std::endl;
for(unsigned int i = 0; i < Column4.size(); i++)
{
std::cout << i + 1 << ":\t" << "[" << column1[i] << "]\t[" << column2[i] <<
"]\t[" << column3[i] << "]\t[" << column4[i] << "]\t[" << VerifyCell(i + 1)
<< "]" << std::endl;
}
return 0;
}
So, after this lengthy explanation, what I want to know is this:
Is there any way to do this in Excel's VBA (or even better, in plain Excel without VBA)?
If not, how can I have my code (which I can easily translate to another C-like language and/or optimise) get the data from, and deliver the results to, Excel?
Is there any way to do this in Excel's VBA?
Yes, you can surely do this with VBA, it is a complete and powerful programming language
(or even better, in plain Excel without VBA)?
Nope. The calculation seems too complicated to fit with Excel formulae without any VBA code.
If not, how can I have my code (which I can easily translate to another C-like language and/or optimise) get the data from, and deliver the results to, Excel?
You can access Excel from C++ in many ways. Using ATL is one of them. another, easier way would be to import/export your Excel file in CSV format, which is easy to parse and write from C++.
Also consider C#, it has complete COM inter-operability to access office components.
Ok, if you like to "whipped the code in a rush" then you'll love VBA, next time please try to ask a more specific question. Based on code and comments #MikeAscended you're a relatively good programmer, with a grasp of functions/recursion, variable/parameters, conditions, loops, data structures, etc. Re: " I have only touched VBA once in my life and ran away from it" My intent is to get you started and give you syntax here not necessarily a working solution. I'm happy to answer any further specific questions you may continue to have.
Strategy-wise,
I recommend plain VBA which is easy to use in Excel. Obviously your problem can be solved in many ways including formulas, however VBA is a powerful tool that any programmer will benefit from using.
Code-wise,
To start access the editor from Excel press [Alt-F11], or from Design Mode insert and double-click an ActiveX button. To run a macro press [Alt-F8], or in VBA click the green play button.
One last note, if you want those line numbers in column 1 in excel then yours will become Column 2-5 or B-F. I'm assuming you'll use the row numbers in excel so that Column 1 is A, but row 1 will still have titles, so you are staring your data on row 2.
sub processResults_Col5()
' Run This Script as Main()
dim rowCount as long, i as long 'rowCount = columnsSize
with sheets(1)
.Range("A1:D1") = Array("a", "b", "k", "YES")
' finish init here
' SetUpColumns not necessary in excel
if .cells(2,1).value <> "" then 'do not use .end(xldown) if data is missing
rowCount = .cells(1,1).end(xldown).row
for i = 1 to rowCount
.cells(i,5) = verifyCell(i + 1, rowCount)
next i
endif 'space will be added :p
end with
end sub
function verifyCell(rowLocation as long, size as long, optional wSh as excel.worksheet) as string
' the rest should be easy for you to figure out based on C-code
with wSh
if wsh is nothing then set wsh = activesheet 'let VBA capitalize stuff so you know you typed it correctly
if size < rowlocation then
verifyCell = "Error" 'the function name is the return value
'msgbox "Error" ' you can uncomment this line to see error
elseif cells(rowLocation, 4).value = "NO" then
cells(rowLocation, 5) = "NO" 'set result
else
call InsertInVector(rowLocation) 'CheckingLine
' edit the current rowLocation with for loops
verifyCell = VerifyCurrentVector(0) 'whatever you're doing here
endif
end with
end function
sub InsertInVector()
end sub
sub VerifyCurrentVector() 'function returns a value
end sub
Some tips:
Generally, Comment Your Code!
Generally, The first word/acronym of Variable and Object names should start in lowercase, then continue in camel-case. This helps distinguish them from library types.
In VBA always put [option explicit] in the beginning of every sheet/module, this requires you to [dim varName as Type] which will help debugging and make your code more explicit so it's easy to understand.
In VBA for numbers use type Long, learn early vs late-binding. If you're instantiating any object that requires a reference/library, always state it explicitly. This includes Excel.Worksheet, Excel.Workbook, etc. (eg. you may want your code in MS Access)
In Office One of the first settings you're going to want to disable is the popup error window, also use debug.print and the immediate box a few times.
Generally, as you know from C++ take your time, try to write correct code on your the first try as this will save you debugging time. Try not to rush and keep coffee & healthy snacks on hand. Good luck and have fun :)

rankall : returning the correct data frame to rank hospitals on performance

this is a solution(not working well) to a coursera problem. I'm trying to rank a data frame containing the names of hospitals based on their performance on 3 different conditions. (I found another to this question at How to subset a row from list based on condition). I think I'm not subsetting right and I don't return the correct data frame at the end. really new to programming and R. thank you for your help.
rankall <- function(outcome, num = 'best'){
data <- read.csv('outcome-of-care-measures.csv', colClasses = 'character')
data[,11] <- as.numeric(data[,11])
data[,17] <- as.numeric(data[,17])
data[17] <- as.numeric(data[,23])
states <- sort(unique(data$State))
conditions <- data[c(11,17,23)]
if(!state %in% states){stop('invalid state')}
if(!outcome %in% conditions){stop('invalid outcome')}
for (i in 1:length(states)){
statedata <-data[data$State == state[i],]
if(outcome == 'heart attack'){column <- (statedata[,11]}
if(outcome == 'heart failure') {column <-(statedata[,17]}
if(outcome == 'pneumonia') {column <- statedata[,23]}
rankedhospitals <- c()
rankcondition <- rank(column, na.last = NA)
if (num == 'best'){num <- 1}
if(num == 'worst'){num <- nrow(rankcondition)}
rankedhospitals[i] <- statedata$Hospital.Name[order(column, statedata$Hospital.Name)[num]]
rankedhospitals <- cbind(rankedhospitals,states[num,2])
}
return (c('rankedhospitals', 'states'))
}

trying to append a list, but something breaks

I'm trying to create an empty list which will have as many elements as there are num.of.walkers. I then try to append, to each created element, a new sub-list (length of new sub-list corresponds to a value in a.
When I fiddle around in R everything goes smooth:
list.of.dist[[1]] <- vector("list", a[1])
list.of.dist[[2]] <- vector("list", a[2])
list.of.dist[[3]] <- vector("list", a[3])
list.of.dist[[4]] <- vector("list", a[4])
I then try to write a function. Here is my feeble attempt that results in an error. Can someone chip in what am I doing wrong?
countNumberOfWalks <- function(walk.df) {
list.of.walkers <- sort(unique(walk.df$label))
num.of.walkers <- length(unique(walk.df$label))
#Pre-allocate objects for further manipulation
list.of.dist <- vector("list", num.of.walkers)
a <- c()
# Count the number of walks per walker.
for (i in list.of.walkers) {
a[i] <- nrow(walk.df[walk.df$label == i,])
}
a <- as.vector(a)
# Add a sublist (length = number of walks) for each walker.
for (i in i:num.of.walkers) {
list.of.dist[[i]] <- vector("list", a[i])
}
return(list.of.dist)
}
> num.of.walks.per.walker <- countNumberOfWalks(walk.df)
Error in vector("list", a[i]) : vector size cannot be NA
Assuming 'walk.df' is something like:
walk.df <- data.frame(label=sample(1:10,100,T),var2=1:100)
then:
countNumberOfWalks <- function(walk.df) {
list.of.walkers <- sort(unique(walk.df$label))
num.of.walkers <- length(unique(walk.df$label))
list.of.dist <- vector("list", num.of.walkers)
for (i in 1:num.of.walkers) {
list.of.dist[[i]] <- vector("list",
nrow(walk.df[walk.df$label == list.of.walkers[i],]))}
return(list.of.dist)
}
Will achieve what you're after.