converting a matrix of lists to a regular matrix - list

Take the following code:
foo <- list()
foo[[1]] <- list(a=1, b=2)
foo[[2]] <- list(a=11, b=22)
foo[[3]] <- list(a=111, b=222)
result <- do.call(rbind, foo)
result[,'a']
In this case, result[,'a'] shows a list. Is there a more elegant way such that result is a "regular" matrix of vectors? I imagine there are manual ways of going about this, but I was wondering if there was an obvious step that I was missing.

do.call on lists is very elegant, and fast. In fact do.call(rbind, my.list) once saved my ass when I needed to combine a huge list. It was by far the fastest solution.
To solve your problem, maybe something like:
do.call(rbind, lapply(foo, unlist))
> result.2 <- do.call(rbind, lapply(foo, unlist))
> result.2
a b
[1,] 1 2
[2,] 11 22
[3,] 111 222
> result.2[, 'a']
[1] 1 11 111
>

One possible solution is as follows (but am interested in alternatives):
new.result <- matrix(unlist(result), ncol=ncol(result),
dimnames=list(NULL, colnames(result)))

Related

How to add column to data.table with values from list based on regex

I have the following data.table:
id fShort
1 432-12 1245
2 3242-12 453543
3 324-32 45543
4 322-34 45343
5 2324-34 13543
DT <- data.table(
id=c("432-12", "3242-12", "324-32", "322-34", "2324-34"),
fShort=c("1245", "453543", "45543", "45343", "13543"))
and the following list:
filenames <- list("3242-124342345.png", "432-124343.png", "135-13434.jpeg")
I would like to create a new column "fComplete" that includes the complete filename from the list. For this the values of column "id" need to be matched with the filename-list. If the filename starts with the "id" string, the complete filename should be returned. I use the following regex
t <- grep("432-12","432-124343.png",value=T)
that return the correct filename.
This is how the final table should look like:
id fShort fComplete
1 432-12 1245 432-124343.png
2 3242-12 453543 3242-124342345.png
3 324-32 45543 NA
4 322-34 45343 NA
5 2324-34 13543 NA
DT2 <- data.table(
id=c("432-12", "3242-12", "324-32", "322-34", "2324-34"),
fshort=c("1245", "453543", "45543", "45343", "13543"),
fComplete = c("432-124343.png", "3242-124342345.png", NA, NA, NA))
I tried using apply and data.table approaches but I always get warnings like
argument 'pattern' has length > 1 and only the first element will be used
What is a simple approach to accomplish this?
Here's a data.table solution:
DT[ , fComplete := lapply(id, function(x) {
m <- grep(x, filenames, value = TRUE)
if (!length(m)) NA else m})]
id fShort fComplete
1: 432-12 1245 432-124343.png
2: 3242-12 453543 3242-124342345.png
3: 324-32 45543 NA
4: 322-34 45343 NA
5: 2324-34 13543 NA
In my experience with similar functions, sometimes the regex functions return a list, so you have to consider that in the apply - I usually do an example manually
Also apply will not always in y experience on its own return something that always works into a data.frame,sometimes I had to use lap ply, and or unlist and data.frame to modify it
Here is an answer - I am not familiar with data.tables and I was having issues with the filenames being in a list, but with some transformations this works. I worked it out by seeing what apply was outputting and adding the [1] to get the piece I needed
DT <- data.frame(
id=c("432-12", "3242-12", "324-32", "322-34", "2324-34"),
fShort=c("1245", "453543", "45543", "45343", "13543"))
filenames <- list("3242-124342345.png", "432-124343.png", "135-13434.jpeg")
filenames1 <- unlist(filenames)
x<-apply(DT[1],1,function(x) grep(x,filenames1)[1])
DT$fielname <- filenames1[x]

how to combine vectors with different length within a list in R?

I have a problem when combining the following vectors included in the list:
x <- list(as.numeric(c(1,4)),as.numeric(c(3,19,11)))
names (x[[1]]) <- c("species.A","species.C")
names (x[[2]]) <- c("species.A","species.B","species.C")
which gives the following list:
>x
>[[1]]
>species.A species.C
> 1 4
>[[2]]
>species.A species.B species.C
> 3 19 11
combining them using the do.call function:
y<- do.call(cbind,x)
gives:
>y
> [,1] [,2]
> species.A 1 3
> species.B 4 19
> species.C 1 11
while I would like to obtain this:
> [,1] [,2]
> species.A 1 3
> species.B NA 19
> species.C 4 11
You need to give R a bit more help, by first preparing the particular vectors, all of the same length, that you eventually want to cbind together. Otherwise (as you've seen) R uses its usual recycling rules to fill out the matrix.
Try something like this:
spp <- paste("species", c("A", "B", "C"), sep=".")
x2 <- lapply(x, FUN=function(X) X[spp])
mat <- do.call("cbind", x2)
row.names(mat) <- spp
mat
[,1] [,2]
species.A 1 3
species.B NA 19
species.C 4 11
EDIT: As Brian mentions in comments, this could be made a bit more compact (but at the expense of some readability). Which one you use is just a matter of taste:
mat <- do.call("cbind", lapply(x, "[", spp))
row.names(mat) <- spp
It looks like you're actually trying to do a merge. As such, merge will work. You just have to tell it to merge on the names, and to keep all rows.
do.call(merge, c(x, by=0, all=TRUE)) # by=0 and by="row.names" are the same
(This will create a data frame rather than a matrix, but for most purposes that shouldn't be an issue.)
merge(x = x[[1]], y = x[[2]], by = "names", all.y = TRUE)

R divide 2 list objects which each contain the same size xts objects

I have 2 lists whose components are xts objects (co and oc). I want to produce another list object that has the result of oc / co.
> length(co)
[1] 1064
> length(oc)
[1] 1064
> tail(co[[1]])
[,1]
2011-12-22 0.3018297
2011-12-23 0.2987450
2011-12-27 0.2699710
2011-12-28 0.2706428
2011-12-29 0.2098897
2011-12-30 0.2089051
> tail(oc[[1]])
[,1]
2011-12-22 0.6426411
2011-12-23 0.6462834
2011-12-27 0.6466680
2011-12-28 0.6741420
2011-12-29 0.6781371
2011-12-30 0.6650130
> co / oc
Error in co/oc : non-numeric argument to binary operator
If I specify an index of the lists the operation succeeds as follows:
> tail(co[[1]] / oc[[1]])
[,1]
2011-12-22 0.4696707
2011-12-23 0.4622507
2011-12-27 0.4174800
2011-12-28 0.4014627
2011-12-29 0.3095093
2011-12-30 0.3141369
I want to do this without writing a loop to iterate through each component of the two lists (1064 components in total).
Any help would be greatly appreciated. Thank you.
Something like this may work:
mapply("/",co,oc,SIMPLIFY = FALSE)
although there are probably countless ways of doing this that are all mostly equivalent.
Here's a minimal example using some sample data from the xts package:
data(sample_matrix)
sample.xts <- as.xts(sample_matrix, descr='my new xts object')
v1 <- list(a = sample.xts[,1],b = sample.xts[,2])
v2 <- list(a = sample.xts[,3],b = sample.xts[,4])
mapply("/",v1,v2,SIMPLIFY = FALSE)
Update:
We can now use Map which is basically the mapply(..., simplify = FALSE) by default.
Map("/",co,oc)

converting a matrix to a list

Suppose I have a matrix foo as follows:
foo <- cbind(c(1,2,3), c(15,16,17))
> foo
[,1] [,2]
[1,] 1 15
[2,] 2 16
[3,] 3 17
I'd like to turn it into a list that looks like
[[1]]
[1] 1 15
[[2]]
[1] 2 16
[[3]]
[1] 3 17
You can do it as follows:
lapply(apply(foo, 1, function(x) list(c(x[1], x[2]))), function(y) unlist(y))
I'm interested in an alternative method that isn't as complicated. Note, if you just do apply(foo, 1, function(x) list(c(x[1], x[2]))), it returns a list within a list, which I'm hoping to avoid.
Here's a cleaner solution:
as.list(data.frame(t(foo)))
That takes advantage of the fact that a data frame is really just a list of equal length vectors (while a matrix is really a vector that is displayed with columns and rows...you can see this by calling foo[5], for instance).
You could also do this, although it isn't much of an improvement:
lapply(1:nrow(foo), function(i) foo[i,])
library(plyr)
alply(foo, 1)

Generating a vector of the number of items in each list item

I have a list containing 98 items. But each item contains 0, 1, 2, 3, 4 or 5 character strings.
I know how to get the length of the list and in fact someone has asked the question before and got voted down for presumably asking such an easy question.
But I want a vector that is 98 elements long with each element being an integer from 0 to 5 telling me how many character strings there are in each list item.
I was expecting the following to work but it did not.
lapply(name.of.list,length())
From my question you will see that I do not really know the nomeclature of lists and items. Feel free to straighten me out.
Farrel, I do not exactly follow as 'item' is not an R type. Maybe you have a list of length 98 where each element is a vector of character string?
In that case, consider this:
R> fl <- list(A=c("un", "deux"), B=c("one"), C=c("eins", "zwei", "drei"))
R> lapply(fl, function(x) length(x))
$A
[1] 2
$B
[1] 1
$C
[1] 3
R> do.call(rbind, lapply(fl, function(x) length(x)))
[,1]
A 2
B 1
C 3
R>
So there is you vector of the length of your list, telling you how many strings each list element has. Note the last do.call(rbind, someList) as we got a list back from lapply.
If, on the other hand, you want to count the length of all the strings at each list position, replace the simple length(x) with a new function counting the characters:
R> lapply(fl, function(x) { sapply(x, function(y) nchar(y)) } )
$A
un deux
2 4
$B
one
3
$C
eins zwei drei
4 4 4
R>
If that is not want you want, maybe you could mock up some example input data?
Edit:: In response to your comments, what you wanted is probably:
R> do.call(rbind, lapply(fl, length))
[,1]
A 2
B 1
C 3
R>
Note that I pass in length, the name of a function, and not length(), the (displayed) body of a function. Because that is easy to mix up, I simply apply almost always wrap an anonymous function around as in my first answer.
And yes, this can also be done with just sapply or even some of the **ply functions:
R> sapply(fl, length)
A B C
2 1 3
R> lapply(fl, length)
[1] 2 1 3
R>
All this seems very complicated - there is a function specifically doing what you were asking for:
lengths #note the plural "s"
Using Dirks sample data:
fl <- list(A=c("un", "deux"), B=c("one"), C=c("eins", "zwei", "drei"))
lengths(fl)
will return a named integer vector:
A B C
2 1 3
The code below accepts a list and returns a vector of lengths:
x = c("vectors", "matrices", "arrays", "factors", "dataframes", "formulas",
"shingles", "datesandtimes", "connections", "lists")
xl = list(x)
fnx = function(xl){length(unlist(strsplit(x, "")))}
lv = sapply(x, fnx)