subset of a list by first match of a part of the column's name - list

I have a list (L) with several form of AB variable (like AB_1, AB_1_1 ,...), can I have a subset of list with only the first column that matches AB form.
List (L) and desired result as List (R) are as follow:
L1 = data.frame(AB_1 = c(1:4) , AB_1_1 = c(1:4) , C1 = c(1:4))
L2 = data.frame(AB_1_1 = c(1:4) , AB_2 = c(1:4), D = c(1:4) )
L=list(L1,L2)
R1 = data.frame(AB_1 = c(1:4) , C1 = c(1:4))
R2 = data.frame(AB_1_1 = c(1:4) , D = c(1:4))
R=list(R1,R2)

It is not the best answer, but it is a solution:
First change the name of all columns start with AB... to AB, and then remove the duplicate column names for each data frame in list (L).
for (i in 1:length(L)){
colnames(L[[i]])[grepl('AB',colnames(L[[i]]))] <- 'AB'
L[[i]] <- L[[i]][ , !duplicated(colnames(L[[i]]))]
}

Related

how to add line break or newline for space in specific columns using kable extra r-markdown?

I'm using r-markdown to generate pdf. i want to add a line break or newline to a column in a dataframe where ever the space is there
C1 C2 ID
-----------------
22.6 a-b a
23.5 ba-cd b
24 c-d c
25.3 d-e d
i want output like
C1 ID
------------
22.6 a
a-b
23.5 b
ba-cd
my code
df$C1<- with(df, paste0(C1,'\n', C2))
df$C2<-NULL
kbl(df,booktabs = T,longtable = T,align = c("c", "c", "c", "c")) %>%
kable_styling(latex_options = c("repeat_header"),bootstrap_options = "bordered",font_size = 7,full_width = F)%>%
column_spec(1, width = "8cm")%>%
column_spec(2:4, width = "2.25cm")%>%
row_spec(0, bold = T, color = "white", background = "#008752")
Use the <br/> tag for linebreak -
library(knitr)
library(kableExtra)
df$C1<- with(df, paste0(C1,'<br/>', C2))
df$C2<-NULL
kbl(df, booktabs = T,longtable = T,align = c("c", "c"), escape = F) %>%
kable_styling(latex_options = c("repeat_header"),
bootstrap_options = "bordered",font_size = 7,full_width = F)%>%
column_spec(1, width = "8cm")

For loop for a list to prevent vertical display

ini_list = "[('G 02', 'UV', '2.73')]"
res = ini_list.strip('[]')
print(res)
('G 02', 'UV', '2.73')
result = res.strip('()')
print(result)
'G 02', 'UV', '2.73'
I have a list: 'G 02', 'UV', '2.73' and I would like to assign variables to this list
so that the outcome is as follows:
Element = G 02
Reason = UV
Time = 2.73
I have numerous lists that contain those parameters that I would like to later use to plot various things and so would like to extract each parameter from the list and associate it with the specific variable.
I tried to do it by:
Results = res
for index, Parameters in enumerate(Results):
element = Parameters[0]
print(element)
in the hopes that i could extract each item from the list to assign it a variable as mentioned above however when i print element the list prints vertically downwards and it also doesnt let me extract individual indexes.
'
G
0
2
'
,
'
U
V
'
,
'
2
.
7
3
'
how do i get it so it assigns variables to each parameter as mentioned above and so it prints as so:
element = G 02
reason = UV
time = 2.73
if the ini_list is a string such as ini_list = "[('G 02', 'UV', '2.73')]" then you need strip and split methods to do your work such as following,
ini_list = "[('G 02', 'UV', '2.73')]"
res = ini_list.strip('[]')
result = res.strip('()')
result1 = result.split(',')
result1=[x.strip(" ") for x in result1]
result1=[x.strip("''") for x in result1]
element = result1[0]
print("element:",element)
reason =result1[1]
print("reason:",reason)
time=result1[2]
print("time:",time)
output:
element: G 02
reason: UV
time: 2.73

I want a line graph with mean and SD like

I want graph as like as the picture but don't understand how to do this. Actually a have different data for same date how do i made its a line.
library(tidyverse)
library(data.table)
fish <- read_csv(file = "FishF.csv", col_types = cols(DD= col_factor()))
fish1<-fish[is.na(fish)] <- 0
fish
View(fish)
fd<-fish[,c(1,2,3)]
1
newfish <- melt(setDT(fish), id.vars = "DD",
measure.vars = patterns("avg","SE"),
value.name = c("avg","SE"))[ , variable := lvls_revalue(variable, c("C3", "IgM", "IgT", "KHV", "Lyso"))][]
n<- melt(setDT(fd), id.vars = "DD",
measure.vars = patterns("avg","SE"),
value.name = c("avg","SE"))[ , variable := lvls_revalue(variable, c("C3"))][]

Dropdown query has no error, but returns no results

I'm hoping to allow others to sort through the data using some dropdowns, but they shouldn't have to use all of them if they don't need to.
My query function:
=QUERY(CATALOG!A2:I259,"SELECT * WHERE 1=1 "&IF(A2="Any",""," AND B = '"&A2&"' ")&IF(B2="ANY",""," AND C = '"&B2&"' ")&IF(C2="Any",""," AND D = '"&C2&"' ")&IF(D2="Any",""," AND E = '"&D2&"' ")&IF(E2="Any",""," AND F = '"&E2&"' ")&IF(F2="Any",""," AND G = '"&F2&"' ")&IF(G2="Any",""," AND H = '"&G2&"' "),1)
Whenever I ran this, there wasn't an error but the query didn't give any items.
I initially only tested one of the dropdowns but received nothing. I plugged in a known product into the inputs and still received nothing.
Link to copy of the spreadsheet with dataset and function
https://docs.google.com/spreadsheets/d/1s3tOm_6g8n66HT9md3EAXY7XwbkpmPggYhxdF-zv5ok/edit?usp=sharing
Some of the search parameters seem to be numbers. In that case do not use the single quotes (as that will turn them into strings). See if this works
=QUERY(CATALOG!A2:I259,"SELECT * WHERE 1=1 "&IF(A2="Any",""," AND B = "&A2&" ")&IF(B2="ANY",""," AND C = '"&B2&"' ")&IF(C2="Any",""," AND D = '"&C2&"' ")&IF(D2="Any",""," AND E = '"&D2&"' ")&IF(E2="Any",""," AND F = "&E2&" ")&IF(F2="Any",""," AND G = "&F2&" ")&IF(G2="Any",""," AND H = '"&G2&"' "),1)
try shorter:
=QUERY(CATALOG!A2:I259,
"where 1=1 "&
IF(A2="Any",," and B = "&A2)&
IF(B2="Any",," and C = '"&B2&"'")&
IF(C2="Any",," and D = '"&C2&"'")&
IF(D2="Any",," and E = '"&D2&"'")&
IF(E2="Any",," and F = "&E2)&
IF(F2="Any",," and G = "&F2)&
IF(G2="Any",," and H = '"&G2&"'"), 1)

How to include variables values into regular expressions in R

I have 5 files which contain metabolites (details of different bacteria models). I'm writing a function to append a specified number of files. File names look like the following.
[1] "01_iAPECO1_1312_metabolites.csv" "02_iB21_1397_metabolites.csv"
[3] "03_iBWG_1329_metabolites.csv" "04_ic_1306_metabolites.csv"
[5] "05_iE2348C_1286_metabolites.csv"
Below is my function.
strat = 3 # defines the starting position of the range
end = 5 # defines the ending position of the range
type = "metabolites" # two types of files - for metabolites and reactions
files <- NULL
if (type == "metabolites"){
files <- list.files(pattern = "*metabolites\\.csv$")
}else if(type == "reactions"){
files <- list.files(pattern = "*reactions\\.csv$")
}
#reading each file within the range and append them to create one file
for (i in start:end){
temp_df <- data.frame(ModelName = character(), Object = character(),stringsAsFactors = F)
#reading the current file
temp = rbind(one,temp_df)
}
#writing the appended file
write.csv(temp,"appended.csv",row.names = F,quote = F)
temp_df <- NULL
For example, if I specify the start=3 and end = 5, the code is supposed to read files 03, 04 and 05 and append them. Note: the two integers at the beginning of the file names are used to get the file referenced by the range. I'm unable to select the required file within the for loop using a regular expression. When I specify the number it picks up but I'm looking for a generalized version with i in it.
currentFile = grep("01.+",files)
Any help is appreciated.
For the test data shown below this returns a vector containing the file names of the files that start with 02, 03, 04 and 05 and end with "reactions.csv"
# create some test files
for(i in 1:5) cat(file = sprintf("%02djunkreactions[.]csv", i))
# test input
start <- 2
end <- 5
type <- "reactions"
list.files(pattern = paste(sprintf("^%02d.*%s[.]csv$", start:end, type), collapse = "|"))
giving:
[1] "02junkreactions.csv" "03junkreactions.csv" "04junkreactions.csv"
[4] "05junkreactions.csv"
Note: If start and end are both always one digit then a simplification is possible:
list.files(pattern = sprintf("^0[%d-%d].*%s.csv$", start, end, type))
You can do this with a cross-join.
library(dplyr)
library(stringi)
start = 3
end = 5
type = "metabolites"
all_files = data_frame(file = list.files() )
desired_files = data_frame(
number = start:end,
regex = sprintf("^%02.f.*%s", number, file_type) )
all_files %>%
merge(desired_files) %>%
filter(stri_detect_regex(file, regex)) %>%
group_by(number) %>%
do(read.csv(.$file) ) %>%
write.csv("appended.csv", row.names = F, quote = F)
Are you looking for something like this?
files <- c("01_iAPECO1_1312_metabolites.csv", "02_iB21_1397_metabolites.csv","03_iBWG_1329_metabolites.csv", "04_ic_1306_metabolites.csv","05_iE2348C_1286_metabolites.csv")
for(i in 2:4) print(grep(sprintf("^(%02d){1}_",i),files,value=T))