Merging column header in latex table created by kable in Rmarkdown - r-markdown

I have created a latex table as below using kable in Rmarkdown:
---
output: pdf_document
header-includes:
- \usepackage{xcolor}
---
```{r, message=FALSE, warning=FALSE, echo=FALSE}
library(kableExtra)
library(tidyr)
library(dplyr)
data(iris)
iris %>%
as_tibble %>%
gather(.,key = variable,value = value,-Species) %>%
group_by(Species,variable) %>%
summarise(value=mean(value)) %>%
ungroup %>%
spread(.,key = variable,value = value) %>%
mutate(`Percentage Change`=`Petal.Length`/`Petal.Width`*100) %>%
kable(.,format='latex',
align='c',linesep='',
booktabs=TRUE,escape=FALSE) %>%
add_header_above(.,c(' '=1,'Parts'=4,' '=1),
escape = FALSE) %>%
kable_styling(latex_options = c('striped','HOLD_position','scale_down'))
```
I would like to have the column header "Species" and "Percentage Change" merged with the empty space above them respectively, so that Species can be placed in the middle of the two header rows, while Percentage Change (Petal Length/ Petal Width) can occupy two rows, rather than having a empty row above, and prevent other column to have an empty row below.
Wonder if it can be modified in kable preferably, latex "hack" suggestion is also welcome.
Thanks!

I think for this latex 'hack' solution is much cleaner. In kable also this can be done but that would require changing the data frame (convert column names to row) so that collapse_rows can be used. Anyway, here's the latex way out:
The code that you gave in your questions does not give the column name as in the pdf snapshot. So I edited the code first to get that table:
---
output:
pdf_document:
keep_tex: true
header-includes:
- \usepackage{xcolor}
---
```{r, message=FALSE, warning=FALSE, echo=FALSE}
library(kableExtra)
library(tidyr)
library(dplyr)
data(iris)
iris %>%
as_tibble %>%
gather(.,key = variable,value = value,-Species) %>%
group_by(Species,variable) %>%
summarise(value=mean(value)) %>%
ungroup %>%
spread(.,key = variable,value = value) %>%
mutate('Percentage Change\n(Petal length/ Petal width)'=`Petal.Length`/`Petal.Width`*100) %>%
kable(format='latex',align='c',linesep='',booktabs=TRUE,escape=FALSE,
col.names = linebreak(colnames(.),align = 'c')) %>%
add_header_above(.,c(' '=1,'Parts'=4,' '=1),escape = FALSE) %>%
collapse_rows(columns = c(1,6),valign = 'middle')%>%
kable_styling(latex_options = c('striped','HOLD_position','scale_down'))
```
This gives this:
Note two things in above code:
keep_tex: true: this retains the .tex file generated and can be used to edit.
Use of linebreaks to ensure that the entire column name for last column is not in one line.
Now we make small changes in latex output. In code below the commented out line is the original code generated by kable. This is replaced by the new lines just below the commented out line as indicated.
\begin{table}[H]
\centering
\resizebox{\linewidth}{!}{
\begin{tabular}{cccccc}
\toprule
% \multicolumn{1}{c}{ } & \multicolumn{4}{c}{Parts} & \multicolumn{1}{c}{ } \\
\multirow{2}{*}{Species} & \multicolumn{4}{c}{Parts} & \multirow{2}{*}{\makecell[c]{Percentage Change\\(Petal length/ Petal width)}} \\ % replaced line
\cmidrule(l{3pt}r{3pt}){2-5}
% Species & Petal.Length & Petal.Width & Sepal.Length & Sepal.Width & \makecell[c]{Percentage Change\\(Petal length/ Petal width)}\\
& Petal.Length & Petal.Width & Sepal.Length & Sepal.Width &\\ % replaced line
\midrule
\cellcolor{gray!6}{setosa} & \cellcolor{gray!6}{1.462} & \cellcolor{gray!6}{0.246} & \cellcolor{gray!6}{5.006} & \cellcolor{gray!6}{3.428} & \cellcolor{gray!6}{594.3089}\\
\cmidrule{1-6}
versicolor & 4.260 & 1.326 & 5.936 & 2.770 & 321.2670\\
\cmidrule{1-6}
\cellcolor{gray!6}{virginica} & \cellcolor{gray!6}{5.552} & \cellcolor{gray!6}{2.026} & \cellcolor{gray!6}{6.588} & \cellcolor{gray!6}{2.974} & \cellcolor{gray!6}{274.0375}\\
\bottomrule
\end{tabular}}
\end{table}
This gives the following output:

Related

Functions dplyr with rlang::last_error() in purrr::map loop in r

I'm using a function to calculate the length of linestring per cell by ID and store in a list, convert each element of the list into a RasterLayer and turn that list into a RasterStack, average all layers and get a single raster.
#function
# build_length_raster <- function(one_df) {
intersect_list <- by(
one_df ,
one_df$sub_id,
function(subid_df) sf::st_intersection(grid2, subid_df) %>%
dplyr::mutate(length = as.numeric(sf::st_length(.))) %>%
sf::st_drop_geometry()
)
list_length_grid <- purrr::map(intersect_list, function(x)
x %>% dplyr::left_join(x=grid2, by="cell", copy=T) %>%
dplyr::mutate(length=length) %>%
dplyr::mutate_if(is.numeric,coalesce,0)
)
list_length_raster <- purrr::map(list_length_grid, function(x)
raster::rasterize(x, r, field="length", na.rm=F, background=0)
)
list_length_raster2 <- unlist(list_length_raster, recursive=F)
raster_stack <- raster::stack(list_length_raster2)
raster_mean <- raster::stackApply(
raster_stack,
indices = rep(1,nlayers(raster_stack)),
fun = "mean", na.rm = TRUE)
#}
The function presents a step where, in order for the resulting grid of st_intersection() to have the same number of cells as it had initially, I use left_join(by="cell" column).Then I use mutate() to replace the NA's with 0. When I run the function steps for one dataframe from the list, it works perfectly, but when I put it inside map() to do this in a list, I get this error, which seems to refer to the dplyr functions:
final_list <- purrr::map(mylist, build_length_raster)
> rlang::last_error()
<error/rlang_error>
Join columns must be present in data.
x Problem with `cell`.
Backtrace:
1. purrr::map(mylist, build_length_raster)
15. dplyr:::left_join.data.frame(., x = grid, by = "cell", copy = T)
16. dplyr:::join_mutate(...)
17. dplyr:::join_cols(...)
18. dplyr:::standardise_join_by(by, x_names = x_names, y_names = y_names)
19. dplyr:::check_join_vars(by$y, y_names)
Run `rlang::last_trace()` to see the full context.
Is there a way to solved this problem?
MYDATA example
library(tidyverse)
library(sf)
library(purrr)
library(raster)
#data example
id <- c("844", "844", "844", "844", "844","844", "844", "844", "844", "844",
"844", "844", "845", "845", "845", "845", "845","845", "845", "845",
"845","845", "845", "845")
sub_id <- c("2017_844_1", "2017_844_1", "2017_844_1", "2017_844_1", "2017_844_2",
"2017_844_2", "2017_844_2", "2017_844_2", "2017_844_3", "2017_844_3",
"2017_844_3", "2017_844_3", "2017_845_1", "2017_845_1", "2017_845_1",
"2017_845_1", "2017_845_2","2017_845_2", "2017_845_2", "2017_845_2",
"2017_845_3","2017_845_3", "2017_845_3", "2017_845_3")
lat <- c(-30.6456, -29.5648, -27.6667, -31.5587, -30.6934, -29.3147, -23.0538,
-26.5877, -26.6923, -23.40865, -23.1143, -23.28331, -31.6456, -24.5648,
-27.6867, -31.4587, -30.6784, -28.3447, -23.0466, -27.5877, -26.8524,
-23.8855, -24.1143, -23.5874)
long <- c(-50.4879, -49.8715, -51.8716, -50.4456, -50.9842, -51.9787, -41.2343,
-40.2859, -40.19599, -41.64302, -41.58042, -41.55057, -50.4576, -48.8715,
-51.4566, -51.4456, -50.4477, -50.9937, -41.4789, -41.3859, -40.2536,
-41.6502, -40.5442, -41.4057)
df <- tibble(id = as.factor(id), sub_id = as.factor(sub_id), lat, long)
#converting ​to sf
df.sf <- df %>%
​sf::st_as_sf(coords = c("long", "lat"), crs = 4326)
#creating grid
xy <- sf::st_coordinates(df.sf)
grid = sf::st_make_grid(sf::st_bbox(df.sf),
​cellsize = .1, square = FALSE) %>%
​sf::st_as_sf()
#creating raster
r <- raster::raster(grid, res=0.1)
#return grid because raster function changes number of cells
grid2 <- rasterToPolygons(r, na.rm=F) %>%
st_as_sf() %>% mutate(cell=1:ncell(r))
#creating linestring to each sub_id
df.line <- df.sf %>%
dplyr::group_by(sub_id, id) %>%
dplyr::summarize() %>%
sf::st_cast("LINESTRING")
#creating ID list
mylist<- split(df.line, df.line$id)
#separating one dataframe of list to test function
one_df <- df.line[df.line$id=="844",]
one_df$id <- droplevels(one_df$id)
one_df$sub_id <- droplevels(one_df$sub_id)
The specific error is caused because intersect_list has empty items in the list, which cannot be joined because they are empty, and hence have no columns to join by. If you modified the map function to only use non-empty items of intersect_list you would not get that error.
As you noted in the comments, removing the empty list entries with keep(intersect_list, ~ !is.null(.)) before mapping left_join onto the list items will fix the error.
However, I don't think this is the most elegant way to solve this problem. I might misunderstand what the goal is, but if it's to produce a raster from the total length of lines within each grid cell, I think a simpler approach without using purrr might work.
This is not the exact same as your product, but I'm keeping it simpler rn to illustrate an alternate approach. Here is a sum of the lengths in each cell as a stars object (similar to raster but plays better with the tidyverse and sf).
I'm starting off from your objects one_df and grid:
# Turn multiple lines into single MULTILINESTRING:
one_df %>%
st_union() ->
union_df
# Intersection of each grid cell with the MULTILINESTRING geometry:
grid %>%
st_intersection(union_df) ->
grid_lines
# Get lengths:
grid_lines %>%
mutate(length = st_length(x)) %>%
st_drop_geometry() ->
grid_lengths
# Join the calculated lengths back with the spatial grid,
# most of which will have NA for length
grid %>%
left_join(grid_lengths, by = "cell") ->
grid_with_lengths
# Rasterize the length field of the grid
grid_with_lengths %>%
dplyr::select(length) %>%
stars::st_rasterize() ->
length_stars
length_stars %>% mapview::mapview()

How can I get a table to print under a picture using a loop in a .rmd with word_document?

I am trying to create a .rmd file that takes all of the pictures for a field day and the notes that was taken and create a report. I am able to get the pictures to plot but the no matter what I try the table with the notes does not want to print. Below is the loop I am utilizing:
for(i in 1:nrow(subset_Inventory_data)) {
singlept <- subset_Inventory_data[i,]
picture <- pictureLookup[singlept$GlobalID == pictureLookup$REL_GLOBAL,]
#PRINT PICTURE
plot(image_read(paste(baseURL,picture$UID,sep = "")) %>%
# image_resize("400x400") %>%
image_rotate(degrees = 90)
)
#creating table underneath picture
Categories <- c("Latitude", "Longitude", "Road Width", "Conditon", "Lock Present","Additional Notes")
sum_table <- data.frame(Category = character(),
Information = character(),
stringsAsFactors = FALSE)
sum_table <- rbind(sum_table,Categories,
stringsAsFactors = FALSE)
colnames(sum_table) <- Categories
sum_table$Latitude <- sprintf("%f",singlept$LAT)
sum_table$Longitude <-sprintf("%f",singlept$LONG)
sum_table$`Road Width` <- paste(singlept$Gate_Width,"feet")
sum_table$Conditon <- singlept$Condition
sum_table$`Lock Present` <- singlept$GlobalID
sum_table$`Additional Notes` <- singlept$General_Notes
#TRIED FLEXTABLE
ft <- flextable(sum_table)
ft <- fontsize(ft, size = 12)
ft <- autofit(ft)
print(ft)
#TRIED KABLE
print(kable(sum_table,"latex"))
}

Table in Bookdown/Huskydown with several features (Citation, Caption, URL, PNG Figure, ...)

I would like to include a table in an R markdown document (Bookdown/Huskydown) which should meet the following requirements. Ideally, the table works with several output formats, e.g. LaTex/PDF and HTML.
Requirements:
Table width: fixed
Cell width: fixed
Vertical alignment: cell content aligned to the top
Text formatting: like bold or italics (best would be if md formatting supported, such that code is output agnostic) and allow for line breaks in longer texts
Citations: should be rendered
URLs: as clickable links both in HTML and LaTex/PDF
Figures: include
figures stored locally, either in
a markdown way ![](Rlogo.png) or
a knitr way knitr::include_graphics("Rlogo.png")
figures taken straight from the web
Caption for the table
Captions text formatting: caption should also allow for text formatting
Footnote: include footnotes in the table
Table numeration: tables are should be numerated
Referencing the table: in the document is needed
Notes regarding different approaches
Fixed cell width: in markdown the number of "-"s in table header determine cell width
Linebreaks:
LaTex:\\linebreak
All others: <br/>
Referencing
LaTex: add \label{foo} => \ref{foo} ( \#ref(foo))
Markdown: add Table: (\#tab:md-table) Caption==> \#ref(tab:md-table))
Comments on different approaches
Markdown: easy coding of tables in markdown
Kable & kableExtra: Versatile R markdown coding of the table, but vertical text alignment obscure and figures are not included in PDF
Pander: achieves the most, but no vertical alignment and footnotes
Huxtable: most promising, but figures are not included in PDF
This is less an answer than providing MWEs for the table shown above
```{r}
# create some random text
library(stringi)
some_text <- stri_rand_lipsum(1)
some_text <- substr(some_text, 1, 75)
# create dataframe with some stuff
figpath <- "figure/"
df <- data.frame(
Citation = c("#R-base", "#R-bookdown"),
Textfield = c("**Formatted** string<br/> -- _Everyone_ needs H^2^O", some_text),
URL = c("[R-url](https://www.r-project.org/)", "[bookdown](https://bookdown.org/)"),
fig_local_md = c(
paste0("![](", figpath, "Rlogo.png){ width=10% height=5% }"),
paste0("![](", figpath, "bookdownlogo.png){ height='36px' width='36px' }")
)#,
# not working:
# fig_local_knitr = c("knitr::include_graphics('figure/Rlogo.png')", "knitr::include_graphics('figure/bookdownlogo.png')")
)
# only include if output format is HTML, else pander throws error
if (knitr::is_html_output()) {
df$fig_web <- c("![](https://www.picgifs.com/glitter-gifs/a/arrows/picgifs-arrows-110130.gif)")
output_format <- "html"
}
if (knitr::is_latex_output()) {
output_format <- "latex"
}
```
markdown
Table: markdown table: markdown styling works in HTML (*italics*, **bold**), LaTex styling in PDF (\\textbf{bold})
| Image | Description |
| :--------------------------------------------------------- | :----------------------------------------------------------- |
| ![](figure/Rlogo.png){ width=10% height=5% } | **Image description** [#R-base] <br/>Lorem ipsum dolor sit amet, ... [R-url](https://www.r-project.org/) |
| ![](figure/bookdownlogo.png){ height='36px' width='36px' } | **Image description** [#R-bookdown] <br/>Lorem ipsum dolor sit amet, ... [bookdown](https://bookdown.org/) |
kable table
```{r kable-table, echo=FALSE, out.width='90%', fig.align = "center", results='asis'}
library(knitr)
kable(df,
caption = "kable table: markdown styling works in HTML (*italics*, **bold**), LaTex styling in PDF (\\textbf{bold})",
caption.short = "md styling works in HTML (*italics*, **bold**), LaTex styling in PDF (\\textbf{bold})"
)
```
kableExtra table
```{r kableExtra-table, echo=FALSE, out.width='90%', fig.align = "center", results='asis'}
library(kableExtra)
# http://haozhu233.github.io/kableExtra/awesome_table_in_pdf.pdf
kable(
df,
caption = "kableExtra table: markdown styling works in HTML (*italics*, **bold**), LaTex styling in PDF (\\textbf{bold})",
output_format, booktabs = T, # output_format = latex, html (specify above)
# align = "l",
valign = "top"
) %>%
kable_styling(full_width = F,
latex_options = c(#"striped",
"hold_position", # stop table floating
"repeat_header") # for long tables
) %>%
column_spec(1, bold = T, border_right = T, width = "30em") %>%
column_spec(2, width = "50em") %>%
column_spec(3, width = "5em") %>%
column_spec(4, width = "10em") %>%
column_spec(5, width = "10em") %>%
footnote(general = "Here is a general comments of the table. ",
number = c("Footnote 1; ", "Footnote 2; "),
alphabet = c("Footnote A; ", "Footnote B; "),
symbol = c("Footnote Symbol 1; ", "Footnote Symbol 2"),
general_title = "General: ", number_title = "Type I: ",
alphabet_title = "Type II: ", symbol_title = "Type III: ",
footnote_as_chunk = T, title_format = c("italic", "underline")
)
```
pander table
```{r pander-table, echo=FALSE, out.width='90%', fig.align = "center", results='asis'}
library(pander)
# https://cran.r-project.org/web/packages/pander/vignettes/pandoc_table.html
pander(
df,
caption = "pander table: markdown styling works in HTML and PDF (*italics*, **bold**), LaTex styling in PDF (\\textbf{bold})",
# style = "multiline", # simple
split.table = Inf, # default = 80 characters; Inf = turn off table splitting
split.cells = c(15, 50, 5, 5, 5), # default = 30
# split.cells = c("25%", "50%", "5%", "10%", "10%"), # no difference
justify = "left"
)
```
huxtable table
```{r huxtable-table, echo=FALSE, out.width='90%', fig.align = "center", results='asis'}
library(dplyr)
library(huxtable)
# https://hughjonesd.github.io/huxtable/
hux <- as_hux(df) %>%
# huxtable::add_rownames(colname = '') %>%
huxtable::add_colnames() %>%
set_top_border(1, everywhere, 1) %>%
set_bottom_border(1, everywhere, 1) %>%
set_bottom_border(final(), everywhere, 1) %>%
set_bold(1, everywhere, TRUE) %>% # bold headlines
set_italic(-1, 1, TRUE) %>% # italics in first column (except the first row)
set_valign("top") %>%
set_width(1) %>%
set_col_width(c(0.10,0.45,0.05,0.10,0.10)) %>%
set_wrap(TRUE) %>%
set_position('left') %>% # fix table alignment (default is center)
add_footnote("Sample Footnote") %>%
set_font_size(4)
table_caption <- 'huxtable table: markdown styling works in HTML (*italics*, **bold**), LaTex styling in PDF (\\textbf{bold})'
# Print table conditional on output type
if (knitr::is_html_output()) {
caption(hux) <- paste0('(#tab:huxtable-table-explicit) ', table_caption)
print_html(hux) # output table html friendly (requires in chunk options "results='asis'")
}
if (knitr::is_latex_output()) {
caption(hux) <- paste0('(\\#tab:huxtable-table-explicit) ', table_caption)
hux # if using chunk option "results='asis'" simply output the table with "hux", i.e. do not use print_latex(hux)
}
```
Referencing the tables
works differently for different table types
Adding a short caption for the LoT
Finally adding a short caption for the table of figures is not really working as desired
(ref:huxtable-table-caption) huxtable-table caption
(ref:huxtable-table-scaption) huxtable-table short caption
```{r huxtable-table, echo=FALSE, out.width='90%', fig.align = "center", fig.cap='(ref:huxtable-table-caption)', fig.scap='(ref:huxtable-table-scaption)', results='asis'}
...
```

How can I split a table so that it appears side by side in R markdown?

I'm writing a document with R markdown and I'd like to put a table. The problem is that this table only has two columns and takes a full page, which is not very beautiful. So my question is : is there a way to split this table in two and to place the two "sub-tables" side by side with only one caption ?
I use the kable command and I tried this solution (How to split kable over multiple columns?) but I could not do the cbind() command.
Here's my code to create the table :
---
title:
author:
date: "`r format(Sys.time(), '%d %B, %Y')`"
output: pdf_document
indent: true
header-includes:
- \usepackage{indentfirst}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r, echo = FALSE}
kable(aerop2, format = "markdown")
```
where aerop2 is my data frame with a list of country names in column 1 and the number of airports in each of these countries in column 2.
I have a long two-column table which is a waste of space. I would like to split this table in two sub-tables and put these sub-tables side by side with a caption that includes both of them.
This doesn't give a lot of flexibility in spacing, but here's one way to do it. I'm using the mtcars dataset as an example because I don't have aerop2.
---
output: pdf_document
indent: true
header-includes:
- \usepackage{indentfirst}
- \usepackage{booktabs}
---
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(echo = TRUE)
```
The data are in Table \ref{tab:tables}, which will float to the top of the page.
```{r echo = FALSE}
rows <- seq_len(nrow(mtcars) %/% 2)
kable(list(mtcars[rows,1:2],
matrix(numeric(), nrow=0, ncol=1),
mtcars[-rows, 1:2]),
caption = "This is the caption.",
label = "tables", format = "latex", booktabs = TRUE)
```
This gives:
Note that without that zero-row matrix, the two parts are closer together. To increase the spacing more, put extra copies of the zero-row matrix into
the list.
The solution offered by 'user2554330' was very useful.
As I needed to split in more columns and eventually more sections, I further developed the idea.
I also needed to have the tables after the text, not floating to the top. I found a way using kableExtra::kable_styling(latex_options = "hold_position").
I am writing here to share the development and to ask minor questions.
1 - Why did you add the line - \usepackage{indentfirst}?
2 - What is the effect of label = "tables" as kable() input?
(The questions are related to Latex. I probably know to little to understand the explanation in kable() documentation: "label - The table reference label"!)
---
title: "Test-split.print"
header-includes:
- \usepackage{booktabs}
output:
pdf_document: default
html_document:
df_print: paged
---
```{r setup, include=FALSE}
suppressPackageStartupMessages(library(tidyverse))
library(knitr)
library(kableExtra)
split.print <- function(x, cols = 2, sects = 1, spaces = 1, caption = "", label = ""){
if (cols < 1) stop("cols must be GT 1!")
if (sects < 1) stop("sects must be GT 1!")
rims <- nrow(x) %% sects
nris <- (rep(nrow(x) %/% sects, sects) + c(rep(1, rims), rep(0, sects-rims))) %>%
cumsum() %>%
c(0, .)
for(s in 1:sects){
xs <- x[(nris[s]+1):nris[s+1], ]
rimc <- nrow(xs) %% cols
nric <- (rep(nrow(xs) %/% cols, cols) + c(rep(1, rimc), rep(0, cols-rimc))) %>%
cumsum() %>%
c(0, .)
lst <- NULL
spc <- NULL
for(sp in 1:spaces) spc <- c(spc, list(matrix(numeric(), nrow=0, ncol=1)))
for(c in 1:cols){
lst <- c(lst, list(xs[(nric[c]+1):nric[c+1], ]))
if (cols > 1 & c < cols) lst <- c(lst, spc)
}
kable(lst,
caption = ifelse(sects == 1, caption, paste0(caption, " (", s, "/", sects, ")")),
label = "tables", format = "latex", booktabs = TRUE) %>%
kable_styling(latex_options = "hold_position") %>%
print()
}
}
```
```{r, results='asis'}
airquality %>%
select(1:3) %>%
split.print(cols = 3, sects = 2, caption = "multi page table")
```

gsub in columns value in dataframe

I have a file with multiple columns. I am showing two columns in which I am interested two columns
Probe.Set.ID Entrez.Gene
A01157cds_s_at 50682
A03913cds_s_at 29366
A04674cds_s_at 24860 /// 100909612
A07543cds_s_at 24867
A09811cds_s_at 25662
---- ----
A16585cds_s_at 25616
I need to replace /// with "\t"(tab) and the output should be like
A01157cds_s_at;50682
A03913cds_s_at;29366
A04674cds_s_at;24860 100909612
Also, I need to avoid the ones with "---"
Here is slightly more different approach using dplyr:
data <- data.frame(Probe.Set.ID = c("A01157cds_s_at",
"A03913cds_s_at",
"A04674cds_s_at",
"A07543cds_s_at",
"A09811cds_s_at",
"----",
"A16585cds_s_at"),
Entrez.Gene = c("50682",
"29366",
"24860 /// 100909612",
"24867",
"25662",
"----",
"25616")
)
if(!require(dplyr)) install.packages("dplyr")
library(dplyr)
data %>%
filter(Entrez.Gene != "----") %>%
mutate(new_column = paste(Probe.Set.ID,
gsub("///", "\t", Entrez.Gene),
sep = ";"
)
) %>% select(new_column)
Looks like you will want to subset the data, then paste the two columns together, then use gsub to make the replace the '///'. Here is what I came up with, with dat being the dataframe containing the two columns.
dat = dat[dat$Probe.Set.ID != "----",] # removes the rows with "---"
dat = paste0(dat$Probe.Set.ID, ";", dat$Entrez.Gene) # pastes the columns together and adds the ";"
dat = gsub("///","\t",dat) # replaces the "///" with a tab
Also, use cat() to view the tab as opposed to "\t". I got that from here: How to replace specific characters of a string with tab in R. This will output a list as opposed to a data.frame. You can convert back with data.frame(), but then you cannot use cat() to view.
We can use dplyr and tidyr here.
library(dplyr)
library(tidyr)
> df <- data.frame(
col1 = c('A01157cds_s_at', 'A03913cds_s_at', 'A04674cds_s_at', 'A07543cds_s_at', '----'),
col2 = c('50682', '29366', '24860 /// 100909612', '24867', '----'))
> df %>% filter(col1 != '----') %>%
separate(col2, c('col2_first', 'col2_second'), '///', remove = T) %>%
unite(col1_new, c(col1, col2_first), sep = ';', remove = T)
> df
## col1_new col2_second
## 1 A01157cds_s_at;50682 <NA>
## 2 A03913cds_s_at;29366 <NA>
## 3 A04674cds_s_at;24860 100909612
## 4 A07543cds_s_at;24867 <NA>
filter removes the observations with col1 == '----'.
separate splits col2 into two columns, namely col2_first and col2_second
unite concatenates col1 and col2_first with ; as separator.