Why does my code chunk render my plot correctly when run by itself but I get an 'object not found' error when I try to knit in R Markdown? - r-markdown

I am relatively new at R Markdown and am having trouble when trying to knit to create a report. The error I am getting is:
Error in ggplot(data = bio1530_sci1420_summary_stats.xlsx) :
object 'bio1530_sci1420_summary_stats.xlsx' not found
Calls: ... withVisible -> eval_with_user_handlers -> eval -> eval -> ggplot
Execution halted
Here is my code thus far:
title: "NGRMarkdown"
author: "Rob McCandless"
date: "r Sys.Date()"
output: word_document
knitr::opts_chunk$set(echo = TRUE)
library(ggplot2)
library(ggrepel)
library(tidyverse)
library(here)
read_csv("bio1530_sci1420_summary_stats.xlsx")
#ScatterPlot of mean course grade v. mean normalized gain on 1420 and 1530 data with regression lines and error bars
ggplot(data=bio1530_sci1420_summary_stats.xlsx)+
geom_errorbar(aes(x=Course_grade, y=Norm_gain, ymin=Norm_gain-CI, ymax=Norm_gain+CI), color="black", width=0.2, position=position_dodge2(10.0))+
geom_point(mapping=aes(x=Course_grade, y=Norm_gain, shape=Course, color=Course),size=3)+
geom_smooth(method=lm, se=FALSE, col='black', size=1, mapping=aes(x=Course_grade, y=Norm_gain, linetype=Course))+
geom_label_repel(aes(Course_grade, y=Norm_gain, label = Alpha), box.padding = 0.3, point.padding = 0.7, segment.color = 'grey50')+ #added point labels A-J
ylab('Mean Normalized Gain (all instructor sections)')+
xlab('Mean Course Grade (all instructor sections)')+
scale_fill_discrete(labels=c("Bio 1530", "Sci 1420"))+
labs(title="Normalized Gain v. Course Grade by Course & Instructor", subtitle="Mean and 95% CI of all sections per instructor (A-J)")+
theme(plot.title=element_text(hjust=0.5))+
theme(plot.subtitle=element_text(hjust=0.5))+
annotate("text", x=73.0, y=0.09, label="R2 = 0.68, p = 0.044")+
annotate("text", x=78.5, y=0.22, label="R2 = 0.46, p = 0.095")
And this is the plot that renders when I tell R to run this chunk only:
[Course grade v. normalized gain](https://i.stack.imgur.com/WO9S7.png)
So the code works and the dataframe the code refers to is valid, but it won't render when I try to knit in R Markdown.
I suspect it may have to do with the current and working directories not being the same, but I'm not certain of this and am not sure how to check this. I have confirmed that my my working directory is:
getwd()
[1] "/Users/robmccandless/Library/Mobile Documents/com~apple~CloudDocs/R Projects/Normalized_Gain_Data"
and this is where the dataframe and RMD file are both located. Can anyone give me some idea of what I am doing wrong? Any assistance will be greatly appreciated.

the error is saying that your dataset object (i.e., the .xlsx file) is not found in the your local environment. From snippet above it doesn't look like the dataset is saved, just read. One option is to try in your markdown:
df <- read_csv("bio1530_sci1420_summary_stats.xlsx")
ggplot(data=df)+
geom_errorbar(aes(x=Course_grade, y=Norm_gain, ymin=Norm_gain-CI, ymax=Norm_gain+CI), color="black", width=0.2, position=position_dodge2(10.0))+
geom_point(mapping=aes(x=Course_grade, y=Norm_gain, shape=Course, color=Course),size=3)+
geom_smooth(method=lm, se=FALSE, col='black', size=1, mapping=aes(x=Course_grade, y=Norm_gain, linetype=Course))+
geom_label_repel(aes(Course_grade, y=Norm_gain, label = Alpha), box.padding = 0.3, point.padding = 0.7, segment.color = 'grey50')+ #added point labels A-J
ylab('Mean Normalized Gain (all instructor sections)')+
xlab('Mean Course Grade (all instructor sections)')+
scale_fill_discrete(labels=c("Bio 1530", "Sci 1420"))+
labs(title="Normalized Gain v. Course Grade by Course & Instructor", subtitle="Mean and 95% CI of all sections per instructor (A-J)")+
theme(plot.title=element_text(hjust=0.5))+
theme(plot.subtitle=element_text(hjust=0.5))+
annotate("text", x=73.0, y=0.09, label="R2 = 0.68, p = 0.044")+
annotate("text", x=78.5, y=0.22, label="R2 = 0.46, p = 0.095")

Related

Predict zero-inflated negative binomial to raster with raster::predict

I am trying to predict a ZINB model to a raster. My code looks like this:
m1 <- pscl::zeroinfl(Herbround ~ LandcoverCode | LandcoverCode, data = data, dist = 'negbin')
r <- raster("data\\final.tif") # read in raster
names(r) <- "LandcoverCode" # match terminology in the model
#predict model to raster
predict(r, m1)
Error in `contrasts<-`(`*tmp*`, value = contrasts.arg[[nn]]) :
contrasts apply only to factors
In addition: Warning message:
In model.frame.default(delete.response(object$terms$full), newdata, :
variable 'LandcoverCode' is not a factor
I then get the error above. If I use an lm or glm model it works without issues and I know that my LandcoverCode is in fact a factor with multiple levels (6 to be exact). I have tried adding in:
predict(r, m1, type = "count")
But that doesn't work either. Any help with predicting a ZINB model to a raster would be greatly appreciated, thanks.

Rmarkdown officer / officedown / flextables problem

I have the following rmd script. I've spent a few days trying to get this to work but I am failing miserably. Basically I need help with three things. I am happy to post three separate questions if needed.
The multicolumn options/code are completely ignored. The corporatetable.docx is in landscape and has a typical corporate style. I need to have a full width landscape -> two column landscape -> full width landscape. If I could get the two column landscape setup to work, the remaining style would be inherited by corporatetable.docx. If I could get help with only one - I would need this.
When I run the rmd it generates a word file but none of the corporate styles are in there. It just uses my word's default colors etc. The difference is very clear - no landscape, single column and blue instead of red. How do I correctly pass the officedown::rdocx_document: to reference my word file because it's clearly not picking it up and no warning or error is generated?
If you see in the second chunk I am using flextable to show two pictures (which are passed through params) in the word report and align them with some information. myft works but it prints the (temporary/volatile) path instead of showing the pictures in the report. For reference if I use knitr::include_graphics(c(params$x1,params$x2)) it works fine.
I'm really stuck on these. Any help is welcome.
---
title: "Title"
subtitle:
params:
x1: x1
x1_name: x1_name
x1_email : x1_email
x2: x2
x2_name: x2_name
x2_email : x2_name
output:
officedown::rdocx_document:
reference_docx: corporatetemplate.docx
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(officedown)
library(officer)
library(flextable)
knitr::opts_chunk$set(out.width = '100%', dpi=300)
```
<!---BLOCK_MULTICOL_START--->
This text is on column 1. Please work
```{r somecodechunk, echo=FALSE, out.width="75px", include=TRUE, strip.white=TRUE}
library(flextable)
# this works but prefer to use flextable
# if(all(!is.null(params))) {
# knitr::include_graphics(c(params$x1,params$x2))} else {
# }
myft <- data.frame(
"pic1" = rep("",3),
"details1" = c(params$x1_name,"+X XXX XXX X",params$x1_email),
"pic2" = rep("",3),
"details2" = c(params$x2_name,"+X XXX XXX X",params$x2_email)
)
myft <- flextable(myft)
myft <- merge_at(myft, i = 1:3, j = 1 )
myft <- merge_at(myft, i = 1:3, j = 3 )
myft <- compose(myft,i = 1, j = 1, value = as_paragraph(as_image(params$x1), part = "body"))
myft <- compose(myft,i = 1, j = 3, value = as_paragraph(as_image(params$x2), part = "body"))
autofit(myft)
#Ok this does not work because the pics are not shown
```
`r run_columnbreak()`
This text is on column 2. Please work
This text is on column 2. Please work
`r run_linebreak()`
<!---BLOCK_MULTICOL_STOP{widths: [4,4], space: 0.2, sep: true}--->
\pagebreak
Back to full width with some text
\pagebreak

Rendering in Shiny: red error before reactive data loads

I am sure there is a function out there that will let me wait to render my Echarts4R output once my pickers & reactive data have loaded, as to mitigate the ugly red warning until they do! Once the reactive data is loaded, I get a nice output as such. It would be even better to have a little loading symbol or something while they load. It only takes about a second to load, though, so that isnt imperative.
Here are some example code snippets.
The UI for the graphic:
box(width = 12, title = "Total Escalated Project Costs by Calendar Year", solidHeader = TRUE, collapsible = TRUE, status = "primary"
,echarts4rOutput("initiative_cashflow")
And the server for the graphic:
output$Breakdown_by_Project <- renderEcharts4r({
breakdown_by_project_data <- init_data() %>%
group_by(Project) %>%
summarise(total_project_cost = sum(`Escalated Project Cost`))%>%
select(Project, total_project_cost)%>%
e_charts(Project)%>%
e_pie(total_project_cost, radius = c("50%", "70%"))%>%
e_legend(show = FALSE) %>%
e_tooltip()})
Any help would be appreciated!
Thanks!

Is there a way to have previous/next links at the bottom of shiny presentations?

I am working in RStudio and creating a markdown Shiny presentation (which I believe uses IOslides).
Currently the generated presentation doesn't have any navigational help, the user has to know they need to use left/right arrows to move to the next or previous slides. Even when deployed to server I don't see any arrows at the bottom of presentations.
I have searched through documentation and here to see if this is possible, but can't seem to find anything.
Is there some setting to include a Previous/Next type link at the bottom of every slide?
Process to create my presentation in R Studio:
New file > R Markdown > Shiny > Shiny presentation
The issue occurs even with the sample code when creating a new file - there are no navigation arrows
Published example (where there are no navigation arrows):
https://regolith.shinyapps.io/test
And the sample code (as generated by R studio):
---
title: "test"
author: ""
date: "24 January 2017"
output: ioslides_presentation
runtime: shiny
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
## Shiny Presentation
This R Markdown presentation is made interactive using Shiny. The viewers of the presentation can change the assumptions underlying what's presented and see the results immediately.
To learn more, see [Interactive Documents](http://rmarkdown.rstudio.com/authoring_shiny.html).
## Interactive Plot
```{r eruptions}
inputPanel(
selectInput("n_breaks", label = "Number of bins:",
choices = c(10, 20, 35, 50), selected = 20),
sliderInput("bw_adjust", label = "Bandwidth adjustment:",
min = 0.2, max = 2, value = 1, step = 0.2)
)
renderPlot({
hist(faithful$eruptions, probability = TRUE, breaks = as.numeric(input$n_breaks),
xlab = "Duration (minutes)", main = "Geyser eruption duration")
dens <- density(faithful$eruptions, adjust = input$bw_adjust)
lines(dens, col = "blue")
})
```
## Bullets
- Bullet 1
- Bullet 2
- Bullet 3
## R Output
```{r cars}
summary(cars)
```

Reading Time Series from netCDF with python

I'm trying to create time series from a netCDF file (accessed via Thredds server) with python. The code I use seems correct, but the values of the variable amb reading are 'masked'. I'm new into python and I'm not familiar with the formats. Any idea of how can I read the data?
This is the code I use:
import netCDF4
import pandas as pd
import datetime as dt
import matplotlib.pyplot as plt
from datetime import datetime, timedelta #
dayFile = datetime.now() - timedelta(days=1)
dayFile = dayFile.strftime("%Y%m%d")
url='http://nomads.ncep.noaa.gov:9090/dods/nam/nam%s/nam1hr_00z' %(dayFile)
# NetCDF4-Python can open OPeNDAP dataset just like a local NetCDF file
nc = netCDF4.Dataset(url)
varsInFile = nc.variables.keys()
lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
first = netCDF4.num2date(time_var[0],time_var.units)
last = netCDF4.num2date(time_var[-1],time_var.units)
print first.strftime('%Y-%b-%d %H:%M')
print last.strftime('%Y-%b-%d %H:%M')
# determine what longitude convention is being used
print lon.min(),lon.max()
# Specify desired station time series location
# note we add 360 because of the lon convention in this dataset
#lati = 36.605; loni = -121.85899 + 360. # west of Pacific Grove, CA
lati = 41.4; loni = -100.8 +360.0 # Georges Bank
# Function to find index to nearest point
def near(array,value):
idx=(abs(array-value)).argmin()
return idx
# Find nearest point to desired location (no interpolation)
ix = near(lon, loni)
iy = near(lat, lati)
print ix,iy
# Extract desired times.
# 1. Select -+some days around the current time:
start = netCDF4.num2date(time_var[0],time_var.units)
stop = netCDF4.num2date(time_var[-1],time_var.units)
time_var = nc.variables['time']
datetime = netCDF4.num2date(time_var[:],time_var.units)
istart = netCDF4.date2index(start,time_var,select='nearest')
istop = netCDF4.date2index(stop,time_var,select='nearest')
print istart,istop
# Get all time records of variable [vname] at indices [iy,ix]
vname = 'dswrfsfc'
var = nc.variables[vname]
hs = var[istart:istop,iy,ix]
tim = dtime[istart:istop]
# Create Pandas time series object
ts = pd.Series(hs,index=tim,name=vname)
The var data are not read as I expected, apparently because data is masked:
>>> hs
masked_array(data = [-- -- -- ..., -- -- --],
mask = [ True True True ..., True True True],
fill_value = 9.999e+20)
The var name, and the time series are correct, as well of the rest of the script. The only thing that doesn't work is the var data retrieved. This is the time serie I get:
>>> ts
2016-10-25 00:00:00.000000 NaN
2016-10-25 01:00:00.000000 NaN
2016-10-25 02:00:00.000006 NaN
2016-10-25 03:00:00.000000 NaN
2016-10-25 04:00:00.000000 NaN
... ... ... ... ...
2016-10-26 10:00:00.000000 NaN
2016-10-26 11:00:00.000006 NaN
Name: dswrfsfc, dtype: float32
Any help will be appreciated!
Hmm, this code looks familiar. ;-)
You are getting NaNs because the NAM model you are trying to access now uses longitude in the range [-180, 180] instead of the range [0, 360]. So if you request loni = -100.8 instead of loni = -100.8 +360.0, I believe your code will return non-NaN values.
It's worth noting, however, that the task of extracting time series from multidimensional gridded data is now much easier with xarray, because you can simply select a dataset closest to a lon,lat point and then plot any variable. The data only gets loaded when you need it, not when you extract the dataset object. So basically you now only need:
import xarray as xr
ds = xr.open_dataset(url) # NetCDF or OPeNDAP URL
lati = 41.4; loni = -100.8 # Georges Bank
# Extract a dataset closest to specified point
dsloc = ds.sel(lon=loni, lat=lati, method='nearest')
# select a variable to plot
dsloc['dswrfsfc'].plot()
Full notebook here: http://nbviewer.jupyter.org/gist/rsignell-usgs/d55b37c6253f27c53ef0731b610b81b4
I checked your approach with xarray. Works great to extract Solar radiation data! I can add that the first point is not defined (NaN) because the model starts calculating there, so there is no accumulated radiation data (to calculate hourly global radiation). So that is why it is masked.
Something everyone overlooked is that the output is not correct. It does look ok (at noon= sunshine, at nmidnight=0, dark), but the daylength is not correct! I checked it for 52 latitude north and 5.6 longitude (east) (November) and daylength is at least 2 hours too much! (The NOAA Panoply viewer for Netcdf databases gives similar results)