class: center, middle, inverse, title-slide # An Open Science Framework for Research on Cyanobacteria in Lakes and Ponds ## US EPA, Region 7 ### Jeff Hollister, Farnaz Nojavan, Betty Kreakie, Stephen Shivers, and Bryan Milstead ###
Lenexa, KS
--- class: center, middle, inverse # Twitter? ![Yes!!!]( ### hashtag: \#cyanobacteria ### me: @jhollist --- class: center, middle, inverse # Who, what, why, and how? --- # Who are we? .left-column[ - Ecologists - Computational focus - Enough to be dangerous - 3 FTE - Myself - Betty Kreakie - Bryan Milstead - 2 Post-docs - Farnaz Nojavan - Stephen Shivers ] .right-column[ <img src="figure/comp_eco_crew.jpg" style="margin-left: 75px"></img> ] --- # What do we do? - Apply computational approaches to understand water quality impacts in lakes - Open Science ![The workflow](figure/cyano_model_function.jpg) --- # What is open science? - Access to materials - Reproducible/ Repeatable - The Web! - A process, not a state <img src="figure/open_sign_4344960203_821cb56ec9_o_resize.jpg" style="height: 80%; width: 80%; margin-left: 100px"></img> --- # Why open science? - Often required - Government/Funders/Journals - Benefits researchers - [Mciernan et al. (2016) How open science helps researchers succeed]( - Improves quality - [The classic example: Reinhart and Rogoff]( - Benefits to society - ["Sharing of Data Leads to Progress on Alzheimer’s"]( <img src="" style="width: 60%; height: 60%; margin-left: 20%;"></img> --- # How are we open? .left-column[ - R package development - Research compendia - Tooling for common problems - Visualization - Sharing and collaborating - Publishing - Apply to our research efforts ] .right-column[ <img src = "" style="width: 110%; height: 110%; margin-top: 50px;"></img> ] --- class: center, middle, inverse # R Packages --- # Why R Packages - Useful structure - Infrastructure for sharing - GitHub - CRAN - We are an R shop! <img src="" style="width: 80%; height: 80%; margin-left: 10%;"></img> --- # Research Compendia .left-column[ - Define - Origins - [Gentleman and Lang (2004)]( - Part of - Reproducible Research - Literate Programming (ala Donald Knuth) - ROpenSci efforts - [rrrpkg]( - [ROpenSci unconf 2017 discussion]( ] .right-column[ <img src="" style="margin-left: 75px"></img> from Nüst, Konkol, et al (2017), ] --- # Packages as Research Compendia - R, Data, and Vignettes folders - Other examples - [Carl Boettiger's template]( - [Ben Marwick]( - Our examples - - - GitHub and Zenodo (Archive) ![ghz](ghz.jpg) --- # Packages to solve common problems - `lakemorpho` - `elevatr` - `goatscape` (in development) <img src="" style="width: 80%; height: 80%; margin-left: 10%;"></img> --- # `lakemorpho` .footnote[Package URL: <>] .left-column[ - Lake morphometry metrics in R - Version 1.0 - August 2014 - Version 1.1.0 - December 2016 - `sf` support to be added - [National Lake Morphometry]( - [Hollister and Milstead (2010)]( - [Hollister *et. al.* (2011)]( - [Hollister and Stachelek (2017)]( ] .right-column[ ![lakemorpho](figure/lakemorpho.png) ] --- class: center, middle background-image: url('figure/lakemorpho_demo.png') background-position: 50% 50% # lakemorpho::demo <!-- # [lakemorpho: Demo]( --> --- # `elevatr` .footnote[Package URL: <>] .left-column[ - Access elevation data in R - Mapzen - AWS - USGS - Version 0.1.1 - January 2017 - Version 0.1.3 - March 2017 - Will be paired with `lakemorpho` - `sf` support to be added ] .right-column[ ![elevatr](figure/elevatr.png) ] --- class: center, middle background-image: url('figure/elevatr_demo.png') background-position: 50% 50% # elevatr::demo <!-- # [elevatr: Demo]( --> --- # `goatscape` .left-column[ - New effort with Bryan Milstead - What's in a name? - Summarizes ancillary data for a user-defined landscape polygon - Census (via `censusapi`) - Landcover - Impervious - Accepts arbitrary spatial data for the landscape - Based on `sf` and tidy by design - <> ] .right-column[ <img src="figure/goatscape_logo.jpg" style="width: 90%; height: 90%;"></img> ] --- class: middle, center, inverse # Data Visualization --- # Shiny: Cyanobacteria Monitoring Collaborative .footnote[Project URL: <>] .left-column[ - Started in 2013 - New England Region Cyanobacteria Monitoring Workgroup - Three Projects - bloomWatch - cyanoScope - Monitoring - Data Viz with Shiny ] .right-column[ ![cyano web](figure/cyano_web.jpg) ] --- class: middle, center background-image: url("figure/shiny.jpg") # [Shiny: Demo]( --- class: middle, center, inverse # Sharing and Collaborating --- # GitHub - What is it? - How do we use it? <img src = "" style="margin-left: 5%; margin-top: -20px; width: 700px;"></img> --- class: middle, center background-image: url("figure/github_demo.jpg") # [GitHub: Demo]( --- class: middle, center, inverse # Open Access --- # Publishing - Preprints - [Hollister *el al.* (2016) PeerJ Preprints]( - Open first - [Milstead *et al.* (2013) PLoS One]( - [Hollister and Kreakie (2016) F1000Research]( - Money where our mouth(s) is(are) - [Kreakie *et al.* (2015) LakeLines]( <img src = "figure/oa_journals.jpg" style="margin-left: 15%; width: 500px;"></img> --- class: middle, center, inverse # Open Science Research --- # Models and field research - Random forest models of trophic state and chlorophyll *a* - Re-thinking the Lake Trophic State Index - Chlorophyll *a* and microcystin - Temporal and spatial dynamics of cyanobacteria blooms - New work - Lake photic zone temperature - Phytoplankton community analysis <img src="" style="width: 70%; height: 70%; margin-left: 10%;"></img> --- # Random forest models of trophic state and chlorophyll *a* .left-column[ - National - Data - National Lakes Assessment - Land cover - `randomForest` package - Variable selection - All variables (water quality and GIS) - 68.7% Total Accuracy - GIS only variables - 49% Total Accuracy - But ...] .right-column[ <img src="figure/hollisterES15-00703R_fig11.jpg" style="width: 100%; height: 100%; margin-top: 50px;"></img> ] --- # Random forest models of trophic state and chlorophyll *a* - How is it open and reproducible? - [GitHub]( - [10.5281/zenodo.40271]( - [PeerJ Pre-print]( - [Ecosphere (OA)]( ![ecosphere](figure/ecosphere.jpg) --- # Re-thinking the Lake Trophic State Index .left-column[ - Led by Farnaz Nojavan - Hierarchical model - Nitrogen and Phosphorus - POLR: Revised Trophic State Index - Total Accuracy - 0.6 - Balanced Accuracy - 0.68 to 0.78 ] .right-column[ <img src="figure/dag-0.jpg" style="width: 100%; height: 100%; margin-top: 50px"></img> ] --- # Re-thinking the Lake Trophic State Index .left-column[ - Hierarchical model - Nitrogen and Phosphorus - POLR: Revised Trophic State Index - Total Accuracy - 0.6 - Balanced Accuracy - 0.68 to 0.78 ] .right-column[ ![models](figure/predcit_acc-0.jpg) ] --- # Re-thinking the Lake Trophic State Index - How is it open and reproducible? - [GitHub]( - [10.5281/zenodo.556175]( - OA (when published) ![ecol_model](figure/ecol_model.jpg) --- # Chlorophyll *a* and microcystin .left-column[ - National - Diagnostic tool - Probability - Exceeding microcystin advisory - Given chlorophyll *a* concentration ] .right-column[ <img src="" style="width: 120%; height: 120%; margin-top: 50px;"></img> ] --- # Chlorophyll *a* and microcystin - The numbers! ![mcyst_table](figure/mcys_table.jpg) --- # Chlorophyll *a* and microcystin - How is it open? - [GitHub]( - [Zenodo]( - [F1000Research]( - Pre-print and peer-reviewed in one! ![f1000](figure/f1000.jpg) --- # Temporal and spatial dynamics of cyanobacteria blooms - Led by Stephen Shivers - Rhode Island - Field effort - 2 ponds - Yawgoo Pond (the nice wooded site) - Warwick Pond (gritty and (somewhat) urban site) - Twice weekly - Seven sampling locations in each <img src="figure/yawg_warw.png" style="width: 83%; margin-top: -2%; margin-left: 10%;"></img> --- # Temporal and spatial dynamics of cyanobacteria blooms .left-column[ - Measurements - Chlorophyll *a* - Phycocyanin - Microcystin - Turbidity - Physical profiles - Secchi - Plankton - Nutrients ] .right-column[ <img src="figure/chla.png" style="width: 120%; height: 120%; margin-top: 50px;"></img> ] --- # Temporal and spatial dynamics of cyanobacteria blooms - How will it be open? - [Private (for now) GitHub]() - Zenodo - Open Access publications - Data publication? ![cyano_space_time](figure/cyano_space_time.jpg) --- # New work - Hierarchical Bayes models of microcystin - Lake photic zone temperature - Phytoplankton community analysis <img src="figure/Lila1.jpg" style="width: 42%; height: 42%; margin-left:30%"></img> --- # Thanks! .center[ ## Jeff Hollister US EPA </br> Atlantic Ecology Division </br> Narragansett, RI </br> email: []( </br> twitter: [@jhollist]( </br> github: [jhollist]( </br> Slides created via the R package [**xaringan**]( ]