The National Socio-Environmental Synthesis Center (SESYNC) congratulates Kelly Hondula, Quantitative Researcher and Computer Programmer, on being named a 2018 rOpenSci Fellow. This year, a diverse committee reviewed 64 applications from researchers in various disciplines to select four winners.
rOpenSci’s mission is to enable and support a thriving community of researchers who embrace open and reproducible research practices as part of their work. Since their inception, one of the mechanisms through which they have supported the community is by developing high-quality open source tools that lower barriers to working with scientific data. Equally important to their mission is to build capacity and promote researchers who are engaged in such practices within their disciplinary communities. This fellowship program is a unique opportunity for them to enable such individuals to have a bigger voice in their communities.
Ecologists, hydrologists, and soil scientists collaborating in watershed science use diverse methods to gather data about ecosystems. Each collaborator’s workflow involves a different set of disciplinary expertise to derive results, such as modifying lab procedures, developing calibration curves, or nuanced site-specific corrections. This results in many different strategies for organizing and managing data based on the relevant units and timescales of observations, and whether they are based on manually recorded observations, physical samples, or machine-generated data from sensors or lab equipment. These workflows present a challenge for reproducibility because they are necessarily idiosyncratic and highly contextualized. For example, sensor data often goes through multiple levels of interpretation for quality control to identify periods where data are compromised from maintenance, power failures, biofouling, or other malfunctions. Integrating these time series data with sample-based data has been a particular challenge in watershed science and is typically handed with time consuming, custom approaches that are difficult to document and reproduce. With the rOpenSci fellowship, I’ll be creating tools and training material to help manage these diverse types of data with workflows in R structured around an information model called ODM2 (the Observations Data Model Version 2). This data model was developed in the hydrology and geoscience community, and was specifically designed to make earth science data more interoperable across networks of research sites. My goal is to make it easy for ecologists to adopt ODM2’s concepts and vocabulary for projects that involve studying ecosystems through a combination of field monitoring, sample collection and analysis, laboratory studies, and computer modeling, even if that project is not part of a large network with cyberinfrastructure resources and dedicated data managers. Datasets using this common framework would have greater re-use potential, and could be shared with a wider community in existing long-term data repositories. I’ll be developing a set of modules based on common procedures in aquatic ecology, with examples and guidelines for how to structure a reproducible workflow and “translate” it into the ODM2 data model. Each module will have a set of R functions, template data sheets, and vignettes with conceptual diagrams describing how to use the workflow. To complement these modules, I’ll be using packages like RShiny, leaflet, and dygraphs to create visualization tools to help interpret biogeochemistry and hydrology datasets that use the ODM2 framework. My overall goal is to have these tools make it easier for ecologists, especially those comfortable using R for data analysis, to seamlessly integrate components of the ODM2 framework into the workflows they are already using.
Kelly is an aquatic ecologist working on her PhD in the MEES (Marine, Estuarine, and Environmental Sciences) Program at the University of Maryland. Her dissertation is on linking hydrology and methane in wetlands. She also works at the National Socio-Environmental Synthesis Center, where she provides data science support for SESYNC teams and fellows.
Read the original announcement on rOpenSci's website.