The Summer Fellowship Program of the Environmental Data Initiative

March 25, 2020

Susanne Grossman-Clarke


The Environmental Data Initiative (EDI) assists researchers from field stations, individual laboratories, and research projects of all sizes to archive and publish their environmental data. EDI’s very successful Summer Fellowship Program for Data Management Training is one component of our Outreach and Training program. For the third consecutive year, EDI is reviewing applications from interested undergraduate and graduate students to become an EDI summer fellow. This year we are seeking nine fellows to be trained in the data publishing process and to support 9 research sites in their efforts to manage their data. EDI’s aim is to ensure that these young professionals learn state-of-the-art data stewardship practices.

The fellowships begin with an in-person training workshop to provide the essentials of data preparation, archiving and publishing in the EDI data repository. This workshop emphasizes community-building and fosters collaborations between participants and EDI mentors. Immediately following the workshop, the fellows spend two months at specific host research sites where they gain hands-on experience with managing the host site’s data with support from local scientists and staff. EDI Outreach Specialist Susanne Grossman-Clarke hosts bi-weekly Zoom calls with the fellows to ensure that they receive support from EDI’s data curators, if necessary. For the host sites, this program strives to ensure that: (1) long-term data are securely archived and published, (2) labs conform to the data management requirements established by their funding agencies and (3) sites retain data publication expertise post-fellowship.

Because of the special Covid-19 situation we are preparing to hold the data publishing training workshop online. The host sites are considering to work with their fellow remotely.

Many of the data sets archived from past fellows were from long-term studies collected by field stations, research groups, students and volunteers. Most of these data were very well documented, although some did not have complete provenance. Our fellows’ enthusiasm and their mentors’ commitment and support greatly facilitate the generation of rich metadata, which is paramount for archiving and publishing FAIR data (Findable, Accessible, Interoperable, and Reusable). Two of the data packages published by EDI fellows in 2019 were exemplified in EDI’s “Featured Data Package” series, “Mohonk Preserve Amphibian and Water Quality Monitoring Dataset at 11 Vernal Pools from 1931-Present” and “Monthly litterfall, monthly tree band, and annual tree growth of a South Carolina coastal wetland forest”.

There have been several additional valuable outcomes of the fellowships: several of our previous fellows have obtained data management related jobs and some fellows participated in conferences to present their fellowship work. EDI was invited to hold a data publishing workshop at the Clemson Baruch Institute of Coastal Ecology and Forest Science so that station staff could be trained in data publishing best practices. Fellow Katherine Qi, of the Northeast US Shelf (NES) LTER, shared her work in teleconferences with NASA, Integrated Ocean Observing System, and Marine Biodiversity Observation Network scientists and data managers. She developed workflows to contribute datasets from plankton imaging systems to EDI. Kathy enjoyed being an EDI Fellow, and said this about the experience:

“By going through the entire data pipeline from collection to analysis to publication, I was able to understand how scientists and information managers communicate to work together. This internship encapsulated both biological sciences and software development, allowing me to learn new, useful tools and practices that will aid me with future career advancement.”