Featured Data Contributions

We regularly highlight data packages that stand out with respect to the published data and associated publications, or are used in unique and exemplary ways, such as teaching and data sharing.

Data packages published by EDI's 2022 Summer Fellows

September 1, 2022

We awarded 14 fellowships for our fifth ecological data management training program (20 June - 19 August 2022). The Fellows received training in ecological data management during a three-day online data publishing workshop and gained hands-on experience through participation in data preparation and publishing with scientists and information managers from specific host research projects, spanning diverse ecosystems across the country, from Maine’s Mount Desert Island to Puerto Rico’s Luquillo Mountains to the Palmyra Atoll in the central Pacific. Our Fellows published a number of data packages.

Chemistry of stream water from the Luquillo Mountains, Puerto Rico

July 12, 2022

This data package represents one of the long-term, continuous data sets in the EDI data repository and contains weekly water quality data from several streams in the Luquillo Experimental Forest (LEF). To our knowledge this is the longest such record of tropical stream chemistry on Earth (McDowell et al. 2021). LEF, a protected area of tropical rainforest in the Luquillo Mountains in Puerto Rico's El Yunque National Forest, was designated as a UNESCO Biosphere Reserve in 1976.

Effects of Copper Sulfate Toxicity on Multiple Phytoplankton and Animals in Standardized Aquatic Microcosms

March 1, 2022

The data published in this featured data package were obtained from a study funded by the Food and Drug Administration (1983-1986) to determine if ecosystems could be used to regulate chemicals. The work on this topic was published in several peer-reviewed articles that have been added as journal citations to the data package landing page in the EDI data repository. At the time of publishing this work, the practice of data publishing was not yet established. Publishing the data now ensures that they are preserved and available for re-use in continuing or future studies.

The Global Nitrous Oxide Database

January 1, 2022

This featured data package contains data on nitrous oxide (N2O) emissions (Fig. 1) as well as covariate data such as climate data, soil moisture and temperature, soil inorganic nitrogen, crop yields and other greenhouse gas emissions (methane, carbon dioxide) from experimental agricultural sites whose data are deposited in the Global Nitrous Oxide Database.

Journey North – Hummingbird and Monarch Butterfly Observations by Volunteer Community Scientists Across Central and North America (1996-2020)

November 1, 2021

During the 2021 EDI summer fellowship, Luis Weber-Grullón, EDI Summer Fellow with Journey North, and his mentor Nancy Sheehan, Journey North’s Program Coordinator at the University of Wisconsin-Madison Arboretum successfully published two data packages.

Calling Activity of Birds in the White Mountain National Forest: Audio Recordings (2016 and 2018)

September 1, 2021

The audio recordings of migratory song birds published in this data package were gathered in the USDA Forest Service’s Hubbard Brook Experimental Forest, a 7,800-acre northern hardwood forest in the White Mountain National Forest in New Hampshire. The experimental forest was established in 1955 and is renowned as the first site where water samples indicating acid rain were collected. This discovery was instrumental in the development of the Clean Air Act. Collaborative, multidisciplinary research efforts on air, water, soils, plants, and animals are carried out here by the Hubbard Brook Ecosystem Study (HBES). HBES is a public-private partnership of the USDA Forest Service, the National Science Foundation’s Long Term Ecological Research (LTER) program, the Hubbard Brook Research Foundation and scientists around the world. HBES was founded in 1963 and is one of the longest running and most comprehensive ecosystem studies in the world.

Effects of Nutrient Enrichment on Grassland Biomass and Plant Diversity Across the Globe

July 1, 2021

Global grassland biomass production accounts for about a third of terrestrial productivity (Chapin et al. 2002), thereby serving as a sink of atmospheric CO2 and an energy source for terrestrial food webs. It is also vital for human food production. Grasslands’ productivity and diversity across the globe is affected by fossil fuel combustion and agricultural fertilization that have increased deposition of nitrogen and phosphorus relative to pre-industrial levels. Concurrently, habitat loss and degradation remove grasslands’ consumers from food webs.

Growth and Survival of Seedlings of 14 Species of Lowland Rainforest Trees Planted in the La Guaria Annex of La Selva Biological Station, Costa Rica, in 1986

May 1, 2021

The Organization for Tropical Studies (OTS) based in Costa Rica, is currently undertaking a robust and concerted effort to identify, retrieve, manage, and share important legacy data sets from tropical research projects going back to the 1960s. The salient aspects of this project include: 1) locating the relevant data and metadata artifacts (photos, maps, field notes, technical reports), 2) securing the data in the EDI repository, 3) publishing and sharing the data within the scientific community, and 4) creatively repurposing the data and projects for future research efforts.

Freshwater Insect Occurrences and Traits for the Contiguous United States, 2001 – 2018

March 1, 2021

This featured data package relates to a recent data paper in Global Ecology and Biogeography, describing a database of freshwater insect occurrences and traits for the contiguous United States (Twardochleb et al. 2021). The lead author, Laura Twardochleb, is a senior environmental scientist with the California Department of Water Resources.

Combined Effects of Climate & Land Use Changes on Water Quality of Lake Sunapee (NH, USA) for the Last 31 Years: Integrating Data Packages from Different Sources

January 1, 2021

The EDI data packages featured here were applied to the General Lake Model (GLM) v.2.1.8, coupled with Aquatic EcoDynamics (AED, Hipsey et al., 2019) to simulate water quality of Lake Sunapee (NH, USA) over the last 31 years. The GLM is a one‐dimensional hydrodynamic model, while AED is a lake ecosystem model. This study was carried out by Nicole Ward and recently published in the journal Water Resources Research (Ward et al. 2020a). Nicole is a PhD student at Virginia Tech’s Department of Biological Sciences with the Carey lab that specializes in freshwater ecosystem sciences.

Annual Point Count Breeding Bird Survey at Pepperwood Preserve in the California Coast Ranges 2007-2019

September 1, 2020

The Dwight Center for Conservation Science at Pepperwood is an ecological institute dedicated to educating, engaging, and inspiring our community through habitat preservation, science-based conservation, leading-edge research, and interdisciplinary educational programs. Our mission is to steward the life and landscapes of the 3,200-acre Pepperwood Preserve and to advance science-based conservation of ecosystems throughout our region and beyond.

Global Lake Area, Climate, and Population Dataset

July 1, 2021

This featured data package is a great example of re-using and combining datasets to further new science. Existing global datasets were harmonized to create the Global Lake area, Climate, and Population (GLCP) dataset. The harmonized data are presented in a recent article in the journal Scientific Data (Meyer et al. 2020), along with a description of the kind of research questions to which the GLCP can be applied.

Abundance and Biovolume of Taxonomically-Resolved Phytoplankton and Microzooplankton Imaged Continuously Underway with an Imaging FlowCytobot Along the NES-LTER Transect in Winter 2018

May 1, 2020

The NES-LTER (Northeast Shelf Long Term Ecological Research) project integrates long-term observations, process and experimental studies, and models to understand and predict how planktonic food webs are changing on the Northeast U.S. Shelf under climate change, and how those changes impact ecosystem productivity, including higher trophic levels. Seasonal surveys along a high gradient cross-shelf transect are among the long-term observations of the NES-LTER. The transect extends about 150 km southward from Martha’s Vineyard (Figure 1).

The Influence of Legacy Phosphorus on Lake Water Quality in the Yahara Watershed, WI

April 1, 2020

This data package represents an example of published model code, including model input, output, and processing scripts. Generally, model based datasets archived in the EDI and other data repositories may include the model code itself, input data, model parameter settings, and output data. Based on our experience with this process we are currently developing best practices for publishing model software code and data. Those guidelines will be available for our community on our “Data package best practices” GitHub page soon.

Teaching Modules “Macrosystems EDDIE 1-3”

February 1, 2020

These data packages were developed as part of the Macrosystems EDDIE program by Cayelan Carey and Kaitlin Farrell (Department of Biological Sciences, Virginia Tech). The data packages demonstrate an educational use case of data (Farrell and Carey 2018) and the role of data repositories in facilitating science education by having data and metadata readily available for use by teachers and students. Educational use of data packages also furthers students’ data and repository literacy, an important skill for research careers. Macrosystems EDDIE is supported by NSF’s Macrosystems Biology program (NSF EF-1702506 and DEB-1926050).

Sampling Sites Where Ecological Indices Were Used to Assess the Impact of Different Environmental Stressors in Aquatic Environments in Argentina

December 1, 2019

This data package is a great example of re-using environmental data that were collected and published previously. EDI actively promotes and enables the re-use of environmental data.

Savannah River Site Corridor Experiment (SC, USA): Annual Plant Occurrence Dataset, 2000 – current

October 1, 2019

Results derived from this data package were recently published in the journal Science (Damschen et al. 2019).

Monthly Litterfall, Monthly Tree Band, and Annual Tree Growth of a South Carolina Coastal Wetland Forest

September 1, 2019

This data package has been published with the support of Vanessa Bailey, EDI’s Summer 2019 Fellow at the Baruch Institute of Coastal Ecology and Forest Science.

Mohonk Preserve Amphibian and Water Quality Monitoring Dataset at 11 Vernal Pools from 1931-Present

July 1, 2019

This data package has been published with the support of EDI’s Summer 2019 Fellow Alexis Garretson. The Mohonk Preserve’s Daniel Smiley Research Center near New Paltz, NY has been monitoring species occupancy, reproductive success, and water quality of 11 vernal pools on the Preserve since April 1931. The vernal pools are Ski Loop, Bonticou, Terrace, Long Woodland Pool, Long Woodland Swamp, Oakwood, Sleepy Hollow, Hermits, North Mud Pond, Canaan, and Talus. This project aims to document changes in the reproductive behavior and phenology of amphibians and to allow research access to these historical, longitudinal records. The dataset is a paired record of amphibian occurrence with environmental indicators spanning nearly 90 years of data collection.

Plot-level Field Data and Model Simulation Results from Summer 2017 Sampling of 2016 Short-interval Fires in Greater Yellowstone

May 1, 2019

This data package is archived to accompany the paper by Turner et al. (2019).

Progress on Converting Community Survey Data Packages into the ecocomDP Data Model

April 1, 2019

EDI is supporting data synthesis and cross site research projects by harmonizing data packages that contain the same attributes, but in various raw data formats and vocabularies. The packages are reformatted to a common data model. For more information see EDI’s website here. Figure 1 shows the general workflow for harmonizing data packages. Archived raw data (level 0 – L0) are converted to a common harmonized data model (level 1 – L1). The L1 data allow for a straightforward data discovery and conversion into derived data products (level 2 – L2).

Virgin Islands National Park: Coral Reef: Population Dynamics: Scleractinian Corals

March 1, 2019

These data are evidence of the long-term dynamics of shallow coral reefs along the south coast of St. John from as early as 1987. The data describe coral reef community structure as percent cover based on the analysis of color photographs. All of these data originate from color images of photoquadrats recorded annually (usually in the summer) from as early as 1987. The data falls into three groups. The two groups that are contained in this data package are (1) Tektite & Yawzi and (2) Random sites. The juvenile coral density is packaged separately.

Interagency Ecological Program: Fish Catch and Water Quality Data from the Sacramento River Floodplain and Tidal Slough, Collected by the Yolo Bypass Fish Monitoring Program, 1998-2018.

October 1, 2018

Largely supported by the Interagency Ecological Program (IEP), California Department of Water Resources (DWR) has operated a fisheries monitoring program in the Yolo Bypass, a seasonal floodplain and tidal slough, since 1998. The objectives of the Yolo Bypass Fish Monitoring Program (YBFMP) are to: (1) collect baseline data on lower trophic levels (phytoplankton, zooplankton, and aquatic insects), juvenile fish and adult fish, hydrology, and water quality parameters; 2) investigation of the temporal and seasonal patterns in chlorophyll-a concentrations, including whether high concentrations are exported from the Bypass during agricultural and natural flow events and the possibility of manipulating bypass flows to benefit listed species like Delta Smelt (Hypomesus transpacificus) and Chinook salmon (Oncorhynchus tshawytscha). The YBFMP operates a rotary screw trap and fyke trap, and conducts biweekly beach seine and lower trophic surveys in addition to maintaining water quality instrumentation in the bypass. The YBFMP serves to fill information gaps regarding environmental conditions in the bypass that trigger migrations and enhanced survival and growth of native fishes, as well as provide data for IEP synthesis efforts. YBFMP staff also conduct analyses of our monitoring data to address pertinent management related questions as identified by IEP.

Adirondack Long-Term Ecological Monitoring Program: Songbird surveys (1952 – 1964 and 1983 – 2008)

September 1, 2018

Songbirds are diverse, abundant and relatively easy to detect, making the taxonomic group useful for understanding changes related to forest change. Declines in neotropical migrants have been linked to changes in habitat quantity and quality across species’ range. Yet songbirds that nest and forage in different habitat types or at different heights in the forest canopy may not be affected equally by forest change or management. The study objectives were to (1) Document long-term trends in relative abundance and diversity of breeding forest birds (songbirds) in forest stands with different harvest histories and management regimes and (2) Identify bird species that can be used as indicators of habitat change or degradation. We detected breeding songbirds using point-counts at Huntington Wildlife Forest (HWF) in the central Adirondack Mountains of New York during 1952-64 (Webb et al. 1977) and 1983-present. Relative abundance (RA, the number of individual birds/count) was measured in sites with differing management histories.

Effects of wetland restoration on hydrology and species diversity of restored wetlands within a Central Florida ranchland, 2003 – ongoing

August 1, 2018

This data package has been published with the support of Gabriel Kamener, EDI's 2018 Fellow with the “Hydrology Monitoring Program” at the Archbold Biological Station (FL).