Featured Data

Progress on Converting Community Survey Data Packages into the ecocomDP Data Model

April 1, 2019

Susanne Grossman-Clarke

Citation

Description

EDI is supporting data synthesis and cross site research projects by harmonizing data packages that contain the same attributes, but in various raw data formats and vocabularies. The packages are reformatted to a common data model. For more information see EDI’s website here. Figure 1 shows the general workflow for harmonizing data packages. Archived raw data (level 0 – L0) are converted to a common harmonized data model (level 1 – L1). The L1 data allow for a straightforward data discovery and conversion into derived data products (level 2 – L2).

This data harmonization framework, developed by EDI, is currently successfully applied to converting data packages for community survey data into the ecocomDP data model. To date, EDI harmonized approximately 70 data packages. The summary metrics of those packages is given in the following table:

Packages’ Summary Characteristics Mean Min Max Median
Temporal Coverage 20 3 17 16
Temporal evenness (interval SD) 1.3 0 10.8
Geographic coverage (km2, > 0) 1.9 x 106 1.4 1.3 x 108 158.6
Taxonomic coverage (without OTUs*) 142 1 1752 48

In addition to allowing easier analysis of data packages in a common design pattern (or data model), the data packages can be easily discovered in the EDI repository as well as by Google’s data search. The raw data packages would have not been easily queried due to the use of different vocabularies and keywords. Figure 2 shows the search results returned by the EDI repository software after querying the keyword “ecocomDP”.

References

All featured data contributions