Resources for Data Users

Data users search, access, integrate, and cite data from the EDI Repository to contextualize new research, synthesize data, ask new questions of old data, etc. Typically, the use of data from the EDI Repository results in the publication of a derived data package in which the data sources are referenced by provenance metadata or are cited in a journal publication.

The EDI Data Repository holds a thematically diverse collection of data with over 7000 unique data packages ranging in temporal extent from days to decades and with a global spatial extent, though mostly concentrated in the United States. More than 99 percent of data packages are openly accessible and downloadable for reuse.

Finding data

Data packages in the EDI Data Repository are findable by the EDI, DataONE, and Google Dataset Search Engines. The EDI Search Engine supports the most detailed queries, DataONE offers search across several Earth and environmental data repositories, and Google provides search across all disciplinary boundaries. For more information see Finding Data.

Analysis ready data, from EDI Thematic Standardization projects, can be discovered in the EDI Search Engine and the supporting R packages.

Another way to find data of interest is through browsing articles that cite EDI data packages. For a list of articles, see EDI Data Package Citations.

Understanding data

Data package pages provide information rich metadata on the original purpose, collection methods, measurement variables, and other important metadata. Data package pages also provide access to the EDI Data Explorer (DeX) and data download scripts which enable a quick interactive view into a dataset thereby deepening understanding beyond the metadata alone. For more information see Data Exploration.

Together the data package metadata and exploration tools enable assessment of fitness for use. If any questions arise, reach out to the data package contact for clarification. Resolving uncertainties leads to improved metadata and quality of data published in the EDI Repository as a whole.

Although, more than 99 percent of data packages are openly accessible and downloadable for reuse, it is recommended to contact the original data author.

Accessing data

Data packages and their contents can be accessed manually through the EDI Data Portal and programmatically via the REST API, thereby supporting both manual and automated scientific workflows. For more information see Accessing Data.

Citing a data package

Along with ensuring proper attribution is paid to the original data author, properly citing data in publications and other datasets increases findability, encourages data reuse, and enables reliable tracking of data citation metrics. For more information see Citing a Data Package.

Publishing a derived data package

Derived data incorporate one or more published data sources, from EDI or elsewhere, and include the original values or a derived form of them. Creating and publishing a derived data package to the EDI Data Repository follows the same steps as publishing a new data package, but includes provenance metadata describing each data source thereby informing future users to its creation and providing a basis for attribution back to the source data author. For more information see the Resources for Data Authors and Provenance Metadata pages.

Reporting issues

Reporting issues improves data quality for everyone. Report issues to the points of contact listed on the data package pages. If you cannot reach the listed contact then please contact EDI to report the issue.

Event notifications

Event notifications allow users to subscribe to events in the EDI Repository and receive notification when the event occurs (e.g. the creation of a new data package). For more information see Event Notifications.