Updating a Data Package

Perform a data package update whenever data or metadata need to be changed or added to a published data package. Updates may be performed routinely or sporadically and will result in a new "revision". A revision of a data package has the same identifier, but receives a new version number and is assigned a new DOI. All revisions of a data package are linked in the EDI Data Repository. Users who end up on the landing page of an older revision will be notified that a newer version is available.

Note to Information Managers: Be aware of the EML <access> element. Only credentials already specified in an existing version of a data package can be used to publish an update (i.e. to publish edi.10.2, credentials must be specified in edi.10.1). If you are unable to publish a revision for this reason, contact the EDI Curation Team.

Metadata to include

It is important to communicate changes and significance in the metadata of an updated data package so users can understand what has changed and why. This information is included in the maintenance section of EML metadata. Guidance on adding this information is provided below.

Editing data and metadata

ezEML

Edit data and metadata using ezEML:

  1. Open the EML document for the original data package. If the package was created outside of ezEML, or you no longer have access to the original ezEML data package, select Fetch a Package from EDI from the Import/Export menu to retrieve and import an existing data package:

    • Select the scope of your data package (e.g. edi, knb-lter-ntl, etc.).
    • Select the package scope.identifier to start the import.
    • Note any errors that may have occurred during import (if package was originally made outside of ezEML).
    • Select the option to Get Associated Data Files if you plan to edit/reupload one or more tables.
  2. Describe the changes in the new revision. From the Maintenance tab, add a new paragraph to the Description text.

  3. Submit to EDI - Click Send to EDI and add a note mentioning that this is an update to an existing data package (e.g. "This submission is a revision to package edi.101.1").

  4. The EDI curation team will receive the submission and iterate through the review process before the update is published.

EML created with ezEML can be downloaded directly and published to the EDI Repository. If opting to publish your own updates, remember to enter an incremented version number in the Data Package ID tab of ezEML.

EMLassemblyline

Edit data and metadata using EMLassemblyline:

  1. Get the metadata templates and make_eml() function call for the original data package. If these don't exist, use the eml2eal() and eml2eal_losses() functions to create them.
  2. Update the metadata templates and make_eml() function arguments to reflect the changes. Describe the changes made between revisions using the maintenance.description parameter of make_eml().
  3. Increment the data package version number in the package.id parameter of make_eml().
  4. Run make_eml().

Publishing edited data and metadata

EDI Data Portal

An updated data package can be uploaded via the EDI Data Portal similarly to a new data package, but with one key difference:

Use the Allow PASTA+ to skip… option if any of the data files are unchanged between versions. This allows the EDI Data Repository (a.k.a. PASTA) to forgo reuploading replicate data and can save time and repository space. Caution: take care to ensure that the metadata-documented checksum values of each data file are accurate and up to date.

For more information on this option, watch this video.

EDIutils

An updated data package can be uploaded via the EDIutils R package using the update_data_package() function. For updating with this function, all data files must be web-hosted and be associated with static data links. When using this function, the useChecksum option can be selected.

Set the useChecksum argument to TRUE if any of the data objects are unchanged between versions. This allows the EDI Data Repository to forgo reuploading replicate data and can save time and repository space. Caution: take care to ensure that metadata-documented checksum values are accurate and up to date.

For a language-agnostic solution, see the REST API documentation for Updating a Data Package.

Submit via email

Submit desired changes or the new and/or updated data along with the updated EML file to the EDI Curation Team via email. Make sure to mention the data package identifier that is being updated. The EDI Curation Team will create a proof for review before the update is published.