News
Request for Comment: Expanding the List of Supported Date and Time Formats for Congruence Checks
December 5, 2024
Colin Smith
Introduction
The EML Congruence Checker (ECC) and the ezEML Check Data Tables provide critical validation services to verify the accuracy and consistency of data packages before publication in the EDI Data Repository. Currently, these services evaluate date and time data based on a predefined list of supported formats. If a data package format is unrecognized, it does not undergo validation but is flagged with a warning.
This proposal suggests expanding this list to include additional common, unambiguous date and time formats, as summarized below. We aim to ensure greater flexibility, accuracy, and compatibility for date and time data, enhancing overall data quality in the repository.
Issue Statement
At present, our quality-checking services limit validation to a subset of the internationally recognized ISO 8601 standard for date and time formats. This subset was initially curated in 2017 by EDI's ECC working group to balance precision with wide usability.
The ECC and ezEML evaluation services operate as follows:
- Data Packages with Recognized Format: If a data package’s metadata specifies a recognized date and time format, ECC checks values for consistency, flagging any discrepancies between declared and actual formats as errors.
- Data Packages with Unrecognized Format: If the format is unrecognized, the checker bypasses validation for that column and issues a warning only. This practice presents a few challenges:
- Inaccurate or Incomplete Data: Unvalidated packages may contain non-standard, imprecise date and time values that degrade data quality.
- Limited Support for Popular Formats: Communities that rely on commonly used but unsupported formats face obstacles in submitting their data.
- Unnecessary Reformatting: Data curators may spend additional time reformatting already unambiguous dates to ISO 8601.
Proposed Solution
To address these limitations, we propose the following enhancements to ECC and ezEML congruence checks:
- Expanding Supported Formats: We will extend the supported format list to include widely used, unambiguous date and time formats, broadening the range of submissions compatible with the ECC. This should significantly reduce reformatting efforts and improve data quality.
- Maintaining Warnings for Unsupported Formats: Data packages that utilize formats outside this expanded list will continue to trigger warnings, allowing users to be aware of potential issues without blocking submission.
- Encouraging ISO 8601 Adoption: To promote best practices, we will add educational materials to relevant EDI resources (e.g., Data Package Best Practices) that highlight ISO 8601’s benefits for data interoperability.
Community Feedback Requested
Your feedback is vital to refining this proposal and aligning our services with community practices. Please share your thoughts, suggestions, and any other input. Thank you for helping us improve the EDI Data Repository's services for everyone.