Data Revisions

Sometimes, data files need to be updated or replaced to address problems discovered after they were initially published. As of April 2019, all files in the NA-CORDEX archive should have two global attributes, "version" and "tracking_id" that can be used to differentiate whether two files with the same name have different contents. Major version numbers track changes in the primary data in the file. Minor version numbers track changes in metadata or ancillary data. The tracking_id is a unique identifier (UUID) assigned to each file before publication.

CRCM5 Split

When the CRCM5 dataset was first published, there was only one set of simulations, from Katja Winger at UQAM. The files were published with the name of the RCM given as "CRCM5". Subsequently, Sébastien Biner at OURANOS ran a second and complementary set of simulations using CRCM5. However, the configuration of CRCM5 is not the same between the two sets of simulations: the lake fractions differ and one (CRCM5-OUR) is nudged. (See RCM Characteristics for details.) Therefore, we refer to the two models as CRCM5-UQAM and CRCM5-OUR, distinguishing between them by the modeling center. We have republished the files originally named CRCM5 as CRCM5-UQAM.

Reruns

Some simulations had to be re-run because of problems that were detected only after the data had been post-processed and published.

  • The CRCM5-UQAM simulation driven by the MPI-ESM-MR future run was originally continued off the MPI-ESM-LR historical run. It was re-run for both the historical and future periods.
  • Both the WRF and RegCM4 simulations driven by MPI-ESM-LR originally had mis-specified surface temperatures over the oceans.
  • The original WRF runs driven by ERA-Interim had no sea ice due to configuration issues.

In all cases, the original simulations were retracted and replaced with output from the re-runs.

If there is doubt regarding whether a file is from the original run or the re-run, check the global attribute "version" in the netCDF headers; re-runs are verison 2 or higher.

Reprocessed Outputs

Some errors in the post-processing workflow were discovered only after the data had been published. In these cases, we corrected the errors and republished the data. The following problems affected the actual data in the files (not just metadata or ancillary data) and resulted in a version number of 2 or higher.

  • Problems with time coordinates affecting aggregation to daily and longer frequences in the RegCM4/ERA-Int and CRCM5-UQAM/CanESM/RCP45 runs resulted in these files being reprocessed compeltely.
  • An apparent typo in the configuration files for the 25-km RegCM4 runs driven by GFDL and HadGEM resulted in their grid projections being shifted 0.5 degrees south relative to the MPI and ERA-Int runs, which caused errors downsteam when they were regridded to the NAM-22i common grid. The fix to the latitude and longitude arrays was minor, but the common grid data had to be reprocessed.

Other Revisions

  • All priority 0 (temp and precip) files start out at version 1.1 because we added tracking_ids, version numbers, and DOIs to the files, which they did not have when originally published.
  • Latitude and longitude arrays in WRF and CanRCM4 were updated to run -180:180 instead of 0:360.
  • Corrections to latitude and longitude arrays (separate from the grid-shift problem describe above) were applied in all RegCM4 runs.
  • Some metadata and ancillary data errors on assorted NAM-22i CRCM5-OUR runs were corrected.
  • The _FillValue and missing_value attributes in RegCM4 outputs were updated to set them consistently to 1e20f, rather than various default values generated by various processing scripts.
  • Miscellaneous inconsistencies in long_names, standard_names, and cell_methods attributes in WRF and CRCM5-OUR outputs, as well as variables prhmax, hurs, and sfcWind were fixed.