From 2cd17650c94bba2595aeb806a915e7ac8d613254 Mon Sep 17 00:00:00 2001 From: "ceggers@rsna.org" Date: Fri, 2 Aug 2024 15:19:46 -0500 Subject: [PATCH 001/751] accidentally transposed url's for controlledaccess between 2 sets --- datasets/rsna-intracranial-hemorrhage-detection.yaml | 2 +- datasets/rsna-pulmonary-embolism-detection.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/rsna-intracranial-hemorrhage-detection.yaml b/datasets/rsna-intracranial-hemorrhage-detection.yaml index f337ada06..e2c382533 100644 --- a/datasets/rsna-intracranial-hemorrhage-detection.yaml +++ b/datasets/rsna-intracranial-hemorrhage-detection.yaml @@ -21,7 +21,7 @@ Resources: ARN: arn:aws:s3:::intracranial-hemorrhage Region: us-west-2 Type: S3 Bucket - ControlledAccess: https://mira.rsna.org/dataset/2 + ControlledAccess: https://mira.rsna.org/dataset/1 DataAtWork: Publications: - Title: "Construction of a Machine Learning Dataset through Collaboration: The RSNA 2019 Brain CT Hemorrhage Challenge" diff --git a/datasets/rsna-pulmonary-embolism-detection.yaml b/datasets/rsna-pulmonary-embolism-detection.yaml index 30d5ad9e5..b94749d7e 100644 --- a/datasets/rsna-pulmonary-embolism-detection.yaml +++ b/datasets/rsna-pulmonary-embolism-detection.yaml @@ -21,7 +21,7 @@ Resources: ARN: arn:aws:s3:::pulmonary-embolism-detection Region: us-west-2 Type: S3 Bucket - ControlledAccess: https://mira.rsna.org/dataset/1 + ControlledAccess: https://mira.rsna.org/dataset/2 DataAtWork: Publications: - Title: The RSNA Pulmonary Embolism CT Dataset From ecadaf51efac2594a284464a1bb76e7fd1524f34 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:24:54 -0600 Subject: [PATCH 002/751] Update HSDS example link --- datasets/nrel-pds-ncdb.yaml | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/datasets/nrel-pds-ncdb.yaml b/datasets/nrel-pds-ncdb.yaml index 7f75c64d3..22a4b8ec1 100644 --- a/datasets/nrel-pds-ncdb.yaml +++ b/datasets/nrel-pds-ncdb.yaml @@ -6,7 +6,6 @@ Description: | The NCDB seeks to maintain the inherent relationship between the various parameters that are needed to model solar, wind, hydrology and load and provide data for multiple important climate scenarios. - Documentation: https://nsrdb.nrel.gov/ Contact: Manajit.Sengupta@nrel.gov ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' @@ -51,9 +50,9 @@ DataAtWork: - Title: NCDB Website URL: https://climate.nrel.gov AuthorName: NREL NCDB Team - - Title: HSDS Examples - URL: https://github.com/NREL/hsds-examples - AuthorName: Caleb Phillips, Caroline Draxl, John Readey, Jordan Perr-Sauer, Michael Rossol + - Title: NCDB HSDS Examples + URL: https://github.com/NREL/hsds-examples/blob/master/notebooks/10_NCDB_introduction.ipynb + AuthorName: Reid Olson Publications: - Title: Regridding uncertainty for statistical downscaling of solar radiation URL: https://ascmo.copernicus.org/articles/9/103/2023/ From dbafc1495506471ba49d75f14b0c6856284d72f2 Mon Sep 17 00:00:00 2001 From: "ceggers@rsna.org" Date: Tue, 29 Oct 2024 09:03:39 -0500 Subject: [PATCH 003/751] submitting RATIC dataset RSNA Abdominal Traumatic Injury CT imaging dataset for mira.rsna.org --- datasets/rsna-ratic.yaml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 datasets/rsna-ratic.yaml diff --git a/datasets/rsna-ratic.yaml b/datasets/rsna-ratic.yaml new file mode 100644 index 000000000..2a78e5eb2 --- /dev/null +++ b/datasets/rsna-ratic.yaml @@ -0,0 +1,29 @@ +Name: RSNA Abdominal Traumatic Injury CT (RATIC) +Description: "Blunt force abdominal trauma is among the most common types of traumatic injury, with the most frequent cause being motor vehicle accidents. Abdominal trauma may result in damage and internal bleeding of the internal organs, including the liver, spleen, kidneys, and bowel. Detection and classification of injuries are key to effective treatment and favorable outcomes. A large proportion of patients with abdominal trauma require urgent surgery. Abdominal trauma often cannot be diagnosed clinically by physical exam, patient symptoms, or laboratory tests. Prompt diagnosis of abdominal trauma using medical imaging is thus critical to patient care. AI tools that assist and expedite diagnosis of abdominal trauma have the potential to substantially improve patient care and health outcomes in the emergency setting. To create the ground truth dataset, RSNA collected imaging data sourced from 23 sites in 14 countries on six continents, including more than 4,000 CT exams with various abdominal injuries and a roughly equal number of cases without injury." +Documentation: https://github.com/RSNA/AI-Challenge-Data/wiki/RSNA-Abdominal-Traumatic-Injury-CT +Contact: informatics@rsna.org +ManagedBy: 'Radiological Society of North America (https://www.rsna.org/)' +UpdateFrequency: The dataset may be updated with additional or corrected data on a need-to-update basis. +Tags: + - aws-pds + - radiology + - medical imaging + - medical image computing + - machine learning + - computer vision + - csv + - labeled + - computed tomography + - x-ray tomography +License: "You may access and use these de-identified imaging datasets and annotations (“the data”) for non-commercial purposes only, including academic research and education, as long as you agree to abide by the following provisions: Not to make any attempt to identify or contact any individual(s) who may be the subjects of the data. If you share or re-distribute the data in any form, include a citation to the “Brain CT Hemorrhage Dataset, Copyright RSNA, 2019” as follows: Flanders AF, et al. The RSNA Brain CT Hemorrhage Dataset [10.1148/ryai.2020190211]. Radiology: Artificial Intelligence 2020;2:3." +Resources: + - Description: Zip archive containing DCM and CSV files + ARN: arn:aws:s3:::abdominal-trauma-detection + Region: us-west-2 + Type: S3 Bucket + ControlledAccess: https://mira.rsna.org/dataset/5 +DataAtWork: + Publications: + - Title: The RSNA Abdominal Traumatic Injury CT (RATIC) Dataset + AuthorName: Rudie, Jeffrey D. + URL: https://doi.org/10.48550/arXiv.2405.19595 From 391325548bf6638eccf8eeb950ec90fd6f1cee9a Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Fri, 13 Dec 2024 12:36:09 -0600 Subject: [PATCH 004/751] first draft --- datasets/lwi-model-data.yml | 85 +++++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 datasets/lwi-model-data.yml diff --git a/datasets/lwi-model-data.yml b/datasets/lwi-model-data.yml new file mode 100644 index 000000000..6bf7468f8 --- /dev/null +++ b/datasets/lwi-model-data.yml @@ -0,0 +1,85 @@ +Description: > + Geographic (land cover, land elevation, etc.), meteorologic (pluvial, wind, etc.), + hydrologic (fluvial, tidal, etc.), hydrodynamic (water surface elevations, flow velocities), + and built environment (structures, levees, floodgates, culverts) data used as inputs to and + outputs from numerical modeling software in the prediction of flood risk in stochastic and + probabilistic frameworks.This data was collected from open sources, such as from the + National Oceanographic and Atmospheric Administration (NOAA) or the + United States Geological Survey (USGS), modified in format to suit the + needs of the modeling program and software, and then used to predict flooding + in Louisiana across a range of scenarios. The modeling software used to predict + flooding which utilizes and creates this data is freely available from the + United States Army Corps of Engineers: The Hydrologic Engineering Center’s + Hydrologic Modeling System (HEC-HMS) and River Analysis System (HEC-RAS). + All data is made public by the state of Louisiana for the benefit of its citizens. + This flood prediction data will be able to be used by federal, state, and local + decision makers as well as private citizens to assess the flood risk they face and + make sound science-based decisions for response and adaptation. +Contact: endmc@thewaterinstitute.org +ManagedBy: The Water Institute +UpdateFrequency: yearly +Tags: + - forecast + - bathymetry + - climate + - coastal + - disaster response + - elevation + - floods + - geospatial + - hydrodynamic + - hydrology + - infrastructure + - land cover + - land use + - mapping + - meteorological + - model + - numerical + - open source software + - precipitation + - simulations + - sustainability + - water + - weather +License: https://creativecommons.org/licenses/by/4.0/ with attribution to Louisiana Watershed Council +Citation: +Resources: + - Description: Model Applications and Simulations + ARN: + Region: + Type: S3 Bucket + Explore: + - '[Discover on ENDMC] (https://lwi.endmc.org/), the ID of the resources is the same as the path in the bucket.' + - '[Browse on AWS] (https://.s3.amazonaws.com/index.html)' #placeholder +DataAtWork: + Tutorials: + - Title: ENDMC Documentation + URL: https://lwi.endmc.org/help/help_index + AuthorName: The Water Institute + AuthorURL: https://thewaterinstitute.org/ + Tools & Applications: + - Title: LWI ENDMC Datan and Model Catalog + URL: https://lwi.endmc.org/ + AuthorName: The Water Institute + AuthorURL: https://thewaterinstitute.org/ + - title: HEC-HMS + URL: https://www.hec.usace.army.mil/software/hec-hms/ + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/ + - title: HEC-RAS + URL: https://www.hec.usace.army.mil/software/hec-ras/ + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/ + - title: go-consequences + URL: https://github.com/USACE/go-consequences + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/ + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: + - Environmental Data \ No newline at end of file From b7ddf7326f2afa0fb089227216500976ce96dc01 Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Fri, 13 Dec 2024 15:35:38 -0600 Subject: [PATCH 005/751] rename --- datasets/{lwi-model-data.yml => lwi-model-data.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{lwi-model-data.yml => lwi-model-data.yaml} (100%) diff --git a/datasets/lwi-model-data.yml b/datasets/lwi-model-data.yaml similarity index 100% rename from datasets/lwi-model-data.yml rename to datasets/lwi-model-data.yaml From c9b16357d15263d0e7e65c1b04c50715addb78b9 Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Tue, 7 Jan 2025 11:18:22 -0600 Subject: [PATCH 006/751] adding tool documentation links --- datasets/lwi-model-data.yaml | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/datasets/lwi-model-data.yaml b/datasets/lwi-model-data.yaml index 6bf7468f8..8294e71eb 100644 --- a/datasets/lwi-model-data.yaml +++ b/datasets/lwi-model-data.yaml @@ -47,7 +47,7 @@ Citation: Resources: - Description: Model Applications and Simulations ARN: - Region: + Region: us-east-1 Type: S3 Bucket Explore: - '[Discover on ENDMC] (https://lwi.endmc.org/), the ID of the resources is the same as the path in the bucket.' @@ -67,19 +67,30 @@ DataAtWork: URL: https://www.hec.usace.army.mil/software/hec-hms/ AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ + - title: HEC-HMS documentation + URL: https://www.hec.usace.army.mil/software/hec-hms/documentation.aspx + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/ - title: HEC-RAS URL: https://www.hec.usace.army.mil/software/hec-ras/ AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ + - title: HEC-RAS documentation + URL: https://www.hec.usace.army.mil/software/hec-ras/documentation.aspx + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/\ + - title: HEC-FIA + URL: https://www.hec.usace.army.mil/software/hec-fia/ + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/ + - title: HEC-FIA documentation + URL: https://www.hec.usace.army.mil/software/hec-fia/documentation.aspx + AuthorName: United States Army Corps of Engineers + AuthorURL: https://www.usace.army.mil/ - title: go-consequences URL: https://github.com/USACE/go-consequences AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - Publications: - - Title: - URL: - AuthorName: - AuthorURL: DeprecatedNotice: ADXCategories: - Environmental Data \ No newline at end of file From 313f5f185907a28636daed873cd90b80209dbc8f Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Wed, 8 Jan 2025 13:24:11 -0600 Subject: [PATCH 007/751] technical editor suggestions --- datasets/lwi-model-data.yaml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/datasets/lwi-model-data.yaml b/datasets/lwi-model-data.yaml index 8294e71eb..5f344451b 100644 --- a/datasets/lwi-model-data.yaml +++ b/datasets/lwi-model-data.yaml @@ -2,17 +2,17 @@ Description: > Geographic (land cover, land elevation, etc.), meteorologic (pluvial, wind, etc.), hydrologic (fluvial, tidal, etc.), hydrodynamic (water surface elevations, flow velocities), and built environment (structures, levees, floodgates, culverts) data used as inputs to and - outputs from numerical modeling software in the prediction of flood risk in stochastic and - probabilistic frameworks.This data was collected from open sources, such as from the + outputs from numerical modeling software for the prediction of flood risk in stochastic and + probabilistic frameworks. This data was collected from open sources, such as from the National Oceanographic and Atmospheric Administration (NOAA) or the - United States Geological Survey (USGS), modified in format to suit the + United States Geological Survey (USGS). The format of these data is modified to suit the needs of the modeling program and software, and then used to predict flooding in Louisiana across a range of scenarios. The modeling software used to predict flooding which utilizes and creates this data is freely available from the - United States Army Corps of Engineers: The Hydrologic Engineering Center’s + United States Army Corps of Engineers Hydrologic Engineering Center’s Hydrologic Modeling System (HEC-HMS) and River Analysis System (HEC-RAS). - All data is made public by the state of Louisiana for the benefit of its citizens. - This flood prediction data will be able to be used by federal, state, and local + All data is made public by the State of Louisiana for the benefit of its citizens. + This flood prediction data can be used by federal, state, and local decision makers as well as private citizens to assess the flood risk they face and make sound science-based decisions for response and adaptation. Contact: endmc@thewaterinstitute.org @@ -78,7 +78,7 @@ DataAtWork: - title: HEC-RAS documentation URL: https://www.hec.usace.army.mil/software/hec-ras/documentation.aspx AuthorName: United States Army Corps of Engineers - AuthorURL: https://www.usace.army.mil/\ + AuthorURL: https://www.usace.army.mil/ - title: HEC-FIA URL: https://www.hec.usace.army.mil/software/hec-fia/ AuthorName: United States Army Corps of Engineers From d4359ff4a6d8e69d1ad009da00a6f09b1002d624 Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Wed, 8 Jan 2025 13:38:05 -0600 Subject: [PATCH 008/751] adding the aws-pds tag --- datasets/lwi-model-data.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/lwi-model-data.yaml b/datasets/lwi-model-data.yaml index 5f344451b..8fb952fda 100644 --- a/datasets/lwi-model-data.yaml +++ b/datasets/lwi-model-data.yaml @@ -42,6 +42,7 @@ Tags: - sustainability - water - weather + - aws-pds License: https://creativecommons.org/licenses/by/4.0/ with attribution to Louisiana Watershed Council Citation: Resources: From dc89a4ea797bc56c36a178beda9b643ab8ad18f7 Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Wed, 8 Jan 2025 13:40:41 -0600 Subject: [PATCH 009/751] add name --- datasets/lwi-model-data.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/lwi-model-data.yaml b/datasets/lwi-model-data.yaml index 8fb952fda..860f62065 100644 --- a/datasets/lwi-model-data.yaml +++ b/datasets/lwi-model-data.yaml @@ -1,3 +1,4 @@ +Name: Louisiana Watershed Initiative (LWI) Model Data Description: > Geographic (land cover, land elevation, etc.), meteorologic (pluvial, wind, etc.), hydrologic (fluvial, tidal, etc.), hydrodynamic (water surface elevations, flow velocities), From d507dde407708684bb7d1f3ff510118bad7f21ab Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Wed, 8 Jan 2025 13:51:11 -0600 Subject: [PATCH 010/751] removing non-existing tags --- datasets/lwi-model-data.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/lwi-model-data.yaml b/datasets/lwi-model-data.yaml index 860f62065..afef4e663 100644 --- a/datasets/lwi-model-data.yaml +++ b/datasets/lwi-model-data.yaml @@ -28,7 +28,7 @@ Tags: - elevation - floods - geospatial - - hydrodynamic + - hydrologic model - hydrology - infrastructure - land cover @@ -36,7 +36,6 @@ Tags: - mapping - meteorological - model - - numerical - open source software - precipitation - simulations From a0198ffcb9ea4de5d6d07c5b589e8113a57abd9a Mon Sep 17 00:00:00 2001 From: Christian Ariza Date: Wed, 8 Jan 2025 14:54:25 -0600 Subject: [PATCH 011/751] addressing pykwalify concerns --- datasets/lwi-model-data.yaml | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/datasets/lwi-model-data.yaml b/datasets/lwi-model-data.yaml index afef4e663..0be8b8d23 100644 --- a/datasets/lwi-model-data.yaml +++ b/datasets/lwi-model-data.yaml @@ -19,6 +19,7 @@ Description: > Contact: endmc@thewaterinstitute.org ManagedBy: The Water Institute UpdateFrequency: yearly +Documentation: https://watershed.la.gov/modeling-program Tags: - forecast - bathymetry @@ -44,15 +45,15 @@ Tags: - weather - aws-pds License: https://creativecommons.org/licenses/by/4.0/ with attribution to Louisiana Watershed Council -Citation: +Citation: "Louisiana Watershed Initiative Model Data. Louisiana Watershed Council. 2025. https://lwi.endmc.org/" Resources: - Description: Model Applications and Simulations - ARN: + ARN: arn:aws:s3:::lwi-model-data #placeholder Region: us-east-1 Type: S3 Bucket Explore: - - '[Discover on ENDMC] (https://lwi.endmc.org/), the ID of the resources is the same as the path in the bucket.' - - '[Browse on AWS] (https://.s3.amazonaws.com/index.html)' #placeholder + - '[Browse the bucket](https://lwi-model-data.s3.amazonaws.com/index.html)' + - '[Disover the data and models. The ID of the resources can be used to explore the data on the AWS bucket.](https://lwi.endmc.org/)' DataAtWork: Tutorials: - Title: ENDMC Documentation @@ -60,38 +61,37 @@ DataAtWork: AuthorName: The Water Institute AuthorURL: https://thewaterinstitute.org/ Tools & Applications: - - Title: LWI ENDMC Datan and Model Catalog + - Title: LWI ENDMC Datan and Model Catalog. URL: https://lwi.endmc.org/ AuthorName: The Water Institute AuthorURL: https://thewaterinstitute.org/ - - title: HEC-HMS + - Title: HEC-HMS URL: https://www.hec.usace.army.mil/software/hec-hms/ AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - - title: HEC-HMS documentation + - Title: HEC-HMS documentation URL: https://www.hec.usace.army.mil/software/hec-hms/documentation.aspx AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - - title: HEC-RAS + - Title: HEC-RAS URL: https://www.hec.usace.army.mil/software/hec-ras/ AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - - title: HEC-RAS documentation + - Title: HEC-RAS documentation URL: https://www.hec.usace.army.mil/software/hec-ras/documentation.aspx AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - - title: HEC-FIA + - Title: HEC-FIA URL: https://www.hec.usace.army.mil/software/hec-fia/ AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - - title: HEC-FIA documentation + - Title: HEC-FIA documentation URL: https://www.hec.usace.army.mil/software/hec-fia/documentation.aspx AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ - - title: go-consequences + - Title: go-consequences URL: https://github.com/USACE/go-consequences AuthorName: United States Army Corps of Engineers AuthorURL: https://www.usace.army.mil/ -DeprecatedNotice: ADXCategories: - Environmental Data \ No newline at end of file From c9126fd29e448b1550af128225de17a6f8a85b23 Mon Sep 17 00:00:00 2001 From: rsignell <125569335+rsignell@users.noreply.github.com> Date: Mon, 3 Mar 2025 13:14:53 -0500 Subject: [PATCH 012/751] Create fvcom_gom3.yaml Adding the FVCOM GOM3 Ocean Hindcast for New England --- datasets/fvcom_gom3.yaml | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 datasets/fvcom_gom3.yaml diff --git a/datasets/fvcom_gom3.yaml b/datasets/fvcom_gom3.yaml new file mode 100644 index 000000000..8172a914f --- /dev/null +++ b/datasets/fvcom_gom3.yaml @@ -0,0 +1,28 @@ +Name: FVCOM GOM3 +Description: The Finite Volume Community Ocean Model (FVCOM) was used to simulate ocean water levels, velocity, temperature and salinity over a multi-decadal period (1984-present) in the waters of the Northeast US including the Gulf of Maine. The model was configured and run by the Dr. Changshen Chen, Director of the Marine Ecosystems Dynamics Modeling Laboratory in the School for Marine Science & Technology at the University of Massachusetts Dartmouth. The triangular mesh has a varying horizontal resolution from several hundred meters inshore to several kilometers offshore. The model output was saved at hourly from 2009-08-21 to 2022-06-17. +Documentation: https://www.umassd.edu/news/2018/charting-the-ocean-.html +Contact: rich@opensciencecomputing.com +ManagedBy: Open Science Computing, LLC +UpdateFrequency: None +Citation: +Tags: + - aws-pds + - oceans +License: CC0 +Resources: + - Description: A collection of NetCDF files, kerchunk generated JSON files, and an Intake catalog + ARN: arn:aws:s3:::fvcom-gom3 + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: FVCOM Explorer Notebook + URL: https://github.com/opensciencecomputing/fvcom + NotebookURL: https://github.com/opensciencecomputing/fvcom/blob/main/FVCOM_explore.ipynb + AuthorName: Rich Signell + AuthorURL: https://about.me/rich.signell + Services: + Publications: + - Title: An Unstructured Grid, Finite-Volume, Three-Dimensional, Primitive Equations Ocean Model: Application to Coastal Ocean and Estuaries + URL: https://doi.org/10.1175/1520-0426(2003)020%3C0159:AUGFVT%3E2.0.CO;2 + AuthorName: Changsheng Chen, Hedong Liu, and Robert C. Beardsley From 9ca15a5f087859b968a12c75fcfd7fb4916a5c8e Mon Sep 17 00:00:00 2001 From: rsignell <125569335+rsignell@users.noreply.github.com> Date: Mon, 3 Mar 2025 13:40:46 -0500 Subject: [PATCH 013/751] Update fvcom_gom3.yaml --- datasets/fvcom_gom3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/fvcom_gom3.yaml b/datasets/fvcom_gom3.yaml index 8172a914f..37c1550b3 100644 --- a/datasets/fvcom_gom3.yaml +++ b/datasets/fvcom_gom3.yaml @@ -23,6 +23,6 @@ DataAtWork: AuthorURL: https://about.me/rich.signell Services: Publications: - - Title: An Unstructured Grid, Finite-Volume, Three-Dimensional, Primitive Equations Ocean Model: Application to Coastal Ocean and Estuaries + - Title: An Unstructured Grid, Finite Volume, Three Dimensional, Primitive Equations Ocean Model with Application to Coastal Ocean and Estuaries URL: https://doi.org/10.1175/1520-0426(2003)020%3C0159:AUGFVT%3E2.0.CO;2 AuthorName: Changsheng Chen, Hedong Liu, and Robert C. Beardsley From 15ab2420ec72e51e0c384ede8e1935e4594378ed Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Mon, 17 Mar 2025 18:12:33 +0530 Subject: [PATCH 014/751] Created AI3.yaml --- datasets/AI3.yaml | 57 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 datasets/AI3.yaml diff --git a/datasets/AI3.yaml b/datasets/AI3.yaml new file mode 100644 index 000000000..1abe6f3f0 --- /dev/null +++ b/datasets/AI3.yaml @@ -0,0 +1,57 @@ +Name: AI3 +Description: > + The rapid advancement of computing technologies, particularly artificial intelligence (AI), has revolutionized various domains, including drug discovery. Curated datasets are crucial for developing reliable, generalizable, and accurate models for practical applications. Generating experimental data on a large scale is an expensive and arduous process. In domains such as medical diagnostics where real-life data is hard to obtain, synthetic data has been shown to be extremely valuable. We, teams from IIIT Hyderabad, Intel, AWS, and Insilico Medicine, have performed physics-based calculations (molecular dynamics simulations) on about 20,000 protein-ligand complexes. The dataset comprises molecular dynamics snapshots, binding affinities calculated using the MM-PBSA method, and individual energy components, including electrostatic and van der Waals interactions. + +DatasetFileFormats: + - 3D coordinates of the protein-ligand complexes (pdb) in tar.gz files + - CSV files containing the energy data + +DatasetUsages: + - ML scoring function for predicting binding affinities of given protein-ligand complexes + - Classification models for predicting correct binding poses of ligands + - Identification of cryptic binding pockets + - Optimization of binding features by exploiting the individual components of the energy (experimental data has only the total binding affinity) + +DatasetNovelty: > + Existing AI/ML training datasets lack dynamic data and are inherently biased. Further, binding affinity data existing in the literature are obtained from different experimental protocols. Therefore, this dataset has been uniquely created (from the same computational protocols) followed by free energy calculations with molecular dynamics (MD) simulations. The dynamic data-enriched protein-ligand coordinates can be used to effectively train convolutional neural network-based regression models for more accurate binding affinity prediction. + +Documentation: https://github.com/devalab/AI3 +Contact: devalab@iiit.ac.in +ManagedBy: International Institute of Information Technology Hyderabad +UpdateFrequency: Not updated + +Tags: + - Pharmaceutical + - Simulations + - Health + - Life Sciences + - Machine Learning + - Protein + - Molecular Dynamics + +License: https://www.devalab.in/AI3.html + +Resources: + - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. + ARN: To be added once bucket is created + Region: To be added once bucket is created + Type: S3 bucket + +DataAtWork: + Tutorials: + - Description: The dataset is easy to download and can be applied based on user requirements. Further information about the protocol for creation of the dataset can be obtained from https://github.com/devalab/AI3. + ToolsApplications: + - Title: Dataset of protein-ligand complexes now available in the Registry of Open Data on AWS + URL: To be added once blog is published + AuthorName: U. Deva Priyakumar, Rakesh Srivastava, Prathit Chatterjee, Vladimir Aladinskiy, Ramanathan Sethuraman, Yusong Wang, Alex Iankoulski, Beryl Rabindran + AuthorURL: https://devalab.in/ + Publications: + - Title: "PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications" + URL: https://www.nature.com/articles/s41597-022-01631-9 + AuthorName: U. Deva Priyakumar + AuthorURL: https://devalab.in/ + - Title: "PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications" + URL: https://www.nature.com/articles/s41597-023-02872-y + AuthorName: U. Deva Priyakumar + AuthorURL: https://devalab.in + From d8fd5e32b9b0cfa6ba3bafe5af796226c86cb296 Mon Sep 17 00:00:00 2001 From: Allison Heath Date: Tue, 18 Mar 2025 14:54:49 -0400 Subject: [PATCH 015/751] :sparkles: Add RADIANT/CBTN data set --- datasets/radiant.yaml | 45 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 datasets/radiant.yaml diff --git a/datasets/radiant.yaml b/datasets/radiant.yaml new file mode 100644 index 000000000..4a4a3dbec --- /dev/null +++ b/datasets/radiant.yaml @@ -0,0 +1,45 @@ +Name: RADIANT Public Data +Description: > + The Real-time Analysis and Discovery in Integrated And Networked Technologies (RADIANT) + initiative seeks to develop an extensible, federated framework for rapid exchange of + multimodal clinical and research data on behalf of accelerated discovery and patient impact. + Coordination and implementation of initial RADIANT deployments will leverage a network of + more than 35 partnered health care systems and participating patient families within the + Children’s Brain Tumor Network (CBTN) and the Pediatric Neuro-Oncology Consortium (PNOC). + This data set is composed of public multi-modal data provisioned by RADIANT. The initial + bolus of data is from CBTN and consists of clinical data extracted/abstracted from + electronic medical records, omic data such as genomics, transcriptomics and proteomics and + radiology and pathology imaging data. Data are collected or generated as part of consent-based, + IRB-approved observational or interventional studies with the goal of making it available + globally to researchers across a broad number of disciplines. + +Documentation: https://cbtn.org/research-resources +Contact: research@cbtn.org +ManagedBy: "[The Center for Data-Driven Discovery in Biomedicine (D3b) at the Children's Hospital of Philadelphia](https://d3b.center/)" +UpdateFrequency: | + Data is updated on a regular basis by the RADIANT teams to make data available as + rapidly as possible. +Tags: + - aws-pds + - life sciences + - cancer + - genetic + - genomic + - transcriptomics + - medical imaging + - radiology + - Homo sapiens + - pediatric + - whole genome sequencing +License: "NIH Genomic Data Sharing Policy: https://grants.nih.gov/grants/guide/notice-files/not-od-14-124.html" +Resources: + + +DataAtWork: + Tools & Applications: + - Title: RADIANT Source Code + URL: https://github.com/radiant-network + AuthorName: RADIANT Team + AuthorURL: https://github.com/radiant-network + Publications: + From 90817400f0f3d74c232a3cbc98baa921b059edda Mon Sep 17 00:00:00 2001 From: Allison Heath Date: Mon, 24 Mar 2025 09:57:08 -0400 Subject: [PATCH 016/751] :memo: Update tools and publications --- datasets/radiant.yaml | 40 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/datasets/radiant.yaml b/datasets/radiant.yaml index 4a4a3dbec..1f4e1e1c3 100644 --- a/datasets/radiant.yaml +++ b/datasets/radiant.yaml @@ -41,5 +41,43 @@ DataAtWork: URL: https://github.com/radiant-network AuthorName: RADIANT Team AuthorURL: https://github.com/radiant-network + - Title: CAVATICA + URL: http://cavatica.org + AuthorName: Seven Bridges Genomics + AuthorURL: http://www.sevenbridges.com + - Title: PedcBioPortal + URL: https://pedcbioportal.kidsfirstdrc.org + AuthorName: cBioPortal + AuthorURL: https://www.cbioportal.org/ + - Title: Flywheel (CHOP D3b) + URL: https://chop.flywheel.io + AuthorName: Flywheel + AuthorURL: https://flywheel.io/ Publications: - + - Title: "The children's brain tumor network (CBTN) - Accelerating research in pediatric central nervous system tumors through collaboration and open science." + URL: https://pubmed.ncbi.nlm.nih.gov/36335802/ + AuthorName: Jena V Lilly, Jo Lynne Rokita, Jennifer L Mason, et al. + - Title: "The landscape of primary mismatch repair deficient gliomas in children, adolescents, and young adults: a multi-cohort study" + URL: https://pubmed.ncbi.nlm.nih.gov/39701117/ + AuthorName: Logine Negm, Jiil Chung, Liana Nobre, et al. + - Title: "Multiparametric MRI along with machine learning predicts prognosis and treatment response in pediatric low-grade glioma" + URL: https://pubmed.ncbi.nlm.nih.gov/39747214/ + AuthorName: Anahita Gathi Kazerooni, Adam Kraya, Komal S Rathi, Meen Chul Kim, et al. + - Title: "Multi-scale signaling and tumor evolution in high-grade gliomas" + URL: https://pubmed.ncbi.nlm.nih.gov/38981438/ + AuthorName: Jingxian Liu, Song Cao, Kathleen J Imback, et al. + - Title: "Germline analysis of an international cohort of pediatric diffuse midline glioma patients" + URL: https://pubmed.ncbi.nlm.nih.gov/40072012/ + AuthorName: Marion K Mateos, Pamela Ajuyah, Noemi Fuentes-Bolanos, et al. + - Title: "A road map for the treatment of pediatric diffuse midline glioma" + URL: https://pubmed.ncbi.nlm.nih.gov/38039965/ + AuthorName: Carl Koschmann, Wajd N Al-Holou, Marta M Alonso, et al. + - Title: "Use of External Control Cohorts in Pediatric Brain Tumor Clinical Trials" + URL: https://pubmed.ncbi.nlm.nih.gov/38394473/ + AuthorName: Ashley S Margol, Annette M Molinaro, Arzu Onar-Thomas, et al. + - Title: "OpenPBTA: The Open Pediatric Brain Tumor Atlas" + URL: https://pubmed.ncbi.nlm.nih.gov/37492101/ + AuthorName: Joshua A Shapiro, Krutika S Gaonkar, Stephanie J Spielman, et al. + - Title: "Generation and multi-dimensional profiling of a childhood cancer cell line atlas defines new therapeutic opportunities" + URL: https://pubmed.ncbi.nlm.nih.gov/37001527/ + AuthorName: Claire Xin Sun, Paul Daniel, Gabrielle Bradshaw et al. \ No newline at end of file From f323a1cfafbc1126229776410687a9356e70cf8b Mon Sep 17 00:00:00 2001 From: Sandy Hider Date: Thu, 27 Mar 2025 18:38:06 -0400 Subject: [PATCH 017/751] Initial creation of ember.yaml for open-data-registry --- datasets/ember.yaml | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 datasets/ember.yaml diff --git a/datasets/ember.yaml b/datasets/ember.yaml new file mode 100644 index 000000000..53fbd2237 --- /dev/null +++ b/datasets/ember.yaml @@ -0,0 +1,25 @@ +Name: EMBER Open Datasets +Description: This is data from, Ecosystem for Multi-modal Brain-behavior Experimentation and Research (EMBER), It contains time series behavioral and neuroscience data from animal and deidentified human subjects across multiple modalities. +Documentation: https://emberarchive.org/ +Contact: brock.wester@jhuapl.edu +ManagedBy: "[Johns Hopkins University Applied Physics Laboratory](https://www.jhuapl.edu)" +UpdateFrequency: New datasets are added as soon as it is available. Minor updates on existing datasets occur sporadically. +Tags: + - neural data + - behavioral + - physiological + - biochemical + - kinematic + - spatial + - molecular +License: Creative Commons 4.0 International (CC BY 4.0) +Resources: + - Description: Time series neurophysiology and behavioral data from animal and human (deidentified) + ARN: arn:aws:s3:::ember-open-data + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Publications: + - Title: Mapping the landscape of social behavior + URL: https://pubmed.ncbi.nlm.nih.gov/40043703/ + AuthorName: Ugne Klibaite, Tianqing Li, Diego Aldarondo, Jumana F Akoad, Bence P Ölveczky, Timothy W Dunn From 54459a30acbe1369643091d4c7022ca21127570b Mon Sep 17 00:00:00 2001 From: Sandy Hider Date: Fri, 28 Mar 2025 10:58:14 -0400 Subject: [PATCH 018/751] update the tags to match tags.yaml --- datasets/ember.yaml | 35 ++++++++++++++++++++++++++++------- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/datasets/ember.yaml b/datasets/ember.yaml index 53fbd2237..8b2501c0b 100644 --- a/datasets/ember.yaml +++ b/datasets/ember.yaml @@ -5,13 +5,34 @@ Contact: brock.wester@jhuapl.edu ManagedBy: "[Johns Hopkins University Applied Physics Laboratory](https://www.jhuapl.edu)" UpdateFrequency: New datasets are added as soon as it is available. Minor updates on existing datasets occur sporadically. Tags: - - neural data - - behavioral - - physiological - - biochemical - - kinematic - - spatial - - molecular + - neuroscience + - neurobiology + - neuroimaging + - neurophysiology + - electrophysiology + - machine learning + - magnetic resonance imaging + - json + - hdf5 + - zarr + - localization + - brain images + - life sciences + - signal processing + - speech processing + - activity recognition + - activity detection + - analytics + - bioinformatics + - brain models + - cloud computing + - computer vision + - deep learning + - GPS + - h5 + - Homo sapiens + - Mus musculus + - non-human primate License: Creative Commons 4.0 International (CC BY 4.0) Resources: - Description: Time series neurophysiology and behavioral data from animal and human (deidentified) From 45aa0bed4ca101cb0e685c322f76298b39f5f2a3 Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Mon, 7 Apr 2025 10:46:36 +0000 Subject: [PATCH 019/751] added the sentinel 1 --- datasets/deafrica-sentinel-1-mosaic.yaml | 55 ++++++++++++++++++++++++ 1 file changed, 55 insertions(+) create mode 100644 datasets/deafrica-sentinel-1-mosaic.yaml diff --git a/datasets/deafrica-sentinel-1-mosaic.yaml b/datasets/deafrica-sentinel-1-mosaic.yaml new file mode 100644 index 000000000..b31b79216 --- /dev/null +++ b/datasets/deafrica-sentinel-1-mosaic.yaml @@ -0,0 +1,55 @@ +Name: DSentinel-1 Monthly Mosaic +Description: | + Synthetic Aperture Radar (SAR) sensor have the advantage of operating at wavelengths not impeded by cloud cover and can acquire data over a site during the day or night. The Sentinel-1 mission, part of the Copernicus joint initiative by the European Commission (EC) and the European Space Agency (ESA), provides reliable and repeated wide-area monitoring using its SAR instrument.Sentinel-1 Monthly Mosaics are analysis-ready product of individual Sentinel-1 acquisitions. Sentinel-1 monthly mosaics are generated from Radiometric Terrain Corrected (RTC) backscatter data, with variations from changing observation geometries mitigated. RTC images acquired within a calendar month are combined using a multitemporal compositing algorithm. This algorithm calculates a weighted average of valid pixels, assigning higher weights to pixels with higher local resolution (e.g., slopes facing away from the sensor). This local resolution weighting approach minimizes noise and improves spatial homogeneity in the composites. + +Documentation: https://docs.digitalearthafrica.org/en/latest/data_specs/Sentinel-1_Monthly_Mosaic_specs.html +Contact: helpdesk@digitalearthafrica.org +ManagedBy: "[Digital Earth Africa](https://www.digitalearthafrica.org/)" +UpdateFrequency: N/A. +Collabs: + ASDI: + Tags: + - satellite imagery +Tags: + - aws-pds + - agriculture + - earth observation + - satellite imagery + - geospatial + - natural resource + - disaster response + - deafrica + - stac + - cog + - synthetic aperture radar +License: | + Access to S1 Monthly Mosaic data is free, full and open for the broad Regional, National, European and International user community. View [Terms and Conditions](https://scihub.copernicus.eu/twiki/do/view/SciHubWebPortal/TermsConditions). +Resources: + - Description: S1 Monthly Mosaic tiles and metadata + ARN: arn:aws:s3:::deafrica-sentinel-1 + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/products/s1_monthly_mosaic)' +DataAtWork: + Tools & Applications: + - Title: "Digital Earth Africa Explorer" + URL: https://explorer.digitalearth.africa/products/s1_monthly_mosaic/extents + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa web services" + URL: https://ows.digitalearth.africa + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Map" + URL: https://maps.digitalearth.africa/ + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Sandbox" + URL: https://sandbox.digitalearth.africa/ + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Notebook Repo" + URL: https://github.com/digitalearthafrica/deafrica-sandbox-notebooks + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Geoportal" + URL: https://www.africageoportal.com/pages/digital-earth-africa + AuthorName: Digital Earth Africa Contributors + From 6c6db752834f61541aab87424068770b9f24275b Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Tue, 8 Apr 2025 09:29:32 +0000 Subject: [PATCH 020/751] update text --- datasets/deafrica-sentinel-1-mosaic.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/deafrica-sentinel-1-mosaic.yaml b/datasets/deafrica-sentinel-1-mosaic.yaml index b31b79216..9644174a6 100644 --- a/datasets/deafrica-sentinel-1-mosaic.yaml +++ b/datasets/deafrica-sentinel-1-mosaic.yaml @@ -1,6 +1,6 @@ Name: DSentinel-1 Monthly Mosaic Description: | - Synthetic Aperture Radar (SAR) sensor have the advantage of operating at wavelengths not impeded by cloud cover and can acquire data over a site during the day or night. The Sentinel-1 mission, part of the Copernicus joint initiative by the European Commission (EC) and the European Space Agency (ESA), provides reliable and repeated wide-area monitoring using its SAR instrument.Sentinel-1 Monthly Mosaics are analysis-ready product of individual Sentinel-1 acquisitions. Sentinel-1 monthly mosaics are generated from Radiometric Terrain Corrected (RTC) backscatter data, with variations from changing observation geometries mitigated. RTC images acquired within a calendar month are combined using a multitemporal compositing algorithm. This algorithm calculates a weighted average of valid pixels, assigning higher weights to pixels with higher local resolution (e.g., slopes facing away from the sensor). This local resolution weighting approach minimizes noise and improves spatial homogeneity in the composites. + Synthetic Aperture Radar (SAR) sensor have the advantage of operating at wavelengths not impeded by cloud cover and can acquire data over a site during the day or night. The Sentinel-1 mission, part of the Copernicus joint initiative by the European Commission (EC) and the European Space Agency (ESA), provides reliable and repeated wide-area monitoring using its SAR instrument.Sentinel-1 Monthly Mosaics are analysis-ready product of individual Sentinel-1 acquisitions. Sentinel-1 monthly mosaics are generated from Radiometric Terrain Corrected (RTC) backscatter data, with variations from changing observation geometries mitigated. RTC images acquired within a calendar month are combined using a multitemporal compositing algorithm. This algorithm calculates a weighted average of valid pixels, assigning higher weights to pixels with higher local resolution (e.g., slopes facing away from the sensor). This local resolution weighting approach minimizes noise and improves spatial homogeneity in the composites. Sinergise(Planet Labs) processed and indexed the product on the DE Africa platform Documentation: https://docs.digitalearthafrica.org/en/latest/data_specs/Sentinel-1_Monthly_Mosaic_specs.html Contact: helpdesk@digitalearthafrica.org From bae5f57df08d4496ce369b509b24a901106cfe81 Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Tue, 8 Apr 2025 09:48:42 +0000 Subject: [PATCH 021/751] add sentinel 2 collection 1 --- datasets/deafrica-sentinel-2-c1.yaml | 82 ++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 datasets/deafrica-sentinel-2-c1.yaml diff --git a/datasets/deafrica-sentinel-2-c1.yaml b/datasets/deafrica-sentinel-2-c1.yaml new file mode 100644 index 000000000..351202f27 --- /dev/null +++ b/datasets/deafrica-sentinel-2-c1.yaml @@ -0,0 +1,82 @@ +Name: Digital Earth Africa Sentinel-2 Level-2A Surface Reflectance Collection 1 +Description: | + The Sentinel-2 mission is part of the European Union Copernicus programme for Earth observations. Sentinel-2 consists of twin satellites, Sentinel-2A (launched 23 June 2015) and Sentinel-2B (launched 7 March 2017). The two satellites have the same orbit, but 180° apart for optimal coverage and data delivery. Their combined data is used in the Digital Earth Africa Sentinel-2 product. + Together, they cover all Earth’s land surfaces, large islands, inland and coastal waters every 3-5 days. + Sentinel-2 data is tiered by level of pre-processing. Level-0, Level-1A and Level-1B data contain raw data from the satellites, with little to no pre-processing. Level-1C data is surface reflectance measured at the top of the atmosphere. This is processed using the Sen2Cor algorithm to give Level-2A, the bottom-of-atmosphere reflectance (Obregón et al, 2019). Level-2A data is the most ideal for research activities as it allows further analysis without applying additional atmospheric corrections. + The Digital Earth Africa Sentinel-2 Level-2A Surface Reflectance Collection 1 dataset contains Level-2A data of the African continent. Digital Earth Africa Sentinel-2 Level-2A Surface Reflectance Collection 1 is the reprocessing of Digital Earth Africa Sentinel-2 product to correct some of the quality masking, etc.Digital Earth Africa does not host any lower-level Sentinel-2 data. + Note that this data is a subset of the Sentinel-2 COGs dataset. +Documentation: https://docs.digitalearthafrica.org/en/latest/data_specs/Sentinel-2_Level-2A_specs.html +Contact: helpdesk@digitalearthafrica.org +ManagedBy: "[Digital Earth Africa](https://www.digitalearthafrica.org/)" +UpdateFrequency: New Sentinel-2 scenes are added regularly, usually within few hours after they are available on Copernicus OpenHub. +Collabs: + ASDI: + Tags: + - satellite imagery +Tags: + - aws-pds + - agriculture + - earth observation + - satellite imagery + - geospatial + - natural resource + - disaster response + - deafrica + - stac + - cog +License: | + Access to Sentinel data is free, full and open for the broad Regional, National, European and International user community. View [Terms and Conditions](https://scihub.copernicus.eu/twiki/do/view/SciHubWebPortal/TermsConditions). +Resources: + - Description: Sentinel-2 scenes and metadata + ARN: arn:aws:s3:::deafrica-sentinel-2-l2a-c1/sentinel-2-c1-l2a + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/stac/collections/s2_l2a_c1)' + - Description: "[S3 Inventory](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html#storage-inventory-contents)" + ARN: arn:aws:s3:::deafrica-sentinel-2-l2a-c1/sentinel-2-c1-l2a-inventory + Region: af-south-1 + Type: S3 Bucket + - Description: New scene notifications, can subscribe with [Lambda](https://aws.amazon.com/lambda/) or [SQS](https://aws.amazon.com/sqs/). Message contains entire STAC record for each new Item. + ARN: arn:aws:sns:af-south-1:543785577597:deafrica-sentinel-2-scene-topic + Region: af-south-1 + Type: SNS Topic + - Description: Bucket creation event notification, can subscribe with [Lambda](https://aws.amazon.com/lambda/) or [SQS](https://aws.amazon.com/sqs/). Message sent by deafrica-sentinel-2 s3 bucket all object create events. + ARN: arn:aws:sns:af-south-1:543785577597:deafrica-sentinel-2-topic + Region: af-south-1 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: Use Sentinel-2 data in the Open Data Cube + URL: https://github.com/opendatacube/cube-in-a-box + AuthorName: Alex Leith + - Title: Digital Earth Africa Training + URL: http://learn.digitalearthafrica.org/ + AuthorName: Digital Earth Africa Contributors + - Title: Downloading and streaming data using STAC metadata + URL: https://docs.digitalearthafrica.org/en/latest/sandbox/notebooks/Frequently_used_code/Downloading_data_with_STAC.html + AuthorName: Digital Earth Africa Contributors + Tools & Applications: + - Title: "Digital Earth Africa Explorer" + URL: https://explorer.digitalearth.africa/products/s2_l2a_c1/extents + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa web services" + URL: https://ows.digitalearth.africa + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Map" + URL: https://maps.digitalearth.africa/ + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Sandbox" + URL: https://sandbox.digitalearth.africa/ + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Notebook Repo" + URL: https://github.com/digitalearthafrica/deafrica-sandbox-notebooks + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Geoportal" + URL: https://www.africageoportal.com/pages/digital-earth-africa + AuthorName: Digital Earth Africa Contributors + Publications: + - Title: "Introduction to DE Africa" + URL: https://youtu.be/Wkf7N6O9jJQ + AuthorName: Dr Fang Yuan From 0985af88bb12f22d971680b892b76900b57c1793 Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Tue, 8 Apr 2025 10:08:50 +0000 Subject: [PATCH 022/751] update header --- datasets/deafrica-sentinel-1-mosaic.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/deafrica-sentinel-1-mosaic.yaml b/datasets/deafrica-sentinel-1-mosaic.yaml index 9644174a6..b98b10ee7 100644 --- a/datasets/deafrica-sentinel-1-mosaic.yaml +++ b/datasets/deafrica-sentinel-1-mosaic.yaml @@ -1,6 +1,6 @@ -Name: DSentinel-1 Monthly Mosaic +Name: Sentinel-1 Monthly Mosaic Description: | - Synthetic Aperture Radar (SAR) sensor have the advantage of operating at wavelengths not impeded by cloud cover and can acquire data over a site during the day or night. The Sentinel-1 mission, part of the Copernicus joint initiative by the European Commission (EC) and the European Space Agency (ESA), provides reliable and repeated wide-area monitoring using its SAR instrument.Sentinel-1 Monthly Mosaics are analysis-ready product of individual Sentinel-1 acquisitions. Sentinel-1 monthly mosaics are generated from Radiometric Terrain Corrected (RTC) backscatter data, with variations from changing observation geometries mitigated. RTC images acquired within a calendar month are combined using a multitemporal compositing algorithm. This algorithm calculates a weighted average of valid pixels, assigning higher weights to pixels with higher local resolution (e.g., slopes facing away from the sensor). This local resolution weighting approach minimizes noise and improves spatial homogeneity in the composites. Sinergise(Planet Labs) processed and indexed the product on the DE Africa platform + Synthetic Aperture Radar (SAR) sensor have the advantage of operating at wavelengths not impeded by cloud cover and can acquire data over a site during the day or night. The Sentinel-1 mission, part of the Copernicus joint initiative by the European Commission (EC) and the European Space Agency (ESA), provides reliable and repeated wide-area monitoring using its SAR instrument.Sentinel-1 Monthly Mosaics are analysis-ready product of individual Sentinel-1 acquisitions. Sentinel-1 monthly mosaics are generated from Radiometric Terrain Corrected (RTC) backscatter data, with variations from changing observation geometries mitigated. RTC images acquired within a calendar month are combined using a multitemporal compositing algorithm. This algorithm calculates a weighted average of valid pixels, assigning higher weights to pixels with higher local resolution (e.g., slopes facing away from the sensor). This local resolution weighting approach minimizes noise and improves spatial homogeneity in the composites. Sinergise (Planet Labs) processed and indexed the product on the DE Africa platform Documentation: https://docs.digitalearthafrica.org/en/latest/data_specs/Sentinel-1_Monthly_Mosaic_specs.html Contact: helpdesk@digitalearthafrica.org From d933bfe2356cb7f03aaee2f8aade0e17ee898e44 Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Tue, 8 Apr 2025 10:24:23 +0000 Subject: [PATCH 023/751] update text --- datasets/deafrica-sentinel-2-c1.yaml | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/datasets/deafrica-sentinel-2-c1.yaml b/datasets/deafrica-sentinel-2-c1.yaml index 351202f27..385e7d085 100644 --- a/datasets/deafrica-sentinel-2-c1.yaml +++ b/datasets/deafrica-sentinel-2-c1.yaml @@ -39,16 +39,16 @@ Resources: Region: af-south-1 Type: S3 Bucket - Description: New scene notifications, can subscribe with [Lambda](https://aws.amazon.com/lambda/) or [SQS](https://aws.amazon.com/sqs/). Message contains entire STAC record for each new Item. - ARN: arn:aws:sns:af-south-1:543785577597:deafrica-sentinel-2-scene-topic + ARN: arn:aws:sns:af-south-1:543785577597:deafrica-sentinel-2-l2a-c1-scene-topic Region: af-south-1 Type: SNS Topic - - Description: Bucket creation event notification, can subscribe with [Lambda](https://aws.amazon.com/lambda/) or [SQS](https://aws.amazon.com/sqs/). Message sent by deafrica-sentinel-2 s3 bucket all object create events. + - Description: Bucket creation event notification, can subscribe with [Lambda](https://aws.amazon.com/lambda/) or [SQS](https://aws.amazon.com/sqs/). Message sent by deafrica-sentinel-2-l2a-c1 s3 bucket all object create events. ARN: arn:aws:sns:af-south-1:543785577597:deafrica-sentinel-2-topic Region: af-south-1 Type: SNS Topic DataAtWork: Tutorials: - - Title: Use Sentinel-2 data in the Open Data Cube + - Title: Use Sentinel-2-C1 data in the Open Data Cube URL: https://github.com/opendatacube/cube-in-a-box AuthorName: Alex Leith - Title: Digital Earth Africa Training @@ -80,3 +80,6 @@ DataAtWork: - Title: "Introduction to DE Africa" URL: https://youtu.be/Wkf7N6O9jJQ AuthorName: Dr Fang Yuan + - Title: "S2 Processing" + URL: https://sentiwiki.copernicus.eu/web/s2-processing#S2Processing-Collection-1ProcessingBaselineS2-Processing-Collection-Processing-Baseline + AuthorName: Sentiwiki From d1e14b6596ea68a7f484b941cb898f4639cb2432 Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Tue, 8 Apr 2025 10:31:58 +0000 Subject: [PATCH 024/751] udated text --- datasets/deafrica-sentinel-2-c1.yaml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/datasets/deafrica-sentinel-2-c1.yaml b/datasets/deafrica-sentinel-2-c1.yaml index 385e7d085..6de6ef0ab 100644 --- a/datasets/deafrica-sentinel-2-c1.yaml +++ b/datasets/deafrica-sentinel-2-c1.yaml @@ -3,7 +3,9 @@ Description: | The Sentinel-2 mission is part of the European Union Copernicus programme for Earth observations. Sentinel-2 consists of twin satellites, Sentinel-2A (launched 23 June 2015) and Sentinel-2B (launched 7 March 2017). The two satellites have the same orbit, but 180° apart for optimal coverage and data delivery. Their combined data is used in the Digital Earth Africa Sentinel-2 product. Together, they cover all Earth’s land surfaces, large islands, inland and coastal waters every 3-5 days. Sentinel-2 data is tiered by level of pre-processing. Level-0, Level-1A and Level-1B data contain raw data from the satellites, with little to no pre-processing. Level-1C data is surface reflectance measured at the top of the atmosphere. This is processed using the Sen2Cor algorithm to give Level-2A, the bottom-of-atmosphere reflectance (Obregón et al, 2019). Level-2A data is the most ideal for research activities as it allows further analysis without applying additional atmospheric corrections. - The Digital Earth Africa Sentinel-2 Level-2A Surface Reflectance Collection 1 dataset contains Level-2A data of the African continent. Digital Earth Africa Sentinel-2 Level-2A Surface Reflectance Collection 1 is the reprocessing of Digital Earth Africa Sentinel-2 product to correct some of the quality masking, etc.Digital Earth Africa does not host any lower-level Sentinel-2 data. + Digital Earth Africa Sentinel-2 Level-2A Surface Reflectance Collection 1 is the Sentinel-2 product processed for enhanced calibration and consistent time series between Sentinel-2A and Sentinel-2B. Digital Earth Africa does not host any lower-level Sentinel-2 data. + + Note that this data is a subset of the Sentinel-2 COGs dataset. Documentation: https://docs.digitalearthafrica.org/en/latest/data_specs/Sentinel-2_Level-2A_specs.html Contact: helpdesk@digitalearthafrica.org From 0de66f434214d1e0e6f3a4dfa118ac7d92f69f40 Mon Sep 17 00:00:00 2001 From: Keisuke Sehara Date: Mon, 14 Apr 2025 16:47:39 +0900 Subject: [PATCH 025/751] Create braidyn-bc_cued-lever-pull.yaml --- datasets/braidyn-bc_cued-lever-pull.yaml | 51 ++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 datasets/braidyn-bc_cued-lever-pull.yaml diff --git a/datasets/braidyn-bc_cued-lever-pull.yaml b/datasets/braidyn-bc_cued-lever-pull.yaml new file mode 100644 index 000000000..34dc9fd3a --- /dev/null +++ b/datasets/braidyn-bc_cued-lever-pull.yaml @@ -0,0 +1,51 @@ +Name: "BraiDyn-BC: Cued lever-pull task dataset" +Description: | + The BraiDyn-BC (Brain Dynamics underlying emergence of Behavioral Change) Database offers an extensive, multimodal dataset that links + wide-field calcium imaging of the mouse neocortex to comprehensive behavioral measurements during a behavioral task. + As one of the contents in this database, we newly provide a dataset that includes 15 sessions spanning two weeks of motor skill learning, + in which 25 mice were trained to pull a lever to obtain water rewards. + Simultaneous high-speed videography captures body, facial, and eye movements, and environmental parameters are monitored. + The dataset also features resting-state cortical activity and sensory-evoked responses, enhancing its utility for both learning-related and + sensory-driven neural dynamics studies. + Data are formatted in accordance with the Neurodata Without Borders (NWB) standard, ensuring compatibility with existing analysis tools and + adherence to the FAIR principles. + This resource enables in-depth investigations into the neural mechanisms underlying behavior and learning. + The platform encourages collaborative research, supporting the exploration of rapid within-session learning effects, long-term behavioral adaptations, and neural circuit dynamics. +Documentation: https://doi.org/10.1101/2025.02.03.631599 +Contact: "Ken Nakae (ken.nakae@gmail.com)" +ManagedBy: "[BraiDyn-BC Database Project](https://boatneck-weeder-7b7.notion.site/BraiDyn-BC-Database-303cf08c89f94d81bb2eaed4c3c50345)" +UpdateFrequency: NA +Tags: + - mouse + - behavior + - head fixation + - wide-field calcium imaging + - high-speed videography + - motor-skill learning + - operant conditioning + - sensory mapping + - behavioral tracking +License: Creative Commons Attribution 4.0 International (CC-BY 4.0) +Resources: + - Description: + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: Detailed usage tutorials on Google Colab + URL: https://drive.google.com/drive/folders/1QciTJd3tXkEGhz6782czB2dEO3fafm8M + AuthorName: Keisuke Sehara + AuthorURL: https://orcid.org/0000-0003-4368-8143 + Tools & Applications: + - Title: A set of libraries used for generating the dataset + URL: https://github.com/BraiDyn-BC/bdbc-data-pipeline + AuthorName: Keisuke Sehara, Ryo Aoki, Shoya Sugimoto + Publications: + - Title: A multimodal dataset linking wide-field calcium imaging to behavior changes in mice during an operant lever-pull task + URL: https://doi.org/10.1101/2025.02.03.631599 + AuthorName: Kondo M, Sehara K, Harukuni R, Aoki R, Sugimoto S, Tanaka YR, Matsuzaki M, Nakae K + AuthorURL: +ADXCategories: + - Healthcare & Life Sciences Data From 6fb852736b9b47d6a19c84bd5f43978bbafa9258 Mon Sep 17 00:00:00 2001 From: Taylor Grafft Date: Tue, 15 Apr 2025 16:12:45 -0500 Subject: [PATCH 026/751] Adding brainlife and apex datasets --- datasets/apex.yaml | 48 ++++++++++++++++++++++++++++++++++ datasets/brainlife.yaml | 57 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 105 insertions(+) create mode 100644 datasets/apex.yaml create mode 100644 datasets/brainlife.yaml diff --git a/datasets/apex.yaml b/datasets/apex.yaml new file mode 100644 index 000000000..b26dfe386 --- /dev/null +++ b/datasets/apex.yaml @@ -0,0 +1,48 @@ +Name: APEX +Description: > + The BRAIN Initiative Connectivity Across Scales (CONNECTS) program is working to create detailed maps of brain + wiring across different species and scales, using advanced imaging technologies. + APEX supports this effort by serving as a central hub that brings together and coordinates data and tools + from research focused on brain connectivity in humans and animals. Together, these efforts aim to improve our + understanding of how the brain is structured and functions. +Documentation: https://brainlife.io +Contact: brainlifeio@gmail.com +ManagedBy: "[Brainlife Team](https://brainlife.io/team/)" +UpdateFrequency: New datasets are added monthly +Tags: + - neuroscience + - neuroimaging + - microscopy + - zarr + - metadata + - machine learning + - infrastructure + - json + - imaging + - brain images + - brain models + - analysis ready data + - nifti +License: +Citation: +Resources: + - Description: All APEX datasets are available for download + ARN: + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Brainlife AWS Tutorials + URL: https://brainlife.io/docs/tutorial/aws-brainlife + AuthorName: Brainlife + AuthorURL: https://brainlife.io + Tools & Applications: + - Title: Brainlife Web App + URL: https://brainlife.io + AuthorName: Brainlife + AuthorURL: https://brainlife.io + - Title: Brainlife CLI (Command Line Interface) + URL: https://github.com/brainlife/cli + AuthorName: Brainlife + AuthorURL: https://github.com/brainlife/cli + Publications: diff --git a/datasets/brainlife.yaml b/datasets/brainlife.yaml new file mode 100644 index 000000000..cce5acc72 --- /dev/null +++ b/datasets/brainlife.yaml @@ -0,0 +1,57 @@ +Name: Brainlife +Description: > + brainlife.io provides a large collection of open-access neuroscience datasets, primarily focused on human brain imaging. + These datasets include diffusion MRI (dMRI), structural MRI (T1w, T2w), functional MRI (fMRI), and other neuroimaging modalities. + Many datasets are derived from well-known studies and organized in the BIDS (Brain Imaging Data Structure) format, making them + easy to integrate into standardized processing pipelines. + The data is curated to support research in connectomics, brain development, neurodegenerative diseases, and machine learning + applications in neuroscience. +Documentation: https://brainlife.io +Contact: brainlifeio@gmail.com +ManagedBy: "[Brainlife Team](https://brainlife.io/team/)" +UpdateFrequency: New datasets are added daily +Tags: + - neuroscience + - neuroimaging + - microscopy + - zarr + - metadata + - machine learning + - infrastructure + - json + - imaging + - brain images + - brain models + - analysis ready data + - nifti +License: +Citation: +Resources: + - Description: All brainlife datasets are available for download + ARN: + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Brainlife AWS Tutorials + URL: https://brainlife.io/docs/tutorial/aws-brainlife + AuthorName: Brainlife + AuthorURL: https://brainlife.io + Tools & Applications: + - Title: Brainlife Web App + URL: https://brainlife.io + AuthorName: Brainlife + AuthorURL: https://brainlife.io + - Title: Brainlife CLI (Command Line Interface) + URL: https://github.com/brainlife/cli + AuthorName: Brainlife + AuthorURL: https://github.com/brainlife/cli + Publications: + - Title: Brainlife--A decentrailized and open-source cloud platform to support neuroscience research (Nature.com) + URL: https://www.nature.com/articles/s41592-024-02237-2 + - Title: Brainlife--A decentralized and open-source cloud platform to support neuroscience research (NIH) + URL: https://pmc.ncbi.nlm.nih.gov/articles/PMC10274934/ + - Title: New cloud-based tool accelerates research on conditions such as dementia, sports concussion (IU News) + URL: https://pmc.ncbi.nlm.nih.gov/articles/PMC10274934/ + AuthorName: Kevin Fryling + AuthorURL: https://news.iu.edu/live/profiles/2003-kevin-fryling From 77fbd7a0617858f65322609551ca24e3d6f8366c Mon Sep 17 00:00:00 2001 From: Taylor Grafft Date: Tue, 15 Apr 2025 16:56:27 -0500 Subject: [PATCH 027/751] Adding ARN and updating contact --- datasets/apex.yaml | 4 ++-- datasets/brainlife.yaml | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/apex.yaml b/datasets/apex.yaml index b26dfe386..fff71886d 100644 --- a/datasets/apex.yaml +++ b/datasets/apex.yaml @@ -6,7 +6,7 @@ Description: > from research focused on brain connectivity in humans and animals. Together, these efforts aim to improve our understanding of how the brain is structured and functions. Documentation: https://brainlife.io -Contact: brainlifeio@gmail.com +Contact: brainlife.io@gmail.com ManagedBy: "[Brainlife Team](https://brainlife.io/team/)" UpdateFrequency: New datasets are added monthly Tags: @@ -27,7 +27,7 @@ License: Citation: Resources: - Description: All APEX datasets are available for download - ARN: + ARN: arn:aws:s3:::apex-brainlife Region: us-east-1 Type: S3 Bucket DataAtWork: diff --git a/datasets/brainlife.yaml b/datasets/brainlife.yaml index cce5acc72..b3a967086 100644 --- a/datasets/brainlife.yaml +++ b/datasets/brainlife.yaml @@ -7,7 +7,7 @@ Description: > The data is curated to support research in connectomics, brain development, neurodegenerative diseases, and machine learning applications in neuroscience. Documentation: https://brainlife.io -Contact: brainlifeio@gmail.com +Contact: brainlife.io@gmail.com ManagedBy: "[Brainlife Team](https://brainlife.io/team/)" UpdateFrequency: New datasets are added daily Tags: @@ -28,7 +28,7 @@ License: Citation: Resources: - Description: All brainlife datasets are available for download - ARN: + ARN: arn:aws:s3:::brainlife Region: us-east-1 Type: S3 Bucket DataAtWork: From d252b1742cf844cfd30beb00e1bce62bb5c1d467 Mon Sep 17 00:00:00 2001 From: Taylor Grafft Date: Wed, 16 Apr 2025 10:35:46 -0500 Subject: [PATCH 028/751] Adding License and updating Publications --- datasets/apex.yaml | 6 +++--- datasets/brainlife.yaml | 12 +++++++----- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/datasets/apex.yaml b/datasets/apex.yaml index fff71886d..347c31f5c 100644 --- a/datasets/apex.yaml +++ b/datasets/apex.yaml @@ -1,4 +1,4 @@ -Name: APEX +Name: APEX-CONNECTS Description: > The BRAIN Initiative Connectivity Across Scales (CONNECTS) program is working to create detailed maps of brain wiring across different species and scales, using advanced imaging technologies. @@ -23,11 +23,11 @@ Tags: - brain models - analysis ready data - nifti -License: +License: '[CC BY](https://creativecommons.org/licenses/by/4.0)' Citation: Resources: - Description: All APEX datasets are available for download - ARN: arn:aws:s3:::apex-brainlife + ARN: arn:aws:s3:::apex-connects Region: us-east-1 Type: S3 Bucket DataAtWork: diff --git a/datasets/brainlife.yaml b/datasets/brainlife.yaml index b3a967086..75d7d0392 100644 --- a/datasets/brainlife.yaml +++ b/datasets/brainlife.yaml @@ -24,7 +24,7 @@ Tags: - brain models - analysis ready data - nifti -License: +License: '[CC BY](https://creativecommons.org/licenses/by/4.0)' Citation: Resources: - Description: All brainlife datasets are available for download @@ -47,11 +47,13 @@ DataAtWork: AuthorName: Brainlife AuthorURL: https://github.com/brainlife/cli Publications: - - Title: Brainlife--A decentrailized and open-source cloud platform to support neuroscience research (Nature.com) + - Title: Brainlife--A decentrailized and open-source cloud platform to support neuroscience research URL: https://www.nature.com/articles/s41592-024-02237-2 - - Title: Brainlife--A decentralized and open-source cloud platform to support neuroscience research (NIH) - URL: https://pmc.ncbi.nlm.nih.gov/articles/PMC10274934/ + AuthorName: Hayashi, S., Caron, B.A., Heinsfeld, A.S., Pestilli, F., et al. + - Title: Brainlife Paper - Human Connectome Young Adult - Full Dataset + URL: https://brainlife.io/pub/640a3f9dc538c16a826f9b1a + AuthorName: Caron, B.A., Pestilli, F., Heinsfeld, A.S., Hayashi, S., et al. - Title: New cloud-based tool accelerates research on conditions such as dementia, sports concussion (IU News) - URL: https://pmc.ncbi.nlm.nih.gov/articles/PMC10274934/ + URL: https://news.iu.edu/live/news/26119-new-cloud-based-tool-accelerates-research-on AuthorName: Kevin Fryling AuthorURL: https://news.iu.edu/live/profiles/2003-kevin-fryling From 4799547907dfb88dec62e4a338b1e38aa6e0e0f8 Mon Sep 17 00:00:00 2001 From: Taylor Grafft Date: Wed, 16 Apr 2025 11:47:48 -0500 Subject: [PATCH 029/751] Separating PRs --- datasets/brainlife.yaml | 59 ----------------------------------------- 1 file changed, 59 deletions(-) delete mode 100644 datasets/brainlife.yaml diff --git a/datasets/brainlife.yaml b/datasets/brainlife.yaml deleted file mode 100644 index 75d7d0392..000000000 --- a/datasets/brainlife.yaml +++ /dev/null @@ -1,59 +0,0 @@ -Name: Brainlife -Description: > - brainlife.io provides a large collection of open-access neuroscience datasets, primarily focused on human brain imaging. - These datasets include diffusion MRI (dMRI), structural MRI (T1w, T2w), functional MRI (fMRI), and other neuroimaging modalities. - Many datasets are derived from well-known studies and organized in the BIDS (Brain Imaging Data Structure) format, making them - easy to integrate into standardized processing pipelines. - The data is curated to support research in connectomics, brain development, neurodegenerative diseases, and machine learning - applications in neuroscience. -Documentation: https://brainlife.io -Contact: brainlife.io@gmail.com -ManagedBy: "[Brainlife Team](https://brainlife.io/team/)" -UpdateFrequency: New datasets are added daily -Tags: - - neuroscience - - neuroimaging - - microscopy - - zarr - - metadata - - machine learning - - infrastructure - - json - - imaging - - brain images - - brain models - - analysis ready data - - nifti -License: '[CC BY](https://creativecommons.org/licenses/by/4.0)' -Citation: -Resources: - - Description: All brainlife datasets are available for download - ARN: arn:aws:s3:::brainlife - Region: us-east-1 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: Brainlife AWS Tutorials - URL: https://brainlife.io/docs/tutorial/aws-brainlife - AuthorName: Brainlife - AuthorURL: https://brainlife.io - Tools & Applications: - - Title: Brainlife Web App - URL: https://brainlife.io - AuthorName: Brainlife - AuthorURL: https://brainlife.io - - Title: Brainlife CLI (Command Line Interface) - URL: https://github.com/brainlife/cli - AuthorName: Brainlife - AuthorURL: https://github.com/brainlife/cli - Publications: - - Title: Brainlife--A decentrailized and open-source cloud platform to support neuroscience research - URL: https://www.nature.com/articles/s41592-024-02237-2 - AuthorName: Hayashi, S., Caron, B.A., Heinsfeld, A.S., Pestilli, F., et al. - - Title: Brainlife Paper - Human Connectome Young Adult - Full Dataset - URL: https://brainlife.io/pub/640a3f9dc538c16a826f9b1a - AuthorName: Caron, B.A., Pestilli, F., Heinsfeld, A.S., Hayashi, S., et al. - - Title: New cloud-based tool accelerates research on conditions such as dementia, sports concussion (IU News) - URL: https://news.iu.edu/live/news/26119-new-cloud-based-tool-accelerates-research-on - AuthorName: Kevin Fryling - AuthorURL: https://news.iu.edu/live/profiles/2003-kevin-fryling From 3fd7a924e9ec276de2c91f3633e6d5ddebb1ed15 Mon Sep 17 00:00:00 2001 From: Taylor Grafft Date: Wed, 16 Apr 2025 16:33:56 -0500 Subject: [PATCH 030/751] Updating region --- datasets/apex.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/apex.yaml b/datasets/apex.yaml index 347c31f5c..e4d0f1a3e 100644 --- a/datasets/apex.yaml +++ b/datasets/apex.yaml @@ -28,7 +28,7 @@ Citation: Resources: - Description: All APEX datasets are available for download ARN: arn:aws:s3:::apex-connects - Region: us-east-1 + Region: us-east-2 Type: S3 Bucket DataAtWork: Tutorials: From 841f2809fa81e623a77e6f1358ed30b12d6d3dae Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Mon, 21 Apr 2025 17:24:52 -0400 Subject: [PATCH 031/751] Create proteingym.yaml --- datasets/proteingym.yaml | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 datasets/proteingym.yaml diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml new file mode 100644 index 000000000..f7bf80912 --- /dev/null +++ b/datasets/proteingym.yaml @@ -0,0 +1,40 @@ +Name: ProteinGym +Description: ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), as well clinical annotations from experts about the pathogenicity of mutants in over 3k human genes. +Documentation: https://github.com/OATML-Markslab/ProteinGym/blob/main/README.md +Contact: pascal_notin@hms.harvard.edu +ManagedBy: Harvard Medical School; University of Oxford +UpdateFrequency: Quarterly +Tags: + - protein + - bioinformatics + - biology + - deep learning + - machine learning +License: MIT License +Resources: + - Description: + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: Scoring ProteinGym assays with TranceptEVE + URL: https://github.com/OATML-Markslab/ProteinGym/blob/main/notebooks/TranceptEVE_example.ipynb + NotebookURL: https://github.com/OATML-Markslab/ProteinGym/blob/main/notebooks/TranceptEVE_example.ipynb + AuthorName: Daniel Ritter + AuthorURL: https://danieldritter.github.io/ + Services: + Tools & Applications: + - Title: ProteinGym website + URL: https://proteingym.org/ + AuthorName: Pascal Notin & Daniel Ritter + AuthorURL: + Publications: + - Title: ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design + URL: https://papers.nips.cc/paper_files/paper/2023/hash/cac723e5ff29f65e3fcbb0739ae91bee-Abstract-Datasets_and_Benchmarks.html + AuthorName: Pascal Notin, Aaron Kollasch, Daniel Ritter, Lood van Niekerk, Steffanie Paul, Han Spinner, Nathan Rollins, Ada Shaw, Rose Orenbuch, Ruben Weitzman, Jonathan Frazer, Mafalda Dias, Dinko Franceschi, Yarin Gal, Debora Marks + AuthorURL: https://www.pascalnotin.com/ +DeprecatedNotice: +ADXCategories: + - From 0e2ba822faab52ae5334e37d01238a4aee35794d Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Sat, 26 Apr 2025 16:59:51 +0530 Subject: [PATCH 032/751] Update AI3.yaml We have removed "www" from the license link as per changes made in the lab website --- datasets/AI3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/AI3.yaml b/datasets/AI3.yaml index 1abe6f3f0..f318e1917 100644 --- a/datasets/AI3.yaml +++ b/datasets/AI3.yaml @@ -29,7 +29,7 @@ Tags: - Protein - Molecular Dynamics -License: https://www.devalab.in/AI3.html +License: https://devalab.in/AI3.html Resources: - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. From c5814f4f1a7756b5d01d84a3f9fdbe355220f2aa Mon Sep 17 00:00:00 2001 From: Eyal Ben-Hur Date: Tue, 20 May 2025 20:51:12 +0300 Subject: [PATCH 033/751] Add files via upload Updated licence --- datasets/huj-herbarium.yaml | 50 +++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 datasets/huj-herbarium.yaml diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml new file mode 100644 index 000000000..3e736bc0d --- /dev/null +++ b/datasets/huj-herbarium.yaml @@ -0,0 +1,50 @@ +Name: National Herbarium of Israel +Description: + Our collection encompasses approximately one million vascular plant specimens from the Mediterranean and Middle East biodiversity hotspot, representing flora from Israel, Jordan, Hermon, Sinai, Egypt, the Caucasus, Arabia, North Africa, and throughout the Mediterranean basin. This scientifically significant repository includes published voucher specimens, original specimens used for "Flora Palaestina" illustrations, and critical references for the Israeli gene bank collections. + The ongoing digitization process captures high-resolution images of each specimen while systematically incorporating label information into our computerized catalog. This virtual herbarium will democratize access to these valuable botanical resources, enabling global researchers to examine specimens in exceptional detail from anywhere in the world. + Beyond preservation, this digital transformation unlocks new research possibilities through computational analysis of both visual specimen characteristics and associated metadata. The dataset will serve as a foundational resource for advancing botanical research, ecological modeling, taxonomic investigation, historical analysis, and numerous other scientific disciplines concerned with plant biodiversity in this ecologically and historically significant region. + +Documentation: +Contact: Eyal.Ben-Hur@mail.huji.ac.il +ManagedBy: National Natural History Collections, The Hebrew University of Jerusalem +UpdateFrequency: Monthly +Tags: + - biology + - life sciences + - biodiversity + - environmental + - climate + - digital preservation + - imaging + - image processing + - aws-pds + +License: CC-BY-SA 4.0 +Citation: Vascular plants - Herbarium of The National Natural History Collections was accessed on DATE from https://registry.opendata.aws/huj-herbarium. +Resources: + - Description: HUJ Herbarium Collection Images + ARN: + Region: Israel (Tel Aviv) il-central-1 + Type: S3 bucket + Explore: +DataAtWork: + Tutorials: + - Title: How to use AWS S3 bucket to explore our puclic images dataset + URL: https://bit.ly/HUJimages + NotebookURL: + AuthorName: Eyal Ben-Hur + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: + - \ No newline at end of file From cd2962f97f1be8a03def39586f32ebe8528e60ae Mon Sep 17 00:00:00 2001 From: Eyal Ben-Hur Date: Tue, 20 May 2025 22:53:52 +0300 Subject: [PATCH 034/751] Update huj-herbarium.yaml --- datasets/huj-herbarium.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index 3e736bc0d..9958056c7 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -24,12 +24,12 @@ Citation: Vascular plants - Herbarium of The National Natural History Collection Resources: - Description: HUJ Herbarium Collection Images ARN: - Region: Israel (Tel Aviv) il-central-1 + Region: il-central-1 Type: S3 bucket Explore: DataAtWork: Tutorials: - - Title: How to use AWS S3 bucket to explore our puclic images dataset + - Title: How to use AWS S3 bucket to explore our public images dataset URL: https://bit.ly/HUJimages NotebookURL: AuthorName: Eyal Ben-Hur @@ -47,4 +47,4 @@ DataAtWork: AuthorURL: DeprecatedNotice: ADXCategories: - - \ No newline at end of file + - From 0242169acba2a578531c7aa176bd52b09633fc3b Mon Sep 17 00:00:00 2001 From: xhagrg Date: Wed, 21 May 2025 15:46:52 -0500 Subject: [PATCH 035/751] Add new dataset for Surya-bench. --- datasets/surya-bench.yml | 20 + tags.yaml | 903 ++++++++++++++++++++------------------- 2 files changed, 472 insertions(+), 451 deletions(-) create mode 100644 datasets/surya-bench.yml diff --git a/datasets/surya-bench.yml b/datasets/surya-bench.yml new file mode 100644 index 000000000..74cd19d5d --- /dev/null +++ b/datasets/surya-bench.yml @@ -0,0 +1,20 @@ +Name: Surya Bench +Description: | + This dataset provides machine learning (ML)-ready solar data curated from NASA’s Solar Dynamics Observatory (SDO), covering observations from May 13, 2010, to July 31, 2024. It includes Level-1.5 processed data from: + - Atmospheric Imaging Assembly (AIA): + - Helioseismic and Magnetic Imager (HMI): + The dataset is designed to facilitate large-scale ML applications in heliophysics, such as solar activity forecasting, unsupervised representation learning, and scientific foundation model development. +Documentation: https://huggingface.co/datasets/nasa-impact/Surya-bench +Contact: sujit.roy@nasa.gov +ManagedBy: NASA IMPACT +UpdateFrequency: This is the final version of the Dataset. +Tags: + - machine learning + - NASA Solar Dynamics Observatory (SDO) +License: | + Creative Commons Attribution 4.0 International. +Resources: + - Description: Surya Bench + ARN: arn:aws:s3:::nasa-surya-bench + Region: us-west-2 + Type: S3 Bucket diff --git a/tags.yaml b/tags.yaml index 453bfa12d..c00b7e1e4 100644 --- a/tags.yaml +++ b/tags.yaml @@ -1,451 +1,452 @@ -- 1940 census -- 1950 census -- 2020 census -- acoustics -- activity detection -- activity recognition -- aerial imagery -- age -- agriculture -- air quality -- ai safety -- air temperature -- alchemical free energy calculations -- alphafold -- amazon.science -- amino acid -- analysis ready data -- analytics -- anomaly detection -- approximate monte carlo -- approximate monte carlo replicates -- archives -- array tomography -- art -- assembly -- astronomy -- atmosphere -- autism spectrum disorder -- automatic speech recognition -- autonomous racing -- autonomous vehicles -- auxiliary data -- aviation -- aws-pds -- bacteria -- bam -- bathymetry -- benchmark -- bias -- biodiversity -- bioinformatics -- biology -- biomolecular modeling -- biotech blueprint -- bitcoin -- blockchain -- brain images -- brain models -- breast cancer -- broadband -- broadcast ephemeris -- Caenorhabditis elegans -- calcium imaging -- cancer -- carbon -- cell biology -- cell imaging -- cell painting -- census -- ceos -- chemical biology -- chemistry -- cities -- civic -- classification -- climate -- climate model -- climate projections -- climate risk -- cloud computing -- CMIP5 -- CMIP6 -- coastal -- code completion -- cog -- collaborative filtering -- commerce -- complaints -- computational fluid dynamics -- computational pathology -- computed tomography -- computer forensics -- computer security -- computer vision -- conservation -- contamination -- Continuously Operating Reference Station (CORS) -- conversation data -- copper -- copyright monitoring -- coronavirus -- cover song identification -- COVID-19 -- cram -- cromwell -- cropland partitioning -- cryo electron tomography -- cryptocurrency -- CSI -- csv -- cultural preservation -- culture -- cyber security -- cyclone typhoon hurricane -- czi -- Danio rerio -- data assimilation -- datacenter -- deafrica -- decennial census -- deep learning -- demographic and housing characteristics file -- demographics -- demography -- denoising -- dhc -- dialog -- dicom -- differential privacy -- digital assets -- digital forensics -- digital pathology -- digital preservation -- disaster response -- disclosure avoidance -- distributional semantics -- drilling -- drifters -- Drosophila melanogaster -- earth observation -- earthquakes -- economics -- ecosystems -- education -- EEIO -- electricity -- electron microscopy -- electron tomography -- electrophysiology -- elevation -- email -- encyclopedic -- energy -- energy modeling -- environmental -- enzyme -- epigenomics -- ethereum -- ethnicity -- Eulerian -- evapotranspiration -- event camera -- events -- exploration -- extreme weather -- fast5 -- fasta -- fastq -- fewshot -- financial markets -- fisheries -- floods -- fluid dynamics -- fluorescence imaging -- foldingathome -- food security -- forecast -- free software -- gatk-sv -- gene expression -- GeneLab -- genetic -- genetic maps -- genome -- genome wide association study -- genomic -- genotyping -- geochemistry -- geology -- geopackage -- geophysics -- geoscience -- geospatial -- geothermal -- glioblastoma -- global -- global shutter camera -- GNSS -- golden retriever lifetime study -- governance -- government records -- government spending -- GPS -- grand-challenge.org -- graph -- green aviation -- ground water -- group quarters -- h5 -- hazard -- hazard indicator -- Hawkes Process -- hdf5 -- health -- high-throughput imaging -- hiring -- hispanic -- histopathology -- history -- Homo sapiens -- household type -- housing -- housing units -- HPC -- HYCOM -- hydrography -- hydrologic model -- hydrology -- ice -- image processing -- image-based profiling -- imaging -- IMU -- industrial -- industry -- information retrieval -- infrastructure -- internet -- interception loss -- intrusion detection -- ion channels -- irrigated cropland -- japanese -- json -- labeled -- Lagrangian -- land -- land cover -- land use -- last mile -- latino -- lidar -- life sciences -- light-sheet microscopy -- live song identification -- localization -- loftee -- logistics -- long read sequencing -- low-pressure turbine -- machine learning -- machine translation -- magnetic resonance imaging -- malware -- mammography -- mapping -- marine -- marine mammals -- marine navigation -- market data -- materials science -- media -- medical image computing -- medical imaging -- medicine -- MERS -- meta learning -- metadata -- metagenomics -- meteorological -- microbial genomics -- microbiome -- microcircuit modeling and simulation -- microdata -- microscopy -- mining -- mixed file dataset -- model -- molecular docking -- molecular dynamics -- molecule -- morphological reconstructions -- morris animal foundation -- movies -- msa -- multimedia -- Mus musculus -- museum -- music -- music features dataset -- music information retrieval -- music recognition -- nara -- NASA Center for Climate Simulation (NCCS) -- NASA SMD AI -- national archives catalog -- natural language processing -- natural resource -- near-surface air temperature -- near-surface relative humidity -- near-surface specific humidity -- netcdf -- network traffic -- neurobiology -- neuroimaging -- neurophysiology -- neuroscience -- nifti -- NOAA CORS Network (NCN) -- noisy measurements -- non-human primate -- nuclear magnetic resonance -- numerical particle -- object detection -- object tracking -- ocean circulation -- ocean currents -- ocean velocity -- ocean sea surface height -- ocean simulation -- oceans -- online shopping -- open source software -- opendap -- openfold -- optimization -- orbit -- organelle -- osm -- parquet -- pbi -- pediatric -- perception -- peril -- pharmaceutical -- physical -- physics -- planetary -- politics -- population -- population genetics -- post-processing -- precipitation -- privacy -- product comparison -- protein -- protein folding -- protein template -- race -- radar -- radiation -- radiology -- rainfed cropland -- ransomware -- Rattus norvegicus -- RDF -- recombination maps -- redistricting -- reference index -- regulatory -- relation-to-householder -- RINEX -- robotics -- routing -- RTK -- SARS -- SARS-CoV-2 -- satellite imagery -- scholarly communication -- schools -- scope 3 -- seafloor -- segmentation -- seismology -- sentiment -- sentinel-1 -- short read sequencing -- signal processing -- simulations -- simulation neuroscience -- single neuron models -- single year of age -- single-cell transcriptomics -- social media -- socioeconomic -- soil moisture -- solar -- source code -- space biology -- space weather -- SPARQL -- speaker identification -- speech processing -- speech recognition -- speech synthesis -- spend-based models -- sports -- sqlite -- stac -- statistics -- STRIDES -- structural biology -- structural birth defect -- structural variation -- subtitles -- supply chain -- surface water -- survey -- sustainability -- synthetic aperture radar -- synthetic data -- telecommunications -- temporal point process -- tertiary analysis -- text analysis -- tiles -- time series forecasting -- trading -- traffic -- transcriptomics -- transparency -- transportation -- turbulence -- txt -- ukraine -- urban -- us -- us-dc -- utilities -- variant annotation -- vcf -- vep -- video -- virus -- volumetric imaging -- voting age -- water -- weather -- web3 -- web archive -- whole exome sequencing -- whole genome sequencing -- wildlife -- word embeddings -- workload analysis -- x-ray -- x-ray crystallography -- x-ray microtomography -- x-ray tomography -- xml -- zarr +- 1940 census +- 1950 census +- 2020 census +- acoustics +- activity detection +- activity recognition +- aerial imagery +- age +- agriculture +- air quality +- ai safety +- air temperature +- alchemical free energy calculations +- alphafold +- amazon.science +- amino acid +- analysis ready data +- analytics +- anomaly detection +- approximate monte carlo +- approximate monte carlo replicates +- archives +- array tomography +- art +- assembly +- astronomy +- atmosphere +- autism spectrum disorder +- automatic speech recognition +- autonomous racing +- autonomous vehicles +- auxiliary data +- aviation +- aws-pds +- bacteria +- bam +- bathymetry +- benchmark +- bias +- biodiversity +- bioinformatics +- biology +- biomolecular modeling +- biotech blueprint +- bitcoin +- blockchain +- brain images +- brain models +- breast cancer +- broadband +- broadcast ephemeris +- Caenorhabditis elegans +- calcium imaging +- cancer +- carbon +- cell biology +- cell imaging +- cell painting +- census +- ceos +- chemical biology +- chemistry +- cities +- civic +- classification +- climate +- climate model +- climate projections +- climate risk +- cloud computing +- CMIP5 +- CMIP6 +- coastal +- code completion +- cog +- collaborative filtering +- commerce +- complaints +- computational fluid dynamics +- computational pathology +- computed tomography +- computer forensics +- computer security +- computer vision +- conservation +- contamination +- Continuously Operating Reference Station (CORS) +- conversation data +- copper +- copyright monitoring +- coronavirus +- cover song identification +- COVID-19 +- cram +- cromwell +- cropland partitioning +- cryo electron tomography +- cryptocurrency +- CSI +- csv +- cultural preservation +- culture +- cyber security +- cyclone typhoon hurricane +- czi +- Danio rerio +- data assimilation +- datacenter +- deafrica +- decennial census +- deep learning +- demographic and housing characteristics file +- demographics +- demography +- denoising +- dhc +- dialog +- dicom +- differential privacy +- digital assets +- digital forensics +- digital pathology +- digital preservation +- disaster response +- disclosure avoidance +- distributional semantics +- drilling +- drifters +- Drosophila melanogaster +- earth observation +- earthquakes +- economics +- ecosystems +- education +- EEIO +- electricity +- electron microscopy +- electron tomography +- electrophysiology +- elevation +- email +- encyclopedic +- energy +- energy modeling +- environmental +- enzyme +- epigenomics +- ethereum +- ethnicity +- Eulerian +- evapotranspiration +- event camera +- events +- exploration +- extreme weather +- fast5 +- fasta +- fastq +- fewshot +- financial markets +- fisheries +- floods +- fluid dynamics +- fluorescence imaging +- foldingathome +- food security +- forecast +- free software +- gatk-sv +- gene expression +- GeneLab +- genetic +- genetic maps +- genome +- genome wide association study +- genomic +- genotyping +- geochemistry +- geology +- geopackage +- geophysics +- geoscience +- geospatial +- geothermal +- glioblastoma +- global +- global shutter camera +- GNSS +- golden retriever lifetime study +- governance +- government records +- government spending +- GPS +- grand-challenge.org +- graph +- green aviation +- ground water +- group quarters +- h5 +- hazard +- hazard indicator +- Hawkes Process +- hdf5 +- health +- high-throughput imaging +- hiring +- hispanic +- histopathology +- history +- Homo sapiens +- household type +- housing +- housing units +- HPC +- HYCOM +- hydrography +- hydrologic model +- hydrology +- ice +- image processing +- image-based profiling +- imaging +- IMU +- industrial +- industry +- information retrieval +- infrastructure +- internet +- interception loss +- intrusion detection +- ion channels +- irrigated cropland +- japanese +- json +- labeled +- Lagrangian +- land +- land cover +- land use +- last mile +- latino +- lidar +- life sciences +- light-sheet microscopy +- live song identification +- localization +- loftee +- logistics +- long read sequencing +- low-pressure turbine +- machine learning +- machine translation +- magnetic resonance imaging +- malware +- mammography +- mapping +- marine +- marine mammals +- marine navigation +- market data +- materials science +- media +- medical image computing +- medical imaging +- medicine +- MERS +- meta learning +- metadata +- metagenomics +- meteorological +- microbial genomics +- microbiome +- microcircuit modeling and simulation +- microdata +- microscopy +- mining +- mixed file dataset +- model +- molecular docking +- molecular dynamics +- molecule +- morphological reconstructions +- morris animal foundation +- movies +- msa +- multimedia +- Mus musculus +- museum +- music +- music features dataset +- music information retrieval +- music recognition +- nara +- NASA Center for Climate Simulation (NCCS) +- NASA SMD AI +- NASA Solar Dynamics Observatory (SDO) +- national archives catalog +- natural language processing +- natural resource +- near-surface air temperature +- near-surface relative humidity +- near-surface specific humidity +- netcdf +- network traffic +- neurobiology +- neuroimaging +- neurophysiology +- neuroscience +- nifti +- NOAA CORS Network (NCN) +- noisy measurements +- non-human primate +- nuclear magnetic resonance +- numerical particle +- object detection +- object tracking +- ocean circulation +- ocean currents +- ocean velocity +- ocean sea surface height +- ocean simulation +- oceans +- online shopping +- open source software +- opendap +- openfold +- optimization +- orbit +- organelle +- osm +- parquet +- pbi +- pediatric +- perception +- peril +- pharmaceutical +- physical +- physics +- planetary +- politics +- population +- population genetics +- post-processing +- precipitation +- privacy +- product comparison +- protein +- protein folding +- protein template +- race +- radar +- radiation +- radiology +- rainfed cropland +- ransomware +- Rattus norvegicus +- RDF +- recombination maps +- redistricting +- reference index +- regulatory +- relation-to-householder +- RINEX +- robotics +- routing +- RTK +- SARS +- SARS-CoV-2 +- satellite imagery +- scholarly communication +- schools +- scope 3 +- seafloor +- segmentation +- seismology +- sentiment +- sentinel-1 +- short read sequencing +- signal processing +- simulations +- simulation neuroscience +- single neuron models +- single year of age +- single-cell transcriptomics +- social media +- socioeconomic +- soil moisture +- solar +- source code +- space biology +- space weather +- SPARQL +- speaker identification +- speech processing +- speech recognition +- speech synthesis +- spend-based models +- sports +- sqlite +- stac +- statistics +- STRIDES +- structural biology +- structural birth defect +- structural variation +- subtitles +- supply chain +- surface water +- survey +- sustainability +- synthetic aperture radar +- synthetic data +- telecommunications +- temporal point process +- tertiary analysis +- text analysis +- tiles +- time series forecasting +- trading +- traffic +- transcriptomics +- transparency +- transportation +- turbulence +- txt +- ukraine +- urban +- us +- us-dc +- utilities +- variant annotation +- vcf +- vep +- video +- virus +- volumetric imaging +- voting age +- water +- weather +- web3 +- web archive +- whole exome sequencing +- whole genome sequencing +- wildlife +- word embeddings +- workload analysis +- x-ray +- x-ray crystallography +- x-ray microtomography +- x-ray tomography +- xml +- zarr From 0b4abfc03c934c114ff0f7319a8e4d4ce1b14856 Mon Sep 17 00:00:00 2001 From: xhagrg Date: Wed, 21 May 2025 15:50:23 -0500 Subject: [PATCH 036/751] Update with citation. --- datasets/surya-bench.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/surya-bench.yml b/datasets/surya-bench.yml index 74cd19d5d..a0cb51dc0 100644 --- a/datasets/surya-bench.yml +++ b/datasets/surya-bench.yml @@ -13,6 +13,8 @@ Tags: - NASA Solar Dynamics Observatory (SDO) License: | Creative Commons Attribution 4.0 International. +Citation: > + Roy, S., Singh, T., Freitag, M., Schmude, J., Lal, R., Hegde, D., Ranjan, S., Lin, A., Gaur, V., Vos, E.E. and Ghosal, R., 2024. AI Foundation Model for Heliophysics: Applications, Design, and Implementation. arXiv preprint arXiv:2410.10841. Resources: - Description: Surya Bench ARN: arn:aws:s3:::nasa-surya-bench From d2af6acaa24967c8d28e95fbfd99077478f8ef2b Mon Sep 17 00:00:00 2001 From: xhagrg Date: Wed, 21 May 2025 15:52:36 -0500 Subject: [PATCH 037/751] Use proper name. --- datasets/{surya-bench.yml => surya-bench.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{surya-bench.yml => surya-bench.yaml} (100%) diff --git a/datasets/surya-bench.yml b/datasets/surya-bench.yaml similarity index 100% rename from datasets/surya-bench.yml rename to datasets/surya-bench.yaml From 2ecad521a9107e750fa95438581bb94170c380b7 Mon Sep 17 00:00:00 2001 From: "ceggers@rsna.org" Date: Thu, 29 May 2025 10:23:31 -0500 Subject: [PATCH 038/751] Submitting Dataset From RSNA RSNA Lumbar Spine Degenerative Classification Dataset (RSNA-LSDD) --- ...e-degenerative-classification-dataset.yaml | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml diff --git a/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml b/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml new file mode 100644 index 000000000..0c3ae14e0 --- /dev/null +++ b/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml @@ -0,0 +1,23 @@ +Name: RSNA Lumbar Spine Degenerative Classification Dataset (RSNA-LSDD) +Description: "The Radiological Society of North America Lumbar Spine Degenerative Classification dataset (RSNA-LSDD) is a collection of over 2,600 magnetic resonance imaging (MR) scans of the lumbar spine annotated by a cohort of about 60 volunteer radiologists recruited by the RSNA, the American Society for Spine Radiology and the American Society of Neuroradiology to identify the location and severity of five degenerative conditions across the five intervertebral disc levels (L1/L2, L2/L3, L3/L4, L4/L5, and L5/S1). The imaging data, comprising over 8,500 image series (Sagittal “T2”, Axial T2 and Sagittal T1), was provided by twelve institutions from across the globe. Initially compiled in 2024 for the RSNA Lumbar Spine Degenerative Classification AI Challenge hosted on Kaggle competition platform (https://www.kaggle.com/competitions/rsna-2024-lumbar-spine-degenerative-classification), it represents the largest publicly available collection of its kind. Additional information on the dataset and how to make use of it is provided in the Data Resource Publication listed below, as well as on the Kaggle competition website, which also provides access to models developed during the competition." +Documentation: https://github.com/RSNA/AI-Challenge-Data/wiki/RSNA-Lumbar-Spine-Degenerative-Classification-Dataset +Contact: informatics@rsna.org +ManagedBy: 'Radiological Society of North America (https://www.rsna.org/)' +UpdateFrequency: The dataset may be updated with additional or corrected data on a need-to-update basis. +Tags: + - aws-pds + - radiology + - medical imaging + - medical image computing + - machine learning + - computer vision + - csv + - labeled + - life sciences +License: "You may access and use these de-identified imaging datasets and annotations (“the data”) for non-commercial purposes only, including academic research and education, as long as you agree to abide by the following provisions: Not to make any attempt to identify or contact any individual(s) who may be the subjects of the data. If you share or re-distribute the data in any form, include a citation to the “Radiological Society of North America Screening Mammography Breast Cancer Detection (RSNA-SMBC) Dataset, November 2022” [https://doi.org/10.1148/dataset.smbc.2024]." +Resources: + - Description: Zip archive containing DCM and CSV files + ARN: arn:aws:s3:::lumbar-spine-degenerative-classification + Region: us-west-2 + Type: S3 Bucket + ControlledAccess: https://mira.rsna.org/dataset/6 \ No newline at end of file From e991c166ba289b93e206d81eae64f2e00e222c03 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Mon, 9 Jun 2025 13:33:38 -0400 Subject: [PATCH 039/751] graf reforecast update --- datasets/graf-reforecast.yaml | 41 +++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 datasets/graf-reforecast.yaml diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml new file mode 100644 index 000000000..79fa51194 --- /dev/null +++ b/datasets/graf-reforecast.yaml @@ -0,0 +1,41 @@ +Name: Graf Reforecast +Description: +Documentation: +Contact: +ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" +UpdateFrequency: One time push only +Tags: + - atmosphere + - forecast + - geoscience + - geospatial + - model + - near-surface air temperature + - near-surface relative humidity + - zar + - weather +License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +Citation: +Resources: + - Description: + ARN: + Region: us-west-1 + Type: S3 Bucket + Explore: +DataAtWork: + Tutorials: + - Title: + URL: + NotebookURL: + AuthorName: + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: Global reforecasts from MPAS “GRAF” with mesh refinement over the US and Europe + URL: https://cesoc.net/wp-content/uploads/2024/08/GRAF-reforecast-Hamill-CESOC24.pdf + AuthorName: Thomas M. Hamill, Raghu Raj Prasanna Kumar, Karthik Kashinath2, Carl Ponder, Mike Pritchard, Tao Ge, Akshay Subramanian, Jaideep Pathak, John Wong, Brett Wilt, Peter Neilley From c18248fc630998e49c91468573074ce3f4cd3355 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Mon, 9 Jun 2025 13:35:15 -0400 Subject: [PATCH 040/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 79fa51194..5253d3c86 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -17,9 +17,9 @@ Tags: License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" Citation: Resources: - - Description: - ARN: - Region: us-west-1 + - Description: GRAF Reforecast dataset + ARN: arn:aws:s3:::twc-graf-reforecast + Region: us-west-2 Type: S3 Bucket Explore: DataAtWork: From a6993e2c897384ca74ebdd6c7f64fe69c8723c15 Mon Sep 17 00:00:00 2001 From: Matt McCormick Date: Wed, 18 Sep 2024 17:07:30 -0400 Subject: [PATCH 041/751] Add OME-Zarr Open SciVis Datasets The OME-Zarr Open SciVis Datasets project provides the Open SciVis Dataset in a chunked, multi-scale format, encodes metadata in JSON according to the OME-Zarr specification, and hosts the datasets on AWS S3 through the AWS Open Data Program, aiming to serve as a web-based resource for the scientific visualization community to enhance reproducibility and facilitate testing and development of OME-Zarr tools. --- datasets/ome-zarr-open-scivis.yaml | 48 ++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 datasets/ome-zarr-open-scivis.yaml diff --git a/datasets/ome-zarr-open-scivis.yaml b/datasets/ome-zarr-open-scivis.yaml new file mode 100644 index 000000000..d27470674 --- /dev/null +++ b/datasets/ome-zarr-open-scivis.yaml @@ -0,0 +1,48 @@ +Name: OME-Zarr Open SciVis Datasets +Description: This project provides the Open SciVis Datasets in a chunked, highly-compressed, multi-scale format, encodes metadata in JSON according to the OME-Zarr specification, and hosts the datasets on AWS S3 through the AWS Open Data Program, aiming to serve as a web-based resource for the scientific visualization community to enhance reproducibility and facilitate testing and development of OME-Zarr tools. +Documentation: https://github.com/InsightSoftwareConsortium/OMEZarrOpenSciVisDatasets +Contact: "Matt McCormick " +ManagedBy: "NumFOCUS" +UpdateFrequency: On a biannual basis we update the datasets and sync with OME-Zarr standards. +Tags: + - biology + - image processing + - imaging + - neuroimaging + - neuroscience + - life sciences + - magnetic resonance imaging + - computed tomography + - volumetric imaging + - zarr +License: CC-BY-4.0 unless otherwise specified +Resources: + - Description: OME-Zarr Open SciVis Datasets + ARN: arn:aws:s3:::ome-zarr-scivis + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Read and Visualize in Python + URL: https://github.com/InsightSoftwareConsortium/OMEZarrOpenSciVisDatasets?tab=readme-ov-file#usage + AuthorName: Matt McCormick + AuthorURL: https://github.com/thewtex + Tools & Applications: + - Title: A list of tools and libraries with OME-Zarr support + URL: https://ngff.openmicroscopy.org/tools/index.html + AuthorName: NGFF community + AuthorURL: https://github.com/ome/ngff + Publications: + - Title: "OME-NGFF: a next-generation file format for expanding bioimaging data-access strategies" + URL: https://www.nature.com/articles/s41592-021-01326-w + AuthorName: Josh Moore, Chris Allan, Sébastien Besson, Jean-Marie Burel, Erin Diel, David Gault, Kevin Kozlowski, Dominik Lindner, Melissa Linkert, Trevor Manz, Will Moore, Constantin Pape, Christian Tischer & Jason R. Swedlow + - Title: "OME-Zarr: a cloud-optimized bioimaging file format with international community support" + URL: https://link.springer.com/article/10.1007/s00418-023-02209-1 + AuthorName: Josh Moore, Daniela Basurto-Lozada, Sébastien Besson, John Bogovic, Jordão Bragantini, Eva M. Brown, Jean-Marie Burel, Xavier Casas Moreno, Gustavo de Medeiros, Erin E. Diel, David Gault, Satrajit S. Ghosh, Ilan Gold, Yaroslav O. Halchenko, Matthew Hartley, Dave Horsfall, Mark S. Keller, Mark Kittisopikul, Gabor Kovacs, Aybüke Küpcü Yoldaş, Koji Kyoda, Albane le Tournoulx de la Villegeorges, Tong Li, Prisca Liberali, Dominik Lindner, Melissa Linkert, Joel Lüthi, Jeremy Maitin-Shepard, Trevor Manz, Luca Marconato, Matthew McCormick, Merlin Lange, Khaled Mohamed, William Moore, Nils Norlin, Wei Ouyang, Bugra Özdemir, Giovanni Palla, Constantin Pape, Lucas Pelkmans, Tobias Pietzsch, Stephan Preibisch, Martin Prete, Norman Rzepka, Sameeul Samee, Nicholas Schaub, Hythem Sidky, Ahmet Can Solak, David R. Stirling, Jonathan Striebel, Christian Tischer, Daniel Toloudis, Isaac Virshup, Petr Walczysko, Alan M. Watson, Erin Weisbart, Frances Wong, Kevin A. Yamauchi, Omer Bayraktar, Beth A. Cimini, Nils Gehlenborg, Muzlifah Haniffa, Nathan Hotaling, Shuichi Onami, Loic A. Royer, Stephan Saalfeld, Oliver Stegle, Fabian J. Theis & Jason R. Swedlow + - Title: Open SciVis Datasets + URL: https://klacansky.com/open-scivis-datasets/ + AuthorName: Pavol Klacansky +DeprecatedNotice: +ADXCategories: + - Healthcare & Life Sciences Data + - Manufacturing Data From 0efd78d4a8e898caae588e59d27d1a2ffcc9aea1 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Wed, 11 Jun 2025 14:58:06 -0400 Subject: [PATCH 042/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 5253d3c86..6ce63ce27 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -1,5 +1,5 @@ Name: Graf Reforecast -Description: +Description: NVIDIA and The Weather Company (TWCo) have generated a data set of reforecasts from TWCo’s GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of the computation on graphical processing units, with this development assisted by NVIDIA. The intended 1836 reforecast cases (~5 years) will be generated with initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way, we hypothesize, will span a wider range of interesting, high-impact weather scenarios than had we performed five contiguous years of once-daily reforecasts, and we will span a wider variety of interesting precipitation events while still providing many samples in non-precipitating regions with more ordinary weather. GRAF reforecasts were mostly run to +27 h lead time, 3-h for spin up plus a full diurnal cycle. Data were saved in zarr format. Most fields were at 15-min intervals. Data are made publicly available to all through Amazon Web Services’ Open-Data Initiative. Documentation: Contact: ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" @@ -21,7 +21,6 @@ Resources: ARN: arn:aws:s3:::twc-graf-reforecast Region: us-west-2 Type: S3 Bucket - Explore: DataAtWork: Tutorials: - Title: From 3eaaba5660e4d0885d12a20c33e724403c0719f0 Mon Sep 17 00:00:00 2001 From: Aodhan Sweeney <40372081+AodhanSweeney@users.noreply.github.com> Date: Fri, 13 Jun 2025 15:29:42 -0700 Subject: [PATCH 043/751] Adding planette's C3S archive intial yaml commit for planette's C3S seasonal forecast archive --- .../planette_c3s_seasonal_forecast_data.yaml | 75 +++++++++++++++++++ 1 file changed, 75 insertions(+) create mode 100644 datasets/planette_c3s_seasonal_forecast_data.yaml diff --git a/datasets/planette_c3s_seasonal_forecast_data.yaml b/datasets/planette_c3s_seasonal_forecast_data.yaml new file mode 100644 index 000000000..fb381cab8 --- /dev/null +++ b/datasets/planette_c3s_seasonal_forecast_data.yaml @@ -0,0 +1,75 @@ +Name: Planette's C3S Seasonal Forecast Data +Description: | + The C3S seasonal forecast dataset provides global, daily, probabilistic forecasts of the Earth system, + enabling users to assess the likelihood of future climate states. These forecasts are particularly + valuable for studying slowly evolving climate patterns such as El Niño, La Niña, and the North Atlantic + Oscillation (NAO), which can be predicted with greater skill than the chaotic atmosphere. This dataset + is derived from the Copernicus Climate Change Service (C3S) archive and includes SEAS5 hindcasts + (1981-2016) and forecasts (2017-present) at 1°x1° global resolution. More models from the C3S archive will + be updated as they are processed into cloud native format. + + The planette C3S archive stores this data in cloud native format for easy access and analysis. +Documentation: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/README.md +Contact: aodhan.sweeney@planette.ai +ManagedBy: Planette.ai +UpdateFrequency: Monthly +Collabs: + ASDI: + Tags: + - climate + - weather + - forecasting + - seasonal + - subseasonal +Tags: + - aws-pds + - climate + - weather + - forecasting + - seasonal + - meteorology + - earth observation + - zarr + - xarray + - icechunk +License: | + Copernicus Licence (similar to CC-BY-4.0): You are free to share and adapt the material + for any purpose, even commercially, provided that you give appropriate credit. + https://cds.climate.copernicus.eu/api/v2/terms/static/licence-to-use-copernicus-products.pdf +Citation: | + Copernicus Climate Change Service (C3S) (2017): C3S seasonal forecast data. + Copernicus Climate Change Service, Climate Data Store (CDS). + https://cds.climate.copernicus.eu/cdsapp#!/dataset/seasonal-original-single-levels +Resources: + - Description: C3S Seasonal Forecast Hindcasts and Forecasts (Zarr format) + ARN: arn:aws:s3:::planettebaikal/forecast_models/seasonal/seas5/ + Region: us-east-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://planettebaikal.s3.amazonaws.com/index.html#forecast_models/seasonal/)' +DataAtWork: + Tutorials: + - Title: Accessing C3S Seasonal Forecast Data with Python + URL: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/c3s_seasonal_forecast_tutorial.ipynb + AuthorName: "Aodhan Sweeney-Jaramillo" + AuthorURL: https://github.com/AodhanSweeney + Tools & Applications: + - Title: xarray + URL: https://docs.xarray.dev/ + AuthorName: xarray Developers + - Title: zarr-python + URL: https://zarr.dev/ + AuthorName: zarr Developers + - Title: icechunk + URL: https://github.com/earth-mover/icechunk + AuthorName: earth-mover + Publications: + - Title: "SEAS5: The new ECMWF seasonal forecast system" + URL: https://doi.org/10.5194/gmd-12-1087-2019 + AuthorName: Johnson, S. J., et al. + - Title: "C3S Seasonal Forecasts Documentation" + URL: https://climate.copernicus.eu/seasonal-forecasts + AuthorName: Copernicus Climate Change Service +DeprecatedNotice: +ADXCategories: + - Environmental Data \ No newline at end of file From 4787cdc4c5ca4347096143c2d639f863f2841af0 Mon Sep 17 00:00:00 2001 From: Aodhan Sweeney <40372081+AodhanSweeney@users.noreply.github.com> Date: Fri, 13 Jun 2025 15:59:03 -0700 Subject: [PATCH 044/751] Changed Tags --- datasets/c3s-seasonal-forecast-data.yaml | 66 ++++++++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 datasets/c3s-seasonal-forecast-data.yaml diff --git a/datasets/c3s-seasonal-forecast-data.yaml b/datasets/c3s-seasonal-forecast-data.yaml new file mode 100644 index 000000000..268401e20 --- /dev/null +++ b/datasets/c3s-seasonal-forecast-data.yaml @@ -0,0 +1,66 @@ +Name: Planette C3S Seasonal Forecast Data +Description: | + The C3S seasonal forecast dataset provides global, daily, probabilistic forecasts of the Earth system, + enabling users to assess the likelihood of future climate states. These forecasts are particularly + valuable for studying slowly evolving climate patterns such as El Niño, La Niña, and the North Atlantic + Oscillation (NAO), which can be predicted with greater skill than the chaotic atmosphere. This dataset + is derived from the Copernicus Climate Change Service (C3S) archive and includes SEAS5 hindcasts + (1981-2016) and forecasts (2017-present) at 1°x1° global resolution. More models from the C3S archive will + be updated as they are processed into cloud native format. The planette C3S archive stores this data in + cloud native format for easy access and analysis. +Documentation: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/README.md +Contact: aodhan.sweeney@planette.ai +ManagedBy: Planette.ai +UpdateFrequency: Monthly +Collabs: + ASDI: + Tags: + - climate + - weather + - forecast +Tags: + - aws-pds + - climate + - weather + - earth observation +License: | + Copernicus Licence (similar to CC-BY-4.0): You are free to share and adapt the material + for any purpose, even commercially, provided that you give appropriate credit. + https://cds.climate.copernicus.eu/api/v2/terms/static/licence-to-use-copernicus-products.pdf +Citation: | + Copernicus Climate Change Service (C3S) (2017): C3S seasonal forecast data. + Copernicus Climate Change Service, Climate Data Store (CDS). + https://cds.climate.copernicus.eu/cdsapp#!/dataset/seasonal-original-single-levels +Resources: + - Description: C3S Seasonal Forecast Hindcasts and Forecasts (Zarr format) + ARN: arn:aws:s3:::planettebaikal/forecast_models/seasonal/seas5/ + Region: us-east-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://planettebaikal.s3.amazonaws.com/index.html#forecast_models/seasonal/)' +DataAtWork: + Tutorials: + - Title: Accessing C3S Seasonal Forecast Data with Python + URL: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/c3s_seasonal_forecast_tutorial.ipynb + AuthorName: "Aodhan Sweeney-Jaramillo" + AuthorURL: https://github.com/AodhanSweeney + Tools & Applications: + - Title: xarray + URL: https://docs.xarray.dev/ + AuthorName: xarray Developers + - Title: zarr-python + URL: https://zarr.dev/ + AuthorName: zarr Developers + - Title: icechunk + URL: https://github.com/earth-mover/icechunk + AuthorName: earth-mover + Publications: + - Title: "SEAS5: The new ECMWF seasonal forecast system" + URL: https://doi.org/10.5194/gmd-12-1087-2019 + AuthorName: Johnson, S. J., et al. + - Title: "C3S Seasonal Forecasts Documentation" + URL: https://climate.copernicus.eu/seasonal-forecasts + AuthorName: Copernicus Climate Change Service +DeprecatedNotice: +ADXCategories: + - Environmental Data \ No newline at end of file From dc49f80e9f31bceabdea3807b54eda5535efe52d Mon Sep 17 00:00:00 2001 From: Aodhan Sweeney <40372081+AodhanSweeney@users.noreply.github.com> Date: Fri, 13 Jun 2025 16:09:12 -0700 Subject: [PATCH 045/751] Update planette_c3s_seasonal_forecast_data.yaml --- .../planette_c3s_seasonal_forecast_data.yaml | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/datasets/planette_c3s_seasonal_forecast_data.yaml b/datasets/planette_c3s_seasonal_forecast_data.yaml index fb381cab8..f4ac99a5c 100644 --- a/datasets/planette_c3s_seasonal_forecast_data.yaml +++ b/datasets/planette_c3s_seasonal_forecast_data.yaml @@ -1,4 +1,4 @@ -Name: Planette's C3S Seasonal Forecast Data +Name: Planette C3S Seasonal Forecast Data Description: | The C3S seasonal forecast dataset provides global, daily, probabilistic forecasts of the Earth system, enabling users to assess the likelihood of future climate states. These forecasts are particularly @@ -6,9 +6,8 @@ Description: | Oscillation (NAO), which can be predicted with greater skill than the chaotic atmosphere. This dataset is derived from the Copernicus Climate Change Service (C3S) archive and includes SEAS5 hindcasts (1981-2016) and forecasts (2017-present) at 1°x1° global resolution. More models from the C3S archive will - be updated as they are processed into cloud native format. - - The planette C3S archive stores this data in cloud native format for easy access and analysis. + be updated as they are processed into cloud native format. The planette C3S archive stores this data in + cloud native format for easy access and analysis. Documentation: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/README.md Contact: aodhan.sweeney@planette.ai ManagedBy: Planette.ai @@ -18,20 +17,12 @@ Collabs: Tags: - climate - weather - - forecasting - - seasonal - - subseasonal + - forecast Tags: - aws-pds - climate - weather - - forecasting - - seasonal - - meteorology - earth observation - - zarr - - xarray - - icechunk License: | Copernicus Licence (similar to CC-BY-4.0): You are free to share and adapt the material for any purpose, even commercially, provided that you give appropriate credit. @@ -72,4 +63,4 @@ DataAtWork: AuthorName: Copernicus Climate Change Service DeprecatedNotice: ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 89edad1440d77ad93af22c8d05bf9e52922686cb Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 16 Jun 2025 11:36:49 -0800 Subject: [PATCH 046/751] Delete datasets/c3s-seasonal-forecast-data.yaml --- datasets/c3s-seasonal-forecast-data.yaml | 66 ------------------------ 1 file changed, 66 deletions(-) delete mode 100644 datasets/c3s-seasonal-forecast-data.yaml diff --git a/datasets/c3s-seasonal-forecast-data.yaml b/datasets/c3s-seasonal-forecast-data.yaml deleted file mode 100644 index 268401e20..000000000 --- a/datasets/c3s-seasonal-forecast-data.yaml +++ /dev/null @@ -1,66 +0,0 @@ -Name: Planette C3S Seasonal Forecast Data -Description: | - The C3S seasonal forecast dataset provides global, daily, probabilistic forecasts of the Earth system, - enabling users to assess the likelihood of future climate states. These forecasts are particularly - valuable for studying slowly evolving climate patterns such as El Niño, La Niña, and the North Atlantic - Oscillation (NAO), which can be predicted with greater skill than the chaotic atmosphere. This dataset - is derived from the Copernicus Climate Change Service (C3S) archive and includes SEAS5 hindcasts - (1981-2016) and forecasts (2017-present) at 1°x1° global resolution. More models from the C3S archive will - be updated as they are processed into cloud native format. The planette C3S archive stores this data in - cloud native format for easy access and analysis. -Documentation: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/README.md -Contact: aodhan.sweeney@planette.ai -ManagedBy: Planette.ai -UpdateFrequency: Monthly -Collabs: - ASDI: - Tags: - - climate - - weather - - forecast -Tags: - - aws-pds - - climate - - weather - - earth observation -License: | - Copernicus Licence (similar to CC-BY-4.0): You are free to share and adapt the material - for any purpose, even commercially, provided that you give appropriate credit. - https://cds.climate.copernicus.eu/api/v2/terms/static/licence-to-use-copernicus-products.pdf -Citation: | - Copernicus Climate Change Service (C3S) (2017): C3S seasonal forecast data. - Copernicus Climate Change Service, Climate Data Store (CDS). - https://cds.climate.copernicus.eu/cdsapp#!/dataset/seasonal-original-single-levels -Resources: - - Description: C3S Seasonal Forecast Hindcasts and Forecasts (Zarr format) - ARN: arn:aws:s3:::planettebaikal/forecast_models/seasonal/seas5/ - Region: us-east-2 - Type: S3 Bucket - Explore: - - '[Browse Dataset](https://planettebaikal.s3.amazonaws.com/index.html#forecast_models/seasonal/)' -DataAtWork: - Tutorials: - - Title: Accessing C3S Seasonal Forecast Data with Python - URL: https://github.com/PlanetteAI/planette_c3s_archive/blob/main/c3s_seasonal_forecast_tutorial.ipynb - AuthorName: "Aodhan Sweeney-Jaramillo" - AuthorURL: https://github.com/AodhanSweeney - Tools & Applications: - - Title: xarray - URL: https://docs.xarray.dev/ - AuthorName: xarray Developers - - Title: zarr-python - URL: https://zarr.dev/ - AuthorName: zarr Developers - - Title: icechunk - URL: https://github.com/earth-mover/icechunk - AuthorName: earth-mover - Publications: - - Title: "SEAS5: The new ECMWF seasonal forecast system" - URL: https://doi.org/10.5194/gmd-12-1087-2019 - AuthorName: Johnson, S. J., et al. - - Title: "C3S Seasonal Forecasts Documentation" - URL: https://climate.copernicus.eu/seasonal-forecasts - AuthorName: Copernicus Climate Change Service -DeprecatedNotice: -ADXCategories: - - Environmental Data \ No newline at end of file From 1493fcde4463cf0015cf9341d7d97c18fa8a01e8 Mon Sep 17 00:00:00 2001 From: Zoheyr Doctor Date: Tue, 17 Jun 2025 11:54:06 -0500 Subject: [PATCH 047/751] Create mbers-open-data.yaml --- datasets/mbers-open-data.yaml | 43 +++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 datasets/mbers-open-data.yaml diff --git a/datasets/mbers-open-data.yaml b/datasets/mbers-open-data.yaml new file mode 100644 index 000000000..02e8c6978 --- /dev/null +++ b/datasets/mbers-open-data.yaml @@ -0,0 +1,43 @@ +Name: Marginal Build Emissions Rates (MBERs) for Electricity +Description: The Climate TRACE coalition has developed and maintains free global hourly Build Margin data, also known as MBERs, that are compliant with the Greenhouse Gas Protocol's Project Protocol electricity sector guidance, Guidelines for Grid-Connected Electricity Projects ("GHGP Guidelines"). +Documentation: https://github.com/WattTime/mbers-open-data +Contact: The annual and hourly MBERs data are created and maintained by the Climate TRACE coalition of nonprofits, universities, and tech companies. The largest contributors to the coalition's electricity sector work are WattTime, Transition Zero, Global Energy Monitor, Pixel Scientia Labs, Planet Labs, and Georgetown University. For questions or more information about MBER data, contact coalition@ClimateTRACE.org or visit https://climatetrace.org/contact. +ManagedBy: Climate TRACE +UpdateFrequency: Approximately quarterly +Tags: + - carbon + - climate + - csv + - electricity + - energy + - energy modeling + - environmental +License: +Citation: +Resources: + - Description: + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: + URL: + NotebookURL: + AuthorName: + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: + - Environmental Data From 8899dab54a072076641a5681c3d25d8bcd1caf7e Mon Sep 17 00:00:00 2001 From: Brian Foo Date: Tue, 17 Jun 2025 16:07:29 -0400 Subject: [PATCH 048/751] Add entry for Library of Congress Sanborn Maps Dataset --- datasets/loc-sanborn.yml | 51 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 datasets/loc-sanborn.yml diff --git a/datasets/loc-sanborn.yml b/datasets/loc-sanborn.yml new file mode 100644 index 000000000..c9306866b --- /dev/null +++ b/datasets/loc-sanborn.yml @@ -0,0 +1,51 @@ +Name: Sanborn Maps Data Package +Description: "The dataset contains metadata records for 50,600 maps from the [Sanborn Fire Insurance Maps collection](https://www.loc.gov/collections/sanborn-maps/) and their corresponding 440,048 JPEG images. The Sanborn collection at Library of Congress includes over fifty thousand editions of fire insurance maps comprising almost seven hundred thousand individual sheets. The Library of Congress holdings represent the largest extant collection of maps produced by the Sanborn Map Company." +Documentation: https://data.labs.loc.gov/sanborn/ +Contact: For curatorial questions about the content of the collection and formats, contact the Library of Congress Geography and Map Division at https://ask.loc.gov/map-geography. For technical questions about access, contact LC-Labs@loc.gov +ManagedBy: "[Library of Congress](https://www.loc.gov/)" +UpdateFrequency: As new and significant changes to the underlying digital collection occurs +Tags: + - archives + - cities + - computer vision + - conservation + - culture + - cultural preservation + - demographics + - digital assets + - geospatial + - history + - housing + - land use + - mapping + - urban +License: The content of the Library of Congress online Sanborn Maps Collection is in the public domain and is free to use and reuse. For more information, see https://www.loc.gov/collections/sanborn-maps/about-this-collection/rights-and-access/. +Resources: + - Description: Sanborn Maps data + ARN: arn:aws:s3:::loc-sanborn + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket by State](https://loc-sanborn.s3.amazonaws.com/maps-by-state/index.html)' + - '[README](https://loc-sanborn.s3.amazonaws.com/README.html)'' +DataAtWork: + Tutorials: + - Title: README data cover sheet + URL: https://loc-sanborn.s3.amazonaws.com/README.html + AuthorName: Library of Congress + - Title: Sanborn Map Data Python Tutorial (Jupyter notebook) + URL: https://libraryofcongress.github.io/data-exploration/Data%20Packages/sanborn.html + AuthorName: Library of Congress + AuthorURL: https://github.com/LibraryOfCongress + - Title: "Fire Insurance Maps at the Library of Congress: A Resource Guide" + URL: https://guides.loc.gov/fire-insurance-maps/introduction + AuthorName: Julie Stoner, Reference Librarian, Geography and Map Division, Library of Congress + Tools & Applications: + - Title: Sanborn Atlas Volume Finder + URL: https://loc.maps.arcgis.com/apps/instant/media/index.html?appid=0cb2c04324a0413081e1b793ea18f854 + AuthorName: Julie Stoner and Meagan Snow, Geography and Map Division, Library of Congress + AuthorURL: https://github.com/aarande + Publications: + - Title: Introduction to the Collection + URL: https://www.loc.gov/collections/sanborn-maps/articles-and-essays/introduction-to-the-collection/ + AuthorName: Walter W. Ristow From b6e4d02a93a75fb15ca9fd476c56c11ff6e9e34c Mon Sep 17 00:00:00 2001 From: Brian Foo Date: Tue, 17 Jun 2025 16:13:22 -0400 Subject: [PATCH 049/751] Rename loc-sanborn.yml to loc-sanborn.yaml --- datasets/{loc-sanborn.yml => loc-sanborn.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{loc-sanborn.yml => loc-sanborn.yaml} (100%) diff --git a/datasets/loc-sanborn.yml b/datasets/loc-sanborn.yaml similarity index 100% rename from datasets/loc-sanborn.yml rename to datasets/loc-sanborn.yaml From cfc4ea932d5ee28fb61f18fb4942845f3834890f Mon Sep 17 00:00:00 2001 From: Brian Foo Date: Tue, 17 Jun 2025 16:19:00 -0400 Subject: [PATCH 050/751] Remove extra apostrophe --- datasets/loc-sanborn.yaml | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/datasets/loc-sanborn.yaml b/datasets/loc-sanborn.yaml index c9306866b..af1208def 100644 --- a/datasets/loc-sanborn.yaml +++ b/datasets/loc-sanborn.yaml @@ -1,7 +1,18 @@ +--- Name: Sanborn Maps Data Package -Description: "The dataset contains metadata records for 50,600 maps from the [Sanborn Fire Insurance Maps collection](https://www.loc.gov/collections/sanborn-maps/) and their corresponding 440,048 JPEG images. The Sanborn collection at Library of Congress includes over fifty thousand editions of fire insurance maps comprising almost seven hundred thousand individual sheets. The Library of Congress holdings represent the largest extant collection of maps produced by the Sanborn Map Company." +Description: The dataset contains metadata records for 50,600 maps from the + [Sanborn Fire Insurance Maps + collection](https://www.loc.gov/collections/sanborn-maps/) and their + corresponding 440,048 JPEG images. The Sanborn collection at Library of + Congress includes over fifty thousand editions of fire insurance maps + comprising almost seven hundred thousand individual sheets. The Library of + Congress holdings represent the largest extant collection of maps produced by + the Sanborn Map Company. Documentation: https://data.labs.loc.gov/sanborn/ -Contact: For curatorial questions about the content of the collection and formats, contact the Library of Congress Geography and Map Division at https://ask.loc.gov/map-geography. For technical questions about access, contact LC-Labs@loc.gov +Contact: For curatorial questions about the content of the collection and + formats, contact the Library of Congress Geography and Map Division at + https://ask.loc.gov/map-geography. For technical questions about access, + contact LC-Labs@loc.gov ManagedBy: "[Library of Congress](https://www.loc.gov/)" UpdateFrequency: As new and significant changes to the underlying digital collection occurs Tags: @@ -19,15 +30,19 @@ Tags: - land use - mapping - urban -License: The content of the Library of Congress online Sanborn Maps Collection is in the public domain and is free to use and reuse. For more information, see https://www.loc.gov/collections/sanborn-maps/about-this-collection/rights-and-access/. +License: The content of the Library of Congress online Sanborn Maps Collection + is in the public domain and is free to use and reuse. For more information, + see + https://www.loc.gov/collections/sanborn-maps/about-this-collection/rights-and-access/. Resources: - Description: Sanborn Maps data ARN: arn:aws:s3:::loc-sanborn Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket by State](https://loc-sanborn.s3.amazonaws.com/maps-by-state/index.html)' - - '[README](https://loc-sanborn.s3.amazonaws.com/README.html)'' + - "[Browse Bucket by + State](https://loc-sanborn.s3.amazonaws.com/maps-by-state/index.html)" + - "[README](https://loc-sanborn.s3.amazonaws.com/README.html)" DataAtWork: Tutorials: - Title: README data cover sheet @@ -39,11 +54,13 @@ DataAtWork: AuthorURL: https://github.com/LibraryOfCongress - Title: "Fire Insurance Maps at the Library of Congress: A Resource Guide" URL: https://guides.loc.gov/fire-insurance-maps/introduction - AuthorName: Julie Stoner, Reference Librarian, Geography and Map Division, Library of Congress + AuthorName: Julie Stoner, Reference Librarian, Geography and Map Division, + Library of Congress Tools & Applications: - Title: Sanborn Atlas Volume Finder URL: https://loc.maps.arcgis.com/apps/instant/media/index.html?appid=0cb2c04324a0413081e1b793ea18f854 - AuthorName: Julie Stoner and Meagan Snow, Geography and Map Division, Library of Congress + AuthorName: Julie Stoner and Meagan Snow, Geography and Map Division, Library of + Congress AuthorURL: https://github.com/aarande Publications: - Title: Introduction to the Collection From dd0d0f97017b96cc13ecf9c29746321de1132ce5 Mon Sep 17 00:00:00 2001 From: Michael Uftring Date: Wed, 18 Jun 2025 12:28:57 -0400 Subject: [PATCH 051/751] update DRC full name --- datasets/sparc.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/sparc.yaml b/datasets/sparc.yaml index ecd10cdb9..fb3b83c4f 100644 --- a/datasets/sparc.yaml +++ b/datasets/sparc.yaml @@ -10,7 +10,7 @@ Description: | of anatomical and functional connectivity of the nervous system. Documentation: https://docs.sparc.science Contact: joostw@seas.upenn.edu, support@sparc.science -ManagedBy: "[The SPARC Data Resource Center](https://sparc.science/about)" +ManagedBy: "[The SPARC Data and Resource Center](https://sparc.science/about)" UpdateFrequency: Continually adding new datasets and releasing versions of datasets Tags: - bioinformatics @@ -32,13 +32,13 @@ DataAtWork: Tutorials: - Title: Downloading large scale SPARC datasets URL: https://docs.sparc.science/recipes - AuthorName: "The SPARC Data Resource Center" + AuthorName: "The SPARC Data and Resource Center" - Title: Download public data, scaffolds and run computations URL: https://docs.sparc.science/docs/getting-started-with-the-sparc-python-client - AuthorName: "The SPARC Data Resource Center" + AuthorName: "The SPARC Data and Resource Center" - Title: Using sparc.client for data movement in SPARC URL: https://docs.sparc.science/docs/tutorial-using-sparcclient-for-data-movement-in-sparc - AuthorName: "The SPARC Data Resource Center" + AuthorName: "The SPARC Data and Resource Center" Tools & Applications: - Title: The SPARC Portal URL: https://sparc.science From 47e0dbf57b20ce0a02d2b01d54d69d53194e8be7 Mon Sep 17 00:00:00 2001 From: Zoheyr Doctor Date: Wed, 18 Jun 2025 13:39:30 -0500 Subject: [PATCH 052/751] Update mbers-open-data.yaml --- datasets/mbers-open-data.yaml | 36 ++++++++--------------------------- 1 file changed, 8 insertions(+), 28 deletions(-) diff --git a/datasets/mbers-open-data.yaml b/datasets/mbers-open-data.yaml index 02e8c6978..a86e0ce15 100644 --- a/datasets/mbers-open-data.yaml +++ b/datasets/mbers-open-data.yaml @@ -1,9 +1,9 @@ Name: Marginal Build Emissions Rates (MBERs) for Electricity Description: The Climate TRACE coalition has developed and maintains free global hourly Build Margin data, also known as MBERs, that are compliant with the Greenhouse Gas Protocol's Project Protocol electricity sector guidance, Guidelines for Grid-Connected Electricity Projects ("GHGP Guidelines"). -Documentation: https://github.com/WattTime/mbers-open-data +Documentation: https://github.com/WattTime/mbers-open-data/blob/main/MBER_Data_Summary_and_Methodology.pdf Contact: The annual and hourly MBERs data are created and maintained by the Climate TRACE coalition of nonprofits, universities, and tech companies. The largest contributors to the coalition's electricity sector work are WattTime, Transition Zero, Global Energy Monitor, Pixel Scientia Labs, Planet Labs, and Georgetown University. For questions or more information about MBER data, contact coalition@ClimateTRACE.org or visit https://climatetrace.org/contact. ManagedBy: Climate TRACE -UpdateFrequency: Approximately quarterly +UpdateFrequency: Annually Tags: - carbon - climate @@ -12,32 +12,12 @@ Tags: - energy - energy modeling - environmental -License: -Citation: -Resources: - - Description: - ARN: - Region: - Type: - Explore: +License: All data are free and provided without license restrictions. +Citation: Marginal Build Emissions Rates (MBERs) for Electricity. Climate TRACE. [DATE]. URL: https://www.gem.wiki/MBERs DataAtWork: - Tutorials: - - Title: - URL: - NotebookURL: - AuthorName: - AuthorURL: - Services: - Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: - Publications: - - Title: - URL: - AuthorName: - AuthorURL: -DeprecatedNotice: + Tutorials: + - Title: MBER Orientation and Tutorial + URL: https://github.com/WattTime/mbers-open-data/blob/main/MBER_Orientation_and_Tutorial.pdf + AuthorName: Climate TRACE ADXCategories: - Environmental Data From 476d317912e3b42307239f3852a6f16a142ade43 Mon Sep 17 00:00:00 2001 From: Daofeng Li Date: Wed, 25 Jun 2025 18:10:40 -0500 Subject: [PATCH 053/751] Update a typo, add the directory listing link Update a typo, add the directory listing link --- datasets/roadmapepigenomics.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/roadmapepigenomics.yaml b/datasets/roadmapepigenomics.yaml index 9f0907537..f9dbe7511 100644 --- a/datasets/roadmapepigenomics.yaml +++ b/datasets/roadmapepigenomics.yaml @@ -1,6 +1,6 @@ Name: NIH Roadmap Epigenomics Description: | - The NIH Roadmap Epigenomics Mapping Consortium was launched with the goal of producing a public resource of human epigenomic data to catalyze basic biology and disease-oriented research. The project has generated high-quality, genome-wide maps of several key histone modifications, chromatin accessibility, DNA methylation and mRNA expression across 100s of human cell types and tissues. + The NIH Roadmap Epigenomics Mapping Consortium was launched with the goal of producing a public resource of human epigenomic data to catalyze basic biology and disease-oriented research. The project has generated high-quality, genome-wide maps of several key histone modifications, chromatin accessibility, DNA methylation and mRNA expression across 100s of human cell types and tissues. To see what data is available, please check the directory listing: https://roadmapepigenomics.s3.us-west-2.amazonaws.com/index.html. Contact: dli23@wustl.edu ManagedBy: NIH Roadmap Epigenomics Mapping Consortium, Ting Wang Lab at WashU (https://wang.wustl.edu/) Documentation: https://egg2.wustl.edu/roadmap/web_portal/ @@ -25,8 +25,8 @@ DataAtWork: URL: https://egg2.wustl.edu/roadmap/web_portal/ AuthorName: Anshul Kundaje Lab AuthorURL: https://kundajelab.github.io/ - - Title: Visualize TaRGET data with WashU Epigenome Browser - URL: https://epigenomegateway.wustl.edu/browser/ + - Title: Visualize Roadmp data with WashU Epigenome Browser + URL: https://epigenomegateway.wustl.edu/browser/?genome=hg19&hub=https://vizhub.wustl.edu/public/hg19/new/roadmap9_methylC.md AuthorName: WashU Epigenome Browser AuthorURL: https://epigenomegateway.wustl.edu/browser/ Publications: From c3e6f960b400d410a41232788d6944e5b4ab4251 Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Tue, 25 Feb 2025 23:31:50 -0800 Subject: [PATCH 054/751] Add datasets/ztf.yaml --- datasets/ztf.yaml | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 datasets/ztf.yaml diff --git a/datasets/ztf.yaml b/datasets/ztf.yaml new file mode 100644 index 000000000..349ef4ef1 --- /dev/null +++ b/datasets/ztf.yaml @@ -0,0 +1,36 @@ +Name: 'Zwicky Transient Facility (ZTF)' +Description: 'The Zwicky Transient Facility (ZTF) is a time-domain astronomy survey that uses the Palomar 48 inch Schmidt telescope and a custom-built wide-field camera to image the night sky in three photometric filters (g, r, and i). It is a fully-automated survey aimed at a systematic exploration of optical transient phenomena. It completes a scan of the observable northern sky approximately every three nights.' +Documentation: https://irsa.ipac.caltech.edu/Missions/ztf.html +Contact: https://irsa.ipac.caltech.edu/docs/help_desk.html +ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech.edu)) at Caltech" +UpdateFrequency: ZTF datasets may be updated approximately twice per year. The data may also be presented in new ways as the products become available. +Tags: + - astronomy + - object detection + - parquet + - survey +License: https://irsa.ipac.caltech.edu/data_use_terms.html +Citation: "If you use the Objects Table, please cite the Digital Object Identifier (DOI): [10.26131/IRSA597](https://www.ipac.caltech.edu/doi/10.26131/IRSA597). If you use the Lightcurves, please cite the Digital Object Identifier (DOI): [10.26131/IRSA598](https://www.ipac.caltech.edu/doi/10.26131/IRSA598). In addition, please follow the [ZTF acknowledgement guidelines](https://irsa.ipac.caltech.edu/data/ZTF/docs/releases/ztf_release_notes_latest) and the [IRSA acknowledgement guidelines](https://irsa.ipac.caltech.edu/ack.html)." +Resources: + - Description: 'Objects Table is a catalog of PSF-fit photometry detections extracted from ZTF reference images. The reference images were generated by stacking single exposures acquired from all science programs in the survey, resulting in photometry up to 2.5 magnitudes deeper than single-exposure detections. Objects Table contains both point-like and extended objects. The survey covers ~25,000 square degrees of the northern hemisphere. This version of the catalog is in Apache Parquet format and partitioned following the Hierarchical Adaptive Tiling Scheme ([HATS](https://hats.readthedocs.io/)).' + ARN: arn:aws:s3:::BUCKET/PREFIX/ + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Lightcurves is a catalog of PSF-fit photometry detections extracted from single-exposure images at the locations of Objects Table detections. An object ID identifies related data in both catalogs. Photometry is in the native ZTF photometric system and the epoch-dependent zeropoints have already been applied. Note that Lightcurves detections may be missing, for example, in cases where the Objects Table detection is fainter or approximately equal to the single-exposure sensitivity limits. This version of the catalog is in Apache Parquet format and partitioned following the Hierarchical Adaptive Tiling Scheme ([HATS](https://hats.readthedocs.io/)).' + ARN: arn:aws:s3:::BUCKET/PREFIX/ + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False +DataAtWork: + Tutorials: + - Title: IRSA Notebook Tutorials + URL: https://irsa.ipac.caltech.edu/docs/notebooks/ + AuthorName: Caltech/IPAC-IRSA + AuthorURL: https://irsa.ipac.caltech.edu + - Title: Multi-Wavelength Light Curves Tutorial + URL: https://nasa-fornax.github.io/fornax-demo-notebooks/light_curves/light_curve_generator.html + AuthorName: NASA/IPAC Infrared Science Archive (IRSA) + AuthorURL: https://irsa.ipac.caltech.edu From 8f5c46a1c6100948edd7baef3a2f3242dfd5667d Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Tue, 1 Apr 2025 16:00:31 -0700 Subject: [PATCH 055/751] Add tag aws-pds --- datasets/ztf.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/ztf.yaml b/datasets/ztf.yaml index 349ef4ef1..beaf0fe37 100644 --- a/datasets/ztf.yaml +++ b/datasets/ztf.yaml @@ -6,6 +6,7 @@ ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech UpdateFrequency: ZTF datasets may be updated approximately twice per year. The data may also be presented in new ways as the products become available. Tags: - astronomy + - aws-pds - object detection - parquet - survey From e39b701c00b528df3166b8181be01e84066eed12 Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Wed, 25 Jun 2025 18:26:28 -0700 Subject: [PATCH 056/751] Add bucket info --- datasets/ztf.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/ztf.yaml b/datasets/ztf.yaml index beaf0fe37..f9259b104 100644 --- a/datasets/ztf.yaml +++ b/datasets/ztf.yaml @@ -14,13 +14,13 @@ License: https://irsa.ipac.caltech.edu/data_use_terms.html Citation: "If you use the Objects Table, please cite the Digital Object Identifier (DOI): [10.26131/IRSA597](https://www.ipac.caltech.edu/doi/10.26131/IRSA597). If you use the Lightcurves, please cite the Digital Object Identifier (DOI): [10.26131/IRSA598](https://www.ipac.caltech.edu/doi/10.26131/IRSA598). In addition, please follow the [ZTF acknowledgement guidelines](https://irsa.ipac.caltech.edu/data/ZTF/docs/releases/ztf_release_notes_latest) and the [IRSA acknowledgement guidelines](https://irsa.ipac.caltech.edu/ack.html)." Resources: - Description: 'Objects Table is a catalog of PSF-fit photometry detections extracted from ZTF reference images. The reference images were generated by stacking single exposures acquired from all science programs in the survey, resulting in photometry up to 2.5 magnitudes deeper than single-exposure detections. Objects Table contains both point-like and extended objects. The survey covers ~25,000 square degrees of the northern hemisphere. This version of the catalog is in Apache Parquet format and partitioned following the Hierarchical Adaptive Tiling Scheme ([HATS](https://hats.readthedocs.io/)).' - ARN: arn:aws:s3:::BUCKET/PREFIX/ + ARN: arn:aws:s3:::ipac-irsa-ztf/contributed/dr23/objects/hats Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - Description: 'Lightcurves is a catalog of PSF-fit photometry detections extracted from single-exposure images at the locations of Objects Table detections. An object ID identifies related data in both catalogs. Photometry is in the native ZTF photometric system and the epoch-dependent zeropoints have already been applied. Note that Lightcurves detections may be missing, for example, in cases where the Objects Table detection is fainter or approximately equal to the single-exposure sensitivity limits. This version of the catalog is in Apache Parquet format and partitioned following the Hierarchical Adaptive Tiling Scheme ([HATS](https://hats.readthedocs.io/)).' - ARN: arn:aws:s3:::BUCKET/PREFIX/ + ARN: arn:aws:s3:::ipac-irsa-ztf/contributed/dr23/lc/hats Region: us-east-1 Type: S3 Bucket RequesterPays: False @@ -33,5 +33,5 @@ DataAtWork: AuthorURL: https://irsa.ipac.caltech.edu - Title: Multi-Wavelength Light Curves Tutorial URL: https://nasa-fornax.github.io/fornax-demo-notebooks/light_curves/light_curve_generator.html - AuthorName: NASA/IPAC Infrared Science Archive (IRSA) + AuthorName: Caltech/IPAC-IRSA AuthorURL: https://irsa.ipac.caltech.edu From 8ff1a93e1a405bfb8142e5d33d51d8b0693a7408 Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Thu, 26 Jun 2025 17:51:41 -0700 Subject: [PATCH 057/751] Patch tutorial author --- datasets/ztf.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/ztf.yaml b/datasets/ztf.yaml index f9259b104..5b7cc2546 100644 --- a/datasets/ztf.yaml +++ b/datasets/ztf.yaml @@ -33,5 +33,5 @@ DataAtWork: AuthorURL: https://irsa.ipac.caltech.edu - Title: Multi-Wavelength Light Curves Tutorial URL: https://nasa-fornax.github.io/fornax-demo-notebooks/light_curves/light_curve_generator.html - AuthorName: Caltech/IPAC-IRSA - AuthorURL: https://irsa.ipac.caltech.edu + AuthorName: Fornax Initiative + AuthorURL: https://pcos.gsfc.nasa.gov/Fornax/ From 58580bf8572ffbc37df7d0c74d429c975ee9bf94 Mon Sep 17 00:00:00 2001 From: vict0rsch Date: Fri, 27 Jun 2025 14:56:47 +0200 Subject: [PATCH 058/751] feat: add initial YAML structure --- datasets/lemat-rho.yaml | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 datasets/lemat-rho.yaml diff --git a/datasets/lemat-rho.yaml b/datasets/lemat-rho.yaml new file mode 100644 index 000000000..77ef08cc8 --- /dev/null +++ b/datasets/lemat-rho.yaml @@ -0,0 +1,37 @@ +Name: +Description: +Documentation: +Contact: +ManagedBy: +UpdateFrequency: +Tags: + - +License: +Citation: +Resources: + - Description: + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: + URL: + NotebookURL: + AuthorName: + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: + - \ No newline at end of file From bd37c98feb4224bf0ca679b972323e23eca23b1f Mon Sep 17 00:00:00 2001 From: Jordan Matelsky Date: Fri, 27 Jun 2025 09:03:49 -0400 Subject: [PATCH 059/751] Update bossdb.yaml license to include CC0 --- datasets/bossdb.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/bossdb.yaml b/datasets/bossdb.yaml index 71ef9d042..ffa93211b 100644 --- a/datasets/bossdb.yaml +++ b/datasets/bossdb.yaml @@ -18,7 +18,7 @@ Tags: - light-sheet microscopy - calcium imaging - volumetric imaging -License: Creative Commons 4.0 International (CC BY 4.0) +License: Creative Commons 4.0 International (CC BY 4.0); Creative Commons CC0 1.0 Universal (CC0-1.0) Resources: - Description: Large 3D volumes of neuroimaging data and image processing products such as segmentation and reconstructed meshes ARN: arn:aws:s3:::bossdb-open-data From afcb4c2b80c260984f0283c3dccaf905012f293c Mon Sep 17 00:00:00 2001 From: martinsiron Date: Fri, 27 Jun 2025 17:37:19 +0200 Subject: [PATCH 060/751] more yaml data --- datasets/lemat-rho.yaml | 59 +++++++++++++++++++++++++++-------------- 1 file changed, 39 insertions(+), 20 deletions(-) diff --git a/datasets/lemat-rho.yaml b/datasets/lemat-rho.yaml index 77ef08cc8..e83f8fa2a 100644 --- a/datasets/lemat-rho.yaml +++ b/datasets/lemat-rho.yaml @@ -1,37 +1,56 @@ -Name: -Description: -Documentation: -Contact: -ManagedBy: -UpdateFrequency: +Name: LeMat-Rho +Description: Charge densities and other raw VASP files from density functional theory calculations of equilibrium materials in LeMat-Bulk and non-equilibrium materials from MAD dataset. +Documentation: https://github.com/LeMaterial/LeMat-Rho/tree/main/docs +Contact: contact@entalpic.ai +ManagedBy: "[LeMaterial](http://lematerial.org)" +UpdateFrequency: Continuously, as calculated Tags: - - -License: -Citation: -Resources: - - Description: + - chemistry + - materials science + - machine learning + - physics + - crystallography + - density functional theory +License: BY CC 4.0 +Citation: +Resources: Raw Data + - Description: Raw, gzipped VASP calculations for all materials calculated ARN: Region: - Type: + Type: S3 Bucket Explore: DataAtWork: - Tutorials: - - Title: + Tutorials: + - Title: URL: NotebookURL: AuthorName: AuthorURL: Services: Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: + - Title: Pymatgen + URL: https://pymatgen.org + AuthorName: Materials Project + AuthorURL: https://materialsproject.org + - Title: Atomate2 + URL: https://materialsproject.github.io/atomate2 + AuthorName: Materials Project + AuthorURL: https://materialsproject.org + - Title: FireWorks + URL: https://materialsproject.github.io/fireworks + AuthorName: Materials Project + AuthorURL: https://materialsproject.org + - Title: MP-PyRho + URL: https://github.com/materialsproject/pyrho + AuthorName: MaterialsProject + AuthorURL: https://materialsproject.org Publications: - Title: URL: AuthorName: AuthorURL: -DeprecatedNotice: ADXCategories: - - \ No newline at end of file + - Education + - Public Sector & Government + - Technology + - Manufacturing From 92415eb531981bea9c9ee547c7f990b71467d765 Mon Sep 17 00:00:00 2001 From: martinsiron Date: Mon, 30 Jun 2025 10:10:54 +0200 Subject: [PATCH 061/751] update tutorial --- datasets/lemat-rho.yaml | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/datasets/lemat-rho.yaml b/datasets/lemat-rho.yaml index e83f8fa2a..edfec803a 100644 --- a/datasets/lemat-rho.yaml +++ b/datasets/lemat-rho.yaml @@ -1,7 +1,7 @@ Name: LeMat-Rho Description: Charge densities and other raw VASP files from density functional theory calculations of equilibrium materials in LeMat-Bulk and non-equilibrium materials from MAD dataset. Documentation: https://github.com/LeMaterial/LeMat-Rho/tree/main/docs -Contact: contact@entalpic.ai +Contact: info@entalpic.ai ManagedBy: "[LeMaterial](http://lematerial.org)" UpdateFrequency: Continuously, as calculated Tags: @@ -21,12 +21,10 @@ Resources: Raw Data Explore: DataAtWork: Tutorials: - - Title: - URL: - NotebookURL: - AuthorName: - AuthorURL: - Services: + - Title: Accessing Data in LeMat-Rho AWS OpenData Repository + URL: https://github.com/LeMaterial/LeMat-Rho/blob/feat/aws-upload/scripts/aws-open-data.ipynb + NotebookURL: https://github.com/LeMaterial/LeMat-Rho/blob/feat/aws-upload/scripts/aws-open-data.ipynb + AuthorName: Martin Siron, Mathilde Franckel, Jonathan Schmidt Tools & Applications: - Title: Pymatgen URL: https://pymatgen.org From cb3eb0385557bd92015f3177fb76ad45d7a8171c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Mon, 30 Jun 2025 09:21:50 -0400 Subject: [PATCH 062/751] Add a new dataset Add the Dendritic Consortium Multimodal Dataset YAML file --- datasets/dendritic-consortium.yaml.txt | 38 ++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 datasets/dendritic-consortium.yaml.txt diff --git a/datasets/dendritic-consortium.yaml.txt b/datasets/dendritic-consortium.yaml.txt new file mode 100644 index 000000000..a34f5d35c --- /dev/null +++ b/datasets/dendritic-consortium.yaml.txt @@ -0,0 +1,38 @@ +Name: Dendritic Consortium Multimodal Dataset +Description: The Dendritic Consortium provides a multimodal dataset integrating calcium and voltage imaging, electrophysiology, electron microscopy, proteomics, and computational models of Baz1a pyramidal neurons in the mouse primary visual cortex (V1), and endodermal neurons in Hydra vulgaris. +Documentation: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium +Contact: dendriticconsortium@gmail.com +ManagedBy: Dendritic Consortium +UpdateFrequency: Continuously updated as new experimental and computational data are generated. +Tags: + - brain images + - brain models + - electrophysiology + - electron microscopy + - imaging + - Mus musculus + - neuroscience + - neurobiology + - neuroimaging + - neurophysiology + - simulation neuroscience + - single neuron models +License: There are no restrictions on the use of this data. +Resources: + - Description: Multimodal dataset from Baz1a pyramidal neurons in mouse V1 and endodermal neurons in Hydra vulgaris. Includes TIFF, ABF, MAT, CSV, PNG, PY, HOC, and SWC files. + ARN: arn:aws:s3:::dendritic-consortium + Region: us-east-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Tutorial: Download and Visualize Data from the Dendritic Consortium Dataset + URL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium/tree/main/tutorials + AuthorName: Dendritic Consortium + AuthorURL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: +ADXCategories: + - Healthcare & Life Sciences Data \ No newline at end of file From 9a332da3b8b4967b187570eba0577952ac615a1f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Mon, 30 Jun 2025 09:24:03 -0400 Subject: [PATCH 063/751] Update yaml file --- ...{dendritic-consortium.yaml.txt => dendritic-consortium.yaml} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename datasets/{dendritic-consortium.yaml.txt => dendritic-consortium.yaml} (95%) diff --git a/datasets/dendritic-consortium.yaml.txt b/datasets/dendritic-consortium.yaml similarity index 95% rename from datasets/dendritic-consortium.yaml.txt rename to datasets/dendritic-consortium.yaml index a34f5d35c..73592aec5 100644 --- a/datasets/dendritic-consortium.yaml.txt +++ b/datasets/dendritic-consortium.yaml @@ -35,4 +35,4 @@ DataAtWork: AuthorName: AuthorURL: ADXCategories: - - Healthcare & Life Sciences Data \ No newline at end of file + - Healthcare & Life Sciences Data From 0c91b71049d397190ff90cebf42567e116304d7c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Mon, 30 Jun 2025 12:32:16 -0400 Subject: [PATCH 064/751] Modify dendritic-consortium.yaml --- datasets/dendritic-consortium.yaml | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/datasets/dendritic-consortium.yaml b/datasets/dendritic-consortium.yaml index 73592aec5..a38a3dfd7 100644 --- a/datasets/dendritic-consortium.yaml +++ b/datasets/dendritic-consortium.yaml @@ -29,10 +29,5 @@ DataAtWork: URL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium/tree/main/tutorials AuthorName: Dendritic Consortium AuthorURL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium - Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: ADXCategories: - - Healthcare & Life Sciences Data + - Healthcare & Life Sciences Data \ No newline at end of file From 7217d396ccc952e7e397af878fd62d97b3ed5eea Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Mon, 30 Jun 2025 12:39:17 -0400 Subject: [PATCH 065/751] Modify dendritic-consortium.yaml --- datasets/dendritic-consortium.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/dendritic-consortium.yaml b/datasets/dendritic-consortium.yaml index a38a3dfd7..22867c9a1 100644 --- a/datasets/dendritic-consortium.yaml +++ b/datasets/dendritic-consortium.yaml @@ -19,13 +19,13 @@ Tags: - single neuron models License: There are no restrictions on the use of this data. Resources: - - Description: Multimodal dataset from Baz1a pyramidal neurons in mouse V1 and endodermal neurons in Hydra vulgaris. Includes TIFF, ABF, MAT, CSV, PNG, PY, HOC, and SWC files. + - Description: "Multimodal dataset from Baz1a pyramidal neurons in mouse V1 and endodermal neurons in Hydra vulgaris, including TIFF, ABF, MAT, CSV, PNG, PY, HOC, and SWC files." ARN: arn:aws:s3:::dendritic-consortium Region: us-east-2 Type: S3 Bucket DataAtWork: Tutorials: - - Title: Tutorial: Download and Visualize Data from the Dendritic Consortium Dataset + - Title: Download and Visualize Data from the Dendritic Consortium Dataset URL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium/tree/main/tutorials AuthorName: Dendritic Consortium AuthorURL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium From aeabe6deffd3413086af31c5908ab8ba1cc03228 Mon Sep 17 00:00:00 2001 From: Hyun Min Kang Date: Mon, 30 Jun 2025 15:38:14 -0400 Subject: [PATCH 066/751] Added CartoStore --- datasets/cartostore.yaml | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) create mode 100644 datasets/cartostore.yaml diff --git a/datasets/cartostore.yaml b/datasets/cartostore.yaml new file mode 100644 index 000000000..356070b57 --- /dev/null +++ b/datasets/cartostore.yaml @@ -0,0 +1,35 @@ +Name: CartoStore +Description: | + Cross-Platform Repository for High-resolution Spatial Transcriptomics Datasets. +Documentation: "[CartoStore Overview](https://github.com/seqscope/cartostore)" +Contact: hmkang@umich.edu +ManagedBy: "[Hyun Min Kang](https://scholar.google.com/citations?user=8e0jy0IAAAAJ&hl=en)" +UpdateFrequency: Monthly +Tags: + - spatial transcriptomics + - spatial omics + - genomics + - PMTiles + - geospatial +License: | + "[CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)" +Citation: | + CartoStore by Hyun Min Kang's lab at the University of Michigan School of Public Health. + Provided by Kang lab and accessed [DAY MONTH YEAR]. +Resources: + - Description: Parquet and Shapefiles + ARN: arn:aws:s3:::carostore + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: CartoStore Overview + URL: https://github.com/seqscope/cartostore/blob/main/README.md + AuthorName: Hyun Min Kang and Weiqiu Cheng + - Title: Cartloader Documentation + URL: https://seqscope.github.io/cartloader/ + AuthorName: Hyun Min Kang and Weiqiu Cheng + Example Datasets: + - Title : Example CartoStore Repository for Xenium Breast Cancer Dataset + URL: https://zenodo.org/records/15649152 + AuthorName: Hyun Min Kang and Weiqiu Cheng From cf18f9dafe51478022e9c80412e25dc8b7c52a82 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Tue, 1 Jul 2025 10:48:49 -0400 Subject: [PATCH 067/751] Update dendritic-consortium.yaml --- datasets/dendritic-consortium.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/dendritic-consortium.yaml b/datasets/dendritic-consortium.yaml index 22867c9a1..fd82b01f0 100644 --- a/datasets/dendritic-consortium.yaml +++ b/datasets/dendritic-consortium.yaml @@ -10,6 +10,7 @@ Tags: - electrophysiology - electron microscopy - imaging + - life sciences - Mus musculus - neuroscience - neurobiology @@ -30,4 +31,4 @@ DataAtWork: AuthorName: Dendritic Consortium AuthorURL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium ADXCategories: - - Healthcare & Life Sciences Data \ No newline at end of file + - Healthcare & Life Sciences Data From 3d5e49bb9a6d1c9e57fbde13f7275c9d8b332def Mon Sep 17 00:00:00 2001 From: tim-essential Date: Tue, 1 Jul 2025 10:35:05 -0700 Subject: [PATCH 068/751] docs: adds metadata for Essential-Web v1.0 --- datasets/eai-essential-web-v1.yaml | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 datasets/eai-essential-web-v1.yaml diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml new file mode 100644 index 000000000..89831258d --- /dev/null +++ b/datasets/eai-essential-web-v1.yaml @@ -0,0 +1,26 @@ +Name: Essential-Web v1.0: 24T tokens of organized web data +Description: A 24-trillion-token dataset in which every document is annotated with a twelve-category taxonomy covering topic, format, content complexity, and quality. +Documentation: https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 +Contact: research@essential.ai +ManagedBy: '[EssentialAI](https://www.essential.ai)' +UpdateFrequency: Not updated +Tags: + - aws-pds + - machine learning + - natural language processing + - web data + - text +License: 'Essential-Web-v1.0 contributions are made available under the [ODC attribution license](https://opendatacommons.org/licenses/by/odc_by_1.0_public_text.txt); however, users should also abide by the [Common Crawl - Terms of Use](https://commoncrawl.org/terms-of-use). We do not alter the license of any of the underlying data.' +Resources: + - Description: Essential-Web v1.0: 24T tokens of organized web data + ARN: # TODO: fill in + Region: # TODO: fill in + Type: S3 Bucket + Explore: + - https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 +DataAtWork: + Publications: + - Title: 'Essential-Web v1.0: 24T tokens of organized web data' + URL: https://arxiv.org/abs/2506.14111 + AuthorName: Andrew Hojel, Michael Pust, Tim Romanski, Yash Vanjani, Ritvik Kapila, Mohit Parmar et al. + AuthorURL: https://arxiv.org/abs/2506.14111 From ae2c9329b06a8d4c196b0b03072e7159021626ab Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Thu, 26 Jun 2025 03:39:50 -0700 Subject: [PATCH 069/751] Add SPHEREx dataset yaml --- datasets/spherex.yaml | 81 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 datasets/spherex.yaml diff --git a/datasets/spherex.yaml b/datasets/spherex.yaml new file mode 100644 index 000000000..45313116e --- /dev/null +++ b/datasets/spherex.yaml @@ -0,0 +1,81 @@ +Name: 'SPHEREx: An All-Sky Spectral Survey' +Description: 'The Spectro-Photometer for the History of the Universe, Epoch of Reionization, and Ices Explorer (SPHEREx) is a NASA Astrophysics Medium-class Explorer (MIDEX) mission launched in March 2025. During its planned two-year mission, SPHEREx will perform the first ever all-sky spectral survey in the optical to near-infrared (0.75-5 microns). SPHEREx data will be used to probe inflation and the early universe, trace the history of galactic light production, and investigate the origin of planetary systems and biogenic ices, in addition to contributing to many other astrophyics research topics.' +Documentation: https://irsa.ipac.caltech.edu/Missions/spherex.html +Contact: https://irsa.ipac.caltech.edu/docs/help_desk.html +ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech.edu)) at Caltech" +UpdateFrequency: The SPHEREx mission releases small updates weekly and large updates annually. The data may also be presented in new ways as the products become available. +Tags: + - astronomy + - imaging + - object detection + - satellite imagery + - survey +License: https://irsa.ipac.caltech.edu/data_use_terms.html +Citation: "If you use SPHEREx data, please follow the citation instructions provided at [FILL IN] and [https://irsa.ipac.caltech.edu/ack.html](https://irsa.ipac.caltech.edu/ack.html)." +Resources: + - Description: 'Linear Variable Filter (LVF) Images: Calibrated LVF images plus per-pixel status and processing flags, variance map, zodiacal model, exposure-averaged PSF, and wavelength WCS. Multi-extension FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/level2 + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Absolute Gain Matrix: Pixel-to-pixel gain variations within a single spectral channel and relative gain differences across channels. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/abs_gain_matrix + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Gain Factors: [DESCRIPTION]. YAML format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/gain_factors + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Exposure-Averaged PSF: Wavelength-dependent point spread function (PSF) estimates on a fine positional grid across each detector. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/average_psf + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Dark Current: [DESCRIPTION]. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/dark + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Dichroic: [DESCRIPTION]. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/dichroic + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Non-functional Pixel Map: Per-exposure, per-detector bad pixel flags. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/nonfunc + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Non-Linearity Correction: Corrections applied to compensate for detector non-linearity due to gain degradation. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/nonlinear_pars + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Readout Noise: [DESCRIPTION]. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/readnoise_pars + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Spectral WCS Map: Detailed WCS map. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/spectral_wcs + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False +DataAtWork: + Tutorials: + - Title: Notebook Tutorials + URL: https://irsa.ipac.caltech.edu/docs/notebooks/#accessing-euclid-data + AuthorName: Caltech/IPAC-IRSA + AuthorURL: https://irsa.ipac.caltech.edu From f00a84d68224c467ec9b8569cf6116e3f4b14a9b Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Tue, 1 Jul 2025 01:12:08 -0700 Subject: [PATCH 070/751] Apply feedback from @vandesai1 --- datasets/spherex.yaml | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/datasets/spherex.yaml b/datasets/spherex.yaml index 45313116e..711431573 100644 --- a/datasets/spherex.yaml +++ b/datasets/spherex.yaml @@ -11,9 +11,9 @@ Tags: - satellite imagery - survey License: https://irsa.ipac.caltech.edu/data_use_terms.html -Citation: "If you use SPHEREx data, please follow the citation instructions provided at [FILL IN] and [https://irsa.ipac.caltech.edu/ack.html](https://irsa.ipac.caltech.edu/ack.html)." +Citation: 'If you use SPHEREx data from the IRSA archive, please cite the appropriate Digital Object Identifier: [10.26131/IRSA629](https://www.ipac.caltech.edu/doi/irsa/10.26131/IRSA629), include the following acknowledgement: "This publication makes use of data products from the Spectro-Photometer for the History of the Universe, Epoch of Reionization and Ices Explorer (SPHEREx), which is a joint project of the Jet Propulsion Laboratory and the California Institute of Technology, and is funded by the National Aeronautics and Space Administration.", and follow the [IRSA acknowledgement guidelines](https://irsa.ipac.caltech.edu/ack.html).' Resources: - - Description: 'Linear Variable Filter (LVF) Images: Calibrated LVF images plus per-pixel status and processing flags, variance map, zodiacal model, exposure-averaged PSF, and wavelength WCS. Multi-extension FITS format.' + - Description: 'Spectral Images: Calibrated Spectral Images plus per-pixel status and processing flags, variance map, zodiacal model, exposure-averaged PSF, and wavelength WCS. Multi-extension FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/level2 Region: us-east-1 Type: S3 Bucket @@ -25,31 +25,31 @@ Resources: Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Gain Factors: [DESCRIPTION]. YAML format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/gain_factors - Region: us-east-1 - Type: S3 Bucket - RequesterPays: False - AccountRequired: False - Description: 'Exposure-Averaged PSF: Wavelength-dependent point spread function (PSF) estimates on a fine positional grid across each detector. FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/average_psf Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Dark Current: [DESCRIPTION]. FITS format.' + - Description: 'Dark Current: Per pixel dark current. FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/dark Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Dichroic: [DESCRIPTION]. FITS format.' + - Description: 'Dichroic: Map of pixels affected by flux attenuation due to the dichroic filter. FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/dichroic Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Non-functional Pixel Map: Per-exposure, per-detector bad pixel flags. FITS format.' + - Description: 'Gain Factors: Gain factors for each detector. YAML format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr/gain_factors + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'Non-functional Pixel Map: Map of permanently non-functioning pixels. FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/nonfunc Region: us-east-1 Type: S3 Bucket @@ -61,13 +61,13 @@ Resources: Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Readout Noise: [DESCRIPTION]. FITS format.' + - Description: 'Readout Noise: Per-detector read noise maps. FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/readnoise_pars Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Spectral WCS Map: Detailed WCS map. FITS format.' + - Description: 'Spectral WCS Map: Detailed World Coordinate System (WCS) map. FITS format.' ARN: arn:aws:s3:::nasa-irsa-spherex/qr/spectral_wcs Region: us-east-1 Type: S3 Bucket @@ -76,6 +76,6 @@ Resources: DataAtWork: Tutorials: - Title: Notebook Tutorials - URL: https://irsa.ipac.caltech.edu/docs/notebooks/#accessing-euclid-data + URL: https://irsa.ipac.caltech.edu/docs/notebooks/#accessing-spherex-data AuthorName: Caltech/IPAC-IRSA AuthorURL: https://irsa.ipac.caltech.edu From d0f84cfe1de8da82f02654a64300ef0d2bc8e6da Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Tue, 1 Jul 2025 12:55:02 -0700 Subject: [PATCH 071/751] Rename file so url will be registry.opendata.aws/spherex-qr/ --- datasets/{spherex.yaml => spherex-qr.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{spherex.yaml => spherex-qr.yaml} (100%) diff --git a/datasets/spherex.yaml b/datasets/spherex-qr.yaml similarity index 100% rename from datasets/spherex.yaml rename to datasets/spherex-qr.yaml From 346d4122fc50a0f9442db9ea22edcc5144cb191f Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Tue, 1 Jul 2025 19:44:16 -0700 Subject: [PATCH 072/751] Patch text to reflect QR only --- datasets/spherex-qr.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/spherex-qr.yaml b/datasets/spherex-qr.yaml index 711431573..38ded3649 100644 --- a/datasets/spherex-qr.yaml +++ b/datasets/spherex-qr.yaml @@ -1,9 +1,9 @@ -Name: 'SPHEREx: An All-Sky Spectral Survey' -Description: 'The Spectro-Photometer for the History of the Universe, Epoch of Reionization, and Ices Explorer (SPHEREx) is a NASA Astrophysics Medium-class Explorer (MIDEX) mission launched in March 2025. During its planned two-year mission, SPHEREx will perform the first ever all-sky spectral survey in the optical to near-infrared (0.75-5 microns). SPHEREx data will be used to probe inflation and the early universe, trace the history of galactic light production, and investigate the origin of planetary systems and biogenic ices, in addition to contributing to many other astrophyics research topics.' +Name: 'SPHEREx Quick Release (QR): An All-Sky Spectral Survey' +Description: 'The Spectro-Photometer for the History of the Universe, Epoch of Reionization, and Ices Explorer (SPHEREx) is a NASA Astrophysics Medium-class Explorer (MIDEX) mission launched in March 2025. During its planned two-year mission, SPHEREx will perform the first ever all-sky spectral survey in the optical to near-infrared (0.75-5 microns). SPHEREx Quick Release (QR) is the first data release. SPHEREx data will be used to probe inflation and the early universe, trace the history of galactic light production, and investigate the origin of planetary systems and biogenic ices, in addition to contributing to many other astrophysics research topics.' Documentation: https://irsa.ipac.caltech.edu/Missions/spherex.html Contact: https://irsa.ipac.caltech.edu/docs/help_desk.html ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech.edu)) at Caltech" -UpdateFrequency: The SPHEREx mission releases small updates weekly and large updates annually. The data may also be presented in new ways as the products become available. +UpdateFrequency: SPHEREx QR is updated weekly. The data may also be presented in new ways as the products become available. Tags: - astronomy - imaging From 8618b28d1e5a8584ec5cee23db345bb231185fa3 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Wed, 2 Jul 2025 13:10:26 -0400 Subject: [PATCH 073/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 6ce63ce27..a3558b811 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -1,7 +1,7 @@ -Name: Graf Reforecast -Description: NVIDIA and The Weather Company (TWCo) have generated a data set of reforecasts from TWCo’s GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of the computation on graphical processing units, with this development assisted by NVIDIA. The intended 1836 reforecast cases (~5 years) will be generated with initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way, we hypothesize, will span a wider range of interesting, high-impact weather scenarios than had we performed five contiguous years of once-daily reforecasts, and we will span a wider variety of interesting precipitation events while still providing many samples in non-precipitating regions with more ordinary weather. GRAF reforecasts were mostly run to +27 h lead time, 3-h for spin up plus a full diurnal cycle. Data were saved in zarr format. Most fields were at 15-min intervals. Data are made publicly available to all through Amazon Web Services’ Open-Data Initiative. +Name: GRAF Reforecast +Description: A zarr-formatted dataset of 1836 reforecast cases (~5 years) from The Weather Company GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of its computation on graphical processing units, with this development assisted by NVIDIA. The 1836 cases (~5 years) were generated from ECMWF reanalyses (ERA5) for initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way spanned a wider range of interesting, high-impact weather scenarios than were there five contiguous years of data. GRAF reforecasts were mostly run to +27 h lead time, assuming a 3-h for spin up followed by a full diurnal cycle. Data were saved in zarr format on the native model vertical coordinate. Most fields were saved at 15-min intervals, though several precipitation variables were saved at 5-min cadence. Documentation: -Contact: +Contact: Tom Hamill (tom.hamill@weather.com) ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" UpdateFrequency: One time push only Tags: @@ -12,8 +12,15 @@ Tags: - model - near-surface air temperature - near-surface relative humidity - - zar + - precipitation amount + - precipitation type + - wind speeds + - cloud amount + - visibility + - zarr - weather + - ERA5 + - MPAS License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" Citation: Resources: @@ -21,6 +28,7 @@ Resources: ARN: arn:aws:s3:::twc-graf-reforecast Region: us-west-2 Type: S3 Bucket + Explore: DataAtWork: Tutorials: - Title: @@ -29,11 +37,6 @@ DataAtWork: AuthorName: AuthorURL: Services: - Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: Publications: - Title: Global reforecasts from MPAS “GRAF” with mesh refinement over the US and Europe URL: https://cesoc.net/wp-content/uploads/2024/08/GRAF-reforecast-Hamill-CESOC24.pdf From 4209fff6e02cf8a286c1028b2b523778c4461157 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Wed, 2 Jul 2025 17:44:12 -0400 Subject: [PATCH 074/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index a3558b811..dc07f2b41 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -1,7 +1,7 @@ Name: GRAF Reforecast Description: A zarr-formatted dataset of 1836 reforecast cases (~5 years) from The Weather Company GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of its computation on graphical processing units, with this development assisted by NVIDIA. The 1836 cases (~5 years) were generated from ECMWF reanalyses (ERA5) for initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way spanned a wider range of interesting, high-impact weather scenarios than were there five contiguous years of data. GRAF reforecasts were mostly run to +27 h lead time, assuming a 3-h for spin up followed by a full diurnal cycle. Data were saved in zarr format on the native model vertical coordinate. Most fields were saved at 15-min intervals, though several precipitation variables were saved at 5-min cadence. -Documentation: -Contact: Tom Hamill (tom.hamill@weather.com) +Documentation: "[Documentation](https://docs.google.com/forms/d/e/1FAIpQLSejRyG2CXrfcmrX7g_iFhc3RF-n3ZzmPQdVieSDwTzLNkR-_w/viewform)" +Contact: graf.reforecast@weather.com ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" UpdateFrequency: One time push only Tags: @@ -22,21 +22,12 @@ Tags: - ERA5 - MPAS License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" -Citation: Resources: - Description: GRAF Reforecast dataset ARN: arn:aws:s3:::twc-graf-reforecast Region: us-west-2 Type: S3 Bucket - Explore: DataAtWork: - Tutorials: - - Title: - URL: - NotebookURL: - AuthorName: - AuthorURL: - Services: Publications: - Title: Global reforecasts from MPAS “GRAF” with mesh refinement over the US and Europe URL: https://cesoc.net/wp-content/uploads/2024/08/GRAF-reforecast-Hamill-CESOC24.pdf From 3b37573a958f3f65249a97a866cdd5ecebdca9a1 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Wed, 2 Jul 2025 17:47:01 -0400 Subject: [PATCH 075/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index dc07f2b41..cde11dbaf 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -1,5 +1,5 @@ Name: GRAF Reforecast -Description: A zarr-formatted dataset of 1836 reforecast cases (~5 years) from The Weather Company GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of its computation on graphical processing units, with this development assisted by NVIDIA. The 1836 cases (~5 years) were generated from ECMWF reanalyses (ERA5) for initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way spanned a wider range of interesting, high-impact weather scenarios than were there five contiguous years of data. GRAF reforecasts were mostly run to +27 h lead time, assuming a 3-h for spin up followed by a full diurnal cycle. Data were saved in zarr format on the native model vertical coordinate. Most fields were saved at 15-min intervals, though several precipitation variables were saved at 5-min cadence. +Description: "A zarr-formatted dataset of 1836 reforecast cases (~5 years) from The Weather Company GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of its computation on graphical processing units, with this development assisted by NVIDIA. The 1836 cases (~5 years) were generated from ECMWF reanalyses (ERA5) for initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way spanned a wider range of interesting, high-impact weather scenarios than were there five contiguous years of data. GRAF reforecasts were mostly run to +27 h lead time, assuming a 3-h for spin up followed by a full diurnal cycle. Data were saved in zarr format on the native model vertical coordinate. Most fields were saved at 15-min intervals, though several precipitation variables were saved at 5-min cadence." Documentation: "[Documentation](https://docs.google.com/forms/d/e/1FAIpQLSejRyG2CXrfcmrX7g_iFhc3RF-n3ZzmPQdVieSDwTzLNkR-_w/viewform)" Contact: graf.reforecast@weather.com ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" From 613c5a08a823036ba89b23e74ecd86b6b8f950d4 Mon Sep 17 00:00:00 2001 From: Jed Sundwall Date: Wed, 2 Jul 2025 15:38:23 -0700 Subject: [PATCH 076/751] Add Clay datasets from Source Cooperative Add three new datasets from Source Cooperative's Clay platform: - clay-v1-5-sentinel2: Sentinel-2 satellite imagery data - clay-v1-5-naip-2: NAIP aerial imagery data - clay-model-v0-embeddings: Machine learning model embeddings These datasets provide satellite imagery, aerial photography, and AI model embeddings for earth observation and computer vision applications. --- datasets/clay-model-v0-embeddings.yaml | 27 ++++++++++++++++++++++++ datasets/clay-v1-5-naip-2.yaml | 27 ++++++++++++++++++++++++ datasets/clay-v1-5-sentinel2.yaml | 29 ++++++++++++++++++++++++++ 3 files changed, 83 insertions(+) create mode 100644 datasets/clay-model-v0-embeddings.yaml create mode 100644 datasets/clay-v1-5-naip-2.yaml create mode 100644 datasets/clay-v1-5-sentinel2.yaml diff --git a/datasets/clay-model-v0-embeddings.yaml b/datasets/clay-model-v0-embeddings.yaml new file mode 100644 index 000000000..fa22f879e --- /dev/null +++ b/datasets/clay-model-v0-embeddings.yaml @@ -0,0 +1,27 @@ +Name: Clay Model v0 Embeddings +Description: Machine learning model embeddings dataset providing pre-computed feature representations for satellite and aerial imagery analysis. +Documentation: https://source.coop/repositories/clay/clay-model-v0-embeddings/description +Contact: contact@madewithclay.org +ManagedBy: "[Source Cooperative](https://source.coop/)" +UpdateFrequency: As new model versions become available +Tags: + - aws-pds + - machine learning + - embeddings + - computer vision + - satellite imagery + - aerial imagery + - feature extraction + - ai +License: Creative Commons Attribution 4.0 International License +Citation: "Clay Model v0 Embeddings. Source Cooperative. https://source.coop/repositories/clay/clay-model-v0-embeddings/description" +Resources: + - Description: Clay Model v0 Embeddings S3 Bucket + ARN: arn:aws:s3:::us-west-2.opendata.source.coop/clay/clay-model-v0-embeddings + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://source.coop/clay/clay-model-v0-embeddings/)' +ADXCategories: + - Machine Learning Data + - Computer Vision Data \ No newline at end of file diff --git a/datasets/clay-v1-5-naip-2.yaml b/datasets/clay-v1-5-naip-2.yaml new file mode 100644 index 000000000..75e9816cb --- /dev/null +++ b/datasets/clay-v1-5-naip-2.yaml @@ -0,0 +1,27 @@ +Name: Clay v1.5 NAIP-2 +Description: National Agriculture Imagery Program (NAIP) dataset providing high-resolution aerial imagery for agricultural monitoring, land use analysis, and natural resource management. +Documentation: https://source.coop/repositories/clay/clay-v1-5-naip-2/description +Contact: contact@madewithclay.org +ManagedBy: "[Source Cooperative](https://source.coop/)" +UpdateFrequency: As new NAIP data becomes available +Tags: + - aws-pds + - aerial imagery + - naip + - agriculture + - land use + - natural resources + - remote sensing + - environmental +License: Creative Commons Attribution 4.0 International License +Citation: "Clay v1.5 NAIP-2. Source Cooperative. https://source.coop/repositories/clay/clay-v1-5-naip-2/description" +Resources: + - Description: Clay v1.5 NAIP-2 S3 Bucket + ARN: arn:aws:s3:::us-west-2.opendata.source.coop/clay/clay-v1-5-naip-2 + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://source.coop/clay/clay-v1-5-naip-2/)' +ADXCategories: + - Earth Observation Data + - Aerial Imagery \ No newline at end of file diff --git a/datasets/clay-v1-5-sentinel2.yaml b/datasets/clay-v1-5-sentinel2.yaml new file mode 100644 index 000000000..6fea1200e --- /dev/null +++ b/datasets/clay-v1-5-sentinel2.yaml @@ -0,0 +1,29 @@ +Name: Clay v1.5 Sentinel-2 +Description: Sentinel-2 satellite imagery dataset providing high-resolution optical data for land monitoring, agriculture, and environmental applications. +Documentation: https://source.coop/repositories/clay/clay-v1-5-sentinel2/description +Contact: contact@madewithclay.org +ManagedBy: "[Source Cooperative](https://source.coop/)" +UpdateFrequency: As new Sentinel-2 data becomes available +Tags: + - aws-pds + - satellite imagery + - sentinel-2 + - earth observation + - remote sensing + - agriculture + - land monitoring + - environmental +License: Creative Commons Attribution 4.0 International License +Citation: "Clay v1.5 Sentinel-2. Source Cooperative. https://source.coop/repositories/clay/clay-v1-5-sentinel2/description" +Resources: + - Description: Clay v1.5 Sentinel-2 S3 Bucket + ARN: arn:aws:s3:::us-west-2.opendata.source.coop/clay/clay-v1-5-sentinel2 + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://source.coop/repositories/clay/clay-v1-5-sentinel2/description)' +ADXCategories: + - Earth Observation Data + - Satellite Imagery + + From 66b855cb6b30c258706f9f2b520f5de17db58d99 Mon Sep 17 00:00:00 2001 From: Jed Sundwall Date: Wed, 2 Jul 2025 16:06:50 -0700 Subject: [PATCH 077/751] Fix invalid tags in Clay datasets - remove aws-pds and invalid tags --- datasets/clay-model-v0-embeddings.yaml | 6 ++---- datasets/clay-v1-5-naip-2.yaml | 5 +---- datasets/clay-v1-5-sentinel2.yaml | 3 --- 3 files changed, 3 insertions(+), 11 deletions(-) diff --git a/datasets/clay-model-v0-embeddings.yaml b/datasets/clay-model-v0-embeddings.yaml index fa22f879e..81207fe4d 100644 --- a/datasets/clay-model-v0-embeddings.yaml +++ b/datasets/clay-model-v0-embeddings.yaml @@ -5,14 +5,12 @@ Contact: contact@madewithclay.org ManagedBy: "[Source Cooperative](https://source.coop/)" UpdateFrequency: As new model versions become available Tags: - - aws-pds - machine learning - - embeddings - computer vision - satellite imagery - aerial imagery - - feature extraction - - ai + - earth observation + - imaging License: Creative Commons Attribution 4.0 International License Citation: "Clay Model v0 Embeddings. Source Cooperative. https://source.coop/repositories/clay/clay-model-v0-embeddings/description" Resources: diff --git a/datasets/clay-v1-5-naip-2.yaml b/datasets/clay-v1-5-naip-2.yaml index 75e9816cb..9a211bfd3 100644 --- a/datasets/clay-v1-5-naip-2.yaml +++ b/datasets/clay-v1-5-naip-2.yaml @@ -5,13 +5,10 @@ Contact: contact@madewithclay.org ManagedBy: "[Source Cooperative](https://source.coop/)" UpdateFrequency: As new NAIP data becomes available Tags: - - aws-pds - aerial imagery - - naip - agriculture - land use - - natural resources - - remote sensing + - natural resource - environmental License: Creative Commons Attribution 4.0 International License Citation: "Clay v1.5 NAIP-2. Source Cooperative. https://source.coop/repositories/clay/clay-v1-5-naip-2/description" diff --git a/datasets/clay-v1-5-sentinel2.yaml b/datasets/clay-v1-5-sentinel2.yaml index 6fea1200e..f967a9d3b 100644 --- a/datasets/clay-v1-5-sentinel2.yaml +++ b/datasets/clay-v1-5-sentinel2.yaml @@ -5,11 +5,8 @@ Contact: contact@madewithclay.org ManagedBy: "[Source Cooperative](https://source.coop/)" UpdateFrequency: As new Sentinel-2 data becomes available Tags: - - aws-pds - satellite imagery - - sentinel-2 - earth observation - - remote sensing - agriculture - land monitoring - environmental From e6828fc55f777720a51ecc0bd4fa667f304598a8 Mon Sep 17 00:00:00 2001 From: Jed Sundwall Date: Wed, 2 Jul 2025 16:18:26 -0700 Subject: [PATCH 078/751] Fix invalid 'land monitoring' tag - replace with valid 'land use' tag --- datasets/clay-v1-5-sentinel2.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/clay-v1-5-sentinel2.yaml b/datasets/clay-v1-5-sentinel2.yaml index f967a9d3b..706f7a39f 100644 --- a/datasets/clay-v1-5-sentinel2.yaml +++ b/datasets/clay-v1-5-sentinel2.yaml @@ -8,7 +8,7 @@ Tags: - satellite imagery - earth observation - agriculture - - land monitoring + - land use - environmental License: Creative Commons Attribution 4.0 International License Citation: "Clay v1.5 Sentinel-2. Source Cooperative. https://source.coop/repositories/clay/clay-v1-5-sentinel2/description" From b48c735ac1c699706972871f7ac9194b44fbb271 Mon Sep 17 00:00:00 2001 From: Jed Sundwall Date: Wed, 2 Jul 2025 16:37:12 -0700 Subject: [PATCH 079/751] Fix invalid ADX categories - use only valid categories from adx_categories.yaml --- datasets/clay-model-v0-embeddings.yaml | 3 +-- datasets/clay-v1-5-naip-2.yaml | 3 +-- datasets/clay-v1-5-sentinel2.yaml | 3 +-- 3 files changed, 3 insertions(+), 6 deletions(-) diff --git a/datasets/clay-model-v0-embeddings.yaml b/datasets/clay-model-v0-embeddings.yaml index 81207fe4d..8bff0851f 100644 --- a/datasets/clay-model-v0-embeddings.yaml +++ b/datasets/clay-model-v0-embeddings.yaml @@ -21,5 +21,4 @@ Resources: Explore: - '[Browse Dataset](https://source.coop/clay/clay-model-v0-embeddings/)' ADXCategories: - - Machine Learning Data - - Computer Vision Data \ No newline at end of file + - Environmental Data \ No newline at end of file diff --git a/datasets/clay-v1-5-naip-2.yaml b/datasets/clay-v1-5-naip-2.yaml index 9a211bfd3..03bd8ae35 100644 --- a/datasets/clay-v1-5-naip-2.yaml +++ b/datasets/clay-v1-5-naip-2.yaml @@ -20,5 +20,4 @@ Resources: Explore: - '[Browse Dataset](https://source.coop/clay/clay-v1-5-naip-2/)' ADXCategories: - - Earth Observation Data - - Aerial Imagery \ No newline at end of file + - Environmental Data \ No newline at end of file diff --git a/datasets/clay-v1-5-sentinel2.yaml b/datasets/clay-v1-5-sentinel2.yaml index 706f7a39f..9efab89c4 100644 --- a/datasets/clay-v1-5-sentinel2.yaml +++ b/datasets/clay-v1-5-sentinel2.yaml @@ -20,7 +20,6 @@ Resources: Explore: - '[Browse Dataset](https://source.coop/repositories/clay/clay-v1-5-sentinel2/description)' ADXCategories: - - Earth Observation Data - - Satellite Imagery + - Environmental Data From b8cfbc2778072ad619c2e760bd2e2fcfc319c94b Mon Sep 17 00:00:00 2001 From: Zoheyr Doctor Date: Wed, 2 Jul 2025 18:47:13 -0500 Subject: [PATCH 080/751] Added quotes to citation --- datasets/mbers-open-data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/mbers-open-data.yaml b/datasets/mbers-open-data.yaml index a86e0ce15..6d98c4c7c 100644 --- a/datasets/mbers-open-data.yaml +++ b/datasets/mbers-open-data.yaml @@ -13,7 +13,7 @@ Tags: - energy modeling - environmental License: All data are free and provided without license restrictions. -Citation: Marginal Build Emissions Rates (MBERs) for Electricity. Climate TRACE. [DATE]. URL: https://www.gem.wiki/MBERs +Citation: "Marginal Build Emissions Rates (MBERs) for Electricity. Climate TRACE. [DATE]. URL: https://www.gem.wiki/MBERs" DataAtWork: Tutorials: - Title: MBER Orientation and Tutorial From 4a861e882a6fb3c5453b6c10172d52183b00f707 Mon Sep 17 00:00:00 2001 From: nanaboamah89 Date: Thu, 3 Jul 2025 14:39:48 +0000 Subject: [PATCH 081/751] created yaml for lake water quality --- datasets/deafrica-clgm-lwq.yaml | 104 ++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) create mode 100644 datasets/deafrica-clgm-lwq.yaml diff --git a/datasets/deafrica-clgm-lwq.yaml b/datasets/deafrica-clgm-lwq.yaml new file mode 100644 index 000000000..f5e9a1d6a --- /dev/null +++ b/datasets/deafrica-clgm-lwq.yaml @@ -0,0 +1,104 @@ +Name: Digital Earth Africa - Copernicus Global Land Service - Lake Water Quality +Description: | + The Copernicus Global Land Service – Lake Water Quality products offer a comprehensive, satellite-derived monitoring system for assessing key water quality indicators in major large lakes, typically those greater than 50 hectares. These datasets are generated using optical satellite sensors, primarily Sentinel-2 MSI and Sentinel-3 OLCI, with earlier archives derived from Envisat MERIS. Spanning multiple spatial resolutions (100 m and 300 m) and temporal scales (10-day composites), they support both near-real-time and retrospective assessments of inland water quality. + + Key parameters include surface reflectance, turbidity, total suspended matter (TSM), chlorophyll-a concentration, trophic state index, and floating cyanobacteria risk—all essential for monitoring eutrophication, ecological health, and harmful algal blooms (HABs). The datasets cover the period from 2002 to the present, providing long-term continuity for environmental monitoring and scientific research, with focused coverage in Europe and Africa. + + All products are delivered using standardized geospatial grids (EPSG:4326) and include quality flags, detailed metadata, and validation against in situ observations to ensure reliability. Continuous improvements across product versions—such as enhanced atmospheric correction and updated retrieval algorithms—have significantly improved accuracy and usability. In addition, comprehensive user manuals, technical documentation, and support materials are available, making the data highly accessible to researchers, policymakers, and environmental managers. + + Digital Earth Africa (DE Africa) hosts these datasets for the African region, providing free and open access to both the data and associated tools. + +Documentation: https://docs.digitalearthafrica.org/en/latest/data_specs/CGLM_Lake_Water_Quality_specs.html +Contact: helpdesk@digitalearthafrica.org +ManagedBy: "[Digital Earth Africa](https://www.digitalearthafrica.org/)" +UpdateFrequency: New scene-level data is added regularly, as the Lake Water Quality (LWQ) datasets are updated every 10 days (dekadal composites), with near-real-time versions typically available within 3 to 4 days after satellite acquisition. +Collabs: + ASDI: + Tags: + - satellite imagery +Tags: + - aws-pds + - agriculture + - disaster response + - earth observation + - geospatial + - natural resource + - satellite imagery + - water + - deafrica + - stac + - cog +License: | + DE Africa makes this data available under the Creative Commons Attribute 4.0 license https://creativecommons.org/licenses/by/4.0/. +Resources: + - Description: Lake Water Quality 2019-2024 (raster 100 m), 10-daily – version 1 + ARN: arn:aws:s3:::deafrica-input-datasets/cgls_lwq100_2019_2024 + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/stac/collections/cgls_lwq100_2019_2024)' + - Description: Lake Water Quality 2024 - present (raster 100 m), 10-daily – version 2 + ARN: arn:aws:s3:::deafrica-input-datasets/cgls_lwq100_2024_nrt + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/stac/collections/cgls_lwq100_2024_nrt)' + - Description: Lake Water Quality 2002-2012 (raster 300 m), 10-daily – version 1 + ARN: arn:aws:s3:::deafrica-services/cgls_lwq300_2002_2012 + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/stac/collections/cgls_lwq300_2002_2012)' + - Description: Lake Water Quality 2016-2024 (raster 300 m), 10-daily – version 1 + ARN: arn:aws:s3:::deafrica-services/cgls_lwq300_2016_2024 + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/stac/collections/cgls_lwq300_2016_2024)' + - Description: Lake Water Quality 2024 - present (raster 300 m), 10-daily – version 2 + ARN: arn:aws:s3:::deafrica-services/cgls_lwq300_2024_nrt + Region: af-south-1 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[STAC V1.0.0 endpoint](https://explorer.digitalearth.africa/stac/collections/cgls_lwq300_2024_nrt)' +DataAtWork: + Tutorials: + - Title: Digital Earth Africa Training + URL: http://learn.digitalearthafrica.org/ + AuthorName: Digital Earth Africa Contributors + Tools & Applications: + - Title: "Digital Earth Africa Explorer (Lake Water Quality 2019-2024 (raster 100 m), 10-daily – version 1)" + URL: https://explorer.digitalearth.africa/products/cgls_lwq100_2019_2024 + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Explorer ( Lake Water Quality 2024 - present (raster 100 m), 10-daily – version 2)" + URL: https://explorer.digitalearth.africa/products/cgls_lwq100_2024_nrt + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Explorer ( Lake Water Quality 2002-2012 (raster 300 m), 10-daily – version 1)" + URL: https://explorer.digitalearth.africa/products/cgls_lwq300_2002_2012 + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Explorer ( Lake Water Quality 2016-2024 (raster 300 m), 10-daily – version 1)" + URL: https://explorer.digitalearth.africa/products/cgls_lwq300_2016_2024 + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Explorer ( Lake Water Quality 2024 - present (raster 300 m), global, 10-daily – version 2)" + URL: https://explorer.digitalearth.africa/products/cgls_lwq300_2024_nrt + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa web services" + URL: https://ows.digitalearth.africa + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Map" + URL: https://maps.digitalearth.africa/ + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Sandbox" + URL: https://sandbox.digitalearth.africa/ + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Notebook Repo" + URL: https://github.com/digitalearthafrica/deafrica-sandbox-notebooks + AuthorName: Digital Earth Africa Contributors + - Title: "Digital Earth Africa Geoportal" + URL: https://www.africageoportal.com/pages/digital-earth-africa + AuthorName: Digital Earth Africa Contributors From 60da5ece1ce9c834312d16887f7a468169037ff5 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 3 Jul 2025 10:11:57 -0800 Subject: [PATCH 082/751] adding asdi collab --- datasets/dep-ls-geomads.yaml | 4 ++++ datasets/vt-opendata.yaml | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/datasets/dep-ls-geomads.yaml b/datasets/dep-ls-geomads.yaml index a7e9424f0..1f04bc0e1 100644 --- a/datasets/dep-ls-geomads.yaml +++ b/datasets/dep-ls-geomads.yaml @@ -8,6 +8,10 @@ Documentation: https://digitalearthpacific.org/#/applications Contact: dep@spc.int ManagedBy: "[Pacific Community (SPC)](https://www.spc.int/)" UpdateFrequency: Annually +Collabs: + ASDI: + Tags: + - satellite imagery Tags: - earth observation - geoscience diff --git a/datasets/vt-opendata.yaml b/datasets/vt-opendata.yaml index 2c06c0664..d9991d4e9 100644 --- a/datasets/vt-opendata.yaml +++ b/datasets/vt-opendata.yaml @@ -4,6 +4,10 @@ Documentation: https://vcgi.vermont.gov/data-and-programs/ Contact: If you have specific questions please contact - vcgi@vermont.gov ManagedBy: "[Vermont Center for Geographic Information](https://vcgi.vermont.gov)" UpdateFrequency: Vermont acquires statewide imagery approximately once every other year. Lidar is acquired approximately once every 5-8 years. High-resolution landcover is generated once every other year. +Collabs: + ASDI: + Tags: + - satellite imagery Tags: - earth observation - aerial imagery From cafe129a87c065c118ca922c73e57ffa0488cda8 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Thu, 3 Jul 2025 16:57:34 -0400 Subject: [PATCH 083/751] update tags --- datasets/graf-reforecast.yaml | 4 ++-- tags.yaml | 7 +++++++ 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index cde11dbaf..a99a7f17c 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -17,10 +17,10 @@ Tags: - wind speeds - cloud amount - visibility - - zarr - - weather - ERA5 - MPAS + - zarr + - weather License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" Resources: - Description: GRAF Reforecast dataset diff --git a/tags.yaml b/tags.yaml index 2e4f2b763..c96076861 100644 --- a/tags.yaml +++ b/tags.yaml @@ -452,3 +452,10 @@ - x-ray tomography - xml - zarr +- precipitation amount +- precipitation type +- wind speeds +- cloud amount +- visibility +- ERA5 +- MPAS From a4a5e1a0ff9fd9851b647b25d9ac911f5bc39573 Mon Sep 17 00:00:00 2001 From: Dries Verachtert Date: Fri, 4 Jul 2025 15:16:05 +0200 Subject: [PATCH 084/751] Update the contact info and 'managed by' fields for the Blue Brain Open Data --- datasets/bluebrain_opendata.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/bluebrain_opendata.yaml b/datasets/bluebrain_opendata.yaml index 4d69c3cdb..d65c7f3a0 100644 --- a/datasets/bluebrain_opendata.yaml +++ b/datasets/bluebrain_opendata.yaml @@ -1,8 +1,8 @@ Name: Blue Brain Open Data Description: The Blue Brain Open Data represents an extensive neuroscience dataset encompassing a diverse range of data types, including experimental, model, and simulation data, along with images and videos depicting reconstructed neurons and brain regions. Documentation: https://github.com/BlueBrain/OpenData -Contact: rodrigo.perin@epfl.ch -ManagedBy: "BBP/EPFL" +Contact: info@openbraininstitute.org +ManagedBy: "Open Brain Institute" UpdateFrequency: No updates Tags: - neuroscience From 2c76347003937076b911c6e4306b0de410267568 Mon Sep 17 00:00:00 2001 From: Zoheyr Doctor Date: Mon, 7 Jul 2025 11:57:36 -0500 Subject: [PATCH 085/751] Add Resources to mbers-open-data.yaml --- datasets/mbers-open-data.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/datasets/mbers-open-data.yaml b/datasets/mbers-open-data.yaml index 6d98c4c7c..558541e52 100644 --- a/datasets/mbers-open-data.yaml +++ b/datasets/mbers-open-data.yaml @@ -14,6 +14,11 @@ Tags: - environmental License: All data are free and provided without license restrictions. Citation: "Marginal Build Emissions Rates (MBERs) for Electricity. Climate TRACE. [DATE]. URL: https://www.gem.wiki/MBERs" +Resources: + - Description: Marginal Build Emissions Rates (MBERs) for Electricity CSV Data + ARN: arn:aws:s3:::mbers-open-data + Region: us-west-2 + Type: S3 Bucket DataAtWork: Tutorials: - Title: MBER Orientation and Tutorial From 02e983d9e319757147751a52615f7d2c6eb67889 Mon Sep 17 00:00:00 2001 From: Beatrice BM <126121645+beatrice-b-m@users.noreply.github.com> Date: Mon, 7 Jul 2025 13:33:12 -0400 Subject: [PATCH 086/751] Update EMBED controlled access link and tutorial notebook --- datasets/emory-breast-imaging-dataset-embed.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/emory-breast-imaging-dataset-embed.yaml b/datasets/emory-breast-imaging-dataset-embed.yaml index 157a8656b..a65cbc78b 100644 --- a/datasets/emory-breast-imaging-dataset-embed.yaml +++ b/datasets/emory-breast-imaging-dataset-embed.yaml @@ -27,11 +27,11 @@ Resources: ARN: arn:aws:s3:::embed-dataset-open Region: us-west-2 Type: S3 Bucket - ControlledAccess: https://forms.gle/HwGMM6vdv3w32TKF9 + ControlledAccess: https://forms.gle/6YVFKTz7ucEJKEWw8 DataAtWork: Tutorials: - - Title: Sample Notebook - URL: https://github.com/Emory-HITI/EMBED_Open_Data/blob/main/Sample_Notebook.ipynb + - Title: Screening Label Assignment Example + URL: https://github.com/Emory-HITI/EMBED_Open_Data/blob/43e76483284a87b07d33982fc673082b5e2d41c9/resources/notebooks/screening_label_assignment.ipynb AuthorName: Emory-HITI Publications: - Title: "The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.4M Screening and Diagnostic Mammograms" From 0b140b022ebc68ce60de1ba77f19b668737977ff Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Mon, 7 Jul 2025 15:07:20 -0400 Subject: [PATCH 087/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index a99a7f17c..94a6b7b76 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -21,7 +21,7 @@ Tags: - MPAS - zarr - weather -License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Resources: - Description: GRAF Reforecast dataset ARN: arn:aws:s3:::twc-graf-reforecast From a680e8a33281d7351cf41d8cab303be438134040 Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Mon, 7 Jul 2025 15:11:27 -0400 Subject: [PATCH 088/751] updated tags --- datasets/graf-reforecast.yaml | 3 +-- tags.yaml | 2 -- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 94a6b7b76..0ecd12a8d 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -12,8 +12,7 @@ Tags: - model - near-surface air temperature - near-surface relative humidity - - precipitation amount - - precipitation type + - precipitation - wind speeds - cloud amount - visibility diff --git a/tags.yaml b/tags.yaml index c96076861..5fa9b4bc0 100644 --- a/tags.yaml +++ b/tags.yaml @@ -452,8 +452,6 @@ - x-ray tomography - xml - zarr -- precipitation amount -- precipitation type - wind speeds - cloud amount - visibility From 55772fb5ae3c8176c7c481c716ef95e83e1b7fbf Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Mon, 7 Jul 2025 15:35:59 -0400 Subject: [PATCH 089/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 0ecd12a8d..56a3f0852 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -26,6 +26,8 @@ Resources: ARN: arn:aws:s3:::twc-graf-reforecast Region: us-west-2 Type: S3 Bucket + Explore: + - '[Browse Bucket](https://s3-us-west-2.amazonaws.com/twc-graf-reforecast/index.html)' DataAtWork: Publications: - Title: Global reforecasts from MPAS “GRAF” with mesh refinement over the US and Europe From 9c7ac207b830c56487bcc4a38e118fc109dd7dfb Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Mon, 7 Jul 2025 15:58:27 -0400 Subject: [PATCH 090/751] updated collabs --- datasets/graf-reforecast.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 56a3f0852..3b6c99d0b 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -4,6 +4,10 @@ Documentation: "[Documentation](https://docs.google.com/forms/d/e/1FAIpQLSejRyG2 Contact: graf.reforecast@weather.com ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" UpdateFrequency: One time push only +Collabs: + ASDI: + Tags: + - weather Tags: - atmosphere - forecast From 6b72da41000760e338568f60cd14d15a7fbbf806 Mon Sep 17 00:00:00 2001 From: Matthew Berkeley <42berkeley@cua.edu> Date: Tue, 8 Jul 2025 17:03:50 +0200 Subject: [PATCH 091/751] Draft commit of busco-data.yaml --- datasets/busco-data.yaml | 49 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 datasets/busco-data.yaml diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml new file mode 100644 index 000000000..90b7f21e5 --- /dev/null +++ b/datasets/busco-data.yaml @@ -0,0 +1,49 @@ +Name: BUSCO Datasets +Description: Lineage datasets for use with BUSCO software package. Each dataset contains HMM profiles for clade specific, universal, single-copy marker genes. Datasets are available across archaea, bacteria, eukaryota and virus domains. The repository also includes necessary data files for phylogenetic placement of an input assembly. +Documentation: https://busco.ezlab.org/busco_userguide.html#lineage-datasets +Contact: https://gitlab.com/ezlab/busco/-/issues +ManagedBy: Computational Evolutionary Genomics Group, University of Geneva +UpdateFrequency: New datasets are released to correspond with updates in OrthoDB versions. Maintenance updates occur a few times a year if necessary to fix any bugs or update metadata. +Tags: + - assembly + - bacteria + - bioinformatics + - genomic + - metagenomics + - open source software + - protein + - virus +License: The BUSCO datasets are licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. +Any use of these datasets for analyses in a publication or product must include the citation of the corresponding paper: https://doi.org/10.1093/molbev/msab199 +Citation: +Resources: + - Description: + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: BUSCO - from QC to gene prediction and phylogenomics + URL: https://www.youtube.com/watch?v=9SjVY3BT8JU + AuthorName: Matthew Berkeley + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: OrthoDB and BUSCO update: annotation of orthologs with wider sampling of genomes + URL: https://academic.oup.com/nar/article/53/D1/D516/7899526?login=true + AuthorName: Fredrik Tegenfeldt, Dmitry Kuznetsov, Mosè Manni, Matthew Berkeley, Evgeny M Zdobnov, Evgenia V Kriventseva + - Title: BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes + URL: https://academic.oup.com/mbe/article/38/10/4647/6329644?login=true + AuthorName: Mosè Manni, Matthew R Berkeley, Mathieu Seppey, Felipe A Simão, Evgeny M Zdobnov + - Title: BUSCO: assessing genomic data quality and beyond + URL: https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpz1.323 + AuthorName: Mosè Manni, Matthew R. Berkeley, Mathieu Seppey, Evgeny M. Zdobnov +DeprecatedNotice: +ADXCategories: + - From 7efc2d1e4250a6484702edb97222aca85c376e8f Mon Sep 17 00:00:00 2001 From: Qaish Kanchwala Date: Tue, 8 Jul 2025 12:33:10 -0400 Subject: [PATCH 092/751] Update graf-reforecast.yaml --- datasets/graf-reforecast.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/graf-reforecast.yaml b/datasets/graf-reforecast.yaml index 3b6c99d0b..ca21ce183 100644 --- a/datasets/graf-reforecast.yaml +++ b/datasets/graf-reforecast.yaml @@ -1,5 +1,5 @@ Name: GRAF Reforecast -Description: "A zarr-formatted dataset of 1836 reforecast cases (~5 years) from The Weather Company GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to ~4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of its computation on graphical processing units, with this development assisted by NVIDIA. The 1836 cases (~5 years) were generated from ECMWF reanalyses (ERA5) for initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way spanned a wider range of interesting, high-impact weather scenarios than were there five contiguous years of data. GRAF reforecasts were mostly run to +27 h lead time, assuming a 3-h for spin up followed by a full diurnal cycle. Data were saved in zarr format on the native model vertical coordinate. Most fields were saved at 15-min intervals, though several precipitation variables were saved at 5-min cadence." +Description: "A zarr-formatted dataset of 1836 reforecast cases (approx. 5 years) from The Weather Company GRAF (Global high-Resolution Atmospheric Forecasting) model, a version of the National Center for Atmospheric Research (NCAR) Model for Predictions Across Scales (MPAS). GRAF is global, but the configuration for this reforecast had a mesh refinement to approx. 4 km over the US, Caribbean Basin, and Europe, and 15 km elsewhere. This model was designed to run much of its computation on graphical processing units, with this development assisted by NVIDIA. The 1836 cases (approx. 5 years) were generated from ECMWF reanalyses (ERA5) for initial condition dates spanning more than 20 years, 2004-2024. These dates of the chosen initial conditions mostly selected based on high-impact weather in the contiguous US (CONUS) and Caribbean. Sampling in this way spanned a wider range of interesting, high-impact weather scenarios than were there five contiguous years of data. GRAF reforecasts were mostly run to +27 h lead time, assuming a 3-h for spin up followed by a full diurnal cycle. Data were saved in zarr format on the native model vertical coordinate. Most fields were saved at 15-min intervals, though several precipitation variables were saved at 5-min cadence." Documentation: "[Documentation](https://docs.google.com/forms/d/e/1FAIpQLSejRyG2CXrfcmrX7g_iFhc3RF-n3ZzmPQdVieSDwTzLNkR-_w/viewform)" Contact: graf.reforecast@weather.com ManagedBy: "[The Weather Company](https://www.weathercompany.com/)" From 04e19588d0ffca2765a7aaa5c0df6c2272dfaf67 Mon Sep 17 00:00:00 2001 From: "Shruti [C] Bhanderi" Date: Wed, 9 Jul 2025 13:26:07 +0000 Subject: [PATCH 093/751] bulk tagging ASDI - 124 files --- bulk_update_test.ipynb | 1439 +++++++++++++++++ datasets/3kricegenome.yaml | 4 + datasets/ag-loam.yaml | 4 + datasets/amazon-last-mile-challenges.yaml | 4 + ...n_animal_acoustic_tracking_delayed_qc.yaml | 4 + ...td_satellite_relay_tagging_delayed_qc.yaml | 4 + ...el_sea_level_anomaly_gridded_realtime.yaml | 4 + datasets/aodn_mooring_ctd_delayed_qc.yaml | 4 + ..._mooring_hourly_timeseries_delayed_qc.yaml | 4 + ...lite_altimetry_calibration_validation.yaml | 4 + ...t_velocity_hourly_averaged_delayed_qc.yaml | 4 + ..._capricornbunkergroup_wave_delayed_qc.yaml | 4 + ..._capricornbunkergroup_wind_delayed_qc.yaml | 4 + ...r_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...dn_radar_coffsharbour_wave_delayed_qc.yaml | 4 + ...t_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...e_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...f_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...f_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...n_radar_rottnestshelf_wave_delayed_qc.yaml | 4 + ...n_radar_rottnestshelf_wind_delayed_qc.yaml | 4 + ...s_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...r_southaustraliagulfs_wave_delayed_qc.yaml | 4 + ...t_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...tellite_chlorophylla_carder_1day_aqua.yaml | 4 + ..._satellite_chlorophylla_gsm_1day_aqua.yaml | 4 + ...atellite_chlorophylla_gsm_1day_noaa20.yaml | 4 + ..._satellite_chlorophylla_gsm_1day_snpp.yaml | 4 + ..._satellite_chlorophylla_oc3_1day_aqua.yaml | 4 + ...atellite_chlorophylla_oc3_1day_noaa20.yaml | 4 + ..._satellite_chlorophylla_oc3_1day_snpp.yaml | 4 + ..._satellite_chlorophylla_oci_1day_aqua.yaml | 4 + ...atellite_chlorophylla_oci_1day_noaa20.yaml | 4 + ..._satellite_chlorophylla_oci_1day_snpp.yaml | 4 + ...fuse_attenuation_coefficent_1day_aqua.yaml | 4 + ...se_attenuation_coefficent_1day_noaa20.yaml | 4 + ...fuse_attenuation_coefficent_1day_snpp.yaml | 4 + ...e_nanoplankton_fraction_oc3_1day_aqua.yaml | 4 + ...et_primary_productivity_gsm_1day_aqua.yaml | 4 + ...et_primary_productivity_oc3_1day_aqua.yaml | 4 + ...atellite_optical_water_type_1day_aqua.yaml | 4 + ...e_picoplankton_fraction_oc3_1day_aqua.yaml | 4 + datasets/aodn_slocum_glider_delayed_qc.yaml | 4 + datasets/aodn_vessel_co2_delayed_qc.yaml | 4 + .../aodn_vessel_fishsoop_realtime_qc.yaml | 4 + datasets/aodn_vessel_xbt_delayed_qc.yaml | 4 + datasets/aodn_vessel_xbt_realtime_nonqc.yaml | 4 + datasets/aodn_wave_buoy_realtime_nonqc.yaml | 4 + datasets/argoverse.yaml | 4 + datasets/asf-event-data.yaml | 4 + datasets/asset-data-igp-coal-plant.yaml | 4 + datasets/aurora_msds.yaml | 4 + datasets/bhl-open-data.yaml | 4 + datasets/black_marble_combustion.yaml | 68 +- datasets/blended-tropomi-gosat-methane.yaml | 4 + datasets/blue_et.yaml | 70 +- datasets/boreas.yaml | 4 + datasets/caladapt-wildfire-dataset.yaml | 4 + datasets/catalyst-cooperative-pudl.yaml | 4 + datasets/ccic.yaml | 4 + datasets/cesm-hr.yaml | 4 + datasets/citrus-farm.yaml | 4 + datasets/colorado-imagery.yaml | 74 +- datasets/cropland_partitioining.yaml | 94 +- datasets/cwa_opendata.yaml | 40 +- datasets/dep-coastlines.yaml | 4 + datasets/dep-mangroves.yaml | 4 + datasets/dep-s1-annual-mosaics.yaml | 4 + datasets/dep-s2-geomads.yaml | 4 + datasets/dmi-opendata.yaml | 4 + datasets/ecmwf-forecasts.yaml | 4 + datasets/epa-2022-modeling-platform.yaml | 196 +-- datasets/epa-edde-v1.yaml | 4 + datasets/epa-edde-v2.yaml | 4 + datasets/epa-equates-v1.yaml | 128 +- datasets/era5-for-wrf.yaml | 56 +- datasets/ford-multi-av-seasonal.yaml | 4 + datasets/geoglows-v2.yaml | 4 + datasets/glo-30-hand.yaml | 4 + datasets/global-drought-flood-catalogue.yaml | 4 + datasets/gmsdata.yaml | 4 + datasets/gnss-ro-opendata.yaml | 4 + datasets/green_et.yaml | 68 +- datasets/gulfwide-avian-monitoring.yaml | 90 +- datasets/hycom-gofs-3pt1-reanalysis.yaml | 4 + datasets/in-elevation.yaml | 80 +- datasets/in-imagery.yaml | 84 +- datasets/intelinair_corn_kernel_counting.yaml | 4 + ...nair_longitudinal_nutrient_deficiency.yaml | 4 + datasets/its-live-data.yaml | 4 + datasets/kyfromabove.yaml | 4 + datasets/ladi.yaml | 4 + datasets/mapping-africa.yaml | 4 + datasets/nifs-lhd.yaml | 4 + datasets/noaa-historicalcharts.yaml | 4 + datasets/noaa-nesdis-tcprimed-pds.yaml | 4 + datasets/noaa-nws-wam-ipe.yaml | 4 + datasets/noaa-space-weather.yaml | 66 +- datasets/nyc-tlc-trip-records-pds.yaml | 4 + datasets/nz-elevation.yaml | 4 + datasets/nz-imagery.yaml | 4 + datasets/obis.yaml | 4 + datasets/oceanomics.yaml | 4 + datasets/open-meteo.yaml | 4 + datasets/openaerialmap.yaml | 4 + datasets/openfoodfacts-images.yaml | 4 + datasets/os-climate-physrisk.yaml | 4 + .../palsar-2-scansar-flooding-in-rwanda.yaml | 4 + datasets/proj-datum-grids.yaml | 4 + datasets/racecar-dataset.yaml | 4 + datasets/real-changesets.yaml | 4 + datasets/satellogic-earthview.yaml | 4 + datasets/seefar.yaml | 4 + datasets/sofar-spotter-archive.yaml | 4 + datasets/speedtest-global-performance.yaml | 4 + datasets/ssl4eo-multi-product-data.yaml | 4 + datasets/stdpopsim_kern.yaml | 58 +- datasets/surface-pm2-5-v6gl02.yaml | 4 + datasets/targetepigenomics.yaml | 4 + datasets/usgs_aqr.yaml | 78 +- datasets/venus-l2a-cogs.yaml | 4 + datasets/wbg-cckp.yaml | 4 + datasets/whiffle-wins50.yaml | 4 + datasets/wis2-global-cache.yaml | 4 + datasets/wise-allsky.yaml | 4 + 125 files changed, 2530 insertions(+), 595 deletions(-) create mode 100644 bulk_update_test.ipynb diff --git a/bulk_update_test.ipynb b/bulk_update_test.ipynb new file mode 100644 index 000000000..2e0b5f09e --- /dev/null +++ b/bulk_update_test.ipynb @@ -0,0 +1,1439 @@ +{ + "cells": [ + { + "cell_type": "code", + "id": "initial_id", + "metadata": { + "collapsed": true, + "ExecuteTime": { + "end_time": "2025-07-08T21:40:22.659792Z", + "start_time": "2025-07-08T21:40:22.465395Z" + } + }, + "source": [ + "\n", + "from pathlib import Path\n", + "# from github import Github\n", + "\n", + "import os\n", + "import pandas as pd\n" + ], + "outputs": [], + "execution_count": 1 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-08T21:40:27.284330Z", + "start_time": "2025-07-08T21:40:27.281147Z" + } + }, + "cell_type": "code", + "source": [ + "def clean_name(name):\n", + " \"\"\"Clean name by removing parentheses and normalizing\"\"\"\n", + " if not name:\n", + " return \"\"\n", + "\n", + " # Remove parentheses and their contents\n", + " cleaned = name.split('(')[0]\n", + "\n", + " # Remove special characters\n", + " cleaned = ''.join(char for char in cleaned if char.isalnum() or char.isspace())\n", + "\n", + " # Normalize whitespace and convert to lowercase\n", + " return ' '.join(cleaned.split()).lower().strip()" + ], + "id": "e0678a327eded893", + "outputs": [], + "execution_count": 2 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-09T05:16:07.395317Z", + "start_time": "2025-07-09T05:16:07.376411Z" + } + }, + "cell_type": "code", + "source": [ + "# #deprecated version\n", + "# from ruamel.yaml import YAML\n", + "# from ruamel.yaml.comments import CommentedMap\n", + "#\n", + "#\n", + "# class OrderedYAML(YAML):\n", + "# def __init__(self):\n", + "# super().__init__()\n", + "# self.preserve_quotes = True\n", + "# self.indent(sequence=2)\n", + "# self.width = 4096\n", + "# self.default_flow_style = False\n", + "# self.map_format = 'rt'\n", + "#\n", + "#\n", + "# def update_yaml_with_collabs(yaml_file_path, dataset_name, category):\n", + "# try:\n", + "# yaml = OrderedYAML()\n", + "#\n", + "# # Read file and preserve formatting\n", + "# with open(yaml_file_path, 'r') as file:\n", + "# original_content = file.read()\n", + "# data = yaml.load(original_content)\n", + "#\n", + "# if data.get('Name') == dataset_name:\n", + "# print(f\"Match found for: {dataset_name} with yaml file {str(yaml_file_path)}\")\n", + "#\n", + "# # # Create backup\n", + "# # backup_path = str(yaml_file_path) + '.bak'\n", + "# # with open(backup_path, 'w') as backup:\n", + "# # backup.write(original_content)\n", + "# if 'DeprecatedNotice' in data and isinstance(data['DeprecatedNotice'], str) and data['DeprecatedNotice'].strip():\n", + "# return 'deprecated' # Special return value for deprecated datasets\n", + "# try:\n", + "# # Create new CommentedMap to preserve order\n", + "# new_data = CommentedMap()\n", + "# modified = False\n", + "#\n", + "#\n", + "# # Copy existing data in order\n", + "# for key in data:\n", + "# # new_data[key] = data[key]\n", + "# if key == 'Tags':\n", + "# # Debug prints\n", + "# print(\"Found Tags key\")\n", + "# print(\"Current data keys:\", data.keys())\n", + "# print(\"Checking if Collabs exists:\", 'Collabs' in data)\n", + "#\n", + "# new_data[key] = data[key]\n", + "# # Then add Collabs right after Tags if it doesn't exist\n", + "# # Then add Collabs right after Tags\n", + "# if not any(k == 'Collabs' for k in data.keys()): # Alternative check\n", + "# print(\"Adding Collabs section\")\n", + "# new_data['Collabs'] = CommentedMap()\n", + "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", + "# modified = True\n", + "# else:\n", + "# new_data[key] = data[key]\n", + "#\n", + "# # If Collabs already exists, just update it\n", + "# if 'Collabs' in data:\n", + "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", + "# modified = True\n", + "#\n", + "# # If Collabs already exists, just update it\n", + "# # if 'Collabs' in new_data:\n", + "# # if 'ASDI' not in new_data['Collabs']:\n", + "# # new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", + "# # modified = True\n", + "# # elif new_data['Collabs']['ASDI'].get('Tags') != [category.strip()]:\n", + "# # new_data['Collabs']['ASDI']['Tags'] = [category.strip()]\n", + "# # modified = True\n", + "# # Write back preserving format\n", + "# # with open(yaml_file_path, 'w') as outfile:\n", + "# # yaml.dump(new_data, outfile)\n", + "#\n", + "# # Only write if changes were made\n", + "# # Only write if changes were made\n", + "# if modified:\n", + "# with open(yaml_file_path, '+w') as outfile:\n", + "# yaml.dump(new_data, outfile)\n", + "# print(f\"Successfully updated {yaml_file_path}\")\n", + "# return 'modified' # New return value to indicate modification\n", + "# else:\n", + "# print(f\"No changes needed for {yaml_file_path}\")\n", + "# return 'matched'\n", + "#\n", + "#\n", + "# # print(f\"Successfully updated {yaml_file_path}\")\n", + "# # return True\n", + "#\n", + "# except Exception as write_error:\n", + "# # Restore from backup\n", + "# with open(yaml_file_path, 'w') as file:\n", + "# file.write(original_content)\n", + "# print(f\"Error writing file, restored from backup: {write_error}\")\n", + "# raise\n", + "#\n", + "# # finally:\n", + "# # import os\n", + "# # if os.path.exists(backup_path):\n", + "# # os.remove(backup_path)\n", + "#\n", + "# # finally:\n", + "# # import os\n", + "# # if os.path.exists(backup_path):\n", + "# # os.remove(backup_path)\n", + "#\n", + "# except Exception as e:\n", + "# print(f\"Error processing {yaml_file_path}: {e}\")\n", + "#\n", + "# return False\n", + "#\n", + "#\n", + "# def update_yaml_with_collabs_cleaned(yaml_file_path, dataset_name, category):\n", + "# \"\"\"Update YAML file using cleaned name matching\"\"\"\n", + "# try:\n", + "# yaml = OrderedYAML()\n", + "#\n", + "# with open(yaml_file_path, 'r') as file:\n", + "# original_content = file.read()\n", + "# data = yaml.load(original_content)\n", + "#\n", + "# yaml_name = data.get('Name', '')\n", + "#\n", + "# # Clean both names for comparison\n", + "# clean_yaml_name = clean_name(yaml_name)\n", + "# clean_dataset_name = clean_name(dataset_name)\n", + "#\n", + "# if clean_yaml_name == clean_dataset_name:\n", + "# if 'DeprecatedNotice' in data and isinstance(data['DeprecatedNotice'], str) and data['DeprecatedNotice'].strip():\n", + "# return 'deprecated'\n", + "#\n", + "# print(f\"\\nCleaned name match found!\")\n", + "# print(f\"Original Excel name: {dataset_name}\")\n", + "# print(f\"Original YAML name: {yaml_name}\")\n", + "# print(f\"Cleaned names: {clean_yaml_name}\")\n", + "#\n", + "# # # Create backup\n", + "# # backup_path = str(yaml_file_path) + '.bak'\n", + "# # with open(backup_path, 'w') as backup:\n", + "# # backup.write(original_content)\n", + "#\n", + "# try:\n", + "# # Create new CommentedMap to preserve order\n", + "# new_data = CommentedMap()\n", + "# modified = False\n", + "#\n", + "# # Copy existing data in order\n", + "# for key in data:\n", + "# # print(key)\n", + "# # new_data[key] = data[key]\n", + "# if key == 'Tags':\n", + "# # First add the Tags\n", + "# new_data[key] = data[key]\n", + "# # print(key)\n", + "# # print(data)\n", + "# # Then add Collabs right after Tags if it doesn't exist\n", + "# if 'Collabs' not in data:\n", + "# print(key)\n", + "# new_data['Collabs'] = CommentedMap()\n", + "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", + "# modified = True\n", + "# else:\n", + "# new_data[key] = data[key]\n", + "#\n", + "#\n", + "# # If Collabs already exists, just update it\n", + "# if 'Collabs' in data:\n", + "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", + "# modified = True\n", + "#\n", + "# # If Collabs already exists, just update it\n", + "# # if 'Collabs' in new_data:\n", + "# # if 'ASDI' not in new_data['Collabs']:\n", + "# # new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", + "# # modified = True\n", + "# # elif new_data['Collabs']['ASDI'].get('Tags') != [category.strip()]:\n", + "# # new_data['Collabs']['ASDI']['Tags'] = [category.strip()]\n", + "# # modified = True\n", + "#\n", + "# # Write back preserving format\n", + "# # with open(yaml_file_path, 'w') as outfile:\n", + "# # yaml.dump(new_data, outfile)\n", + "#\n", + "# # Only write if changes were made\n", + "# # Only write if changes were made\n", + "# if modified:\n", + "# with open(yaml_file_path, 'w') as outfile:\n", + "# yaml.dump(new_data, outfile)\n", + "# print(f\"Successfully updated {yaml_file_path}\")\n", + "# return 'modified' # New return value to indicate modification\n", + "# else:\n", + "# print(f\"No changes needed for {yaml_file_path}\")\n", + "# return 'matched'\n", + "# return True\n", + "#\n", + "#\n", + "# # print(f\"Successfully updated {yaml_file_path}\")\n", + "# # return True\n", + "#\n", + "# except Exception as write_error:\n", + "# # Restore from backup\n", + "# with open(yaml_file_path, 'w') as file:\n", + "# file.write(original_content)\n", + "# print(f\"Error writing file, restored from backup: {write_error}\")\n", + "# raise\n", + "#\n", + "# # finally:\n", + "# # import os\n", + "# # if os.path.exists(backup_path):\n", + "# # os.remove(backup_path)\n", + "#\n", + "# except Exception as e:\n", + "# print(f\"Error processing {yaml_file_path}: {e}\")\n", + "#\n", + "# return False\n", + "#\n", + "#\n", + "# def process_yaml_files(dataset_folder, excel_file):\n", + "# \"\"\"Process all YAML files and match with Excel data\"\"\"\n", + "# total_yaml_files = 0\n", + "# processed_files = 0\n", + "# matches_found = 0\n", + "# matched_dataset_names = []\n", + "# failed_files = []\n", + "# git_files = [] # Track YAML files to be updated\n", + "# deprecated_matched = [] # Track deprecated but matched datasets\n", + "# # Read Excel file\n", + "# try:\n", + "# df = pd.read_excel(excel_file)\n", + "# # Ensure the required columns exist\n", + "# if 'Dataset' not in df.columns or 'Category' not in df.columns:\n", + "# raise ValueError(\"Excel file must contain 'Name' and 'Category' columns\")\n", + "#\n", + "# # Create dictionary of dataset names and categories\n", + "# dataset_dict = dict(zip(df['Dataset'], df['Category']))\n", + "# # print(dataset_dict)\n", + "#\n", + "# yaml_folder_path = Path(dataset_folder)\n", + "# if not yaml_folder_path.exists() or not yaml_folder_path.is_dir():\n", + "# raise ValueError(f\"Invalid dataset folder path: {dataset_folder}\")\n", + "#\n", + "# # Count total YAML files\n", + "# total_yaml_files = len([f for f in yaml_folder_path.glob('*.yaml')])\n", + "# print(\"\\n=== First Pass: Exact Matching ===\")\n", + "# # First pass - exact matching\n", + "# for yaml_file in yaml_folder_path.glob('*.yaml'):\n", + "# processed_files += 1\n", + "# # print(f\"\\nProcessing file {processed_files}/{total_yaml_files}: {yaml_file}\")\n", + "#\n", + "# try:\n", + "# for dataset_name, category in dataset_dict.items():\n", + "# match_result = update_yaml_with_collabs(yaml_file, dataset_name, category)\n", + "# if match_result == 'deprecated':\n", + "# # Dataset matched but is deprecated\n", + "# deprecated_matched.append({\n", + "# 'dataset': dataset_name,\n", + "# 'file': yaml_file.name\n", + "# })\n", + "# matched_dataset_names.append(dataset_name) # Count as matched but not updated\n", + "# print(f\"Dataset {dataset_name} matched but marked as deprecated - skipping update\")\n", + "# break\n", + "# elif match_result == 'modified': # New condition\n", + "# matches_found += 1\n", + "# matched_dataset_names.append(dataset_name)\n", + "# git_files.append(yaml_file.name)\n", + "# break\n", + "# elif match_result == 'matched': # New condition\n", + "# matches_found += 1\n", + "# matched_dataset_names.append(dataset_name)\n", + "# break\n", + "# except Exception as e:\n", + "# print(f\"Error processing YAML file {yaml_file}: {e}\")\n", + "# failed_files.append(str(yaml_file))\n", + "# # Create dictionary of unmatched datasets\n", + "# unmatched_dict = {k: v for k, v in dataset_dict.items()\n", + "# if k not in matched_dataset_names}\n", + "#\n", + "# print(f\"\\n=== First Pass Complete ===\")\n", + "# print(f\"Matches found: {matches_found}\")\n", + "# print(f\"Deprecated matches (not updated): {len(deprecated_matched)}\")\n", + "# print(f\"Unmatched datasets: {len(unmatched_dict)}\")\n", + "#\n", + "# if unmatched_dict:\n", + "# print(\"\\n=== Second Pass: Cleaned Name Matching ===\")\n", + "# second_pass_matches = 0\n", + "#\n", + "# # Reset file counter for second pass\n", + "# processed_files = 0\n", + "#\n", + "# # Second pass with cleaned names\n", + "# for yaml_file in yaml_folder_path.glob('*.yaml'):\n", + "# processed_files += 1\n", + "# # print(f\"\\nSecond pass processing {processed_files}/{total_yaml_files}: {yaml_file}\")\n", + "#\n", + "# try:\n", + "# for dataset_name, category in unmatched_dict.items():\n", + "# match_result = update_yaml_with_collabs_cleaned(yaml_file, dataset_name, category)\n", + "# if match_result == 'deprecated':\n", + "# deprecated_matched.append({\n", + "# 'dataset': dataset_name,\n", + "# 'file': yaml_file.name\n", + "# })\n", + "# matched_dataset_names.append(dataset_name)\n", + "# print(f\"Dataset {dataset_name} matched but marked as deprecated - skipping update\")\n", + "# break\n", + "# elif match_result == 'modified': # New condition\n", + "# second_pass_matches += 1\n", + "# matched_dataset_names.append(dataset_name)\n", + "# git_files.append(yaml_file.name)\n", + "# break\n", + "# elif match_result == 'matched': # New condition\n", + "# second_pass_matches += 1\n", + "# matched_dataset_names.append(dataset_name)\n", + "# break\n", + "#\n", + "# except Exception as e:\n", + "# print(f\"Error in second pass processing {yaml_file}: {e}\")\n", + "# if str(yaml_file) not in failed_files:\n", + "# failed_files.append(str(yaml_file))\n", + "#\n", + "# print(f\"\\n=== Second Pass Complete ===\")\n", + "# print(f\"Additional matches found: {second_pass_matches}\")\n", + "# matches_found += second_pass_matches\n", + "#\n", + "# # Final unmatched datasets\n", + "# final_unmatched = df[~df['Dataset'].isin(matched_dataset_names)]\n", + "#\n", + "# # Save unmatched datasets to Excel\n", + "# output_path = 'unmatched_datasets.xlsx'\n", + "# final_unmatched.to_excel(output_path, index=False)\n", + "#\n", + "# # Print final summary\n", + "# print(\"\\n=== Final Processing Summary ===\")\n", + "# print(f\"Total YAML files: {total_yaml_files}\")\n", + "# print(f\"Total matches found: {matches_found}\")\n", + "# print(f\"Failed files: {len(failed_files)}\")\n", + "# print(f\"Final unmatched datasets: {len(final_unmatched)}\")\n", + "# print(f\"Deprecated matches (not updated): {len(deprecated_matched)}\")\n", + "# print(f\"Unmatched datasets saved to: {output_path}\")\n", + "# print(f\"\\nTotal Git files to be updated: {len(git_files)}\")\n", + "# print(\"\\nDeprecated matched datasets:\")\n", + "# for dep in deprecated_matched:\n", + "# print(f\"Dataset: {dep['dataset']} - File: {dep['file']}\")\n", + "# # print(\"Git files:\")\n", + "# # for file in git_files:\n", + "# # print(file)\n", + "#\n", + "# return {\n", + "# 'total_files': total_yaml_files,\n", + "# 'total_matches': matches_found,\n", + "# 'failed_files': failed_files,\n", + "# 'matched_datasets': matched_dataset_names,\n", + "# 'unmatched_count': len(final_unmatched),\n", + "# 'deprecated_matched': deprecated_matched,\n", + "# 'git_files': git_files\n", + "# }\n", + "#\n", + "# except Exception as e:\n", + "# print(f\"Error in main processing: {e}\")\n", + "# return None # Final unmatched datasets\n", + "# final_unmatched = df[~df['Dataset'].isin(matched_dataset_names)]\n", + "#\n", + "# # Save unmatched datasets to Excel\n", + "# output_path = 'unmatched_datasets.xlsx'\n", + "# final_unmatched.to_excel(output_path, index=False)\n", + "#\n", + "# # Print final summary\n", + "# print(\"\\n=== Final Processing Summary ===\")\n", + "# print(f\"Total YAML files: {total_yaml_files}\")\n", + "# print(f\"Total matches found: {matches_found}\")\n", + "# print(f\"Failed files: {len(failed_files)}\")\n", + "# print(f\"Final unmatched datasets: {len(final_unmatched)}\")\n", + "# print(f\"Unmatched datasets saved to: {output_path}\")\n", + "#\n", + "# return {\n", + "# 'total_files': total_yaml_files,\n", + "# 'total_matches': matches_found,\n", + "# 'failed_files': failed_files,\n", + "# 'matched_datasets': matched_dataset_names,\n", + "# 'unmatched_count': len(final_unmatched),\n", + "# 'deprecated_matched': deprecated_matched,\n", + "# 'git_files': git_files\n", + "# }\n", + "#\n", + "# except Exception as e:\n", + "# print(f\"Error in main processing: {e}\")\n", + "# return None\n" + ], + "id": "b4897b0f1e3c3300", + "outputs": [], + "execution_count": 70 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-09T13:05:25.172175Z", + "start_time": "2025-07-09T13:05:25.160778Z" + } + }, + "cell_type": "code", + "source": [ + "#deprecated version\n", + "from ruamel.yaml import YAML\n", + "from ruamel.yaml.comments import CommentedMap\n", + "\n", + "class OrderedYAML(YAML):\n", + " def __init__(self):\n", + " super().__init__()\n", + " self.preserve_quotes = True\n", + " # self.indent(sequence=2)\n", + " # self.width = 4096\n", + " # self.default_flow_style = False\n", + " # self.map_format = 'rt'\n", + "def update_yaml_with_collabs(yaml_file_path, dataset_name, category):\n", + " \"\"\"Update YAML file using both exact and cleaned name matching\"\"\"\n", + " try:\n", + " with open(yaml_file_path, 'r') as file:\n", + " lines = file.readlines() # Read all lines while preserving format\n", + "\n", + " # Load YAML for name comparison\n", + " yaml = OrderedYAML()\n", + " data = yaml.load(''.join(lines))\n", + "\n", + " yaml_name = data.get('Name', '')\n", + "\n", + " # Try exact match first, then cleaned match\n", + " if yaml_name == dataset_name or clean_name(yaml_name) == clean_name(dataset_name):\n", + " if 'DeprecatedNotice' in data and isinstance(data['DeprecatedNotice'], str) and data['DeprecatedNotice'].strip():\n", + " return 'deprecated'\n", + "\n", + " print(f\"\\nMatch found!\")\n", + " print(f\"Excel name: {dataset_name}\")\n", + " print(f\"YAML name: {yaml_name}\")\n", + "\n", + " try:\n", + " updated_lines = []\n", + " modified = False\n", + "\n", + " # Check if Collabs already exists\n", + " has_collabs = 'Collabs:' in ''.join(lines)\n", + "\n", + " # Process lines\n", + " i = 0\n", + " while i < len(lines):\n", + " line = lines[i]\n", + "\n", + " # If we find Tags and Collabs doesn't exist, add Collabs before Tags\n", + " if line.strip().startswith('Tags:') and not has_collabs:\n", + " # Add Collabs section first\n", + " updated_lines.append('Collabs:\\n')\n", + " updated_lines.append(' ASDI:\\n')\n", + " updated_lines.append(f' Tags:\\n')\n", + " updated_lines.append(f' - {category.strip()}\\n')\n", + " modified = True\n", + "\n", + " # Then add Tags section\n", + " updated_lines.append(line)\n", + " else:\n", + " updated_lines.append(line)\n", + " i += 1\n", + "\n", + " # Only write if changes were made\n", + " if modified:\n", + " with open(yaml_file_path, 'w') as outfile:\n", + " outfile.writelines(updated_lines)\n", + " print(f\"Successfully updated {yaml_file_path}\")\n", + " return 'modified'\n", + " else:\n", + " print(f\"No changes needed for {yaml_file_path}\")\n", + " return 'matched'\n", + "\n", + " except Exception as write_error:\n", + " print(f\"Error writing file: {write_error}\")\n", + " raise\n", + "\n", + " except Exception as e:\n", + " print(f\"Error processing {yaml_file_path}: {e}\")\n", + "\n", + " return False\n", + "\n", + "\n", + "def process_yaml_files(dataset_folder, excel_file):\n", + " \"\"\"Process all YAML files and match with Excel data\"\"\"\n", + " total_yaml_files = 0\n", + " processed_files = 0\n", + " matches_found = 0\n", + " matched_dataset_names = []\n", + " failed_files = []\n", + " git_files = [] # Track YAML files to be updated\n", + " deprecated_matched = [] # Track deprecated but matched datasets\n", + " # Read Excel file\n", + " try:\n", + " df = pd.read_excel(excel_file)\n", + " # Ensure the required columns exist\n", + " if 'Dataset' not in df.columns or 'Category' not in df.columns:\n", + " raise ValueError(\"Excel file must contain 'Name' and 'Category' columns\")\n", + "\n", + " # Create dictionary of dataset names and categories\n", + " dataset_dict = dict(zip(df['Dataset'], df['Category']))\n", + " # print(dataset_dict)\n", + "\n", + " yaml_folder_path = Path(dataset_folder)\n", + " if not yaml_folder_path.exists() or not yaml_folder_path.is_dir():\n", + " raise ValueError(f\"Invalid dataset folder path: {dataset_folder}\")\n", + "\n", + " # Count total YAML files\n", + " total_yaml_files = len([f for f in yaml_folder_path.glob('*.yaml')])\n", + " print(\"\\n=== Matching starts ===\")\n", + " for yaml_file in yaml_folder_path.glob('*.yaml'):\n", + " processed_files += 1\n", + "\n", + " try:\n", + " for dataset_name, category in dataset_dict.items():\n", + " match_result = update_yaml_with_collabs(yaml_file, dataset_name, category)\n", + " if match_result == 'deprecated':\n", + " deprecated_matched.append({\n", + " 'dataset': dataset_name,\n", + " 'file': yaml_file.name\n", + " })\n", + " matched_dataset_names.append(dataset_name)\n", + " print(f\"Dataset {dataset_name} matched but marked as deprecated - skipping update\")\n", + " break\n", + " elif match_result == 'modified':\n", + " matches_found += 1\n", + " matched_dataset_names.append(dataset_name)\n", + " git_files.append(yaml_file.name)\n", + " break\n", + " elif match_result == 'matched':\n", + " matches_found += 1\n", + " matched_dataset_names.append(dataset_name)\n", + " break\n", + "\n", + "\n", + "\n", + " except Exception as e:\n", + " print(f\"Error processing {yaml_file}: {e}\")\n", + " failed_files.append(str(yaml_file))\n", + "\n", + "\n", + " # Final unmatched datasets\n", + " final_unmatched = df[~df['Dataset'].isin(matched_dataset_names)]\n", + "\n", + " # Save unmatched datasets to Excel\n", + " output_path = 'unmatched_datasets.xlsx'\n", + " final_unmatched.to_excel(output_path, index=False)\n", + "\n", + " # Print final summary\n", + " print(\"\\n=== Final Processing Summary ===\")\n", + " print(f\"Total YAML files: {total_yaml_files}\")\n", + " print(f\"Total matches found: {matches_found}\")\n", + " print(f\"Failed files: {len(failed_files)}\")\n", + " print(f\"Final unmatched datasets: {len(final_unmatched)}\")\n", + " print(f\"Deprecated matches (not updated): {len(deprecated_matched)}\")\n", + " # print(f\"Unmatched datasets saved to: {output_path}\")\n", + " print(f\"\\nTotal Git files to be updated: {len(git_files)}\")\n", + " print(\"\\nDeprecated matched datasets:\")\n", + " for dep in deprecated_matched:\n", + " print(f\"Dataset: {dep['dataset']} - File: {dep['file']}\")\n", + " return {\n", + " 'total_files': total_yaml_files,\n", + " 'total_matches': matches_found,\n", + " 'failed_files': failed_files,\n", + " 'matched_datasets': matched_dataset_names,\n", + " 'unmatched_count': len(final_unmatched),\n", + " 'deprecated_matched': deprecated_matched,\n", + " 'git_files': git_files\n", + " }\n", + "\n", + " except Exception as e:\n", + " print(f\"Error in main processing: {e}\")\n", + " return None\n", + "\n" + ], + "id": "605ffb7dc2eb7cc", + "outputs": [], + "execution_count": 88 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-09T13:23:18.905105Z", + "start_time": "2025-07-09T13:12:32.821944Z" + } + }, + "cell_type": "code", + "source": [ + " # dataset_folder = \"open-data-registry/datasets/\" # Change this to your YAML files folder path\n", + " # /local/home/bshrutiw/open-data-registry-fork\n", + "dataset_folder= \"/local/home/bshrutiw/open-data-registry-fork/datasets\"\n", + "# folder_path = \"open-data-registry/datasets/\"\n", + "excel_file = \"ASDI_adds.xlsx\" # Change this to your Excel file path\n", + "\n", + "results=process_yaml_files(dataset_folder, excel_file)\n", + "if results:\n", + " print(\"\\nProcessing completed successfully!\")\n", + " print(f\"Total matches across both passes: {results['total_matches']}\")\n", + " print(f\"Total Git files to update: {len(results['git_files'])}\")\n" + ], + "id": "3a159612bff75e15", + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "=== Matching starts ===\n", + "\n", + "Match found!\n", + "Excel name: Corn Kernel Counting Dataset\n", + "YAML name: Corn Kernel Counting Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/intelinair_corn_kernel_counting.yaml\n", + "\n", + "Match found!\n", + "Excel name: NOAA Historical Maps and Charts\n", + "YAML name: NOAA Historical Maps and Charts\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-historicalcharts.yaml\n", + "\n", + "Match found!\n", + "Excel name: Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column Data\n", + "YAML name: Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column data from the EPA's Air Quality Time Series (EQUATES) Project\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-equates-v1.yaml\n", + "\n", + "Match found!\n", + "Excel name: Biodiversity Heritage Library Metadata and Page Images\n", + "YAML name: Biodiversity Heritage Library Metadata and Page Images\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/bhl-open-data.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (GSM model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: SSL4EO S12 Landsat Multi Product Dataset\n", + "YAML name: SSL4EO S12 Landsat Multi Product Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ssl4eo-multi-product-data.yaml\n", + "\n", + "Match found!\n", + "Excel name: ECMWF real-time forecasts\n", + "YAML name: ECMWF real-time forecasts\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ecmwf-forecasts.yaml\n", + "\n", + "Match found!\n", + "Excel name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", + "YAML name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Nanoplankton fraction\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Nanoplankton fraction (OC3 model and Brewin et al 2012 algorithm)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: IWMI DIWASA Green ET for Africa\n", + "YAML name: IWMI DIWASA Green ET for Africa\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/green_et.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Optical Water Type (Moore)\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Optical Water Type (Moore et al 2009 algorithm)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Capricorn bunker group site - Wind - Delayed mode\n", + "YAML name: Ocean Radar - Capricorn bunker group site - Wind - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: PROJ datum grids\n", + "YAML name: PROJ datum grids\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/proj-datum-grids.yaml\n", + "\n", + "Match found!\n", + "Excel name: Whiffle WINS50 Open Data on AWS\n", + "YAML name: Whiffle WINS50 Open Data on AWS\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/whiffle-wins50.yaml\n", + "\n", + "Match found!\n", + "Excel name: Earth Radio Occultation\n", + "YAML name: Earth Radio Occultation\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/gnss-ro-opendata.yaml\n", + "\n", + "Match found!\n", + "Excel name: Indiana Statewide Elevation Catalog\n", + "YAML name: Indiana Statewide Elevation Catalog\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/in-elevation.yaml\n", + "\n", + "Match found!\n", + "Excel name: Longitudinal Nutrient Deficiency\n", + "YAML name: Longitudinal Nutrient Deficiency\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/intelinair_longitudinal_nutrient_deficiency.yaml\n", + "\n", + "Match found!\n", + "Excel name: SeeFar V0\n", + "YAML name: SeeFar V0\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/seefar.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Northwest shelf site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Northwest shelf site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", + "Dataset Earth Observation Data Cubes for Brazil matched but marked as deprecated - skipping update\n", + "\n", + "Match found!\n", + "Excel name: Marine Animal - Satellite Relay Tagging - Quality controlled profiles\n", + "YAML name: Marine Animal - Satellite Relay Tagging - Quality controlled profiles\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: World Bank Climate Change Knowledge Portal (CCKP)\n", + "YAML name: World Bank Climate Change Knowledge Portal (CCKP)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/wbg-cckp.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Turquoise Coast Site - Sea Water Velocity - Delayed Mode\n", + "YAML name: Ocean Radar - Turquoise coast site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Aurora Multi-Sensor Dataset\n", + "YAML name: Aurora Multi-Sensor Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aurora_msds.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States\n", + "YAML name: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/usgs_aqr.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellogic EarthView dataset\n", + "YAML name: Satellogic EarthView dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/satellogic-earthview.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Rottnest shelf site - Wind - Delayed mode\n", + "YAML name: Ocean Radar - Rottnest shelf site - Wind - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Digital Earth Pacific Mangroves Extent and Density\n", + "YAML name: Digital Earth Pacific Mangroves Extent and Density\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-mangroves.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Coffs Harbour site - Wave - Delayed mode\n", + "YAML name: Ocean Radar - Coffs Harbour site - Wave - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: VENUS L2A Cloud-Optimized GeoTIFFs\n", + "YAML name: VENUS L2A Cloud-Optimized GeoTIFFs\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/venus-l2a-cogs.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (Carder model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: OceanOmics\n", + "YAML name: OceanOmics\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/oceanomics.yaml\n", + "\n", + "Match found!\n", + "Excel name: ASF SAR Data Products for Disaster Events\n", + "YAML name: ASF SAR Data Products for Disaster Events\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/asf-event-data.yaml\n", + "\n", + "Match found!\n", + "Excel name: The Genome Modeling System\n", + "YAML name: The Genome Modeling System\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/gmsdata.yaml\n", + "\n", + "Match found!\n", + "Excel name: 3000 Rice Genomes Project\n", + "YAML name: 3000 Rice Genomes Project\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/3kricegenome.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (OC3 model and Eppley-VGPM algorithm)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: Central Weather Administration OpenData\n", + "YAML name: Central Weather Administration OpenData\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cwa_opendata.yaml\n", + "\n", + "Match found!\n", + "Excel name: Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)\n", + "YAML name: Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/targetepigenomics.yaml\n", + "\n", + "Match found!\n", + "Excel name: Wildfire Projections to Support Climate Resilience\n", + "YAML name: Wildfire Projections to Support Climate Resilience\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/caladapt-wildfire-dataset.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Biogeochemical sensors - Delayed mode\n", + "YAML name: Ships of Opportunity - Biogeochemical sensors - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_co2_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: 2021 Amazon Last Mile Routing Research Challenge Dataset\n", + "YAML name: 2021 Amazon Last Mile Routing Research Challenge Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/amazon-last-mile-challenges.yaml\n", + "\n", + "Match found!\n", + "Excel name: Vermont Open Geospatial on AWS\n", + "YAML name: Vermont Open Geospatial on AWS\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/vt-opendata.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (GSM model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: New Zealand Elevation\n", + "YAML name: New Zealand Elevation\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nz-elevation.yaml\n", + "\n", + "Match found!\n", + "Excel name: IWMI DIWASA Blue ET for Africa\n", + "YAML name: IWMI DIWASA Blue ET for Africa\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/blue_et.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration\n", + "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration (GSM model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model and Eppley-VGPM algorithm)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: Speedtest by Ookla Global Fixed and Mobile Network Performance Maps\n", + "YAML name: Speedtest by Ookla Global Fixed and Mobile Network Performance Maps\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/speedtest-global-performance.yaml\n", + "\n", + "Match found!\n", + "Excel name: stdpopsim Species Resources\n", + "YAML name: stdpopsim species resources\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/stdpopsim_kern.yaml\n", + "\n", + "Match found!\n", + "Excel name: Argoverse\n", + "YAML name: Argoverse\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/argoverse.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Fisheries vessels - Real time\n", + "YAML name: Ships of Opportunity - Fisheries vessels - Real time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_fishsoop_realtime_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sentinel-1 Mean and Median Annual Mosaic\n", + "YAML name: Sentinel-1 Mean and Median Annual Mosaic\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-s1-annual-mosaics.yaml\n", + "\n", + "Match found!\n", + "Excel name: Nighttime-Fire-Flare\n", + "YAML name: Nighttime-Fire-Flare\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/black_marble_combustion.yaml\n", + "\n", + "Match found!\n", + "Excel name: Gulfwide Avian Colony Monitoring Survey Photos\n", + "YAML name: Gulfwide Avian Colony Monitoring Survey Photos\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/gulfwide-avian-monitoring.yaml\n", + "\n", + "Match found!\n", + "Excel name: NIFS Large Helical Device (LHD) Experiment\n", + "YAML name: NIFS Large Helical Device (LHD) Experiment\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nifs-lhd.yaml\n", + "\n", + "Match found!\n", + "Excel name: OAQPS 2022 Modeling Platform\n", + "YAML name: OAQPS 2022 Modeling Platform \n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-2022-modeling-platform.yaml\n", + "\n", + "Match found!\n", + "Excel name: IWMI DIWASA Rainfed and Irrigated Cropland Map for Africa\n", + "YAML name: IWMI DIWASA Rainfed and Irrigated Cropland Map for Africa\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cropland_partitioining.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Bonney coast site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Bonney coast site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Moorings - Hourly time-series product\n", + "YAML name: Moorings - Hourly time-series product\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: CitrusFarm Dataset\n", + "YAML name: CitrusFarm Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/citrus-farm.yaml\n", + "\n", + "Match found!\n", + "Excel name: OceanCurrent - Gridded sea level anomaly - Near real time\n", + "YAML name: OceanCurrent - Gridded sea level anomaly - Near real time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml\n", + "\n", + "Match found!\n", + "Excel name: real-changesets\n", + "YAML name: real-changesets\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/real-changesets.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Diffuse attenuation coefficient\n", + "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Diffuse attenuation coefficient (k490)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml\n", + "\n", + "Match found!\n", + "Excel name: Pacific Coastlines Change\n", + "YAML name: Pacific Coastlines Change\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-coastlines.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Newcastle site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Newcastle site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: WIS2 Global Cache on AWS\n", + "YAML name: WIS2 Global Cache on AWS\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/wis2-global-cache.yaml\n", + "\n", + "Match found!\n", + "Excel name: HYbrid Coordinate Ocean Model Global Ocean Forecast System Reanalysis\n", + "YAML name: HYbrid Coordinate Ocean Model Global Ocean Forecast System Reanalysis\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/hycom-gofs-3pt1-reanalysis.yaml\n", + "\n", + "Match found!\n", + "Excel name: SatPM2.5\n", + "YAML name: SatPM2.5\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/surface-pm2-5-v6gl02.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration\n", + "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration (OC3 model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OCI model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: National Mooring Network - CTD profiles\n", + "YAML name: National Mooring Network - CTD profiles\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_ctd_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Wave buoys observations - Real time\n", + "YAML name: Wave buoys observations - Real time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_wave_buoy_realtime_nonqc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Coral coast site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Coral coast site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: AG-LOAM Dataset\n", + "YAML name: AG-LOAM Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ag-loam.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3 model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: Tropical Cyclone Precipitation, Infrared, Microwave, and Environmental Dataset (TC PRIMED)\n", + "YAML name: Tropical Cyclone Precipitation, Infrared, Microwave, and Environmental Dataset (TC PRIMED)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-nesdis-tcprimed-pds.yaml\n", + "\n", + "Match found!\n", + "Excel name: OpenAerialMap on AWS\n", + "YAML name: OpenAerialMap on AWS\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/openaerialmap.yaml\n", + "\n", + "Match found!\n", + "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", + "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v2.yaml\n", + "\n", + "Match found!\n", + "Excel name: Public Utility Data Liberation Project\n", + "YAML name: Public Utility Data Liberation Project\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/catalyst-cooperative-pudl.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Expendable bathythermographs - Real time\n", + "YAML name: Ships of Opportunity - Expendable bathythermographs - Real time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_xbt_realtime_nonqc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - South Australian gulfs site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - South Australian gulfs site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Landsat Geometric Median and Absolute Deviations (GeoMAD) over the Pacific\n", + "YAML name: Landsat Geometric Median and Absolute Deviations (GeoMAD) over the Pacific.\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/dep-ls-geomads.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Rottnest shelf site - Wave - Delayed mode\n", + "YAML name: Ocean Radar - Rottnest shelf site - Wave - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sentinel-2 Geometric Median and Absolute Deviations (GeoMAD)\n", + "YAML name: Sentinel-2 Geometric Median and Absolute Deviations (GeoMAD) over the Pacific\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-s2-geomads.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Diffuse attenuation coefficient\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Diffuse attenuation coefficient (k490)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: High resolution, annual cropland and landcover maps for selected African countries\n", + "YAML name: High resolution, annual cropland and landcover maps for selected African countries\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/mapping-africa.yaml\n", + "\n", + "Match found!\n", + "Excel name: NOAA Whole Atmosphere Model-Ionosphere Plasmasphere Electrodynamics (WAM-IPE)\n", + "YAML name: NOAA Whole Atmosphere Model-Ionosphere Plasmasphere Electrodynamics (WAM-IPE) Forecast System (WFS)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-nws-wam-ipe.yaml\n", + "\n", + "Match found!\n", + "Excel name: Open Food Facts Images\n", + "YAML name: Open Food Facts Images\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/openfoodfacts-images.yaml\n", + "\n", + "Match found!\n", + "Excel name: State of Colorado Imagery\n", + "YAML name: State of Colorado Imagery\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/colorado-imagery.yaml\n", + "\n", + "Match found!\n", + "Excel name: Boreas Autonomous Driving Dataset\n", + "YAML name: Boreas Autonomous Driving Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/boreas.yaml\n", + "\n", + "Match found!\n", + "Excel name: NOAA Space Weather Forecast and Observation Data\n", + "YAML name: NOAA Space Weather Forecast and Observation Data\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-space-weather.yaml\n", + "\n", + "Match found!\n", + "Excel name: Chalmers Cloud Ice Climatology (CCIC)\n", + "YAML name: Chalmers Cloud Ice Climatology\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ccic.yaml\n", + "\n", + "Match found!\n", + "Excel name: CESM-HR\n", + "YAML name: CESM-HR\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cesm-hr.yaml\n", + "\n", + "Match found!\n", + "Excel name: New Zealand Imagery\n", + "YAML name: New Zealand Imagery\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nz-imagery.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Altimetry calibration and validation\n", + "YAML name: Satellite - Altimetry calibration and validation\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml\n", + "\n", + "Match found!\n", + "Excel name: IGP Coal Plant\n", + "YAML name: IGP Coal Plant\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/asset-data-igp-coal-plant.yaml\n", + "\n", + "Match found!\n", + "Excel name: ERA5-for-WRF Open Data on AWS\n", + "YAML name: ERA5-for-WRF Open Data on AWS\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/era5-for-wrf.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Coffs Harbour site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Coffs Harbour site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: All-Sky Data | Wide-field Infrared Survey Explorer (WISE)\n", + "YAML name: All-Sky Data | Wide-field Infrared Survey Explorer (WISE)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/wise-allsky.yaml\n", + "\n", + "Match found!\n", + "Excel name: Global 30m Height Above Nearest Drainage (HAND)\n", + "YAML name: Global 30m Height Above Nearest Drainage (HAND)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/glo-30-hand.yaml\n", + "\n", + "Match found!\n", + "Excel name: Blended TROPOMI+GOSAT Satellite Data Product for Atmospheric Methane\n", + "YAML name: Blended TROPOMI+GOSAT Satellite Data Product for Atmospheric Methane\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/blended-tropomi-gosat-methane.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Expendable bathythermographs - Delayed mode\n", + "YAML name: Ships of Opportunity - Expendable bathythermographs - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_xbt_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: OS-Climate Physrisk\n", + "YAML name: OS-Climate Physrisk\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/os-climate-physrisk.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Rottnest shelf site - Sea Water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Rottnest shelf site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: GEOGLOWS Hydrological Model Version 2\n", + "YAML name: GEOGLOWS Hydrological Model Version 2\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/geoglows-v2.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sofar Spotter Archive\n", + "YAML name: Sofar Spotter Archive\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/sofar-spotter-archive.yaml\n", + "\n", + "Match found!\n", + "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", + "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 1\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v1.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration\n", + "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration (OCI model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (OC3 model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Picoplankton fraction\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Picoplankton fraction (OC3 model and Brewin et al 2012 algorithm)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: RACECAR Dataset\n", + "YAML name: RACECAR Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/racecar-dataset.yaml\n", + "\n", + "Match found!\n", + "Excel name: PALSAR-2 ScanSAR Flooding in Rwanda (L2.1)\n", + "YAML name: PALSAR-2 ScanSAR Flooding in Rwanda (L2.1)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/palsar-2-scansar-flooding-in-rwanda.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Gliders - Delayed mode\n", + "YAML name: Ocean Gliders - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_slocum_glider_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Diffuse attenuation coefficient\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Diffuse attenuation coefficient (k490)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (OCI model)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE)\n", + "YAML name: Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/its-live-data.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - South Australian gulfs site - Wave - Delayed mode\n", + "YAML name: Ocean Radar - South Australian gulfs site - Wave - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Indiana Statewide Digital Aerial Imagery Catalog\n", + "YAML name: Indiana Statewide Digital Aerial Imagery Catalog\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/in-imagery.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ford Multi-AV Seasonal Dataset\n", + "YAML name: Ford Multi-AV Seasonal Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ford-multi-av-seasonal.yaml\n", + "\n", + "Match found!\n", + "Excel name: Open-Meteo Weather API Database\n", + "YAML name: Open-Meteo Weather API Database\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/open-meteo.yaml\n", + "\n", + "Match found!\n", + "Excel name: Low Altitude Disaster Imagery (LADI) Dataset\n", + "YAML name: Low Altitude Disaster Imagery (LADI) Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ladi.yaml\n", + "\n", + "Match found!\n", + "Excel name: A Global Drought and Flood Catalogue from 1950 to 2016\n", + "YAML name: A Global Drought and Flood Catalogue from 1950 to 2016\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/global-drought-flood-catalogue.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Biodiversity Information System (OBIS) species occurrence data\n", + "YAML name: Ocean Biodiversity Information System (OBIS) species occurrence data\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/obis.yaml\n", + "\n", + "Match found!\n", + "Excel name: Danish Meteorological Institute (DMI) Open Data Forecasts\n", + "YAML name: Danish Meteorological Institute (DMI) Open Data Forecasts\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dmi-opendata.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Capricorn bunker group site - Wave - Delayed mode\n", + "YAML name: Ocean Radar - Capricorn bunker group site - Wave - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: KyFromAbove on AWS\n", + "YAML name: KyFromAbove on AWS\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/kyfromabove.yaml\n", + "\n", + "Match found!\n", + "Excel name: New York City Taxi and Limousine Commission (TLC) Trip Record Data\n", + "YAML name: New York City Taxi and Limousine Commission (TLC) Trip Record Data\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nyc-tlc-trip-records-pds.yaml\n", + "\n", + "=== Final Processing Summary ===\n", + "Total YAML files: 748\n", + "Total matches found: 126\n", + "Failed files: 0\n", + "Final unmatched datasets: 33\n", + "Deprecated matches (not updated): 1\n", + "\n", + "Total Git files to be updated: 124\n", + "\n", + "Deprecated matched datasets:\n", + "Dataset: Earth Observation Data Cubes for Brazil - File: brazil-data-cubes.yaml\n", + "\n", + "Processing completed successfully!\n", + "Total matches across both passes: 126\n", + "Total Git files to update: 124\n" + ] + } + ], + "execution_count": 90 + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-09T12:42:55.269730Z", + "start_time": "2025-07-09T12:42:55.248312Z" + } + }, + "cell_type": "code", + "source": [ + "\n", + "\n" + ], + "id": "33969aa18854426", + "outputs": [ + { + "ename": "NameError", + "evalue": "name 'matched_dataset_names' is not defined", + "output_type": "error", + "traceback": [ + "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m", + "\u001B[0;31mNameError\u001B[0m Traceback (most recent call last)", + "Cell \u001B[0;32mIn[80], line 5\u001B[0m\n\u001B[1;32m 2\u001B[0m excel_file \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mASDI_adds.xlsx\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 3\u001B[0m df \u001B[38;5;241m=\u001B[39m pd\u001B[38;5;241m.\u001B[39mread_excel(excel_file)\n\u001B[0;32m----> 5\u001B[0m final_unmatched \u001B[38;5;241m=\u001B[39m df[\u001B[38;5;241m~\u001B[39mdf[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mDataset\u001B[39m\u001B[38;5;124m'\u001B[39m]\u001B[38;5;241m.\u001B[39misin(\u001B[43mmatched_dataset_names\u001B[49m)]\n", + "\u001B[0;31mNameError\u001B[0m: name 'matched_dataset_names' is not defined" + ] + } + ], + "execution_count": 80 + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": "\n", + "id": "32c9fe7ccfeaa4d4" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": "", + "id": "afe1d28ac7b91d15" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "print(\"Git files to update:\")\n", + "for file in results['git_files']:\n", + " print(file)" + ], + "id": "456c408a11b90883" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + " #git add all the files\n", + "import os\n", + "\n", + "# Get the current working directory\n", + "print(os.getcwd())\n", + "\n", + "# Change the working directory\n", + "os.chdir(os.getcwd() + \"/open-data-registry\")\n", + "\n", + "# Verify the change\n", + "print(os.getcwd())" + ], + "id": "c060a91e00e23967" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "# add all our changes to be tracked by git\n", + "for file in results['git_files']:\n", + "\tos.system(\"git add {}\".format(file))\n", + "\tprint(\"git add {}\".format(file))\n", + "\n", + "\n", + "# git commit the change to the branch\n", + "os.system('git commit -m \\\"bulk add ASDI tags\\\"')\n", + "\n", + "print(\" \")\n", + "print(\"Done. Run the following command to push your changes:\")\n", + "##gh pr create --bulk_tag_ASDI \"adding ASDI tags in bulk\" --draft" + ], + "id": "cfc61ec2a240441c" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/datasets/3kricegenome.yaml b/datasets/3kricegenome.yaml index b953a5a05..45ac198cd 100644 --- a/datasets/3kricegenome.yaml +++ b/datasets/3kricegenome.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/3kricege Contact: http://iric.irri.org/contact-us ManagedBy: '[International Rice Research Institute](https://www.irri.org/)' UpdateFrequency: Not updated +Collabs: + ASDI: + Tags: + - agriculture Tags: - agriculture - food security diff --git a/datasets/ag-loam.yaml b/datasets/ag-loam.yaml index e4c2e6533..6c03ec929 100644 --- a/datasets/ag-loam.yaml +++ b/datasets/ag-loam.yaml @@ -9,6 +9,10 @@ Documentation: https://github.com/UCR-Robotics/AG-LOAM Contact: Hanzhe Teng (hteng007@ucr.edu), Konstantinos Karydis (kkarydis@ece.ucr.edu) ManagedBy: "[Autonomous Robots and Control Systems Lab](https://sites.google.com/view/arcs-lab)" UpdateFrequency: NA +Collabs: + ASDI: + Tags: + - agriculture Tags: - aws-pds - robotics diff --git a/datasets/amazon-last-mile-challenges.yaml b/datasets/amazon-last-mile-challenges.yaml index 27353960d..6d513d0d2 100644 --- a/datasets/amazon-last-mile-challenges.yaml +++ b/datasets/amazon-last-mile-challenges.yaml @@ -7,6 +7,10 @@ Contact: lastmile-research-challenge@amazon.com ManagedBy: "[Amazon](https://www.amazon.com/)" UpdateFrequency: None +Collabs: + ASDI: + Tags: + - infrastructure Tags: - transportation - machine learning diff --git a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml index 3cc75d42e..61b3b66eb 100644 --- a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml +++ b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml @@ -23,6 +23,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - biodiversity Tags: - oceans - marine mammals diff --git a/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml b/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml index 04be616b7..b5ed8aa87 100644 --- a/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml +++ b/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml @@ -23,6 +23,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - biodiversity Tags: - oceans - marine mammals diff --git a/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml b/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml index 62d34c093..b1adeda49 100644 --- a/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml +++ b/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml @@ -37,6 +37,10 @@ Resources: anomaly - Near real time Region: ap-southeast-2 Type: S3 Bucket +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean velocity diff --git a/datasets/aodn_mooring_ctd_delayed_qc.yaml b/datasets/aodn_mooring_ctd_delayed_qc.yaml index a6200feea..015281534 100644 --- a/datasets/aodn_mooring_ctd_delayed_qc.yaml +++ b/datasets/aodn_mooring_ctd_delayed_qc.yaml @@ -14,6 +14,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - chemistry diff --git a/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml b/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml index d32f52eea..add684a9d 100644 --- a/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml +++ b/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml @@ -22,6 +22,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - chemistry diff --git a/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml b/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml index 0720d3e5c..a1a800537 100644 --- a/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml +++ b/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml @@ -40,6 +40,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - climate Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml index dab12aea3..27ae1ab7b 100644 --- a/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml @@ -18,6 +18,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml b/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml index ce79a9139..3de654e18 100644 --- a/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml @@ -29,6 +29,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml b/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml index cc0e67e30..d39a8dd70 100644 --- a/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml +++ b/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml @@ -29,6 +29,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml index c625e26df..621ea7d4b 100644 --- a/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml @@ -26,6 +26,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml b/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml index 91ca65cde..f9e0fc997 100644 --- a/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml @@ -26,6 +26,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml index 25d372fbb..fd1233744 100644 --- a/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml index 79f62c3f6..d456f15be 100644 --- a/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml @@ -15,6 +15,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml index 611f7d002..628c1fa2a 100644 --- a/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml @@ -14,6 +14,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml index dee0b34df..aa2f7704f 100644 --- a/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml b/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml index 2c57e130b..2d299cc64 100644 --- a/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml b/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml index 020ffc1cc..ee0208c5b 100644 --- a/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml +++ b/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml index e086d7f2b..7464e80c1 100644 --- a/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml @@ -28,6 +28,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml b/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml index 9e493cb21..f566a4293 100644 --- a/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml @@ -28,6 +28,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml index 7ddf2f30e..ce05e2b5e 100644 --- a/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml @@ -34,6 +34,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml index f3b91fa06..a6a08924d 100644 --- a/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml @@ -16,6 +16,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml index ffeb65205..2a8577a24 100644 --- a/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml @@ -11,6 +11,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml b/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml index 04bbeb008..e0d63c8e0 100644 --- a/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml +++ b/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml @@ -11,6 +11,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml b/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml index 8463f6a98..0f020ba71 100644 --- a/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml +++ b/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml @@ -11,6 +11,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml index c1f0c76e8..b37517d8d 100644 --- a/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml b/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml index 15561fee7..e5dd4458c 100644 --- a/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml +++ b/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml b/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml index 21e308746..492e30862 100644 --- a/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml +++ b/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml index 1dfa47ee8..189c29431 100644 --- a/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml b/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml index 4e4e0d548..f3296c66f 100644 --- a/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml +++ b/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml b/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml index cbd80b556..5b83a8ba9 100644 --- a/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml +++ b/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml index bcab1b679..e2e7c5403 100644 --- a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml +++ b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml @@ -11,6 +11,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml index 4fb0a6a63..040cb6945 100644 --- a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml +++ b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml index 5f010d89f..d6d08bfd4 100644 --- a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml +++ b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml @@ -11,6 +11,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml b/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml index 834877ef9..a85c49572 100644 --- a/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml @@ -20,6 +20,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: FILL UP MANUALLY - CHECK DOCUMENTATION ManagedBy: FILL UP MANUALLY - CHECK DOCUMENTATION UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml b/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml index 95de13c6a..ebee62c0a 100644 --- a/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml +++ b/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml @@ -28,6 +28,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml b/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml index 8a22541eb..8e921cd11 100644 --- a/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml @@ -32,6 +32,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml b/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml index 2a6dc519e..e5bb1f38c 100644 --- a/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml +++ b/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml @@ -16,6 +16,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml b/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml index 5d3711950..c68005b3e 100644 --- a/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml @@ -20,6 +20,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: FILL UP MANUALLY - CHECK DOCUMENTATION ManagedBy: FILL UP MANUALLY - CHECK DOCUMENTATION UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_slocum_glider_delayed_qc.yaml b/datasets/aodn_slocum_glider_delayed_qc.yaml index 8fb0a445f..cd5c2df11 100644 --- a/datasets/aodn_slocum_glider_delayed_qc.yaml +++ b/datasets/aodn_slocum_glider_delayed_qc.yaml @@ -31,6 +31,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_vessel_co2_delayed_qc.yaml b/datasets/aodn_vessel_co2_delayed_qc.yaml index 6a803d98d..b8a2210cc 100644 --- a/datasets/aodn_vessel_co2_delayed_qc.yaml +++ b/datasets/aodn_vessel_co2_delayed_qc.yaml @@ -19,6 +19,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - climate Tags: - oceans - chemistry diff --git a/datasets/aodn_vessel_fishsoop_realtime_qc.yaml b/datasets/aodn_vessel_fishsoop_realtime_qc.yaml index b9d6dad63..06c0e26d0 100644 --- a/datasets/aodn_vessel_fishsoop_realtime_qc.yaml +++ b/datasets/aodn_vessel_fishsoop_realtime_qc.yaml @@ -47,6 +47,10 @@ Resources: of Opportunity Sub-Facility - Real-time data Region: ap-southeast-2 Type: S3 Bucket +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans UpdateFrequency: As Needed diff --git a/datasets/aodn_vessel_xbt_delayed_qc.yaml b/datasets/aodn_vessel_xbt_delayed_qc.yaml index 83978f57c..763ab5b29 100644 --- a/datasets/aodn_vessel_xbt_delayed_qc.yaml +++ b/datasets/aodn_vessel_xbt_delayed_qc.yaml @@ -12,6 +12,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans License: http://creativecommons.org/licenses/by/4.0/ diff --git a/datasets/aodn_vessel_xbt_realtime_nonqc.yaml b/datasets/aodn_vessel_xbt_realtime_nonqc.yaml index dfd4a4c7b..fb3423a31 100644 --- a/datasets/aodn_vessel_xbt_realtime_nonqc.yaml +++ b/datasets/aodn_vessel_xbt_realtime_nonqc.yaml @@ -16,6 +16,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans License: http://creativecommons.org/licenses/by/4.0/ diff --git a/datasets/aodn_wave_buoy_realtime_nonqc.yaml b/datasets/aodn_wave_buoy_realtime_nonqc.yaml index a9b779924..7a6af4069 100644 --- a/datasets/aodn_wave_buoy_realtime_nonqc.yaml +++ b/datasets/aodn_wave_buoy_realtime_nonqc.yaml @@ -31,6 +31,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans License: http://creativecommons.org/licenses/by/4.0/ diff --git a/datasets/argoverse.yaml b/datasets/argoverse.yaml index 0a5c3b25c..1d399e525 100644 --- a/datasets/argoverse.yaml +++ b/datasets/argoverse.yaml @@ -20,6 +20,10 @@ Documentation: https://argoverse.github.io/user-guide/ Contact: https://github.com/argoverse/av2-api/issues ManagedBy: "[Argoverse](https://argoverse.org)" UpdateFrequency: Infrequently +Collabs: + ASDI: + Tags: + - infrastructure Tags: - aws-pds - autonomous vehicles diff --git a/datasets/asf-event-data.yaml b/datasets/asf-event-data.yaml index 77af4863a..11c74da9c 100644 --- a/datasets/asf-event-data.yaml +++ b/datasets/asf-event-data.yaml @@ -10,6 +10,10 @@ Contact: https://asf.alaska.edu/asf/contact-us/ ManagedBy: "[The Alaska Satellite Facility (ASF)](https://asf.alaska.edu/)" UpdateFrequency: > Irregular, in response to disaster events +Collabs: + ASDI: + Tags: + - disaster Tags: - aws-pds - disaster response diff --git a/datasets/asset-data-igp-coal-plant.yaml b/datasets/asset-data-igp-coal-plant.yaml index 8da1197bc..86492ebca 100644 --- a/datasets/asset-data-igp-coal-plant.yaml +++ b/datasets/asset-data-igp-coal-plant.yaml @@ -5,6 +5,10 @@ Contact: https://github.com/APAD2024/APAD-Asset-Data/issues ManagedBy: APAD UpdateFrequency: as needed +Collabs: + ASDI: + Tags: + - energy Tags: - air quality - energy diff --git a/datasets/aurora_msds.yaml b/datasets/aurora_msds.yaml index 3c1756d67..8fd798bb9 100644 --- a/datasets/aurora_msds.yaml +++ b/datasets/aurora_msds.yaml @@ -12,6 +12,10 @@ Documentation: | Contact: ams-dataset@aurora.tech ManagedBy: Aurora Operations, Inc. UpdateFrequency: This dataset is complete. +Collabs: + ASDI: + Tags: + - infrastructure Tags: - aws-pds - autonomous vehicles diff --git a/datasets/bhl-open-data.yaml b/datasets/bhl-open-data.yaml index 42a2a33de..9840e4d82 100644 --- a/datasets/bhl-open-data.yaml +++ b/datasets/bhl-open-data.yaml @@ -4,6 +4,10 @@ Documentation: Documentation can be found at - any issues with the copy of the data on the AWS is reported in the future. Other experiments will be shared in the future according to the availability of resources. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - climate diff --git a/datasets/citrus-farm.yaml b/datasets/citrus-farm.yaml index e0207b465..f380a2f7c 100644 --- a/datasets/citrus-farm.yaml +++ b/datasets/citrus-farm.yaml @@ -9,6 +9,10 @@ Documentation: https://ucr-robotics.github.io/Citrus-Farm-Dataset/ Contact: Hanzhe Teng (hteng007@ucr.edu), Konstantinos Karydis (kkarydis@ece.ucr.edu) ManagedBy: "[Autonomous Robots and Control Systems Lab](https://sites.google.com/view/arcs-lab)" UpdateFrequency: NA +Collabs: + ASDI: + Tags: + - agriculture Tags: - aws-pds - robotics diff --git a/datasets/colorado-imagery.yaml b/datasets/colorado-imagery.yaml index 1d3e29bba..a5c92527d 100644 --- a/datasets/colorado-imagery.yaml +++ b/datasets/colorado-imagery.yaml @@ -1,35 +1,39 @@ -Name: State of Colorado Imagery -Description: The State of Colorado has gathered public historical imagery ranging from 2005 to 2021. -Documentation: https://docs.google.com/document/d/1YDHignUj9lQTMw2J-SqA96MTP8KmJYtk2ZKKC2ZYuPE/edit?usp=sharing -Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team -UpdateFrequency: Periodically -Tags: - - aws-pds - - aerial imagery - - geospatial - - imaging - - mapping -License: https://creativecommons.org/publicdomain/zero/1.0/legalcode -Resources: - - Description: The State of Colorado historic public aerial imagery. Currently, NAIP is available from 2005 and 2009-2021. The National Agriculture Imagery Program is a project managed by the U.S. Department of Agriculture created to collect leaf-on imagery for the United States during peak growing seasons. The files are available as GeoTIFFs. From 2005-2017 they have a one meter resolution. After that, it is a 60cm resolution. DRAPP (Denver Regional Aerial Photgraphy Project) is available from 2010-2020. It is availble in 3, 6, and 12in resolutions (except 2012). - ARN: arn:aws:s3:::colorado-public-imagery - Region: us-west-2 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: Colorado AWS Open Imagery Guide - URL: https://docs.google.com/document/d/15GjCSWSzst82FZMqBqdGV0rt6FKJzt03NlQYdWwsLGE/edit?usp=sharing - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ - Tools & Applications: - - Title: Colorado Public Imagery Dowloader - URL: https://gis.colorado.gov/imagery/ - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ - - Title: Colorado Public Imagery s3 Browser - URL: https://colorado-public-imagery.s3.amazonaws.com/index.html - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ -ADXCategories: - - Public Sector Data +Name: State of Colorado Imagery +Description: The State of Colorado has gathered public historical imagery ranging from 2005 to 2021. +Documentation: https://docs.google.com/document/d/1YDHignUj9lQTMw2J-SqA96MTP8KmJYtk2ZKKC2ZYuPE/edit?usp=sharing +Contact: oit_gis@state.co.us +ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team +UpdateFrequency: Periodically +Collabs: + ASDI: + Tags: + - satellite imagery +Tags: + - aws-pds + - aerial imagery + - geospatial + - imaging + - mapping +License: https://creativecommons.org/publicdomain/zero/1.0/legalcode +Resources: + - Description: The State of Colorado historic public aerial imagery. Currently, NAIP is available from 2005 and 2009-2021. The National Agriculture Imagery Program is a project managed by the U.S. Department of Agriculture created to collect leaf-on imagery for the United States during peak growing seasons. The files are available as GeoTIFFs. From 2005-2017 they have a one meter resolution. After that, it is a 60cm resolution. DRAPP (Denver Regional Aerial Photgraphy Project) is available from 2010-2020. It is availble in 3, 6, and 12in resolutions (except 2012). + ARN: arn:aws:s3:::colorado-public-imagery + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Colorado AWS Open Imagery Guide + URL: https://docs.google.com/document/d/15GjCSWSzst82FZMqBqdGV0rt6FKJzt03NlQYdWwsLGE/edit?usp=sharing + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ + Tools & Applications: + - Title: Colorado Public Imagery Dowloader + URL: https://gis.colorado.gov/imagery/ + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ + - Title: Colorado Public Imagery s3 Browser + URL: https://colorado-public-imagery.s3.amazonaws.com/index.html + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ +ADXCategories: + - Public Sector Data diff --git a/datasets/cropland_partitioining.yaml b/datasets/cropland_partitioining.yaml index 6ce1ecba3..3683e9b9e 100644 --- a/datasets/cropland_partitioining.yaml +++ b/datasets/cropland_partitioining.yaml @@ -1,45 +1,49 @@ -Name: IWMI DIWASA Rainfed and Irrigated Cropland Map for Africa -Description: A framework integrating the Budyko model has been developed to distinguish between rainfed and irrigated cropland areas across Africa. This expands on remote sensing land cover products available for agricultural water studies in Africa and thereby helps address the need for deeper insights into cropland patterns. Validation against an independent dataset revealed an overall accuracy of 73% with high precision and specificity scores. These results validate the framework’s effectiveness in identifying irrigated areas while minimizing errors in misclassifying rainfed areas as irrigated. -Documentation: https://github.com/iwmiwaplus/ODR/tree/master/Partitioned%20Croplands -Contact: iwmiwaplus@gmail.com -ManagedBy: "[IWMI](https://www.iwmi.org/)" -UpdateFrequency: None -Tags: - - cropland partitioning - - irrigated cropland - - rainfed cropland - - agriculture - - land use - - land cover -License: There are no restrictions on the use of this data. -Resources: - - Description: high-confidence cropland map (HCCM) - ARN: arn:aws:s3:::iwmi-datasets/Cropland_partition/HCCM/ - Region: af-south-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](https://iwmi-datasets.s3.af-south-1.amazonaws.com/Cropland_partition/index.html)' - - Description: Cropland partitioning all data - ARN: arn:aws:s3:::iwmi-datasets/Cropland_partition/ - Region: af-south-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](https://iwmi-datasets.s3.af-south-1.amazonaws.com/Cropland_partition/index.html)' -DataAtWork: - Tutorials: - - Title: Cropland percentage - URL: https://github.com/iwmiwaplus/ODR/tree/master/Partitioned%20Croplands/Tutorials - AuthorName: iwmiwaplus - AuthorURL: https://github.com/iwmiwaplus - Tools & Applications: - - Title: Water use in Awash basin - URL: https://github.com/iwmiwaplus/ODR/blob/master/Partitioned%20Croplands/Applications/Awash_cropland%20partitioning.pdf - AuthorName: A. Owusu, K. Akpoti, M. Leh, N. Velpuri - AuthorURL: https://github.com/iwmiwaplus - Publications: - - Title: A framework for disaggregating remote-sensing cropland into rainfed and irrigated classes at continental scale - URL: https://doi.org/10.1016/J.JAG.2023.103607 - AuthorName: Owusu, A., Kagone, S., Leh, M., Velpuri, N. M., Gumma, M. K., Ghansah, B., Thilina-Prabhath, P., Akpoti, K., Mekonnen, K., Tinonetsana, P., & Mohammed, I. - - Title: Rainfed and Irrigated Cropland Areas for Africa - URL: https://doi.org/10.5066/P9N4R7SF - AuthorName: Owusu, A., Kagone, S., Leh, M., and Velpuri, N.M. +Name: IWMI DIWASA Rainfed and Irrigated Cropland Map for Africa +Description: A framework integrating the Budyko model has been developed to distinguish between rainfed and irrigated cropland areas across Africa. This expands on remote sensing land cover products available for agricultural water studies in Africa and thereby helps address the need for deeper insights into cropland patterns. Validation against an independent dataset revealed an overall accuracy of 73% with high precision and specificity scores. These results validate the framework’s effectiveness in identifying irrigated areas while minimizing errors in misclassifying rainfed areas as irrigated. +Documentation: https://github.com/iwmiwaplus/ODR/tree/master/Partitioned%20Croplands +Contact: iwmiwaplus@gmail.com +ManagedBy: "[IWMI](https://www.iwmi.org/)" +UpdateFrequency: None +Collabs: + ASDI: + Tags: + - agriculture +Tags: + - cropland partitioning + - irrigated cropland + - rainfed cropland + - agriculture + - land use + - land cover +License: There are no restrictions on the use of this data. +Resources: + - Description: high-confidence cropland map (HCCM) + ARN: arn:aws:s3:::iwmi-datasets/Cropland_partition/HCCM/ + Region: af-south-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://iwmi-datasets.s3.af-south-1.amazonaws.com/Cropland_partition/index.html)' + - Description: Cropland partitioning all data + ARN: arn:aws:s3:::iwmi-datasets/Cropland_partition/ + Region: af-south-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://iwmi-datasets.s3.af-south-1.amazonaws.com/Cropland_partition/index.html)' +DataAtWork: + Tutorials: + - Title: Cropland percentage + URL: https://github.com/iwmiwaplus/ODR/tree/master/Partitioned%20Croplands/Tutorials + AuthorName: iwmiwaplus + AuthorURL: https://github.com/iwmiwaplus + Tools & Applications: + - Title: Water use in Awash basin + URL: https://github.com/iwmiwaplus/ODR/blob/master/Partitioned%20Croplands/Applications/Awash_cropland%20partitioning.pdf + AuthorName: A. Owusu, K. Akpoti, M. Leh, N. Velpuri + AuthorURL: https://github.com/iwmiwaplus + Publications: + - Title: A framework for disaggregating remote-sensing cropland into rainfed and irrigated classes at continental scale + URL: https://doi.org/10.1016/J.JAG.2023.103607 + AuthorName: Owusu, A., Kagone, S., Leh, M., Velpuri, N. M., Gumma, M. K., Ghansah, B., Thilina-Prabhath, P., Akpoti, K., Mekonnen, K., Tinonetsana, P., & Mohammed, I. + - Title: Rainfed and Irrigated Cropland Areas for Africa + URL: https://doi.org/10.5066/P9N4R7SF + AuthorName: Owusu, A., Kagone, S., Leh, M., and Velpuri, N.M. diff --git a/datasets/cwa_opendata.yaml b/datasets/cwa_opendata.yaml index 5cfd2550b..eabfc4faf 100644 --- a/datasets/cwa_opendata.yaml +++ b/datasets/cwa_opendata.yaml @@ -1,19 +1,23 @@ -Name: Central Weather Administration OpenData -Description: Various kinds of weather raw data and charts from Central Weather Administration. -Documentation: https://opendata.cwa.gov.tw/devManual/insrtuction -Contact: od@cwa.gov.tw -ManagedBy: "[Central Weather Administration](https://www.cwa.gov.tw/)" -UpdateFrequency: Data is updated as soon as newer one is available. -Tags: - - aws-pds - - climate - - earth observation - - earthquakes - - satellite imagery - - weather -License: http://data.gov.tw/license -Resources: - - Description: CWA data lake - ARN: arn:aws:s3:::cwaopendata - Region: ap-northeast-1 +Name: Central Weather Administration OpenData +Description: Various kinds of weather raw data and charts from Central Weather Administration. +Documentation: https://opendata.cwa.gov.tw/devManual/insrtuction +Contact: od@cwa.gov.tw +ManagedBy: "[Central Weather Administration](https://www.cwa.gov.tw/)" +UpdateFrequency: Data is updated as soon as newer one is available. +Collabs: + ASDI: + Tags: + - climate +Tags: + - aws-pds + - climate + - earth observation + - earthquakes + - satellite imagery + - weather +License: http://data.gov.tw/license +Resources: + - Description: CWA data lake + ARN: arn:aws:s3:::cwaopendata + Region: ap-northeast-1 Type: S3 Bucket \ No newline at end of file diff --git a/datasets/dep-coastlines.yaml b/datasets/dep-coastlines.yaml index 33fd94d62..e292875ec 100644 --- a/datasets/dep-coastlines.yaml +++ b/datasets/dep-coastlines.yaml @@ -9,6 +9,10 @@ Documentation: https://digitalearthpacific.org/#/applications Contact: dep@spc.int ManagedBy: "[Pacific Community (SPC)](https://www.spc.int/)" UpdateFrequency: Annually +Collabs: + ASDI: + Tags: + - oceans Tags: - earth observation - environmental diff --git a/datasets/dep-mangroves.yaml b/datasets/dep-mangroves.yaml index fb377b25f..59874cdfe 100644 --- a/datasets/dep-mangroves.yaml +++ b/datasets/dep-mangroves.yaml @@ -12,6 +12,10 @@ Documentation: https://digitalearthpacific.org/#/applications Contact: dep@spc.int ManagedBy: "[Pacific Community (SPC)](https://www.spc.int/)" UpdateFrequency: Annually +Collabs: + ASDI: + Tags: + - biodiversity Tags: - earth observation - environmental diff --git a/datasets/dep-s1-annual-mosaics.yaml b/datasets/dep-s1-annual-mosaics.yaml index f7cef8615..8df7f0eb3 100644 --- a/datasets/dep-s1-annual-mosaics.yaml +++ b/datasets/dep-s1-annual-mosaics.yaml @@ -7,6 +7,10 @@ Documentation: https://digitalearthpacific.org/#/applications Contact: dep@spc.int ManagedBy: "[Pacific Community (SPC)](https://www.spc.int/)" UpdateFrequency: Annually +Collabs: + ASDI: + Tags: + - climate Tags: - earth observation - environmental diff --git a/datasets/dep-s2-geomads.yaml b/datasets/dep-s2-geomads.yaml index 669e567e6..13c9b1e4c 100644 --- a/datasets/dep-s2-geomads.yaml +++ b/datasets/dep-s2-geomads.yaml @@ -13,6 +13,10 @@ Documentation: https://digitalearthpacific.org/#/applications Contact: dep@spc.int ManagedBy: "[Pacific Community (SPC)](https://www.spc.int/)" UpdateFrequency: Annually +Collabs: + ASDI: + Tags: + - climate Tags: - earth observation - geoscience diff --git a/datasets/dmi-opendata.yaml b/datasets/dmi-opendata.yaml index eab6d02a1..8bde94349 100644 --- a/datasets/dmi-opendata.yaml +++ b/datasets/dmi-opendata.yaml @@ -4,6 +4,10 @@ Documentation: https://opendatadocs.dmi.govcloud.dk/en/Data/Forecast_Data Contact: https://opendatadocs.dmi.govcloud.dk/en/API_Status_and_Contact ManagedBy: "[Danish Meteorological Institute](https://www.dmi.dk/)" UpdateFrequency: Every hour, 3 hours or 6 hours depending on model +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - air temperature diff --git a/datasets/ecmwf-forecasts.yaml b/datasets/ecmwf-forecasts.yaml index 638a6c084..94ace5f1e 100644 --- a/datasets/ecmwf-forecasts.yaml +++ b/datasets/ecmwf-forecasts.yaml @@ -5,6 +5,10 @@ Documentation: "[User Documentation](https://confluence.ecmwf.int/display/DAC/EC Contact: https://confluence.ecmwf.int/site/support ManagedBy: "[European Centre for Medium-Range Weather Forecasts](https://www.ecmwf.int/)" UpdateFrequency: "The data are released 1 hour after the [real-time dissemination schedule](https://confluence.ecmwf.int/display/DAC/Dissemination+schedule)." +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - air temperature diff --git a/datasets/epa-2022-modeling-platform.yaml b/datasets/epa-2022-modeling-platform.yaml index a876a858e..5385ba4e2 100644 --- a/datasets/epa-2022-modeling-platform.yaml +++ b/datasets/epa-2022-modeling-platform.yaml @@ -1,97 +1,101 @@ -Name: >- - OAQPS 2022 Modeling Platform -Description: >- - The data are part of the 2022 Modeling Platform used to support regulatory actions - and technical analyses conducted by the EPA's Office of Air Quality Planning and - Standards. Specifically, this data includes Weather Research and Forecasting Model (v4.4.2) - conducted at a 12-km resolution over the Continental United States (12US). MCIP-processed - files and wrfcamx-processed (12US1 domain) are also available as part of this dataset - to assist in the use of emissions processing and photochemical modeling. These files - may be used in downstream applications to generate emissions, photochemical - modeling, or dispersion modeling inputs. Additionally, lateral boundary condition files - generated using GEOS-CF at 36-km with results translated from GEOS-Chem species available - in GEOS-CF to CMAQ cb6/ae7. Simulations for boundary conditions covering the northern - hemisphere are also provided. 12US2 lateral boundary condition files are also generated based - on 36US3 CMAQ model run outputs. These simulations were conducted using CMAQ v5.4 and GEOS-Chem - v14.0.1. 2022v1 CMAQ-ready emissions are provided for a 36km grid over North America (36US3) - and two 12km grids (12SU1 and 12US2). In addition, 2022v1 CAMx-ready emissions are provided - for a 12km grid over North America (12US2). See the documentation for pictures of the grids. - The types of emissions data provided include point sources, nonpoint sources, mobiles sources, - fires, lightning NOx, and biogenic emissions. Input files for computing biogenic emissions, - lightning NOx emissions and bi-directional deposition inline within CMAQ are also provided. - Ozone column and photolysis rate input files for CAMx model run are also provided. The related - CMAQ and CAMx run scripts are also available now. One day sample outputs for CMAQ and CAMx - on the 12US2 domain are also now available. README text files are included at multiple - levels within the directory structure to explain files at that level. For more information - about the emissions, see the below documentation or the 2022v1 web page: - https://www.epa.gov/air-emissions-modeling/2022v1-emissions-modeling-platform -Documentation: >- - 2022 WRF Modeling TSD: - https://bit.ly/2022WRF - - 2022 Emissions Base Case: - https://bit.ly/2022Emissions - - 2022 v1 36US3 model performance: - https://bit.ly/36US3_2022 - -Contact: Misenis.Chris@epa.gov -ManagedBy: - U.S. Environmental Protection Agency (https://www.epa.gov) -UpdateFrequency: As needed -Tags: - - aws-pds - - air quality - - regulatory - - weather - - meteorological -License: >- - These datasets are products of the U.S. Government and are intended for public - access and use. Unless otherwise specified, all data produced by the U.S EPA - is, by default, in the public domain and are not subject to domestic copyright - protection under 17 U.S.C. § 105. More details on the U.S. Public Domain - license are available here: http://www.usa.gov/publicdomain/label/1.0/ -Citation: >- - WRF Modeling: - US EPA, 2024, "Meteorological Model Performance for Annual 2022 Simulation - WRF v4.4.2" - Emissions Modeling: - US EPA, 2024, "Documentation of 2022 Base Year Emissions Released August 2024" -Resources: - - Description: >- - The 2022 WRF output are stored as uncompressed netcdf/hdf5 formatted files in - the /WRF directory. The 2022 MCIP output are stored as uncompressed netcdf/hdf5 - formatted files in IOAPI format in the /MCIP directory. The wrfcamx files are stored - as uncompressed netcdf files in the /wrfcamx directory. Information on the model - projection and grid structure is contained in the header information of the - netcdf file. The netcdf files can be opened and manipulated using software programs - that can read and write netcdf formatted files (e.g. Fortran, R, Python). - The WRF files are daily files containing hourly data beginning at 00UTC through - 23UTC for each modeled day. For more information: https://www2.mmm.ucar.edu/wrf/users/ - The MCIP files are daily files with multiple files for each day. For more information - about what each MCIP file contains, please see the following GitHub entry: - https://github.com/USEPA/CMAQ/blob/main/PREP/mcip/README.md For more information about - what each wrfcamx file contains, please see the README file in the wrfcamx source - code file available from Ramboll at: - https://www.camx.com/getmedia/wrfcamx_v5.2.10Jan22.tgz - The 2022v1 emissions data are stored as uncompressed netcdf files in the /emis directory. - Year 2022 CMAQ-ready emissions are provided under the folder emis/2022hc_cb6_22m. - Year 2022 CAMx-ready emissions are provided under the folder emis/CAMx. - The 2022v1 12US2 boundary conditions are stored as uncompressed netecdf files in - bcon/12US2_CMAQ_BCON and bcon/ HEMI_CMAQ_12US2_CAMxBC. The 2022v1 12US2 and 36US3 - EPIC data are stored as uncompressed netecdf files in CMAQ_ancillary_inputs/EPIC. - The 2022v1 12US2 and 36US3 lightning data are stored as uncompressed netecdf - files in CMAQ_ancillary_inputs/Lightning_data. The 2022v1 12US2 and 36US3 ozone - column data as uncompressed txt files in CAMx_ancillary_inputs/ozone_col. The 2022v1 - 12US2 and 36US3 photolysis rate data are stored as uncompressed files in - CAMx_ancillary_inputs/photolysis_rate. The 2022v1 12US2 and 36US3 model run - scripts are stored in Model_jobs/. - ARN: 'arn:aws:s3:::epa-2022-modeling-platform' - Region: us-east-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](https://epa-2022-modeling-platform.s3.amazonaws.com/index.html)' - - Description: Notification for the 2022 Modeling Platform bucket - ARN: 'arn:aws:sns:us-east-1:127085394039:epa-2022-modeling-platform-object_created' - Region: us-east-1 +Name: >- + OAQPS 2022 Modeling Platform +Description: >- + The data are part of the 2022 Modeling Platform used to support regulatory actions + and technical analyses conducted by the EPA's Office of Air Quality Planning and + Standards. Specifically, this data includes Weather Research and Forecasting Model (v4.4.2) + conducted at a 12-km resolution over the Continental United States (12US). MCIP-processed + files and wrfcamx-processed (12US1 domain) are also available as part of this dataset + to assist in the use of emissions processing and photochemical modeling. These files + may be used in downstream applications to generate emissions, photochemical + modeling, or dispersion modeling inputs. Additionally, lateral boundary condition files + generated using GEOS-CF at 36-km with results translated from GEOS-Chem species available + in GEOS-CF to CMAQ cb6/ae7. Simulations for boundary conditions covering the northern + hemisphere are also provided. 12US2 lateral boundary condition files are also generated based + on 36US3 CMAQ model run outputs. These simulations were conducted using CMAQ v5.4 and GEOS-Chem + v14.0.1. 2022v1 CMAQ-ready emissions are provided for a 36km grid over North America (36US3) + and two 12km grids (12SU1 and 12US2). In addition, 2022v1 CAMx-ready emissions are provided + for a 12km grid over North America (12US2). See the documentation for pictures of the grids. + The types of emissions data provided include point sources, nonpoint sources, mobiles sources, + fires, lightning NOx, and biogenic emissions. Input files for computing biogenic emissions, + lightning NOx emissions and bi-directional deposition inline within CMAQ are also provided. + Ozone column and photolysis rate input files for CAMx model run are also provided. The related + CMAQ and CAMx run scripts are also available now. One day sample outputs for CMAQ and CAMx + on the 12US2 domain are also now available. README text files are included at multiple + levels within the directory structure to explain files at that level. For more information + about the emissions, see the below documentation or the 2022v1 web page: + https://www.epa.gov/air-emissions-modeling/2022v1-emissions-modeling-platform +Documentation: >- + 2022 WRF Modeling TSD: + https://bit.ly/2022WRF + + 2022 Emissions Base Case: + https://bit.ly/2022Emissions + + 2022 v1 36US3 model performance: + https://bit.ly/36US3_2022 + +Contact: Misenis.Chris@epa.gov +ManagedBy: + U.S. Environmental Protection Agency (https://www.epa.gov) +UpdateFrequency: As needed +Collabs: + ASDI: + Tags: + - climate +Tags: + - aws-pds + - air quality + - regulatory + - weather + - meteorological +License: >- + These datasets are products of the U.S. Government and are intended for public + access and use. Unless otherwise specified, all data produced by the U.S EPA + is, by default, in the public domain and are not subject to domestic copyright + protection under 17 U.S.C. § 105. More details on the U.S. Public Domain + license are available here: http://www.usa.gov/publicdomain/label/1.0/ +Citation: >- + WRF Modeling: + US EPA, 2024, "Meteorological Model Performance for Annual 2022 Simulation + WRF v4.4.2" + Emissions Modeling: + US EPA, 2024, "Documentation of 2022 Base Year Emissions Released August 2024" +Resources: + - Description: >- + The 2022 WRF output are stored as uncompressed netcdf/hdf5 formatted files in + the /WRF directory. The 2022 MCIP output are stored as uncompressed netcdf/hdf5 + formatted files in IOAPI format in the /MCIP directory. The wrfcamx files are stored + as uncompressed netcdf files in the /wrfcamx directory. Information on the model + projection and grid structure is contained in the header information of the + netcdf file. The netcdf files can be opened and manipulated using software programs + that can read and write netcdf formatted files (e.g. Fortran, R, Python). + The WRF files are daily files containing hourly data beginning at 00UTC through + 23UTC for each modeled day. For more information: https://www2.mmm.ucar.edu/wrf/users/ + The MCIP files are daily files with multiple files for each day. For more information + about what each MCIP file contains, please see the following GitHub entry: + https://github.com/USEPA/CMAQ/blob/main/PREP/mcip/README.md For more information about + what each wrfcamx file contains, please see the README file in the wrfcamx source + code file available from Ramboll at: + https://www.camx.com/getmedia/wrfcamx_v5.2.10Jan22.tgz + The 2022v1 emissions data are stored as uncompressed netcdf files in the /emis directory. + Year 2022 CMAQ-ready emissions are provided under the folder emis/2022hc_cb6_22m. + Year 2022 CAMx-ready emissions are provided under the folder emis/CAMx. + The 2022v1 12US2 boundary conditions are stored as uncompressed netecdf files in + bcon/12US2_CMAQ_BCON and bcon/ HEMI_CMAQ_12US2_CAMxBC. The 2022v1 12US2 and 36US3 + EPIC data are stored as uncompressed netecdf files in CMAQ_ancillary_inputs/EPIC. + The 2022v1 12US2 and 36US3 lightning data are stored as uncompressed netecdf + files in CMAQ_ancillary_inputs/Lightning_data. The 2022v1 12US2 and 36US3 ozone + column data as uncompressed txt files in CAMx_ancillary_inputs/ozone_col. The 2022v1 + 12US2 and 36US3 photolysis rate data are stored as uncompressed files in + CAMx_ancillary_inputs/photolysis_rate. The 2022v1 12US2 and 36US3 model run + scripts are stored in Model_jobs/. + ARN: 'arn:aws:s3:::epa-2022-modeling-platform' + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://epa-2022-modeling-platform.s3.amazonaws.com/index.html)' + - Description: Notification for the 2022 Modeling Platform bucket + ARN: 'arn:aws:sns:us-east-1:127085394039:epa-2022-modeling-platform-object_created' + Region: us-east-1 Type: SNS Topic \ No newline at end of file diff --git a/datasets/epa-edde-v1.yaml b/datasets/epa-edde-v1.yaml index 8adb56808..236475d52 100644 --- a/datasets/epa-edde-v1.yaml +++ b/datasets/epa-edde-v1.yaml @@ -37,6 +37,10 @@ Contact: >- ManagedBy: >- U.S. Environmental Protection Agency (https://www.epa.gov) UpdateFrequency: Quarterly +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - weather diff --git a/datasets/epa-edde-v2.yaml b/datasets/epa-edde-v2.yaml index 859af3df3..9d30adba3 100644 --- a/datasets/epa-edde-v2.yaml +++ b/datasets/epa-edde-v2.yaml @@ -36,6 +36,10 @@ Contact: >- ManagedBy: >- U.S. Environmental Protection Agency (https://www.epa.gov) UpdateFrequency: Quarterly +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - weather diff --git a/datasets/epa-equates-v1.yaml b/datasets/epa-equates-v1.yaml index 56e8dec34..ef4160ab2 100644 --- a/datasets/epa-equates-v1.yaml +++ b/datasets/epa-equates-v1.yaml @@ -1,63 +1,67 @@ -Name: >- - Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column data from - the EPA's Air Quality Time Series (EQUATES) Project -Description: >- - The data are part of EPA’s Air Quality Time Series (EQUATES) Project. The - data consist of hourly gridded pollutant concentrations estimates by the - Community Multiscale Air Quality (CMAQ) model version 5.3.2 - (https://doi.org/10.15139/S3/F2KJSK) for January 1 – December 31, 2019. Model - data is provided for two spatial domains : the Northern Hemisphere (108 km x - 108km horizontal grid spacing) and the Contiguous United States including - parts of Canada and Mexico (12km x 12km horizontal grid spacing). Two types - of hourly data are provided: three-dimensional air pollutant concentrations - and vertical column pollutant totals. Previous studies have used this type of - CMAQ 3D and vertical column air quality data to evaluate the modeling system, - created model-observed ‘fused’ surfaces, and to analyze spatial and temporal - changes in air quality in the upper atmosphere, e.g., - https://doi.org/10.1016/j.envint.2019.104909; - https://doi.org/10.5194/acp-17-12449-2017; - https://doi.org/10.1029/2006JD008085; - https://doi.org/10.5194/acp-15-9997-2015. -Documentation: >- - EQUATES data DOI: https://doi.org/10.15139/S3/F2KJSK. Please see the Data Use - Statement if you plan to use this data for your own research. Additional - information may be found on the EQUATES home page (www.epa.gov/cmaq/equates). - For questions or issues please use this User Support Forum: - https://forum.cmascenter.org/t/about-the-equates-category/2723 -Contact: CMAQ_Team@epa.gov -ManagedBy: >- - U.S. Environmental Protection Agency (https://www.epa.gov) -UpdateFrequency: Annual -Tags: - - aws-pds - - air quality - - atmosphere - - model -License: >- - These datasets are products of the U.S. Government and are intended for public - access and use. Unless otherwise specified, all data produced by the U.S EPA - is, by default, in the public domain and are not subject to domestic copyright - protection under 17 U.S.C. § 105. More details on the U.S. Public Domain - license are available here: http://www.usa.gov/publicdomain/label/1.0/ -Citation: >- - US EPA, 2021, "EQUATESv1.0: Emissions, WRF/MCIP, CMAQv5.3.2 Data -- 2002-2019 - US_12km and NHEMI_108km", https://doi.org/10.15139/S3/F2KJSK, UNC Dataverse, - V5 -Resources: - - Description: >- - The 2019 CMAQ output are stored as compressed netcdf/hdf5 formatted files - using I/O API data structures (https://www.cmascenter.org/ioapi/). - Information on the model projection and grid structure is contained in the - header information of the netcdf file. The netcdf files can be opened and - manipulated using I/O API utilities (e.g. M3XTRACT, M3WNDW) or other - software programs that can read and write netcdf formatted files (e.g. - Fortran, R, Python). - ARN: 'arn:aws:s3:::epa-equates-v1' - Region: us-east-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](https://epa-equates-v1.s3.amazonaws.com/index.html)' - - Description: Notifications for EQUATES bucket - ARN: 'arn:aws:sns:us-east-1:127085394039:epa-equates-v1-object_created' - Region: us-east-1 +Name: >- + Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column data from + the EPA's Air Quality Time Series (EQUATES) Project +Description: >- + The data are part of EPA’s Air Quality Time Series (EQUATES) Project. The + data consist of hourly gridded pollutant concentrations estimates by the + Community Multiscale Air Quality (CMAQ) model version 5.3.2 + (https://doi.org/10.15139/S3/F2KJSK) for January 1 – December 31, 2019. Model + data is provided for two spatial domains : the Northern Hemisphere (108 km x + 108km horizontal grid spacing) and the Contiguous United States including + parts of Canada and Mexico (12km x 12km horizontal grid spacing). Two types + of hourly data are provided: three-dimensional air pollutant concentrations + and vertical column pollutant totals. Previous studies have used this type of + CMAQ 3D and vertical column air quality data to evaluate the modeling system, + created model-observed ‘fused’ surfaces, and to analyze spatial and temporal + changes in air quality in the upper atmosphere, e.g., + https://doi.org/10.1016/j.envint.2019.104909; + https://doi.org/10.5194/acp-17-12449-2017; + https://doi.org/10.1029/2006JD008085; + https://doi.org/10.5194/acp-15-9997-2015. +Documentation: >- + EQUATES data DOI: https://doi.org/10.15139/S3/F2KJSK. Please see the Data Use + Statement if you plan to use this data for your own research. Additional + information may be found on the EQUATES home page (www.epa.gov/cmaq/equates). + For questions or issues please use this User Support Forum: + https://forum.cmascenter.org/t/about-the-equates-category/2723 +Contact: CMAQ_Team@epa.gov +ManagedBy: >- + U.S. Environmental Protection Agency (https://www.epa.gov) +UpdateFrequency: Annual +Collabs: + ASDI: + Tags: + - climate +Tags: + - aws-pds + - air quality + - atmosphere + - model +License: >- + These datasets are products of the U.S. Government and are intended for public + access and use. Unless otherwise specified, all data produced by the U.S EPA + is, by default, in the public domain and are not subject to domestic copyright + protection under 17 U.S.C. § 105. More details on the U.S. Public Domain + license are available here: http://www.usa.gov/publicdomain/label/1.0/ +Citation: >- + US EPA, 2021, "EQUATESv1.0: Emissions, WRF/MCIP, CMAQv5.3.2 Data -- 2002-2019 + US_12km and NHEMI_108km", https://doi.org/10.15139/S3/F2KJSK, UNC Dataverse, + V5 +Resources: + - Description: >- + The 2019 CMAQ output are stored as compressed netcdf/hdf5 formatted files + using I/O API data structures (https://www.cmascenter.org/ioapi/). + Information on the model projection and grid structure is contained in the + header information of the netcdf file. The netcdf files can be opened and + manipulated using I/O API utilities (e.g. M3XTRACT, M3WNDW) or other + software programs that can read and write netcdf formatted files (e.g. + Fortran, R, Python). + ARN: 'arn:aws:s3:::epa-equates-v1' + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://epa-equates-v1.s3.amazonaws.com/index.html)' + - Description: Notifications for EQUATES bucket + ARN: 'arn:aws:sns:us-east-1:127085394039:epa-equates-v1-object_created' + Region: us-east-1 Type: SNS Topic \ No newline at end of file diff --git a/datasets/era5-for-wrf.yaml b/datasets/era5-for-wrf.yaml index 6747b5bcc..11bfe3eb9 100644 --- a/datasets/era5-for-wrf.yaml +++ b/datasets/era5-for-wrf.yaml @@ -1,26 +1,30 @@ -Name: ERA5-for-WRF Open Data on AWS -Description: ERA5 reanalysis data on AWS, preprocessed for use with the Weather Research and Forecasting (WRF) model. -Documentation: https://github.com/moptis/era5-for-wrf/ -Contact: info@veer.eco -ManagedBy: "[Veer Renewables](http://www.veer.eco/)" -UpdateFrequency: Monthly. -Tags: - - aws-pds - - weather - - sustainability - - atmosphere - - electricity - - meteorological - - model -License: CC BY-SA 4.0 -Resources: - - Description: ERA5-for-WRF Data - ARN: arn:aws:s3:::era5-for-wrf - Region: us-east-1 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: ERA5-for-WRF Tutorials - URL: https://github.com/moptis/era5-for-wrf/ - AuthorName: Veer Renewables - AuthorURL: https://veer.eco +Name: ERA5-for-WRF Open Data on AWS +Description: ERA5 reanalysis data on AWS, preprocessed for use with the Weather Research and Forecasting (WRF) model. +Documentation: https://github.com/moptis/era5-for-wrf/ +Contact: info@veer.eco +ManagedBy: "[Veer Renewables](http://www.veer.eco/)" +UpdateFrequency: Monthly. +Collabs: + ASDI: + Tags: + - climate +Tags: + - aws-pds + - weather + - sustainability + - atmosphere + - electricity + - meteorological + - model +License: CC BY-SA 4.0 +Resources: + - Description: ERA5-for-WRF Data + ARN: arn:aws:s3:::era5-for-wrf + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: ERA5-for-WRF Tutorials + URL: https://github.com/moptis/era5-for-wrf/ + AuthorName: Veer Renewables + AuthorURL: https://veer.eco diff --git a/datasets/ford-multi-av-seasonal.yaml b/datasets/ford-multi-av-seasonal.yaml index 8b5c95953..c067a9bc9 100644 --- a/datasets/ford-multi-av-seasonal.yaml +++ b/datasets/ford-multi-av-seasonal.yaml @@ -4,6 +4,10 @@ Contact: avdata.ford.com Documentation: avdata.ford.com ManagedBy: "[Ford Motor Company](https://avdata.ford.com)" UpdateFrequency: New data will be added until the entire dataset is released online. +Collabs: + ASDI: + Tags: + - infrastructure Tags: - autonomous vehicles - computer vision diff --git a/datasets/geoglows-v2.yaml b/datasets/geoglows-v2.yaml index ef7a793b9..fe536853c 100644 --- a/datasets/geoglows-v2.yaml +++ b/datasets/geoglows-v2.yaml @@ -25,6 +25,10 @@ Documentation: https://training.geoglows.org Contact: https://groups.google.com/g/geoglows ManagedBy: Riley Hales UpdateFrequency: Monthly +Collabs: + ASDI: + Tags: + - oceans Tags: - aws-pds - hydrology diff --git a/datasets/glo-30-hand.yaml b/datasets/glo-30-hand.yaml index b065417af..072b1356e 100644 --- a/datasets/glo-30-hand.yaml +++ b/datasets/glo-30-hand.yaml @@ -11,6 +11,10 @@ ManagedBy: "[The Alaska Satellite Facility (ASF)](https://asf.alaska.edu/)" UpdateFrequency: > None, except HAND may be updated if the[ Copernicus GLO-30 Public](https://registry.opendata.aws/copernicus-dem/) dataset is updated. +Collabs: + ASDI: + Tags: + - disaster Tags: - aws-pds - elevation diff --git a/datasets/global-drought-flood-catalogue.yaml b/datasets/global-drought-flood-catalogue.yaml index bdcf28ece..6837fd190 100644 --- a/datasets/global-drought-flood-catalogue.yaml +++ b/datasets/global-drought-flood-catalogue.yaml @@ -5,6 +5,10 @@ Documentation: https://prep-next.github.io/data/GDFC/index.html Contact: For any questions regrading dataset, email Professor Xiaogang He at hexg@nus.edu.sg. ManagedBy: "[PREP-NexT Lab](https://github.com/PREP-NexT)" UpdateFrequency: No future updates planned. +Collabs: + ASDI: + Tags: + - disaster Tags: - aws-pds - floods diff --git a/datasets/gmsdata.yaml b/datasets/gmsdata.yaml index e2cd328a9..c9bfca33d 100644 --- a/datasets/gmsdata.yaml +++ b/datasets/gmsdata.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/genome/gms/wiki Contact: https://github.com/genome/gms/issues ManagedBy: Genome Institute at the Washington University School of Medicine in St. Louis UpdateFrequency: Not updated +Collabs: + ASDI: + Tags: + - biodiversity Tags: - aws-pds - genetic diff --git a/datasets/gnss-ro-opendata.yaml b/datasets/gnss-ro-opendata.yaml index dded0e1f0..7971306c5 100644 --- a/datasets/gnss-ro-opendata.yaml +++ b/datasets/gnss-ro-opendata.yaml @@ -4,6 +4,10 @@ Documentation: "http://github.com/gnss-ro/aws-opendata" Contact: "Stephen Leroy (sleroy@aer.com)" ManagedBy: "Verisk Atmospheric and Environmental Research, Inc." UpdateFrequency: "The dataset is updated monthly for UCAR and ROM SAF contributions only. The update frequency for the JPL contribution is to be determined." +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - atmosphere diff --git a/datasets/green_et.yaml b/datasets/green_et.yaml index cb55ed77b..a11576f9e 100644 --- a/datasets/green_et.yaml +++ b/datasets/green_et.yaml @@ -1,32 +1,36 @@ -Name: IWMI DIWASA Green ET for Africa -Description: Green evapotranspiration (Green ET) is the portion of ET derived from green water, which includes soil moisture and rainfall used by vegetation. It represents a key component of green water fluxes in water accounting. Green ET consists of evaporation from soil moisture in non-irrigated areas, transpiration from rainfed crops and natural vegetation, and interception losses from precipitation on vegetation. It plays a crucial role in rainfed agriculture, drought monitoring, and sustainable water management by tracking how rainfall supports plant growth. -Documentation: https://iwmi.africageoportal.com/pages/continent-africa -Contact: iwmiwaplus@gmail.com -ManagedBy: "[IWMI](https://www.iwmi.org/)" -UpdateFrequency: None -Tags: - - soil moisture - - rainfed cropland - - interception loss - - evapotranspiration - - water -License: "Creative commons open license" -Resources: - - Description: Monthly Green ET for Africa - ARN: arn:aws:s3:::iwmi-datasets/Water_accounting_plus/Africa/Rainfall_ET_M/ - Region: af-south-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](https://iwmi-datasets.s3.af-south-1.amazonaws.com/Cropland_partition/index.html)' - -DataAtWork: - Tutorials: - - Title: Analysis of IWMI’s Water Data Products through Digital Earth Africa - URL: https://learn.digitalearthafrica.org/courses/course-v1:IWMI+DIWASA1+2024_10/about - AuthorName: A.T. Haile, E.T. Negash, K. Mubea, M. Tadesse - AuthorURL: https://github.com/iwmiwaplus - Tools & Applications: - - Title: Multi-Scale Water Accounting in the Volta Basin - URL: https://public.tableau.com/app/profile/iwmi.wa/viz/Voltabasinvertical/Merged?publish=yes - AuthorName: iwmiwaplus - AuthorURL: https://public.tableau.com/app/profile/iwmi.wa +Name: IWMI DIWASA Green ET for Africa +Description: Green evapotranspiration (Green ET) is the portion of ET derived from green water, which includes soil moisture and rainfall used by vegetation. It represents a key component of green water fluxes in water accounting. Green ET consists of evaporation from soil moisture in non-irrigated areas, transpiration from rainfed crops and natural vegetation, and interception losses from precipitation on vegetation. It plays a crucial role in rainfed agriculture, drought monitoring, and sustainable water management by tracking how rainfall supports plant growth. +Documentation: https://iwmi.africageoportal.com/pages/continent-africa +Contact: iwmiwaplus@gmail.com +ManagedBy: "[IWMI](https://www.iwmi.org/)" +UpdateFrequency: None +Collabs: + ASDI: + Tags: + - agriculture +Tags: + - soil moisture + - rainfed cropland + - interception loss + - evapotranspiration + - water +License: "Creative commons open license" +Resources: + - Description: Monthly Green ET for Africa + ARN: arn:aws:s3:::iwmi-datasets/Water_accounting_plus/Africa/Rainfall_ET_M/ + Region: af-south-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://iwmi-datasets.s3.af-south-1.amazonaws.com/Cropland_partition/index.html)' + +DataAtWork: + Tutorials: + - Title: Analysis of IWMI’s Water Data Products through Digital Earth Africa + URL: https://learn.digitalearthafrica.org/courses/course-v1:IWMI+DIWASA1+2024_10/about + AuthorName: A.T. Haile, E.T. Negash, K. Mubea, M. Tadesse + AuthorURL: https://github.com/iwmiwaplus + Tools & Applications: + - Title: Multi-Scale Water Accounting in the Volta Basin + URL: https://public.tableau.com/app/profile/iwmi.wa/viz/Voltabasinvertical/Merged?publish=yes + AuthorName: iwmiwaplus + AuthorURL: https://public.tableau.com/app/profile/iwmi.wa diff --git a/datasets/gulfwide-avian-monitoring.yaml b/datasets/gulfwide-avian-monitoring.yaml index bd9a0fdf3..162ae942a 100755 --- a/datasets/gulfwide-avian-monitoring.yaml +++ b/datasets/gulfwide-avian-monitoring.yaml @@ -1,43 +1,47 @@ ---- -Name: Gulfwide Avian Colony Monitoring Survey Photos -Description: > - For this project, The Water Institute (the Institute) and - subcontractor Colibri Ecological Consulting, LLC (Colibri) utilized - established methods and protocols capable of assessing changes of colonial - waterbird populations and their important habitats within individual states - and the broader northern Gulf of Mexico region. - Data collection activities included: - Aerial Photographic Nest Surveys: Implementation of fixed-wing aircraft surveys intended to assess waterbird colonies and document associated nesting within select portions of the northern Gulf of Mexico. Additional detail is provided on the Survey Protocols page of this portal. - Nest Dotting Analyses: Review and analysis of aerial photographic nest surveys (2010-2013, 2015, 2018, and 2021) with the intention of documenting the breeding population size and associated nesting for each species at each colony. Additional detail is provided on the Dotting Protocols page of this portal. -Documentation: https://experience.arcgis.com/experience/010503b4c64b4ff6a7f3570220a53647 -Contact: avaiandataaws@thewaterinstitute.org -ManagedBy: "[CPRA](https://coastal.la.gov/) and [The Water - Institute](https://thewaterinstitute.org/)" -UpdateFrequency: ~2 years -Tags: - - biology - - conservation - - ecosystems - - object detection - - labeled - - environmental - - aws-pds -License: Creative Commons BY-SA -Resources: - - Description: > - High resolution(5184 x 3456) images are provided in jpg format - (compression quality level 98%). - - The avian-monitoring folder includes the high resolution photos, the dotting process screenshots, the dotting information (birds and nest counts by species), and thumbnails of the subset of the images referenced on those dataset. Files in this subfolder have been renamed and organized to have a common naming convension across the years. - - The top level `High Resolution Images` includes all the high resolution images (even the ones not referenced in the dotting process) with their original filenames. - ARN: arn:aws:s3:::twi-aviandata - Region: us-east-2 - Type: S3 Bucket - Explore: - - "[Explore - dataset](https://experience.arcgis.com/experience/010503b4c64b4ff6a7f35\ - 70220a53647/page/Data-Explorer/)" - - "[README](https://experience.arcgis.com/experience/010503b4c64b4ff6a7f3\ - 570220a53647/page/Project-Information/)" - - "[Data processing notebook](https://github.com/waterinstitute/avian_data_ingestor/blob/master/doc/Metadata%20for%20DottedImages.ipynb)" +--- +Name: Gulfwide Avian Colony Monitoring Survey Photos +Description: > + For this project, The Water Institute (the Institute) and + subcontractor Colibri Ecological Consulting, LLC (Colibri) utilized + established methods and protocols capable of assessing changes of colonial + waterbird populations and their important habitats within individual states + and the broader northern Gulf of Mexico region. + Data collection activities included: + Aerial Photographic Nest Surveys: Implementation of fixed-wing aircraft surveys intended to assess waterbird colonies and document associated nesting within select portions of the northern Gulf of Mexico. Additional detail is provided on the Survey Protocols page of this portal. + Nest Dotting Analyses: Review and analysis of aerial photographic nest surveys (2010-2013, 2015, 2018, and 2021) with the intention of documenting the breeding population size and associated nesting for each species at each colony. Additional detail is provided on the Dotting Protocols page of this portal. +Documentation: https://experience.arcgis.com/experience/010503b4c64b4ff6a7f3570220a53647 +Contact: avaiandataaws@thewaterinstitute.org +ManagedBy: "[CPRA](https://coastal.la.gov/) and [The Water + Institute](https://thewaterinstitute.org/)" +UpdateFrequency: ~2 years +Collabs: + ASDI: + Tags: + - biodiversity +Tags: + - biology + - conservation + - ecosystems + - object detection + - labeled + - environmental + - aws-pds +License: Creative Commons BY-SA +Resources: + - Description: > + High resolution(5184 x 3456) images are provided in jpg format + (compression quality level 98%). + + The avian-monitoring folder includes the high resolution photos, the dotting process screenshots, the dotting information (birds and nest counts by species), and thumbnails of the subset of the images referenced on those dataset. Files in this subfolder have been renamed and organized to have a common naming convension across the years. + + The top level `High Resolution Images` includes all the high resolution images (even the ones not referenced in the dotting process) with their original filenames. + ARN: arn:aws:s3:::twi-aviandata + Region: us-east-2 + Type: S3 Bucket + Explore: + - "[Explore + dataset](https://experience.arcgis.com/experience/010503b4c64b4ff6a7f35\ + 70220a53647/page/Data-Explorer/)" + - "[README](https://experience.arcgis.com/experience/010503b4c64b4ff6a7f3\ + 570220a53647/page/Project-Information/)" + - "[Data processing notebook](https://github.com/waterinstitute/avian_data_ingestor/blob/master/doc/Metadata%20for%20DottedImages.ipynb)" diff --git a/datasets/hycom-gofs-3pt1-reanalysis.yaml b/datasets/hycom-gofs-3pt1-reanalysis.yaml index cf1d6a0a9..b56a0522c 100644 --- a/datasets/hycom-gofs-3pt1-reanalysis.yaml +++ b/datasets/hycom-gofs-3pt1-reanalysis.yaml @@ -4,6 +4,10 @@ Documentation: https://www.hycom.org/dataserver/gofs-3pt1/reanalysis Contact: help@hycom.org ManagedBy: "[COAPS](https://www.coaps.fsu.edu/)" UpdateFrequency: "Static Dataset Covering 1994-01-01 to 2015-12-31" +Collabs: + ASDI: + Tags: + - oceans Tags: - aws-pds - global diff --git a/datasets/in-elevation.yaml b/datasets/in-elevation.yaml index 2b07759c8..f2a5169cb 100644 --- a/datasets/in-elevation.yaml +++ b/datasets/in-elevation.yaml @@ -1,38 +1,42 @@ -Name: Indiana Statewide Elevation Catalog -Description: | - The State of Indiana Geographic Information Office and IOT Office of Technology manage a series of digital LiDAR LAS files stored in AWS, dating back to the 2011-2013 collection and including the NRCS-funded 2016-2020 collection. These LiDAR datasets are available as uncompressed LAS files, for cloud storage and access. Each year's data is organized into a tile grid scheme covering the entire geography of Indiana, ensuring easy access and efficient processing. The tiles' naming reflects each tile's lower left coordinate, facilitating accurate data management and retrieval. The AWS storage solution ensures that these extensive datasets are readily accessible for analysis and application across various projects. -Documentation: https://elevation.gio.in.gov/ -Contact: sscholer@iot.in.gov -ManagedBy: Indiana Geographic Information Office -UpdateFrequency: The State of Indiana has another four-year cycle of collecting orthoimagery and Lidar starting in 2025 and continuing through 2028. The collections are designated by counties in three groups that cover Indiana, South to North. These areas are frequently referred to as Tiers in the other documentation. For example, tier 1 (Central 3rd) extends from Harrison County in the South to Elkhart and St. Joseph County in the North, while Tier 2 consists of the counties to the eastern side of the State, and Tier 3 is those counties to the western side of the State. -Tags: -- lidar -- aws-pds -- earth observation -- geospatial -- imaging -- mapping -- natural resource -- sustainability -- agriculture -License: "Access to Indiana Geographic Information Office Lidar is governed by Creative Commons 0 (CC0): https://creativecommons.org/publicdomain/zero/1.0/legalcode" -Resources: -- Description: State of Indiana Elevation archive. - ARN: arn:aws:s3:::giselevationingov - Region: us-east-2 - Type: S3 Bucket -DataAtWork: - Tutorials: - Tools & Applications: - - Title: ArcGIS Online Indiana Lidar Viewer - URL: https://indianamap-inmap.hub.arcgis.com/maps/ff98e3834d464619bd5c8974b0038a13/about - AuthorName: Indiana Geographic Information Office (IGIO) - - Title: IGIO Elevation Opendata S3 Browser - URL: https://giselevationingov.s3.amazonaws.com/index.html - AuthorName: Indiana Geographic Information Office (IGIO) - - Publications: - - Title: "Recording of 2025 - 2028 Indiana Imagery and Elevation Program Presentation" - URL: https://elevation.gio.in.gov/pages/resources - AuthorName: Indiana Geographic Information Office (IGIO) - +Name: Indiana Statewide Elevation Catalog +Description: | + The State of Indiana Geographic Information Office and IOT Office of Technology manage a series of digital LiDAR LAS files stored in AWS, dating back to the 2011-2013 collection and including the NRCS-funded 2016-2020 collection. These LiDAR datasets are available as uncompressed LAS files, for cloud storage and access. Each year's data is organized into a tile grid scheme covering the entire geography of Indiana, ensuring easy access and efficient processing. The tiles' naming reflects each tile's lower left coordinate, facilitating accurate data management and retrieval. The AWS storage solution ensures that these extensive datasets are readily accessible for analysis and application across various projects. +Documentation: https://elevation.gio.in.gov/ +Contact: sscholer@iot.in.gov +ManagedBy: Indiana Geographic Information Office +UpdateFrequency: The State of Indiana has another four-year cycle of collecting orthoimagery and Lidar starting in 2025 and continuing through 2028. The collections are designated by counties in three groups that cover Indiana, South to North. These areas are frequently referred to as Tiers in the other documentation. For example, tier 1 (Central 3rd) extends from Harrison County in the South to Elkhart and St. Joseph County in the North, while Tier 2 consists of the counties to the eastern side of the State, and Tier 3 is those counties to the western side of the State. +Collabs: + ASDI: + Tags: + - elevation +Tags: +- lidar +- aws-pds +- earth observation +- geospatial +- imaging +- mapping +- natural resource +- sustainability +- agriculture +License: "Access to Indiana Geographic Information Office Lidar is governed by Creative Commons 0 (CC0): https://creativecommons.org/publicdomain/zero/1.0/legalcode" +Resources: +- Description: State of Indiana Elevation archive. + ARN: arn:aws:s3:::giselevationingov + Region: us-east-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + Tools & Applications: + - Title: ArcGIS Online Indiana Lidar Viewer + URL: https://indianamap-inmap.hub.arcgis.com/maps/ff98e3834d464619bd5c8974b0038a13/about + AuthorName: Indiana Geographic Information Office (IGIO) + - Title: IGIO Elevation Opendata S3 Browser + URL: https://giselevationingov.s3.amazonaws.com/index.html + AuthorName: Indiana Geographic Information Office (IGIO) + + Publications: + - Title: "Recording of 2025 - 2028 Indiana Imagery and Elevation Program Presentation" + URL: https://elevation.gio.in.gov/pages/resources + AuthorName: Indiana Geographic Information Office (IGIO) + diff --git a/datasets/in-imagery.yaml b/datasets/in-imagery.yaml index 0c1ed1225..ce52165b2 100644 --- a/datasets/in-imagery.yaml +++ b/datasets/in-imagery.yaml @@ -1,40 +1,44 @@ -Name: Indiana Statewide Digital Aerial Imagery Catalog -Description: | - The State of Indiana Geographic Information Office and IOT Office of Technology manage a series of digital orthophotography dating back to 2005. Every year's worth of imagery is available as Cloud Optimized GeoTIFF (COG) files, original GeoTIFF, and other compressed deliverables such as ECW and MrSID. Additionally, each imagery year is organized into a tile grid scheme covering the entire geography of Indiana. All years of imagery are tiled from a 5,000 ft grid or sub tiles depending upon the resolution of the imagery. The naming of the tiles reflects the lower left coordinate from the image. -Documentation: https://imagery.gio.in.gov/ -Contact: sscholer@iot.in.gov -ManagedBy: Indiana Geographic Information Office -UpdateFrequency: The State of Indiana has had a 4-year cycle collecting imagery. The collections are designated by counties in three groups that cover Indiana, South to North. These areas are frequently referred to as Tiers in the other documentation. For example, tier 1 (Central 3rd) extends from Harrison County in the South to Elkhart and St. Joseph County in the North, while Tier 2 consists of the counties to the eastern side of the State, and Tier 3 is those counties to the western side of the State. -Tags: -- aerial imagery -- aws-pds -- earth observation -- geospatial -- imaging -- mapping -- cog -- natural resource -- sustainability -- agriculture -License: "Access to Indiana Geographic Information Office Orthoimagery is governed by Creative Commons 0 (CC0): https://creativecommons.org/publicdomain/zero/1.0/legalcode" -Resources: -- Description: State of Indiana digital orthophotography archive. - ARN: arn:aws:s3:::gisimageryingov - Region: us-east-2 - Type: S3 Bucket -DataAtWork: - Tutorials: - Tools & Applications: - - Title: ArcGIS Online Indiana Orthoimagery Viewer - URL: https://indianamap-inmap.hub.arcgis.com/datasets/61d4dc991c154af49ad7c1d675182a4f/explore - AuthorName: Indiana Geographic Information Office (IGIO) - - Title: IGIO Imagery Opendata S3 Browser - URL: https://gisimageryingov.s3.amazonaws.com/index.html - AuthorName: Indiana Geographic Information Office (IGIO) - - Publications: - - Title: "Recording of 2025 - 2028 Indiana Orthoimagery Program Presentation" - URL: https://imagery.gio.in.gov/pages/resources - AuthorName: Indiana Geographic Information Office (IGIO) - - +Name: Indiana Statewide Digital Aerial Imagery Catalog +Description: | + The State of Indiana Geographic Information Office and IOT Office of Technology manage a series of digital orthophotography dating back to 2005. Every year's worth of imagery is available as Cloud Optimized GeoTIFF (COG) files, original GeoTIFF, and other compressed deliverables such as ECW and MrSID. Additionally, each imagery year is organized into a tile grid scheme covering the entire geography of Indiana. All years of imagery are tiled from a 5,000 ft grid or sub tiles depending upon the resolution of the imagery. The naming of the tiles reflects the lower left coordinate from the image. +Documentation: https://imagery.gio.in.gov/ +Contact: sscholer@iot.in.gov +ManagedBy: Indiana Geographic Information Office +UpdateFrequency: The State of Indiana has had a 4-year cycle collecting imagery. The collections are designated by counties in three groups that cover Indiana, South to North. These areas are frequently referred to as Tiers in the other documentation. For example, tier 1 (Central 3rd) extends from Harrison County in the South to Elkhart and St. Joseph County in the North, while Tier 2 consists of the counties to the eastern side of the State, and Tier 3 is those counties to the western side of the State. +Collabs: + ASDI: + Tags: + - climate +Tags: +- aerial imagery +- aws-pds +- earth observation +- geospatial +- imaging +- mapping +- cog +- natural resource +- sustainability +- agriculture +License: "Access to Indiana Geographic Information Office Orthoimagery is governed by Creative Commons 0 (CC0): https://creativecommons.org/publicdomain/zero/1.0/legalcode" +Resources: +- Description: State of Indiana digital orthophotography archive. + ARN: arn:aws:s3:::gisimageryingov + Region: us-east-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + Tools & Applications: + - Title: ArcGIS Online Indiana Orthoimagery Viewer + URL: https://indianamap-inmap.hub.arcgis.com/datasets/61d4dc991c154af49ad7c1d675182a4f/explore + AuthorName: Indiana Geographic Information Office (IGIO) + - Title: IGIO Imagery Opendata S3 Browser + URL: https://gisimageryingov.s3.amazonaws.com/index.html + AuthorName: Indiana Geographic Information Office (IGIO) + + Publications: + - Title: "Recording of 2025 - 2028 Indiana Orthoimagery Program Presentation" + URL: https://imagery.gio.in.gov/pages/resources + AuthorName: Indiana Geographic Information Office (IGIO) + + diff --git a/datasets/intelinair_corn_kernel_counting.yaml b/datasets/intelinair_corn_kernel_counting.yaml index 697a72bec..320274cdb 100644 --- a/datasets/intelinair_corn_kernel_counting.yaml +++ b/datasets/intelinair_corn_kernel_counting.yaml @@ -4,6 +4,10 @@ Documentation: https://www.frontiersin.org/articles/10.3389/frobt.2021.627009/ab Contact: support@intelinair.com ManagedBy: Intelinair, Inc. UpdateFrequency: Periodically +Collabs: + ASDI: + Tags: + - agriculture Tags: - aws-pds - agriculture diff --git a/datasets/intelinair_longitudinal_nutrient_deficiency.yaml b/datasets/intelinair_longitudinal_nutrient_deficiency.yaml index 442a04c15..cde7b326a 100644 --- a/datasets/intelinair_longitudinal_nutrient_deficiency.yaml +++ b/datasets/intelinair_longitudinal_nutrient_deficiency.yaml @@ -4,6 +4,10 @@ Documentation: https://arxiv.org/abs/2012.09654 Contact: support@intelinair.com ManagedBy: Intelinair, Inc. UpdateFrequency: Periodically +Collabs: + ASDI: + Tags: + - agriculture Tags: - aws-pds - aerial imagery diff --git a/datasets/its-live-data.yaml b/datasets/its-live-data.yaml index 4a734198f..1e767143b 100644 --- a/datasets/its-live-data.yaml +++ b/datasets/its-live-data.yaml @@ -31,6 +31,10 @@ Contact: > ManagedBy: "[The Alaska Satellite Facility (ASF)](https://asf.alaska.edu/)" UpdateFrequency: Up to daily, as new satellite imagery is made available. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - ice diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index 51644285a..4afb1545b 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/kyfromab Contact: More information regarding the KyFromAbove program can be found at https://kyfromabove.ky.gov. If you have specific questions please contact - kyfromabove@ky.gov. ManagedBy: "[Kentucky Division of Geographic Information](https://kygeonet.ky.gov)" UpdateFrequency: KyFromAbove data is typically updated on an annual basis. Each year, a portion of the state is acquired with an overall update cycle of every three to four years. This update cadance is determined by both funding and the length of leaf-off conditions in a given year. This catalog currently includes imagery and LiDAR data from 2010 through 2024 for most products. +Collabs: + ASDI: + Tags: + - elevation Tags: - aws-pds - earth observation diff --git a/datasets/ladi.yaml b/datasets/ladi.yaml index 3f3492235..82f51684e 100644 --- a/datasets/ladi.yaml +++ b/datasets/ladi.yaml @@ -5,6 +5,10 @@ Contact: ladi-dataset-admin@mit.edu ManagedBy: "[MIT Lincoln Laboratory Humanitarian Assistance and Disaster Relief group](https://www.ll.mit.edu/r-d/biotechnology-and-human-systems/humanitarian-assistance-and-disaster-relief-systems)" UpdateFrequency: Periodically License: Creative Commons Attribution 4.0 International (CC BY 4.0) +Collabs: + ASDI: + Tags: + - disaster Tags: - aws-pds - aerial imagery diff --git a/datasets/mapping-africa.yaml b/datasets/mapping-africa.yaml index 0e0ff37cb..9dbfda5f5 100644 --- a/datasets/mapping-africa.yaml +++ b/datasets/mapping-africa.yaml @@ -9,6 +9,10 @@ Documentation: https://github.com/agroimpacts/mapping-africa Contact: mappingafrica@clarku.edu ManagedBy: "[The Agricultural Impacts Research Group](https://agroimpacts.info/)" UpdateFrequency: "New maps are added as developed" +Collabs: + ASDI: + Tags: + - agriculture Tags: - aws-pds - agriculture diff --git a/datasets/nifs-lhd.yaml b/datasets/nifs-lhd.yaml index 2168f4eed..6c1df1935 100644 --- a/datasets/nifs-lhd.yaml +++ b/datasets/nifs-lhd.yaml @@ -4,6 +4,10 @@ Documentation: https://www-lhd.nifs.ac.jp/pub/Repository_en.html Contact: For any questions regarding data delivery or any general questions regarding the LHD Experiment data repository, please send email to the Data Acquisition and Analysis group at Comp_DAE@nifs.ac.jp. ManagedBy: "[NIFS](https://www.nifs.ac.jp/)" UpdateFrequency: Archived data files are updated nightly when new or revised data are generated in LHD experiment. +Collabs: + ASDI: + Tags: + - energy Tags: - analytics - anomaly detection diff --git a/datasets/noaa-historicalcharts.yaml b/datasets/noaa-historicalcharts.yaml index 5a1109924..7d4cc7eb0 100644 --- a/datasets/noaa-historicalcharts.yaml +++ b/datasets/noaa-historicalcharts.yaml @@ -6,6 +6,10 @@ Contact: | For general questions or feedback about the data, please submit inquiries through the NOAA Office of Coast Survey (OCS) ASSIST Tool at https://www.nauticalcharts.noaa.gov/customer-service/assist/. ManagedBy: "[NOAA](http://www.noaa.gov/)" UpdateFrequency: Periodic manual updates when historic charts are added to the collection. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - history diff --git a/datasets/noaa-nesdis-tcprimed-pds.yaml b/datasets/noaa-nesdis-tcprimed-pds.yaml index caf22df47..0fb9df2ac 100644 --- a/datasets/noaa-nesdis-tcprimed-pds.yaml +++ b/datasets/noaa-nesdis-tcprimed-pds.yaml @@ -4,6 +4,10 @@ Documentation: https://rammb-data.cira.colostate.edu/tcprimed/TCPRIMED_v01r01_do Contact: CIRA_tcprimed [at] colostate [dot] edu ManagedBy: "[CIRA](https://www.cira.colostate.edu/)" UpdateFrequency: Annually for the final version, several months after the conclusion of the Northern Hemisphere tropical cyclone season and daily for the preliminary version, several days after the dissipation of a tropical cyclone. +Collabs: + ASDI: + Tags: + - climate Tags: - atmosphere - aws-pds diff --git a/datasets/noaa-nws-wam-ipe.yaml b/datasets/noaa-nws-wam-ipe.yaml index 1f0215043..be2a3401f 100644 --- a/datasets/noaa-nws-wam-ipe.yaml +++ b/datasets/noaa-nws-wam-ipe.yaml @@ -23,6 +23,10 @@ Contact: |
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov ManagedBy: "[NOAA](http://www.noaa.gov/)" UpdateFrequency: The update frequencies of the WAM-IPE dataset range from 10 minutes to 6 hours depending on the CONOPS. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - climate diff --git a/datasets/noaa-space-weather.yaml b/datasets/noaa-space-weather.yaml index c44ec2aca..ebdcf087c 100644 --- a/datasets/noaa-space-weather.yaml +++ b/datasets/noaa-space-weather.yaml @@ -1,31 +1,35 @@ -Name: NOAA Space Weather Forecast and Observation Data -Description: > - Space weather forecast and observation data is collected and disseminated by NOAA’s Space Weather Prediction Center (SWPC) in Boulder, CO. SWPC produces forecasts for multiple space weather phenomenon types and the resulting impacts to Earth and human activities. A variety of products are available that provide these forecast expectations, and their respective measurements, in formats that range from detailed technical forecast discussions to NOAA Scale values to simple bulletins that give information in laymen's terms. - Forecasting is the prediction of future events, based on analysis and modeling of the past and present conditions of the environment you are interested in. In Space Weather, persistence and recurrence of active regions on the sun over the 27-day solar rotational period play an important role in accurately forecasting the space environment. -Documentation: https://www.swpc.noaa.gov/products-and-data -Contact: | - For any questions regarding data delivery or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov. -
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov -ManagedBy: "[NOAA](http://www.noaa.gov/)" -UpdateFrequency: The update frequencies of the space weather dataset range from one minute observations to daily and monthly updates of more slowly-varying indices -Tags: - - aws-pds - - climate - - meteorological - - solar - - weather -License: NOAA data disseminated through NODD are open to the public and can be used as desired. -
-
- NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. -Resources: - - Description: NOAA Space Weather Prediction Center Forecasts - ARN: arn:aws:s3:::noaa-swpc-pds - Region: us-east-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](https://noaa-swpc-pds.s3.amazonaws.com/index.html)' - - Description: New data notifications for NOAA Space Weather Prediction Center Forecasts, only Lambda and SQS protocols allowed - ARN: arn:aws:sns:us-east-1:123901341784:NewSWPCObject - Region: us-east-1 - Type: SNS Topic +Name: NOAA Space Weather Forecast and Observation Data +Description: > + Space weather forecast and observation data is collected and disseminated by NOAA’s Space Weather Prediction Center (SWPC) in Boulder, CO. SWPC produces forecasts for multiple space weather phenomenon types and the resulting impacts to Earth and human activities. A variety of products are available that provide these forecast expectations, and their respective measurements, in formats that range from detailed technical forecast discussions to NOAA Scale values to simple bulletins that give information in laymen's terms. + Forecasting is the prediction of future events, based on analysis and modeling of the past and present conditions of the environment you are interested in. In Space Weather, persistence and recurrence of active regions on the sun over the 27-day solar rotational period play an important role in accurately forecasting the space environment. +Documentation: https://www.swpc.noaa.gov/products-and-data +Contact: | + For any questions regarding data delivery or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov. +
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov +ManagedBy: "[NOAA](http://www.noaa.gov/)" +UpdateFrequency: The update frequencies of the space weather dataset range from one minute observations to daily and monthly updates of more slowly-varying indices +Collabs: + ASDI: + Tags: + - climate +Tags: + - aws-pds + - climate + - meteorological + - solar + - weather +License: NOAA data disseminated through NODD are open to the public and can be used as desired. +
+
+ NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. +Resources: + - Description: NOAA Space Weather Prediction Center Forecasts + ARN: arn:aws:s3:::noaa-swpc-pds + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://noaa-swpc-pds.s3.amazonaws.com/index.html)' + - Description: New data notifications for NOAA Space Weather Prediction Center Forecasts, only Lambda and SQS protocols allowed + ARN: arn:aws:sns:us-east-1:123901341784:NewSWPCObject + Region: us-east-1 + Type: SNS Topic diff --git a/datasets/nyc-tlc-trip-records-pds.yaml b/datasets/nyc-tlc-trip-records-pds.yaml index 3cd725b3d..7de425541 100644 --- a/datasets/nyc-tlc-trip-records-pds.yaml +++ b/datasets/nyc-tlc-trip-records-pds.yaml @@ -4,6 +4,10 @@ Documentation: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page Contact: research@tlc.nyc.gov ManagedBy: City of New York Taxi and Limousine Commission UpdateFrequency: As soon as new data is available to be shared publicly. +Collabs: + ASDI: + Tags: + - infrastructure Tags: - aws-pds - cities diff --git a/datasets/nz-elevation.yaml b/datasets/nz-elevation.yaml index aa7af0cf4..556b2abfe 100644 --- a/datasets/nz-elevation.yaml +++ b/datasets/nz-elevation.yaml @@ -8,6 +8,10 @@ Contact: elevation@linz.govt.nz ManagedBy: "[Toitū Te Whenua Land Information New Zealand](https://www.linz.govt.nz)" UpdateFrequency: New elevation data will regularly be added, as part of being published to the LINZ Data Service and LINZ Basemaps. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - elevation diff --git a/datasets/nz-imagery.yaml b/datasets/nz-imagery.yaml index d7d69625e..c844fdad2 100644 --- a/datasets/nz-imagery.yaml +++ b/datasets/nz-imagery.yaml @@ -7,6 +7,10 @@ Contact: imagery@linz.govt.nz ManagedBy: "[Toitū Te Whenua Land Information New Zealand](https://www.linz.govt.nz)" UpdateFrequency: New imagery will regularly be added, as part of being published to the LINZ Data Service and LINZ Basemaps. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - aerial imagery diff --git a/datasets/obis.yaml b/datasets/obis.yaml index beeda716d..1e09f3d2c 100644 --- a/datasets/obis.yaml +++ b/datasets/obis.yaml @@ -4,6 +4,10 @@ Documentation: Documentation for this dataset is available at https://github.com Contact: helpdesk@obis.org ManagedBy: The Ocean Biodiversity Information System (OBIS) UpdateFrequency: Weekly +Collabs: + ASDI: + Tags: + - biodiversity Tags: - biodiversity - coastal diff --git a/datasets/oceanomics.yaml b/datasets/oceanomics.yaml index 6bf315676..5f669ae44 100644 --- a/datasets/oceanomics.yaml +++ b/datasets/oceanomics.yaml @@ -4,6 +4,10 @@ Documentation: https://edna.minderoo.org Contact: oceanomics@minderoo.org ManagedBy: Minderoo Foundation OceanOmics (Dr Shannon Corrigan, Dr Philipp Bayer) UpdateFrequency: Data will be continually updated as it is generated. +Collabs: + ASDI: + Tags: + - oceans Tags: - biodiversity - bioinformatics diff --git a/datasets/open-meteo.yaml b/datasets/open-meteo.yaml index aba932244..ec73a37fd 100644 --- a/datasets/open-meteo.yaml +++ b/datasets/open-meteo.yaml @@ -13,6 +13,10 @@ Documentation: https://github.com/open-meteo/open-data Contact: info@open-meteo.com ManagedBy: "[Open-Meteo](https://www.open-meteo.com/)" UpdateFrequency: Hourly +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - agriculture diff --git a/datasets/openaerialmap.yaml b/datasets/openaerialmap.yaml index cdbc871cc..0f1be46ac 100644 --- a/datasets/openaerialmap.yaml +++ b/datasets/openaerialmap.yaml @@ -4,6 +4,10 @@ Documentation: https://docs.openaerialmap.org/ Contact: info@openaerialmap.org ManagedBy: "[Humanitarian OpenStreetMap Team](https://www.hotosm.org/)" UpdateFrequency: New imagery is added as soon as it is uploaded by community contributors. +Collabs: + ASDI: + Tags: + - disaster Tags: - satellite imagery - aerial imagery diff --git a/datasets/openfoodfacts-images.yaml b/datasets/openfoodfacts-images.yaml index 449c46280..fec58c6a9 100644 --- a/datasets/openfoodfacts-images.yaml +++ b/datasets/openfoodfacts-images.yaml @@ -6,6 +6,10 @@ Contact: contact@openfoodfacts.org ManagedBy: "[Open Food Facts](https://world.openfoodfacts.org)" UpdateFrequency: Monthly License: All data contained in this dataset is licenced under the [Creative Commons Attribution ShareAlike licence](https://creativecommons.org/licenses/by-sa/3.0/deed.en) +Collabs: + ASDI: + Tags: + - agriculture Tags: - aws-pds - machine learning diff --git a/datasets/os-climate-physrisk.yaml b/datasets/os-climate-physrisk.yaml index 4129135db..a4fd74e49 100644 --- a/datasets/os-climate-physrisk.yaml +++ b/datasets/os-climate-physrisk.yaml @@ -4,6 +4,10 @@ Documentation: https://physrisk.readthedocs.io/en/latest/ Contact: https://os-climate.org/contact-us/ ManagedBy: "[OS-Climate](https://os-climate.org/)" UpdateFrequency: Data is updated as new important public domain datasets become available or if corrections are published. +Collabs: + ASDI: + Tags: + - climate Tags: - climate risk - physical diff --git a/datasets/palsar-2-scansar-flooding-in-rwanda.yaml b/datasets/palsar-2-scansar-flooding-in-rwanda.yaml index 690faaa11..18faf5bb1 100644 --- a/datasets/palsar-2-scansar-flooding-in-rwanda.yaml +++ b/datasets/palsar-2-scansar-flooding-in-rwanda.yaml @@ -5,6 +5,10 @@ License: Data is available for free under the terms of use. Documentation: https://www.eorc.jaxa.jp/ALOS/en/dataset/alos_open_and_free_e.htm, https://www.eorc.jaxa.jp/ALOS/en/dataset/palsar2_l22_e.htm ManagedBy: "[JAXA](https://www.jaxa.jp/)" Contact: aproject@jaxa.jp +Collabs: + ASDI: + Tags: + - disaster Tags: - aws-pds - agriculture diff --git a/datasets/proj-datum-grids.yaml b/datasets/proj-datum-grids.yaml index fdc343ed9..672f09a64 100644 --- a/datasets/proj-datum-grids.yaml +++ b/datasets/proj-datum-grids.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/OSGeo/proj-datumgrid-geotiff Contact: proj@lists.osgeo.org ManagedBy: "[PROJ](https://proj.org)" UpdateFrequency: New grids are added when made available +Collabs: + ASDI: + Tags: + - infrastructure Tags: - aws-pds - geospatial diff --git a/datasets/racecar-dataset.yaml b/datasets/racecar-dataset.yaml index 92a067340..5d66f32d4 100644 --- a/datasets/racecar-dataset.yaml +++ b/datasets/racecar-dataset.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/linklab-uva/RACECAR_DATA Contact: Prof. Madhur Behl (madhur.behl@viginia.edu) ManagedBy: Amar Kulkarni (ark8su@virginia.edu) UpdateFrequency: This dataset was constructed during a single racing season (2021-22). Future seasons may potentially be added. +Collabs: + ASDI: + Tags: + - infrastructure Tags: - aws-pds - autonomous vehicles diff --git a/datasets/real-changesets.yaml b/datasets/real-changesets.yaml index f95ea51e2..832e32d6f 100644 --- a/datasets/real-changesets.yaml +++ b/datasets/real-changesets.yaml @@ -7,6 +7,10 @@ Documentation: https://github.com/osmus/osmcha-charter-project/blob/main/real-ch Contact: team@openstreetmap.us ManagedBy: OpenStreetMap US UpdateFrequency: Minutely +Collabs: + ASDI: + Tags: + - disaster Tags: - geospatial - osm diff --git a/datasets/satellogic-earthview.yaml b/datasets/satellogic-earthview.yaml index 8ea995f6a..253901501 100644 --- a/datasets/satellogic-earthview.yaml +++ b/datasets/satellogic-earthview.yaml @@ -4,6 +4,10 @@ Documentation: https://satellogic-earthview.s3.us-west-2.amazonaws.com/index.htm Contact: https://www.satellogic.com/ ManagedBy: "[Satellogic](https://www.satellogic.com)" UpdateFrequency: New data will be made available periodically, with annual updates expected in the future covering the same or other new regions. +Collabs: + ASDI: + Tags: + - satellite imagery Tags: - aws-pds - satellite imagery diff --git a/datasets/seefar.yaml b/datasets/seefar.yaml index 5407ba92a..626e9880d 100644 --- a/datasets/seefar.yaml +++ b/datasets/seefar.yaml @@ -4,6 +4,10 @@ Documentation: https://coastalcarbon.ai/seefar Contact: James Lowman ManagedBy: Coastal Carbon UpdateFrequency: Yearly +Collabs: + ASDI: + Tags: + - climate Tags: - geospatial - earth observation diff --git a/datasets/sofar-spotter-archive.yaml b/datasets/sofar-spotter-archive.yaml index a904b1757..2102c5622 100644 --- a/datasets/sofar-spotter-archive.yaml +++ b/datasets/sofar-spotter-archive.yaml @@ -5,6 +5,10 @@ Documentation: "[Spotter Technical Reference Manual](https://content.sofarocean. Contact: opendata@sofarocean.com ManagedBy: "[Sofar Ocean](https://www.sofarocean.com/company/contact-us)" UpdateFrequency: As available +Collabs: + ASDI: + Tags: + - oceans Tags: - aws-pds - climate diff --git a/datasets/speedtest-global-performance.yaml b/datasets/speedtest-global-performance.yaml index 68dab2050..ead22a350 100644 --- a/datasets/speedtest-global-performance.yaml +++ b/datasets/speedtest-global-performance.yaml @@ -5,6 +5,10 @@ Documentation: "[Performance Maps Overview](https://github.com/teamookla/ookla-o Contact: opendata@ookla.com ManagedBy: "[Ookla](https://www.ookla.com/ookla-for-good)" UpdateFrequency: Quarterly +Collabs: + ASDI: + Tags: + - infrastructure Tags: - analytics - aws-pds diff --git a/datasets/ssl4eo-multi-product-data.yaml b/datasets/ssl4eo-multi-product-data.yaml index 66218702f..4e9325cb0 100644 --- a/datasets/ssl4eo-multi-product-data.yaml +++ b/datasets/ssl4eo-multi-product-data.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/sunny1401/ssl4eo_multi_satellite_products Contact: https://github.com/sunny1401/ssl4eo_multi_satellite_products ManagedBy: Sankranti Joshi UpdateFrequency: Not updated +Collabs: + ASDI: + Tags: + - satellite imagery Tags: - satellite imagery License: https://creativecommons.org/licenses/by-nc-sa/4.0/ diff --git a/datasets/stdpopsim_kern.yaml b/datasets/stdpopsim_kern.yaml index 3edfbc3a9..4fe4f841a 100644 --- a/datasets/stdpopsim_kern.yaml +++ b/datasets/stdpopsim_kern.yaml @@ -1,27 +1,31 @@ -Name: stdpopsim species resources -Description: Contains all resources (genome specifications, recombination maps, etc.) required for - species specific simulation with the stdpopsim package. These resources are originally from a - variety of other consortium and published work but are consolidated here for ease of access and - use. If you are interested in adding a new species to the stdpopsim resource please raise an - issue on the stdpopsim GitHub page to have the necessary files added here. -Documentation: https://stdpopsim.readthedocs.io/en/latest/catalog.html -Contact: https://github.com/popsim-consortium/stdpopsim/issues -ManagedBy: Andrew Kern & Jerome Kelleher -UpdateFrequency: Data will be added as new species, genome assemblies, and genetic map data for already included species become available. -Tags: - - aws-pds - - genetic maps - - life sciences - - population genetics - - recombination maps - - simulations -License: Please see the individual datasets compiled here for licensing details and make sure to cite the original sources of any elements of this data that you use. -Resources: - - Description: https://stdpopsim.readthedocs.io/en/latest/ - ARN: arn:aws:s3:::stdpopsim - Region: us-west-2 - Type: S3 Bucket -DataAtWork: - Tutorials: - Tools & Applications: - Publications: +Name: stdpopsim species resources +Description: Contains all resources (genome specifications, recombination maps, etc.) required for + species specific simulation with the stdpopsim package. These resources are originally from a + variety of other consortium and published work but are consolidated here for ease of access and + use. If you are interested in adding a new species to the stdpopsim resource please raise an + issue on the stdpopsim GitHub page to have the necessary files added here. +Documentation: https://stdpopsim.readthedocs.io/en/latest/catalog.html +Contact: https://github.com/popsim-consortium/stdpopsim/issues +ManagedBy: Andrew Kern & Jerome Kelleher +UpdateFrequency: Data will be added as new species, genome assemblies, and genetic map data for already included species become available. +Collabs: + ASDI: + Tags: + - biodiversity +Tags: + - aws-pds + - genetic maps + - life sciences + - population genetics + - recombination maps + - simulations +License: Please see the individual datasets compiled here for licensing details and make sure to cite the original sources of any elements of this data that you use. +Resources: + - Description: https://stdpopsim.readthedocs.io/en/latest/ + ARN: arn:aws:s3:::stdpopsim + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + Tools & Applications: + Publications: diff --git a/datasets/surface-pm2-5-v6gl02.yaml b/datasets/surface-pm2-5-v6gl02.yaml index c74ab92e8..5374b57da 100644 --- a/datasets/surface-pm2-5-v6gl02.yaml +++ b/datasets/surface-pm2-5-v6gl02.yaml @@ -4,6 +4,10 @@ Documentation: https://sites.wustl.edu/acag/datasets/surface-pm2-5/#V6.GL.02.03 Contact: randall.martin@wustl.edu ManagedBy: "https://sites.wustl.edu/acag/" UpdateFrequency: Yearly +Collabs: + ASDI: + Tags: + - climate Tags: - atmosphere - netcdf diff --git a/datasets/targetepigenomics.yaml b/datasets/targetepigenomics.yaml index d34f37d6d..e08f8f81b 100644 --- a/datasets/targetepigenomics.yaml +++ b/datasets/targetepigenomics.yaml @@ -5,6 +5,10 @@ Contact: targetdcc16@gmail.com ManagedBy: TaRGET II Data Coordination Center (TaRGET-DCC) Documentation: https://data.targetepigenomics.org/ UpdateFrequency: "TaRGET-DCC offers monthly data releases, although this dataset may not be updated at every release." +Collabs: + ASDI: + Tags: + - climate Tags: - biology - bioinformatics diff --git a/datasets/usgs_aqr.yaml b/datasets/usgs_aqr.yaml index 39b7a75e1..74182e289 100644 --- a/datasets/usgs_aqr.yaml +++ b/datasets/usgs_aqr.yaml @@ -1,38 +1,42 @@ -Name: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States -Description: "Aquatic reflectance produced with the dark spectrum fitting (DSF) algorithm as implemented in the Atmospheric Correction for OLI “lite” (ACOLITE) software (version 20221114.0). Aquatic reflectance is defined here as unitless water-leaving radiance reflectance and represents the ratio of water-leaving radiance (units of watts per square meter per steradian per nanometer) to downwelling irradiance (units of watts per square meter per nanometer) multiplied by pi." -Documentation: https://www.sciencebase.gov/catalog/item/640f612dd34e254fd352e1ed -Contact: tvking@usgs.gov -ManagedBy: "[United States Geological Survey](https://www.usgs.gov)" -UpdateFrequency: New scenes are added daily. -Tags: - - aws-pds - - earth observation - - satellite imagery - - geospatial - - natural resource - - cog - - water -License: "Contains modified Copernicus Sentinel data, which is available under the Creative Commons CC BY-SA 3.0 IGO license. Please reference King et al., 2024 (doi 10.5066/P904243C) when referring to the aquatic reflectance, and include the statement 'Contains modified Copernicus Sentinel data [Year]' to acknowledge the data originator." -Citation: "King, T.V., Meyer, M.F., Hundt, S.A., Ball, G.P., Hafen, K.C., Avouris, D.M., Ducar, S.D., Wakefield, B.F., Stengel, V.S., and Vanhellemont, Q., 2024, Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States; U.S. Geological Survey Data Release, doi 10.5066/P904243C." -Resources: - - Description: Scenes and metadata - ARN: arn:aws:s3:::usgs-wma-sentinel-2-aqr-acolite-dsf/version_01 - Region: us-west-2 - Type: S3 Bucket - - Description: New scene notification - ARN: arn:aws:sns:us-west-2:242201296900:usgs-wma-sentinel-2-aqr-acolite-dsf-object_created - Region: us-west-2 - Type: SNS Topic -DataAtWork: - Tutorials: - - Title: "tutorial.zip" - URL: https://www.sciencebase.gov/catalog/item/640f612dd34e254fd352e1ed - AuthorName: S.D. Ducar - Tools & Applications: - - Title: GLOBUS Access Point - URL: https://app.globus.org/file-manager?origin_id=8fd8727f-c464-4e86-a5ed-c6db72848c02&origin_path=%2F - AuthorName: T.V. King, et al. - Publications: - - Title: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States - URL: https://www.sciencebase.gov/catalog/item/640f612dd34e254fd352e1ed +Name: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States +Description: "Aquatic reflectance produced with the dark spectrum fitting (DSF) algorithm as implemented in the Atmospheric Correction for OLI “lite” (ACOLITE) software (version 20221114.0). Aquatic reflectance is defined here as unitless water-leaving radiance reflectance and represents the ratio of water-leaving radiance (units of watts per square meter per steradian per nanometer) to downwelling irradiance (units of watts per square meter per nanometer) multiplied by pi." +Documentation: https://www.sciencebase.gov/catalog/item/640f612dd34e254fd352e1ed +Contact: tvking@usgs.gov +ManagedBy: "[United States Geological Survey](https://www.usgs.gov)" +UpdateFrequency: New scenes are added daily. +Collabs: + ASDI: + Tags: + - oceans +Tags: + - aws-pds + - earth observation + - satellite imagery + - geospatial + - natural resource + - cog + - water +License: "Contains modified Copernicus Sentinel data, which is available under the Creative Commons CC BY-SA 3.0 IGO license. Please reference King et al., 2024 (doi 10.5066/P904243C) when referring to the aquatic reflectance, and include the statement 'Contains modified Copernicus Sentinel data [Year]' to acknowledge the data originator." +Citation: "King, T.V., Meyer, M.F., Hundt, S.A., Ball, G.P., Hafen, K.C., Avouris, D.M., Ducar, S.D., Wakefield, B.F., Stengel, V.S., and Vanhellemont, Q., 2024, Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States; U.S. Geological Survey Data Release, doi 10.5066/P904243C." +Resources: + - Description: Scenes and metadata + ARN: arn:aws:s3:::usgs-wma-sentinel-2-aqr-acolite-dsf/version_01 + Region: us-west-2 + Type: S3 Bucket + - Description: New scene notification + ARN: arn:aws:sns:us-west-2:242201296900:usgs-wma-sentinel-2-aqr-acolite-dsf-object_created + Region: us-west-2 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: "tutorial.zip" + URL: https://www.sciencebase.gov/catalog/item/640f612dd34e254fd352e1ed + AuthorName: S.D. Ducar + Tools & Applications: + - Title: GLOBUS Access Point + URL: https://app.globus.org/file-manager?origin_id=8fd8727f-c464-4e86-a5ed-c6db72848c02&origin_path=%2F + AuthorName: T.V. King, et al. + Publications: + - Title: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States + URL: https://www.sciencebase.gov/catalog/item/640f612dd34e254fd352e1ed AuthorName: T.V. King, et al. \ No newline at end of file diff --git a/datasets/venus-l2a-cogs.yaml b/datasets/venus-l2a-cogs.yaml index ab3a4736d..2a00c448f 100644 --- a/datasets/venus-l2a-cogs.yaml +++ b/datasets/venus-l2a-cogs.yaml @@ -13,6 +13,10 @@ Documentation: https://github.com/earthdaily/venus-on-aws/ Contact: Klaus Bachhuber - klaus.bachhuber@earthdaily.com ManagedBy: "[EarthDaily Analytics](https://earthdaily.com/)" UpdateFrequency: New Venus data are added regularly +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - agriculture diff --git a/datasets/wbg-cckp.yaml b/datasets/wbg-cckp.yaml index 69deae2ea..78698a791 100644 --- a/datasets/wbg-cckp.yaml +++ b/datasets/wbg-cckp.yaml @@ -4,6 +4,10 @@ Documentation: https://worldbank.github.io/climateknowledgeportal Contact: C. MacKenzie Dove cdove@worldbank.org; askclimate@worldbank.org ManagedBy: "[World Bank Group](https://www.worldbank.org/en/home)" UpdateFrequency: Semi-annually +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - climate diff --git a/datasets/whiffle-wins50.yaml b/datasets/whiffle-wins50.yaml index b0ef3611f..82590b172 100644 --- a/datasets/whiffle-wins50.yaml +++ b/datasets/whiffle-wins50.yaml @@ -4,6 +4,10 @@ Documentation: https://gitlab.com/whiffle-public/whiffle-open-data Contact: support@whiffle.nl ManagedBy: "[Whiffle](http://www.whiffle.nl/)" UpdateFrequency: No updates planned. +Collabs: + ASDI: + Tags: + - energy Tags: - aws-pds - weather diff --git a/datasets/wis2-global-cache.yaml b/datasets/wis2-global-cache.yaml index 481518c19..0b07181e7 100644 --- a/datasets/wis2-global-cache.yaml +++ b/datasets/wis2-global-cache.yaml @@ -4,6 +4,10 @@ Documentation: https://www.metoffice.gov.uk/services/data/external-data-channels Contact: gisc-exeter@metoffice.gov.uk ManagedBy: "[Met Office](https://www.metoffice.gov.uk/)" UpdateFrequency: New data added as soon as available from origin WIS2 Nodes. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - atmosphere diff --git a/datasets/wise-allsky.yaml b/datasets/wise-allsky.yaml index a4d34a732..4daf1f965 100644 --- a/datasets/wise-allsky.yaml +++ b/datasets/wise-allsky.yaml @@ -4,6 +4,10 @@ Documentation: https://irsa.ipac.caltech.edu/Missions/wise.html Contact: https://irsa.ipac.caltech.edu/docs/help_desk.html ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech.edu)) at Caltech" UpdateFrequency: The All-Sky Data Release has been finalized and will not be updated. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - astronomy From e5b3e57a85d858160af1e3d729b6c218337ae539 Mon Sep 17 00:00:00 2001 From: "Shruti [C] Bhanderi" Date: Wed, 9 Jul 2025 13:58:56 +0000 Subject: [PATCH 094/751] bulk tagging ASDI - 25 files --- bulk_update_test.ipynb | 433 +++++++++++++++++- ...p_velocity_hourly_averaged_delayed_qc.yaml | 4 + ...y_daynighttime_multi_sensor_australia.yaml | 4 + ...nighttime_single_sensor_southernocean.yaml | 4 + ...y_daynighttime_multi_sensor_australia.yaml | 4 + ..._daynighttime_single_sensor_australia.yaml | 4 + ...sst_l4_gamssa_1day_multi_sensor_world.yaml | 4 + ...l4_ramssa_1day_multi_sensor_australia.yaml | 4 + ...n_vessel_air_sea_flux_product_delayed.yaml | 4 + datasets/aodn_vessel_sst_delayed_qc.yaml | 4 + datasets/aodn_vessel_trv_realtime_qc.yaml | 4 + .../cmip6-era5-hybrid-southeast-asia.yaml | 58 +-- datasets/coawst.yaml | 4 + .../ctrees-california-vhr-tree-height.yaml | 4 + datasets/dep-wofs.yaml | 4 + datasets/emearth.yaml | 4 + datasets/hycom-global-drifters.yaml | 4 + datasets/nsf-ncar-era5.yaml | 4 + ...lsar-2-scansar-flooding-in-bangladesh.yaml | 4 + datasets/pohang-canal-dataset.yaml | 4 + datasets/s1-orbits.yaml | 4 + datasets/sentinel-products-ca-mirror.yaml | 4 + 22 files changed, 520 insertions(+), 51 deletions(-) diff --git a/bulk_update_test.ipynb b/bulk_update_test.ipynb index 2e0b5f09e..75d177f72 100644 --- a/bulk_update_test.ipynb +++ b/bulk_update_test.ipynb @@ -1320,30 +1320,32 @@ { "metadata": { "ExecuteTime": { - "end_time": "2025-07-09T12:42:55.269730Z", - "start_time": "2025-07-09T12:42:55.248312Z" + "end_time": "2025-07-09T13:27:29.807756Z", + "start_time": "2025-07-09T13:27:29.805506Z" } }, "cell_type": "code", "source": [ - "\n", - "\n" + "#######\n", + "###Final Processing Summary ===\n", + "# Total YeifjcbflkftbifjvrdlncAML files: 748\n", + "# Total matches found: 126\n", + "# Failed files: 0\n", + "# Final unmatched datasets: 33\n", + "# Deprecated matches (not updated): 1\n", + "#\n", + "# Total Git files to be updated: 124\n", + "#\n", + "# Deprecated matched datasets:\n", + "# Dataset: Earth Observation Data Cubes for Brazil - File: brazil-data-cubes.yaml\n", + "#\n", + "# Processing completed successfully!\n", + "# Total matches across both passes: 126\n", + "# Total Git files to update: 124\n" ], "id": "33969aa18854426", - "outputs": [ - { - "ename": "NameError", - "evalue": "name 'matched_dataset_names' is not defined", - "output_type": "error", - "traceback": [ - "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m", - "\u001B[0;31mNameError\u001B[0m Traceback (most recent call last)", - "Cell \u001B[0;32mIn[80], line 5\u001B[0m\n\u001B[1;32m 2\u001B[0m excel_file \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mASDI_adds.xlsx\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 3\u001B[0m df \u001B[38;5;241m=\u001B[39m pd\u001B[38;5;241m.\u001B[39mread_excel(excel_file)\n\u001B[0;32m----> 5\u001B[0m final_unmatched \u001B[38;5;241m=\u001B[39m df[\u001B[38;5;241m~\u001B[39mdf[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mDataset\u001B[39m\u001B[38;5;124m'\u001B[39m]\u001B[38;5;241m.\u001B[39misin(\u001B[43mmatched_dataset_names\u001B[49m)]\n", - "\u001B[0;31mNameError\u001B[0m: name 'matched_dataset_names' is not defined" - ] - } - ], - "execution_count": 80 + "outputs": [], + "execution_count": 91 }, { "metadata": {}, @@ -1351,27 +1353,410 @@ "outputs": [], "execution_count": null, "source": "\n", - "id": "32c9fe7ccfeaa4d4" + "id": "e569cc3d00979bfc" + }, + { + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-09T13:56:04.589833Z", + "start_time": "2025-07-09T13:53:32.890359Z" + } + }, + "cell_type": "code", + "source": [ + " # dataset_folder = \"open-data-registry/datasets/\" # Change this to your YAML files folder path\n", + " # /local/home/bshrutiw/open-data-registry-fork\n", + "dataset_folder= \"/local/home/bshrutiw/open-data-registry-fork/datasets\"\n", + "# folder_path = \"open-data-registry/datasets/\"\n", + "excel_file = \"ASDI_cleaned_unmatched.xlsx\" # Change this to your Excel file path\n", + "\n", + "results=process_yaml_files(dataset_folder, excel_file)\n", + "if results:\n", + " print(\"\\nProcessing completed successfully!\")\n", + " print(f\"Total matches across both passes: {results['total_matches']}\")\n", + " print(f\"Total Git files to update: {len(results['git_files'])}\")\n", + "\n" + ], + "id": "32c9fe7ccfeaa4d4", + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "=== Matching starts ===\n", + "\n", + "Match found!\n", + "Excel name: Digital Earth Pacific Water Observatins from Space (WOfS)\n", + "YAML name: Digital Earth Pacific Water Observatins from Space (WOfS)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-wofs.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3\n", + " model)\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (GSM model)\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", + "YAML name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - 6 day - Day\n", + " and night time\n", + "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - 6 day - Day and night time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Global Australian\n", + "YAML name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Global Australian\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml\n", + "\n", + "Match found!\n", + "Excel name: Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters\n", + "YAML name: Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/pohang-canal-dataset.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 1 day - Day and night time\n", + "YAML name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 1 day - Day and night time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sentinel Near Real-time Canada Mirror | Miroir Sentinel temps quasi réel du Canada\n", + "YAML name: Sentinel Near Real-time Canada Mirror | Miroir Sentinel temps quasi réel du Canada\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/sentinel-products-ca-mirror.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ensemble Meteorological Dataset for Planet Earth, EM-Earth\n", + "YAML name: Ensemble Meteorological Dataset for Planet Earth, EM-Earth\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/emearth.yaml\n", + "\n", + "Match found!\n", + "Excel name: Hybrid statistical-dynamic downscaling based on multi-model ensembles in Southeast Asia\n", + "YAML name: Hybrid statistical-dynamic downscaling based on multi-model ensembles in Southeast Asia\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cmip6-era5-hybrid-southeast-asia.yaml\n", + "\n", + "Match found!\n", + "Excel name: PALSAR-2 ScanSAR Tropical Cycolne Mocha (L2.1)\n", + "YAML name: PALSAR-2 ScanSAR Tropical Cycolne Mocha (L2.1)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/palsar-2-scansar-flooding-in-bangladesh.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sub-Meter Canopy Tree Height of California in 2020 by CTrees.org\n", + "YAML name: Sub-Meter Canopy Tree Height of California in 2020 by CTrees.org\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ctrees-california-vhr-tree-height.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model\n", + " and Eppley-VGPM algorithm)\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (OC3 model and Eppley-VGPM algorithm)\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Tropical research vessels - Real time\n", + "YAML name: Ships of Opportunity - Tropical research vessels - Real time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_trv_realtime_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 day - Day and night time - Southern Ocean\n", + "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 day - Day and night time - Southern Ocean\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml\n", + "\n", + "Match found!\n", + "Excel name: USGS COAWST (Coupled Ocean Atmosphere Wave and Sediment Transport) Forecast Model Archive, US East and Gulf Coasts\n", + "YAML name: USGS COAWST (Coupled Ocean Atmosphere Wave and Sediment Transport) Forecast Model Archive, US East and Gulf Coasts\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/coawst.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Air-sea fluxes - Meteorological and flux - Delayed mode\n", + "YAML name: Ships of Opportunity - Air-sea fluxes - Meteorological and flux - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model\n", + " and Eppley-VGPM algorithm)\n", + "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model and Eppley-VGPM algorithm)\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml\n", + "\n", + "Match found!\n", + "Excel name: HYCOM-OceanTrack Integrated HYCOM Eulerian Fields and Lagrangian Trajectories Dataset\n", + "YAML name: HYCOM-OceanTrack Integrated HYCOM Eulerian Fields and Lagrangian Trajectories Dataset\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/hycom-global-drifters.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ships of Opportunity - Sea surface temperature - 1-minute average data products\n", + "YAML name: Ships of Opportunity - Sea surface temperature - 1-minute average data products\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_sst_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Sentinel-1 Precise Orbit Determination (POD) Products\n", + "YAML name: Sentinel-1 Precise Orbit Determination (POD) Products\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/s1-orbits.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3\n", + " model)\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OCI model)\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Regional Australian\n", + "YAML name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Regional Australian\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml\n", + "\n", + "Match found!\n", + "Excel name: National Mooring Network - CTD profiles\n", + "YAML name: National Mooring Network - CTD profiles\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_ctd_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3\n", + " model)\n", + "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3 model)\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml\n", + "\n", + "Match found!\n", + "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", + "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v2.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - South Australian gulfs site - Sea water velocity - Delayed mode\n", + "YAML name: Ocean Radar - South Australian gulfs site - Sea water velocity - Delayed mode\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: Ocean Radar - Capricorn bunker group site - Sea water velocity - Delayed mode\n", + "YAML name: Ocean Radar - Capricorn bunker group site - Sea water velocity - Delayed mode\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml\n", + "\n", + "Match found!\n", + "Excel name: NSF NCAR Curated ECMWF Reanalysis 5 (ERA5)\n", + "YAML name: NSF NCAR Curated ECMWF Reanalysis 5 (ERA5)\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nsf-ncar-era5.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 3 day - Day and night time\n", + "YAML name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 3 day - Day and night time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml\n", + "\n", + "Match found!\n", + "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", + "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 1\n", + "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v1.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - Himawari-8 - 1 day - Night time\n", + "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - Himawari-8 - 1 day - Night time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml\n", + "\n", + "Match found!\n", + "Excel name: A region-wide, multi-year set of crop field boundary labels for Africa\n", + "YAML name: A region-wide, multi-year set of crop field boundary labels for Africa\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/africa-field-boundary-labels.yaml\n", + "\n", + "Match found!\n", + "Excel name: RCM CEOS Analysis Ready Data | Données prêtes à l'analyse du CEOS pour le MCR\n", + "YAML name: RCM CEOS Analysis Ready Data | Données prêtes à l'analyse du CEOS pour le MCR\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/rcm-ceos-ard.yaml\n", + "\n", + "Match found!\n", + "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 month - Day time\n", + "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 month - Day time\n", + "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml\n", + "\n", + "=== Final Processing Summary ===\n", + "Total YAML files: 748\n", + "Total matches found: 35\n", + "Failed files: 0\n", + "Final unmatched datasets: 2\n", + "Deprecated matches (not updated): 0\n", + "\n", + "Total Git files to be updated: 25\n", + "\n", + "Deprecated matched datasets:\n", + "\n", + "Processing completed successfully!\n", + "Total matches across both passes: 35\n", + "Total Git files to update: 25\n" + ] + } + ], + "execution_count": 93 }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, - "source": "", + "source": [ + "# === Final Processing Summary ===\n", + "# Total YAML files: 748\n", + "# Total matches found: 35\n", + "# Failed files: 0\n", + "# Final unmatched datasets: 2\n", + "# Deprecated matches (not updated): 0\n", + "#\n", + "# Total Git files to be updated: 25\n", + "#\n", + "# Deprecated matched datasets:\n", + "#\n", + "# Processing completed successfully!\n", + "# Total matches across both passes: 35\n", + "# Total Git files to update: 25" + ], "id": "afe1d28ac7b91d15" }, { - "metadata": {}, + "metadata": { + "ExecuteTime": { + "end_time": "2025-07-09T13:27:38.540829Z", + "start_time": "2025-07-09T13:27:38.537831Z" + } + }, "cell_type": "code", - "outputs": [], - "execution_count": null, "source": [ "print(\"Git files to update:\")\n", "for file in results['git_files']:\n", " print(file)" ], - "id": "456c408a11b90883" + "id": "456c408a11b90883", + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Git files to update:\n", + "intelinair_corn_kernel_counting.yaml\n", + "noaa-historicalcharts.yaml\n", + "epa-equates-v1.yaml\n", + "bhl-open-data.yaml\n", + "aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml\n", + "ssl4eo-multi-product-data.yaml\n", + "ecmwf-forecasts.yaml\n", + "aodn_animal_acoustic_tracking_delayed_qc.yaml\n", + "aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml\n", + "green_et.yaml\n", + "aodn_satellite_optical_water_type_1day_aqua.yaml\n", + "aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml\n", + "proj-datum-grids.yaml\n", + "whiffle-wins50.yaml\n", + "gnss-ro-opendata.yaml\n", + "in-elevation.yaml\n", + "intelinair_longitudinal_nutrient_deficiency.yaml\n", + "seefar.yaml\n", + "aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", + "aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml\n", + "wbg-cckp.yaml\n", + "aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml\n", + "aurora_msds.yaml\n", + "usgs_aqr.yaml\n", + "satellogic-earthview.yaml\n", + "aodn_radar_rottnestshelf_wind_delayed_qc.yaml\n", + "dep-mangroves.yaml\n", + "aodn_radar_coffsharbour_wave_delayed_qc.yaml\n", + "venus-l2a-cogs.yaml\n", + "aodn_satellite_chlorophylla_carder_1day_aqua.yaml\n", + "oceanomics.yaml\n", + "asf-event-data.yaml\n", + "gmsdata.yaml\n", + "3kricegenome.yaml\n", + "aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml\n", + "cwa_opendata.yaml\n", + "targetepigenomics.yaml\n", + "caladapt-wildfire-dataset.yaml\n", + "aodn_vessel_co2_delayed_qc.yaml\n", + "amazon-last-mile-challenges.yaml\n", + "aodn_satellite_chlorophylla_gsm_1day_aqua.yaml\n", + "nz-elevation.yaml\n", + "blue_et.yaml\n", + "aodn_satellite_chlorophylla_gsm_1day_snpp.yaml\n", + "aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml\n", + "speedtest-global-performance.yaml\n", + "stdpopsim_kern.yaml\n", + "argoverse.yaml\n", + "aodn_vessel_fishsoop_realtime_qc.yaml\n", + "dep-s1-annual-mosaics.yaml\n", + "black_marble_combustion.yaml\n", + "gulfwide-avian-monitoring.yaml\n", + "nifs-lhd.yaml\n", + "epa-2022-modeling-platform.yaml\n", + "cropland_partitioining.yaml\n", + "aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml\n", + "aodn_mooring_hourly_timeseries_delayed_qc.yaml\n", + "citrus-farm.yaml\n", + "aodn_model_sea_level_anomaly_gridded_realtime.yaml\n", + "real-changesets.yaml\n", + "aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml\n", + "dep-coastlines.yaml\n", + "aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml\n", + "wis2-global-cache.yaml\n", + "hycom-gofs-3pt1-reanalysis.yaml\n", + "surface-pm2-5-v6gl02.yaml\n", + "aodn_satellite_chlorophylla_oc3_1day_snpp.yaml\n", + "aodn_satellite_chlorophylla_oci_1day_noaa20.yaml\n", + "aodn_mooring_ctd_delayed_qc.yaml\n", + "aodn_wave_buoy_realtime_nonqc.yaml\n", + "aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml\n", + "ag-loam.yaml\n", + "aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml\n", + "noaa-nesdis-tcprimed-pds.yaml\n", + "openaerialmap.yaml\n", + "epa-edde-v2.yaml\n", + "catalyst-cooperative-pudl.yaml\n", + "aodn_vessel_xbt_realtime_nonqc.yaml\n", + "aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml\n", + "aodn_radar_rottnestshelf_wave_delayed_qc.yaml\n", + "dep-s2-geomads.yaml\n", + "aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml\n", + "mapping-africa.yaml\n", + "noaa-nws-wam-ipe.yaml\n", + "openfoodfacts-images.yaml\n", + "colorado-imagery.yaml\n", + "boreas.yaml\n", + "noaa-space-weather.yaml\n", + "ccic.yaml\n", + "cesm-hr.yaml\n", + "nz-imagery.yaml\n", + "aodn_mooring_satellite_altimetry_calibration_validation.yaml\n", + "asset-data-igp-coal-plant.yaml\n", + "era5-for-wrf.yaml\n", + "aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml\n", + "wise-allsky.yaml\n", + "glo-30-hand.yaml\n", + "blended-tropomi-gosat-methane.yaml\n", + "aodn_vessel_xbt_delayed_qc.yaml\n", + "os-climate-physrisk.yaml\n", + "aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", + "geoglows-v2.yaml\n", + "sofar-spotter-archive.yaml\n", + "epa-edde-v1.yaml\n", + "aodn_satellite_chlorophylla_oci_1day_snpp.yaml\n", + "aodn_satellite_chlorophylla_oc3_1day_aqua.yaml\n", + "aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml\n", + "racecar-dataset.yaml\n", + "palsar-2-scansar-flooding-in-rwanda.yaml\n", + "aodn_slocum_glider_delayed_qc.yaml\n", + "aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml\n", + "aodn_satellite_chlorophylla_oci_1day_aqua.yaml\n", + "its-live-data.yaml\n", + "aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml\n", + "in-imagery.yaml\n", + "ford-multi-av-seasonal.yaml\n", + "open-meteo.yaml\n", + "ladi.yaml\n", + "global-drought-flood-catalogue.yaml\n", + "obis.yaml\n", + "dmi-opendata.yaml\n", + "aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml\n", + "kyfromabove.yaml\n", + "nyc-tlc-trip-records-pds.yaml\n" + ] + } + ], + "execution_count": 92 }, { "metadata": {}, diff --git a/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml index 8cfbfd3fd..00404ce4c 100644 --- a/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml @@ -29,6 +29,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - ocean currents diff --git a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml index 5af507889..5ea6793c3 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml index 148231ba2..bb1844d32 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml @@ -18,6 +18,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml index e33ef6710..90d19081c 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml index 84276551c..e2eb7589d 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml @@ -19,6 +19,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml b/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml index 239d8bb9d..ceb965b75 100644 --- a/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml +++ b/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml @@ -16,6 +16,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml index 0946d2f66..af6b9fd7f 100644 --- a/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml @@ -17,6 +17,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml b/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml index 27b48385e..43c5cb4bc 100644 --- a/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml +++ b/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - air temperature diff --git a/datasets/aodn_vessel_sst_delayed_qc.yaml b/datasets/aodn_vessel_sst_delayed_qc.yaml index db64be654..26e3d3802 100644 --- a/datasets/aodn_vessel_sst_delayed_qc.yaml +++ b/datasets/aodn_vessel_sst_delayed_qc.yaml @@ -25,6 +25,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - air temperature diff --git a/datasets/aodn_vessel_trv_realtime_qc.yaml b/datasets/aodn_vessel_trv_realtime_qc.yaml index dc53ea084..ced99a379 100644 --- a/datasets/aodn_vessel_trv_realtime_qc.yaml +++ b/datasets/aodn_vessel_trv_realtime_qc.yaml @@ -26,6 +26,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - chemistry diff --git a/datasets/cmip6-era5-hybrid-southeast-asia.yaml b/datasets/cmip6-era5-hybrid-southeast-asia.yaml index 693df9a70..56a47ddc6 100644 --- a/datasets/cmip6-era5-hybrid-southeast-asia.yaml +++ b/datasets/cmip6-era5-hybrid-southeast-asia.yaml @@ -1,27 +1,31 @@ -Name: Hybrid statistical-dynamic downscaling based on multi-model ensembles in Southeast Asia -Description: | - GCMs under CMIP6 have been widely used to investigate climate change impacts and put forward associated adaptation and mitigation strategies. However, the relatively coarse spatial resolutions (usually 100~300km) preclude their direct applications at regional scales, which are exactly where the analysis (e.g., hydrological model simulation) is performed. To bridge this gap, a typical approach is to ‘refine’ the information from GCMs through regional climate downscaling experiments, which can be conducted statistically, dynamically, or a combination thereof. Statistical downscaling establishes relationships between large-scale climate indicators and small-scale climate variables in the reference (historical) period. Subsequently, these relationships are kept unchanged in the future and used to predict the future variables. On the other hand, dynamical downscaling operates based on the physical processes and the associated interactions in the climate systems and thus can produce a full set of regional climate simulations (e.g., temperature and precipitation fields) that are dynamically consistent. However, traditional dynamical downscaling contains significant biases that are transferred from GCMs and may be enhanced during the process of downscaling, thus degrading the downscaled results. One promising approach to remove these biases is the hybrid statistical-dynamical downscaling method, where GCMs are firstly bias-corrected, and subsequently used as lower and lateral boundary conditions to drive the regional climate models (RCMs). - - In this work, we apply a hybrid statistical-dynamical downscaling method, following the approach of Xu et al. 2021. The bias-corrected dataset is adjusted to resemble ERA5-based mean climate and interannual variance, and with a non-linear trend from the ensemble mean of the 14 CMIP6 models. The dataset spans a historical period of 1979–2014 and future scenarios (SSP585) of 2015–2100, with a temporal scale of six-hour. - - The main contributions of this dataset are twofold. First, we provide the open-source and high-resolution (12.5km: Southeast Asia; 2.5km:Southern Malay Peninsula; 500m: Singapore, as shown in the following Figures) datasets, including precipitation, wind, temperature, radiation, etc. Second, through our experiment, this bias-corrected and downscaled dataset is of exceptional quality compared to that of the existing dynamical scaling work (e.g., CORDEX) in southeast Asia in terms of its ability to reproduce regional climate extremes, spatial patterns, etc. This dataset will be useful for policy-makers and researchers to establish the necessary pathways for resilient planning in order to mitigate the dire impacts of climate change. -Documentation: https://sgcale.github.io/resource/data/ -Contact: For any questions regarding dataset, email Professor Xiaogang He at hexg@nus.edu.sg. -ManagedBy: "[PREP-NexT Lab](https://github.com/PREP-NexT)" -UpdateFrequency: Update when needed. -Tags: - - climate - - netcdf - - precipitation - - aws-pds -License: - "All the code in this repository is [MIT](https://choosealicense.com/licenses/mit/) licensed, but we request that you please provide attribution if reusing any of our digital content (graphics, logo, copy, etc.)." -Resources: - - Description: | - We are releasing a bias-corrected and downscaled dataset based on 14 Coupled Model Intercomparison Project 6 (CMIP6) global climate models (GCMs) and the European Centre for Medium-Range Weather Forecasts Reanalysis 5 (ERA5) dataset. More details please refer to [this link](https://sgcale.github.io/research/climate-downscaling/). - ARN: arn:aws:s3:::arn:aws:s3:::cmip6-wrf-southeastasia - Region: us-west-2 - Type: S3 Bucket - RequesterPays: false - Explore: - - "[Browse Bucket](https://cmip6-wrf-southeastasia.s3.us-west-2.amazonaws.com/index.html)" +Name: Hybrid statistical-dynamic downscaling based on multi-model ensembles in Southeast Asia +Description: | + GCMs under CMIP6 have been widely used to investigate climate change impacts and put forward associated adaptation and mitigation strategies. However, the relatively coarse spatial resolutions (usually 100~300km) preclude their direct applications at regional scales, which are exactly where the analysis (e.g., hydrological model simulation) is performed. To bridge this gap, a typical approach is to ‘refine’ the information from GCMs through regional climate downscaling experiments, which can be conducted statistically, dynamically, or a combination thereof. Statistical downscaling establishes relationships between large-scale climate indicators and small-scale climate variables in the reference (historical) period. Subsequently, these relationships are kept unchanged in the future and used to predict the future variables. On the other hand, dynamical downscaling operates based on the physical processes and the associated interactions in the climate systems and thus can produce a full set of regional climate simulations (e.g., temperature and precipitation fields) that are dynamically consistent. However, traditional dynamical downscaling contains significant biases that are transferred from GCMs and may be enhanced during the process of downscaling, thus degrading the downscaled results. One promising approach to remove these biases is the hybrid statistical-dynamical downscaling method, where GCMs are firstly bias-corrected, and subsequently used as lower and lateral boundary conditions to drive the regional climate models (RCMs). + + In this work, we apply a hybrid statistical-dynamical downscaling method, following the approach of Xu et al. 2021. The bias-corrected dataset is adjusted to resemble ERA5-based mean climate and interannual variance, and with a non-linear trend from the ensemble mean of the 14 CMIP6 models. The dataset spans a historical period of 1979–2014 and future scenarios (SSP585) of 2015–2100, with a temporal scale of six-hour. + + The main contributions of this dataset are twofold. First, we provide the open-source and high-resolution (12.5km: Southeast Asia; 2.5km:Southern Malay Peninsula; 500m: Singapore, as shown in the following Figures) datasets, including precipitation, wind, temperature, radiation, etc. Second, through our experiment, this bias-corrected and downscaled dataset is of exceptional quality compared to that of the existing dynamical scaling work (e.g., CORDEX) in southeast Asia in terms of its ability to reproduce regional climate extremes, spatial patterns, etc. This dataset will be useful for policy-makers and researchers to establish the necessary pathways for resilient planning in order to mitigate the dire impacts of climate change. +Documentation: https://sgcale.github.io/resource/data/ +Contact: For any questions regarding dataset, email Professor Xiaogang He at hexg@nus.edu.sg. +ManagedBy: "[PREP-NexT Lab](https://github.com/PREP-NexT)" +UpdateFrequency: Update when needed. +Collabs: + ASDI: + Tags: + - climate +Tags: + - climate + - netcdf + - precipitation + - aws-pds +License: + "All the code in this repository is [MIT](https://choosealicense.com/licenses/mit/) licensed, but we request that you please provide attribution if reusing any of our digital content (graphics, logo, copy, etc.)." +Resources: + - Description: | + We are releasing a bias-corrected and downscaled dataset based on 14 Coupled Model Intercomparison Project 6 (CMIP6) global climate models (GCMs) and the European Centre for Medium-Range Weather Forecasts Reanalysis 5 (ERA5) dataset. More details please refer to [this link](https://sgcale.github.io/research/climate-downscaling/). + ARN: arn:aws:s3:::arn:aws:s3:::cmip6-wrf-southeastasia + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + Explore: + - "[Browse Bucket](https://cmip6-wrf-southeastasia.s3.us-west-2.amazonaws.com/index.html)" diff --git a/datasets/coawst.yaml b/datasets/coawst.yaml index de87d9a09..2820336d0 100644 --- a/datasets/coawst.yaml +++ b/datasets/coawst.yaml @@ -5,6 +5,10 @@ Contact: jbzambon@fathomscience.com ManagedBy: Fathom Science UpdateFrequency: None Citation: Warner, J.C., and Kalra, T.S., 2022, Collection of COAWST model forecast for the US East Coast and Gulf of Mexico, U.S. Geological Survey data release, https://doi.org/10.5066/P903KPBJ +Collabs: + ASDI: + Tags: + - oceans Tags: - aws-pds - oceans diff --git a/datasets/ctrees-california-vhr-tree-height.yaml b/datasets/ctrees-california-vhr-tree-height.yaml index d47f037ec..b2384cc86 100644 --- a/datasets/ctrees-california-vhr-tree-height.yaml +++ b/datasets/ctrees-california-vhr-tree-height.yaml @@ -5,6 +5,10 @@ Documentation: "[Project overview](https://ctrees.org/products/tree-level)" Contact: info@ctrees.org ManagedBy: "[CTrees](https://ctrees.org/)" UpdateFrequency: TBD +Collabs: + ASDI: + Tags: + - biodiversity Tags: - aws-pds - cog diff --git a/datasets/dep-wofs.yaml b/datasets/dep-wofs.yaml index 9c08e3ea1..369835ef9 100644 --- a/datasets/dep-wofs.yaml +++ b/datasets/dep-wofs.yaml @@ -10,6 +10,10 @@ Documentation: https://digitalearthpacific.org/#/applications Contact: dep@spc.int ManagedBy: "[Pacific Community (SPC)](https://www.spc.int/)" UpdateFrequency: Annually +Collabs: + ASDI: + Tags: + - oceans Tags: - earth observation - environmental diff --git a/datasets/emearth.yaml b/datasets/emearth.yaml index 50221efca..270a51294 100644 --- a/datasets/emearth.yaml +++ b/datasets/emearth.yaml @@ -4,6 +4,10 @@ Documentation: https://doi.org/10.20383/102.0547 Contact: shervan.gharari@usask.ca ManagedBy: "[Computational Hydrology at the University of Saskatchewan](https://uofs-comphyd.github.io/)" UpdateFrequency: N/A +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - atmosphere diff --git a/datasets/hycom-global-drifters.yaml b/datasets/hycom-global-drifters.yaml index 729bcd4b8..725e38e4b 100644 --- a/datasets/hycom-global-drifters.yaml +++ b/datasets/hycom-global-drifters.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/selipot/hycom-oceantrack Contact: https://github.com/selipot/hycom-oceantrack/issues ManagedBy: Shane Elipot UpdateFrequency: Not updated +Collabs: + ASDI: + Tags: + - oceans Tags: - aws-pds - drifters diff --git a/datasets/nsf-ncar-era5.yaml b/datasets/nsf-ncar-era5.yaml index f01db9a52..f213e2d14 100644 --- a/datasets/nsf-ncar-era5.yaml +++ b/datasets/nsf-ncar-era5.yaml @@ -4,6 +4,10 @@ Documentation: https://doi.org/10.5065/BH6N-5N20 Contact: rdahelp@ucar.edu ManagedBy: "[NSF National Center for Atmospheric Research](https://ncar.ucar.edu/)" UpdateFrequency: Monthly, with a 3-4 month lag from realtime +Collabs: + ASDI: + Tags: + - climate Tags: - climate - model diff --git a/datasets/palsar-2-scansar-flooding-in-bangladesh.yaml b/datasets/palsar-2-scansar-flooding-in-bangladesh.yaml index 31da24f06..b06a6c86a 100644 --- a/datasets/palsar-2-scansar-flooding-in-bangladesh.yaml +++ b/datasets/palsar-2-scansar-flooding-in-bangladesh.yaml @@ -5,6 +5,10 @@ License: Data is available for free under the terms of use. Documentation: https://www.eorc.jaxa.jp/ALOS/en/dataset/alos_open_and_free_e.htm, https://www.eorc.jaxa.jp/ALOS/en/dataset/palsar2_l22_e.htm ManagedBy: "[JAXA](https://www.jaxa.jp/)" Contact: aproject@jaxa.jp +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - agriculture diff --git a/datasets/pohang-canal-dataset.yaml b/datasets/pohang-canal-dataset.yaml index bb004a3e5..d8a4422fe 100644 --- a/datasets/pohang-canal-dataset.yaml +++ b/datasets/pohang-canal-dataset.yaml @@ -4,6 +4,10 @@ Documentation: https://sites.google.com/view/pohang-canal-dataset/home Contact: morin-lab@kaist.ac.kr ManagedBy: "[MORIN](http://morin.kaist.ac.kr)" UpdateFrequency: Not updated +Collabs: + ASDI: + Tags: + - oceans Tags: - aws-pds - autonomous vehicles diff --git a/datasets/s1-orbits.yaml b/datasets/s1-orbits.yaml index ce619203a..9fd6cedcf 100644 --- a/datasets/s1-orbits.yaml +++ b/datasets/s1-orbits.yaml @@ -14,6 +14,10 @@ Contact: https://asf.alaska.edu/asf/contact-us/ ManagedBy: "[The Alaska Satellite Facility (ASF)](https://asf.alaska.edu/)" UpdateFrequency: > Updated as new data becomes available on the [Copernicus Data Space Ecosystem](https://documentation.dataspace.copernicus.eu/Data/ComplementaryData/Additional.html#sentinel-1-orbits). Typically AUX_POEORB files are published daily and AUX_RESORB files are published every other hour. +Collabs: + ASDI: + Tags: + - disaster Tags: - auxiliary data - disaster response diff --git a/datasets/sentinel-products-ca-mirror.yaml b/datasets/sentinel-products-ca-mirror.yaml index 8cc477f10..76c5e9d7d 100644 --- a/datasets/sentinel-products-ca-mirror.yaml +++ b/datasets/sentinel-products-ca-mirror.yaml @@ -14,6 +14,10 @@ UpdateFrequency: "Sentinel-1 is an NRT dataset retrieved from ESA within 90 minu
Sentinel-1 est un ensemble de données NRT récupéré de l'ESA dans les 90 minutes suivant la liaison descendante du satellite. Sentinel-2 et Sentinel-3 non NRT sont également récupérés le plus rapidement possible en fonction de la couverture du Canada et de la disponibilité à la source." +Collabs: + ASDI: + Tags: + - satellite imagery Tags: - aws-pds - agriculture From 1c03985092988a85972cc71297484c065ec62e9e Mon Sep 17 00:00:00 2001 From: "Shruti [C] Bhanderi" Date: Wed, 9 Jul 2025 13:59:41 +0000 Subject: [PATCH 095/751] bulk tagging ASDI - missing commit files --- datasets/africa-field-boundary-labels.yaml | 4 ++++ .../aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml | 4 ++++ ...ite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml | 4 ++++ datasets/rcm-ceos-ard.yaml | 4 ++++ 4 files changed, 16 insertions(+) diff --git a/datasets/africa-field-boundary-labels.yaml b/datasets/africa-field-boundary-labels.yaml index 271fe1b24..80996af96 100644 --- a/datasets/africa-field-boundary-labels.yaml +++ b/datasets/africa-field-boundary-labels.yaml @@ -13,6 +13,10 @@ Documentation: Information on the primary dataset can be found [here](https://gi Contact: airg@clarku.edu ManagedBy: "[The Agricultural Impacts Research Group](https://agroimpacts.info/)" UpdateFrequency: "Updated versions of the dataset are added as they are developed" +Collabs: + ASDI: + Tags: + - agriculture Tags: - agriculture - machine learning diff --git a/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml b/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml index dad13e67a..0ff16356a 100644 --- a/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml +++ b/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml @@ -23,6 +23,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml index 39ddf0ec2..36d7d95a4 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml @@ -16,6 +16,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - satellite imagery diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index 6584d6946..1008189c3 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -14,6 +14,10 @@ UpdateFrequency: "The initial dataset will be Canada-wide, 30M Compact-Polarizat
L'ensemble de données initial couvrira l'ensemble du Canada, une couverture standard de 30 metres de polarisation compacte, tous les 12 jours, par fréquence de revisite de mission." +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - agriculture From de6d96139eb1465741a6803619ce281004c0e6e3 Mon Sep 17 00:00:00 2001 From: "Shruti [C] Bhanderi" Date: Wed, 9 Jul 2025 14:08:24 +0000 Subject: [PATCH 096/751] manual add ASDI tag 1 file --- datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml b/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml index 0c94b837e..35879331c 100644 --- a/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml +++ b/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml @@ -21,6 +21,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea Contact: info@aodn.org.au ManagedBy: AODN UpdateFrequency: As Needed +Collabs: + ASDI: + Tags: + - oceans Tags: - oceans - air temperature From ca8cde86da3253f31b1c274bc1fbfe0f4e852fbd Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 9 Jul 2025 10:49:50 -0800 Subject: [PATCH 097/751] Update real-changesets.yaml --- datasets/real-changesets.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/real-changesets.yaml b/datasets/real-changesets.yaml index 832e32d6f..b72fc1aae 100644 --- a/datasets/real-changesets.yaml +++ b/datasets/real-changesets.yaml @@ -10,7 +10,7 @@ UpdateFrequency: Minutely Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - geospatial - osm From a60aa73ee07acea8afe167a0809856fa0c281166 Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 11:56:20 -0700 Subject: [PATCH 098/751] Delete bulk_update_test.ipynb Removing the script for bulk tag from PR --- bulk_update_test.ipynb | 1824 ---------------------------------------- 1 file changed, 1824 deletions(-) delete mode 100644 bulk_update_test.ipynb diff --git a/bulk_update_test.ipynb b/bulk_update_test.ipynb deleted file mode 100644 index 75d177f72..000000000 --- a/bulk_update_test.ipynb +++ /dev/null @@ -1,1824 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "id": "initial_id", - "metadata": { - "collapsed": true, - "ExecuteTime": { - "end_time": "2025-07-08T21:40:22.659792Z", - "start_time": "2025-07-08T21:40:22.465395Z" - } - }, - "source": [ - "\n", - "from pathlib import Path\n", - "# from github import Github\n", - "\n", - "import os\n", - "import pandas as pd\n" - ], - "outputs": [], - "execution_count": 1 - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-08T21:40:27.284330Z", - "start_time": "2025-07-08T21:40:27.281147Z" - } - }, - "cell_type": "code", - "source": [ - "def clean_name(name):\n", - " \"\"\"Clean name by removing parentheses and normalizing\"\"\"\n", - " if not name:\n", - " return \"\"\n", - "\n", - " # Remove parentheses and their contents\n", - " cleaned = name.split('(')[0]\n", - "\n", - " # Remove special characters\n", - " cleaned = ''.join(char for char in cleaned if char.isalnum() or char.isspace())\n", - "\n", - " # Normalize whitespace and convert to lowercase\n", - " return ' '.join(cleaned.split()).lower().strip()" - ], - "id": "e0678a327eded893", - "outputs": [], - "execution_count": 2 - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-09T05:16:07.395317Z", - "start_time": "2025-07-09T05:16:07.376411Z" - } - }, - "cell_type": "code", - "source": [ - "# #deprecated version\n", - "# from ruamel.yaml import YAML\n", - "# from ruamel.yaml.comments import CommentedMap\n", - "#\n", - "#\n", - "# class OrderedYAML(YAML):\n", - "# def __init__(self):\n", - "# super().__init__()\n", - "# self.preserve_quotes = True\n", - "# self.indent(sequence=2)\n", - "# self.width = 4096\n", - "# self.default_flow_style = False\n", - "# self.map_format = 'rt'\n", - "#\n", - "#\n", - "# def update_yaml_with_collabs(yaml_file_path, dataset_name, category):\n", - "# try:\n", - "# yaml = OrderedYAML()\n", - "#\n", - "# # Read file and preserve formatting\n", - "# with open(yaml_file_path, 'r') as file:\n", - "# original_content = file.read()\n", - "# data = yaml.load(original_content)\n", - "#\n", - "# if data.get('Name') == dataset_name:\n", - "# print(f\"Match found for: {dataset_name} with yaml file {str(yaml_file_path)}\")\n", - "#\n", - "# # # Create backup\n", - "# # backup_path = str(yaml_file_path) + '.bak'\n", - "# # with open(backup_path, 'w') as backup:\n", - "# # backup.write(original_content)\n", - "# if 'DeprecatedNotice' in data and isinstance(data['DeprecatedNotice'], str) and data['DeprecatedNotice'].strip():\n", - "# return 'deprecated' # Special return value for deprecated datasets\n", - "# try:\n", - "# # Create new CommentedMap to preserve order\n", - "# new_data = CommentedMap()\n", - "# modified = False\n", - "#\n", - "#\n", - "# # Copy existing data in order\n", - "# for key in data:\n", - "# # new_data[key] = data[key]\n", - "# if key == 'Tags':\n", - "# # Debug prints\n", - "# print(\"Found Tags key\")\n", - "# print(\"Current data keys:\", data.keys())\n", - "# print(\"Checking if Collabs exists:\", 'Collabs' in data)\n", - "#\n", - "# new_data[key] = data[key]\n", - "# # Then add Collabs right after Tags if it doesn't exist\n", - "# # Then add Collabs right after Tags\n", - "# if not any(k == 'Collabs' for k in data.keys()): # Alternative check\n", - "# print(\"Adding Collabs section\")\n", - "# new_data['Collabs'] = CommentedMap()\n", - "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", - "# modified = True\n", - "# else:\n", - "# new_data[key] = data[key]\n", - "#\n", - "# # If Collabs already exists, just update it\n", - "# if 'Collabs' in data:\n", - "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", - "# modified = True\n", - "#\n", - "# # If Collabs already exists, just update it\n", - "# # if 'Collabs' in new_data:\n", - "# # if 'ASDI' not in new_data['Collabs']:\n", - "# # new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", - "# # modified = True\n", - "# # elif new_data['Collabs']['ASDI'].get('Tags') != [category.strip()]:\n", - "# # new_data['Collabs']['ASDI']['Tags'] = [category.strip()]\n", - "# # modified = True\n", - "# # Write back preserving format\n", - "# # with open(yaml_file_path, 'w') as outfile:\n", - "# # yaml.dump(new_data, outfile)\n", - "#\n", - "# # Only write if changes were made\n", - "# # Only write if changes were made\n", - "# if modified:\n", - "# with open(yaml_file_path, '+w') as outfile:\n", - "# yaml.dump(new_data, outfile)\n", - "# print(f\"Successfully updated {yaml_file_path}\")\n", - "# return 'modified' # New return value to indicate modification\n", - "# else:\n", - "# print(f\"No changes needed for {yaml_file_path}\")\n", - "# return 'matched'\n", - "#\n", - "#\n", - "# # print(f\"Successfully updated {yaml_file_path}\")\n", - "# # return True\n", - "#\n", - "# except Exception as write_error:\n", - "# # Restore from backup\n", - "# with open(yaml_file_path, 'w') as file:\n", - "# file.write(original_content)\n", - "# print(f\"Error writing file, restored from backup: {write_error}\")\n", - "# raise\n", - "#\n", - "# # finally:\n", - "# # import os\n", - "# # if os.path.exists(backup_path):\n", - "# # os.remove(backup_path)\n", - "#\n", - "# # finally:\n", - "# # import os\n", - "# # if os.path.exists(backup_path):\n", - "# # os.remove(backup_path)\n", - "#\n", - "# except Exception as e:\n", - "# print(f\"Error processing {yaml_file_path}: {e}\")\n", - "#\n", - "# return False\n", - "#\n", - "#\n", - "# def update_yaml_with_collabs_cleaned(yaml_file_path, dataset_name, category):\n", - "# \"\"\"Update YAML file using cleaned name matching\"\"\"\n", - "# try:\n", - "# yaml = OrderedYAML()\n", - "#\n", - "# with open(yaml_file_path, 'r') as file:\n", - "# original_content = file.read()\n", - "# data = yaml.load(original_content)\n", - "#\n", - "# yaml_name = data.get('Name', '')\n", - "#\n", - "# # Clean both names for comparison\n", - "# clean_yaml_name = clean_name(yaml_name)\n", - "# clean_dataset_name = clean_name(dataset_name)\n", - "#\n", - "# if clean_yaml_name == clean_dataset_name:\n", - "# if 'DeprecatedNotice' in data and isinstance(data['DeprecatedNotice'], str) and data['DeprecatedNotice'].strip():\n", - "# return 'deprecated'\n", - "#\n", - "# print(f\"\\nCleaned name match found!\")\n", - "# print(f\"Original Excel name: {dataset_name}\")\n", - "# print(f\"Original YAML name: {yaml_name}\")\n", - "# print(f\"Cleaned names: {clean_yaml_name}\")\n", - "#\n", - "# # # Create backup\n", - "# # backup_path = str(yaml_file_path) + '.bak'\n", - "# # with open(backup_path, 'w') as backup:\n", - "# # backup.write(original_content)\n", - "#\n", - "# try:\n", - "# # Create new CommentedMap to preserve order\n", - "# new_data = CommentedMap()\n", - "# modified = False\n", - "#\n", - "# # Copy existing data in order\n", - "# for key in data:\n", - "# # print(key)\n", - "# # new_data[key] = data[key]\n", - "# if key == 'Tags':\n", - "# # First add the Tags\n", - "# new_data[key] = data[key]\n", - "# # print(key)\n", - "# # print(data)\n", - "# # Then add Collabs right after Tags if it doesn't exist\n", - "# if 'Collabs' not in data:\n", - "# print(key)\n", - "# new_data['Collabs'] = CommentedMap()\n", - "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", - "# modified = True\n", - "# else:\n", - "# new_data[key] = data[key]\n", - "#\n", - "#\n", - "# # If Collabs already exists, just update it\n", - "# if 'Collabs' in data:\n", - "# new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", - "# modified = True\n", - "#\n", - "# # If Collabs already exists, just update it\n", - "# # if 'Collabs' in new_data:\n", - "# # if 'ASDI' not in new_data['Collabs']:\n", - "# # new_data['Collabs']['ASDI'] = {'Tags': [category.strip()]}\n", - "# # modified = True\n", - "# # elif new_data['Collabs']['ASDI'].get('Tags') != [category.strip()]:\n", - "# # new_data['Collabs']['ASDI']['Tags'] = [category.strip()]\n", - "# # modified = True\n", - "#\n", - "# # Write back preserving format\n", - "# # with open(yaml_file_path, 'w') as outfile:\n", - "# # yaml.dump(new_data, outfile)\n", - "#\n", - "# # Only write if changes were made\n", - "# # Only write if changes were made\n", - "# if modified:\n", - "# with open(yaml_file_path, 'w') as outfile:\n", - "# yaml.dump(new_data, outfile)\n", - "# print(f\"Successfully updated {yaml_file_path}\")\n", - "# return 'modified' # New return value to indicate modification\n", - "# else:\n", - "# print(f\"No changes needed for {yaml_file_path}\")\n", - "# return 'matched'\n", - "# return True\n", - "#\n", - "#\n", - "# # print(f\"Successfully updated {yaml_file_path}\")\n", - "# # return True\n", - "#\n", - "# except Exception as write_error:\n", - "# # Restore from backup\n", - "# with open(yaml_file_path, 'w') as file:\n", - "# file.write(original_content)\n", - "# print(f\"Error writing file, restored from backup: {write_error}\")\n", - "# raise\n", - "#\n", - "# # finally:\n", - "# # import os\n", - "# # if os.path.exists(backup_path):\n", - "# # os.remove(backup_path)\n", - "#\n", - "# except Exception as e:\n", - "# print(f\"Error processing {yaml_file_path}: {e}\")\n", - "#\n", - "# return False\n", - "#\n", - "#\n", - "# def process_yaml_files(dataset_folder, excel_file):\n", - "# \"\"\"Process all YAML files and match with Excel data\"\"\"\n", - "# total_yaml_files = 0\n", - "# processed_files = 0\n", - "# matches_found = 0\n", - "# matched_dataset_names = []\n", - "# failed_files = []\n", - "# git_files = [] # Track YAML files to be updated\n", - "# deprecated_matched = [] # Track deprecated but matched datasets\n", - "# # Read Excel file\n", - "# try:\n", - "# df = pd.read_excel(excel_file)\n", - "# # Ensure the required columns exist\n", - "# if 'Dataset' not in df.columns or 'Category' not in df.columns:\n", - "# raise ValueError(\"Excel file must contain 'Name' and 'Category' columns\")\n", - "#\n", - "# # Create dictionary of dataset names and categories\n", - "# dataset_dict = dict(zip(df['Dataset'], df['Category']))\n", - "# # print(dataset_dict)\n", - "#\n", - "# yaml_folder_path = Path(dataset_folder)\n", - "# if not yaml_folder_path.exists() or not yaml_folder_path.is_dir():\n", - "# raise ValueError(f\"Invalid dataset folder path: {dataset_folder}\")\n", - "#\n", - "# # Count total YAML files\n", - "# total_yaml_files = len([f for f in yaml_folder_path.glob('*.yaml')])\n", - "# print(\"\\n=== First Pass: Exact Matching ===\")\n", - "# # First pass - exact matching\n", - "# for yaml_file in yaml_folder_path.glob('*.yaml'):\n", - "# processed_files += 1\n", - "# # print(f\"\\nProcessing file {processed_files}/{total_yaml_files}: {yaml_file}\")\n", - "#\n", - "# try:\n", - "# for dataset_name, category in dataset_dict.items():\n", - "# match_result = update_yaml_with_collabs(yaml_file, dataset_name, category)\n", - "# if match_result == 'deprecated':\n", - "# # Dataset matched but is deprecated\n", - "# deprecated_matched.append({\n", - "# 'dataset': dataset_name,\n", - "# 'file': yaml_file.name\n", - "# })\n", - "# matched_dataset_names.append(dataset_name) # Count as matched but not updated\n", - "# print(f\"Dataset {dataset_name} matched but marked as deprecated - skipping update\")\n", - "# break\n", - "# elif match_result == 'modified': # New condition\n", - "# matches_found += 1\n", - "# matched_dataset_names.append(dataset_name)\n", - "# git_files.append(yaml_file.name)\n", - "# break\n", - "# elif match_result == 'matched': # New condition\n", - "# matches_found += 1\n", - "# matched_dataset_names.append(dataset_name)\n", - "# break\n", - "# except Exception as e:\n", - "# print(f\"Error processing YAML file {yaml_file}: {e}\")\n", - "# failed_files.append(str(yaml_file))\n", - "# # Create dictionary of unmatched datasets\n", - "# unmatched_dict = {k: v for k, v in dataset_dict.items()\n", - "# if k not in matched_dataset_names}\n", - "#\n", - "# print(f\"\\n=== First Pass Complete ===\")\n", - "# print(f\"Matches found: {matches_found}\")\n", - "# print(f\"Deprecated matches (not updated): {len(deprecated_matched)}\")\n", - "# print(f\"Unmatched datasets: {len(unmatched_dict)}\")\n", - "#\n", - "# if unmatched_dict:\n", - "# print(\"\\n=== Second Pass: Cleaned Name Matching ===\")\n", - "# second_pass_matches = 0\n", - "#\n", - "# # Reset file counter for second pass\n", - "# processed_files = 0\n", - "#\n", - "# # Second pass with cleaned names\n", - "# for yaml_file in yaml_folder_path.glob('*.yaml'):\n", - "# processed_files += 1\n", - "# # print(f\"\\nSecond pass processing {processed_files}/{total_yaml_files}: {yaml_file}\")\n", - "#\n", - "# try:\n", - "# for dataset_name, category in unmatched_dict.items():\n", - "# match_result = update_yaml_with_collabs_cleaned(yaml_file, dataset_name, category)\n", - "# if match_result == 'deprecated':\n", - "# deprecated_matched.append({\n", - "# 'dataset': dataset_name,\n", - "# 'file': yaml_file.name\n", - "# })\n", - "# matched_dataset_names.append(dataset_name)\n", - "# print(f\"Dataset {dataset_name} matched but marked as deprecated - skipping update\")\n", - "# break\n", - "# elif match_result == 'modified': # New condition\n", - "# second_pass_matches += 1\n", - "# matched_dataset_names.append(dataset_name)\n", - "# git_files.append(yaml_file.name)\n", - "# break\n", - "# elif match_result == 'matched': # New condition\n", - "# second_pass_matches += 1\n", - "# matched_dataset_names.append(dataset_name)\n", - "# break\n", - "#\n", - "# except Exception as e:\n", - "# print(f\"Error in second pass processing {yaml_file}: {e}\")\n", - "# if str(yaml_file) not in failed_files:\n", - "# failed_files.append(str(yaml_file))\n", - "#\n", - "# print(f\"\\n=== Second Pass Complete ===\")\n", - "# print(f\"Additional matches found: {second_pass_matches}\")\n", - "# matches_found += second_pass_matches\n", - "#\n", - "# # Final unmatched datasets\n", - "# final_unmatched = df[~df['Dataset'].isin(matched_dataset_names)]\n", - "#\n", - "# # Save unmatched datasets to Excel\n", - "# output_path = 'unmatched_datasets.xlsx'\n", - "# final_unmatched.to_excel(output_path, index=False)\n", - "#\n", - "# # Print final summary\n", - "# print(\"\\n=== Final Processing Summary ===\")\n", - "# print(f\"Total YAML files: {total_yaml_files}\")\n", - "# print(f\"Total matches found: {matches_found}\")\n", - "# print(f\"Failed files: {len(failed_files)}\")\n", - "# print(f\"Final unmatched datasets: {len(final_unmatched)}\")\n", - "# print(f\"Deprecated matches (not updated): {len(deprecated_matched)}\")\n", - "# print(f\"Unmatched datasets saved to: {output_path}\")\n", - "# print(f\"\\nTotal Git files to be updated: {len(git_files)}\")\n", - "# print(\"\\nDeprecated matched datasets:\")\n", - "# for dep in deprecated_matched:\n", - "# print(f\"Dataset: {dep['dataset']} - File: {dep['file']}\")\n", - "# # print(\"Git files:\")\n", - "# # for file in git_files:\n", - "# # print(file)\n", - "#\n", - "# return {\n", - "# 'total_files': total_yaml_files,\n", - "# 'total_matches': matches_found,\n", - "# 'failed_files': failed_files,\n", - "# 'matched_datasets': matched_dataset_names,\n", - "# 'unmatched_count': len(final_unmatched),\n", - "# 'deprecated_matched': deprecated_matched,\n", - "# 'git_files': git_files\n", - "# }\n", - "#\n", - "# except Exception as e:\n", - "# print(f\"Error in main processing: {e}\")\n", - "# return None # Final unmatched datasets\n", - "# final_unmatched = df[~df['Dataset'].isin(matched_dataset_names)]\n", - "#\n", - "# # Save unmatched datasets to Excel\n", - "# output_path = 'unmatched_datasets.xlsx'\n", - "# final_unmatched.to_excel(output_path, index=False)\n", - "#\n", - "# # Print final summary\n", - "# print(\"\\n=== Final Processing Summary ===\")\n", - "# print(f\"Total YAML files: {total_yaml_files}\")\n", - "# print(f\"Total matches found: {matches_found}\")\n", - "# print(f\"Failed files: {len(failed_files)}\")\n", - "# print(f\"Final unmatched datasets: {len(final_unmatched)}\")\n", - "# print(f\"Unmatched datasets saved to: {output_path}\")\n", - "#\n", - "# return {\n", - "# 'total_files': total_yaml_files,\n", - "# 'total_matches': matches_found,\n", - "# 'failed_files': failed_files,\n", - "# 'matched_datasets': matched_dataset_names,\n", - "# 'unmatched_count': len(final_unmatched),\n", - "# 'deprecated_matched': deprecated_matched,\n", - "# 'git_files': git_files\n", - "# }\n", - "#\n", - "# except Exception as e:\n", - "# print(f\"Error in main processing: {e}\")\n", - "# return None\n" - ], - "id": "b4897b0f1e3c3300", - "outputs": [], - "execution_count": 70 - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-09T13:05:25.172175Z", - "start_time": "2025-07-09T13:05:25.160778Z" - } - }, - "cell_type": "code", - "source": [ - "#deprecated version\n", - "from ruamel.yaml import YAML\n", - "from ruamel.yaml.comments import CommentedMap\n", - "\n", - "class OrderedYAML(YAML):\n", - " def __init__(self):\n", - " super().__init__()\n", - " self.preserve_quotes = True\n", - " # self.indent(sequence=2)\n", - " # self.width = 4096\n", - " # self.default_flow_style = False\n", - " # self.map_format = 'rt'\n", - "def update_yaml_with_collabs(yaml_file_path, dataset_name, category):\n", - " \"\"\"Update YAML file using both exact and cleaned name matching\"\"\"\n", - " try:\n", - " with open(yaml_file_path, 'r') as file:\n", - " lines = file.readlines() # Read all lines while preserving format\n", - "\n", - " # Load YAML for name comparison\n", - " yaml = OrderedYAML()\n", - " data = yaml.load(''.join(lines))\n", - "\n", - " yaml_name = data.get('Name', '')\n", - "\n", - " # Try exact match first, then cleaned match\n", - " if yaml_name == dataset_name or clean_name(yaml_name) == clean_name(dataset_name):\n", - " if 'DeprecatedNotice' in data and isinstance(data['DeprecatedNotice'], str) and data['DeprecatedNotice'].strip():\n", - " return 'deprecated'\n", - "\n", - " print(f\"\\nMatch found!\")\n", - " print(f\"Excel name: {dataset_name}\")\n", - " print(f\"YAML name: {yaml_name}\")\n", - "\n", - " try:\n", - " updated_lines = []\n", - " modified = False\n", - "\n", - " # Check if Collabs already exists\n", - " has_collabs = 'Collabs:' in ''.join(lines)\n", - "\n", - " # Process lines\n", - " i = 0\n", - " while i < len(lines):\n", - " line = lines[i]\n", - "\n", - " # If we find Tags and Collabs doesn't exist, add Collabs before Tags\n", - " if line.strip().startswith('Tags:') and not has_collabs:\n", - " # Add Collabs section first\n", - " updated_lines.append('Collabs:\\n')\n", - " updated_lines.append(' ASDI:\\n')\n", - " updated_lines.append(f' Tags:\\n')\n", - " updated_lines.append(f' - {category.strip()}\\n')\n", - " modified = True\n", - "\n", - " # Then add Tags section\n", - " updated_lines.append(line)\n", - " else:\n", - " updated_lines.append(line)\n", - " i += 1\n", - "\n", - " # Only write if changes were made\n", - " if modified:\n", - " with open(yaml_file_path, 'w') as outfile:\n", - " outfile.writelines(updated_lines)\n", - " print(f\"Successfully updated {yaml_file_path}\")\n", - " return 'modified'\n", - " else:\n", - " print(f\"No changes needed for {yaml_file_path}\")\n", - " return 'matched'\n", - "\n", - " except Exception as write_error:\n", - " print(f\"Error writing file: {write_error}\")\n", - " raise\n", - "\n", - " except Exception as e:\n", - " print(f\"Error processing {yaml_file_path}: {e}\")\n", - "\n", - " return False\n", - "\n", - "\n", - "def process_yaml_files(dataset_folder, excel_file):\n", - " \"\"\"Process all YAML files and match with Excel data\"\"\"\n", - " total_yaml_files = 0\n", - " processed_files = 0\n", - " matches_found = 0\n", - " matched_dataset_names = []\n", - " failed_files = []\n", - " git_files = [] # Track YAML files to be updated\n", - " deprecated_matched = [] # Track deprecated but matched datasets\n", - " # Read Excel file\n", - " try:\n", - " df = pd.read_excel(excel_file)\n", - " # Ensure the required columns exist\n", - " if 'Dataset' not in df.columns or 'Category' not in df.columns:\n", - " raise ValueError(\"Excel file must contain 'Name' and 'Category' columns\")\n", - "\n", - " # Create dictionary of dataset names and categories\n", - " dataset_dict = dict(zip(df['Dataset'], df['Category']))\n", - " # print(dataset_dict)\n", - "\n", - " yaml_folder_path = Path(dataset_folder)\n", - " if not yaml_folder_path.exists() or not yaml_folder_path.is_dir():\n", - " raise ValueError(f\"Invalid dataset folder path: {dataset_folder}\")\n", - "\n", - " # Count total YAML files\n", - " total_yaml_files = len([f for f in yaml_folder_path.glob('*.yaml')])\n", - " print(\"\\n=== Matching starts ===\")\n", - " for yaml_file in yaml_folder_path.glob('*.yaml'):\n", - " processed_files += 1\n", - "\n", - " try:\n", - " for dataset_name, category in dataset_dict.items():\n", - " match_result = update_yaml_with_collabs(yaml_file, dataset_name, category)\n", - " if match_result == 'deprecated':\n", - " deprecated_matched.append({\n", - " 'dataset': dataset_name,\n", - " 'file': yaml_file.name\n", - " })\n", - " matched_dataset_names.append(dataset_name)\n", - " print(f\"Dataset {dataset_name} matched but marked as deprecated - skipping update\")\n", - " break\n", - " elif match_result == 'modified':\n", - " matches_found += 1\n", - " matched_dataset_names.append(dataset_name)\n", - " git_files.append(yaml_file.name)\n", - " break\n", - " elif match_result == 'matched':\n", - " matches_found += 1\n", - " matched_dataset_names.append(dataset_name)\n", - " break\n", - "\n", - "\n", - "\n", - " except Exception as e:\n", - " print(f\"Error processing {yaml_file}: {e}\")\n", - " failed_files.append(str(yaml_file))\n", - "\n", - "\n", - " # Final unmatched datasets\n", - " final_unmatched = df[~df['Dataset'].isin(matched_dataset_names)]\n", - "\n", - " # Save unmatched datasets to Excel\n", - " output_path = 'unmatched_datasets.xlsx'\n", - " final_unmatched.to_excel(output_path, index=False)\n", - "\n", - " # Print final summary\n", - " print(\"\\n=== Final Processing Summary ===\")\n", - " print(f\"Total YAML files: {total_yaml_files}\")\n", - " print(f\"Total matches found: {matches_found}\")\n", - " print(f\"Failed files: {len(failed_files)}\")\n", - " print(f\"Final unmatched datasets: {len(final_unmatched)}\")\n", - " print(f\"Deprecated matches (not updated): {len(deprecated_matched)}\")\n", - " # print(f\"Unmatched datasets saved to: {output_path}\")\n", - " print(f\"\\nTotal Git files to be updated: {len(git_files)}\")\n", - " print(\"\\nDeprecated matched datasets:\")\n", - " for dep in deprecated_matched:\n", - " print(f\"Dataset: {dep['dataset']} - File: {dep['file']}\")\n", - " return {\n", - " 'total_files': total_yaml_files,\n", - " 'total_matches': matches_found,\n", - " 'failed_files': failed_files,\n", - " 'matched_datasets': matched_dataset_names,\n", - " 'unmatched_count': len(final_unmatched),\n", - " 'deprecated_matched': deprecated_matched,\n", - " 'git_files': git_files\n", - " }\n", - "\n", - " except Exception as e:\n", - " print(f\"Error in main processing: {e}\")\n", - " return None\n", - "\n" - ], - "id": "605ffb7dc2eb7cc", - "outputs": [], - "execution_count": 88 - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-09T13:23:18.905105Z", - "start_time": "2025-07-09T13:12:32.821944Z" - } - }, - "cell_type": "code", - "source": [ - " # dataset_folder = \"open-data-registry/datasets/\" # Change this to your YAML files folder path\n", - " # /local/home/bshrutiw/open-data-registry-fork\n", - "dataset_folder= \"/local/home/bshrutiw/open-data-registry-fork/datasets\"\n", - "# folder_path = \"open-data-registry/datasets/\"\n", - "excel_file = \"ASDI_adds.xlsx\" # Change this to your Excel file path\n", - "\n", - "results=process_yaml_files(dataset_folder, excel_file)\n", - "if results:\n", - " print(\"\\nProcessing completed successfully!\")\n", - " print(f\"Total matches across both passes: {results['total_matches']}\")\n", - " print(f\"Total Git files to update: {len(results['git_files'])}\")\n" - ], - "id": "3a159612bff75e15", - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "=== Matching starts ===\n", - "\n", - "Match found!\n", - "Excel name: Corn Kernel Counting Dataset\n", - "YAML name: Corn Kernel Counting Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/intelinair_corn_kernel_counting.yaml\n", - "\n", - "Match found!\n", - "Excel name: NOAA Historical Maps and Charts\n", - "YAML name: NOAA Historical Maps and Charts\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-historicalcharts.yaml\n", - "\n", - "Match found!\n", - "Excel name: Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column Data\n", - "YAML name: Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column data from the EPA's Air Quality Time Series (EQUATES) Project\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-equates-v1.yaml\n", - "\n", - "Match found!\n", - "Excel name: Biodiversity Heritage Library Metadata and Page Images\n", - "YAML name: Biodiversity Heritage Library Metadata and Page Images\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/bhl-open-data.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (GSM model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: SSL4EO S12 Landsat Multi Product Dataset\n", - "YAML name: SSL4EO S12 Landsat Multi Product Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ssl4eo-multi-product-data.yaml\n", - "\n", - "Match found!\n", - "Excel name: ECMWF real-time forecasts\n", - "YAML name: ECMWF real-time forecasts\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ecmwf-forecasts.yaml\n", - "\n", - "Match found!\n", - "Excel name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", - "YAML name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Nanoplankton fraction\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Nanoplankton fraction (OC3 model and Brewin et al 2012 algorithm)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: IWMI DIWASA Green ET for Africa\n", - "YAML name: IWMI DIWASA Green ET for Africa\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/green_et.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Optical Water Type (Moore)\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Optical Water Type (Moore et al 2009 algorithm)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Capricorn bunker group site - Wind - Delayed mode\n", - "YAML name: Ocean Radar - Capricorn bunker group site - Wind - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: PROJ datum grids\n", - "YAML name: PROJ datum grids\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/proj-datum-grids.yaml\n", - "\n", - "Match found!\n", - "Excel name: Whiffle WINS50 Open Data on AWS\n", - "YAML name: Whiffle WINS50 Open Data on AWS\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/whiffle-wins50.yaml\n", - "\n", - "Match found!\n", - "Excel name: Earth Radio Occultation\n", - "YAML name: Earth Radio Occultation\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/gnss-ro-opendata.yaml\n", - "\n", - "Match found!\n", - "Excel name: Indiana Statewide Elevation Catalog\n", - "YAML name: Indiana Statewide Elevation Catalog\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/in-elevation.yaml\n", - "\n", - "Match found!\n", - "Excel name: Longitudinal Nutrient Deficiency\n", - "YAML name: Longitudinal Nutrient Deficiency\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/intelinair_longitudinal_nutrient_deficiency.yaml\n", - "\n", - "Match found!\n", - "Excel name: SeeFar V0\n", - "YAML name: SeeFar V0\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/seefar.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Northwest shelf site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Northwest shelf site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", - "Dataset Earth Observation Data Cubes for Brazil matched but marked as deprecated - skipping update\n", - "\n", - "Match found!\n", - "Excel name: Marine Animal - Satellite Relay Tagging - Quality controlled profiles\n", - "YAML name: Marine Animal - Satellite Relay Tagging - Quality controlled profiles\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: World Bank Climate Change Knowledge Portal (CCKP)\n", - "YAML name: World Bank Climate Change Knowledge Portal (CCKP)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/wbg-cckp.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Turquoise Coast Site - Sea Water Velocity - Delayed Mode\n", - "YAML name: Ocean Radar - Turquoise coast site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Aurora Multi-Sensor Dataset\n", - "YAML name: Aurora Multi-Sensor Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aurora_msds.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States\n", - "YAML name: Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/usgs_aqr.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellogic EarthView dataset\n", - "YAML name: Satellogic EarthView dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/satellogic-earthview.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Rottnest shelf site - Wind - Delayed mode\n", - "YAML name: Ocean Radar - Rottnest shelf site - Wind - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Digital Earth Pacific Mangroves Extent and Density\n", - "YAML name: Digital Earth Pacific Mangroves Extent and Density\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-mangroves.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Coffs Harbour site - Wave - Delayed mode\n", - "YAML name: Ocean Radar - Coffs Harbour site - Wave - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: VENUS L2A Cloud-Optimized GeoTIFFs\n", - "YAML name: VENUS L2A Cloud-Optimized GeoTIFFs\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/venus-l2a-cogs.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (Carder model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: OceanOmics\n", - "YAML name: OceanOmics\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/oceanomics.yaml\n", - "\n", - "Match found!\n", - "Excel name: ASF SAR Data Products for Disaster Events\n", - "YAML name: ASF SAR Data Products for Disaster Events\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/asf-event-data.yaml\n", - "\n", - "Match found!\n", - "Excel name: The Genome Modeling System\n", - "YAML name: The Genome Modeling System\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/gmsdata.yaml\n", - "\n", - "Match found!\n", - "Excel name: 3000 Rice Genomes Project\n", - "YAML name: 3000 Rice Genomes Project\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/3kricegenome.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (OC3 model and Eppley-VGPM algorithm)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: Central Weather Administration OpenData\n", - "YAML name: Central Weather Administration OpenData\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cwa_opendata.yaml\n", - "\n", - "Match found!\n", - "Excel name: Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)\n", - "YAML name: Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/targetepigenomics.yaml\n", - "\n", - "Match found!\n", - "Excel name: Wildfire Projections to Support Climate Resilience\n", - "YAML name: Wildfire Projections to Support Climate Resilience\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/caladapt-wildfire-dataset.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Biogeochemical sensors - Delayed mode\n", - "YAML name: Ships of Opportunity - Biogeochemical sensors - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_co2_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: 2021 Amazon Last Mile Routing Research Challenge Dataset\n", - "YAML name: 2021 Amazon Last Mile Routing Research Challenge Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/amazon-last-mile-challenges.yaml\n", - "\n", - "Match found!\n", - "Excel name: Vermont Open Geospatial on AWS\n", - "YAML name: Vermont Open Geospatial on AWS\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/vt-opendata.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (GSM model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: New Zealand Elevation\n", - "YAML name: New Zealand Elevation\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nz-elevation.yaml\n", - "\n", - "Match found!\n", - "Excel name: IWMI DIWASA Blue ET for Africa\n", - "YAML name: IWMI DIWASA Blue ET for Africa\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/blue_et.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration\n", - "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration (GSM model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model and Eppley-VGPM algorithm)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: Speedtest by Ookla Global Fixed and Mobile Network Performance Maps\n", - "YAML name: Speedtest by Ookla Global Fixed and Mobile Network Performance Maps\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/speedtest-global-performance.yaml\n", - "\n", - "Match found!\n", - "Excel name: stdpopsim Species Resources\n", - "YAML name: stdpopsim species resources\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/stdpopsim_kern.yaml\n", - "\n", - "Match found!\n", - "Excel name: Argoverse\n", - "YAML name: Argoverse\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/argoverse.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Fisheries vessels - Real time\n", - "YAML name: Ships of Opportunity - Fisheries vessels - Real time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_fishsoop_realtime_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sentinel-1 Mean and Median Annual Mosaic\n", - "YAML name: Sentinel-1 Mean and Median Annual Mosaic\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-s1-annual-mosaics.yaml\n", - "\n", - "Match found!\n", - "Excel name: Nighttime-Fire-Flare\n", - "YAML name: Nighttime-Fire-Flare\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/black_marble_combustion.yaml\n", - "\n", - "Match found!\n", - "Excel name: Gulfwide Avian Colony Monitoring Survey Photos\n", - "YAML name: Gulfwide Avian Colony Monitoring Survey Photos\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/gulfwide-avian-monitoring.yaml\n", - "\n", - "Match found!\n", - "Excel name: NIFS Large Helical Device (LHD) Experiment\n", - "YAML name: NIFS Large Helical Device (LHD) Experiment\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nifs-lhd.yaml\n", - "\n", - "Match found!\n", - "Excel name: OAQPS 2022 Modeling Platform\n", - "YAML name: OAQPS 2022 Modeling Platform \n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-2022-modeling-platform.yaml\n", - "\n", - "Match found!\n", - "Excel name: IWMI DIWASA Rainfed and Irrigated Cropland Map for Africa\n", - "YAML name: IWMI DIWASA Rainfed and Irrigated Cropland Map for Africa\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cropland_partitioining.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Bonney coast site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Bonney coast site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Moorings - Hourly time-series product\n", - "YAML name: Moorings - Hourly time-series product\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: CitrusFarm Dataset\n", - "YAML name: CitrusFarm Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/citrus-farm.yaml\n", - "\n", - "Match found!\n", - "Excel name: OceanCurrent - Gridded sea level anomaly - Near real time\n", - "YAML name: OceanCurrent - Gridded sea level anomaly - Near real time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml\n", - "\n", - "Match found!\n", - "Excel name: real-changesets\n", - "YAML name: real-changesets\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/real-changesets.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Diffuse attenuation coefficient\n", - "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Diffuse attenuation coefficient (k490)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml\n", - "\n", - "Match found!\n", - "Excel name: Pacific Coastlines Change\n", - "YAML name: Pacific Coastlines Change\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-coastlines.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Newcastle site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Newcastle site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: WIS2 Global Cache on AWS\n", - "YAML name: WIS2 Global Cache on AWS\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/wis2-global-cache.yaml\n", - "\n", - "Match found!\n", - "Excel name: HYbrid Coordinate Ocean Model Global Ocean Forecast System Reanalysis\n", - "YAML name: HYbrid Coordinate Ocean Model Global Ocean Forecast System Reanalysis\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/hycom-gofs-3pt1-reanalysis.yaml\n", - "\n", - "Match found!\n", - "Excel name: SatPM2.5\n", - "YAML name: SatPM2.5\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/surface-pm2-5-v6gl02.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration\n", - "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration (OC3 model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OCI model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: National Mooring Network - CTD profiles\n", - "YAML name: National Mooring Network - CTD profiles\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_ctd_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Wave buoys observations - Real time\n", - "YAML name: Wave buoys observations - Real time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_wave_buoy_realtime_nonqc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Coral coast site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Coral coast site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: AG-LOAM Dataset\n", - "YAML name: AG-LOAM Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ag-loam.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3 model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: Tropical Cyclone Precipitation, Infrared, Microwave, and Environmental Dataset (TC PRIMED)\n", - "YAML name: Tropical Cyclone Precipitation, Infrared, Microwave, and Environmental Dataset (TC PRIMED)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-nesdis-tcprimed-pds.yaml\n", - "\n", - "Match found!\n", - "Excel name: OpenAerialMap on AWS\n", - "YAML name: OpenAerialMap on AWS\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/openaerialmap.yaml\n", - "\n", - "Match found!\n", - "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", - "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v2.yaml\n", - "\n", - "Match found!\n", - "Excel name: Public Utility Data Liberation Project\n", - "YAML name: Public Utility Data Liberation Project\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/catalyst-cooperative-pudl.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Expendable bathythermographs - Real time\n", - "YAML name: Ships of Opportunity - Expendable bathythermographs - Real time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_xbt_realtime_nonqc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - South Australian gulfs site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - South Australian gulfs site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Landsat Geometric Median and Absolute Deviations (GeoMAD) over the Pacific\n", - "YAML name: Landsat Geometric Median and Absolute Deviations (GeoMAD) over the Pacific.\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/dep-ls-geomads.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Rottnest shelf site - Wave - Delayed mode\n", - "YAML name: Ocean Radar - Rottnest shelf site - Wave - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sentinel-2 Geometric Median and Absolute Deviations (GeoMAD)\n", - "YAML name: Sentinel-2 Geometric Median and Absolute Deviations (GeoMAD) over the Pacific\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-s2-geomads.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Diffuse attenuation coefficient\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Diffuse attenuation coefficient (k490)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: High resolution, annual cropland and landcover maps for selected African countries\n", - "YAML name: High resolution, annual cropland and landcover maps for selected African countries\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/mapping-africa.yaml\n", - "\n", - "Match found!\n", - "Excel name: NOAA Whole Atmosphere Model-Ionosphere Plasmasphere Electrodynamics (WAM-IPE)\n", - "YAML name: NOAA Whole Atmosphere Model-Ionosphere Plasmasphere Electrodynamics (WAM-IPE) Forecast System (WFS)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-nws-wam-ipe.yaml\n", - "\n", - "Match found!\n", - "Excel name: Open Food Facts Images\n", - "YAML name: Open Food Facts Images\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/openfoodfacts-images.yaml\n", - "\n", - "Match found!\n", - "Excel name: State of Colorado Imagery\n", - "YAML name: State of Colorado Imagery\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/colorado-imagery.yaml\n", - "\n", - "Match found!\n", - "Excel name: Boreas Autonomous Driving Dataset\n", - "YAML name: Boreas Autonomous Driving Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/boreas.yaml\n", - "\n", - "Match found!\n", - "Excel name: NOAA Space Weather Forecast and Observation Data\n", - "YAML name: NOAA Space Weather Forecast and Observation Data\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/noaa-space-weather.yaml\n", - "\n", - "Match found!\n", - "Excel name: Chalmers Cloud Ice Climatology (CCIC)\n", - "YAML name: Chalmers Cloud Ice Climatology\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ccic.yaml\n", - "\n", - "Match found!\n", - "Excel name: CESM-HR\n", - "YAML name: CESM-HR\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cesm-hr.yaml\n", - "\n", - "Match found!\n", - "Excel name: New Zealand Imagery\n", - "YAML name: New Zealand Imagery\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nz-imagery.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Altimetry calibration and validation\n", - "YAML name: Satellite - Altimetry calibration and validation\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml\n", - "\n", - "Match found!\n", - "Excel name: IGP Coal Plant\n", - "YAML name: IGP Coal Plant\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/asset-data-igp-coal-plant.yaml\n", - "\n", - "Match found!\n", - "Excel name: ERA5-for-WRF Open Data on AWS\n", - "YAML name: ERA5-for-WRF Open Data on AWS\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/era5-for-wrf.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Coffs Harbour site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Coffs Harbour site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: All-Sky Data | Wide-field Infrared Survey Explorer (WISE)\n", - "YAML name: All-Sky Data | Wide-field Infrared Survey Explorer (WISE)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/wise-allsky.yaml\n", - "\n", - "Match found!\n", - "Excel name: Global 30m Height Above Nearest Drainage (HAND)\n", - "YAML name: Global 30m Height Above Nearest Drainage (HAND)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/glo-30-hand.yaml\n", - "\n", - "Match found!\n", - "Excel name: Blended TROPOMI+GOSAT Satellite Data Product for Atmospheric Methane\n", - "YAML name: Blended TROPOMI+GOSAT Satellite Data Product for Atmospheric Methane\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/blended-tropomi-gosat-methane.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Expendable bathythermographs - Delayed mode\n", - "YAML name: Ships of Opportunity - Expendable bathythermographs - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_xbt_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: OS-Climate Physrisk\n", - "YAML name: OS-Climate Physrisk\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/os-climate-physrisk.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Rottnest shelf site - Sea Water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Rottnest shelf site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: GEOGLOWS Hydrological Model Version 2\n", - "YAML name: GEOGLOWS Hydrological Model Version 2\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/geoglows-v2.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sofar Spotter Archive\n", - "YAML name: Sofar Spotter Archive\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/sofar-spotter-archive.yaml\n", - "\n", - "Match found!\n", - "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", - "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 1\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v1.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration\n", - "YAML name: Satellite - Ocean Colour - SNPP - 1 day - Chlorophyll-a concentration (OCI model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (OC3 model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Picoplankton fraction\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Picoplankton fraction (OC3 model and Brewin et al 2012 algorithm)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: RACECAR Dataset\n", - "YAML name: RACECAR Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/racecar-dataset.yaml\n", - "\n", - "Match found!\n", - "Excel name: PALSAR-2 ScanSAR Flooding in Rwanda (L2.1)\n", - "YAML name: PALSAR-2 ScanSAR Flooding in Rwanda (L2.1)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/palsar-2-scansar-flooding-in-rwanda.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Gliders - Delayed mode\n", - "YAML name: Ocean Gliders - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_slocum_glider_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Diffuse attenuation coefficient\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Diffuse attenuation coefficient (k490)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a Concentration\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Chlorophyll-a concentration (OCI model)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE)\n", - "YAML name: Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/its-live-data.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - South Australian gulfs site - Wave - Delayed mode\n", - "YAML name: Ocean Radar - South Australian gulfs site - Wave - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Indiana Statewide Digital Aerial Imagery Catalog\n", - "YAML name: Indiana Statewide Digital Aerial Imagery Catalog\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/in-imagery.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ford Multi-AV Seasonal Dataset\n", - "YAML name: Ford Multi-AV Seasonal Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ford-multi-av-seasonal.yaml\n", - "\n", - "Match found!\n", - "Excel name: Open-Meteo Weather API Database\n", - "YAML name: Open-Meteo Weather API Database\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/open-meteo.yaml\n", - "\n", - "Match found!\n", - "Excel name: Low Altitude Disaster Imagery (LADI) Dataset\n", - "YAML name: Low Altitude Disaster Imagery (LADI) Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ladi.yaml\n", - "\n", - "Match found!\n", - "Excel name: A Global Drought and Flood Catalogue from 1950 to 2016\n", - "YAML name: A Global Drought and Flood Catalogue from 1950 to 2016\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/global-drought-flood-catalogue.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Biodiversity Information System (OBIS) species occurrence data\n", - "YAML name: Ocean Biodiversity Information System (OBIS) species occurrence data\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/obis.yaml\n", - "\n", - "Match found!\n", - "Excel name: Danish Meteorological Institute (DMI) Open Data Forecasts\n", - "YAML name: Danish Meteorological Institute (DMI) Open Data Forecasts\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dmi-opendata.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Capricorn bunker group site - Wave - Delayed mode\n", - "YAML name: Ocean Radar - Capricorn bunker group site - Wave - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: KyFromAbove on AWS\n", - "YAML name: KyFromAbove on AWS\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/kyfromabove.yaml\n", - "\n", - "Match found!\n", - "Excel name: New York City Taxi and Limousine Commission (TLC) Trip Record Data\n", - "YAML name: New York City Taxi and Limousine Commission (TLC) Trip Record Data\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nyc-tlc-trip-records-pds.yaml\n", - "\n", - "=== Final Processing Summary ===\n", - "Total YAML files: 748\n", - "Total matches found: 126\n", - "Failed files: 0\n", - "Final unmatched datasets: 33\n", - "Deprecated matches (not updated): 1\n", - "\n", - "Total Git files to be updated: 124\n", - "\n", - "Deprecated matched datasets:\n", - "Dataset: Earth Observation Data Cubes for Brazil - File: brazil-data-cubes.yaml\n", - "\n", - "Processing completed successfully!\n", - "Total matches across both passes: 126\n", - "Total Git files to update: 124\n" - ] - } - ], - "execution_count": 90 - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-09T13:27:29.807756Z", - "start_time": "2025-07-09T13:27:29.805506Z" - } - }, - "cell_type": "code", - "source": [ - "#######\n", - "###Final Processing Summary ===\n", - "# Total YeifjcbflkftbifjvrdlncAML files: 748\n", - "# Total matches found: 126\n", - "# Failed files: 0\n", - "# Final unmatched datasets: 33\n", - "# Deprecated matches (not updated): 1\n", - "#\n", - "# Total Git files to be updated: 124\n", - "#\n", - "# Deprecated matched datasets:\n", - "# Dataset: Earth Observation Data Cubes for Brazil - File: brazil-data-cubes.yaml\n", - "#\n", - "# Processing completed successfully!\n", - "# Total matches across both passes: 126\n", - "# Total Git files to update: 124\n" - ], - "id": "33969aa18854426", - "outputs": [], - "execution_count": 91 - }, - { - "metadata": {}, - "cell_type": "code", - "outputs": [], - "execution_count": null, - "source": "\n", - "id": "e569cc3d00979bfc" - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-09T13:56:04.589833Z", - "start_time": "2025-07-09T13:53:32.890359Z" - } - }, - "cell_type": "code", - "source": [ - " # dataset_folder = \"open-data-registry/datasets/\" # Change this to your YAML files folder path\n", - " # /local/home/bshrutiw/open-data-registry-fork\n", - "dataset_folder= \"/local/home/bshrutiw/open-data-registry-fork/datasets\"\n", - "# folder_path = \"open-data-registry/datasets/\"\n", - "excel_file = \"ASDI_cleaned_unmatched.xlsx\" # Change this to your Excel file path\n", - "\n", - "results=process_yaml_files(dataset_folder, excel_file)\n", - "if results:\n", - " print(\"\\nProcessing completed successfully!\")\n", - " print(f\"Total matches across both passes: {results['total_matches']}\")\n", - " print(f\"Total Git files to update: {len(results['git_files'])}\")\n", - "\n" - ], - "id": "32c9fe7ccfeaa4d4", - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "=== Matching starts ===\n", - "\n", - "Match found!\n", - "Excel name: Digital Earth Pacific Water Observatins from Space (WOfS)\n", - "YAML name: Digital Earth Pacific Water Observatins from Space (WOfS)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/dep-wofs.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3\n", - " model)\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (GSM model)\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", - "YAML name: Animal Tracking - Acoustic Telemetry - Quality controlled detections\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - 6 day - Day\n", - " and night time\n", - "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - 6 day - Day and night time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Global Australian\n", - "YAML name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Global Australian\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml\n", - "\n", - "Match found!\n", - "Excel name: Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters\n", - "YAML name: Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/pohang-canal-dataset.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 1 day - Day and night time\n", - "YAML name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 1 day - Day and night time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sentinel Near Real-time Canada Mirror | Miroir Sentinel temps quasi réel du Canada\n", - "YAML name: Sentinel Near Real-time Canada Mirror | Miroir Sentinel temps quasi réel du Canada\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/sentinel-products-ca-mirror.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ensemble Meteorological Dataset for Planet Earth, EM-Earth\n", - "YAML name: Ensemble Meteorological Dataset for Planet Earth, EM-Earth\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/emearth.yaml\n", - "\n", - "Match found!\n", - "Excel name: Hybrid statistical-dynamic downscaling based on multi-model ensembles in Southeast Asia\n", - "YAML name: Hybrid statistical-dynamic downscaling based on multi-model ensembles in Southeast Asia\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/cmip6-era5-hybrid-southeast-asia.yaml\n", - "\n", - "Match found!\n", - "Excel name: PALSAR-2 ScanSAR Tropical Cycolne Mocha (L2.1)\n", - "YAML name: PALSAR-2 ScanSAR Tropical Cycolne Mocha (L2.1)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/palsar-2-scansar-flooding-in-bangladesh.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sub-Meter Canopy Tree Height of California in 2020 by CTrees.org\n", - "YAML name: Sub-Meter Canopy Tree Height of California in 2020 by CTrees.org\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/ctrees-california-vhr-tree-height.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model\n", - " and Eppley-VGPM algorithm)\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (OC3 model and Eppley-VGPM algorithm)\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Tropical research vessels - Real time\n", - "YAML name: Ships of Opportunity - Tropical research vessels - Real time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_trv_realtime_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 day - Day and night time - Southern Ocean\n", - "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 day - Day and night time - Southern Ocean\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml\n", - "\n", - "Match found!\n", - "Excel name: USGS COAWST (Coupled Ocean Atmosphere Wave and Sediment Transport) Forecast Model Archive, US East and Gulf Coasts\n", - "YAML name: USGS COAWST (Coupled Ocean Atmosphere Wave and Sediment Transport) Forecast Model Archive, US East and Gulf Coasts\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/coawst.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Air-sea fluxes - Meteorological and flux - Delayed mode\n", - "YAML name: Ships of Opportunity - Air-sea fluxes - Meteorological and flux - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model\n", - " and Eppley-VGPM algorithm)\n", - "YAML name: Satellite - Ocean Colour - MODIS - 1 day - Net Primary Productivity (GSM model and Eppley-VGPM algorithm)\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml\n", - "\n", - "Match found!\n", - "Excel name: HYCOM-OceanTrack Integrated HYCOM Eulerian Fields and Lagrangian Trajectories Dataset\n", - "YAML name: HYCOM-OceanTrack Integrated HYCOM Eulerian Fields and Lagrangian Trajectories Dataset\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/hycom-global-drifters.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ships of Opportunity - Sea surface temperature - 1-minute average data products\n", - "YAML name: Ships of Opportunity - Sea surface temperature - 1-minute average data products\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_vessel_sst_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Sentinel-1 Precise Orbit Determination (POD) Products\n", - "YAML name: Sentinel-1 Precise Orbit Determination (POD) Products\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/s1-orbits.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3\n", - " model)\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OCI model)\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Regional Australian\n", - "YAML name: Satellite - Sea surface temperature - Level 4 - Multi sensor - Regional Australian\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml\n", - "\n", - "Match found!\n", - "Excel name: National Mooring Network - CTD profiles\n", - "YAML name: National Mooring Network - CTD profiles\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_mooring_ctd_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3\n", - " model)\n", - "YAML name: Satellite - Ocean Colour - NOAA20 - 1 day - Chlorophyll-a concentration (OC3 model)\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml\n", - "\n", - "Match found!\n", - "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", - "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v2.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - South Australian gulfs site - Sea water velocity - Delayed mode\n", - "YAML name: Ocean Radar - South Australian gulfs site - Sea water velocity - Delayed mode\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: Ocean Radar - Capricorn bunker group site - Sea water velocity - Delayed mode\n", - "YAML name: Ocean Radar - Capricorn bunker group site - Sea water velocity - Delayed mode\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml\n", - "\n", - "Match found!\n", - "Excel name: NSF NCAR Curated ECMWF Reanalysis 5 (ERA5)\n", - "YAML name: NSF NCAR Curated ECMWF Reanalysis 5 (ERA5)\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/nsf-ncar-era5.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 3 day - Day and night time\n", - "YAML name: Satellite - Sea surface temperature - Level 3 - Multi sensor - 3 day - Day and night time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml\n", - "\n", - "Match found!\n", - "Excel name: EPA Dynamically Downscaled Ensemble (EDDE) Version 2\n", - "YAML name: EPA Dynamically Downscaled Ensemble (EDDE) Version 1\n", - "No changes needed for /local/home/bshrutiw/open-data-registry-fork/datasets/epa-edde-v1.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - Himawari-8 - 1 day - Night time\n", - "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - Himawari-8 - 1 day - Night time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml\n", - "\n", - "Match found!\n", - "Excel name: A region-wide, multi-year set of crop field boundary labels for Africa\n", - "YAML name: A region-wide, multi-year set of crop field boundary labels for Africa\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/africa-field-boundary-labels.yaml\n", - "\n", - "Match found!\n", - "Excel name: RCM CEOS Analysis Ready Data | Données prêtes à l'analyse du CEOS pour le MCR\n", - "YAML name: RCM CEOS Analysis Ready Data | Données prêtes à l'analyse du CEOS pour le MCR\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/rcm-ceos-ard.yaml\n", - "\n", - "Match found!\n", - "Excel name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 month - Day time\n", - "YAML name: Satellite - Sea surface temperature - Level 3 - Single sensor - 1 month - Day time\n", - "Successfully updated /local/home/bshrutiw/open-data-registry-fork/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml\n", - "\n", - "=== Final Processing Summary ===\n", - "Total YAML files: 748\n", - "Total matches found: 35\n", - "Failed files: 0\n", - "Final unmatched datasets: 2\n", - "Deprecated matches (not updated): 0\n", - "\n", - "Total Git files to be updated: 25\n", - "\n", - "Deprecated matched datasets:\n", - "\n", - "Processing completed successfully!\n", - "Total matches across both passes: 35\n", - "Total Git files to update: 25\n" - ] - } - ], - "execution_count": 93 - }, - { - "metadata": {}, - "cell_type": "code", - "outputs": [], - "execution_count": null, - "source": [ - "# === Final Processing Summary ===\n", - "# Total YAML files: 748\n", - "# Total matches found: 35\n", - "# Failed files: 0\n", - "# Final unmatched datasets: 2\n", - "# Deprecated matches (not updated): 0\n", - "#\n", - "# Total Git files to be updated: 25\n", - "#\n", - "# Deprecated matched datasets:\n", - "#\n", - "# Processing completed successfully!\n", - "# Total matches across both passes: 35\n", - "# Total Git files to update: 25" - ], - "id": "afe1d28ac7b91d15" - }, - { - "metadata": { - "ExecuteTime": { - "end_time": "2025-07-09T13:27:38.540829Z", - "start_time": "2025-07-09T13:27:38.537831Z" - } - }, - "cell_type": "code", - "source": [ - "print(\"Git files to update:\")\n", - "for file in results['git_files']:\n", - " print(file)" - ], - "id": "456c408a11b90883", - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Git files to update:\n", - "intelinair_corn_kernel_counting.yaml\n", - "noaa-historicalcharts.yaml\n", - "epa-equates-v1.yaml\n", - "bhl-open-data.yaml\n", - "aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml\n", - "ssl4eo-multi-product-data.yaml\n", - "ecmwf-forecasts.yaml\n", - "aodn_animal_acoustic_tracking_delayed_qc.yaml\n", - "aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml\n", - "green_et.yaml\n", - "aodn_satellite_optical_water_type_1day_aqua.yaml\n", - "aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml\n", - "proj-datum-grids.yaml\n", - "whiffle-wins50.yaml\n", - "gnss-ro-opendata.yaml\n", - "in-elevation.yaml\n", - "intelinair_longitudinal_nutrient_deficiency.yaml\n", - "seefar.yaml\n", - "aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", - "aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml\n", - "wbg-cckp.yaml\n", - "aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml\n", - "aurora_msds.yaml\n", - "usgs_aqr.yaml\n", - "satellogic-earthview.yaml\n", - "aodn_radar_rottnestshelf_wind_delayed_qc.yaml\n", - "dep-mangroves.yaml\n", - "aodn_radar_coffsharbour_wave_delayed_qc.yaml\n", - "venus-l2a-cogs.yaml\n", - "aodn_satellite_chlorophylla_carder_1day_aqua.yaml\n", - "oceanomics.yaml\n", - "asf-event-data.yaml\n", - "gmsdata.yaml\n", - "3kricegenome.yaml\n", - "aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml\n", - "cwa_opendata.yaml\n", - "targetepigenomics.yaml\n", - "caladapt-wildfire-dataset.yaml\n", - "aodn_vessel_co2_delayed_qc.yaml\n", - "amazon-last-mile-challenges.yaml\n", - "aodn_satellite_chlorophylla_gsm_1day_aqua.yaml\n", - "nz-elevation.yaml\n", - "blue_et.yaml\n", - "aodn_satellite_chlorophylla_gsm_1day_snpp.yaml\n", - "aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml\n", - "speedtest-global-performance.yaml\n", - "stdpopsim_kern.yaml\n", - "argoverse.yaml\n", - "aodn_vessel_fishsoop_realtime_qc.yaml\n", - "dep-s1-annual-mosaics.yaml\n", - "black_marble_combustion.yaml\n", - "gulfwide-avian-monitoring.yaml\n", - "nifs-lhd.yaml\n", - "epa-2022-modeling-platform.yaml\n", - "cropland_partitioining.yaml\n", - "aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml\n", - "aodn_mooring_hourly_timeseries_delayed_qc.yaml\n", - "citrus-farm.yaml\n", - "aodn_model_sea_level_anomaly_gridded_realtime.yaml\n", - "real-changesets.yaml\n", - "aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml\n", - "dep-coastlines.yaml\n", - "aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml\n", - "wis2-global-cache.yaml\n", - "hycom-gofs-3pt1-reanalysis.yaml\n", - "surface-pm2-5-v6gl02.yaml\n", - "aodn_satellite_chlorophylla_oc3_1day_snpp.yaml\n", - "aodn_satellite_chlorophylla_oci_1day_noaa20.yaml\n", - "aodn_mooring_ctd_delayed_qc.yaml\n", - "aodn_wave_buoy_realtime_nonqc.yaml\n", - "aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml\n", - "ag-loam.yaml\n", - "aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml\n", - "noaa-nesdis-tcprimed-pds.yaml\n", - "openaerialmap.yaml\n", - "epa-edde-v2.yaml\n", - "catalyst-cooperative-pudl.yaml\n", - "aodn_vessel_xbt_realtime_nonqc.yaml\n", - "aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml\n", - "aodn_radar_rottnestshelf_wave_delayed_qc.yaml\n", - "dep-s2-geomads.yaml\n", - "aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml\n", - "mapping-africa.yaml\n", - "noaa-nws-wam-ipe.yaml\n", - "openfoodfacts-images.yaml\n", - "colorado-imagery.yaml\n", - "boreas.yaml\n", - "noaa-space-weather.yaml\n", - "ccic.yaml\n", - "cesm-hr.yaml\n", - "nz-imagery.yaml\n", - "aodn_mooring_satellite_altimetry_calibration_validation.yaml\n", - "asset-data-igp-coal-plant.yaml\n", - "era5-for-wrf.yaml\n", - "aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml\n", - "wise-allsky.yaml\n", - "glo-30-hand.yaml\n", - "blended-tropomi-gosat-methane.yaml\n", - "aodn_vessel_xbt_delayed_qc.yaml\n", - "os-climate-physrisk.yaml\n", - "aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml\n", - "geoglows-v2.yaml\n", - "sofar-spotter-archive.yaml\n", - "epa-edde-v1.yaml\n", - "aodn_satellite_chlorophylla_oci_1day_snpp.yaml\n", - "aodn_satellite_chlorophylla_oc3_1day_aqua.yaml\n", - "aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml\n", - "racecar-dataset.yaml\n", - "palsar-2-scansar-flooding-in-rwanda.yaml\n", - "aodn_slocum_glider_delayed_qc.yaml\n", - "aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml\n", - "aodn_satellite_chlorophylla_oci_1day_aqua.yaml\n", - "its-live-data.yaml\n", - "aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml\n", - "in-imagery.yaml\n", - "ford-multi-av-seasonal.yaml\n", - "open-meteo.yaml\n", - "ladi.yaml\n", - "global-drought-flood-catalogue.yaml\n", - "obis.yaml\n", - "dmi-opendata.yaml\n", - "aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml\n", - "kyfromabove.yaml\n", - "nyc-tlc-trip-records-pds.yaml\n" - ] - } - ], - "execution_count": 92 - }, - { - "metadata": {}, - "cell_type": "code", - "outputs": [], - "execution_count": null, - "source": [ - " #git add all the files\n", - "import os\n", - "\n", - "# Get the current working directory\n", - "print(os.getcwd())\n", - "\n", - "# Change the working directory\n", - "os.chdir(os.getcwd() + \"/open-data-registry\")\n", - "\n", - "# Verify the change\n", - "print(os.getcwd())" - ], - "id": "c060a91e00e23967" - }, - { - "metadata": {}, - "cell_type": "code", - "outputs": [], - "execution_count": null, - "source": [ - "# add all our changes to be tracked by git\n", - "for file in results['git_files']:\n", - "\tos.system(\"git add {}\".format(file))\n", - "\tprint(\"git add {}\".format(file))\n", - "\n", - "\n", - "# git commit the change to the branch\n", - "os.system('git commit -m \\\"bulk add ASDI tags\\\"')\n", - "\n", - "print(\" \")\n", - "print(\"Done. Run the following command to push your changes:\")\n", - "##gh pr create --bulk_tag_ASDI \"adding ASDI tags in bulk\" --draft" - ], - "id": "cfc61ec2a240441c" - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 2 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython2", - "version": "2.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} From 1066eb5b75028498d4a12b296f9a1c9e43125da8 Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 11:57:43 -0700 Subject: [PATCH 099/751] Update asf-event-data.yaml disaster response update --- datasets/asf-event-data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/asf-event-data.yaml b/datasets/asf-event-data.yaml index 11c74da9c..163163126 100644 --- a/datasets/asf-event-data.yaml +++ b/datasets/asf-event-data.yaml @@ -13,7 +13,7 @@ UpdateFrequency: > Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - aws-pds - disaster response From 3cea3fcdd18f5f00208c3667d5059d1a0c279b67 Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 11:58:36 -0700 Subject: [PATCH 100/751] Update glo-30-hand.yaml disaster response tag update --- datasets/glo-30-hand.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/glo-30-hand.yaml b/datasets/glo-30-hand.yaml index 072b1356e..95458341f 100644 --- a/datasets/glo-30-hand.yaml +++ b/datasets/glo-30-hand.yaml @@ -14,7 +14,7 @@ UpdateFrequency: > Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - aws-pds - elevation From d4e7f3a5c3290877be4a879fb8816dc421b882c2 Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 12:00:30 -0700 Subject: [PATCH 101/751] Update global-drought-flood-catalogue.yaml --- datasets/global-drought-flood-catalogue.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/global-drought-flood-catalogue.yaml b/datasets/global-drought-flood-catalogue.yaml index 6837fd190..2c7c53835 100644 --- a/datasets/global-drought-flood-catalogue.yaml +++ b/datasets/global-drought-flood-catalogue.yaml @@ -8,7 +8,7 @@ UpdateFrequency: No future updates planned. Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - aws-pds - floods From da0714eb1e7d39dc04d54b51d4bb697f23c90215 Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 12:08:16 -0700 Subject: [PATCH 102/751] Update ladi.yaml --- datasets/ladi.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ladi.yaml b/datasets/ladi.yaml index 82f51684e..06771af0f 100644 --- a/datasets/ladi.yaml +++ b/datasets/ladi.yaml @@ -8,7 +8,7 @@ License: Creative Commons Attribution 4.0 International (CC BY 4.0) Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - aws-pds - aerial imagery From eb9b70b0d3aef75518c03d9492be461bedd5603d Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 12:09:09 -0700 Subject: [PATCH 103/751] Update openaerialmap.yaml disaster response tag update --- datasets/openaerialmap.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/openaerialmap.yaml b/datasets/openaerialmap.yaml index 0f1be46ac..e09392148 100644 --- a/datasets/openaerialmap.yaml +++ b/datasets/openaerialmap.yaml @@ -7,7 +7,7 @@ UpdateFrequency: New imagery is added as soon as it is uploaded by community con Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - satellite imagery - aerial imagery From 434e838637d2a4a5264bceb4243bb024887b5823 Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 12:09:46 -0700 Subject: [PATCH 104/751] Update palsar-2-scansar-flooding-in-rwanda.yaml disaster response tag update --- datasets/palsar-2-scansar-flooding-in-rwanda.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/palsar-2-scansar-flooding-in-rwanda.yaml b/datasets/palsar-2-scansar-flooding-in-rwanda.yaml index 18faf5bb1..6bb81e5e7 100644 --- a/datasets/palsar-2-scansar-flooding-in-rwanda.yaml +++ b/datasets/palsar-2-scansar-flooding-in-rwanda.yaml @@ -8,7 +8,7 @@ Contact: aproject@jaxa.jp Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - aws-pds - agriculture From b3c523918485f53b3841e5a4122fbeba8b8dbaae Mon Sep 17 00:00:00 2001 From: bshrutiw Date: Wed, 9 Jul 2025 12:10:42 -0700 Subject: [PATCH 105/751] Update s1-orbits.yaml disaster response tag update --- datasets/s1-orbits.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/s1-orbits.yaml b/datasets/s1-orbits.yaml index 9fd6cedcf..3b968d4ff 100644 --- a/datasets/s1-orbits.yaml +++ b/datasets/s1-orbits.yaml @@ -17,7 +17,7 @@ UpdateFrequency: > Collabs: ASDI: Tags: - - disaster + - disaster response Tags: - auxiliary data - disaster response From 37f82fa724ca586b37aadf6e7469c9bfbecc39fe Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Wed, 9 Jul 2025 23:17:28 -0600 Subject: [PATCH 106/751] Adds yaml describing GeoJSON files for Geo-TIDE Adds a yaml file entry for GeoJSON files used by the MCSC's Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE) --- datasets/geo_tide_geojsons.yaml | 55 +++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) create mode 100644 datasets/geo_tide_geojsons.yaml diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml new file mode 100644 index 000000000..e7ddf8eff --- /dev/null +++ b/datasets/geo_tide_geojsons.yaml @@ -0,0 +1,55 @@ +Name: "GeoJSON Files for Geo-TIDE" +Description: "GeoJSON files for the MIT Climate & Sustainability Consortium's Geospatial Trucking Industry Decarbonization Explorer" +Documentation: https://github.com/mcsc-impact-climate/Geo-TIDE-datasets +Contact: mcsc@mit.edu +ManagedBy: MIT Climate & Sustainability Consortium +UpdateFrequency: Quarterly +Tags: + - Fleet Transition + - Decision Support + - Decarbonization + - Geospatial + - Trucking + - Alternative energy +License: Creative Commons Attribution 4.0 International +Citation: "Eamer, D., Borrero, M., Bashir, N., & MIT Climate & Sustainability Consortium. (2025). GeoJSON files for the MCSC's Trucking Industry Decarbonization Explorer (Geo-TIDE) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15851359" +Resources: + - Description: GeoJSON Files for Geo-TIDE + ARN: arn:aws:s3:::mcsc-datahub-public/geojsons_simplified/ + Region: US West (Oregon) us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Geo-TIDE access and getting-started exercises + URL: https://danikam16.wixsite.com/mysite/post/accessing-and-using-the-mcsc-s-interactive-geospatial-decision-support-tool-for-trucking-fleet-decar + AuthorName: Danika Eamer and Helena De Figueiredo Valente + AuthorURL: https://github.com/danikam + Services: Amazon S3 + - Title: Which logistics facilities should a return-to-base carrier target for fleet electrification and chargers? + URL: https://docs.google.com/presentation/d/e/2PACX-1vQZccVHZVT1QRNdhCRRI810UxGvCD3hJhxIE4CzBDbhNr9iecHV5lp2Rv87x6rik1wrCiXXUq0WfuBk/pub + AuthorName: Danika Eamer + AuthorURL: https://github.com/danikam + Services: Amazon S3 + - Title: Which routes should a dry-van carrier prioritize for investment in battery electric or hydrogen trucks? + URL: https://danikam16.wixsite.com/mysite/post/user-case-studies-for-interactive-geospatial-trucking-fleet-decision-support + AuthorName: Danika Eamer and Helena De Figueiredo Valente + AuthorURL: https://github.com/danikam + Services: Amazon S3 + Tools & Applications: + - Title: MCSC Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE) + URL: https://climatedata.mit.edu/faf5/transportation + AuthorName: Danika Eamer, Brilant Kasami, Brooke Bao, and MIT Climate & Sustainability Consortium + AuthorURL: https://impactclimate.mit.edu + Publications: + - Title: Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE): Technical Guide and Methodology + URL: https://dspace.mit.edu/handle/1721.1/159069 + AuthorName: Eamer, D., Borrero, M., Bao, B., Kasami, B., and De Figueiredo Valente, H. + AuthorURL: https://impactclimate.mit.edu + - Title: Thought Experiment to Explore Potential Savings from Pooled Charging Infrastructure Investment + URL: https://dspace.mit.edu/handle/1721.1/153617 + AuthorName: Eamer, D. and Borrero, M. + AuthorURL: https://impactclimate.mit.edu +ADXCategories: + - Resources Data + - Environmental Data + - Automotive Data From 1946f14b64215022a894c65fd11fd63898b92b2a Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Wed, 9 Jul 2025 23:28:24 -0600 Subject: [PATCH 107/751] yaml syntax fix in geo_tide_geojsons.yaml --- datasets/geo_tide_geojsons.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index e7ddf8eff..77b0ba9ea 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -41,7 +41,7 @@ DataAtWork: AuthorName: Danika Eamer, Brilant Kasami, Brooke Bao, and MIT Climate & Sustainability Consortium AuthorURL: https://impactclimate.mit.edu Publications: - - Title: Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE): Technical Guide and Methodology + - Title: "Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE): Technical Guide and Methodology" URL: https://dspace.mit.edu/handle/1721.1/159069 AuthorName: Eamer, D., Borrero, M., Bao, B., Kasami, B., and De Figueiredo Valente, H. AuthorURL: https://impactclimate.mit.edu From 3e423e1d3b740aaf74659f144b36b3918fe15574 Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Wed, 9 Jul 2025 23:38:18 -0600 Subject: [PATCH 108/751] Updated tags to existing values in tags.yaml --- datasets/geo_tide_geojsons.yaml | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index 77b0ba9ea..bed280795 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -5,12 +5,14 @@ Contact: mcsc@mit.edu ManagedBy: MIT Climate & Sustainability Consortium UpdateFrequency: Quarterly Tags: - - Fleet Transition - - Decision Support - - Decarbonization - - Geospatial - - Trucking - - Alternative energy + - electricity + - energy + - environmental + - geospatial + - supply chain + - sustainability + - transportation + - License: Creative Commons Attribution 4.0 International Citation: "Eamer, D., Borrero, M., Bashir, N., & MIT Climate & Sustainability Consortium. (2025). GeoJSON files for the MCSC's Trucking Industry Decarbonization Explorer (Geo-TIDE) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15851359" Resources: From 6850353920a2424897ed2409e9387067fb85d510 Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Wed, 9 Jul 2025 23:46:32 -0600 Subject: [PATCH 109/751] Fixed missing tag --- datasets/geo_tide_geojsons.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index bed280795..d38217c26 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -12,7 +12,6 @@ Tags: - supply chain - sustainability - transportation - - License: Creative Commons Attribution 4.0 International Citation: "Eamer, D., Borrero, M., Bashir, N., & MIT Climate & Sustainability Consortium. (2025). GeoJSON files for the MCSC's Trucking Industry Decarbonization Explorer (Geo-TIDE) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15851359" Resources: From d620e60919407ba0e66ab623e87333cc39e1579d Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Wed, 9 Jul 2025 23:54:15 -0600 Subject: [PATCH 110/751] Update geo_tide_geojsons.yaml Changed Amazon S3 service to list --- datasets/geo_tide_geojsons.yaml | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index d38217c26..2c19f3abf 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -25,17 +25,20 @@ DataAtWork: URL: https://danikam16.wixsite.com/mysite/post/accessing-and-using-the-mcsc-s-interactive-geospatial-decision-support-tool-for-trucking-fleet-decar AuthorName: Danika Eamer and Helena De Figueiredo Valente AuthorURL: https://github.com/danikam - Services: Amazon S3 + Services: + - Amazon S3 - Title: Which logistics facilities should a return-to-base carrier target for fleet electrification and chargers? URL: https://docs.google.com/presentation/d/e/2PACX-1vQZccVHZVT1QRNdhCRRI810UxGvCD3hJhxIE4CzBDbhNr9iecHV5lp2Rv87x6rik1wrCiXXUq0WfuBk/pub AuthorName: Danika Eamer AuthorURL: https://github.com/danikam - Services: Amazon S3 + Services: + - Amazon S3 - Title: Which routes should a dry-van carrier prioritize for investment in battery electric or hydrogen trucks? URL: https://danikam16.wixsite.com/mysite/post/user-case-studies-for-interactive-geospatial-trucking-fleet-decision-support AuthorName: Danika Eamer and Helena De Figueiredo Valente AuthorURL: https://github.com/danikam - Services: Amazon S3 + Services: + - Amazon S3 Tools & Applications: - Title: MCSC Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE) URL: https://climatedata.mit.edu/faf5/transportation From 9374ea6497dbcdda352b49ba6a49710f49afe78b Mon Sep 17 00:00:00 2001 From: Brian Foo Date: Thu, 10 Jul 2025 10:32:27 -0400 Subject: [PATCH 111/751] Rename loc-sanborn to loc-sanborn-maps since bucket name was unavailable --- datasets/{loc-sanborn.yaml => loc-sanborn-maps.yaml} | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) rename datasets/{loc-sanborn.yaml => loc-sanborn-maps.yaml} (90%) diff --git a/datasets/loc-sanborn.yaml b/datasets/loc-sanborn-maps.yaml similarity index 90% rename from datasets/loc-sanborn.yaml rename to datasets/loc-sanborn-maps.yaml index af1208def..ffb578823 100644 --- a/datasets/loc-sanborn.yaml +++ b/datasets/loc-sanborn-maps.yaml @@ -36,17 +36,17 @@ License: The content of the Library of Congress online Sanborn Maps Collection https://www.loc.gov/collections/sanborn-maps/about-this-collection/rights-and-access/. Resources: - Description: Sanborn Maps data - ARN: arn:aws:s3:::loc-sanborn - Region: us-east-1 + ARN: arn:aws:s3:::loc-sanborn-maps + Region: us-west-2 Type: S3 Bucket Explore: - "[Browse Bucket by - State](https://loc-sanborn.s3.amazonaws.com/maps-by-state/index.html)" - - "[README](https://loc-sanborn.s3.amazonaws.com/README.html)" + State](https://loc-sanborn-maps.s3.amazonaws.com/maps-by-state/index.html)" + - "[README](https://loc-sanborn-maps.s3.amazonaws.com/README.html)" DataAtWork: Tutorials: - Title: README data cover sheet - URL: https://loc-sanborn.s3.amazonaws.com/README.html + URL: https://loc-sanborn-maps.s3.amazonaws.com/README.html AuthorName: Library of Congress - Title: Sanborn Map Data Python Tutorial (Jupyter notebook) URL: https://libraryofcongress.github.io/data-exploration/Data%20Packages/sanborn.html From 484dac9e7ba4c7bac1f4b30658ae3cc26b46f73f Mon Sep 17 00:00:00 2001 From: Matthew Berkeley <42berkeley@cua.edu> Date: Thu, 10 Jul 2025 16:52:25 +0200 Subject: [PATCH 112/751] make suggested changes --- datasets/busco-data.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index 90b7f21e5..c2392d0dd 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -9,6 +9,7 @@ Tags: - bacteria - bioinformatics - genomic + - life sciences - metagenomics - open source software - protein @@ -35,7 +36,7 @@ DataAtWork: AuthorName: AuthorURL: Publications: - - Title: OrthoDB and BUSCO update: annotation of orthologs with wider sampling of genomes + - Title: OrthoDB and BUSCO update - annotation of orthologs with wider sampling of genomes URL: https://academic.oup.com/nar/article/53/D1/D516/7899526?login=true AuthorName: Fredrik Tegenfeldt, Dmitry Kuznetsov, Mosè Manni, Matthew Berkeley, Evgeny M Zdobnov, Evgenia V Kriventseva - Title: BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes From c823aadad0ef97395caa40332c20d99bbfde2b5a Mon Sep 17 00:00:00 2001 From: Nicole Stock Date: Thu, 10 Jul 2025 12:36:52 -0400 Subject: [PATCH 113/751] Update bossdb.yaml to include CC BY-NC-SA 4.0 --- datasets/bossdb.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/bossdb.yaml b/datasets/bossdb.yaml index ffa93211b..f3552d0fd 100644 --- a/datasets/bossdb.yaml +++ b/datasets/bossdb.yaml @@ -18,7 +18,7 @@ Tags: - light-sheet microscopy - calcium imaging - volumetric imaging -License: Creative Commons 4.0 International (CC BY 4.0); Creative Commons CC0 1.0 Universal (CC0-1.0) +License: Creative Commons 4.0 International (CC BY 4.0); Creative Commons CC0 1.0 Universal (CC0-1.0); Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) Resources: - Description: Large 3D volumes of neuroimaging data and image processing products such as segmentation and reconstructed meshes ARN: arn:aws:s3:::bossdb-open-data From a4a8b1fb2efcae3ff62f2c44909c39e700b65b82 Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Thu, 10 Jul 2025 11:03:14 -0600 Subject: [PATCH 114/751] Update geo_tide_geojsons.yaml --- datasets/geo_tide_geojsons.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index 2c19f3abf..32352d66e 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -17,7 +17,7 @@ Citation: "Eamer, D., Borrero, M., Bashir, N., & MIT Climate & Sustainability Co Resources: - Description: GeoJSON Files for Geo-TIDE ARN: arn:aws:s3:::mcsc-datahub-public/geojsons_simplified/ - Region: US West (Oregon) us-west-2 + Region: us-west-2 Type: S3 Bucket DataAtWork: Tutorials: From e5bf959488af599240ef3054f7aae891aae11e2c Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Thu, 10 Jul 2025 11:08:03 -0600 Subject: [PATCH 115/751] Update geo_tide_geojsons.yaml --- datasets/geo_tide_geojsons.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index 32352d66e..9295a6471 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -55,5 +55,3 @@ DataAtWork: AuthorURL: https://impactclimate.mit.edu ADXCategories: - Resources Data - - Environmental Data - - Automotive Data From 45823fdf4f19becac8dc0a19c47a86b9edf43c8f Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 10 Jul 2025 09:33:48 -0800 Subject: [PATCH 116/751] Update mbers-open-data.yaml add pds tag --- datasets/mbers-open-data.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/mbers-open-data.yaml b/datasets/mbers-open-data.yaml index 558541e52..14ae6586a 100644 --- a/datasets/mbers-open-data.yaml +++ b/datasets/mbers-open-data.yaml @@ -4,7 +4,8 @@ Documentation: https://github.com/WattTime/mbers-open-data/blob/main/MBER_Data_S Contact: The annual and hourly MBERs data are created and maintained by the Climate TRACE coalition of nonprofits, universities, and tech companies. The largest contributors to the coalition's electricity sector work are WattTime, Transition Zero, Global Energy Monitor, Pixel Scientia Labs, Planet Labs, and Georgetown University. For questions or more information about MBER data, contact coalition@ClimateTRACE.org or visit https://climatetrace.org/contact. ManagedBy: Climate TRACE UpdateFrequency: Annually -Tags: +Tags: + - aws-pds - carbon - climate - csv From 0ef02ea6741c4f49ab82da1a9ef176ac921a6f4b Mon Sep 17 00:00:00 2001 From: crichica <148996603+crichica@users.noreply.github.com> Date: Thu, 10 Jul 2025 14:07:16 -0400 Subject: [PATCH 117/751] Update spherex-qr.yaml Adding pds tag. --- datasets/spherex-qr.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/spherex-qr.yaml b/datasets/spherex-qr.yaml index 38ded3649..c4238ef77 100644 --- a/datasets/spherex-qr.yaml +++ b/datasets/spherex-qr.yaml @@ -5,6 +5,7 @@ Contact: https://irsa.ipac.caltech.edu/docs/help_desk.html ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech.edu)) at Caltech" UpdateFrequency: SPHEREx QR is updated weekly. The data may also be presented in new ways as the products become available. Tags: + - aws-pds - astronomy - imaging - object detection From f2fa944274246e38e886d8a5e8b5b8e32e01c7ea Mon Sep 17 00:00:00 2001 From: kevinzhao81 <94228377+kevinzhao81@users.noreply.github.com> Date: Thu, 10 Jul 2025 15:54:14 -0400 Subject: [PATCH 118/751] Upload CarbonPDF dataset kaz81@pitt.edu Kaiwen --- datasets/carbonpdf.yaml | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 datasets/carbonpdf.yaml diff --git a/datasets/carbonpdf.yaml b/datasets/carbonpdf.yaml new file mode 100644 index 000000000..0796c51a5 --- /dev/null +++ b/datasets/carbonpdf.yaml @@ -0,0 +1,22 @@ +Name: CarbonPDF +Description: A carbon question-answering (QA) dataset specifically designed to facilitate the extraction and analysis of data from real-world carbon reports of computing products. The dataset features annotated metadata, a variety of numerical reasoning tasks, and structured derivations to ensure accurate processing of fragmented and inconsistent information. +Documentation: https://github.com/pittcps/carbonpdf-dataset +Contact: kaz81@pitt.edu +ManagedBy: Pittcps lab +UpdateFrequency: Data for a new company is added once collected. +Tags: + - product carbon footprint + - question answering + - pdf + - sustainability +License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/) +Resources: + - Description: A component-level product carbon footprint dataset and a corresponding question-answering dataset based on it + ARN: arn:aws:s3:::carbonpdf + Region: us-east-1 + Type: S3 Bucket + Explore: https://github.com/pittcps/carbonpdf-dataset +DeprecatedNotice: +ADXCategories: + - Environmental Data + - Manufacturing Data \ No newline at end of file From c3bf46bab8611a482085134a7fe6dae1cceabac8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 10 Jul 2025 13:13:48 -0800 Subject: [PATCH 119/751] Update geo_tide_geojsons.yaml added ASDI tag and pds tag --- datasets/geo_tide_geojsons.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index 9295a6471..ac5d07f43 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -4,7 +4,12 @@ Documentation: https://github.com/mcsc-impact-climate/Geo-TIDE-datasets Contact: mcsc@mit.edu ManagedBy: MIT Climate & Sustainability Consortium UpdateFrequency: Quarterly +Collabs: + ASDI: + Tags: + - sustainability Tags: + - aws-pds - electricity - energy - environmental From aa95afac645131bb1aca085a6a72c2fc133c48f8 Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Thu, 10 Jul 2025 16:37:07 -0600 Subject: [PATCH 120/751] Update geo_tide_geojsons.yaml --- datasets/geo_tide_geojsons.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index ac5d07f43..845aaf306 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -21,7 +21,7 @@ License: Creative Commons Attribution 4.0 International Citation: "Eamer, D., Borrero, M., Bashir, N., & MIT Climate & Sustainability Consortium. (2025). GeoJSON files for the MCSC's Trucking Industry Decarbonization Explorer (Geo-TIDE) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15851359" Resources: - Description: GeoJSON Files for Geo-TIDE - ARN: arn:aws:s3:::mcsc-datahub-public/geojsons_simplified/ + ARN: arn:aws:s3:::mcsc-geotide-geojson-files/geojson_files/ Region: us-west-2 Type: S3 Bucket DataAtWork: From 9dec204427962f86c40b4f37f470c920e81a64cd Mon Sep 17 00:00:00 2001 From: Danika MacDonell Date: Thu, 10 Jul 2025 17:21:10 -0600 Subject: [PATCH 121/751] Added "Browse Bucket" link --- datasets/geo_tide_geojsons.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/geo_tide_geojsons.yaml b/datasets/geo_tide_geojsons.yaml index 845aaf306..8ac1f3751 100644 --- a/datasets/geo_tide_geojsons.yaml +++ b/datasets/geo_tide_geojsons.yaml @@ -24,6 +24,8 @@ Resources: ARN: arn:aws:s3:::mcsc-geotide-geojson-files/geojson_files/ Region: us-west-2 Type: S3 Bucket + Explore: + - '[Browse Bucket](https://mcsc-geotide-geojson-files.s3.amazonaws.com/index.html)' DataAtWork: Tutorials: - Title: Geo-TIDE access and getting-started exercises From b73655ee99af00985890a8c220bec402303d2df0 Mon Sep 17 00:00:00 2001 From: Hyun Min Kang Date: Fri, 11 Jul 2025 12:12:55 -0400 Subject: [PATCH 122/751] Add tags for CartoStore --- datasets/cartostore.yaml | 6 +++--- tags.yaml | 2 ++ 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/datasets/cartostore.yaml b/datasets/cartostore.yaml index 356070b57..a3aa045c7 100644 --- a/datasets/cartostore.yaml +++ b/datasets/cartostore.yaml @@ -8,8 +8,8 @@ UpdateFrequency: Monthly Tags: - spatial transcriptomics - spatial omics - - genomics - - PMTiles + - genomic + - bioinformatics - geospatial License: | "[CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)" @@ -27,7 +27,7 @@ DataAtWork: URL: https://github.com/seqscope/cartostore/blob/main/README.md AuthorName: Hyun Min Kang and Weiqiu Cheng - Title: Cartloader Documentation - URL: https://seqscope.github.io/cartloader/ + URL: https://seqscope.github.io/cartloader AuthorName: Hyun Min Kang and Weiqiu Cheng Example Datasets: - Title : Example CartoStore Repository for Xenium Breast Cancer Dataset diff --git a/tags.yaml b/tags.yaml index 5fa9b4bc0..b0ea97f44 100644 --- a/tags.yaml +++ b/tags.yaml @@ -392,6 +392,8 @@ - space biology - space weather - SPARQL +- spatial omics +- spatial transcriptomics - speaker identification - speech processing - speech recognition From ef4c2949d5cb93b08a36208cfa0e89af1260b3b5 Mon Sep 17 00:00:00 2001 From: Hyun Min Kang Date: Fri, 11 Jul 2025 12:16:22 -0400 Subject: [PATCH 123/751] Removed invalid key --- datasets/cartostore.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/cartostore.yaml b/datasets/cartostore.yaml index a3aa045c7..5e2c7c5a8 100644 --- a/datasets/cartostore.yaml +++ b/datasets/cartostore.yaml @@ -29,7 +29,6 @@ DataAtWork: - Title: Cartloader Documentation URL: https://seqscope.github.io/cartloader AuthorName: Hyun Min Kang and Weiqiu Cheng - Example Datasets: - Title : Example CartoStore Repository for Xenium Breast Cancer Dataset URL: https://zenodo.org/records/15649152 AuthorName: Hyun Min Kang and Weiqiu Cheng From 79f7adadd799a3fcdc3edc33d8d3d6d77d08d388 Mon Sep 17 00:00:00 2001 From: Hyun Min Kang Date: Fri, 11 Jul 2025 12:34:19 -0400 Subject: [PATCH 124/751] Updated tags --- datasets/cartostore.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/cartostore.yaml b/datasets/cartostore.yaml index 5e2c7c5a8..b8087ad38 100644 --- a/datasets/cartostore.yaml +++ b/datasets/cartostore.yaml @@ -10,7 +10,7 @@ Tags: - spatial omics - genomic - bioinformatics - - geospatial + - life sciences License: | "[CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)" Citation: | From fdba1b949743aaa725e94eb260bf687d6b667a04 Mon Sep 17 00:00:00 2001 From: Taylor Grafft Date: Fri, 11 Jul 2025 13:35:20 -0500 Subject: [PATCH 125/751] Adding life sciences tag --- datasets/apex.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/apex.yaml b/datasets/apex.yaml index e4d0f1a3e..9ac7afeb5 100644 --- a/datasets/apex.yaml +++ b/datasets/apex.yaml @@ -13,6 +13,7 @@ Tags: - neuroscience - neuroimaging - microscopy + - life sciences - zarr - metadata - machine learning From 722dfdffadf6523460a5cb3b91490a4727dd929b Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Fri, 11 Jul 2025 15:56:13 -0400 Subject: [PATCH 126/751] Update noaa-nws-naqfc-pds.yaml Updated description. --- datasets/noaa-nws-naqfc-pds.yaml | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/datasets/noaa-nws-naqfc-pds.yaml b/datasets/noaa-nws-naqfc-pds.yaml index 8a8dc9403..ebb215cfb 100644 --- a/datasets/noaa-nws-naqfc-pds.yaml +++ b/datasets/noaa-nws-naqfc-pds.yaml @@ -1,23 +1,23 @@ Name: NOAA National Air Quality Forecast Capability (NAQFC) Regional Model Guidance Description: | - The National Air Quality Forecasting Capability (NAQFC) dataset contains model-generated Air-Quality (AQ) forecast guidance from three different prediction systems. The first system is a coupled weather and atmospheric chemistry numerical forecast model, known as the Air Quality Model (AQM). It is used to produce forecast guidance for ozone (O3) and particulate matter with diameter equal to or less than 2.5 micrometers (PM2.5) using meteorological forecasts based on NCEP’s operational weather forecast models such as North American Mesoscale Models (NAM) and Global Forecast System (GFS), and atmospheric chemistry based on the EPA’s Community Multiscale Air Quality (CMAQ) model. In addition, the modeling system incorporates information related to chemical emissions, including anthropogenic emissions provided by the EPA and fire emissions from NOAA/NESDIS. The NCEP NAQFC AQM output fields in this archive include 72-hr forecast products of model raw and bias-correction predictions, extending back to 1 January 2020. All of the output was generated by the contemporaneous operational AQM, beginning with AQMv5 in 2020, with upgrades to AQMv6 on 20 July 2021, and AQMv7 on 14 May 2024. The history of AQM upgrades is documented [here](https://www.emc.ncep.noaa.gov/mmb/aq/AQChangelog.html) + The National Air Quality Forecasting Capability (NAQFC) dataset contains model-generated air quality (AQ) forecast guidance from three different prediction systems. The first system is a coupled weather and atmospheric chemistry numerical forecast model, known as the Air Quality Model (AQM). It is used to produce forecast guidance for ozone (O3) and particulate matter that is less than or equal to 2.5 micrometers in diameter (PM2.5). Prior to May 14, 2024, AQM predictions were derived using the EPA’s Community Multiscale Air Quality (CMAQ) model, driven by meteorological fields from NCEP’s operational weather forecast models, specifically the North American Mesoscale Model (NAM; prior to 20 July 2021) and the Global Forecast System (GFS; beginning 20 July 2021). Since May 14, 2024, AQM guidance has been produced by a unique application within the community-based Unified Forecast System (UFS). The core model components in this application are derived directly from the fully online-coupled UFS-based weather and CMAQ-based chemistry models. In addition, it incorporates information related to chemical and particle source emissions as it integrates forward in time, including anthropogenic chemical emissions provided by the EPA, fire emissions from NOAA/NESDIS, and airborne particles generated by human activities and those predicted to be generated by wind-driven erosion and biosphere at ground level. The NCEP NAQFC AQM output fields in this archive include model raw and bias-corrected predictions dating back to 1 January 2020, all generated by the contemporaneous operational AQM, beginning with AQMv5 in 2020, transitioning to AQMv6 on 20 July 2021, and to AQMv7 on 14 May 2024. The length of each forecast was 48 hours prior to the implementation of AQMv6, and has been 72 hours ever since. The history of AQM upgrades is documented [here](https://www.emc.ncep.noaa.gov/mmb/aq/AQChangelog.html)

- The second prediction is known as the Hybrid Single-Particle Lagrangian Integrated Trajectory model (HYSPLIT). It is a widely used atmospheric transport and dispersion model containing an internal dust-generation module. It provides forecast guidance for atmospheric dust concentration and, prior to 28 June 2022, it also provided the NAQFC forecast guidance for smoke. Since that date, the third prediction system, a regional numerical weather prediction (NWP) model known as the Rapid Refresh (RAP) model, has subsumed HYSPLIT for operational smoke guidance, simulating the emission, transport, and deposition of smoke particles that originate from biomass burning (fires) and anthropogenic sources. + The second prediction is known as the Hybrid Single-Particle Lagrangian Integrated Trajectory model (HYSPLIT). It is a widely used atmospheric transport and dispersion model containing an internal dust-generation module. It provides forecast guidance for atmospheric dust concentration and, prior to 28 June 2022, it also provided the NAQFC forecast guidance for smoke. Starting on that date, the third prediction system, a regional numerical weather prediction (NWP) model known as the Rapid Refresh (RAP) model, subsumed HYSPLIT for operational smoke guidance, simulating the emission, transport, and deposition of smoke particles that originate from biomass burning (fires) and anthropogenic sources.

- The output from each of these modeling systems is generated over three separate domains, one covering CONUS, one Alaska, and the other Hawaii. Currently, for this archive, the ozone, (PM2.5), and smoke output is available over all three domains, while dust products are available only over the CONUS domain. The predicted concentrations of all species in the lowest model layer (i.e., the layer in contact with the surface) are available, as are vertically integrated values of smoke and dust. The data is gridded horizontally within each domain, with a grid spacing of approximately 5 km over CONUS, 6 km over Alaska, and 2.5 km over Hawaii. Ozone concentrations are provided in parts per billion (PPB), while the concentrations of all other species are quantified in units of micrograms per cubic meter (ug/m3), except for the column-integrated smoke values which are expressed in units of mg/m2. + The output from each of these modeling systems is generated over three separate domains, one covering CONUS, another over Alaska, and the other over Hawaii. Currently, for this archive, the O3, PM2.5, and smoke output is available over all three domains, while dust products are available only over the CONUS domain. The predicted concentrations of all species in the lowest model layer (i.e., the layer in contact with the surface) are available, as are vertically integrated values of smoke and dust. The data is gridded horizontally within each domain, with a grid spacing of approximately 5 km over CONUS, 6 km over Alaska, and 2.5 km over Hawaii. O3 concentrations are provided in parts per billion (PPB), while the concentrations of all other species are quantified in units of micrograms per cubic meter (ug/m3), except for the column-integrated smoke values which are expressed in units of milligrams per square meter (mg/m2).

- Temporally, O3 and PM2.5 are available as maximum and/or averaged values over various time periods. Specifically, O3 is available in both 1-hour and 8-hour (backward calculated) averages, as well as preceding 1-hour and 8-hour maximum values. Similarly, PM2.5 is available in 1-hour and 24-hour average values and 24-hour maximum values. In addition, all O3 and PM2.5 fields are available with bias-corrected magnitudes, based on derived model biases relative to observations. + Temporally, O3 and PM2.5 are available as maximum and/or averaged values over various time periods, selected in part for consistency with the EPA’s National Ambient Air Quality Standards. Specifically, O3 is available in both 1-hour and 8-hour (backward calculated) averages, as well as preceding 1-hour and 8-hour maximum values. Similarly, PM2.5 is available in 1-hour and 24-hour average values and 24-hour maximum values. In addition, all O3 and PM2.5 fields are available with bias-corrected magnitudes, based on derived historical model biases relative to observations.

- The AQM produces hourly forecast guidance for O3 and PM2.5 out to 72 hours twice per day, starting at 0600 and 1200 UTC. Smoke guidance is available out to 51 hours from once-per-day RAP forecasts initialized at 0300 UTC, while dust guidance from HYSPLIT is available out to 48 hours from initialization times of 0600 and 1200 UTC. + The AQM produces hourly forecast guidance for O3 and PM2.5 up to 72 hours twice per day. Smoke guidance is available up to 51 hours from once-per-day RAP forecasts, while dust guidance from HYSPLIT is available up to 48 hours. Documentation: https://vlab.noaa.gov/web/osti-modeling/air-quality Contact: For questions regarding data content or quality, visit the NCEP AQM Products website. For any questions regarding data delivery or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov.
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov ManagedBy: "[NOAA](http://www.noaa.gov/)" -UpdateFrequency: 2 times per day, 0600 and 1200 UTC for O3, PM2.5, and dust; once per day, 0300 UTC for smoke +UpdateFrequency: Two times per day, 0600 and 1200 UTC for O3, PM2.5, and dust; once per day, 0300 UTC for smoke Collabs: ASDI: Tags: From 1c077a7359fb06d0b7362d582b44f5b194cdd1ef Mon Sep 17 00:00:00 2001 From: Yuk Kei Wan <41866052+yuukiiwa@users.noreply.github.com> Date: Sun, 13 Jul 2025 08:18:21 +0800 Subject: [PATCH 127/751] Create yuukiiwa_application_placeholder.yaml --- .../yuukiiwa_application_placeholder.yaml | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 datasets/yuukiiwa_application_placeholder.yaml diff --git a/datasets/yuukiiwa_application_placeholder.yaml b/datasets/yuukiiwa_application_placeholder.yaml new file mode 100644 index 000000000..79146bbb5 --- /dev/null +++ b/datasets/yuukiiwa_application_placeholder.yaml @@ -0,0 +1,30 @@ +Name: Update on August 31 +Description:Update on August 31, 2025 +Documentation: Update on August 31, 2025 +Contact: Update on August 31, 2025 +ManagedBy: "The Genome Institute of Singapore (https://www.a-star.edu.sg/gis) and UMass Chan Medical School's RNA Therapeutics Institute (https://www.umassmed.edu/rti/)" +UpdateFrequency: Datasets will be updated periodically as additional data are generated. +Tags: + - TBD +License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +Citation: Update on August 31, 2025 +Resources: + - Description: Update on August 31, 2025 + ARN: Update on August 31, 2025 + Region: ap-southeast-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](http://frag-struc.s3-website-ap-southeast-1.amazonaws.com/)' +DataAtWork: + Tutorials: + - Title: Update on August 31, 2025 + URL: Update on August 31, 2025 + AuthorName: Leonard Schärfen and Yuk Kei Wan + Tools & Applications: + - Title: Update on August 31, 2025 + URL: Update on August 31, 2025 + AuthorName: Leonard Schärfen and Yuk Kei Wan + Publications: + - Title: Update on August 31, 2025 + URL: In Preparation + AuthorName: Leonard Schärfen and Yuk Kei Wan From 5ad78d1074915dc4adb930f880dec5872abecad5 Mon Sep 17 00:00:00 2001 From: Siyuan-Shen Date: Sat, 12 Jul 2025 21:19:35 -0500 Subject: [PATCH 128/751] Update and rename surface-pm2-5-v6gl02.yaml to surface-pm2-5-v6gl.yaml --- ...rface-pm2-5-v6gl02.yaml => surface-pm2-5-v6gl.yaml} | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) rename datasets/{surface-pm2-5-v6gl02.yaml => surface-pm2-5-v6gl.yaml} (90%) diff --git a/datasets/surface-pm2-5-v6gl02.yaml b/datasets/surface-pm2-5-v6gl.yaml similarity index 90% rename from datasets/surface-pm2-5-v6gl02.yaml rename to datasets/surface-pm2-5-v6gl.yaml index 5374b57da..f41ab8080 100644 --- a/datasets/surface-pm2-5-v6gl02.yaml +++ b/datasets/surface-pm2-5-v6gl.yaml @@ -1,6 +1,6 @@ Name: SatPM2.5 Description: Fine particulate matter (PM2.5) concentrations are estimated using information from satellite-, simulation- and monitor-based sources. Aerosol optical depth from multiple satellites (MODIS, VIIRS, MISR, SeaWiFS, and VIIRS) and their respective retrievals (Dark Target, Deep Blue, MAIAC) is combined with simulation (GEOS-Chem) based upon their relative uncertainties as determined using ground-based sun photometer (AERONET) observations to produce geophysical estimates that explain most of the variance in ground-based PM2.5 measurements. A subsequent statistical fusion incorporates additional information from ground-based PM2.5 measurements. -Documentation: https://sites.wustl.edu/acag/datasets/surface-pm2-5/#V6.GL.02.03 +Documentation: https://sites.wustl.edu/acag/datasets/surface-pm2-5/#V6.GL.02.04 Contact: randall.martin@wustl.edu ManagedBy: "https://sites.wustl.edu/acag/" UpdateFrequency: Yearly @@ -16,15 +16,15 @@ Tags: - health License: Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/) Resources: - - Description: Satellite-Derived Fine Particulate Matter (PM2.5) concentrations from the Atmospheric Composition Analysis Group and Washington University in St. Louis, version GL06.02.03 - ARN: arn:aws:s3:::v6.gl.02.03 + - Description: Satellite-Derived Fine Particulate Matter (PM2.5) concentrations from the Atmospheric Composition Analysis Group and Washington University in St. Louis, version GL06.02.04 + ARN: arn:aws:s3:::v6.gl.02.04 Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Bucket](https://s3.us-west-2.amazonaws.com/v6.gl.02.03/index.html)' + - '[Browse Bucket](https://s3.us-west-2.amazonaws.com/v6.gl.02.04/index.html)' DataAtWork: Tutorials: - - Title: Importing and Plotting the V6.GL.02.03 dataset into Matlab + - Title: Importing and Plotting the V6.GL.02.04 dataset into Matlab URL: https://sites.wustl.edu/acag/importing-and-plotting-the-v6-gl-02-02-dataset-into-matlab/ AuthorName: Aaron van Donkelaar AuthorURL: https://sites.wustl.edu/acag/people/aaron-van-donkelaar/ From 00e6375547c7ac7fad8e204a838fbd36eb80d92e Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 14 Jul 2025 10:42:45 -0600 Subject: [PATCH 129/751] Update nrel-pds-dsgrid.yaml Update dsgrid dataset to include building profiles --- datasets/nrel-pds-dsgrid.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/nrel-pds-dsgrid.yaml b/datasets/nrel-pds-dsgrid.yaml index ef2663c04..b267999e6 100644 --- a/datasets/nrel-pds-dsgrid.yaml +++ b/datasets/nrel-pds-dsgrid.yaml @@ -42,6 +42,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dsgrid-2018-efs%2F)' + - Description: '[Demand-Side Grid Model (dsgrid) Building Load Profiles](https://data.openei.org/submissions/8446)' + ARN: arn:aws:s3:::nrel-pds-dsgrid/building/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-dsgrid&prefix=building%2F)' DataAtWork: Tutorials: - Title: dsgrid Documentation From b296343dc7ca44d0e968f3a1783cdd4144513647 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 15 Jul 2025 10:37:22 -0800 Subject: [PATCH 130/751] Update eai-essential-web-v1.yaml --- datasets/eai-essential-web-v1.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml index 89831258d..5da49ef1e 100644 --- a/datasets/eai-essential-web-v1.yaml +++ b/datasets/eai-essential-web-v1.yaml @@ -1,4 +1,4 @@ -Name: Essential-Web v1.0: 24T tokens of organized web data +Name: "Essential-Web v1.0: 24T tokens of organized web data" Description: A 24-trillion-token dataset in which every document is annotated with a twelve-category taxonomy covering topic, format, content complexity, and quality. Documentation: https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 Contact: research@essential.ai From 63e1bcda2649a3d82a20cc3121c8e2a80feb5682 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 15 Jul 2025 10:47:27 -0800 Subject: [PATCH 131/751] Update eai-essential-web-v1.yaml --- datasets/eai-essential-web-v1.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml index 5da49ef1e..4a43116e5 100644 --- a/datasets/eai-essential-web-v1.yaml +++ b/datasets/eai-essential-web-v1.yaml @@ -1,4 +1,4 @@ -Name: "Essential-Web v1.0: 24T tokens of organized web data" +Name: 'Essential-Web v1.0: 24T tokens of organized web data' Description: A 24-trillion-token dataset in which every document is annotated with a twelve-category taxonomy covering topic, format, content complexity, and quality. Documentation: https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 Contact: research@essential.ai @@ -12,7 +12,7 @@ Tags: - text License: 'Essential-Web-v1.0 contributions are made available under the [ODC attribution license](https://opendatacommons.org/licenses/by/odc_by_1.0_public_text.txt); however, users should also abide by the [Common Crawl - Terms of Use](https://commoncrawl.org/terms-of-use). We do not alter the license of any of the underlying data.' Resources: - - Description: Essential-Web v1.0: 24T tokens of organized web data + - Description: 'Essential-Web v1.0: 24T tokens of organized web data' ARN: # TODO: fill in Region: # TODO: fill in Type: S3 Bucket From dc2568cf4718036004092211dbb512428dc92ce9 Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Tue, 15 Jul 2025 17:42:10 -0400 Subject: [PATCH 132/751] Create noaa-nbm-parallel Created a YAML to generate RODA page for new NBM Parallel dataset hosted on NODD. --- datasets/noaa-nbm-parallel | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 datasets/noaa-nbm-parallel diff --git a/datasets/noaa-nbm-parallel b/datasets/noaa-nbm-parallel new file mode 100644 index 000000000..b9410b4a5 --- /dev/null +++ b/datasets/noaa-nbm-parallel @@ -0,0 +1,36 @@ +Name: NOAA National Blend of Models (NBM) Parallel +Description: | + The National Blend of Models (NBM) is a nationally consistent and skillful suite of calibrated forecast guidance based on a blend of both NWS and non-NWS numerical weather prediction model data and post-processed model guidance. The goal of the NBM is to create a highly accurate, skillful and consistent starting point for the gridded forecast. This dataset contains data from the current parallel version of the NBM which is a test version, featuring many changes, that is a candidate to be implemented into operations following a careful vetting process. +Documentation: | + https://vlab.noaa.gov/web/mdl/nbm +Contact: | + For any questions regarding data delivery not associated with this platform or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov. + We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov +ManagedBy: "[NOAA](http://www.noaa.gov/)" +UpdateFrequency: | + Once per hour. +Collabs: + ASDI: + Tags: + - weather +Tags: + - aws-pds + - agriculture + - climate + - disaster response + - environmental + - meteorological + - weather +License: | + NOAA data disseminated through NODD are open to the public and can be used as desired.

NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. +Resources: + - Description: National Blend of Models (NBM) Parallel + ARN: arn:aws:s3:::noaa-nbm-para-pds + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://noaa-nbm-para-pds.s3.amazonaws.com/index.html)' + - Description: New data notifications for NBM Parallel, only Lambda and SQS protocols allowed + ARN: arn:aws:sns:us-east-1:123901341784:NewNBMParaObject + Region: us-east-1 + Type: SNS Topic From 0ee18fc94a87bd933680410349d0f1ee4f98e842 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 15 Jul 2025 13:51:23 -0800 Subject: [PATCH 133/751] Rename noaa-nbm-parallel to noaa-nbm-parallel.yaml --- datasets/{noaa-nbm-parallel => noaa-nbm-parallel.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{noaa-nbm-parallel => noaa-nbm-parallel.yaml} (100%) diff --git a/datasets/noaa-nbm-parallel b/datasets/noaa-nbm-parallel.yaml similarity index 100% rename from datasets/noaa-nbm-parallel rename to datasets/noaa-nbm-parallel.yaml From 26ed0273e6c27a086fa1b7f4ea9de33398413650 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Wed, 16 Jul 2025 14:34:08 -0600 Subject: [PATCH 134/751] Update nrel-pds-wtk.yaml Add sup3rwind resource to wtk yaml --- datasets/nrel-pds-wtk.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/nrel-pds-wtk.yaml b/datasets/nrel-pds-wtk.yaml index 06bf752b7..8990cbbf2 100644 --- a/datasets/nrel-pds-wtk.yaml +++ b/datasets/nrel-pds-wtk.yaml @@ -230,6 +230,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=wtk-led%2F)' + - Description: Super-Resolution for Renewable Energy Resource Data with Wind from Reanalysis (Sup3rWind) + ARN: arn:aws:s3:::nrel-pds-wtk/sup3rwind/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=sup3rwind%2F)' DataAtWork: Tutorials: - Title: HSDS Examples From d833baec85aa535129404bc66fdfc6030994b949 Mon Sep 17 00:00:00 2001 From: Brian Foo Date: Thu, 17 Jul 2025 13:32:11 -0400 Subject: [PATCH 135/751] Update loc-sanborn-maps.yaml to include changes requested by AWS From 049b153e11450c3dfad4702f77bd5d3b80860284 Mon Sep 17 00:00:00 2001 From: kevinzhao81 <94228377+kevinzhao81@users.noreply.github.com> Date: Thu, 17 Jul 2025 13:33:15 -0400 Subject: [PATCH 136/751] Use tags in tags.yaml --- datasets/carbonpdf.yaml | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/datasets/carbonpdf.yaml b/datasets/carbonpdf.yaml index 0796c51a5..07faf26ef 100644 --- a/datasets/carbonpdf.yaml +++ b/datasets/carbonpdf.yaml @@ -5,10 +5,11 @@ Contact: kaz81@pitt.edu ManagedBy: Pittcps lab UpdateFrequency: Data for a new company is added once collected. Tags: - - product carbon footprint - - question answering - - pdf - - sustainability + - environmental + - product comparison + - csv + - information retrieval + - industry License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/) Resources: - Description: A component-level product carbon footprint dataset and a corresponding question-answering dataset based on it @@ -19,4 +20,4 @@ Resources: DeprecatedNotice: ADXCategories: - Environmental Data - - Manufacturing Data \ No newline at end of file + - Manufacturing Data From 6dd76f334f14e346ea663af985cecc066507e22c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 17 Jul 2025 09:37:35 -0800 Subject: [PATCH 137/751] Update loc-sanborn-maps.yaml --- datasets/loc-sanborn-maps.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/loc-sanborn-maps.yaml b/datasets/loc-sanborn-maps.yaml index ffb578823..a7fce8f39 100644 --- a/datasets/loc-sanborn-maps.yaml +++ b/datasets/loc-sanborn-maps.yaml @@ -16,6 +16,7 @@ Contact: For curatorial questions about the content of the collection and ManagedBy: "[Library of Congress](https://www.loc.gov/)" UpdateFrequency: As new and significant changes to the underlying digital collection occurs Tags: + - aws-pds - archives - cities - computer vision From 1a8869c281adc800086e5cd30af95a6df705fa30 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 17 Jul 2025 09:42:05 -0800 Subject: [PATCH 138/751] Update carbonpdf.yaml --- datasets/carbonpdf.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/carbonpdf.yaml b/datasets/carbonpdf.yaml index 07faf26ef..44fe5901f 100644 --- a/datasets/carbonpdf.yaml +++ b/datasets/carbonpdf.yaml @@ -5,6 +5,7 @@ Contact: kaz81@pitt.edu ManagedBy: Pittcps lab UpdateFrequency: Data for a new company is added once collected. Tags: + - aws-pds - environmental - product comparison - csv @@ -16,8 +17,7 @@ Resources: ARN: arn:aws:s3:::carbonpdf Region: us-east-1 Type: S3 Bucket - Explore: https://github.com/pittcps/carbonpdf-dataset -DeprecatedNotice: + Explore: 'https://github.com/pittcps/carbonpdf-dataset' ADXCategories: - Environmental Data - Manufacturing Data From 06b7723c519bc57566768b1472cd53bf434e9e71 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 17 Jul 2025 10:13:17 -0800 Subject: [PATCH 139/751] Update carbonpdf.yaml --- datasets/carbonpdf.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/carbonpdf.yaml b/datasets/carbonpdf.yaml index 44fe5901f..59de97faa 100644 --- a/datasets/carbonpdf.yaml +++ b/datasets/carbonpdf.yaml @@ -17,7 +17,8 @@ Resources: ARN: arn:aws:s3:::carbonpdf Region: us-east-1 Type: S3 Bucket - Explore: 'https://github.com/pittcps/carbonpdf-dataset' + Explore: + - '[Explore](https://github.com/pittcps/carbonpdf-dataset)' ADXCategories: - Environmental Data - Manufacturing Data From 9139a31f59be328939effb940ea3620182a8b7c8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 17 Jul 2025 10:22:21 -0800 Subject: [PATCH 140/751] Update carbonpdf.yaml --- datasets/carbonpdf.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/carbonpdf.yaml b/datasets/carbonpdf.yaml index 59de97faa..8e5d4b795 100644 --- a/datasets/carbonpdf.yaml +++ b/datasets/carbonpdf.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/pittcps/carbonpdf-dataset Contact: kaz81@pitt.edu ManagedBy: Pittcps lab UpdateFrequency: Data for a new company is added once collected. +Collabs: + ASDI: + Tags: + - climate Tags: - aws-pds - environmental From 0fe514819f21bd88e9a80316ac9f0a1e20ee55d4 Mon Sep 17 00:00:00 2001 From: francoloma Date: Fri, 18 Jul 2025 08:58:19 -1000 Subject: [PATCH 141/751] Update noaa-ncn.yaml Removed outdated notification about the depreciation of FTP services - this has happened already. --- datasets/noaa-ncn.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/noaa-ncn.yaml b/datasets/noaa-ncn.yaml index 47c68d147..c46db048d 100644 --- a/datasets/noaa-ncn.yaml +++ b/datasets/noaa-ncn.yaml @@ -6,7 +6,6 @@ Description: | - [NOAA-NCN on AWS](https://noaa-cors-pds.s3.amazonaws.com/index.html) - [NGS server: https://geodesy.noaa.gov/corsdata/](https://geodesy.noaa.gov/corsdata/) - [NGS's customized data request service (UFCORS)](https://geodesy.noaa.gov/UFCORS/) - - [NGS Anonymous ftp://geodesy.noaa.gov/cors/ - This service is going away on August 02, 2021!](ftp://geodesy.noaa.gov/cors/) - #### NCN Data and Products - **RINEX**: The GPS/GNSS data collected at NCN stations are made available to the public by NGS in Receiver INdependent EXchange (RINEX) format. Most data are available within 1 hour (60 minutes) from when they were recorded at the remote site, and a few sites have a delay of 24 hours (1440 minutes).
RINEX data can be found at: *rinex/`YYYY`/`DDD`/`ssss`/* - **Station logs**: From 56bd453970da03b5a0397338cb61f02f9565cafc Mon Sep 17 00:00:00 2001 From: Ev Date: Fri, 18 Jul 2025 15:15:17 -0400 Subject: [PATCH 142/751] Update aws-public-blockchain.yaml - Adding two datasets (Stellar, TON) --- datasets/aws-public-blockchain.yaml | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index 462949ba2..e6475e3b0 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -1,11 +1,8 @@ Name: AWS Public Blockchain Data Description: > - The AWS Public Blockchain Data provides free access to blockchain datasets. Data is transformed into multiple - tables as compressed Parquet files, partitioned by date, to allow efficient access for most common analytics queries. +

The AWS Public Blockchain Data initiative provides free access to blockchain datasets through collaboration with data providers. The data is optimized for analytics by being transformed into compressed Parquet files, partitioned by date for efficient querying.

-

- - Datasets

+

Datasets

@@ -18,10 +15,15 @@ Description: > + +
Blockchain datasetMaintained byPath
Base SonarX s3://aws-public-blockchain/v1.1/sonarx/base/
Provenance SonarX s3://aws-public-blockchain/v1.1/sonarx/provenance/
XRP Ledger SonarX s3://aws-public-blockchain/v1.1/sonarx/xrp/
Stellar (XDR files) Stellar s3://aws-public-blockchain/v1.1/stellar/
The Open Network (TON) TON s3://aws-public-blockchain/v1.1/ton/

- For full datasets, with support and real-time updates, please visit
SonarX. + +

Become a Data Provider

+

We welcome additional blockchain data providers to join this initiative. If you're interested in contributing datasets to the AWS Public Blockchain Data program, please contact our team at aws-public-blockchain@amazon.com.

+ Documentation: https://github.com/aws-samples/digital-assets-examples/blob/main/analytics/ Contact: aws-blockchain-data@amazon.com @@ -36,11 +38,23 @@ Resources: ARN: arn:aws:s3:::aws-public-blockchain Region: us-east-2 Type: S3 Bucket + Explore: + - '[Browse Bucket](https://aws-public-blockchain.s3.us-east-2.amazonaws.com/index.html)' + DataAtWork: Publications: + - Title: Exploring Arbitrum Data: Analyze L2 Activity with AWS Public Blockchain Datasets + URL: https://repost.aws/articles/ARpnBONglsT2e6D-hZZmxVvA/exploring-arbitrum-data-analyze-l2-activity-with-aws-public-blockchain-datasets + AuthorName: Simon Goldberd, Everton Fraga + - Title: Unlocking XRP Ledger Data: Comprehensive Analysis with AWS Public Blockchain Datasets + URL: https://repost.aws/articles/ARg_zMIXlhTG2hSDFZDfF6hQ/unlocking-xrp-ledger-data-comprehensive-analysis-with-aws-public-blockchain-datasets + AuthorName: Simon Goldberd, Everton Fraga - Title: New datasets added to the AWS Public Blockchain Datasets — available for analytics and research URL: https://repost.aws/articles/AR3gztQGeSS8CfaKNNeyYwsQ AuthorName: Everton Fraga, Simon Goldberg + - Title: FEDS Notes - Primary and Secondary Markets for Stablecoins + URL: https://www.federalreserve.gov/econres/notes/feds-notes/primary-and-secondary-markets-for-stablecoins-20240223.html + AuthorName: Cy Watsky, Jeffrey Allen, Hamzah Daud, Jochen Demuth, Daniel Little, Megan Rodden, Amber Seira - Title: Access Bitcoin and Ethereum open datasets for cross-chain analytics URL: https://aws.amazon.com/blogs/database/access-bitcoin-and-ethereum-open-datasets-for-cross-chain-analytics/ AuthorName: Oliver Steffmann, Bhaskar Ravat, Sreeji Gopal, and Stefan Dicker From a06c32040fbb38ba36b2c9a61ead6ad0d67188e1 Mon Sep 17 00:00:00 2001 From: Ev Date: Fri, 18 Jul 2025 15:25:11 -0400 Subject: [PATCH 143/751] Update aws-public-blockchain.yaml Whitespace fix - yaml --- datasets/aws-public-blockchain.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index e6475e3b0..3036fecde 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -2,7 +2,7 @@ Name: AWS Public Blockchain Data Description: >

The AWS Public Blockchain Data initiative provides free access to blockchain datasets through collaboration with data providers. The data is optimized for analytics by being transformed into compressed Parquet files, partitioned by date for efficient querying.

-

Datasets

+

Datasets

From 41797f239206206326d9094c0e90c958a64e8bb4 Mon Sep 17 00:00:00 2001 From: Ev Date: Fri, 18 Jul 2025 15:31:10 -0400 Subject: [PATCH 144/751] Update aws-public-blockchain.yaml Escaping characters --- datasets/aws-public-blockchain.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index 3036fecde..50eb179b4 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -43,10 +43,10 @@ Resources: DataAtWork: Publications: - - Title: Exploring Arbitrum Data: Analyze L2 Activity with AWS Public Blockchain Datasets + - Title: "Exploring Arbitrum Data: Analyze L2 Activity with AWS Public Blockchain Datasets" URL: https://repost.aws/articles/ARpnBONglsT2e6D-hZZmxVvA/exploring-arbitrum-data-analyze-l2-activity-with-aws-public-blockchain-datasets AuthorName: Simon Goldberd, Everton Fraga - - Title: Unlocking XRP Ledger Data: Comprehensive Analysis with AWS Public Blockchain Datasets + - Title: "Unlocking XRP Ledger Data: Comprehensive Analysis with AWS Public Blockchain Datasets" URL: https://repost.aws/articles/ARg_zMIXlhTG2hSDFZDfF6hQ/unlocking-xrp-ledger-data-comprehensive-analysis-with-aws-public-blockchain-datasets AuthorName: Simon Goldberd, Everton Fraga - Title: New datasets added to the AWS Public Blockchain Datasets — available for analytics and research From 1a9bea553df335d340c01d49e81bd0a9ccb6dd68 Mon Sep 17 00:00:00 2001 From: Hyun Min Kang Date: Mon, 21 Jul 2025 19:22:09 -0400 Subject: [PATCH 145/751] Corrected a typo, updated the file types --- datasets/cartostore.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/cartostore.yaml b/datasets/cartostore.yaml index b8087ad38..892d0a729 100644 --- a/datasets/cartostore.yaml +++ b/datasets/cartostore.yaml @@ -17,8 +17,8 @@ Citation: | CartoStore by Hyun Min Kang's lab at the University of Michigan School of Public Health. Provided by Kang lab and accessed [DAY MONTH YEAR]. Resources: - - Description: Parquet and Shapefiles - ARN: arn:aws:s3:::carostore + - Description: PMTile, YAML, JSON, and TSV files + ARN: arn:aws:s3:::cartostore Region: us-east-1 Type: S3 Bucket DataAtWork: From 9e1282a65e9d0c81d3641830e12397a4a4af2228 Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Mon, 21 Jul 2025 23:25:56 -0400 Subject: [PATCH 146/751] Update proteingym.yaml --- datasets/proteingym.yaml | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index f7bf80912..69b91742a 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -12,11 +12,22 @@ Tags: - machine learning License: MIT License Resources: - - Description: - ARN: - Region: - Type: - Explore: + - Description: "All substitution mutations from Deep Mutational Scanning (DMS) experiments." + ARN: arn:aws:s3:::proteingym/DMS_substitutions.parquet + Region: us-east-2 + Type: S3 Object + - Description: "All insertion-deletion (indel) mutations from DMS experiments." + ARN: arn:aws:s3:::proteingym/DMS_indels.parquet + Region: us-east-2 + Type: S3 Object + - Description: "All substitution mutations from clinical variant databases (e.g., ClinVar)." + ARN: arn:aws:s3:::proteingym/clinical_substitutions.parquet + Region: us-east-2 + Type: S3 Object + - Description: "All indel mutations from clinical variant databases." + ARN: arn:aws:s3:::proteingym/clinical_indels.parquet + Region: us-east-2 + Type: S3 Object DataAtWork: Tutorials: - Title: Scoring ProteinGym assays with TranceptEVE From c9148a04a08419454b3bc21de03ca54767564b0e Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Mon, 21 Jul 2025 23:32:30 -0400 Subject: [PATCH 147/751] Update proteingym.yaml --- datasets/proteingym.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index 69b91742a..1e2e5e2cc 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -44,7 +44,7 @@ DataAtWork: Publications: - Title: ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design URL: https://papers.nips.cc/paper_files/paper/2023/hash/cac723e5ff29f65e3fcbb0739ae91bee-Abstract-Datasets_and_Benchmarks.html - AuthorName: Pascal Notin, Aaron Kollasch, Daniel Ritter, Lood van Niekerk, Steffanie Paul, Han Spinner, Nathan Rollins, Ada Shaw, Rose Orenbuch, Ruben Weitzman, Jonathan Frazer, Mafalda Dias, Dinko Franceschi, Yarin Gal, Debora Marks + AuthorName: "Pascal Notin, et al." AuthorURL: https://www.pascalnotin.com/ DeprecatedNotice: ADXCategories: From 30ac134c0bb767976ccd9bad3f645260881238f0 Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Mon, 21 Jul 2025 23:37:40 -0400 Subject: [PATCH 148/751] Update proteingym.yaml --- datasets/proteingym.yaml | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index 1e2e5e2cc..16f0cca53 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -1,10 +1,14 @@ Name: ProteinGym -Description: ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), as well clinical annotations from experts about the pathogenicity of mutants in over 3k human genes. +Description: | + ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. + It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), + as well as clinical annotations from experts about the pathogenicity of mutants in over 3k human genes. Documentation: https://github.com/OATML-Markslab/ProteinGym/blob/main/README.md Contact: pascal_notin@hms.harvard.edu -ManagedBy: Harvard Medical School; University of Oxford +ManagedBy: "Harvard Medical School; University of Oxford" UpdateFrequency: Quarterly Tags: + - aws-pds - protein - bioinformatics - biology @@ -32,20 +36,14 @@ DataAtWork: Tutorials: - Title: Scoring ProteinGym assays with TranceptEVE URL: https://github.com/OATML-Markslab/ProteinGym/blob/main/notebooks/TranceptEVE_example.ipynb - NotebookURL: https://github.com/OATML-Markslab/ProteinGym/blob/main/notebooks/TranceptEVE_example.ipynb AuthorName: Daniel Ritter AuthorURL: https://danieldritter.github.io/ - Services: Tools & Applications: - Title: ProteinGym website URL: https://proteingym.org/ AuthorName: Pascal Notin & Daniel Ritter - AuthorURL: Publications: - Title: ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design URL: https://papers.nips.cc/paper_files/paper/2023/hash/cac723e5ff29f65e3fcbb0739ae91bee-Abstract-Datasets_and_Benchmarks.html AuthorName: "Pascal Notin, et al." AuthorURL: https://www.pascalnotin.com/ -DeprecatedNotice: -ADXCategories: - - From 13dc6555c69210e54c40c9f036b4447a8f62132c Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Mon, 21 Jul 2025 23:42:12 -0400 Subject: [PATCH 149/751] Update proteingym.yaml --- datasets/proteingym.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index 16f0cca53..2ec8948e7 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -43,7 +43,7 @@ DataAtWork: URL: https://proteingym.org/ AuthorName: Pascal Notin & Daniel Ritter Publications: - - Title: ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design + - Title: "ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design" URL: https://papers.nips.cc/paper_files/paper/2023/hash/cac723e5ff29f65e3fcbb0739ae91bee-Abstract-Datasets_and_Benchmarks.html AuthorName: "Pascal Notin, et al." AuthorURL: https://www.pascalnotin.com/ From 5fc75fdb5e8855f0974208198d394163b5321e42 Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Tue, 22 Jul 2025 00:50:05 -0400 Subject: [PATCH 150/751] Update proteingym.yaml --- datasets/proteingym.yaml | 24 +++++------------------- 1 file changed, 5 insertions(+), 19 deletions(-) diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index 2ec8948e7..733dc7214 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -1,8 +1,6 @@ Name: ProteinGym -Description: | - ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. - It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), - as well as clinical annotations from experts about the pathogenicity of mutants in over 3k human genes. +Description: | + ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), as well as clinical annotations from experts about the pathogenicity of mutants in over 3k human genes. Documentation: https://github.com/OATML-Markslab/ProteinGym/blob/main/README.md Contact: pascal_notin@hms.harvard.edu ManagedBy: "Harvard Medical School; University of Oxford" @@ -16,22 +14,10 @@ Tags: - machine learning License: MIT License Resources: - - Description: "All substitution mutations from Deep Mutational Scanning (DMS) experiments." - ARN: arn:aws:s3:::proteingym/DMS_substitutions.parquet + - Description: "ProteinGym dataset including all substitution/indel mutations from Deep Mutational Scanning (DMS) experiments (DMS_substitutions.parquet / DMS_indels.parquet), and all substitution/indel mutations from clinical variant databases (clinical_substitutions.parquet / clinical_indels.parquet)." + ARN: arn:aws:s3:::proteingym Region: us-east-2 - Type: S3 Object - - Description: "All insertion-deletion (indel) mutations from DMS experiments." - ARN: arn:aws:s3:::proteingym/DMS_indels.parquet - Region: us-east-2 - Type: S3 Object - - Description: "All substitution mutations from clinical variant databases (e.g., ClinVar)." - ARN: arn:aws:s3:::proteingym/clinical_substitutions.parquet - Region: us-east-2 - Type: S3 Object - - Description: "All indel mutations from clinical variant databases." - ARN: arn:aws:s3:::proteingym/clinical_indels.parquet - Region: us-east-2 - Type: S3 Object + Type: S3 Bucket DataAtWork: Tutorials: - Title: Scoring ProteinGym assays with TranceptEVE From a19da887ccae3e8d82b2e21272eae55ee1735c41 Mon Sep 17 00:00:00 2001 From: Pascal Notin Date: Tue, 22 Jul 2025 11:25:32 -0400 Subject: [PATCH 151/751] Update proteingym.yaml --- datasets/proteingym.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index 733dc7214..6d7b8bdc8 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -10,6 +10,7 @@ Tags: - protein - bioinformatics - biology + - life sciences - deep learning - machine learning License: MIT License From 7ef183eb32cbd66a4a4b43018ee8ce1862aec280 Mon Sep 17 00:00:00 2001 From: Val Lorentz Date: Thu, 24 Jul 2025 09:16:21 +0200 Subject: [PATCH 152/751] software-heritage: Update links to documentation --- datasets/software-heritage.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/software-heritage.yaml b/datasets/software-heritage.yaml index 2fa120d41..27c17f867 100644 --- a/datasets/software-heritage.yaml +++ b/datasets/software-heritage.yaml @@ -14,7 +14,7 @@ Description: | information is also included, providing timestamps about when and where all archived source code artifacts have been observed in the wild. Author and committer information is anonymized. -Documentation: https://docs.softwareheritage.org/devel/swh-dataset/graph/athena.html +Documentation: https://docs.softwareheritage.org/devel/swh-export/graph/athena.html Contact: aws@softwareheritage.org ManagedBy: Software Heritage UpdateFrequency: Data is updated yearly @@ -48,11 +48,11 @@ Resources: DataAtWork: Tutorials: - Title: Using the Software Heritage Graph Dataset - URL: https://docs.softwareheritage.org/devel/swh-dataset/graph/index.html + URL: https://docs.softwareheritage.org/devel/swh-export/graph/ AuthorName: The Software Heritage team Tools & Applications: - Title: The SWH-Graph module - URL: https://docs.softwareheritage.org/devel/swh-graph/index.html + URL: https://docs.softwareheritage.org/devel/swh-graph/ AuthorName: The Software Heritage team Publications: - Title: The Software Heritage Graph Dataset From 4a309d6a984f110c7b8ccd737437fe34e94aab09 Mon Sep 17 00:00:00 2001 From: Matt McCormick Date: Fri, 25 Jul 2025 10:31:09 -0400 Subject: [PATCH 153/751] ome-zarr-open-scivis: fix open-scivis-datasets url Only available as http. --- datasets/ome-zarr-open-scivis.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ome-zarr-open-scivis.yaml b/datasets/ome-zarr-open-scivis.yaml index d27470674..c0a110131 100644 --- a/datasets/ome-zarr-open-scivis.yaml +++ b/datasets/ome-zarr-open-scivis.yaml @@ -40,7 +40,7 @@ DataAtWork: URL: https://link.springer.com/article/10.1007/s00418-023-02209-1 AuthorName: Josh Moore, Daniela Basurto-Lozada, Sébastien Besson, John Bogovic, Jordão Bragantini, Eva M. Brown, Jean-Marie Burel, Xavier Casas Moreno, Gustavo de Medeiros, Erin E. Diel, David Gault, Satrajit S. Ghosh, Ilan Gold, Yaroslav O. Halchenko, Matthew Hartley, Dave Horsfall, Mark S. Keller, Mark Kittisopikul, Gabor Kovacs, Aybüke Küpcü Yoldaş, Koji Kyoda, Albane le Tournoulx de la Villegeorges, Tong Li, Prisca Liberali, Dominik Lindner, Melissa Linkert, Joel Lüthi, Jeremy Maitin-Shepard, Trevor Manz, Luca Marconato, Matthew McCormick, Merlin Lange, Khaled Mohamed, William Moore, Nils Norlin, Wei Ouyang, Bugra Özdemir, Giovanni Palla, Constantin Pape, Lucas Pelkmans, Tobias Pietzsch, Stephan Preibisch, Martin Prete, Norman Rzepka, Sameeul Samee, Nicholas Schaub, Hythem Sidky, Ahmet Can Solak, David R. Stirling, Jonathan Striebel, Christian Tischer, Daniel Toloudis, Isaac Virshup, Petr Walczysko, Alan M. Watson, Erin Weisbart, Frances Wong, Kevin A. Yamauchi, Omer Bayraktar, Beth A. Cimini, Nils Gehlenborg, Muzlifah Haniffa, Nathan Hotaling, Shuichi Onami, Loic A. Royer, Stephan Saalfeld, Oliver Stegle, Fabian J. Theis & Jason R. Swedlow - Title: Open SciVis Datasets - URL: https://klacansky.com/open-scivis-datasets/ + URL: http://klacansky.com/open-scivis-datasets/ AuthorName: Pavol Klacansky DeprecatedNotice: ADXCategories: From b25c5c176edb6e5530bb2832663a422cc767a419 Mon Sep 17 00:00:00 2001 From: Thomas Dutkiewicz <106269091+ttdu@users.noreply.github.com> Date: Fri, 25 Jul 2025 10:39:26 -0400 Subject: [PATCH 154/751] correct ARN, fix typo --- datasets/tglc.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/tglc.yaml b/datasets/tglc.yaml index 4e423c713..2e4f095b2 100644 --- a/datasets/tglc.yaml +++ b/datasets/tglc.yaml @@ -1,5 +1,5 @@ -Name: TESS-GAIA Light Curve (TESS) +Name: TESS-GAIA Light Curve (TGLC) Description: | TESS-Gaia Light Curve (TGLC) is a PSF-based TESS full-frame image (FFI) light curve product. Using Gaia DR3 as priors, the team forward models the FFIs with the effective point spread function to remove contamination from nearby stars. The resulting light curves show a photometric precision closely tracking the pre-launch prediction of the noise level: TGLC's photometric precision consistently reaches ≲2% at 16th TESS magnitude even in crowded fields, demonstrating excellent decontamination and deblending power. Documentation: https://archive.stsci.edu/hlsp/tglc @@ -13,12 +13,12 @@ Tags: License: All HLSPs hosted at MAST are subject to a [CC By 4.0 license](https://creativecommons.org/licenses/by/4.0/). Resources: - Description: TGLC Files - ARN: arn:aws:s3:::stpubdata/hlsp/tglc + ARN: arn:aws:s3:::stpubdata/mast/hlsp/tglc Region: us-east-1 Type: S3 Bucket RequesterPays: False - Description: Notifications for new data - ARN: arn:aws:sns:us-east-1:879230861493:stpubdata/hlsp/tglc + ARN: arn:aws:sns:us-east-1:879230861493:stpubdata/mast/hlsp/tglc Region: us-east-1 Type: SNS Topic DataAtWork: From 875648ae88be33ddea6ffd500287fc84cdccdd7e Mon Sep 17 00:00:00 2001 From: Thomas Dutkiewicz <106269091+ttdu@users.noreply.github.com> Date: Fri, 25 Jul 2025 11:19:04 -0400 Subject: [PATCH 155/751] add gaia dr3 --- datasets/gaia-dr3.yaml | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 datasets/gaia-dr3.yaml diff --git a/datasets/gaia-dr3.yaml b/datasets/gaia-dr3.yaml new file mode 100644 index 000000000..449b325d0 --- /dev/null +++ b/datasets/gaia-dr3.yaml @@ -0,0 +1,28 @@ + +Name: Gaia DR3 +Description: | + [Gaia DR3 data](https://www.cosmos.esa.int/web/gaia/dr3) were originally released by the European Space Agency in December 2020. This [HATS](https://hats.readthedocs.io/en/stable)-formatted catalog was produced by the LSST Interdisciplinary Network for Collaboration and Computing. +Documentation: https://docs.lsdb.io/en/latest/index.html +Contact: archive@stsci.edu +ManagedBy: "[Space Telescope Science Institute](http://www.stsci.edu/)" +Citation: Please see [the LSDB citation page](https://docs.lsdb.io/en/latest/citation.html) if using LSDB for an academic publication. Please also [cite the Gaia team](https://gea.esac.esa.int/archive/documentation/GDR3/Miscellaneous/sec_credit_and_citation_instructions/). +UpdateFrequency: Never +Tags: + - astronomy + - gaia +License: Attribution required. +Resources: + - Description: Gaia DR3 HATS-Formatted Files + ARN: arn:aws:s3:::stpubdata/gaia + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + - Description: Notifications for new data + ARN: arn:aws:sns:us-east-1:879230861493:stpubdata/gaia + Region: us-east-1 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: Dark Energy Survey / Gaia DR3 Crossmatch + URL: https://docs.lsdb.io/en/stable/tutorials/pre_executed/des-gaia.html + AuthorName: LSDB Collaboration \ No newline at end of file From a3bc77fb6b55726dbbdd965f6134409408967948 Mon Sep 17 00:00:00 2001 From: Thomas Dutkiewicz <106269091+ttdu@users.noreply.github.com> Date: Fri, 25 Jul 2025 11:29:46 -0400 Subject: [PATCH 156/751] remove invalid gaia tag --- datasets/gaia-dr3.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/gaia-dr3.yaml b/datasets/gaia-dr3.yaml index 449b325d0..01a3bb783 100644 --- a/datasets/gaia-dr3.yaml +++ b/datasets/gaia-dr3.yaml @@ -9,7 +9,6 @@ Citation: Please see [the LSDB citation page](https://docs.lsdb.io/en/latest/cit UpdateFrequency: Never Tags: - astronomy - - gaia License: Attribution required. Resources: - Description: Gaia DR3 HATS-Formatted Files From ac31d77a53c45c1f035262bc042f8b9c3a6e579b Mon Sep 17 00:00:00 2001 From: xhagrg Date: Fri, 25 Jul 2025 14:38:55 -0500 Subject: [PATCH 157/751] Update tags for surya-bench dataset. --- datasets/surya-bench.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/surya-bench.yaml b/datasets/surya-bench.yaml index f6beaffc4..60be9f97a 100644 --- a/datasets/surya-bench.yaml +++ b/datasets/surya-bench.yaml @@ -10,6 +10,8 @@ ManagedBy: NASA IMPACT UpdateFrequency: This is the final version of the Dataset. Tags: - machine learning + - solar + - aws-pds License: | Creative Commons Attribution 4.0 International. Citation: > From 74ca5ea039364ec075b851252450db2231ef281d Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Sat, 26 Jul 2025 13:08:52 +0530 Subject: [PATCH 158/751] Update and rename AI3.yaml to ai3.yaml --- datasets/{AI3.yaml => ai3.yaml} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename datasets/{AI3.yaml => ai3.yaml} (82%) diff --git a/datasets/AI3.yaml b/datasets/ai3.yaml similarity index 82% rename from datasets/AI3.yaml rename to datasets/ai3.yaml index f318e1917..6bd077075 100644 --- a/datasets/AI3.yaml +++ b/datasets/ai3.yaml @@ -32,9 +32,9 @@ Tags: License: https://devalab.in/AI3.html Resources: - - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. - ARN: To be added once bucket is created - Region: To be added once bucket is created + - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. AWS S3 AI3 Publicly Available Dataset Size: Version1: Total Size: 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2: Total Size: 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3: Total Size: 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components. + ARN: arn:aws:account::345594599240:account + Region: us-east-1 Type: S3 bucket DataAtWork: From b0138a5544d650fb71181b5980e8d58d972972b9 Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Sat, 26 Jul 2025 13:18:30 +0530 Subject: [PATCH 159/751] Update ai3.yaml --- datasets/ai3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 6bd077075..64ad49558 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -32,7 +32,7 @@ Tags: License: https://devalab.in/AI3.html Resources: - - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. AWS S3 AI3 Publicly Available Dataset Size: Version1: Total Size: 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2: Total Size: 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3: Total Size: 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components. + - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. AWS S3 AI3 Publicly Available Dataset Size is included in the following descriptions of Version 1, Version2 and Version 3. Version1 contains the total Size of 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2 contains the total Size of 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3 contains the total Size of 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components). ARN: arn:aws:account::345594599240:account Region: us-east-1 Type: S3 bucket From 99988ab1a645552117d95493b19ee3ca531ef868 Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Sat, 26 Jul 2025 13:28:38 +0530 Subject: [PATCH 160/751] Update ai3.yaml --- datasets/ai3.yaml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 64ad49558..1dbcee405 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -21,13 +21,13 @@ ManagedBy: International Institute of Information Technology Hyderabad UpdateFrequency: Not updated Tags: - - Pharmaceutical - - Simulations - - Health - - Life Sciences - - Machine Learning - - Protein - - Molecular Dynamics + - pharmaceutical + - simulations + - health + - life sciences + - machine learning + - protein + - molecular dynamics License: https://devalab.in/AI3.html From 4d2806a90f40d248435e6793b08cd6530e87f9d9 Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Sat, 26 Jul 2025 13:31:14 +0530 Subject: [PATCH 161/751] Update ai3.yaml --- datasets/ai3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 1dbcee405..f62058ca3 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -35,7 +35,7 @@ Resources: - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. AWS S3 AI3 Publicly Available Dataset Size is included in the following descriptions of Version 1, Version2 and Version 3. Version1 contains the total Size of 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2 contains the total Size of 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3 contains the total Size of 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components). ARN: arn:aws:account::345594599240:account Region: us-east-1 - Type: S3 bucket + Type: S3 Bucket DataAtWork: Tutorials: From 591b0af54d83151fc79f056d17538d024a5599eb Mon Sep 17 00:00:00 2001 From: Thomas Dutkiewicz <106269091+ttdu@users.noreply.github.com> Date: Mon, 28 Jul 2025 10:51:35 -0400 Subject: [PATCH 162/751] add more details about the catalog --- datasets/gaia-dr3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/gaia-dr3.yaml b/datasets/gaia-dr3.yaml index 01a3bb783..1da9ef993 100644 --- a/datasets/gaia-dr3.yaml +++ b/datasets/gaia-dr3.yaml @@ -1,7 +1,7 @@ Name: Gaia DR3 Description: | - [Gaia DR3 data](https://www.cosmos.esa.int/web/gaia/dr3) were originally released by the European Space Agency in December 2020. This [HATS](https://hats.readthedocs.io/en/stable)-formatted catalog was produced by the LSST Interdisciplinary Network for Collaboration and Computing. + [Gaia DR3 data](https://www.cosmos.esa.int/web/gaia/dr3) were originally released by the European Space Agency in December 2020. This [HATS](https://hats.readthedocs.io/en/stable)-formatted catalog was produced by the LSST Interdisciplinary Network for Collaboration and Computing. The GAIA HATS Datasets are specifically designed for efficient spatial cross-matching with other HATS-format catalogs, whether within the same archive or across distributed archive data centers. This enables astronomers to perform complex analyses, such as identifying correlations or overlaps between datasets from different surveys. Users can leverage [LSDB (Large-Scale Database)](https://docs.lsdb.io/en/latest/), a scalable spatial analysis library, to execute precise, high-performance operations like cone searches or cross-matching. Documentation: https://docs.lsdb.io/en/latest/index.html Contact: archive@stsci.edu ManagedBy: "[Space Telescope Science Institute](http://www.stsci.edu/)" From 11fbb84518fdf047806c62c22c1fa6795f57762f Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 28 Jul 2025 09:13:53 -0800 Subject: [PATCH 163/751] Update tags.yaml added Heliophysics --- tags.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/tags.yaml b/tags.yaml index 5fa9b4bc0..8ef2a06d1 100644 --- a/tags.yaml +++ b/tags.yaml @@ -203,6 +203,7 @@ - Hawkes Process - hdf5 - health +- heliophysics - high-throughput imaging - hiring - hispanic From 60c79efc3c666608de6cfdba8bd8c24e08166ecb Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Tue, 29 Jul 2025 17:01:19 +0530 Subject: [PATCH 164/751] Update ai3.yaml --- datasets/ai3.yaml | 17 ++--------------- 1 file changed, 2 insertions(+), 15 deletions(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index f62058ca3..51e30dd38 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -1,19 +1,6 @@ Name: AI3 Description: > - The rapid advancement of computing technologies, particularly artificial intelligence (AI), has revolutionized various domains, including drug discovery. Curated datasets are crucial for developing reliable, generalizable, and accurate models for practical applications. Generating experimental data on a large scale is an expensive and arduous process. In domains such as medical diagnostics where real-life data is hard to obtain, synthetic data has been shown to be extremely valuable. We, teams from IIIT Hyderabad, Intel, AWS, and Insilico Medicine, have performed physics-based calculations (molecular dynamics simulations) on about 20,000 protein-ligand complexes. The dataset comprises molecular dynamics snapshots, binding affinities calculated using the MM-PBSA method, and individual energy components, including electrostatic and van der Waals interactions. - -DatasetFileFormats: - - 3D coordinates of the protein-ligand complexes (pdb) in tar.gz files - - CSV files containing the energy data - -DatasetUsages: - - ML scoring function for predicting binding affinities of given protein-ligand complexes - - Classification models for predicting correct binding poses of ligands - - Identification of cryptic binding pockets - - Optimization of binding features by exploiting the individual components of the energy (experimental data has only the total binding affinity) - -DatasetNovelty: > - Existing AI/ML training datasets lack dynamic data and are inherently biased. Further, binding affinity data existing in the literature are obtained from different experimental protocols. Therefore, this dataset has been uniquely created (from the same computational protocols) followed by free energy calculations with molecular dynamics (MD) simulations. The dynamic data-enriched protein-ligand coordinates can be used to effectively train convolutional neural network-based regression models for more accurate binding affinity prediction. + The rapid advancement of computing technologies, particularly artificial intelligence (AI), has revolutionized various domains, including drug discovery. Curated datasets are crucial for developing reliable, generalizable, and accurate models for practical applications. Generating experimental data on a large scale is an expensive and arduous process. In domains such as medical diagnostics where real-life data is hard to obtain, synthetic data has been shown to be extremely valuable. We, teams from IIIT Hyderabad, Intel, AWS, and Insilico Medicine, have performed physics-based calculations (molecular dynamics simulations) on about 20,000 protein-ligand complexes. The dataset comprises molecular dynamics snapshots, binding affinities calculated using the MM-PBSA method, and individual energy components, including electrostatic and van der Waals interactions. DatasetFileFormats essentially incorporate i. 3D coordinates of the protein-ligand complexes (pdb) in tar.gz files, and ii. CSV files containing the energy data. DatasetUsages are on i. ML scoring function for predicting binding affinities of given protein-ligand complexes, ii. Classification models for predicting correct binding poses of ligands, iii. identification of cryptic binding pockets, and iv. optimization of binding features by exploiting the individual components of the energy (experimental data has only the total binding affinity). Further, the novelty of the dataset highlights the fact that existing AI/ML training datasets lack dynamic data and are inherently biased. Further, binding affinity data existing in the literature are obtained from different experimental protocols. Therefore, this dataset has been uniquely created (from the same computational protocols) followed by free energy calculations with molecular dynamics (MD) simulations. The dynamic data-enriched protein-ligand coordinates can be used to effectively train convolutional neural network-based regression models for more accurate binding affinity prediction. Documentation: https://github.com/devalab/AI3 Contact: devalab@iiit.ac.in @@ -32,7 +19,7 @@ Tags: License: https://devalab.in/AI3.html Resources: - - Description: Coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. AWS S3 AI3 Publicly Available Dataset Size is included in the following descriptions of Version 1, Version2 and Version 3. Version1 contains the total Size of 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2 contains the total Size of 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3 contains the total Size of 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components). + - Description: ai3data bucket includes coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. The subfolders of ai3data bucket consist of Version 1, Version2 and Version 3. Version1 contains the total Size of 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2 contains the total Size of 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3 contains the total Size of 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components). ARN: arn:aws:account::345594599240:account Region: us-east-1 Type: S3 Bucket From ed2ebd282443811eb15967271cc9d786dd45d992 Mon Sep 17 00:00:00 2001 From: Pradeep Vanga <5126396+vanga@users.noreply.github.com> Date: Wed, 30 Jul 2025 12:23:38 +0530 Subject: [PATCH 165/751] Indian Supreme Court Judgments --- datasets/indian-supreme-court-judgments.yaml | 30 ++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 datasets/indian-supreme-court-judgments.yaml diff --git a/datasets/indian-supreme-court-judgments.yaml b/datasets/indian-supreme-court-judgments.yaml new file mode 100644 index 000000000..747475948 --- /dev/null +++ b/datasets/indian-supreme-court-judgments.yaml @@ -0,0 +1,30 @@ +Name: Indian Supreme Court Judgments +Description: This dataset contains judgements from the Indian Supreme Court, downloaded from ecourts website. It contains judgments from 1950 to 2025, along with raw metadata (in json format) and structured metadata in parquet format. Judgments are available in both English and regional Indian languages in zip format for easier download. +Documentation: https://github.com/vanga/indian-supreme-court-judgments/blob/main/opendata/docs/dataset.md +Contact: contact@dattam.in +ManagedBy: "[Dattam Labs](https://dattam.in)" +UpdateFrequency: Bi-monthly +Tags: + - legal data + - supreme court + - india +License: CC-BY-4.0 +Resources: + - Description: S3 bucket containing the judgments + ARN: arn:aws:s3:::indian-supreme-court-judgments + Region: ap-south-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Using AWS Athena to query the metadata + URL: https://github.com/vanga/indian-supreme-court-judgments/blob/main/opendata/tutorials/ATHENA.md + AuthorName: Nihesh Rachakonda + AuthorURL: https://github.com/rnihesh + Services: + - Amazon Athena + - Title: Dataset Overview and Usage Examples + URL: https://github.com/vanga/indian-supreme-court-judgments/blob/main/opendata/tutorials/README.md + AuthorName: Nihesh Rachakonda + AuthorURL: https://github.com/rnihesh + Services: + - Amazon S3 From 12d8a5626bc03ded02092666a087671c0c53c453 Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Wed, 30 Jul 2025 12:38:34 -0800 Subject: [PATCH 166/751] fix search by removing table html --- datasets/aws-public-blockchain.yaml | 28 +++++++++++----------------- 1 file changed, 11 insertions(+), 17 deletions(-) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index 50eb179b4..bfd706526 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -3,25 +3,19 @@ Description: >

The AWS Public Blockchain Data initiative provides free access to blockchain datasets through collaboration with data providers. The data is optimized for analytics by being transformed into compressed Parquet files, partitioned by date for efficient querying.

Datasets

-
Blockchain datasetMaintained byPath
- - - - - - - - - - - - - - -
Blockchain datasetMaintained byPath
Bitcoin AWS s3://aws-public-blockchain/v1.0/btc/
Ethereum AWS s3://aws-public-blockchain/v1.0/eth/
Arbitrum SonarX s3://aws-public-blockchain/v1.1/sonarx/arbitrum/
Aptos SonarX s3://aws-public-blockchain/v1.1/sonarx/aptos/
Base SonarX s3://aws-public-blockchain/v1.1/sonarx/base/
Provenance SonarX s3://aws-public-blockchain/v1.1/sonarx/provenance/
XRP Ledger SonarX s3://aws-public-blockchain/v1.1/sonarx/xrp/
Stellar (XDR files) Stellar s3://aws-public-blockchain/v1.1/stellar/
The Open Network (TON) TON s3://aws-public-blockchain/v1.1/ton/
+ Blockchain dataset - Maintained by - Path:
+ - Bitcoin - AWS - s3://aws-public-blockchain/v1.0/btc/
+ - Ethereum - AWS - s3://aws-public-blockchain/v1.0/eth/
+ - Arbitrum - SonarX - s3://aws-public-blockchain/v1.1/sonarx/arbitrum/
+ - Aptos - SonarX - s3://aws-public-blockchain/v1.1/sonarx/aptos/
+ - Base - SonarX - s3://aws-public-blockchain/v1.1/sonarx/base/
+ - Provenance - SonarX - s3://aws-public-blockchain/v1.1/sonarx/provenance/
+ - XRP Ledger - SonarX - s3://aws-public-blockchain/v1.1/sonarx/xrp/
+ - Stellar(XDR files) - Stellar - s3://aws-public-blockchain/v1.1/stellar/
+ - The Open Network (TON) - TON - s3://aws-public-blockchain/v1.1/ton/

-

Become a Data Provider

+

Become a Data Provider

We welcome additional blockchain data providers to join this initiative. If you're interested in contributing datasets to the AWS Public Blockchain Data program, please contact our team at aws-public-blockchain@amazon.com.

From ede3b6bc939fce86c691b5510098a21fc30f4b20 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 30 Jul 2025 12:50:49 -0800 Subject: [PATCH 167/751] Update aws-public-blockchain.yaml pushing build --- datasets/aws-public-blockchain.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index bfd706526..f4e86221b 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -13,7 +13,7 @@ Description: > - XRP Ledger - SonarX - s3://aws-public-blockchain/v1.1/sonarx/xrp/
- Stellar(XDR files) - Stellar - s3://aws-public-blockchain/v1.1/stellar/
- The Open Network (TON) - TON - s3://aws-public-blockchain/v1.1/ton/
-
+

Become a Data Provider

We welcome additional blockchain data providers to join this initiative. If you're interested in contributing datasets to the AWS Public Blockchain Data program, please contact our team at aws-public-blockchain@amazon.com.

From eacb9d8da60c88cff067231b55189cfe411555fd Mon Sep 17 00:00:00 2001 From: xhagrg Date: Wed, 30 Jul 2025 18:05:24 -0500 Subject: [PATCH 168/751] Update used tags. --- datasets/surya-bench.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/surya-bench.yaml b/datasets/surya-bench.yaml index 60be9f97a..2a48b9460 100644 --- a/datasets/surya-bench.yaml +++ b/datasets/surya-bench.yaml @@ -12,6 +12,7 @@ Tags: - machine learning - solar - aws-pds + - heliophysics License: | Creative Commons Attribution 4.0 International. Citation: > From 9399cf794e460f5eee9bf023cea8d00f7c0d38b9 Mon Sep 17 00:00:00 2001 From: Yuk Kei Wan <41866052+yuukiiwa@users.noreply.github.com> Date: Thu, 31 Jul 2025 11:03:04 +0800 Subject: [PATCH 169/751] update license to CC BY 4 --- datasets/yuukiiwa_application_placeholder.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/yuukiiwa_application_placeholder.yaml b/datasets/yuukiiwa_application_placeholder.yaml index 79146bbb5..7934e04af 100644 --- a/datasets/yuukiiwa_application_placeholder.yaml +++ b/datasets/yuukiiwa_application_placeholder.yaml @@ -6,7 +6,7 @@ ManagedBy: "The Genome Institute of Singapore (https://www.a-star.edu.sg/gis) an UpdateFrequency: Datasets will be updated periodically as additional data are generated. Tags: - TBD -License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +License: "[CC BY 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" Citation: Update on August 31, 2025 Resources: - Description: Update on August 31, 2025 From d31361616f6155b17ce70481bb333f0ce4da5d64 Mon Sep 17 00:00:00 2001 From: Yuk Kei Wan <41866052+yuukiiwa@users.noreply.github.com> Date: Thu, 31 Jul 2025 11:05:42 +0800 Subject: [PATCH 170/751] update to CC BY 4.0 --- datasets/yuukiiwa_application_placeholder.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/yuukiiwa_application_placeholder.yaml b/datasets/yuukiiwa_application_placeholder.yaml index 7934e04af..9f9b79afb 100644 --- a/datasets/yuukiiwa_application_placeholder.yaml +++ b/datasets/yuukiiwa_application_placeholder.yaml @@ -6,7 +6,7 @@ ManagedBy: "The Genome Institute of Singapore (https://www.a-star.edu.sg/gis) an UpdateFrequency: Datasets will be updated periodically as additional data are generated. Tags: - TBD -License: "[CC BY 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: Update on August 31, 2025 Resources: - Description: Update on August 31, 2025 From 08f1fe8af143bce18a936ded9569f9186f1e9c89 Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Thu, 31 Jul 2025 14:14:14 -0400 Subject: [PATCH 171/751] Add dataset canada dem --- datasets/canelevation-dem.yml | 49 +++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 datasets/canelevation-dem.yml diff --git a/datasets/canelevation-dem.yml b/datasets/canelevation-dem.yml new file mode 100644 index 000000000..42b67e693 --- /dev/null +++ b/datasets/canelevation-dem.yml @@ -0,0 +1,49 @@ +Name: CanElevation - Canada Digital Elevation Models +Description: # TODO : Add description +Documentation: [Medium Resolution Digital Elevation Model - MRDEM](https://open.canada.ca/data/en/dataset/18752265-bda3-498c-a4ba-9dfe68cb98da) [High Resolution Digital Elevation Model Mosaic](https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b) +Contact: geoinfo@nrcan-rncan.gc.ca +ManagedBy: "[Natural Resources Canada](https://nrcan.gc.ca/)" +UpdateFrequency: "The dataset is updated as new DEM models becomes available. + +L'ensemble de données est mis à jour à mesure que des nouveaux modèles numérique d'élévation deviennent disponibles." +Tags: + - canada #TODO : Needs to be submitted as new tag in the tag.yml + - elevation + - geospatial + - stac + - land + - dsm #TODO : Needs to be submitted as new tag in the tag.yml + - dtm #TODO : Needs to be submitted as new tag in the tag.yml + - dem #TODO : Needs to be submitted as new tag in the tag.yml +License: "[Open Government License (OGL)](https://open.canada.ca/en/open-government-licence-canada)" +Resources: + - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) at 1m / Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) à 1m + ARN: arn:aws:s3:::canelevation-dem/hrdem-mosaic-1m/ + Region: ca-central-1 + Type: S3 Bucket + Explore: https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-1m + - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) at 2m / Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) à 2m + ARN: arn:aws:s3:::canelevation-dem/hrdem-mosaic-2m/ + Region: ca-central-1 + Type: S3 Bucket + Explore: https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-2m + - Description: Medium Resolution Digital Elevation Model (MRDEM). Modèle numérique d'élévation de moyenne résolution (MNEMR) + ARN: arn:aws:s3:::canelevation-dem/mrdem-30/ + Region: ca-central-1 + Type: S3 Bucket + Explore: https://datacube.services.geo.ca/stac/api/search?collections=mrdem-30 + - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) by LiDAR acquisition project. Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) par project d'acquisition LiDAR. + ARN: arn:aws:s3:::canelevation-dem/hrdem-lidar/ + Region: ca-central-1 + Type: S3 Bucket + Explore: https://datacube.services.geo.ca/stac/api/search?collections=hrdem-lidar +DataAtWork: + Tutorials: + - Title: Cloud-Optimized Geospatial Data Access + URL: https://nrcan.github.io/cloud-optimized-geospatial/ + AuthorName: NRCan + Publications: + - Title: # TODO : Add Heather's publication + URL: + AuthorName: + AuthorURL: \ No newline at end of file From 50f11ba96bde595494b75b7ab2a061d4ed52ffa7 Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Thu, 31 Jul 2025 14:16:34 -0400 Subject: [PATCH 172/751] Update tags.yaml --- tags.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tags.yaml b/tags.yaml index 8ef2a06d1..5c56d7aef 100644 --- a/tags.yaml +++ b/tags.yaml @@ -51,6 +51,7 @@ - broadcast ephemeris - Caenorhabditis elegans - calcium imaging +- canada - cancer - carbon - cell biology @@ -110,6 +111,7 @@ - deafrica - decennial census - deep learning +- dem - demographic and housing characteristics file - demographics - demography @@ -128,6 +130,8 @@ - drilling - drifters - Drosophila melanogaster +- dsm +- dtm - earth observation - earthquakes - economics From 56db71d8816a85f9905ef79fe7144097dc064c0a Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Thu, 31 Jul 2025 14:30:17 -0400 Subject: [PATCH 173/751] Updated description and tags --- datasets/canelevation-dem.yml | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/datasets/canelevation-dem.yml b/datasets/canelevation-dem.yml index 42b67e693..012d9f6b5 100644 --- a/datasets/canelevation-dem.yml +++ b/datasets/canelevation-dem.yml @@ -1,5 +1,5 @@ Name: CanElevation - Canada Digital Elevation Models -Description: # TODO : Add description +Description: The Canadian DEM represents the current coverage of elevation data available. This dataset includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived products. This dataset includes a 1m, 2m and 30m DEM. The 1m and 2 m products are a combination of DEM data generated from airborne LiDAR and optical digital images. The 30 m DEM integrates data from the Copernicus DEM acquired during the TanDEM-X Mission, with the DEM data derived from airborne lidar and provides a complete coverage for Canada. Documentation: [Medium Resolution Digital Elevation Model - MRDEM](https://open.canada.ca/data/en/dataset/18752265-bda3-498c-a4ba-9dfe68cb98da) [High Resolution Digital Elevation Model Mosaic](https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b) Contact: geoinfo@nrcan-rncan.gc.ca ManagedBy: "[Natural Resources Canada](https://nrcan.gc.ca/)" @@ -7,14 +7,14 @@ UpdateFrequency: "The dataset is updated as new DEM models becomes available. L'ensemble de données est mis à jour à mesure que des nouveaux modèles numérique d'élévation deviennent disponibles." Tags: - - canada #TODO : Needs to be submitted as new tag in the tag.yml + - canada - elevation - geospatial - stac - land - - dsm #TODO : Needs to be submitted as new tag in the tag.yml - - dtm #TODO : Needs to be submitted as new tag in the tag.yml - - dem #TODO : Needs to be submitted as new tag in the tag.yml + - dsm + - dtm + - dem License: "[Open Government License (OGL)](https://open.canada.ca/en/open-government-licence-canada)" Resources: - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) at 1m / Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) à 1m @@ -43,7 +43,7 @@ DataAtWork: URL: https://nrcan.github.io/cloud-optimized-geospatial/ AuthorName: NRCan Publications: - - Title: # TODO : Add Heather's publication + - Title: URL: AuthorName: AuthorURL: \ No newline at end of file From ad27405601b3fe669bd093943e4cb16ab8b651ea Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Fri, 1 Aug 2025 10:01:32 -0400 Subject: [PATCH 174/751] Added SNS and browse bucket placeholder --- datasets/canelevation-dem.yml | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/datasets/canelevation-dem.yml b/datasets/canelevation-dem.yml index 012d9f6b5..02f0af26d 100644 --- a/datasets/canelevation-dem.yml +++ b/datasets/canelevation-dem.yml @@ -21,22 +21,38 @@ Resources: ARN: arn:aws:s3:::canelevation-dem/hrdem-mosaic-1m/ Region: ca-central-1 Type: S3 Bucket - Explore: https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-1m + Explore: + - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-1m)' + - '[Browse Bucket](...)' - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) at 2m / Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) à 2m ARN: arn:aws:s3:::canelevation-dem/hrdem-mosaic-2m/ Region: ca-central-1 Type: S3 Bucket - Explore: https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-2m + Explore: + - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-2m)' + - '[Browse Bucket](...)' - Description: Medium Resolution Digital Elevation Model (MRDEM). Modèle numérique d'élévation de moyenne résolution (MNEMR) ARN: arn:aws:s3:::canelevation-dem/mrdem-30/ Region: ca-central-1 Type: S3 Bucket - Explore: https://datacube.services.geo.ca/stac/api/search?collections=mrdem-30 + Explore: + - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=mrdem-30)' + - '[Browse Bucket](...)' - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) by LiDAR acquisition project. Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) par project d'acquisition LiDAR. ARN: arn:aws:s3:::canelevation-dem/hrdem-lidar/ Region: ca-central-1 Type: S3 Bucket - Explore: https://datacube.services.geo.ca/stac/api/search?collections=hrdem-lidar + Explore: + - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-lidar)' + - '[Browse Bucket](...)' + - Description: Notifications for Medium Resolution Digital Elevation Model (MRDEM) + ARN: ... + Region: ca-central-1 + Type: SNS Topic + - Description: Notifications for High Resolution Digital Elevation Model (HRDEM) + ARN: ... + Region: ca-central-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Cloud-Optimized Geospatial Data Access @@ -46,4 +62,4 @@ DataAtWork: - Title: URL: AuthorName: - AuthorURL: \ No newline at end of file + AuthorURL: From 4694afd5dad636952882e6d2c5c548bc736a2ab7 Mon Sep 17 00:00:00 2001 From: devapriyakumar <81041093+devapriyakumar@users.noreply.github.com> Date: Sat, 2 Aug 2025 12:10:59 +0530 Subject: [PATCH 175/751] Update ai3.yaml --- datasets/ai3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 51e30dd38..70e395b5a 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -20,7 +20,7 @@ License: https://devalab.in/AI3.html Resources: - Description: ai3data bucket includes coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. The subfolders of ai3data bucket consist of Version 1, Version2 and Version 3. Version1 contains the total Size of 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2 contains the total Size of 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3 contains the total Size of 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components). - ARN: arn:aws:account::345594599240:account + ARN: arn:aws:s3:::ai3data Region: us-east-1 Type: S3 Bucket From b4806377fa9e00bb88db3be98c560e7e1e0c78ac Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 4 Aug 2025 10:20:26 -0400 Subject: [PATCH 176/751] Update ai3.yaml --- datasets/ai3.yaml | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 70e395b5a..1c5001bbb 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -26,12 +26,10 @@ Resources: DataAtWork: Tutorials: - - Description: The dataset is easy to download and can be applied based on user requirements. Further information about the protocol for creation of the dataset can be obtained from https://github.com/devalab/AI3. - ToolsApplications: - - Title: Dataset of protein-ligand complexes now available in the Registry of Open Data on AWS - URL: To be added once blog is published - AuthorName: U. Deva Priyakumar, Rakesh Srivastava, Prathit Chatterjee, Vladimir Aladinskiy, Ramanathan Sethuraman, Yusong Wang, Alex Iankoulski, Beryl Rabindran - AuthorURL: https://devalab.in/ + - Title: "AI3: Protein-Ligand Binding Affinity Dataset" + URL: https://github.com/devalab/AI3 + AuthorName: Deva Priyakumar Lab + AuthorURL: https://github.com/devalab Publications: - Title: "PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications" URL: https://www.nature.com/articles/s41597-022-01631-9 From d9757687113a6f12b7201712ec34e6b20c3616ff Mon Sep 17 00:00:00 2001 From: Ethan Grant Date: Mon, 4 Aug 2025 15:01:42 -0400 Subject: [PATCH 177/751] Update tags.yaml Co-Authored-By: KaseyW31 <115895250+kaseyw31@users.noreply.github.com> --- tags.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tags.yaml b/tags.yaml index 5fa9b4bc0..b8e73a995 100644 --- a/tags.yaml +++ b/tags.yaml @@ -197,11 +197,10 @@ - green aviation - ground water - group quarters -- h5 - hazard - hazard indicator - Hawkes Process -- hdf5 +- hdf - health - high-throughput imaging - hiring @@ -416,6 +415,7 @@ - temporal point process - tertiary analysis - text analysis +- tiff - tiles - time series forecasting - trading From b2f73cb5ff9ab483e142dd5b22cd45de483c2104 Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Tue, 5 Aug 2025 09:39:56 -0400 Subject: [PATCH 178/751] Update noaa-historicalcharts.yaml Updated contact information. --- datasets/noaa-historicalcharts.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-historicalcharts.yaml b/datasets/noaa-historicalcharts.yaml index 7d4cc7eb0..7f40f9136 100644 --- a/datasets/noaa-historicalcharts.yaml +++ b/datasets/noaa-historicalcharts.yaml @@ -2,7 +2,7 @@ Name: NOAA Historical Maps and Charts Description: Historical Charts are not for Navigation. The collection primarily consists of historic charts and maps produced by NOAA's Coast Survey and its predecessors, especially the U.S. Coast and Geodetic Survey and the U.S. Lake Survey (previously under the Department of War). The collection also includes bathymetric maps, land sketches, Civil War battle maps, aeronautical charting from the 1930s to the 1950s, and other drawings and photographs. Documentation: https://historicalcharts.noaa.gov/about.php Contact: | - For any questions regarding data delivery not associated with this platform or any general questions regarding the NOAA Big Data Program, email noaa.bdp@noaa.gov.

+ For any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov. We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov.

For general questions or feedback about the data, please submit inquiries through the NOAA Office of Coast Survey (OCS) ASSIST Tool at https://www.nauticalcharts.noaa.gov/customer-service/assist/. ManagedBy: "[NOAA](http://www.noaa.gov/)" UpdateFrequency: Periodic manual updates when historic charts are added to the collection. From d3604d7ea6b5512f93e1910f7823593519f5f57d Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 5 Aug 2025 08:23:53 -0800 Subject: [PATCH 179/751] Update noaa-historicalcharts.yaml --- datasets/noaa-historicalcharts.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/noaa-historicalcharts.yaml b/datasets/noaa-historicalcharts.yaml index 7f40f9136..f504f51db 100644 --- a/datasets/noaa-historicalcharts.yaml +++ b/datasets/noaa-historicalcharts.yaml @@ -25,3 +25,4 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://noaa-nos-historicalcharts-pds.s3.amazonaws.com/index.html)' + From 0b6ad09bd0801090374a6b7ae483be589ff180e4 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 5 Aug 2025 09:16:47 -0800 Subject: [PATCH 180/751] Update tags.yaml From ca4bc28d4ac52532603c7a13c82157026991f138 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 5 Aug 2025 09:21:40 -0800 Subject: [PATCH 181/751] Update tags.yaml cannot remove tags that are already in use --- tags.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tags.yaml b/tags.yaml index f2d58be28..8c2a82f2a 100644 --- a/tags.yaml +++ b/tags.yaml @@ -197,9 +197,11 @@ - green aviation - ground water - group quarters +- h5 - hazard - hazard indicator - Hawkes Process +- hdf5 - hdf - health - heliophysics From ef973693454dcc4c26218bc3c5eb060a2506b19f Mon Sep 17 00:00:00 2001 From: Hyun Min Kang Date: Wed, 6 Aug 2025 10:15:44 -0400 Subject: [PATCH 182/751] Updated the license to CC-BY-4.0 for cartostore.yaml --- datasets/cartostore.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/cartostore.yaml b/datasets/cartostore.yaml index 892d0a729..5bacbbd79 100644 --- a/datasets/cartostore.yaml +++ b/datasets/cartostore.yaml @@ -12,7 +12,7 @@ Tags: - bioinformatics - life sciences License: | - "[CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)" + "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: | CartoStore by Hyun Min Kang's lab at the University of Michigan School of Public Health. Provided by Kang lab and accessed [DAY MONTH YEAR]. From 014b11d4bf08c17fa62605f9716d0eb013990c20 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 6 Aug 2025 11:14:05 -0400 Subject: [PATCH 183/751] Update indian-supreme-court-judgments.yaml Removing tags not in the approved list --- datasets/indian-supreme-court-judgments.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/indian-supreme-court-judgments.yaml b/datasets/indian-supreme-court-judgments.yaml index 747475948..d9f2d14d7 100644 --- a/datasets/indian-supreme-court-judgments.yaml +++ b/datasets/indian-supreme-court-judgments.yaml @@ -6,8 +6,7 @@ ManagedBy: "[Dattam Labs](https://dattam.in)" UpdateFrequency: Bi-monthly Tags: - legal data - - supreme court - - india + - aws-pds License: CC-BY-4.0 Resources: - Description: S3 bucket containing the judgments From 0b46c9c73b8e0ee26845d3ffc516d8f2ff58b5b8 Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Wed, 6 Aug 2025 14:55:09 -0800 Subject: [PATCH 184/751] NEXRAD new bucket --- datasets/noaa-nexrad.yaml | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/datasets/noaa-nexrad.yaml b/datasets/noaa-nexrad.yaml index 990578f4b..cfa9222f9 100644 --- a/datasets/noaa-nexrad.yaml +++ b/datasets/noaa-nexrad.yaml @@ -1,5 +1,12 @@ Name: NEXRAD on AWS -Description: Real-time and archival data from the Next Generation Weather Radar (NEXRAD) network. +Description: | + Real-time and archival data from the Next Generation Weather Radar (NEXRAD) network. +
+

Update

+ The NEXRAD Level II archive data is moving to a new bucket: unidata-nexrad-level2 + and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive. The old + bucket and SNS topic are now deprecated and will no longer be available starting September 1, 2025. +

Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-nexrad Contact: support-level2@unidata.ucar.edu ManagedBy: "[Unidata](https://www.unidata.ucar.edu/)" @@ -18,11 +25,11 @@ Tags: License: NOAA data disseminated through NODD are open to the public and can be used as desired.

NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. Resources: - Description: NEXRAD Level II archive data - ARN: arn:aws:s3:::noaa-nexrad-level2 + ARN: arn:aws:s3:::unidata-nexrad-level2 Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket](https://noaa-nexrad-level2.s3.amazonaws.com/index.html)' + - '[Browse Bucket](https://unidata-nexrad-level2.s3.amazonaws.com/index.html)' - Description: NEXRAD Level II real-time data ARN: arn:aws:s3:::unidata-nexrad-level2-chunks Region: us-east-1 @@ -37,14 +44,22 @@ Resources: ARN: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2ObjectFilterable Region: us-east-1 Type: SNS Topic - - Description: Notifications for the Level II archival bucket - ARN: arn:aws:sns:us-east-1:811054952067:NewNEXRADLevel2Archive + - Description: Notifications for the new Level II archival bucket + ARN: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive Region: us-east-1 Type: SNS Topic - Description: Notifications for the Level III bucket ARN: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel3Object Region: us-east-1 Type: SNS Topic + - Description: "*OLD NEXRAD Level II archive bucket* which is now Deprecated. It is recommended to move to the new bucket: unidata-nexrad-level2 and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive" + ARN: arn:aws:s3:::noaa-nexrad-level2 + Region: us-east-1 + Type: S3 Bucket + - Description: "Notifications for the *OLD Level II archival bucket* which is now Deprecated. It is recommended to move to the new bucket: unidata-nexrad-level2 and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive" + ARN: arn:aws:sns:us-east-1:811054952067:NewNEXRADLevel2Archive + Region: us-east-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Using Python to Access NCEI Archived NEXRAD Level 2 Data (Jupyter notebook) From d5126a60e8aedc8ac6dc23a67ff7850d340a63a3 Mon Sep 17 00:00:00 2001 From: rsignell <125569335+rsignell@users.noreply.github.com> Date: Thu, 7 Aug 2025 06:45:11 -0400 Subject: [PATCH 185/751] Update fvcom_gom3.yaml --- datasets/fvcom_gom3.yaml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/datasets/fvcom_gom3.yaml b/datasets/fvcom_gom3.yaml index 37c1550b3..4ee360b59 100644 --- a/datasets/fvcom_gom3.yaml +++ b/datasets/fvcom_gom3.yaml @@ -1,24 +1,24 @@ -Name: FVCOM GOM3 -Description: The Finite Volume Community Ocean Model (FVCOM) was used to simulate ocean water levels, velocity, temperature and salinity over a multi-decadal period (1984-present) in the waters of the Northeast US including the Gulf of Maine. The model was configured and run by the Dr. Changshen Chen, Director of the Marine Ecosystems Dynamics Modeling Laboratory in the School for Marine Science & Technology at the University of Massachusetts Dartmouth. The triangular mesh has a varying horizontal resolution from several hundred meters inshore to several kilometers offshore. The model output was saved at hourly from 2009-08-21 to 2022-06-17. -Documentation: https://www.umassd.edu/news/2018/charting-the-ocean-.html +Name: UMASSD-FVCOM-GOM3-Hindcast +Description: The Finite Volume Community Ocean Model (FVCOM) was used to simulate ocean water levels, velocity, temperature and salinity over a multi-decadal period (1984-present) in the waters of the Northeast US including the Gulf of Maine. The model was configured and run by the Dr. Changshen Chen, Director of the Marine Ecosystems Dynamics Modeling Laboratory in the School for Marine Science & Technology at the University of Massachusetts Dartmouth. The triangular mesh has a varying horizontal resolution from several hundred meters inshore to several kilometers offshore, and 45 terrain-following vertical layers. The model output was saved at hourly intervals from 2009-08-21 to 2022-06-17. +Documentation: https://en.wikipedia.org/wiki/Finite_Volume_Community_Ocean_Model Contact: rich@opensciencecomputing.com ManagedBy: Open Science Computing, LLC UpdateFrequency: None -Citation: +Citation: https://web.archive.org/web/20161229211546id_/http://fvcom.smast.umassd.edu/wp-content/uploads/2013/11/MITSG_12-25.pdf Tags: - aws-pds - oceans License: CC0 Resources: - - Description: A collection of NetCDF files, kerchunk generated JSON files, and an Intake catalog + - Description: A collection of NetCDF files, kerchunk-generated Parquet reference files, and an Intake catalog ARN: arn:aws:s3:::fvcom-gom3 - Region: us-west-2 + Region: us-east-1 Type: S3 Bucket DataAtWork: Tutorials: - Title: FVCOM Explorer Notebook URL: https://github.com/opensciencecomputing/fvcom - NotebookURL: https://github.com/opensciencecomputing/fvcom/blob/main/FVCOM_explore.ipynb + NotebookURL: https://github.com/opensciencecomputing/umassd-fvcom/blob/main/fvcom_gom3_explore.ipynb AuthorName: Rich Signell AuthorURL: https://about.me/rich.signell Services: From ea7f3d846ff4c54d3f5a95e156e6103dd21b5a09 Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Thu, 7 Aug 2025 07:42:41 -0400 Subject: [PATCH 186/751] Update and rename canelevation-dem.yml to canelevation-dem.yaml Rename to canelevation-yaml, added sns ressources and french description --- ...levation-dem.yml => canelevation-dem.yaml} | 28 +++++++++++++++---- 1 file changed, 22 insertions(+), 6 deletions(-) rename datasets/{canelevation-dem.yml => canelevation-dem.yaml} (65%) diff --git a/datasets/canelevation-dem.yml b/datasets/canelevation-dem.yaml similarity index 65% rename from datasets/canelevation-dem.yml rename to datasets/canelevation-dem.yaml index 02f0af26d..6e3ae32d8 100644 --- a/datasets/canelevation-dem.yml +++ b/datasets/canelevation-dem.yaml @@ -1,10 +1,13 @@ Name: CanElevation - Canada Digital Elevation Models -Description: The Canadian DEM represents the current coverage of elevation data available. This dataset includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived products. This dataset includes a 1m, 2m and 30m DEM. The 1m and 2 m products are a combination of DEM data generated from airborne LiDAR and optical digital images. The 30 m DEM integrates data from the Copernicus DEM acquired during the TanDEM-X Mission, with the DEM data derived from airborne lidar and provides a complete coverage for Canada. +Description: The Canadian DEM represents the current coverage of elevation data available. This dataset includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived products. This dataset includes a 1m, 2m and 30m DEM. The 1m and 2 m products are a combination of DEM data generated from airborne LiDAR and optical digital images. The 30 m DEM integrates data from the Copernicus DEM acquired during the TanDEM-X Mission, with the DEM data derived from airborne lidar and provides a complete coverage for Canada. +
+
+Le modèle numérique d’élévation (MNE) canadien représente la couverture actuelle des données d’élévation disponibles. Ce jeu de données comprend un Modèle Numérique de Terrain (MNT), un Modèle Numérique de Surface (MNS) et d’autres produits dérivés. Ce jeu de données propose des MNE de résolution 1 m, 2 m et 30 m. Les produits 1 m et 2 m sont issus d’une combinaison de données MNE générées à partir de LiDAR aéroporté et d’images numériques optiques. Le MNE de 30 m intègre des données provenant du MNE Copernicus acquis lors de la mission TanDEM-X, ainsi que les données MNE issues du LiDAR aéroporté, ce qui permet d’assurer une couverture complète du Canada. Documentation: [Medium Resolution Digital Elevation Model - MRDEM](https://open.canada.ca/data/en/dataset/18752265-bda3-498c-a4ba-9dfe68cb98da) [High Resolution Digital Elevation Model Mosaic](https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b) Contact: geoinfo@nrcan-rncan.gc.ca ManagedBy: "[Natural Resources Canada](https://nrcan.gc.ca/)" UpdateFrequency: "The dataset is updated as new DEM models becomes available. - +
L'ensemble de données est mis à jour à mesure que des nouveaux modèles numérique d'élévation deviennent disponibles." Tags: - canada @@ -45,12 +48,24 @@ Resources: Explore: - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-lidar)' - '[Browse Bucket](...)' - - Description: Notifications for Medium Resolution Digital Elevation Model (MRDEM) - ARN: ... + - Description: Notifications for Canada Digital Elevation Models. + ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-create-object + Region: ca-central-1 + Type: SNS Topic + - Description: Notifications for mosaic of High Resolution Digital Elevation Model (HRDEM) at 1m. + ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-hrdem-mosaic-1m-create-object + Region: ca-central-1 + Type: SNS Topic + - Description: Notifications for mosaic of High Resolution Digital Elevation Model (HRDEM) at 2m. + ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-hrdem-mosaic-2m-create-object + Region: ca-central-1 + Type: SNS Topic + - Description: Notifications for Medium Resolution Digital Elevation Model (MRDEM). + ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-mrdem-30-create-object Region: ca-central-1 Type: SNS Topic - - Description: Notifications for High Resolution Digital Elevation Model (HRDEM) - ARN: ... + - Description: Notifications for High Resolution Digital Elevation Model (HRDEM) by LiDAR acquisition project. + ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-hrdem-lidar-create-object Region: ca-central-1 Type: SNS Topic DataAtWork: @@ -63,3 +78,4 @@ DataAtWork: URL: AuthorName: AuthorURL: + From 9cb8ad46b9e6b4a11ba0d9d7bb58a95fe7de4d3e Mon Sep 17 00:00:00 2001 From: Risto Vehmas Date: Wed, 6 Aug 2025 10:03:54 +0300 Subject: [PATCH 187/751] Add ICEYE SAR open dataset entry --- datasets/iceye-opendata.yaml | 40 ++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 datasets/iceye-opendata.yaml diff --git a/datasets/iceye-opendata.yaml b/datasets/iceye-opendata.yaml new file mode 100644 index 000000000..2d3de1ab7 --- /dev/null +++ b/datasets/iceye-opendata.yaml @@ -0,0 +1,40 @@ +Name: ICEYE Synthetic Aperture Radar (SAR) Open Dataset +Description: | + ICEYE operates the world’s largest constellation of synthetic aperture radar (SAR) satellites, delivering unmatched access to persistent, high-resolution Earth observation data regardless of time of day or weather conditions. The ICEYE Open Dataset makes a curated selection of SAR imagery publicly available to promote research, innovation, and education in the geospatial community. ICEYE’s constellation enables rapid revisit rates and flexible imaging modes, unlocking insights into natural disasters, climate monitoring, infrastructure, and more. + + Learn more at [www.iceye.com](https://www.iceye.com). +Documentation: Documentation is available at the [ICEYE website](https://www.iceye.com) and the [ICEYE Product Documentation](sar.iceye.com). +Contact: customer@iceye.com +ManagedBy: "[ICEYE](https://www.iceye.com)" +UpdateFrequency: New data is added frequently. +Collabs: + ASDI: + Tags: + - satellite imagery +Tags: + - aws-pds + - synthetic aperture radar + - stac + - earth observation + - satellite imagery + - image processing + - geospatial + - computer vision + - disaster response +License: | + The data is provided under the Creative Commons License [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/), which gives the user the right to share, copy, and redistribute the material in any medium or format, as well as adapt, remix, transform, and build upon the material for any purpose, even commercially, as long as appropriate credit is given to the original creator. +Resources: + - Description: ICEYE Open SAR Data + ARN: arn:aws:s3:::iceye-open-data-catalog + Region: us-west-2 + Type: S3 Bucket + RequesterPays: False + Explore: + - '[Browse bucket](http://iceye-open-data-catalog.s3-website-us-west-2.amazonaws.com)' + - '[STAC Browser](https://radiantearth.github.io/stac-browser/#/external/iceye-open-data-catalog.s3-us-west-2.amazonaws.com/catalog.json)' +DataAtWork: + Tutorials: + - Title: ICEYE Product Documentation + URL: sar.iceye.com + AuthorName: ICEYE + AuthorURL: https://www.iceye.com \ No newline at end of file From 49042e753ca4fc3b616b31ffaac0d314604e247e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 08:13:39 -0800 Subject: [PATCH 188/751] Update canelevation-dem.yaml formatting --- datasets/canelevation-dem.yaml | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index 6e3ae32d8..96d39b5b3 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -1,14 +1,16 @@ Name: CanElevation - Canada Digital Elevation Models -Description: The Canadian DEM represents the current coverage of elevation data available. This dataset includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived products. This dataset includes a 1m, 2m and 30m DEM. The 1m and 2 m products are a combination of DEM data generated from airborne LiDAR and optical digital images. The 30 m DEM integrates data from the Copernicus DEM acquired during the TanDEM-X Mission, with the DEM data derived from airborne lidar and provides a complete coverage for Canada. -
-
-Le modèle numérique d’élévation (MNE) canadien représente la couverture actuelle des données d’élévation disponibles. Ce jeu de données comprend un Modèle Numérique de Terrain (MNT), un Modèle Numérique de Surface (MNS) et d’autres produits dérivés. Ce jeu de données propose des MNE de résolution 1 m, 2 m et 30 m. Les produits 1 m et 2 m sont issus d’une combinaison de données MNE générées à partir de LiDAR aéroporté et d’images numériques optiques. Le MNE de 30 m intègre des données provenant du MNE Copernicus acquis lors de la mission TanDEM-X, ainsi que les données MNE issues du LiDAR aéroporté, ce qui permet d’assurer une couverture complète du Canada. +Description: | + The Canadian DEM represents the current coverage of elevation data available. This dataset includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived products. This dataset includes a 1m, 2m and 30m DEM. The 1m and 2 m products are a combination of DEM data generated from airborne LiDAR and optical digital images. The 30 m DEM integrates data from the Copernicus DEM acquired during the TanDEM-X Mission, with the DEM data derived from airborne lidar and provides a complete coverage for Canada. +
+
+ Le modèle numérique d’élévation (MNE) canadien représente la couverture actuelle des données d’élévation disponibles. Ce jeu de données comprend un Modèle Numérique de Terrain (MNT), un Modèle Numérique de Surface (MNS) et d’autres produits dérivés. Ce jeu de données propose des MNE de résolution 1 m, 2 m et 30 m. Les produits 1 m et 2 m sont issus d’une combinaison de données MNE générées à partir de LiDAR aéroporté et d’images numériques optiques. Le MNE de 30 m intègre des données provenant du MNE Copernicus acquis lors de la mission TanDEM-X, ainsi que les données MNE issues du LiDAR aéroporté, ce qui permet d’assurer une couverture complète du Canada. Documentation: [Medium Resolution Digital Elevation Model - MRDEM](https://open.canada.ca/data/en/dataset/18752265-bda3-498c-a4ba-9dfe68cb98da) [High Resolution Digital Elevation Model Mosaic](https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b) Contact: geoinfo@nrcan-rncan.gc.ca ManagedBy: "[Natural Resources Canada](https://nrcan.gc.ca/)" -UpdateFrequency: "The dataset is updated as new DEM models becomes available. -
-L'ensemble de données est mis à jour à mesure que des nouveaux modèles numérique d'élévation deviennent disponibles." +UpdateFrequency: | + The dataset is updated as new DEM models becomes available. +
+ L'ensemble de données est mis à jour à mesure que des nouveaux modèles numérique d'élévation deviennent disponibles. Tags: - canada - elevation @@ -79,3 +81,4 @@ DataAtWork: AuthorName: AuthorURL: + From 5b24790035d7642bbd3e53a60ce7d0f8ba8e302f Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 08:16:01 -0800 Subject: [PATCH 189/751] Update canelevation-dem.yaml --- datasets/canelevation-dem.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index 96d39b5b3..834ee8cf3 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -12,6 +12,7 @@ UpdateFrequency: |
L'ensemble de données est mis à jour à mesure que des nouveaux modèles numérique d'élévation deviennent disponibles. Tags: + - aws-pds - canada - elevation - geospatial @@ -82,3 +83,4 @@ DataAtWork: AuthorURL: + From 50d9794dee990740e0da99907b0542cf65b7d9b2 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 08:25:45 -0800 Subject: [PATCH 190/751] Update canelevation-dem.yaml --- datasets/canelevation-dem.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index 834ee8cf3..cfe4870a2 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -1,4 +1,4 @@ -Name: CanElevation - Canada Digital Elevation Models +Name: "CanElevation - Canada Digital Elevation Models" Description: | The Canadian DEM represents the current coverage of elevation data available. This dataset includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived products. This dataset includes a 1m, 2m and 30m DEM. The 1m and 2 m products are a combination of DEM data generated from airborne LiDAR and optical digital images. The 30 m DEM integrates data from the Copernicus DEM acquired during the TanDEM-X Mission, with the DEM data derived from airborne lidar and provides a complete coverage for Canada.
@@ -84,3 +84,4 @@ DataAtWork: + From 97ba20a8a40dd7a3c2951202b78345b44d0cdd93 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 08:30:20 -0800 Subject: [PATCH 191/751] Update canelevation-dem.yaml --- datasets/canelevation-dem.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index cfe4870a2..456b7f073 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -4,7 +4,7 @@ Description: |

Le modèle numérique d’élévation (MNE) canadien représente la couverture actuelle des données d’élévation disponibles. Ce jeu de données comprend un Modèle Numérique de Terrain (MNT), un Modèle Numérique de Surface (MNS) et d’autres produits dérivés. Ce jeu de données propose des MNE de résolution 1 m, 2 m et 30 m. Les produits 1 m et 2 m sont issus d’une combinaison de données MNE générées à partir de LiDAR aéroporté et d’images numériques optiques. Le MNE de 30 m intègre des données provenant du MNE Copernicus acquis lors de la mission TanDEM-X, ainsi que les données MNE issues du LiDAR aéroporté, ce qui permet d’assurer une couverture complète du Canada. -Documentation: [Medium Resolution Digital Elevation Model - MRDEM](https://open.canada.ca/data/en/dataset/18752265-bda3-498c-a4ba-9dfe68cb98da) [High Resolution Digital Elevation Model Mosaic](https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b) +Documentation: "[Medium Resolution Digital Elevation Model - MRDEM](https://open.canada.ca/data/en/dataset/18752265-bda3-498c-a4ba-9dfe68cb98da) [High Resolution Digital Elevation Model Mosaic](https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b) " Contact: geoinfo@nrcan-rncan.gc.ca ManagedBy: "[Natural Resources Canada](https://nrcan.gc.ca/)" UpdateFrequency: | @@ -85,3 +85,4 @@ DataAtWork: + From 429e596d70ef58374a0427c5583d535ead373fb7 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 09:13:52 -0800 Subject: [PATCH 192/751] Update iceye-opendata.yaml --- datasets/iceye-opendata.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/iceye-opendata.yaml b/datasets/iceye-opendata.yaml index 2d3de1ab7..6d80f297d 100644 --- a/datasets/iceye-opendata.yaml +++ b/datasets/iceye-opendata.yaml @@ -37,4 +37,5 @@ DataAtWork: - Title: ICEYE Product Documentation URL: sar.iceye.com AuthorName: ICEYE - AuthorURL: https://www.iceye.com \ No newline at end of file + AuthorURL: https://www.iceye.com + From fecb44e1774cf61b2a668b9cc3806d8670950e75 Mon Sep 17 00:00:00 2001 From: Risto Vehmas Date: Thu, 7 Aug 2025 20:38:25 +0300 Subject: [PATCH 193/751] adding SNS topic resource --- datasets/iceye-opendata.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/iceye-opendata.yaml b/datasets/iceye-opendata.yaml index 6d80f297d..20bcdbcee 100644 --- a/datasets/iceye-opendata.yaml +++ b/datasets/iceye-opendata.yaml @@ -32,6 +32,10 @@ Resources: Explore: - '[Browse bucket](http://iceye-open-data-catalog.s3-website-us-west-2.amazonaws.com)' - '[STAC Browser](https://radiantearth.github.io/stac-browser/#/external/iceye-open-data-catalog.s3-us-west-2.amazonaws.com/catalog.json)' + - Description: Notification for new ICEYE Open SAR Data + ARN: arn:aws:sns:us-west-2:058264311954:iceye-open-data-catalog-object_created + Region: us-west-2 + Type: SNS topic DataAtWork: Tutorials: - Title: ICEYE Product Documentation From d2c013c3fd6903e9f7020407d1795c55e1c8150f Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 09:59:49 -0800 Subject: [PATCH 194/751] Update iceye-opendata.yaml From 16865eddcce95062ef78caba9f5f41b7e1f8cbcc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 10:03:19 -0800 Subject: [PATCH 195/751] Update iceye-opendata.yaml --- datasets/iceye-opendata.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/iceye-opendata.yaml b/datasets/iceye-opendata.yaml index 20bcdbcee..aee9631b9 100644 --- a/datasets/iceye-opendata.yaml +++ b/datasets/iceye-opendata.yaml @@ -35,7 +35,7 @@ Resources: - Description: Notification for new ICEYE Open SAR Data ARN: arn:aws:sns:us-west-2:058264311954:iceye-open-data-catalog-object_created Region: us-west-2 - Type: SNS topic + Type: SNS Topic DataAtWork: Tutorials: - Title: ICEYE Product Documentation From 64409175bec12e5d827f5c1fffb48d2e606e5d8c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 7 Aug 2025 13:33:25 -0800 Subject: [PATCH 196/751] Update sentinel-2-l2a-cogs.yaml Adding Methane notebook --- datasets/sentinel-2-l2a-cogs.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/sentinel-2-l2a-cogs.yaml b/datasets/sentinel-2-l2a-cogs.yaml index b9f117e0c..8065946a3 100644 --- a/datasets/sentinel-2-l2a-cogs.yaml +++ b/datasets/sentinel-2-l2a-cogs.yaml @@ -110,6 +110,12 @@ DataAtWork: AuthorName: Louise Liddell Services: - Amazon SageMaker Studio Lab + - Title: Monitoring of methane (CH4) emission point sources on AWS + URL: https://github.com/aws-samples/aws-opendata-samples/blob/main/notebooks/aws-methane-emissions-monitor/monitor_methane_ch4_emission_point_sources.ipynb + AuthorName: Janosch Woschitz, Karsten Schroer + Services: + - Amazon SageMaker + - Amazon S3 Publications: - Title: STAC and Sentinel-2 COGs (ESIP Summer Meeting 2020) URL: https://docs.google.com/presentation/d/14NsKFZ3UF2Swwx_9L7sPMX9ccFUK1ruQyZXWK9Cz4L4/edit?usp=sharing From 80dfefb8c3d3a4e25cc32302a21a85bd6cedb4d8 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 09:48:57 -0400 Subject: [PATCH 197/751] add NASA ESDIS YAMLs Co-authored-by: EthanGrants <105686621+EthanGrants@users.noreply.github.com> --- datasets/nasa-airibrad.yaml | 42 ++++ datasets/nasa-airicrad.yaml | 57 ++++++ datasets/nasa-astl1t.yaml | 47 +++++ datasets/nasa-atl03.yaml | 31 +++ datasets/nasa-atl08.yaml | 31 +++ datasets/nasa-gedi02a.yaml | 69 +++++++ datasets/nasa-gedil4aagbdensityv212056.yaml | 32 +++ datasets/nasa-gpm2adpr.yaml | 44 ++++ datasets/nasa-gpm3imergde.yaml | 111 ++++++++++ datasets/nasa-gpm3imergdf.yaml | 110 ++++++++++ datasets/nasa-gpm3imergdl.yaml | 111 ++++++++++ datasets/nasa-gpm3imerghh.yaml | 76 +++++++ datasets/nasa-gpm3imerghhe.yaml | 103 ++++++++++ datasets/nasa-gpm3imerghhl.yaml | 104 ++++++++++ datasets/nasa-gpm3imergm.yaml | 76 +++++++ datasets/nasa-gpmimerglandseamask.yaml | 39 ++++ datasets/nasa-gpmmergir.yaml | 81 ++++++++ datasets/nasa-hlsl30.yaml | 190 ++++++++++++++++++ datasets/nasa-hlss30.yaml | 190 ++++++++++++++++++ .../nasa-imergprecipcanadaalaska2097.yaml | 32 +++ datasets/nasa-m2i3npasm.yaml | 120 +++++++++++ datasets/nasa-m2i3nvaer.yaml | 122 +++++++++++ datasets/nasa-m2i3nvasm.yaml | 121 +++++++++++ datasets/nasa-m2t1nxslv.yaml | 125 ++++++++++++ datasets/nasa-mcd43a1.yaml | 56 ++++++ datasets/nasa-mcd43a3.yaml | 46 +++++ datasets/nasa-mcd43a4.yaml | 47 +++++ datasets/nasa-mi1b2e.yaml | 43 ++++ datasets/nasa-mod02hkm.yaml | 78 +++++++ datasets/nasa-mod09a1.yaml | 45 +++++ datasets/nasa-mod09ga.yaml | 46 +++++ datasets/nasa-mod09gq.yaml | 44 ++++ datasets/nasa-mod13q1.yaml | 51 +++++ datasets/nasa-mod16a2.yaml | 57 ++++++ datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml | 40 ++++ datasets/nasa-mur-jpl-l4-glob-v41.yaml | 43 ++++ datasets/nasa-myd09ga.yaml | 53 +++++ datasets/nasa-myd09gq.yaml | 50 +++++ datasets/nasa-operal2cslc-s1-staticv1.yaml | 40 ++++ datasets/nasa-operal2cslc-s1v1.yaml | 64 ++++++ datasets/nasa-operal2rtc-s1-staticv1.yaml | 48 +++++ datasets/nasa-operal2rtc-s1v1.yaml | 73 +++++++ datasets/nasa-operal3disp-s1v1.yaml | 48 +++++ datasets/nasa-operal3dist-alert-hls_v1.yaml | 43 ++++ ...sa-operal3dist-alert-hlsprovisionalv0.yaml | 58 ++++++ datasets/nasa-operal3dist-alert-hlsv1.yaml | 43 ++++ datasets/nasa-operal3dswx-hlsv1.yaml | 72 +++++++ datasets/nasa-operal3dswx-s1v1.yaml | 48 +++++ datasets/nasa-sentinel-1adpgrdhigh.yaml | 43 ++++ datasets/nasa-sentinel-1aslc.yaml | 42 ++++ datasets/nasa-sentinel-1bdpgrdhigh.yaml | 42 ++++ datasets/nasa-sentinel-1bslc.yaml | 43 ++++ 52 files changed, 3470 insertions(+) create mode 100644 datasets/nasa-airibrad.yaml create mode 100644 datasets/nasa-airicrad.yaml create mode 100644 datasets/nasa-astl1t.yaml create mode 100644 datasets/nasa-atl03.yaml create mode 100644 datasets/nasa-atl08.yaml create mode 100644 datasets/nasa-gedi02a.yaml create mode 100644 datasets/nasa-gedil4aagbdensityv212056.yaml create mode 100644 datasets/nasa-gpm2adpr.yaml create mode 100644 datasets/nasa-gpm3imergde.yaml create mode 100644 datasets/nasa-gpm3imergdf.yaml create mode 100644 datasets/nasa-gpm3imergdl.yaml create mode 100644 datasets/nasa-gpm3imerghh.yaml create mode 100644 datasets/nasa-gpm3imerghhe.yaml create mode 100644 datasets/nasa-gpm3imerghhl.yaml create mode 100644 datasets/nasa-gpm3imergm.yaml create mode 100644 datasets/nasa-gpmimerglandseamask.yaml create mode 100644 datasets/nasa-gpmmergir.yaml create mode 100644 datasets/nasa-hlsl30.yaml create mode 100644 datasets/nasa-hlss30.yaml create mode 100644 datasets/nasa-imergprecipcanadaalaska2097.yaml create mode 100644 datasets/nasa-m2i3npasm.yaml create mode 100644 datasets/nasa-m2i3nvaer.yaml create mode 100644 datasets/nasa-m2i3nvasm.yaml create mode 100644 datasets/nasa-m2t1nxslv.yaml create mode 100644 datasets/nasa-mcd43a1.yaml create mode 100644 datasets/nasa-mcd43a3.yaml create mode 100644 datasets/nasa-mcd43a4.yaml create mode 100644 datasets/nasa-mi1b2e.yaml create mode 100644 datasets/nasa-mod02hkm.yaml create mode 100644 datasets/nasa-mod09a1.yaml create mode 100644 datasets/nasa-mod09ga.yaml create mode 100644 datasets/nasa-mod09gq.yaml create mode 100644 datasets/nasa-mod13q1.yaml create mode 100644 datasets/nasa-mod16a2.yaml create mode 100644 datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml create mode 100644 datasets/nasa-mur-jpl-l4-glob-v41.yaml create mode 100644 datasets/nasa-myd09ga.yaml create mode 100644 datasets/nasa-myd09gq.yaml create mode 100644 datasets/nasa-operal2cslc-s1-staticv1.yaml create mode 100644 datasets/nasa-operal2cslc-s1v1.yaml create mode 100644 datasets/nasa-operal2rtc-s1-staticv1.yaml create mode 100644 datasets/nasa-operal2rtc-s1v1.yaml create mode 100644 datasets/nasa-operal3disp-s1v1.yaml create mode 100644 datasets/nasa-operal3dist-alert-hls_v1.yaml create mode 100644 datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml create mode 100644 datasets/nasa-operal3dist-alert-hlsv1.yaml create mode 100644 datasets/nasa-operal3dswx-hlsv1.yaml create mode 100644 datasets/nasa-operal3dswx-s1v1.yaml create mode 100644 datasets/nasa-sentinel-1adpgrdhigh.yaml create mode 100644 datasets/nasa-sentinel-1aslc.yaml create mode 100644 datasets/nasa-sentinel-1bdpgrdhigh.yaml create mode 100644 datasets/nasa-sentinel-1bslc.yaml diff --git a/datasets/nasa-airibrad.yaml b/datasets/nasa-airibrad.yaml new file mode 100644 index 000000000..5568ec4a5 --- /dev/null +++ b/datasets/nasa-airibrad.yaml @@ -0,0 +1,42 @@ +Name: AIRS/Aqua L1B Infrared (IR) geolocated and calibrated radiances V005 (AIRIBRAD) + at GES DISC +Description: |- + WARNING: On 2021/09/23 the EOS Aqua executed a Deep Space Maneuver (DSM). In the DSM, the spacecraft is turned such that the normal Earth field of regard is deep space. + + The thermal impact of the DSM caused a shift of the centroids of spectral response functions (SRF) of about 1% of the width of the SRF, equivalent to a frequency shift of 9 parts per million. This shift is reflected in the “spectral_freq” parameter (observed frequencies) in the L1b v5 files for each 6 minute granule. The magnitude of the effect on brightness temperatures (BT) depends on the spectral gradient of each channel. Maximum BT shifts are approximately +- 0.5 K, although many channels experience far smaller BT shifts. Approximately 1803 channels have BT shifts of less than 0.1 K and 575 channels are now shifted in BT by more than 0.1 K, while 231 of these channels have BT shifts greater than 0.2 K. + + Users of the L1b v5 product who are concerned that these shifts may impact their science investigations and applications are encouraged to switch to the AIRS L1c v6.7.4 product, which, among many other improvements, converts the spectra to a fixed frequency grid. END OF WARNING. + + The Atmospheric Infrared Sounder (AIRS) is a grating spectrometer (R = 1200) aboard the second Earth Observing System (EOS) polar-orbiting platform, EOS Aqua. In combination with the Advanced Microwave Sounding Unit (AMSU) and the Humidity Sounder for Brazil (HSB), AIRS constitutes an innovative atmospheric sounding group of visible, infrared, and microwave sensors. The AIRS Infrared (IR) level 1B data set contains AIRS calibrated and geolocated radiances in milliWatts/m^2/cm^-1/steradian for 2378 infrared channels in the 3.74 to 15.4 micron region of t he spectrum. The AIRS instrument is co-aligned with AMSU-A so that successive blocks of 3 x 3 AIRS footprints are contained within one AMSU-A footprint. The AIRIBRAD_005 products are stored in files (often referred to as "granules") that contain 6 minutes of data, 90 footprints across track by 135 lines along track. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/YZEXEVN4JGGJ +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 2002-08-30 to Ongoing +Tags: + - aws-pds + - atmosphere + - datacenter + - earth observation + - global + - hdf + - ice + - land + - metadata + - opendap + - orbit +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'AIRS/Aqua L1B Infrared (IR) geolocated and calibrated radiances + V005 (AIRIBRAD) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/Aqua_AIRS_Level1/AIRIBRAD.005/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-airicrad.yaml b/datasets/nasa-airicrad.yaml new file mode 100644 index 000000000..b5ebf35e2 --- /dev/null +++ b/datasets/nasa-airicrad.yaml @@ -0,0 +1,57 @@ +Name: AIRS/Aqua L1C Infrared (IR) resampled and corrected radiances V6.7 (AIRICRAD) + at GES DISC +Description: |- + The Atmospheric Infrared Sounder (AIRS) is a grating spectrometer (R = 1200) aboard the second Earth Observing System (EOS) polar-orbiting platform, EOS Aqua. In combination with the Advanced Microwave Sounding Unit (AMSU) and the Humidity Sounder for Brazil (HSB), AIRS constitutes an innovative atmospheric sounding group of visible, infrared, and microwave sensors. The AIRS Infrared (IR) level 1C data set contains AIRS infrared calibrated and geolocated radiances in W/m2/micron/ster. This data set is generated from AIRS level 1B data. The spectral coverage of L1C data is from 3.74 to 15.4 mm. The nominal spectral resolution lambda / delta lambda = 1200. The spectrum is sampled twice per spectral resolution element in a total of 2645 spectral channels. A day of AIRS data is divided into 240 granules (scenes) each of 6-minute duration. For the AIRS IR measurements, an individual granule contains 135 pixels across-track and 90 along-track pixels; there are total of 135 x 90 = 12,150 pixels per granule. AIRS employs a 49.5 degree crosstrack scanning with a 1.1 degree instantaneous field of view (IFOV) to provide twice daily coverage of essentially the entire globe in a 1:30 PM sun synchronous orbit with the 13.5 x 13.5 km2 spatial resolution at nadir. The L1C swath products are derived from the L1B swath products. The primary purpose of the level 1C is to generate the spectra of radiances without spectral gaps caused by the instrument design and bad spectral points. The AIRS L1C data can be used for comparative (with other IR measurements) studies and for weather-climate research. + + This is the latest version of this collection. The DOIs assigned to previous versions, which are no longer available, now direct + to this page. For this collection the switchover occurred on June 1, 2020. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/VWD3DRC07UEN +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 2002-08-30 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - datacenter + - earth observation + - global + - metadata + - opendap + - orbit + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'AIRS/Aqua L1C Infrared (IR) resampled and corrected radiances V6.7 + (AIRICRAD) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/Aqua_AIRS_Level1/AIRICRAD.6.7/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: AIRS version 6.6 and version 7 level-1C products + URL: https://doi.org/10.1117/12.2529400 + AuthorName: Evan M. Manning, L. Larrabee Strow, and Hartmut H. Aumann + - Title: AIRS Level-1C and applications to cross-calibration with MODIS and CrIS + URL: https://doi.org/10.1117/12.2061967 + AuthorName: Evan M. Manning, Hartmut H. Aumann, and Ali Behrangi + - Title: Validation of the Atmospheric Infrared Sounder radiative transfer algorithm + URL: https://doi.org/10.1029/2005JD006146 + AuthorName: Strow, L.L, Hannon, S.E, De-Souza Machado, S., Motteler, H.E., and + Tobin, D.C. + - Title: Radiometric Stability Validation of 17 Years of AIRS Data Using Sea Surface + Temperatures. + URL: https://doi.org/10.1029/2019GL085098 + AuthorName: Aumann, H.H. Brogerg, S.,Manning, E., and Pagaino, T. + - Title: Updates to the absolute radiometric accuracy of the AIRS on Aqua + URL: https://doi.org/10.1117/12.2324605 + AuthorName: Pagano, T.S., Aumann, H.H., Broberg, S., Manning, E., Overoye, K., + and Weiler, M. + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-astl1t.yaml b/datasets/nasa-astl1t.yaml new file mode 100644 index 000000000..482808fe0 --- /dev/null +++ b/datasets/nasa-astl1t.yaml @@ -0,0 +1,47 @@ +Name: ASTER Level 1T Precision Terrain Corrected Registered At-Sensor Radiance V004 +Description: |- + The Terra Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Level 1 Precision Terrain Corrected Registered At-Sensor Radiance (AST_L1T) data contains calibrated at-sensor radiance, which corresponds with the ASTER Level 1B ([AST_L1B](https://doi.org/10.5067/ASTER/AST_L1B.004)) that has been geometrically corrected and rotated to a north-up UTM projection. The AST_L1T is created from a single resampling of the corresponding ASTER L1A ([AST_L1A](https://doi.org/10.5067/ASTER/AST_L1A.004)) product. The bands available in the AST_L1T depend on the bands in the AST_L1A and can include up to three Visible and Near Infrared (VNIR) bands, six Shortwave Infrared (SWIR) bands, and five Thermal Infrared (TIR) bands. The AST_L1T dataset does not include the aft-looking VNIR band 3. The AST_L1T product has a spatial resolution of 15 meters (m) for the VNIR bands, 30 m for the SWIR bands, and 90 m for the TIR bands. + + The precision terrain correction process incorporates GLS2000 digital elevation data with derived ground control points (GCPs) to achieve topographic accuracy for all daytime scenes where correlation statistics reach a minimum threshold. Alternate levels of correction are possible (systematic terrain, systematic, or precision) for scenes acquired at night or that otherwise represent a reduced quality ground image (e.g., cloud cover). + + For daytime images, if the VNIR or SWIR telescope collected data and precision correction was attempted, each precision terrain corrected image will have an accompanying independent quality assessment. It will include the geometric correction available for distribution as both a text file and single band browse images with the valid GCPs overlaid. + + This multi-file product also includes georeferenced full resolution browse images. The number of browse images and the band combinations of the images depends on the bands available in the corresponding [AST_L1A](https://doi.org/10.5067/ASTER/AST_L1A.004) dataset. + + Known Issues + + * Since October 1, 2017, a correction addresses zero-filled scans in low-latitude, ascending orbit (nighttime) TIR data. Additional details are available in the ASTER L1T User Advisory. + * Data from the SWIR bands collected after April 2008 may show anomalous saturation and striping. See the ASTER SWIR User Advisory for further information. + + Improvements/Changes from Previous Versions + + * Enhanced Geolocation Accuracy: Version 4 uses Collection 2 Ground Control Points (GCPs) compared against Global Land Survey (GLS) 2000 standards to improve positional accuracy. + * Radiometric Calibration Update: Version 4 applies Radiometric Calibration Coefficient Version 5 (RCC V5) to improve the radiometric accuracy of the raw DNs, based on research by [Tsuchida and others (2020)](https://doi.org/10.3390/rs12030427), published in Remote Sensing. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/ASTER/AST_L1T.004 +Contact: 'User Services: lpdaac@usgs.gov. Home Page: https://www.earthdata.nasa.gov/centers/lp-daac/contact' +ManagedBy: NASA +UpdateFrequency: From 2000-03-04 to Ongoing (Varies) +Tags: + - aws-pds + - cog + - earth observation + - global + - land + - orbit + - cog +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'ASTER Level 1T Precision Terrain Corrected Registered At-Sensor + Radiance V004.' + ARN: arn:aws:s3:::lp-prod-protected/AST_L1T.004 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/7f58ad3abeca7f0d17637ddb812642c0120a57ab/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC diff --git a/datasets/nasa-atl03.yaml b/datasets/nasa-atl03.yaml new file mode 100644 index 000000000..95cc3aa71 --- /dev/null +++ b/datasets/nasa-atl03.yaml @@ -0,0 +1,31 @@ +Name: ATLAS/ICESat-2 L2A Global Geolocated Photon Data V006 +Description: |- + This data set (ATL03) contains height above the WGS 84 ellipsoid (ITRF2014 reference frame), latitude, longitude, and time for all photons downlinked by the Advanced Topographic Laser Altimeter System (ATLAS) instrument on board the Ice, Cloud and land Elevation Satellite-2 (ICESat-2) observatory. The ATL03 product was designed to be a single source for all photon data and ancillary information needed by higher-level ATLAS/ICESat-2 products. As such, it also includes spacecraft and instrument parameters and ancillary data not explicitly required for ATL03. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.nsidc.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/ATLAS/ATL03.006 +Contact: 'Email: nsidc@nsidc.org. Home Page: https://nsidc.org/daac' +ManagedBy: NASA +UpdateFrequency: From 2018-10-13 to Ongoing +Tags: + - aws-pds + - atmosphere + - datacenter + - earth observation + - global + - hdf + - ice + - land + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'ATLAS/ICESat-2 L2A Global Geolocated Photon Data V006.' + ARN: arn:aws:s3:::nsidc-cumulus-prod-protected/ATLAS/ATL03/006 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.nsidc.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Accessing and working with ICESat-2 data in the cloud + URL: https://github.com/nsidc/NSIDC-Data-Tutorials/blob/main/notebooks/ICESat-2_Cloud_Access/ATL06-direct-access.ipynb + AuthorName: Andy Barrett, Jennifer Roebuck, Amy Steiker diff --git a/datasets/nasa-atl08.yaml b/datasets/nasa-atl08.yaml new file mode 100644 index 000000000..8cdcbb0a5 --- /dev/null +++ b/datasets/nasa-atl08.yaml @@ -0,0 +1,31 @@ +Name: ATLAS/ICESat-2 L3A Land and Vegetation Height V006 +Description: |- + This data set (ATL08) contains along-track heights above the WGS84 ellipsoid (ITRF2014 reference frame) for the ground and canopy surfaces. The canopy and ground surfaces are processed in fixed 100 m data segments, which typically contain more than 100 signal photons. The data were acquired by the Advanced Topographic Laser Altimeter System (ATLAS) instrument on board the Ice, Cloud and land Elevation Satellite-2 (ICESat-2) observatory. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.nsidc.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/ATLAS/ATL08.006 +Contact: 'NASA NSIDC DAAC: nsidc@nsidc.org. Home Page: https://nsidc.org/daac' +ManagedBy: NASA +UpdateFrequency: From 2018-10-14 to Ongoing +Tags: + - aws-pds + - atmosphere + - datacenter + - earth observation + - global + - ice + - land + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'ATLAS/ICESat-2 L3A Land and Vegetation Height V006.' + ARN: arn:aws:s3:::nsidc-cumulus-prod-protected/ATLAS/ATL08/006 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.nsidc.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Accessing and working with ICESat-2 data in the cloud + URL: https://github.com/nsidc/NSIDC-Data-Tutorials/blob/main/notebooks/ICESat-2_Cloud_Access/ATL06-direct-access_rendered.ipynb + AuthorName: Andy Barrett, Jennifer Roebuck, and Amy Steiker. \ No newline at end of file diff --git a/datasets/nasa-gedi02a.yaml b/datasets/nasa-gedi02a.yaml new file mode 100644 index 000000000..3be6d5e48 --- /dev/null +++ b/datasets/nasa-gedi02a.yaml @@ -0,0 +1,69 @@ +Name: GEDI L2A Elevation and Height Metrics Data Global Footprint Level V002 +Description: |- + The Global Ecosystem Dynamics Investigation ([GEDI](https://gedi.umd.edu/)) mission aims to characterize ecosystem structure and dynamics to enable radically improved quantification and understanding of the Earth’s carbon cycle and biodiversity. The GEDI instrument produces high resolution laser ranging observations of the 3-dimensional structure of the Earth. GEDI is attached to the International Space Station (ISS) and collects data globally between 51.6° N and 51.6° S latitudes at the highest resolution and densest sampling of any light detection and ranging (lidar) instrument in orbit to date. Each GEDI Version 2 granule encompasses one-fourth of an ISS orbit and includes georeferenced metadata to allow for spatial querying and subsetting. + + The GEDI instrument was removed from the ISS and placed into storage on March 17, 2023. No data were acquired during the hibernation period from March 17, 2023, to April 24, 2024. GEDI has since been reinstalled on the ISS and resumed operations as of April 26, 2024. + + The purpose of the GEDI Level 2A Geolocated Elevation and Height Metrics product (GEDI02_A) is to provide waveform interpretation and extracted products from each GEDI01_B received waveform, including ground elevation, canopy top height, and relative height (RH) metrics. The methodology for generating the GEDI02_A product datasets is adapted from the Land, Vegetation, and Ice Sensor (LVIS) algorithm. The GEDI02_A product is provided in HDF5 format and has a spatial resolution (average footprint) of 25 meters. + + The GEDI02_A data product contains 156 layers for each of the eight beams, including ground elevation, canopy top height, relative return energy metrics (e.g., canopy vertical structure), and many other interpreted products from the return waveforms. Additional information for the layers can be found in the GEDI Level 2A Dictionary. + + Known Issues + + * Data acquisition gaps: GEDI data acquisitions were suspended on December 19, 2019 (2019 Day 353) and resumed on January 8, 2020 (2020 Day 8). + * Incorrect Reference Ground Track (RGT) number in the filename for select GEDI files: GEDI Science Data Products for six orbits on August 7, 2020, and November 12, 2021, had the incorrect RGT number in the filename. There is no impact to the science data, but users should reference this [document](https://lpdaac.usgs.gov/documents/2236/GEDI_CORRECTED_RGT_FILENAMES.pptx) for the correct RGT numbers. + * Known Issues: Section 8 of the User Guide provides additional information on known issues. + + Improvements/Changes from Previous Versions + + * Metadata has been updated to include spatial coordinates. + * Granule size has been reduced from one full ISS orbit (~5.83 GB) to four segments per orbit (~1.48 GB). + * Filename has been updated to include segment number and version number. + * Improved geolocation for an orbital segment. + * Added elevation from the SRTM digital elevation model for comparison. + * Modified the method to predict an optimum algorithm setting group per laser shot. + * Added additional land cover datasets related to phenology, urban infrastructure, and water persistence. + * Added selected_mode_flag dataset to root beam group using selected algorithm. + * Removed shots when the laser is not firing. + * Modified file name to include segment number and dataset version. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GEDI/GEDI02_A.002 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2019-04-04 to 2023-03-16 (Varies) +Tags: + - aws-pds + - biodiversity + - carbon + - datacenter + - earth observation + - energy + - global + - hdf + - ice + - land + - land cover + - lidar + - metadata + - orbit + - urban + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GEDI L2A Elevation and Height Metrics Data Global Footprint Level + V002.' + ARN: arn:aws:s3:::lp-prod-protected/GEDI02_A.002 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: 'How to: Find and Access GEDI Data' + URL: https://github.com/nasa/GEDI-Data-Resources/blob/main/python/tutorials/how-to-find-and-access-GEDI-data_earthaccess.ipynb + AuthorName: Land Processes Distributed Active Archive Center (LP DAAC) + AuthorURL: https://lpdaac.usgs.gov/ + - Title: Getting Started with GEDI L2A Version 2 Data in Python + URL: https://github.com/nasa/GEDI-Data-Resources/blob/main/python/tutorials/GEDI_L2A_V2_Tutorial.ipynb + AuthorName: Land Processes Distributed Active Archive Center (LP DAAC) + AuthorURL: https://lpdaac.usgs.gov/ diff --git a/datasets/nasa-gedil4aagbdensityv212056.yaml b/datasets/nasa-gedil4aagbdensityv212056.yaml new file mode 100644 index 000000000..e63541c8c --- /dev/null +++ b/datasets/nasa-gedil4aagbdensityv212056.yaml @@ -0,0 +1,32 @@ +Name: GEDI L4A Footprint Level Aboveground Biomass Density, Version 2.1 +Description: |- + This dataset contains Global Ecosystem Dynamics Investigation (GEDI) Level 4A (L4A) Version 2 predictions of the aboveground biomass density (AGBD; in Mg/ha) and estimates of the prediction standard error within each sampled geolocated laser footprint. In this version, the granules are in sub-orbits. The algorithm setting group selection used for GEDI02_A Version 2 has been modified for Evergreen Broadleaf Trees in South America to reduce false positive errors resulting from the selection of waveform modes above ground elevation as the lowest mode. The footprints are located within the global latitude band observed by the International Space Station (ISS), nominally 51.6 degrees N and S and reported for the period 2019-04-18 to 2024-11-27. No acquisitions occurred while the GEDI instrument was in storage on the International Space Station (ISS) from March 2023 to April 2024. The GEDI instrument consists of three lasers producing a total of eight beam ground transects, which instantaneously sample eight ~25 m footprints spaced approximately every 60 m along-track. The GEDI beam transects are spaced approximately 600 m apart on the Earth's surface in the cross-track direction, for an across-track width of ~4.2 km. Footprint AGBD was derived from parametric models that relate simulated GEDI Level 2A (L2A) waveform relative height (RH) metrics to field plot estimates of AGBD. Height metrics from simulated waveforms associated with field estimates of AGBD from multiple regions and plant functional types (PFTs) were compiled to generate a calibration dataset for models representing the combinations of world regions and PFTs (i.e., deciduous broadleaf trees, evergreen broadleaf trees, evergreen needleleaf trees, deciduous needleleaf trees, and the combination of grasslands, shrubs, and woodlands). For each of the eight beams, additional data are reported with the AGBD estimates, including the associated uncertainty metrics, quality flags, model inputs, and other information about the GEDI L2A waveform for this selected algorithm setting group. Model inputs include the scaled and transformed GEDI L2A RH metrics, footprint geolocation variables and land cover input data including PFTs and the world region identifiers. Additional model outputs include the AGBD predictions for each of the six GEDI L2A algorithm setting groups with AGBD in natural and transformed units and associated prediction uncertainty for each GEDI L2A algorithm setting group. Providing these ancillary data products will allow users to evaluate and select alternative algorithm setting groups. Also provided are outputs of parameters and variables from the L4A models used to generate AGBD predictions that are required as input to the GEDI04_B algorithm to generate 1-km gridded products. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.ornldaac.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.3334/ORNLDAAC/2056 +Contact: 'ORNL DAAC User Services Office: uso@daac.ornl.gov.' +ManagedBy: NASA +UpdateFrequency: From 2019-04-17 to 2024-11-27 +Tags: + - aws-pds + - earth observation + - ecosystems + - global + - land + - land cover + - lidar + - opendap + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GEDI L4A Footprint Level Aboveground Biomass Density, Version 2.1.' + ARN: arn:aws:s3:::ornl-cumulus-prod-protected/gedi/GEDI_L4A_AGB_Density_V2_1/data + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.ornldaac.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Searching and Downloading GEDI L4A Dataset + URL: https://github.com/ornldaac/gedi_tutorials/blob/main/notebooks/gedi_l4a_search_download.ipynb + AuthorName: Rupesh Shrestha diff --git a/datasets/nasa-gpm2adpr.yaml b/datasets/nasa-gpm2adpr.yaml new file mode 100644 index 000000000..6bd1cf114 --- /dev/null +++ b/datasets/nasa-gpm2adpr.yaml @@ -0,0 +1,44 @@ +Name: GPM DPR Precipitation Profile L2A 1.5 hours 5 km V07 (GPM_2ADPR) at GES DISC +Description: |- + Version 07 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 07. + . + + 2ADPR provides single- and dual-frequency-derived precipitation estimates from the Ku and Ka radars of the Dual-Frequency Precipitation Radar (DPR) on the core GPM spacecraft. The output consists of three main classes of precipitation products: those derived from the Ku-band frequency over a wide swath (245 km), those derived from the Ka-band frequency over a narrow swath (125 km), and those derived from the dual-frequency data over the narrow swath. The Ka-band results are further divided into the standard and high-sensitivity estimates. In the standard sensitivity mode, the fields of view within the inner swath are matched to those of the Ku-band. Data from these matched-beam Ku- and Ka-band fields of view are used to derive the dual-frequency precipitation products. The retrievals are performed at each radar range bin along the slant path of the radar instrument field of view (IFOV). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GPM/DPR/GPM/2A/07 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 2014-03-08 to Ongoing +Tags: + - aws-pds + - atmosphere + - contamination + - datacenter + - earth observation + - global + - metadata + - opendap + - radar + - water + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM DPR Precipitation Profile L2A 1.5 hours 5 km V07 (GPM_2ADPR) + at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L2/GPM_2ADPR.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: How to Read IMERG Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_IMERG_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + \ No newline at end of file diff --git a/datasets/nasa-gpm3imergde.yaml b/datasets/nasa-gpm3imergde.yaml new file mode 100644 index 000000000..feb4d53ca --- /dev/null +++ b/datasets/nasa-gpm3imergde.yaml @@ -0,0 +1,111 @@ +Name: GPM IMERG Early Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDE) + at GES DISC +Description: "Version 07 is the current version of the data set. Older versions will + no longer be available and have been superseded by Version 07.\n\nThe Integrated + Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global + surface precipitation rates at a high resolution of 0.1° every half-hour beginning + 2000. It is part of the joint NASA-JAXA Global Precipitation Measurement (GPM) + mission, using the GPM Core Observatory satellite as the standard to combine precipitation + observations from an international constellation of satellites using advanced techniques. + \ IMERG can be used for global-scale applications as well as over regions with sparse + or no reliable surface observations. The fine spatial and temporal resolution of + IMERG data allows them to be accumulated to the scale of the application for increased + skill. IMERG has three Runs with varying latencies in response to a range of application + needs: rapid-response applications (Early Run, 4-h latency), same/next-day applications + (Late Run, 14-h latency), and post-real-time research (Final Run, 3.5-month latency). + \ While IMERG strives for consistency and accuracy, satellite estimates of precipitation + are expected to have lower skill over frozen surfaces, complex terrain, and coastal + zones. As well, the changing GPM satellite constellation over time may introduce + artifacts that affect studies focusing on multi-year changes.\n\nThis dataset is + the GPM Level 3 IMERG *Early* Daily 10 x 10 km (GPM_3IMERGDE) derived from the + half-hourly GPM_3IMERGHHE. The derived result represents an early (expedited) estimate + of the daily mean precipitation rate in mm/day. The dataset is produced by first + computing the mean precipitation rate in (mm/hour) in every grid cell, and then + multiplying the result by 24. This minimizes the possible dry bias in versions + before \"07\", in the simple daily totals for cells where less than 48 half-hourly + observations are valid for the day. The latter under-sampling is very rare in the + combined microwave-infrared (and rain gauge in the final) dataset, variable \"precipitation\", + \ and appears in higher latitudes. Thus, in most cases users of global \"precipitation\" + data will not notice any difference. This correction, however, is noticeable in + the high-quality microwave retrieval, variable \"MWprecipitation\", where the occurrence + of less than 48 valid half-hourly samples per day is very common. The counts of + the valid half-hourly samples per day have always been provided as a separate variable, + and users of daily data were advised to pay close attention to that variable and + use it to calculate the correct precipitation daily rates. Starting with version + \"07\", this is done in production to minimize possible misinterpretations of the + data. The counts are still provided in the data, but they are only given to gauge + the significance of the daily rates, and reconstruct the simple totals if someone + wishes to do so. \n\nThe latency of the derived Early daily product is a couple + of minutes after the last granule of GPM_3IMERGHHE for the UTC data day is received + at GES DISC. Since the target latency of GPM_3IMERGHHE is 4 hours, the daily should + appear about 4 hours after the closure of the UTC day. For information on the original + data (GPM_3IMERGHHE), please see the Documentation (Related URL). \n\nThe daily + mean rate (mm/day) is derived by first computing the mean precipitation rate (mm/hour) + in a grid cell for the data day, and then multiplying the result by 24. Thus, for + every grid cell we have \n\nPdaily_mean = SUM{Pi * 1[Pi valid]} + / Pdaily_cnt * 24, i=[1,Nf]\n\nWhere:\nPdaily_cnt = SUM{1[Pi valid]}\n\nPi - + half-hourly input, in (mm/hr)\nNf - Number of half-hourly files per + day, Nf=48\n1[.] - Indicator function; 1 when Pi is valid, 0 otherwise\nPdaily_cnt + \ - Number of valid retrievals in a grid cell per day.\n\nGrid cells for which + Pdaily_cnt=0, are set to fill value in the Daily files.\nNote that Pi=0 is a valid + value.\n\nPdaily_cnt are provided in the data files as variables \"precipitation_cnt\" + and \"MWprecipitation_cnt\", for correspondingly the microwave-IR-gauge and microwave-only + retrievals. They are only given to gauge the significance of the daily rates, and + reconstruct the simple totals if someone wishes to do so. \n\nThere are various + ways the daily error could be estimated from the source half-hourly random error + (variable \"randomError\"). The daily error provided in the data files is calculated + in a fashion similar to the daily mean precipitation rate. First, the mean of the + squared half-hourly \"randomError\" for the day is computed, and the resulting + (mm^2/hr) is converted to (mm^2/day). Finally, square root is taken to get the result + in (mm/day):\n\nPerr_daily = { SUM{ (Perr_i)^2 * 1[Perr_i valid] ) } / Ncnt_err + \ * 24}^0.5, i=[1,Nf]\nNcnt_err = SUM( 1[Perr_i valid] )\n\nwhere:\nPerr_i\t\t- + half-hourly input, \"randomError\", (mm/hr)\nPerr_daily\t- Magnitude of the daily + error, (mm/day)\nNcnt_err\t\t- Number of valid half-hour error estimates\n\nAgain, + the sum of squared \"randomError\" can be reconstructed, and other estimates can + be derived using the available counts in the Daily files.\n\n\nRead our doc on how + to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/GPM/IMERGDE/DAY/07 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - coastal + - datacenter + - global + - hydrology + - land + - metadata + - opendap + - netcdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Early Precipitation L3 1 day 0.1 degree x 0.1 degree V07 + (GPM_3IMERGDE) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGDE.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale.' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations + AuthorName: Rudolf, B., and U. Schneider + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpm3imergdf.yaml b/datasets/nasa-gpm3imergdf.yaml new file mode 100644 index 000000000..9d60d0f30 --- /dev/null +++ b/datasets/nasa-gpm3imergdf.yaml @@ -0,0 +1,110 @@ +Name: GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) + at GES DISC +Description: "Version 07 is the current version of the data set. Older versions will + no longer be available and have been superseded by Version 07.\n\nThe Integrated + Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global + surface precipitation rates at a high resolution of 0.1° every half-hour beginning + 2000. It is part of the joint NASA-JAXA Global Precipitation Measurement (GPM) + mission, using the GPM Core Observatory satellite as the standard to combine precipitation + observations from an international constellation of satellites using advanced techniques. + \ IMERG can be used for global-scale applications as well as over regions with sparse + or no reliable surface observations. The fine spatial and temporal resolution of + IMERG data allows them to be accumulated to the scale of the application for increased + skill. IMERG has three Runs with varying latencies in response to a range of application + needs: rapid-response applications (Early Run, 4-h latency), same/next-day applications + (Late Run, 14-h latency), and post-real-time research (Final Run, 3.5-month latency). + \ While IMERG strives for consistency and accuracy, satellite estimates of precipitation + are expected to have lower skill over frozen surfaces, complex terrain, and coastal + zones. As well, the changing GPM satellite constellation over time may introduce + artifacts that affect studies focusing on multi-year changes.\n\nThis dataset is + the GPM Level 3 IMERG *Final* Daily 10 x 10 km (GPM_3IMERGDF) derived from the + half-hourly GPM_3IMERGHH. The derived result represents the Final estimate of the + daily mean precipitation rate in mm/day. The dataset is produced by first computing + the mean precipitation rate in (mm/hour) in every grid cell, and then multiplying + the result by 24. This minimizes the possible dry bias in versions before \"07\", + in the simple daily totals for cells where less than 48 half-hourly observations + are valid for the day. The latter under-sampling is very rare in the combined microwave-infrared + and rain gauge dataset, variable \"precipitation\", and appears in higher latitudes. + Thus, in most cases users of global \"precipitation\" data will not notice any difference. + This correction, however, is noticeable in the high-quality microwave retrieval, + variable \"MWprecipitation\", where the occurrence of less than 48 valid half-hourly + samples per day is very common. The counts of the valid half-hourly samples per + day have always been provided as a separate variable, and users of daily data were + advised to pay close attention to that variable and use it to calculate the correct + precipitation daily rates. Starting with version \"07\", this is done in production + to minimize possible misinterpretations of the data. The counts are still provided + in the data, but they are only given to gauge the significance of the daily rates, + and reconstruct the simple totals if someone wishes to do so. \n\nThe latency of + the derived *Final* Daily product depends on the delivery of the IMERG *Final* Half-Hourly + product GPM_IMERGHH. Since the latter are delivered in a batch, once per month for + the entire month, with up to 4 months latency, so will be the latency for the Final + Daily, plus about 24 hours. Thus, e.g. the Dailies for January can be expected + to appear no earlier than April 2. \n\nThe daily mean rate (mm/day) is derived by + first computing the mean precipitation rate (mm/hour) in a grid cell for the data + day, and then multiplying the result by 24. Thus, for every grid cell we have \n\nPdaily_mean + \ = SUM{Pi * 1[Pi valid]} / Pdaily_cnt * 24, i=[1,Nf]\n\nWhere:\nPdaily_cnt + = SUM{1[Pi valid]}\n\nPi - half-hourly input, in (mm/hr)\nNf - + Number of half-hourly files per day, Nf=48\n1[.] - Indicator function; + 1 when Pi is valid, 0 otherwise\nPdaily_cnt - Number of valid retrievals in + a grid cell per day.\n\nGrid cells for which Pdaily_cnt=0, are set to fill value + in the Daily files.\nNote that Pi=0 is a valid value.\n\nPdaily_cnt are provided + in the data files as variables \"precipitation_cnt\" and \"MWprecipitation_cnt\", + for correspondingly the microwave-IR-gauge and microwave-only retrievals. They are + only given to gauge the significance of the daily rates, and reconstruct the simple + totals if someone wishes to do so. \n\nThere are various ways the daily error could + be estimated from the source half-hourly random error (variable \"randomError\"). + The daily error provided in the data files is calculated in a fashion similar to + the daily mean precipitation rate. First, the mean of the squared half-hourly \"randomError\" + \ for the day is computed, and the resulting (mm^2/hr) is converted to (mm^2/day). + Finally, square root is taken to get the result in (mm/day):\n\nPerr_daily = { SUM{ + (Perr_i)^2 * 1[Perr_i valid] ) } / Ncnt_err * 24}^0.5, i=[1,Nf]\nNcnt_err = SUM( + 1[Perr_i valid] )\n\nwhere:\nPerr_i\t\t- half-hourly input, \"randomError\", (mm/hr)\nPerr_daily\t- + Magnitude of the daily error, (mm/day)\nNcnt_err\t\t- Number of valid half-hour + error estimates\n\nAgain, the sum of squared \"randomError\" can be reconstructed, + and other estimates can be derived using the available counts in the Daily files.\n\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/GPM/IMERGDF/DAY/07 +Contact: 'Email: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - climate + - coastal + - datacenter + - global + - hydrology + - ice + - land + - metadata + - netCDF + - opendap +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 + (GPM_3IMERGDF) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGDF.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale.' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations + AuthorName: Rudolf, B., and U. Schneider + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpm3imergdl.yaml b/datasets/nasa-gpm3imergdl.yaml new file mode 100644 index 000000000..abd004a2a --- /dev/null +++ b/datasets/nasa-gpm3imergdl.yaml @@ -0,0 +1,111 @@ +Name: GPM IMERG Late Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDL) + at GES DISC +Description: "Version 07 is the current version of the data set. Older versions will + no longer be available and have been superseded by Version 07.\n\nThe Integrated + Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global + surface precipitation rates at a high resolution of 0.1° every half-hour beginning + 2000. It is part of the joint NASA-JAXA Global Precipitation Measurement (GPM) + mission, using the GPM Core Observatory satellite as the standard to combine precipitation + observations from an international constellation of satellites using advanced techniques. + \ IMERG can be used for global-scale applications as well as over regions with sparse + or no reliable surface observations. The fine spatial and temporal resolution of + IMERG data allows them to be accumulated to the scale of the application for increased + skill. IMERG has three Runs with varying latencies in response to a range of application + needs: rapid-response applications (Early Run, 4-h latency), same/next-day applications + (Late Run, 14-h latency), and post-real-time research (Final Run, 3.5-month latency). + \ While IMERG strives for consistency and accuracy, satellite estimates of precipitation + are expected to have lower skill over frozen surfaces, complex terrain, and coastal + zones. As well, the changing GPM satellite constellation over time may introduce + artifacts that affect studies focusing on multi-year changes.\n\nThis dataset is + the GPM Level 3 IMERG Late Daily 10 x 10 km (GPM_3IMERGDL) derived from the half-hourly + GPM_3IMERGHHL. The derived result represents a Late expedited estimate of the daily + mean precipitation rate in mm/day. The dataset is produced by first computing the + mean precipitation rate in (mm/hour) in every grid cell, and then multiplying the + result by 24. This minimizes the possible dry bias in versions before \"07\", in + the simple daily totals for cells where less than 48 half-hourly observations are + valid for the day. The latter under-sampling is very rare in the combined microwave-infrared + (and rain gauge in the final) dataset, variable \"precipitation\", and appears + in higher latitudes. Thus, in most cases users of global \"precipitation\" data + will not notice any difference. This correction, however, is noticeable in the high-quality + microwave retrieval, variable \"MWprecipitation\", where the occurrence of less + than 48 valid half-hourly samples per day is very common. The counts of the valid + half-hourly samples per day have always been provided as a separate variable, and + users of daily data were advised to pay close attention to that variable and use + it to calculate the correct precipitation daily rates. Starting with version \"07\", + this is done in production to minimize possible misinterpretations of the data. + The counts are still provided in the data, but they are only given to gauge the + significance of the daily rates, and reconstruct the simple totals if someone wishes + to do so. \n\nThe latency of the derived Late daily product is a couple of minutes + after the last granule of GPM_3IMERGHHL for the UTC data day is received at GES + DISC. Since the target latency of GPM_3IMERGHHL is 14 hours, the daily should appear + no earlier than 14 hours after the closure of the UTC day. For information on the + original data (GPM_3IMERGHHL), please see the Documentation (Related URL). \n\nThe + daily mean rate (mm/day) is derived by first computing the mean precipitation rate + (mm/hour) in a grid cell for the data day, and then multiplying the result by 24. + \ Thus, for every grid cell we have \n\nPdaily_mean = SUM{Pi + * 1[Pi valid]} / Pdaily_cnt * 24, i=[1,Nf]\n\nWhere:\nPdaily_cnt = SUM{1[Pi valid]}\n\nPi + \ - half-hourly input, in (mm/hr)\nNf - Number of half-hourly + files per day, Nf=48\n1[.] - Indicator function; 1 when Pi is valid, + 0 otherwise\nPdaily_cnt - Number of valid retrievals in a grid cell per day.\n\nGrid + cells for which Pdaily_cnt=0, are set to fill value in the Daily files.\nNote that + Pi=0 is a valid value.\n\nPdaily_cnt are provided in the data files as variables + \"precipitation_cnt\" and \"MWprecipitation_cnt\", for correspondingly the microwave-IR-gauge + and microwave-only retrievals. They are only given to gauge the significance of + the daily rates, and reconstruct the simple totals if someone wishes to do so. \n\nThere + are various ways the daily error could be estimated from the source half-hourly + random error (variable \"randomError\"). The daily error provided in the data files + is calculated in a fashion similar to the daily mean precipitation rate. First, + the mean of the squared half-hourly \"randomError\" for the day is computed, and + the resulting (mm^2/hr) is converted to (mm^2/day). Finally, square root is taken + to get the result in (mm/day):\n\nPerr_daily = { SUM{ (Perr_i)^2 * 1[Perr_i valid] + ) } / Ncnt_err * 24}^0.5, i=[1,Nf]\nNcnt_err = SUM( 1[Perr_i valid] )\n\nwhere:\nPerr_i\t\t- + half-hourly input, \"randomError\", (mm/hr)\nPerr_daily\t- Magnitude of the daily + error, (mm/day)\nNcnt_err\t\t- Number of valid half-hour error estimates\n\nAgain, + the sum of squared \"randomError\" can be reconstructed, and other estimates can + be derived using the available counts in the Daily files.\n\n\nRead our doc on how + to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/GPM/IMERGDL/DAY/07 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - coastal + - datacenter + - global + - hydrology + - land + - metadata + - opendap + - netcdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Late Precipitation L3 1 day 0.1 degree x 0.1 degree V07 + (GPM_3IMERGDL) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGDL.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale.' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations + AuthorName: Rudolf, B., and U. Schneider + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpm3imerghh.yaml b/datasets/nasa-gpm3imerghh.yaml new file mode 100644 index 000000000..ef6dbc01e --- /dev/null +++ b/datasets/nasa-gpm3imerghh.yaml @@ -0,0 +1,76 @@ +Name: GPM IMERG Final Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHH) + at GES DISC +Description: |- + Version 07B is the current version of the IMERG data sets. Older versions will no longer be available and have been superseded by Version 07. + + The Integrated Multi-satellitE Retrievals for GPM (IMERG) is the unified U.S. algorithm that provides the multi-satellite precipitation product for the U.S. GPM team. + + The precipitation estimates from the various precipitation-relevant satellite passive microwave (PMW) sensors comprising the GPM constellation are computed using the 2021 version of the Goddard Profiling Algorithm (GPROF2021), then gridded, intercalibrated to the GPM Combined Ku Radar-Radiometer Algorithm (CORRA) product, and merged into half-hourly 0.1°x0.1° (roughly 10x10 km) fields. Note that CORRA is adjusted to the monthly Global Precipitation Climatology Project (GPCP) Satellite-Gauge (SG) product over high-latitude ocean to correct known biases. + + The half-hourly intercalibrated merged PMW estimates are then input to both a Morphing-Kalman Filter (KF) Lagrangian time interpolation scheme based on work by the Climate Prediction Center (CPC) and the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) Dynamic Infrared–Rain Rate (PDIR) re-calibration scheme. In parallel, CPC assembles the zenith-angle-corrected, intercalibrated merged geo-IR fields and forwards them to PPS for input to the PERSIANN-CCS algorithm (supported by an asynchronous re-calibration cycle) which are then input to the KF morphing (quasi-Lagrangian time interpolation) scheme. + + The KF morphing (supported by an asynchronous KF weights updating cycle) uses the PMW and IR estimates to create half-hourly estimates. Motion vectors for the morphing are computed by maximizing the pattern correlation of successive hours within each of the precipitation (PRECTOT), total precipitable liquid water (TQL), and vertically integrated vapor (TQV) data fields provided by the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2) and Goddard Earth Observing System model Version 5 (GEOS-5) Forward Processing (FP) for the post-real-time (Final) Run and the near-real-time (Early and Late) Runs, respectively. The vectors from PRECTOT are chosen if available, else from TQL, if available, else from TQV. The KF uses the morphed data as the “forecast” and the IR estimates as the “observations”, with weighting that depends on the time interval(s) away from the microwave overpass time. The IR becomes important after about ±90 minutes away from the overpass time. Variable averaging in the KF is accounted for in a routine (Scheme for Histogram Adjustment with Ranked Precipitation Estimates in the Neighborhood, or SHARPEN) that compares the local histogram of KF morphed precipitation to the local histogram of forward- and backward-morphed microwave data and the IR. + + The IMERG system is run twice in near-real time: + + "Early" multi-satellite product ~4 hr after observation time using only forward morphing and + "Late" multi-satellite product ~14 hr after observation time, using both forward and backward morphing + and once after the monthly gauge analysis is received: + + "Final", satellite-gauge product ~4 months after the observation month, using both forward and backward morphing and including monthly gauge analyses. + + In V07, the near-real-time Early and Late half-hourly estimates have a monthly climatological concluding calibration based on averaging the concluding calibrations computed in the Final, while in the post-real-time Final Run the multi-satellite half-hourly estimates are adjusted so that they sum to the Final Run monthly satellite-gauge combination. In all cases the output contains multiple fields that provide information on the input data, selected intermediate fields, and estimation quality. In general, the complete calibrated precipitation, precipitation, is the data field of choice for most users. + + Briefly describing the Final Run, the input precipitation estimates computed from the various satellite passive microwave sensors are intercalibrated to the CORRA product (because it is presumed to be the best snapshot TRMM/GPM estimate after adjustment to the monthly GPCP SG), then "forward/backward morphed" and combined with microwave precipitation-calibrated geo-IR fields, and adjusted with seasonal GPCP SG surface precipitation data to provide half-hourly and monthly precipitation estimates on a 0.1°x0.1° (roughly 10x10 km) grid over the globe. Precipitation phase is a diagnostic variable computed using analyses of surface temperature, humidity, and pressure. The current period of record is June 2000 to the present (delayed by about 4 months). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GPM/IMERG/3B-HH/07 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - datacenter + - forecast + - global + - hdf + - hydrology + - land + - metadata + - opendap + - radar + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Final Precipitation L3 Half Hourly 0.1 degree x 0.1 degree + V07 (GPM_3IMERGHH) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGHH.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale.' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations + AuthorName: Rudolf, B., and U. Schneider + Tutorials: + - Title: How to Read IMERG Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_IMERG_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpm3imerghhe.yaml b/datasets/nasa-gpm3imerghhe.yaml new file mode 100644 index 000000000..e727bef5b --- /dev/null +++ b/datasets/nasa-gpm3imerghhe.yaml @@ -0,0 +1,103 @@ +Name: GPM IMERG Early Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHHE) + at GES DISC +Description: "Version 07B is the current version of the IMERG data sets. Older versions + will no longer be available and have been superseded by Version 07.\n\nThe Integrated + Multi-satellitE Retrievals for GPM (IMERG) is the unified U.S. algorithm that provides + the multi-satellite precipitation product for the U.S. GPM team.\n\nThe precipitation + estimates from the various precipitation-relevant satellite passive microwave (PMW) + sensors comprising the GPM constellation are computed using the 2021 version of + the Goddard Profiling Algorithm (GPROF2021), then gridded, intercalibrated to the + GPM Combined Ku Radar-Radiometer Algorithm (CORRA) product, and merged into half-hourly + 0.1°x0.1° (roughly 10x10 km) fields. Note that CORRA is adjusted to the monthly + Global Precipitation Climatology Project (GPCP) Satellite-Gauge (SG) product over + high-latitude ocean to correct known biases.\n\nThe half-hourly intercalibrated + merged PMW estimates are then input to both a Morphing-Kalman Filter (KF) Lagrangian + time interpolation scheme based on work by the Climate Prediction Center (CPC) and + the Precipitation Estimation from Remotely Sensed Information using Artificial Neural + Networks (PERSIANN) Dynamic Infrared–Rain Rate (PDIR) re-calibration scheme. In + parallel, CPC assembles the zenith-angle-corrected, intercalibrated merged geo-IR + fields and forwards them to PPS for input to the PERSIANN-CCS algorithm (supported + by an asynchronous re-calibration cycle) which are then input to the KF morphing + (quasi-Lagrangian time interpolation) scheme.\n\nThe KF morphing (supported by an + asynchronous KF weights updating cycle) uses the PMW and IR estimates to create + half-hourly estimates. Motion vectors for the morphing are computed by maximizing + the pattern correlation of successive hours within each of the precipitation (PRECTOT), + total precipitable liquid water (TQL), and vertically integrated vapor (TQV) data + fields provided by the Modern-Era Retrospective Analysis for Research and Applications, + Version 2 (MERRA-2) and Goddard Earth Observing System model Version 5 (GEOS-5) + Forward Processing (FP) for the post-real-time (Final) Run and the near-real-time + (Early and Late) Runs, respectively. The vectors from PRECTOT are chosen if available, + else from TQL, if available, else from TQV. The KF uses the morphed data as the + “forecast” and the IR estimates as the “observations”, with weighting that depends + on the time interval(s) away from the microwave overpass time. The IR becomes important + after about ±90 minutes away from the overpass time. Variable averaging in the KF + is accounted for in a routine (Scheme for Histogram Adjustment with Ranked Precipitation + Estimates in the Neighborhood, or SHARPEN) that compares the local histogram of + KF morphed precipitation to the local histogram of forward- and backward-morphed + microwave data and the IR.\n\nThe IMERG system is run twice in near-real time:\n\n\"Early\" + multi-satellite product ~4 hr after observation time using only forward morphing + and\n\"Late\" multi-satellite product ~14 hr after observation time, using both + forward and backward morphing\nand once after the monthly gauge analysis is received:\n\n\"Final\", + satellite-gauge product ~4 months after the observation month, using both forward + and backward morphing and including monthly gauge analyses.\n\nIn V07, the near-real-time + Early and Late half-hourly estimates have a monthly climatological concluding calibration + based on averaging the concluding calibrations computed in the Final, while in the + post-real-time Final Run the multi-satellite half-hourly estimates are adjusted + so that they sum to the Final Run monthly satellite-gauge combination. In all cases + the output contains multiple fields that provide information on the input data, + selected intermediate fields, and estimation quality. In general, the complete calibrated + precipitation, precipitation, is the data field of choice for most users.\n\nPrecipitation + phase is a diagnostic variable computed using analyses of surface temperature, humidity, + and pressure. \nRead our doc on how to get AWS Credentials to retrieve this data: + https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/GPM/IMERG/3B-HH-E/07 +Contact: 'Email: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - datacenter + - forecast + - global + - hdf + - hydrology + - land + - metadata + - opendap + - radar + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Early Precipitation L3 Half Hourly 0.1 degree x 0.1 degree + V07 (GPM_3IMERGHHE) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGHHE.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale.' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations + AuthorName: Rudolf, B., and U. Schneider + Tutorials: + - Title: How to Read IMERG Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_IMERG_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpm3imerghhl.yaml b/datasets/nasa-gpm3imerghhl.yaml new file mode 100644 index 000000000..7d9435683 --- /dev/null +++ b/datasets/nasa-gpm3imerghhl.yaml @@ -0,0 +1,104 @@ +Name: GPM IMERG Late Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHHL) + at GES DISC +Description: |- + Version 07B is the current version of the IMERG data sets. Older versions + will no longer be available and have been superseded by Version 07.\n\nThe Integrated + Multi-satellitE Retrievals for GPM (IMERG) is the unified U.S. algorithm that provides + the multi-satellite precipitation product for the U.S. GPM team.\n\nThe precipitation + estimates from the various precipitation-relevant satellite passive microwave (PMW) + sensors comprising the GPM constellation are computed using the 2021 version of + the Goddard Profiling Algorithm (GPROF2021), then gridded, intercalibrated to the + GPM Combined Ku Radar-Radiometer Algorithm (CORRA) product, and merged into half-hourly + 0.1°x0.1° (roughly 10x10 km) fields. Note that CORRA is adjusted to the monthly + Global Precipitation Climatology Project (GPCP) Satellite-Gauge (SG) product over + high-latitude ocean to correct known biases.\n\nThe half-hourly intercalibrated + merged PMW estimates are then input to both a Morphing-Kalman Filter (KF) Lagrangian + time interpolation scheme based on work by the Climate Prediction Center (CPC) and + the Precipitation Estimation from Remotely Sensed Information using Artificial Neural + Networks (PERSIANN) Dynamic Infrared–Rain Rate (PDIR) re-calibration scheme. In + parallel, CPC assembles the zenith-angle-corrected, intercalibrated merged geo-IR + fields and forwards them to PPS for input to the PERSIANN-CCS algorithm (supported + by an asynchronous re-calibration cycle) which are then input to the KF morphing + (quasi-Lagrangian time interpolation) scheme.\n\nThe KF morphing (supported by an + asynchronous KF weights updating cycle) uses the PMW and IR estimates to create + half-hourly estimates. Motion vectors for the morphing are computed by maximizing + the pattern correlation of successive hours within each of the precipitation (PRECTOT), + total precipitable liquid water (TQL), and vertically integrated vapor (TQV) data + fields provided by the Modern-Era Retrospective Analysis for Research and Applications, + Version 2 (MERRA-2) and Goddard Earth Observing System model Version 5 (GEOS-5) + Forward Processing (FP) for the post-real-time (Final) Run and the near-real-time + (Early and Late) Runs, respectively. The vectors from PRECTOT are chosen if available, + else from TQL, if available, else from TQV. The KF uses the morphed data as the + “forecast” and the IR estimates as the “observations”, with weighting that depends + on the time interval(s) away from the microwave overpass time. The IR becomes important + after about ±90 minutes away from the overpass time. Variable averaging in the KF + is accounted for in a routine (Scheme for Histogram Adjustment with Ranked Precipitation + Estimates in the Neighborhood, or SHARPEN) that compares the local histogram of + KF morphed precipitation to the local histogram of forward- and backward-morphed + microwave data and the IR.\n\nThe IMERG system is run twice in near-real time:\n\n\"Early\" + multi-satellite product ~4 hr after observation time using only forward morphing + and\n\"Late\" multi-satellite product ~14 hr after observation time, using both + forward and backward morphing\nand once after the monthly gauge analysis is received:\n\n\"Final\", + satellite-gauge product ~4 months after the observation month, using both forward + and backward morphing and including monthly gauge analyses.\n\nIn V07, the near-real-time + Early and Late half-hourly estimates have a monthly climatological concluding calibration + based on averaging the concluding calibrations computed in the Final, while in the + post-real-time Final Run the multi-satellite half-hourly estimates are adjusted + so that they sum to the Final Run monthly satellite-gauge combination. In all cases + the output contains multiple fields that provide information on the input data, + selected intermediate fields, and estimation quality. In general, the complete calibrated + precipitation, precipitation, is the data field of choice for most users.\n\nPrecipitation + phase is a diagnostic variable computed using analyses of surface temperature, humidity, + and pressure. \n\n\n\nRead our doc on how to get AWS Credentials to retrieve this + data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GPM/IMERG/3B-HH-L/07 +Contact: 'GES DISC HELP DESK SUPPORT GROUP": gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - datacenter + - forecast + - global + - hydrology + - land + - metadata + - opendap + - radar + - water + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Late Precipitation L3 Half Hourly 0.1 degree x 0.1 degree + V07 (GPM_3IMERGHHL) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGHHL.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale.' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations + AuthorName: Rudolf, B., and U. Schneider + Tutorials: + - Title: How to Read IMERG Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_IMERG_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpm3imergm.yaml b/datasets/nasa-gpm3imergm.yaml new file mode 100644 index 000000000..d8c94152e --- /dev/null +++ b/datasets/nasa-gpm3imergm.yaml @@ -0,0 +1,76 @@ +Name: GPM IMERG Final Precipitation L3 1 month 0.1 degree x 0.1 degree V07 (GPM_3IMERGM) + at GES DISC +Description: |- + Version 07B is the current version of the IMERG data sets. Older versions will no longer be available and have been superseded by Version 07. + + The Integrated Multi-satellitE Retrievals for GPM (IMERG) is the unified U.S. algorithm that provides the multi-satellite precipitation product for the U.S. GPM team. + + The precipitation estimates from the various precipitation-relevant satellite passive microwave (PMW) sensors comprising the GPM constellation are computed using the 2021 version of the Goddard Profiling Algorithm (GPROF2021), then gridded, intercalibrated to the GPM Combined Ku Radar-Radiometer Algorithm (CORRA) product, and merged into half-hourly 0.1°x0.1° (roughly 10x10 km) fields. Note that CORRA is adjusted to the monthly Global Precipitation Climatology Project (GPCP) Satellite-Gauge (SG) product over high-latitude ocean to correct known biases. + + The half-hourly intercalibrated merged PMW estimates are then input to both a Morphing-Kalman Filter (KF) Lagrangian time interpolation scheme based on work by the Climate Prediction Center (CPC) and the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) Dynamic Infrared–Rain Rate (PDIR) re-calibration scheme. In parallel, CPC assembles the zenith-angle-corrected, intercalibrated merged geo-IR fields and forwards them to PPS for input to the PERSIANN-CCS algorithm (supported by an asynchronous re-calibration cycle) which are then input to the KF morphing (quasi-Lagrangian time interpolation) scheme. + + The KF morphing (supported by an asynchronous KF weights updating cycle) uses the PMW and IR estimates to create half-hourly estimates. Motion vectors for the morphing are computed by maximizing the pattern correlation of successive hours within each of the precipitation (PRECTOT), total precipitable liquid water (TQL), and vertically integrated vapor (TQV) data fields provided by the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2) and Goddard Earth Observing System model Version 5 (GEOS-5) Forward Processing (FP) for the post-real-time (Final) Run and the near-real-time (Early and Late) Runs, respectively. The vectors from PRECTOT are chosen if available, else from TQL, if available, else from TQV. The KF uses the morphed data as the “forecast” and the IR estimates as the “observations”, with weighting that depends on the time interval(s) away from the microwave overpass time. The IR becomes important after about ±90 minutes away from the overpass time. Variable averaging in the KF is accounted for in a routine (Scheme for Histogram Adjustment with Ranked Precipitation Estimates in the Neighborhood, or SHARPEN) that compares the local histogram of KF morphed precipitation to the local histogram of forward- and backward-morphed microwave data and the IR. + + The IMERG system is run twice in near-real time: + + "Early" multi-satellite product ~4 hr after observation time using only forward morphing and + "Late" multi-satellite product ~14 hr after observation time, using both forward and backward morphing + and once after the monthly gauge analysis is received: + + "Final", satellite-gauge product ~4 months after the observation month, using both forward and backward morphing and including monthly gauge analyses. + + In V07, the near-real-time Early and Late half-hourly estimates have a monthly climatological concluding calibration based on averaging the concluding calibrations computed in the Final, while in the post-real-time Final Run the multi-satellite half-hourly estimates are adjusted so that they sum to the Final Run monthly satellite-gauge combination. In all cases the output contains multiple fields that provide information on the input data, selected intermediate fields, and estimation quality. In general, the complete calibrated precipitation, precipitation, is the data field of choice for most users. + + Briefly describing the Final Run, the input precipitation estimates computed from the various satellite passive microwave sensors are intercalibrated to the CORRA product (because it is presumed to be the best snapshot TRMM/GPM estimate after adjustment to the monthly GPCP SG), then "forward/backward morphed" and combined with microwave precipitation-calibrated geo-IR fields, and adjusted with seasonal GPCP SG surface precipitation data to provide half-hourly and monthly precipitation estimates on a 0.1°x0.1° (roughly 10x10 km) grid over the globe. Precipitation phase is a diagnostic variable computed using analyses of surface temperature, humidity, and pressure. The current period of record is June 2000 to the present (delayed by about 4 months). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GPM/IMERG/3B-MONTH/07 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - datacenter + - forecast + - global + - hydrology + - land + - metadata + - opendap + - radar + - water + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GPM IMERG Final Precipitation L3 1 month 0.1 degree x 0.1 degree + V07 (GPM_3IMERGM) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/GPM_L3/GPM_3IMERGM.07/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial + Neural Network Cloud Classification System + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao, 2004 + - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, + Combined-Sensor Precipitation Estimates at Fine Scale' + AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, + K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + - Title: Kalman Filter Based CMORPH + AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak + - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface + Using In-Situ Gauge Observations. Proc. of the 2nd Internat. Precip. Working + Group Workshop + AuthorName: Rudolf, B., and U. Schneider +Tutorials: + - Title: How to Read IMERG Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_IMERG_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpmimerglandseamask.yaml b/datasets/nasa-gpmimerglandseamask.yaml new file mode 100644 index 000000000..52f43f805 --- /dev/null +++ b/datasets/nasa-gpmimerglandseamask.yaml @@ -0,0 +1,39 @@ +Name: Land/Sea static mask relevant to IMERG precipitation 0.1x0.1 degree V2 (GPM_IMERG_LandSeaMask) + at GES DISC +Description: |- + Version 2 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 2. + + This land sea mask originated from the NOAA group at SSEC in the 1980s. It was originally produced at 1/6 deg resolution, and then regridded for the purposes of GPCP, TMPA, and IMERG precipitation products. NASA code 610.2, Terrestrial Information Systems Laboratory, restructured this land sea mask to match the IMERG grid, and converted the file to CF-compliant netCDF4. Version 2 was created in May, 2019 to resolve detected inaccuracies in coastal regions. + + Users should be aware that this is a static mask, i.e. there is no seasonal or annual variability, and it is due for update. It is not recommended to be used outside of the aforementioned precipitation data. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/6P5EM1HPR3VD +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - coastal + - datacenter + - global + - land + - metadata + - opendap + - netcdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'Land/Sea static mask relevant to IMERG precipitation 0.1x0.1 degree + V2 (GPM_IMERG_LandSeaMask) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/AUXILIARY/GPM_IMERG_LandSeaMask.2/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-gpmmergir.yaml b/datasets/nasa-gpmmergir.yaml new file mode 100644 index 000000000..049ecab57 --- /dev/null +++ b/datasets/nasa-gpmmergir.yaml @@ -0,0 +1,81 @@ +Name: NCEP/CPC L3 Half Hourly 4km Global (60S - 60N) Merged IR V1 (GPM_MERGIR) at + GES DISC +Description: "These data originate from NOAA/NCEP.\n\nThe NOAA Climate Prediction + Center/NCEP/NWS is making the data available originally in binary format, in a weekly + rotating archive. The NASA GES DISC is acquiring the binary files as they become + available, converts them into CF (Climate and Forecast) -convention compliant netCDF-4 + format, and stores the product in a permanent archive. The original record started + from February, 2000, but in June, 2025 it was extended back to January, 1998.\n\nThe + leading edge of data availability is delayed by about 24 hours from real-time to + abide by international data exchange agreements between NOAA and EUMETSAT (the METEOSAT + data providers).\n\nThe data contain globally-merged (60°S-60°N) 4-km pixel-resolution + IR brightness temperature data (equivalent blackbody temps), merged from the European, + Japanese, and U.S. geostationary satellites over the period of record (GOES-8/9/10/11/12/13/14/15/16/17/18/19, + METEOSAT-5/7/8/9/10/11, and GMS-5/MTSat-1R/2/Himawari-8/9).\n\nThe global geo-IR + are dynamically calibrated to GOES East, using a 35 day trailing inter-calibration + using time/space-matched IR Tb’s at the mid-point between sub-satellite positions. + \ In the event of duplicate data in a grid box, the value with the smaller zenith + angle is taken. The data have been corrected for \"zenith angle dependence\", in + which IR temperatures for locations far from satellite nadir are erroneously cold + due to a combination of geometric effects and radiometric path extinction effects + (Joyce et al. 2001). Finally, the data are re-navigated for parallax, which shifts + the geo-location of the GEO-IR footprints to approximately account for the cloud + tops that the IR “sees” being displaced away from their actual geographic location + when viewed along a slanted path. These corrections allow for the merging of the + IR data from the various GEO-satellites with greatly reduced discontinuities at + GEO-satellite data boundaries. In the event of duplicate data in a grid box, the + value with the smaller zenith angle is taken.\n\nThe NASA GES DISC is curating these + data in a self-documenting, CF-compliant, netCDF-4 format, which allows a broad + range of applications to access the data directly, without the need to cope with + the original binary data format. In addition to the direct download of netCDF-4 + data, the GES DISC provides data download in binary, ASCII, and netCDF-3 formats + using the OPeNDAP interface.\n\nSimilarities with the original\n-----------------------------\nAs + in the original binaries, every netCDF-4 file covers one hour, and contains two + half-hourly grids, at 4-km grid cell resolution. \n\nDifferences from the original\n-----------------------------\n1. + The data in the netCDF-4 files are already converted to real (float) values of Brightness + Temperatures in Kelvin. There is no need to further scale these data. The netCDF-4 + format is machine-independent and users need not worry about the endian-ness of + their machines. \n\n2. To meet the requirements of collection spatial metadata, + the grid is re-ordered from the original and now goes from -180 (West) to 180 (East). + It is also starting from -60 (South).\n\nThe data and time units are reflected in + the corresponding \"units\" attributes, and grid dimensions are described by longitude + (\"lon\"), latitude (\"lat\") and \"time\" vectors. Thus, any CF-compliant tool + should automatically understand the setup in the data files and the starting time + for each half-hourly grid. Even without such tools, simple \"ncdump\" or \"h5dump\" + command line tools will easily disclose the netCDF-4 files configuration.\n\nAcknowledgements\n------------------\nThe + creation of the original data at NOAA/NCEP is supported by funding from the NOAA + Office of Global Programs for the Global Precipitation Climatology Project (GPCP) + and by NASA via the Tropical Rainfall Measuring Mission (TRMM). \n\nThe permanent + archive at GES DISC is supported by NASA's HQ Earth Science Data Systems (ESDS) + Program. \n\n\n\nRead our doc on how to get AWS Credentials to retrieve this data: + https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/P4HZB9N27EKU +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1998-01-01 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - datacenter + - forecast + - global + - metadata + - opendap + - netcdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'NCEP/CPC L3 Half Hourly 4km Global (60S - 60N) Merged IR V1 (GPM_MERGIR) + at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/MERGED_IR/GPM_MERGIR.1/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-hlsl30.yaml b/datasets/nasa-hlsl30.yaml new file mode 100644 index 000000000..63133b64a --- /dev/null +++ b/datasets/nasa-hlsl30.yaml @@ -0,0 +1,190 @@ +Name: HLS Landsat Operational Land Imager Surface Reflectance and TOA Brightness Daily + Global 30m v2.0 +Description: "The Harmonized Landsat Sentinel-2 (HLS) project provides consistent + surface reflectance (SR) and top of atmosphere (TOA) brightness data from a virtual + constellation of satellite sensors. The Operational Land Imager (OLI) is housed + aboard the joint NASA/USGS Landsat 8 and Landsat 9 satellites, while the Multi-Spectral + Instrument (MSI) is mounted aboard Europe’s Copernicus Sentinel-2A, Sentinel-2B, + and Sentinel-2C satellites. The combined measurement enables global observations + of the land every 2–3 days at 30-meter (m) spatial resolution. The HLS project uses + a set of algorithms to obtain seamless products from OLI and MSI that include atmospheric + correction, cloud and cloud-shadow masking, spatial co-registration and common gridding, + illumination and view angle normalization, and spectral bandpass adjustment.\n\nThe + HLSL30 product provides 30-m Nadir Bidirectional Reflectance Distribution Function + (BRDF)-Adjusted Reflectance (NBAR) and is derived from Landsat 8/9 OLI data products. + The [HLSS30](https://doi.org/10.5067/HLS/HLSS30.002) and HLSL30 products are gridded + to the same resolution and Military Grid Reference System ([MGRS](https://hls.gsfc.nasa.gov/products-description/tiling-system/)) + tiling system and thus are “stackable” for time series analysis.\n\nThe HLSL30 product + is provided in Cloud Optimized GeoTIFF (COG) format, and each band is distributed + as a separate file. There are 11 bands included in the HLSL30 product along with + one quality assessment (QA) band and four angle bands. See the User Guide for a + more detailed description of the individual bands provided in the HLSL30 product.\n\nKnown + Issues\n\n* Unrealistically high aerosol and low surface reflectance over bright + areas: The atmospheric correction over bright targets occasionally retrieves unrealistically + high aerosol and thus makes the surface reflectance too low. High aerosol retrievals, + both false high aerosol and realistically high aerosol, are masked when quality + bits 6 and 7 are both set to 1 (see Table 9 in the [User Guide](https://lpdaac.usgs.gov/documents/1698/HLS_User_Guide_V2.pdf)); + the corresponding spectral data should be discarded from analysis.\n\n* Issues over + high latitudes: For scenes greater than or equal to 80 degrees north, multiple overpasses + can be gridded into a single MGRS tile resulting in an L30 granule with data sensed + at two different times. In this same area, it is also possible that Landsat overpasses + that should be gridded into a single MGRS tile are actually written as separate + data files. Finally, for scenes with a latitude greater than or equal to 65 degrees + north, ascending Landsat scenes may have a slightly higher error in the BRDF correction + because the algorithm is calibrated using descending scenes.\n\n* Fmask omission + errors: There are known issues regarding the Fmask band of this data product that + impacts HLSL30 data prior to April of 2022. The HLS Fmask data band may have omission + errors in water detection for cases where water detection using spectral data alone + is difficult, and omission and commission errors in cloud shadow detection for areas + with great topographic relief. This issue does not impact other bands in the dataset.\n\n* + Inconsistent snow surface reflectance between Landsat and Sentinel-2: The HLS snow + surface reflectance can be highly inconsistent between Landsat and Sentinel-2. When + assessed on same-day acquisitions from Landsat and Sentinel-2, Landsat reflectance + is generally higher than Sentinel-2 reflectance in the visible bands.\n\n* Unrealistically + high snow surface reflectance in the visible bands: By design, the Land Surface + Reflectance Code (LaSRC) atmospheric correction does not attempt aerosol retrieval + over snow; instead, a default aerosol optical thickness (AOT) is used to drive the + snow surface reflectance. If the snow detection fails, the full LaSRC is used in + both AOT retrieval and surface reflectance derivation over snow, which produces + surface reflectance values as high as 1.6 in the visible bands. This is a common + problem for spring images at high latitudes.\n\n* Unrealistically low surface reflectance + surrounding snow/ice: Related to the above, the AOT retrieval over snow/ice is generally + too high. When this artificially high AOT is used to derive the surface reflectance + of the neighboring non-snow pixels, very low surface reflectance will result. These + pixels will appear very dark in the visible bands. If the surface reflectance value + of a pixel is below -0.2, a NO_DATA value of -9999 is used.\n\n* Unrealistically + low reflectance surrounding clouds: Like for snow, the HLS atmospheric correction + does not attempt aerosol retrieval over clouds and a default AOT is used instead. + But if the cloud detection fails, an artificially high AOT will be retrieved over + clouds. If the high AOT is used to derive the surface reflectance of the neighboring + cloud-free pixels, very low surface reflectance values will result. If the surface + reflectance value of a pixel is below -0.2, a NO_DATA value of -9999 is used. \n\n* + Unusually low reflectance around other bright land targets: While the HLS atmospheric + correction retrieves AOT over non-cloud, non-snow bright pixels, the retrieved AOT + over bright targets can be unrealistically high in some cases, similar to cloud + or snow. If this unrealistically high AOT is used to derive the surface reflectance + of the neighboring pixels, very low surface reflectance values can result as shown + in Figure 2. If the surface reflectance value of a pixel is below -0.2, a NO_DATA + value of -9999 is used. These types of bright targets are mostly man-made, such + as buildings, parking lots, and roads. \n\n* Dark plumes over water: The HLS atmospheric + correction does not attempt aerosol retrieval over water. For water pixels, the + AOT retrieved from the nearest land pixels is used to derive the surface reflectance, + but if the retrieval is incorrect, e.g. from a cloud pixel, this high AOT will create + dark stripes over water, as shown in Figure 3. This happens more often over large + water bodies, such as lakes and bays, than over narrow rivers. \n\n* Landsat WRS-2 + Path/Row boundary in L30 reflectance: HLS performs atmospheric correction on Landsat + Level 1 images in the original Worldwide Reference System 2 (WRS2) path/row before + the derived surface reflectance is reprojected into Military Grid Reference System + (MGRS) tiles. If a WRS-2 Landsat image is very cloudy, the AOT from a few remaining + clear pixels might be used for the atmospheric correction of the entire image. The + AOT that is used can be quite different from the value for the adjacent row in the + same path, which results in an artificial abrupt change from one row to the next, + as shown in Figure 4. This occurrence is very rare. \n \n* Landsat WRS2 path/row + boundary in cloud masks: The cloud mask algorithm Fmask creates mask labels by applying + thresholds to the histograms of some metrics for each path/row independently. If + two adjacent rows in the same path have distinct distributions within the metrics, + abrupt changes in masking patterns can appear across the row boundary, as shown + in Figure 5. This occurrence is very rare. \n\n* Fmask configuration was deficient + for 2-3 months in 2021: The HLS installation of Fmask failed to include auxiliary + digital elevation model (DEM) and European Space Agency (ESA) Global Surface Water + Occurrence data for a 2-3 month run in 2021. This impacted the masking results over + water and in mountainous regions. \n\n* The reflectance “scale_factor” and “offset” + for some L30 and S30 bands were not set: The HLS reflectance scaling factor is 0.0001 + and offset is 0. However, this information was not set in the Cloud Optimized GeoTIFF + (COG) files of some bands for a small number of granules. The lack of this information + creates a problem for automatic conversion of the reflectance data, requiring explicit + scaling in applications. The problem has been corrected, but the affected granules + have not been reprocessed. \n\n* Incomplete map projection information: For a time, + HLS imagery was produced with an incomplete coordinate reference system (CRS). The + metadata contains the Universal Transverse Mercator (UTM) zone and coordinates necessary + to geolocate pixels within the image but might not be in a standard form, especially + for granules produced early in the HLS mission. As a result, an error will occur + in certain image processing packages due to the incomplete CRS. The simplest solution + is to update to the latest version of Geospatial Data Abstraction Library (GDAL) + and/or rasterio, which use the available information without error. \n\n* False + northing of 10^7 for the L30 angle data: The L30 and S30 products do not use a false + northing for the UTM projection, and the angle data are supposed to follow the same + convention. However, the L30 angle data incorrectly uses a false northing of 10^7. + There is no problem with the angle data itself, but the false northing needs to + be set to 0 for it to be aligned with the reflectance.\n\n* L30 from Landsat L1GT + scenes: Landsat L1GT scenes were not intended for HLS due to their poor geolocation. + However, some scenes made it through screening for a short period of HLS production. + L1GT L30 scenes mainly consist of extensive cloud or snow that can be eliminated + using the Fmask quality bits layer. Users can also identify an L1GT-originated L30 + granule by examining the HLS cmr.xml metadata file.\n\n* The UTC dates in the L30/S30 + filenames may not be the local dates: UTC dates are used by ESA and the U.S. Geological + Survey (USGS) in naming their Level 1 images, and HLS processing retains this information + to name the L30 and S30 products. Landsat and Sentinel-2 overpass eastern Australia + and New Zealand around 10AM local solar time, but this area is in either UTC+10:00 + or +11:00 zone; therefore, the UTC time for some orbits is in fact near the end + of the preceding UTC day. For example, HLS.S30.T59HQS.2016117T221552.v2.0 was acquired + in the 22nd hour of day 117 of year 2016 in UTC, but the time was 10:15:52 of day + 118 locally. Approximately 100 minutes later HLS.S30.T56JML.2016117T235252.v2.0 + was acquired in the next orbit in eastern Australia. \n\n This issue also occurs + for Landsat. For example, HLS.L30.T59HQS.2016117T221209.v2.0 was acquired on the + same day as the first S30 example given above, but both on day 118 of 2016 locally. + Adding to the confusion for L30, in the same region, Landsat 8 and 9 can each overpass + once in one of the two adjacent WRS-2 Paths (91/92/93) over a two-day period on + a local calendar, but based on UTC time, the two overpasses can appear to be on + the same day. For example, in the following seemingly same-day pair, the second + L30 is actually for day 168 locally: \n HLS.L30.T55GCN.2023167T000407.v2.0 + \ \n HLS.L30.T55GCN.2023167T235747.v2.0 \n Bear in mind, the date peculiarity + for the data occurs when the overpass time is during the late hours of a UTC day. + \n\n* The atmospheric ancillary data from the wrong date was used for LaSRC: Related + to the above, for eastern Australia and New Zealand, L30 and S30 surface reflectance + on certain days was created using the atmospheric ancillary data from a date that + was one day too early. The exact geographic extent of the affected HLS products + and the impact on the surface reflectance quality are under investigation. Practice + caution when using data with overpass times during the late hours of a UTC day.\n\n* + Duplicates in L30: The Landsat 9 acquisitions from October 2021 to March 2023 in + Landsat Collection 2 were reprocessed by USGS in March 2023. This reprocessing updated + the overpass time by a fraction of a second for some scenes. Since HLS uses overpass + time as part of the L30 filename, the older L30 granules were not automatically + overwritten due to the different filenames. For example, the first L30 granule in + the following pair originated from an older version of L1TP of Landsat 9 with the + second granule originating from the reprocessed version. \nHLS.L30.T11SLC.2022166T182646.v2.0 + \ \nHLS.L30.T11SLC.2022166T182645.v2.0 \nThere are other causes of duplicate L30 + granules, but the overall number of duplicates is very small.\n\n* Poor Geolocation: + A large amount of granules that were processed for May through July 2023 were created + with L1GT input scenes which were deemed undesirable due to a poor geolocation issue. + These granules were removed from the archive. (see the full list of removed [granules](https://lpdaac.usgs.gov/documents/2161/L30_L1GT_granules_May_July_2023.csv))\n\nImprovements/Changes + from Previous Versions\n\n* Aerosol QA bits from the USGS Land Surface Reflectance + Code (LaSRC) model output have been added into the Function of Mask (Fmask) data + layer. The added two bits indicate the aerosol levels: high, medium, low, and climatology + aerosol.\nRead our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/HLS/HLSL30.002 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2013-04-11 to Ongoing +Tags: + - aws-pds + - atmosphere + - cog + - datacenter + - earth observation + - geospatial + - global + - ice + - land + - metadata + - orbit + - satellite imagery + - stac + - surface water + - tiles + - water + - xml +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'HLS Landsat Operational Land Imager Surface Reflectance and TOA + Brightness Daily Global 30m v2.0.' + ARN: arn:aws:s3:::lp-prod-protected/HLSL30.020 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Getting Started with Cloud-Native HLS Data in Python + URL: https://github.com/nasa/HLS-Data-Resources/blob/main/python/tutorials/HLS_Tutorial.ipynb + AuthorName: Mahsa Jami, Erik A. Bolch, Cole K. Krehbiel, Aaron M. Friesz, Brianna M. Lind diff --git a/datasets/nasa-hlss30.yaml b/datasets/nasa-hlss30.yaml new file mode 100644 index 000000000..e8f40ff26 --- /dev/null +++ b/datasets/nasa-hlss30.yaml @@ -0,0 +1,190 @@ +Name: HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m + v2.0 +Description: "The Harmonized Landsat Sentinel-2 (HLS) project provides consistent + surface reflectance data from the Operational Land Imager (OLI) aboard the joint + NASA/USGS Landsat 8 satellite and the Multi-Spectral Instrument (MSI) aboard Europe’s + Copernicus Sentinel-2A, Sentinel-2B, and Sentinel-2C satellites. The combined measurement + enables global observations of the land every 2–3 days at 30-meter (m) spatial resolution. + The HLS project uses a set of algorithms to obtain seamless products from OLI and + MSI that include atmospheric correction, cloud and cloud-shadow masking, spatial + co-registration and common gridding, illumination and view angle normalization, + and spectral bandpass adjustment. \n\nThe HLSS30 product provides 30-m Nadir Bidirectional + Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) and is derived + from Sentinel-2A, Sentinel-2B, and Sentinel-2C MSI data products. The HLSS30 and + [HLSL30](https://doi.org/10.5067/HLS/HLSL30.002) products are gridded to the same + resolution and Military Grid Reference System ([MGRS](https://hls.gsfc.nasa.gov/products-description/tiling-system/)) + tiling system and thus are “stackable” for time series analysis.\n\nThe HLSS30 product + is provided in Cloud Optimized GeoTIFF (COG) format, and each band is distributed + as a separate COG. There are 13 bands included in the HLSS30 product along with + four angle bands and a quality assessment (QA) band. See the User Guide for a more + detailed description of the individual bands provided in the HLSS30 product.\n\nKnown + Issues\n\n* Unrealistically high aerosol and low surface reflectance over bright + areas: The atmospheric correction over bright targets occasionally retrieves unrealistically + high aerosol and thus makes the surface reflectance too low. High aerosol retrievals, + both false high aerosol and realistically high aerosol, are masked when quality + bits 6 and 7 are both set to 1 (see Table 9 in the [User Guide](https://lpdaac.usgs.gov/documents/1698/HLS_User_Guide_V2.pdf)); + the corresponding spectral data should be discarded from analysis.\n\n* Issues over + high latitudes: For scenes greater than or equal to 80 degrees north, multiple overpasses + can be gridded into a single MGRS tile resulting in an L30 granule with data sensed + at two different times. In this same area, it is also possible that Landsat overpasses + that should be gridded into a single MGRS tile are actually written as separate + data files. Finally, for scenes with a latitude greater than or equal to 65 degrees + north, ascending Landsat scenes may have a slightly higher error in the BRDF correction + because the algorithm is calibrated using descending scenes.\n\n* Fmask omission + errors: There are known issues regarding the Fmask band of this data product that + impacts HLSL30 data prior to April of 2022. The HLS Fmask data band may have omission + errors in water detection for cases where water detection using spectral data alone + is difficult, and omission and commission errors in cloud shadow detection for areas + with great topographic relief. This issue does not impact other bands in the dataset.\n\n* + Inconsistent snow surface reflectance between Landsat and Sentinel-2: The HLS snow + surface reflectance can be highly inconsistent between Landsat and Sentinel-2. When + assessed on same-day acquisitions from Landsat and Sentinel-2, Landsat reflectance + is generally higher than Sentinel-2 reflectance in the visible bands.\n\n* Unrealistically + high snow surface reflectance in the visible bands: By design, the Land Surface + Reflectance Code (LaSRC) atmospheric correction does not attempt aerosol retrieval + over snow; instead, a default aerosol optical thickness (AOT) is used to drive the + snow surface reflectance. If the snow detection fails, the full LaSRC is used in + both AOT retrieval and surface reflectance derivation over snow, which produces + surface reflectance values as high as 1.6 in the visible bands. This is a common + problem for spring images at high latitudes.\n\n* Unrealistically low surface reflectance + surrounding snow/ice: Related to the above, the AOT retrieval over snow/ice is generally + too high. When this artificially high AOT is used to derive the surface reflectance + of the neighboring non-snow pixels, very low surface reflectance will result. These + pixels will appear very dark in the visible bands. If the surface reflectance value + of a pixel is below -0.2, a NO_DATA value of -9999 is used. In Figure 1, the pixels + in front of the glaciers have surface reflectance values that are too low. \n\n* + Unrealistically low reflectance surrounding clouds: Like for snow, the HLS atmospheric + correction does not attempt aerosol retrieval over clouds and a default AOT is used + instead. But if the cloud detection fails, an artificially high AOT will be retrieved + over clouds. If the high AOT is used to derive the surface reflectance of the neighboring + cloud-free pixels, very low surface reflectance values will result. If the surface + reflectance value of a pixel is below -0.2, a NO_DATA value of -9999 is used. \n\n* + Unusually low reflectance around other bright land targets: While the HLS atmospheric + correction retrieves AOT over non-cloud, non-snow bright pixels, the retrieved AOT + over bright targets can be unrealistically high in some cases, similar to cloud + or snow. If this unrealistically high AOT is used to derive the surface reflectance + of the neighboring pixels, very low surface reflectance values can result as shown + in Figure 2. If the surface reflectance value of a pixel is below -0.2, a NO_DATA + value of -9999 is used. These types of bright targets are mostly man-made, such + as buildings, parking lots, and roads. \n\n* Dark plumes over water: The HLS atmospheric + correction does not attempt aerosol retrieval over water. For water pixels, the + AOT retrieved from the nearest land pixels is used to derive the surface reflectance, + but if the retrieval is incorrect, e.g. from a cloud pixel, this high AOT will create + dark stripes over water, as shown in Figure 3. This happens more often over large + water bodies, such as lakes and bays, than over narrow rivers. \n\n* Landsat WRS-2 + Path/Row boundary in L30 reflectance: HLS performs atmospheric correction on Landsat + Level 1 images in the original Worldwide Reference System 2 (WRS2) path/row before + the derived surface reflectance is reprojected into Military Grid Reference System + (MGRS) tiles. If a WRS-2 Landsat image is very cloudy, the AOT from a few remaining + clear pixels might be used for the atmospheric correction of the entire image. The + AOT that is used can be quite different from the value for the adjacent row in the + same path, which results in an artificial abrupt change from one row to the next, + as shown in Figure 4. This occurrence is very rare. \n \n* Landsat WRS2 path/row + boundary in cloud masks: The cloud mask algorithm Fmask creates mask labels by applying + thresholds to the histograms of some metrics for each path/row independently. If + two adjacent rows in the same path have distinct distributions within the metrics, + abrupt changes in masking patterns can appear across the row boundary, as shown + in Figure 5. This occurrence is very rare. \n\n* Fmask configuration was deficient + for 2-3 months in 2021: The HLS installation of Fmask failed to include auxiliary + digital elevation model (DEM) and European Space Agency (ESA) Global Surface Water + Occurrence data for a 2-3 month run in 2021. This impacted the masking results over + water and in mountainous regions. \n\n* The reflectance “scale_factor” and “offset” + for some L30 and S30 bands were not set: The HLS reflectance scaling factor is 0.0001 + and offset is 0. However, this information was not set in the Cloud Optimized GeoTIFF + (COG) files of some bands for a small number of granules. The lack of this information + creates a problem for automatic conversion of the reflectance data, requiring explicit + scaling in applications. The problem has been corrected, but the affected granules + have not been reprocessed. \n\n* Incomplete map projection information: For a time, + HLS imagery was produced with an incomplete coordinate reference system (CRS). The + metadata contains the Universal Transverse Mercator (UTM) zone and coordinates necessary + to geolocate pixels within the image but might not be in a standard form, especially + for granules produced early in the HLS mission. As a result, an error will occur + in certain image processing packages due to the incomplete CRS. The simplest solution + is to update to the latest version of Geospatial Data Abstraction Library (GDAL) + and/or rasterio, which use the available information without error. \n\n* False + northing of 10^7 for the L30 angle data: The L30 and S30 products do not use a false + northing for the UTM projection, and the angle data are supposed to follow the same + convention. However, the L30 angle data incorrectly uses a false northing of 10^7. + There is no problem with the angle data itself, but the false northing needs to + be set to 0 for it to be aligned with the reflectance.\n\n* L30 from Landsat L1GT + scenes: Landsat L1GT scenes were not intended for HLS due to their poor geolocation. + However, some scenes made it through screening for a short period of HLS production. + L1GT L30 scenes mainly consist of extensive cloud or snow that can be eliminated + using the Fmask quality bits layer. Users can also identify an L1GT-originated L30 + granule by examining the HLS cmr.xml metadata file.\n\n* The UTC dates in the L30/S30 + filenames may not be the local dates: UTC dates are used by ESA and the U.S. Geological + Survey (USGS) in naming their Level 1 images, and HLS processing retains this information + to name the L30 and S30 products. Landsat and Sentinel-2 overpass eastern Australia + and New Zealand around 10AM local solar time, but this area is in either UTC+10:00 + or +11:00 zone; therefore, the UTC time for some orbits is in fact near the end + of the preceding UTC day. For example, HLS.S30.T59HQS.2016117T221552.v2.0 was acquired + in the 22nd hour of day 117 of year 2016 in UTC, but the time was 10:15:52 of day + 118 locally. Approximately 100 minutes later HLS.S30.T56JML.2016117T235252.v2.0 + was acquired in the next orbit in eastern Australia. \n\n This issue also occurs + for Landsat. For example, HLS.L30.T59HQS.2016117T221209.v2.0 was acquired on the + same day as the first S30 example given above, but both on day 118 of 2016 locally. + Adding to the confusion for L30, in the same region, Landsat 8 and 9 can each overpass + once in one of the two adjacent WRS-2 Paths (91/92/93) over a two-day period on + a local calendar, but based on UTC time, the two overpasses can appear to be on + the same day. For example, in the following seemingly same-day pair, the second + L30 is actually for day 168 locally: \n HLS.L30.T55GCN.2023167T000407.v2.0 + \ \n HLS.L30.T55GCN.2023167T235747.v2.0 \n Bear in mind, the date peculiarity + for the data occurs when the overpass time is during the late hours of a UTC day. + \ \n\n* The atmospheric ancillary data from the wrong date was used for LaSRC: Related + to the above, for eastern Australia and New Zealand, L30 and S30 surface reflectance + on certain days was created using the atmospheric ancillary data from a date that + was one day too early. The exact geographic extent of the affected HLS products + and the impact on the surface reflectance quality are under investigation. Practice + caution when using data with overpass times during the late hours of a UTC day.\n\n* + Duplicates in L30: The Landsat 9 acquisitions from October 2021 to March 2023 in + Landsat Collection 2 were reprocessed by USGS in March 2023. This reprocessing updated + the overpass time by a fraction of a second for some scenes. Since HLS uses overpass + time as part of the L30 filename, the older L30 granules were not automatically + overwritten due to the different filenames. For example, the first L30 granule in + the following pair originated from an older version of L1TP of Landsat 9 with the + second granule originating from the reprocessed version. \nHLS.L30.T11SLC.2022166T182646.v2.0 + \ \nHLS.L30.T11SLC.2022166T182645.v2.0 \nThere are other causes of duplicate L30 + granules, but the overall number of duplicates is very small.\n\n* Poor Geolocation: + A large amount of granules that were processed for May through July 2023 were created + with L1GT input scenes which were deemed undesirable due to a poor geolocation issue. + These granules were removed from the archive. (see the full list of removed [granules](https://lpdaac.usgs.gov/documents/2161/L30_L1GT_granules_May_July_2023.csv))\n\nImprovements/Changes + from Previous Versions\n\n* Aerosol QA bits from the USGS Land Surface Reflectance + Code (LaSRC) model output have been added into the Function of Mask (Fmask) data + layer. The added two bits indicate the aerosol levels: high, medium, low, and climatology + aerosol.\nRead our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/HLS/HLSS30.002 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2015-11-28 to Ongoing +Tags: + - aws-pds + - cog + - datacenter + - earth observation + - geospatial + - global + - hdf + - ice + - land + - metadata + - orbit + - satellite imagery + - stac + - surface water + - tiles + - water + - xml +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily + Global 30m v2.0.' + ARN: arn:aws:s3:::lp-prod-protected/HLSS30.020 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Getting Started with Cloud-Native HLS Data in Python + URL: https://github.com/nasa/HLS-Data-Resources/blob/main/python/tutorials/HLS_Tutorial.ipynb + AuthorName: Mahsa Jami, Erik A. Bolch, Cole K. Krehbiel, Aaron M. Friesz, Brianna M. Lind diff --git a/datasets/nasa-imergprecipcanadaalaska2097.yaml b/datasets/nasa-imergprecipcanadaalaska2097.yaml new file mode 100644 index 000000000..ce0164685 --- /dev/null +++ b/datasets/nasa-imergprecipcanadaalaska2097.yaml @@ -0,0 +1,32 @@ +Name: 'ABoVE: Bias-Corrected IMERG Monthly Precipitation for Alaska and Canada, 2000-2020' +Description: |- + This dataset is a modification to the Integrated Multi-satellitE Retrievals for GPM (IMERG) Final Run microwave-only, daily precipitation Version 06 data. It provides bias-corrected IMERG monthly precipitation data for Alaska and Canada from June 2000 through December 2020 in Cloud-Optimized GeoTIFF (*.tif) format. Data are provided in the units of mm/day. NASA's IMERG data product is one of the most advanced satellite precipitation products with a 0.1-degree spatial resolution and near global coverage. This dataset bias-corrected IMERG's HQprecipitation precipitation estimates, which are based on passive microwave (PMW)-only retrievals, using a linear regression method. This method utilizes empirical measurements from rain gauge stations from the Global Historical Climatology Network (GHCN) and a digital elevation model. This bias correction approach improves estimates at elevations above 500 m a.s.l., which are typically underestimated. + Read our doc on how to get AWS Credentials to retrieve this data: https://data.ornldaac.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.3334/ORNLDAAC/2097 +Contact: 'ORNL DAAC User Services Office: uso@daac.ornl.gov.' +ManagedBy: NASA +UpdateFrequency: From 2000-06-01 to 2020-12-31 +Tags: + - aws-pds + - atmosphere + - cog + - earth observation + - global + - land + - radar + - cog +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'ABoVE: Bias-Corrected IMERG Monthly Precipitation for Alaska and + Canada, 2000-2020.' + ARN: arn:aws:s3:::ornl-cumulus-prod-protected/above/IMERG_Precip_Canada_Alaska/data + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.ornldaac.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Accessing Data through ORNL DAAC Web Services + URL: https://github.com/ornldaac/web_services_data_access/blob/master/web_services_data_access.ipynb + AuthorName: ORNL DAAC diff --git a/datasets/nasa-m2i3npasm.yaml b/datasets/nasa-m2i3npasm.yaml new file mode 100644 index 000000000..d1f4b59c3 --- /dev/null +++ b/datasets/nasa-m2i3npasm.yaml @@ -0,0 +1,120 @@ +Name: 'MERRA-2 inst3_3d_asm_Np: 3d,3-Hourly,Instantaneous,Pressure-Level,Assimilation,Assimilated + Meteorological Fields 0.625 x 0.5 degree V5.12.4 (M2I3NPASM) at GES DISC' +Description: "M2I3NPASM (or inst3_3d_asm_Np) is an instantaneous 3-dimensional 3-hourly + data collection in Modern-Era Retrospective analysis for Research and Applications + version 2 (MERRA-2). This collection consists of assimilations of meteorological + parameters at 42 pressure levels, such as temperature, wind components, vertical + pressure velocity, water vapor, ozone mass mixing ratio, and layer height. The data + field is available every three hours starting from 00:00 UTC, e.g.: 00:00, 03:00, + … , 21:00 UTC. The information on the pressure levels can be found in the section + 4.2 of the MERRA-2 File Specification document. \n\nMERRA-2 is the latest version + of global atmospheric reanalysis for the satellite era produced by NASA Global Modeling + and Assimilation Office (GMAO) using the Goddard Earth Observing System Model (GEOS) + version 5.12.4. The dataset covers the period of 1980-present with the latency + of ~3 weeks after the end of a month. \n\nData Reprocessing: Please check “Records + of MERRA-2 Data Reprocessing and Service Changes” linked from the “Documentation” + tab on this page. Note that a reprocessed data filename is different from the original + file.\n\nMERRA-2 Mailing List: Sign up to receive information on reprocessing of + data, changing of tools and services, as well as data announcements from GMAO. Contact + the GES DISC Help Desk (gsfc-dl-help-disc@mail.nasa.gov) to be added to the list.\n\nQuestions: + If you have a question, please read \"MERRA-2 File Specification Document\", “MERRA-2 + Data Access – Quick Start Guide”, and FAQs linked from the ”Documentation” tab on + this page. If that does not answer your question, you may post your question to + the NASA Earthdata Forum (forum.earthdata.nasa.gov) or email the GES DISC Help Desk + (gsfc-dl-help-disc@mail.nasa.gov).\nRead our doc on how to get AWS Credentials to + retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/QBZ6MG944HW0 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 1980-01-01 to Ongoing +Tags: + - aws-pds + - agriculture + - air temperature + - atmosphere + - biodiversity + - climate + - coastal + - datacenter + - ecosystems + - global + - hydrology + - ice + - land + - metadata + - netCDF + - opendap + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MERRA-2 inst3_3d_asm_Np: 3d,3-Hourly,Instantaneous,Pressure-Level,Assimilation,Assimilated + Meteorological Fields 0.625 x 0.5 degree V5.12.4 (M2I3NPASM) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/MERRA2/M2I3NPASM.5.12.4/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: The Modern-Era Retrospective Analysis for Research and Applications, + Version 2 (MERRA-2). + URL: https://doi.org/10.1175/JCLI-D-16-0758.1 + AuthorName: Gelaro, R., W. McCarty, M. J. Suárez, R. Todling, A. Molod, L. Takacs, + C. A. Randles, A. Darmenov, M. G. Bosilovich, R. Reichle, et al. + - Title: 'Development of the GEOS-5 atmospheric general circulation model: evolution + from MERRA to MERRA2.' + URL: https://doi.org/10.5194/gmd-8-1339-2015 + AuthorName: Molod, A., L. Takacs, M. Suarez, and J. Bacmeister + - Title: 'The MERRA-2 Aerosol Reanalysis, 1980 Onward. Part I: System Description + and Data Assimilation Evaluation.' + URL: https://doi.org/10.1175/JCLI-D-16-0609.1 + AuthorName: Randles, C. A., A. M. da Silva, V. Buchard, P. R. Colarco, A. Darmenov, + R. Govindaraju, A. Smirnov, B. Holben, R. Ferrare, J. Hair, Y.Shinozuka, and + C.J. Flynn + - Title: 'The MERRA-2 aerosol reanalysis, 1980 onward. Part II: Evaluation and + case studies.' + URL: https://doi.org/10.1175/JCLI-D-16-0613.1 + AuthorName: Buchard V., C. A. Randles, A. M. da Silva, A. Darmenov, P. R. Colarco, + R. Govindaraju, R. Ferrare, J. Hair, A. J. Beyersdorf, L. D. Ziemba, H. Yu + - Title: Land Surface Precipitation in MERRA-2. + URL: https://doi.org/10.1175/JCLI-D-16-0570.1 + AuthorName: Reichle, R.H., Q. Liu, R.D. Koster, C.S. Draper, S.P.P. Mahanama, + and G.S. Partyka + - Title: Assessment of MERRA-2 Land Surface Hydrology Estimates. + URL: https://doi.org/10.1175/JCLI-D-16-0720.1 + AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, + R. D. Koster, and G. J. M. De Lannoy + - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. + Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, + L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, + S. Pawson, O. Reale, S. D. Schubert, and M. Suarez + - Title: Data assimilation using incremental analysis updates + AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina + - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) + Data Assimilation System - Version 4 + AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, + S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu + - Title: Design and implementation of components in the Earth System Modeling + Framework + URL: https://doi.org/10.1177/1094342005056120 + AuthorName: Collins, N., G. Theurich, C. DeLuca, M. Suarez, A. Trayanov, V. + Balaji, P. Li, W. Yang, C. Hill, and A. da Silva + - Title: A catchment-based approach to modeling land surface processes in a GCM, + Part 1, Model Structure + AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part II: Spatially inhomogeneous and anisotropic general + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-m2i3nvaer.yaml b/datasets/nasa-m2i3nvaer.yaml new file mode 100644 index 000000000..1a73d8b3a --- /dev/null +++ b/datasets/nasa-m2i3nvaer.yaml @@ -0,0 +1,122 @@ +Name: 'MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol + Mixing Ratio 0.625 x 0.5 degree V5.12.4 (M2I3NVAER) at GES DISC' +Description: "M2I3NVAER (or inst3_3d_aer_Nv) is an instantaneous 3-dimensional 3-hourly + data collection in Modern-Era Retrospective analysis for Research and Applications + version 2 (MERRA-2). This collection consists of assimilations of aerosol mixing + ratio parameters at 72 model layers, such as dust, sulphur dioxide, sea salt, black + carbon, and organic carbon. The data field is available every three hour starting + from 00:00 UTC, e.g.: 00:00, 03:00, … , 21:00 UTC. Section 4.2 of the MERRA-2 File + Specification document provides pressure values nominal for a 1000 hPa surface pressure + and refers to the top edge of the layer. The lev=1 is for the top layer, and lev=72 + is for the bottom (or surface) model layer. \n\nMERRA-2 is the latest version of + global atmospheric reanalysis for the satellite era produced by NASA Global Modeling + and Assimilation Office (GMAO) using the Goddard Earth Observing System Model (GEOS) + version 5.12.4. The dataset covers the period of 1980-present with the latency + of ~3 weeks after the end of a month. \n\nData Reprocessing: Please check “Records + of MERRA-2 Data Reprocessing and Service Changes” linked from the “Documentation” + tab on this page. Note that a reprocessed data filename is different from the original + file.\n\nMERRA-2 Mailing List: Sign up to receive information on reprocessing of + data, changing of tools and services, as well as data announcements from GMAO. Contact + the GES DISC Help Desk (gsfc-dl-help-disc@mail.nasa.gov) to be added to the list.\n\nQuestions: + If you have a question, please read \"MERRA-2 File Specification Document\", “MERRA-2 + Data Access – Quick Start Guide”, and FAQs linked from the ”Documentation” tab on + this page. If that does not answer your question, you may post your question to + the NASA Earthdata Forum (forum.earthdata.nasa.gov) or email the GES DISC Help Desk + (gsfc-dl-help-disc@mail.nasa.gov).\nRead our doc on how to get AWS Credentials to + retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/LTVB4GPCOTK2 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 1980-01-01 to Ongoing +Tags: + - aws-pds + - agriculture + - air quality + - atmosphere + - biodiversity + - carbon + - climate + - coastal + - datacenter + - ecosystems + - global + - hydrology + - ice + - land + - metadata + - netCDF + - opendap + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol + Mixing Ratio 0.625 x 0.5 degree V5.12.4 (M2I3NVAER) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/MERRA2/M2I3NVAER.5.12.4/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: The Modern-Era Retrospective Analysis for Research and Applications, + Version 2 (MERRA-2). + URL: https://doi.org/10.1175/JCLI-D-16-0758.1 + AuthorName: Gelaro, R., W. McCarty, M. J. Suárez, R. Todling, A. Molod, L. Takacs, + C. A. Randles, A. Darmenov, M. G. Bosilovich, R. Reichle, et al. + - Title: 'Development of the GEOS-5 atmospheric general circulation model: evolution + from MERRA to MERRA2.' + URL: https://doi.org/10.5194/gmd-8-1339-2015 + AuthorName: Molod, A., L. Takacs, M. Suarez, and J. Bacmeister + - Title: 'The MERRA-2 Aerosol Reanalysis, 1980 Onward. Part I: System Description + and Data Assimilation Evaluation.' + URL: https://doi.org/10.1175/JCLI-D-16-0609.1 + AuthorName: Randles, C. A., A. M. da Silva, V. Buchard, P. R. Colarco, A. Darmenov, + R. Govindaraju, A. Smirnov, B. Holben, R. Ferrare, J. Hair, Y.Shinozuka, and + C.J. Flynn + - Title: 'The MERRA-2 aerosol reanalysis, 1980 onward. Part II: Evaluation and + case studies.' + URL: https://doi.org/10.1175/JCLI-D-16-0613.1 + AuthorName: Buchard V., C. A. Randles, A. M. da Silva, A. Darmenov, P. R. Colarco, + R. Govindaraju, R. Ferrare, J. Hair, A. J. Beyersdorf, L. D. Ziemba, H. Yu + - Title: Land Surface Precipitation in MERRA-2. + URL: https://doi.org/10.1175/JCLI-D-16-0570.1 + AuthorName: Reichle, R.H., Q. Liu, R.D. Koster, C.S. Draper, S.P.P. Mahanama, + and G.S. Partyka + - Title: Assessment of MERRA-2 Land Surface Hydrology Estimates. + URL: https://doi.org/10.1175/JCLI-D-16-0720.1 + AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, + R. D. Koster, and G. J. M. De Lannoy + - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. + Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, + L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, + S. Pawson, O. Reale, S. D. Schubert, and M. Suarez + - Title: Data assimilation using incremental analysis updates + AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina + - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) + Data Assimilation System - Version 4 + AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, + S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu + - Title: Design and implementation of components in the Earth System Modeling + Framework + URL: https://doi.org/10.1177/1094342005056120 + AuthorName: Collins, N., G. Theurich, C. DeLuca, M. Suarez, A. Trayanov, V. + Balaji, P. Li, W. Yang, C. Hill, and A. da Silva + - Title: A catchment-based approach to modeling land surface processes in a GCM, + Part 1, Model Structure + AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part II: Spatially inhomogeneous and anisotropic general + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. diff --git a/datasets/nasa-m2i3nvasm.yaml b/datasets/nasa-m2i3nvasm.yaml new file mode 100644 index 000000000..d1b39f14a --- /dev/null +++ b/datasets/nasa-m2i3nvasm.yaml @@ -0,0 +1,121 @@ +Name: 'MERRA-2 inst3_3d_asm_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Assimilated + Meteorological Fields 0.625 x 0.5 degree V5.12.4 (M2I3NVASM) at GES DISC' +Description: "M2I3NVASM (or inst3_3d_asm_Nv) is an instantaneous 3-dimensional 3-hourly + data collection in Modern-Era Retrospective analysis for Research and Applications + version 2 (MERRA-2). This collection consists of assimilations of meteorological + parameters at 72 model layers, such as temperature, wind components, vertical pressure + velocity, water vapor, and layer height. The data field is available every three + hour starting from 00:00 UTC, e.g.: 00:00, 03:00, … , 21:00 UTC. Section 4.2 of + the MERRA-2 File Specification document provides pressure values nominal for a 1000 + hPa surface pressure and refers to the top edge of the layer. The lev=1 is for the + top layer, and lev=72 is for the bottom (or surface) model layer. \n\nMERRA-2 is + the latest version of global atmospheric reanalysis for the satellite era produced + by NASA Global Modeling and Assimilation Office (GMAO) using the Goddard Earth Observing + System Model (GEOS) version 5.12.4. The dataset covers the period of 1980-present + with the latency of ~3 weeks after the end of a month. \n\nData Reprocessing: Please + check “Records of MERRA-2 Data Reprocessing and Service Changes” linked from the + “Documentation” tab on this page. Note that a reprocessed data filename is different + from the original file.\n\nMERRA-2 Mailing List: Sign up to receive information + on reprocessing of data, changing of tools and services, as well as data announcements + from GMAO. Contact the GES DISC Help Desk (gsfc-dl-help-disc@mail.nasa.gov) to be + added to the list.\n\nQuestions: If you have a question, please read \"MERRA-2 File + Specification Document\", “MERRA-2 Data Access – Quick Start Guide”, and FAQs linked + from the ”Documentation” tab on this page. If that does not answer your question, + you may post your question to the NASA Earthdata Forum (forum.earthdata.nasa.gov) + or email the GES DISC Help Desk (gsfc-dl-help-disc@mail.nasa.gov).\nRead our doc + on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/WWQSXQ8IVFW8 +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 1980-01-01 to Ongoing +Tags: + - aws-pds + - agriculture + - air temperature + - atmosphere + - biodiversity + - climate + - coastal + - datacenter + - ecosystems + - global + - hydrology + - ice + - land + - metadata + - netCDF + - opendap + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MERRA-2 inst3_3d_asm_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Assimilated + Meteorological Fields 0.625 x 0.5 degree V5.12.4 (M2I3NVASM) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/MERRA2/M2I3NVASM.5.12.4/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: The Modern-Era Retrospective Analysis for Research and Applications, + Version 2 (MERRA-2). + URL: https://doi.org/10.1175/JCLI-D-16-0758.1 + AuthorName: Gelaro, R., W. McCarty, M. J. Suárez, R. Todling, A. Molod, L. Takacs, + C. A. Randles, A. Darmenov, M. G. Bosilovich, R. Reichle, et al. + - Title: 'Development of the GEOS-5 atmospheric general circulation model: evolution + from MERRA to MERRA2.' + URL: https://doi.org/10.5194/gmd-8-1339-2015 + AuthorName: Molod, A., L. Takacs, M. Suarez, and J. Bacmeister + - Title: 'The MERRA-2 Aerosol Reanalysis, 1980 Onward. Part I: System Description + and Data Assimilation Evaluation.' + URL: https://doi.org/10.1175/JCLI-D-16-0609.1 + AuthorName: Randles, C. A., A. M. da Silva, V. Buchard, P. R. Colarco, A. Darmenov, + R. Govindaraju, A. Smirnov, B. Holben, R. Ferrare, J. Hair, Y.Shinozuka, and + C.J. Flynn + - Title: 'The MERRA-2 aerosol reanalysis, 1980 onward. Part II: Evaluation and + case studies.' + URL: https://doi.org/10.1175/JCLI-D-16-0613.1 + AuthorName: Buchard V., C. A. Randles, A. M. da Silva, A. Darmenov, P. R. Colarco, + R. Govindaraju, R. Ferrare, J. Hair, A. J. Beyersdorf, L. D. Ziemba, H. Yu + - Title: Land Surface Precipitation in MERRA-2. + URL: https://doi.org/10.1175/JCLI-D-16-0570.1 + AuthorName: Reichle, R.H., Q. Liu, R.D. Koster, C.S. Draper, S.P.P. Mahanama, + and G.S. Partyka + - Title: Assessment of MERRA-2 Land Surface Hydrology Estimates. + URL: https://doi.org/10.1175/JCLI-D-16-0720.1 + AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, + R. D. Koster, and G. J. M. De Lannoy + - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. + Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, + L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, + S. Pawson, O. Reale, S. D. Schubert, and M. Suarez + - Title: Data assimilation using incremental analysis updates + AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina + - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) + Data Assimilation System - Version 4 + AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, + S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu + - Title: Design and implementation of components in the Earth System Modeling + Framework + URL: https://doi.org/10.1177/1094342005056120 + AuthorName: Collins, N., G. Theurich, C. DeLuca, M. Suarez, A. Trayanov, V. + Balaji, P. Li, W. Yang, C. Hill, and A. da Silva + - Title: A catchment-based approach to modeling land surface processes in a GCM, + Part 1, Model Structure + AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part II: Spatially inhomogeneous and anisotropic general + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish + Tutorials: + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shre diff --git a/datasets/nasa-m2t1nxslv.yaml b/datasets/nasa-m2t1nxslv.yaml new file mode 100644 index 000000000..9b413ae22 --- /dev/null +++ b/datasets/nasa-m2t1nxslv.yaml @@ -0,0 +1,125 @@ +Name: 'MERRA-2 tavg1_2d_slv_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Single-Level + Diagnostics 0.625 x 0.5 degree V5.12.4 (M2T1NXSLV) at GES DISC' +Description: "M2T1NXSLV (or tavg1_2d_slv_Nx) is an hourly time-averaged 2-dimensional + data collection in Modern-Era Retrospective analysis for Research and Applications + version 2 (MERRA-2). This collection consists of meteorology diagnostics at popularly + used vertical levels, such as air temperature at 2-meter (or at 10-meter, 850hPa, + 500 hPa, 250hPa), wind components at 50-meter (or at 2-meter, 10-meter, 850 hPa, + 500hPa, 250 hPa), sea level pressure, surface pressure, and total precipitable water + vapor (or ice water, liquid water). The data field is time-stamped with the central + time of an hour starting from 00:30 UTC, e.g.: 00:30, 01:30, … , 23:30 UTC.\n\nMERRA-2 + is the latest version of global atmospheric reanalysis for the satellite era produced + by NASA Global Modeling and Assimilation Office (GMAO) using the Goddard Earth Observing + System Model (GEOS) version 5.12.4. The dataset covers the period of 1980-present + with the latency of ~3 weeks after the end of a month. \n\nData Reprocessing: Please + check “Records of MERRA-2 Data Reprocessing and Service Changes” linked from the + “Documentation” tab on this page. Note that a reprocessed data filename is different + from the original file.\n\nMERRA-2 Mailing List: Sign up to receive information + on reprocessing of data, changing of tools and services, as well as data announcements + from GMAO. Contact the GES DISC Help Desk (gsfc-dl-help-disc@mail.nasa.gov) to be + added to the list.\n\nQuestions: If you have a question, please read \"MERRA-2 File + Specification Document\", “MERRA-2 Data Access – Quick Start Guide”, and FAQs linked + from the ”Documentation” tab on this page. If that does not answer your question, + you may post your question to the NASA Earthdata Forum (forum.earthdata.nasa.gov) + or email the GES DISC Help Desk (gsfc-dl-help-disc@mail.nasa.gov).\nRead our doc + on how to get AWS Credentials to retrieve this data: https://data.gesdisc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/VJAFPLI1CSIV +Contact: 'GES DISC HELP DESK SUPPORT GROUP: gsfc-dl-help-disc@mail.nasa.gov. Home Page: https://disc.gsfc.nasa.gov/, GLOBAL MODELING AND ASSIMILATION OFFICE: data@gmao.gsfc.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 1980-01-01 to Ongoing +Tags: + - aws-pds + - agriculture + - air temperature + - atmosphere + - biodiversity + - climate + - coastal + - datacenter + - ecosystems + - global + - hydrology + - ice + - land + - metadata + - oceans + - opendap + - water + - netcdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MERRA-2 tavg1_2d_slv_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Single-Level + Diagnostics 0.625 x 0.5 degree V5.12.4 (M2T1NXSLV) at GES DISC.' + ARN: arn:aws:s3:::gesdisc-cumulus-prod-protected/MERRA2/M2T1NXSLV.5.12.4/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.gesdisc.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: The Modern-Era Retrospective Analysis for Research and Applications, + Version 2 (MERRA-2). + URL: https://doi.org/10.1175/JCLI-D-16-0758.1 + AuthorName: Gelaro, R., W. McCarty, M. J. Suárez, R. Todling, A. Molod, L. Takacs, + C. A. Randles, A. Darmenov, M. G. Bosilovich, R. Reichle, et al. + - Title: 'Development of the GEOS-5 atmospheric general circulation model: evolution + from MERRA to MERRA2.' + URL: https://doi.org/10.5194/gmd-8-1339-2015 + AuthorName: Molod, A., L. Takacs, M. Suarez, and J. Bacmeister + - Title: 'The MERRA-2 Aerosol Reanalysis, 1980 Onward. Part I: System Description + and Data Assimilation Evaluation.' + URL: https://doi.org/10.1175/JCLI-D-16-0609.1 + AuthorName: Randles, C. A., A. M. da Silva, V. Buchard, P. R. Colarco, A. Darmenov, + R. Govindaraju, A. Smirnov, B. Holben, R. Ferrare, J. Hair, Y.Shinozuka, and + C.J. Flynn + - Title: 'The MERRA-2 aerosol reanalysis, 1980 onward. Part II: Evaluation and + case studies.' + URL: https://doi.org/10.1175/JCLI-D-16-0613.1 + AuthorName: Buchard V., C. A. Randles, A. M. da Silva, A. Darmenov, P. R. Colarco, + R. Govindaraju, R. Ferrare, J. Hair, A. J. Beyersdorf, L. D. Ziemba, H. Yu + - Title: Land Surface Precipitation in MERRA-2. + URL: https://doi.org/10.1175/JCLI-D-16-0570.1 + AuthorName: Reichle, R.H., Q. Liu, R.D. Koster, C.S. Draper, S.P.P. Mahanama, + and G.S. Partyka + - Title: Assessment of MERRA-2 Land Surface Hydrology Estimates. + URL: https://doi.org/10.1175/JCLI-D-16-0720.1 + AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, + R. D. Koster, and G. J. M. De Lannoy + - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. + Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, + L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, + S. Pawson, O. Reale, S. D. Schubert, and M. Suarez + - Title: Data assimilation using incremental analysis updates + AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina + - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) + Data Assimilation System - Version 4 + AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, + S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu + - Title: Design and implementation of components in the Earth System Modeling + Framework + URL: https://doi.org/10.1177/1094342005056120 + AuthorName: Collins, N., G. Theurich, C. DeLuca, M. Suarez, A. Trayanov, V. + Balaji, P. Li, W. Yang, C. Hill, and A. da Silva + - Title: A catchment-based approach to modeling land surface processes in a GCM, + Part 1, Model Structure + AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: 'Numerical aspects of the application of recursive filters to variational + statistical analysis. Part II: Spatially inhomogeneous and anisotropic general + covariances' + AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts + - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish + Tutorials: + - Title: How to Read and Plot NetCDF MERRA-2 Data in Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_and_Plot_NetCDF_MERRA-2_data_in_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. + - Title: How to Access GES DISC Data Using Python + URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Access_GES_DISC_Data_Using_Python.ipynb + AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, + Suhung Shen, Jian Su,Joseph Wysk, Rupesh Shrestha. \ No newline at end of file diff --git a/datasets/nasa-mcd43a1.yaml b/datasets/nasa-mcd43a1.yaml new file mode 100644 index 000000000..b63b87746 --- /dev/null +++ b/datasets/nasa-mcd43a1.yaml @@ -0,0 +1,56 @@ +Name: MODIS/Terra+Aqua BRDF/Albedo Model Parameters Daily L3 Global - 500m V061 +Description: "The Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43A1 Version + 6.1 Bidirectional Reflectance Distribution Function and Albedo (BRDF/Albedo) Model + Parameters dataset is produced daily using 16 days of Terra and Aqua MODIS data + at 500 meter (m) resolution. Data are temporally weighted to the ninth day of the + retrieval period which is reflected in the Julian date in the file name. MCD43A1 + provides the three model weighting parameters (isotropic, volumetric, and geometric) + used to derive the Albedo ([MCD43A3](https://doi.org/10.5067/MODIS/MCD43A3.061)) + and Nadir BRDF-Adjusted Reflectance (NBAR) ([MCD43A4](https://doi.org/10.5067/MODIS/MCD43A4.061)) + products.\n\nUsers are urged to use the band specific quality flags to isolate the + highest quality full inversion results for their own science applications as described + in the [User Guide](https://www.umb.edu/spectralmass/modis-user-guide-v006-and-v0061/mcd43a1-brdfalbedo-model-parameters-product/).\n\nThe + MCD43A1 provides the three model weighting parameters for MODIS spectral bands 1 + through 7 as well as the visible, near infrared (NIR), and shortwave bands. Along + with the three-dimensional parameter layers for these bands are the simplified mandatory + quality layers for each of the 10 bands. Essential quality information provided + in the corresponding [MCD43A2](https://doi.org/10.5067/MODIS/MCD43A2.061) data file + should be consulted when using this product. \n\nKnown Issues\n\n* For complete + information about known issues please refer to the [MODIS/VIIRS Land Quality Assessment + website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=TerraAqua&as=61).\n\nImprovements/Changes + from Previous Versions\n\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MCD43A1.061 +Contact: 'User Services: lpdaac@usgs.gov. Home Page: https://www.earthdata.nasa.gov/centers/lp-daac/contact' +ManagedBy: NASA +UpdateFrequency: From 2000-02-16 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - earth observation + - geospatial + - global + - land + - opendap + - tiles + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra+Aqua BRDF/Albedo Model Parameters Daily L3 Global - + 500m V061.' + ARN: arn:aws:s3:::lp-prod-protected/MCD43A1.061 + ARN: arn:aws:s3:::lp-prod-protected/MCD43A1.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/7f58ad3abeca7f0d17637ddb812642c0120a57ab/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC \ No newline at end of file diff --git a/datasets/nasa-mcd43a3.yaml b/datasets/nasa-mcd43a3.yaml new file mode 100644 index 000000000..6f0701e66 --- /dev/null +++ b/datasets/nasa-mcd43a3.yaml @@ -0,0 +1,46 @@ +Name: MODIS/Terra+Aqua BRDF/Albedo Albedo Daily L3 Global - 500m V061 +Description: |- + The Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43A3 Version 6.1 Albedo Model dataset is produced daily using 16 days of Terra and Aqua MODIS data at 500 meter (m) resolution. Data are temporally weighted to the ninth day of the 16 day which is reflected in the Julian date in the file name. + + Users are urged to use the band specific quality flags to isolate the highest quality full inversion results for their own science applications as described in the [User Guide](https://www.umb.edu/spectralmass/modis-user-guide-v006-and-v0061/mcd43a3-albedo-product/). + + The MCD43A3 provides black-sky albedo (directional hemispherical reflectance) and white-sky albedo (bihemispherical reflectance) data at local solar noon for MODIS bands 1 through 7 and the visible, near infrared (NIR), and shortwave bands. Along with the albedo layers are the simplified mandatory quality layers for each of the 10 bands. Essential quality information provided in the corresponding [MCD43A2](https://doi.org/10.5067/MODIS/MCD43A2.061) data file should be consulted when using this product. + + Known Issues + + * For complete information about known issues please refer to the [MODIS/VIIRS Land Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=TerraAqua&as=61). + + Improvements/Changes from Previous Versions + + * The Version 6.1 Level-1B (L1B) products have been improved by undergoing various calibration changes that include: changes to the response-versus-scan angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017. + * A polarization correction has been applied to the L1B Reflective Solar Bands (RSB). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/MODIS/MCD43A3.061 +Contact: 'User Services: lpdaac@usgs.gov. Home Page: https://www.earthdata.nasa.gov/centers/lp-daac' +ManagedBy: NASA +UpdateFrequency: From 2000-02-16 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - earth observation + - geospatial + - global + - land + - opendap + - satellite imagery + - tiles + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra+Aqua BRDF/Albedo Albedo Daily L3 Global - 500m V061.' + ARN: arn:aws:s3:::lp-prod-protected/MCD43A3.061 + ARN: arn:aws:s3:::lp-prod-protected/MCD43A3.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/7f58ad3abeca7f0d17637ddb812642c0120a57ab/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC \ No newline at end of file diff --git a/datasets/nasa-mcd43a4.yaml b/datasets/nasa-mcd43a4.yaml new file mode 100644 index 000000000..67afdaf70 --- /dev/null +++ b/datasets/nasa-mcd43a4.yaml @@ -0,0 +1,47 @@ +Name: MODIS/Terra+Aqua BRDF/Albedo Nadir BRDF-Adjusted Ref Daily L3 Global - 500m V061 +Description: |- + The Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43A4 Version 6.1 Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) dataset is produced daily using 16 days of Terra and Aqua MODIS data at 500 meter (m) resolution. The view angle effects are removed from the directional reflectances, resulting in a stable and consistent NBAR product. Data are temporally weighted to the ninth day which is reflected in the Julian date in the file name. + + Users are urged to use the band specific quality flags to isolate the highest quality full inversion results for their own science applications as described in the [User Guide](https://www.umb.edu/spectralmass/modis-user-guide-v006-and-v0061/mcd43a4-nbar-product/). + + The MCD43A4 provides NBAR and simplified mandatory quality layers for MODIS bands 1 through 7. Essential quality information provided in the corresponding [MCD43A2](https://doi.org/10.5067/MODIS/MCD43A2.061) data file should be consulted when using this product. + + Known Issues + + * For complete information about known issues please refer to the [MODIS/VIIRS Land Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=TerraAqua&as=61). + + Improvements/Changes from Previous Versions + + * The Version 6.1 Level-1B (L1B) products have been improved by undergoing various calibration changes that include: changes to the response-versus-scan angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017. + * A polarization correction has been applied to the L1B Reflective Solar Bands (RSB). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/MODIS/MCD43A4.061 +Contact: 'User Services: lpdaac@usgs.gov. Home Page: https://www.earthdata.nasa.gov/centers/lp-daac/contact' +ManagedBy: NASA +UpdateFrequency: From 2000-02-16 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - earth observation + - geospatial + - global + - land + - opendap + - satellite imagery + - tiles + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra+Aqua BRDF/Albedo Nadir BRDF-Adjusted Ref Daily L3 Global + - 500m V061.' + ARN: arn:aws:s3:::lp-prod-protected/MCD43A4.061 + ARN: arn:aws:s3:::lp-prod-protected/MCD43A4.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/7f58ad3abeca7f0d17637ddb812642c0120a57ab/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC diff --git a/datasets/nasa-mi1b2e.yaml b/datasets/nasa-mi1b2e.yaml new file mode 100644 index 000000000..3df062f3c --- /dev/null +++ b/datasets/nasa-mi1b2e.yaml @@ -0,0 +1,43 @@ +Name: MISR Level 1B2 Ellipsoid Data V004 +Description: "MI1B2E_004 is the Multi-angle Imaging SpectroRadiometer (MISR) Level + 1B2 Ellipsoid Data Version 4 product. It contains Ellipsoid-projected Top-of-Atmosphere + (TOA) Radiance, resampled at the surface and topographically corrected, as well + as geometrically corrected by PGE22. Data collection for this product is ongoing.\r\n\r\nMISR + itself is an instrument designed to view Earth with cameras pointed in 9 different + directions. As the instrument flies overhead, each piece of Earth's surface below + is successively imaged by all 9 cameras, in each of 4 wavelengths (blue, green, + red, and near-infrared). The goal of MISR is to improve our understanding of the + affects of sunlight on Earth, as well as distinguish different types of clouds, + particles and surfaces. Specifically, MISR monitors the monthly, seasonal, and long-term + trends in three areas: 1) amount and type of atmospheric particles (aerosols), including + those formed by natural sources and by human activities; 2) amounts, types, and + heights of clouds, and 3) distribution of land surface cover, including vegetation + canopy structure.\nRead our doc on how to get AWS Credentials to retrieve this data: + https://data.asdc.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/TERRA/MISR/MI1B2E_L1.004 +Contact: 'ASDC USER SERVICES: support-asdc@earthdata.nasa.gov. Home Page: https://asdc.larc.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 1999-12-18 to Ongoing +Tags: + - aws-pds + - atmosphere + - climate + - cyclone typhoon hurricane + - datacenter + - earth observation + - global + - land + - opendap + - orbit +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MISR Level 1B2 Ellipsoid Data V004. (Format: netCDF-4)' + ARN: arn:aws:s3:::asdc-prod-protected/MISR/MI1B2E_004 + ARN: arn:aws:s3:::asdc-prod-protected/MISR/MI1B2E_004 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.asdc.earthdata.nasa.gov/s3credentialsREADME +DataAtWork: + Publications: ~ + Tutorials: ~ diff --git a/datasets/nasa-mod02hkm.yaml b/datasets/nasa-mod02hkm.yaml new file mode 100644 index 000000000..f1505c114 --- /dev/null +++ b/datasets/nasa-mod02hkm.yaml @@ -0,0 +1,78 @@ +Name: MODIS/Terra Calibrated Radiances 5-Min L1B Swath 500m +Description: "The MODIS/Terra Calibrated Radiances 5Min L1B Swath 500m data set contains + calibrated and geolocated at-aperture radiances for 7 discrete bands located in + the 0.45 to 2.20 micron region of the electromagnetic spectrum. These data are generated + from the MODIS Level 1A scans of raw radiance and in the process converted to geophysical + units of W/(m^2 um sr). Additional data are provided including quality flags, error + estimates and calibration data.\r\n\r\nVisible, shortwave infrared, and near infrared + measurements are only made during the daytime (except band 26), while radiances + for the thermal infrared region (bands 20-25, 27-36) are measured continuously.\r\n\r\nChannels + 1 and 2 have 250 m resolution, channels 3 through 7 have 500 m resolution. However, + for the MODIS L1B 500 m product, the 250 m band radiance data and their associated + uncertainties have been aggregated to 500m resolution. Thus the entire channel data + set has been co-registered to the same spatial scale in the 500 m product. Separate + L1B products are available for the 250 m resolution channels (MOD02QKM) and 1 km + resolution channels (MOD021KM). For the latter product, the 250 m and 500 m channel + data (bands 1 through 7) have been aggregated into equivalent 1 km pixel values.\r\n + \ \r\nSpatial resolution for pixels at nadir is 500 km, degrading to 2.4 + km in the along-scan direction at the scan extremes. However, thanks to the overlapping + of consecutive swaths and respectively pixels there, the resulting resolution at + the scan extremes is about 1 km. A 55 degree scanning pattern at the EOS orbit of + 705 km results in a 2330 km orbital swath width and provides global coverage every + one to two days. A single MODIS Level 1B 500 m granule will contain a scene built + from 203 scans sampled 2708 times in the cross-track direction, corresponding to + approximately 5 minutes worth of data; thus 288 granules will be produced per day. + Since an individual MODIS scan will contain 20 along-track spatial elements for + the 500 m channels, the scene will be composed of (2708 x 4060) pixels, resulting + in a spatial coverage of (2330 km x 2040 km). Due to the MODIS scan geometry, there + will be increasing scan overlap beyond about 20 degrees scan angle. \r\n\r\nTo + summarize, the MODIS L1B 500 m data product consists of:\r\n \r\n1. Calibrated + radiances, uncertainties and number of samples for (2) 250 m reflected solar bands + aggregated to 500 m resolution\r\n \r\n2. Calibrated radiances and uncertainties + for (5) 500 m reflected solar bands\r\n \r\n3. Geolocation for 1km pixels, + that must be interpolated to get 500 m pixel locations. For the relationship of + 1km pixels to 500m pixels, see the Geolocation ATBD https://modis.gsfc.nasa.gov/data/atbd/atbd_mod28_v3.pdf.\r\n + \ \r\n4. Calibration data for all channels (scale and offset) \r\n \r\n5. + Comprehensive set of file-level metadata summarizing the spatial, temporal and parameter + attributes of the data, as well as auxiliary information pertaining to instrument + status and data quality characterization users requiring all geolocation and solar/satellite + geometry fields at 1km resolution can obtain the separate MODIS Level 1 Geolocation + product (MOD03) from LAADS https://ladsweb.modaps.eosdis.nasa.gov/ . \r\n \r\nThe + shortname for this product is MOD02HKM and is stored in the Earth Observing System + Hierarchical Data Format (HDF-EOS). A typical MOD02HKM file size is approximately + 135 MB.\r\n \r\nEnvironmental information derived from MODIS L1B measurements + will offer a comprehensive and unprecedented look at terrestrial, atmospheric, and + ocean phenomenology for a wide and diverse community of users throughout the world.\r\n\r\nSee + the MODIS Characterization Support Team webpage for more C6 product information + at:\r\n\r\nhttps://mcst.gsfc.nasa.gov/l1b/product-information\r\n\r\n\r\nor visit + Science Team homepage at:\r\nhttps://modis.gsfc.nasa.gov/data/dataprod/\nRead our + doc on how to get AWS Credentials to retrieve this data: https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MOD02HKM.061 +Contact: 'MODAPS USER SUPPORT TEAM: MODAPSUSO@lists.nasa.gov. Home Page: https://modaps.modaps.eosdis.nasa.gov/, MODIS Characterization Support Team (MCST): https://mcst.gsfc.nasa.gov/contact' +ManagedBy: NASA +UpdateFrequency: From 2000-02-24 to Ongoing +Tags: + - aws-pds + - atmosphere + - datacenter + - earth observation + - environmental + - global + - metadata + - opendap + - orbit + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra Calibrated Radiances 5-Min L1B Swath 500m.' + ARN: 'arn:aws:s3:::prod-lads/MOD02HKM' + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: MODIS Level 1B - Calibrated Radiances - Natural Color RGB - 500m + URL: https://fire.trainhub.eumetsat.int/docs/figure1_MODIS_L1B.html + AuthorName: EUMETSAT \ No newline at end of file diff --git a/datasets/nasa-mod09a1.yaml b/datasets/nasa-mod09a1.yaml new file mode 100644 index 000000000..fed483bf3 --- /dev/null +++ b/datasets/nasa-mod09a1.yaml @@ -0,0 +1,45 @@ +Name: MODIS/Terra Surface Reflectance 8-Day L3 Global 500m SIN Grid V061 +Description: "The Moderate Resolution Imaging Spectroradiometer (MODIS) Terra MOD09A1 + Version 6.1 product provides an estimate of the surface spectral reflectance of + Terra MODIS Bands 1 through 7 corrected for atmospheric conditions such as gasses, + aerosols, and Rayleigh scattering. Along with the seven 500 meter (m) reflectance + bands are two quality layers and four observation bands. For each pixel, a value + is selected from all the acquisitions within the 8-day composite period. The criteria + for the pixel choice include cloud and solar zenith. When several acquisitions meet + the criteria the pixel with the minimum channel 3 (blue) value is used. \n\nKnown + Issues\n* For complete information about known issues please refer to the [MODIS/VIIRS + Land Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Terra&as=61).\n\nImprovements/Changes + from Previous Versions\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MOD09A1.061 +Contact: 'User Services: lpdaac@usgs.gov. Home Page: https://www.earthdata.nasa.gov/centers/lp-daac/contact' +ManagedBy: NASA +UpdateFrequency: From 2000-02-18 to Ongoing (Weekly - < Monthly) +Tags: + - aws-pds + - earth observation + - geospatial + - global + - land + - opendap + - satellite imagery + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra Surface Reflectance 8-Day L3 Global 500m SIN Grid V061.' + ARN: arn:aws:s3:::lp-prod-protected/MOD09A1.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/7f58ad3abeca7f0d17637ddb812642c0120a57ab/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC \ No newline at end of file diff --git a/datasets/nasa-mod09ga.yaml b/datasets/nasa-mod09ga.yaml new file mode 100644 index 000000000..8b926f3a0 --- /dev/null +++ b/datasets/nasa-mod09ga.yaml @@ -0,0 +1,46 @@ +Name: MODIS/Terra Surface Reflectance Daily L2G Global 1km and 500m SIN Grid V061 +Description: "The MOD09GA Version 6.1 product provides an estimate of the surface + spectral reflectance of Terra Moderate Resolution Imaging Spectroradiometer (MODIS) + Bands 1 through 7, corrected for atmospheric conditions such as gasses, aerosols, + and Rayleigh scattering. Provided along with the 500 meter (m) surface reflectance, + observation, and quality bands are a set of ten 1 kilometer (km) observation bands + and geolocation flags. The reflectance layers from the MOD09GA are used as the source + data for many of the MODIS land products. \n\nKnown Issues\n* For complete information + about known issues please refer to the [MODIS/VIIRS Land Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Terra&as=61).\n\nImprovements/Changes + from Previous Versions\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MOD09GA.061 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2000-02-24 to Ongoing +Tags: + - aws-pds + - datacenter + - earth observation + - geospatial + - global + - hdf + - ice + - land + - opendap + - satellite imagery +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra Surface Reflectance Daily L2G Global 1km and 500m SIN + Grid V061.' + ARN: arn:aws:s3:::lp-prod-protected/MOD09GA.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC + AuthorURL: https://www.earthdata.nasa.gov/centers/lp-daac diff --git a/datasets/nasa-mod09gq.yaml b/datasets/nasa-mod09gq.yaml new file mode 100644 index 000000000..31b62f12c --- /dev/null +++ b/datasets/nasa-mod09gq.yaml @@ -0,0 +1,44 @@ +Name: MODIS/Terra Surface Reflectance Daily L2G Global 250m SIN Grid V061 +Description: "The MOD09GQ Version 6.1 product provides an estimate of the surface + spectral reflectance of Terra Moderate Resolution Imaging Spectroradiometer (MODIS) + 250 meter (m) bands 1 and 2, corrected for atmospheric conditions such as gasses, + aerosols, and Rayleigh scattering. Along with the 250 m surface reflectance bands + are the Quality Assurance (QA) layer and five observation layers. This product is + intended to be used in conjunction with the quality and viewing geometry information + of the 500 m product (MOD09GA). \n\nKnown Issues\n* For complete information about + known issues please refer to the [MODIS/VIIRS Land Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Terra&as=61).\n\nImprovements/Changes + from Previous Versions\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MOD09GQ.061 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2000-02-24 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - datacenter + - earth observation + - geospatial + - global + - hdf + - ice + - land + - opendap +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra Surface Reflectance Daily L2G Global 250m SIN Grid V061.' + ARN: arn:aws:s3:::lp-prod-protected/MOD09GQ.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC + AuthorURL: https://www.earthdata.nasa.gov/centers/lp-daac diff --git a/datasets/nasa-mod13q1.yaml b/datasets/nasa-mod13q1.yaml new file mode 100644 index 000000000..00c7bdaed --- /dev/null +++ b/datasets/nasa-mod13q1.yaml @@ -0,0 +1,51 @@ +Name: MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061 +Description: "The Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Vegetation + Indices (MOD13Q1) Version 6.1 data are generated every 16 days at 250 meter (m) + spatial resolution as a Level 3 product. The MOD13Q1 product provides two primary + vegetation layers. The first is the Normalized Difference Vegetation Index (NDVI) + which is referred to as the continuity index to the existing National Oceanic and + Atmospheric Administration-Advanced Very High Resolution Radiometer (NOAA-AVHRR) + derived NDVI. The second vegetation layer is the Enhanced Vegetation Index (EVI), + which has improved sensitivity over high biomass regions. The algorithm chooses + the best available pixel value from all the acquisitions from the 16 day period. + The criteria used is low clouds, low view angle, and the highest NDVI/EVI value.\n\nAlong + with the vegetation layers and the two quality layers, the HDF file will have MODIS + reflectance bands 1 (red), 2 (near-infrared), 3 (blue), and 7 (mid-infrared), as + well as four observation layers. \n\nKnown Issues\n* For complete information about + known issues please refer to the [MODIS/VIIRS Land Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Terra&as=61).\n\nImprovements/Changes + from Previous Versions\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MOD13Q1.061 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2000-02-18 to Ongoing (Weekly - < Monthly) +Tags: + - aws-pds + - datacenter + - earth observation + - geospatial + - global + - hdf + - ice + - land + - opendap + - satellite imagery +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061.' + ARN: arn:aws:s3:::lp-prod-protected/MOD13Q1.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC + AuthorURL: https://www.earthdata.nasa.gov/centers/lp-daac diff --git a/datasets/nasa-mod16a2.yaml b/datasets/nasa-mod16a2.yaml new file mode 100644 index 000000000..1ccc336b8 --- /dev/null +++ b/datasets/nasa-mod16a2.yaml @@ -0,0 +1,57 @@ +Name: MODIS/Terra Net Evapotranspiration 8-Day L4 Global 500m SIN Grid V061 +Description: "The MOD16A2 Version 6.1 Evapotranspiration/Latent Heat Flux product + is an 8-day composite dataset produced at 500 meter (m) pixel resolution. The algorithm + used for the MOD16 data product collection is based on the logic of the Penman-Monteith + equation, which includes inputs of daily meteorological reanalysis data along with + Moderate Resolution Imaging Spectroradiometer (MODIS) remotely sensed data products + such as vegetation property dynamics, albedo, and land cover. \n\nProvided in the + MOD16A2 product are layers for composited Evapotranspiration (ET), Latent Heat Flux + (LE), Potential ET (PET) and Potential LE (PLE) along with a quality control layer. + Two low resolution browse images, ET and LE, are also available for each MOD16A2 + granule.\n\nThe pixel values for the two Evapotranspiration layers (ET and PET) + are the sum of all eight days within the composite period and the pixel values for + the two Latent Heat layers (LE and PLE) are the average of all eight days within + the composite period. Note that the last acquisition period of each year is a 5 + or 6-day composite period, depending on the year.\n\nKnown Issues\n* Operational + and uncertainty issues are provided under Section 3 in the User Guide.\n* For complete + information about known issues please refer to the [MODIS/VIIRS Land Quality Assessment + website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Terra&as=61).\n\nImprovements/Changes + from Previous Versions\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\n* + The product uses Climatology LAI/FPAR as back up to the operational LAI/FPAR.\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MOD16A2.061 +Contact: 'User Services: lpdaac@usgs.gov. Home Page: https://www.earthdata.nasa.gov/centers/lp-daac/contact' +ManagedBy: NASA +UpdateFrequency: From 2021-01-01 to Ongoing (Weekly - < Monthly) +Tags: + - aws-pds + - atmosphere + - earth observation + - evapotranspiration + - geospatial + - global + - land + - land cover + - opendap + - water + - hdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Terra Net Evapotranspiration 8-Day L4 Global 500m SIN Grid + V061.' + ARN: arn:aws:s3:::lp-prod-protected/MOD16A2.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Publications: ~ + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/7f58ad3abeca7f0d17637ddb812642c0120a57ab/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC \ No newline at end of file diff --git a/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml b/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml new file mode 100644 index 000000000..ad011caa1 --- /dev/null +++ b/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml @@ -0,0 +1,40 @@ +Name: GHRSST Level 2P Global Sea Surface Skin Temperature from the Moderate Resolution + Imaging Spectroradiometer (MODIS) on the NASA Terra satellite (GDS2) +Description: |- + NASA produces skin sea surface temperature (SST) products from the Infrared (IR) channels of the Moderate-resolution Imaging Spectroradiometer (MODIS) onboard the Terra satellite. Terra was launched by NASA on December 18, 1999, into a sun synchronous, polar orbit with a daylight descending node at 10:30 am, to study the global dynamics of the Earth atmosphere, land and oceans. The MODIS captures data in 36 spectral bands at a variety of spatial resolutions. Two SST products can be present in these files. The first is a skin SST produced for both day and night observations, derived from the long wave IR 11 and 12 micron wavelength channels, using a modified nonlinear SST algorithm intended to provide continuity with SST derived from heritage and current NASA sensors. At night, a second SST product is produced using the mid-infrared 3.95 and 4.05 micron channels which are unique to MODIS; the SST derived from these measurements is identified as SST4. The SST4 product has lower uncertainty, but due to sun glint can only be produced at night. MODIS L2P SST data have a 1 km spatial resolution at nadir and are stored in 288 five minute granules per day. Full global coverage is obtained every two days, with coverage poleward of 32.3 degree being complete each day. The production of MODIS L2P SST files is part of the Group for High Resolution Sea Surface Temperature (GHRSST) project, and is a joint collaboration between the NASA Jet Propulsion Laboratory (JPL), the NASA Ocean Biology Processing Group (OBPG), and the Rosenstiel School of Marine and Atmospheric Science (RSMAS). Researchers at RSMAS are responsible for SST algorithm development, error statistics and quality flagging, while the OBPG, as the NASA ground data system, is responsible for the production of daily MODIS ocean products. JPL acquires MODIS ocean granules from the OBPG and reformats them to the GHRSST L2P netCDF specification with complete metadata and ancillary variables, and distributes the data as the official Physical Oceanography Data Archive (PO.DAAC) for SST. The R2019.0 supersedes the previous R2014.0 datasets which can be found at https://doi.org/10.5067/GHMDT-2PJ02 + Read our doc on how to get AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GHMDT-2PJ19 +Contact: 'Help Desk: podaac@podaac.jpl.nasa.gov. Home Page: https://podaac.jpl.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 2000-02-24 to Ongoing (Hourly - < Daily) +Tags: + - aws-pds + - atmosphere + - datacenter + - earth observation + - global + - land + - marine + - metadata + - oceans + - orbit + - netcdf +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GHRSST Level 2P Global Sea Surface Skin Temperature from the Moderate + Resolution Imaging Spectroradiometer (MODIS) on the NASA Terra satellite (GDS2).' + ARN: arn:aws:s3:::podaac-ops-cumulus-protected/MODIS_T-JPL-L2P-v2019.0/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://archive.podaac.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: A decade of sea surface temperature from MODIS + URL: http://dx.doi.org/10.1016/j.rse.2015.04.023 + AuthorName: Kilpatrick, K.A., Podestá, G., Walsh, S., Williams, E., Halliwell, + V., Szczodrak, M., Brown, O.B., Minnett, P.J., & Evans, R. + Tutorials: + - Title: Direct S3 Access tutorial + URL: https://github.com/podaac/tutorials/blob/3d2ac9cb3626b656802638f864631a68beb11823/notebooks/s3/S3-Access.ipynb#L33 + AuthorName: PODAAC \ No newline at end of file diff --git a/datasets/nasa-mur-jpl-l4-glob-v41.yaml b/datasets/nasa-mur-jpl-l4-glob-v41.yaml new file mode 100644 index 000000000..b49733597 --- /dev/null +++ b/datasets/nasa-mur-jpl-l4-glob-v41.yaml @@ -0,0 +1,43 @@ +Name: GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis (v4.1) +Description: |- + A Group for High Resolution Sea Surface Temperature (GHRSST) Level 4 sea surface temperature analysis produced as a retrospective dataset (four day latency) and near-real-time dataset (one day latency) at the JPL Physical Oceanography DAAC using wavelets as basis functions in an optimal interpolation approach on a global 0.01 degree grid. The version 4 Multiscale Ultrahigh Resolution (MUR) L4 analysis is based upon nighttime GHRSST L2P skin and subskin SST observations from several instruments including the NASA Advanced Microwave Scanning Radiometer-EOS (AMSR-E), the JAXA Advanced Microwave Scanning Radiometer 2 on GCOM-W1, the Moderate Resolution Imaging Spectroradiometers (MODIS) on the NASA Aqua and Terra platforms, the US Navy microwave WindSat radiometer, the Advanced Very High Resolution Radiometer (AVHRR) on several NOAA satellites, and in situ SST observations from the NOAA iQuam project. The ice concentration data are from the archives at the EUMETSAT Ocean and Sea Ice Satellite Application Facility (OSI SAF) High Latitude Processing Center and are also used for an improved SST parameterization for the high-latitudes. The dataset also contains additional variables for some granules including a SST anomaly derived from a MUR climatology and the temporal distance to the nearest IR measurement for each pixel.This dataset is funded by the NASA MEaSUREs program ( http://earthdata.nasa.gov/our-community/community-data-system-programs/measures-projects ), and created by a team led by Dr. Toshio M. Chin from JPL. It adheres to the GHRSST Data Processing Specification (GDS) version 2 format specifications. Use the file global metadata "history:" attribute to determine if a granule is near-realtime or retrospective. + Read our doc on how to get AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/GHGMR-4FJ04 +Contact: 'Help Desk: podaac@podaac.jpl.nasa.gov' +ManagedBy: NASA +UpdateFrequency: From 2002-05-31 to Ongoing (Hourly - < Daily) +Tags: + - aws-pds + - datacenter + - earth observation + - global + - ice + - metadata + - oceans + - parquet + - us + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis + (v4.1). (Format: netCDF-4)' + ARN: arn:aws:s3:::podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME +DataAtWork: + Publications: + - Title: A multi-scale high-resolution analysis of global sea surface temperature + URL: https://doi.org/10.1016/j.rse.2017.07.029 + AuthorName: Chin, T.M, J. Vazquez-Cuervo, and E.M. Armstrong + Tutorials: + - Title: MUR Sea Surface Temperature Analysis of Washington State + URL: https://podaac.github.io/tutorials/notebooks/datasets/MUR_SST_Washington_Comparison.html + AuthorName: Zoë Walschots + NotebookURL: https://github.com/podaac/tutorials/blob/master/notebooks/datasets/MUR_SST_Washington_Comparison.ipynb + - Title: Using Sea Surface Temperature and Sea Surface Height Data for Hurricane + Helene + URL: https://podaac.github.io/tutorials/notebooks/DataStories/HurricaneHelene_SST_SSH_Notebook.html + AuthorName: Julie Sanchez + NotebookURL: https://github.com/podaac/tutorials/blob/4466294936f7c992a78bd2954a9f4909784bf0ba/notebooks/DataStories/HurricaneHelene_SST_SSH_Notebook.ipynb diff --git a/datasets/nasa-myd09ga.yaml b/datasets/nasa-myd09ga.yaml new file mode 100644 index 000000000..b27acb2dc --- /dev/null +++ b/datasets/nasa-myd09ga.yaml @@ -0,0 +1,53 @@ +Name: MODIS/Aqua Surface Reflectance Daily L2G Global 1km and 500m SIN Grid V061 +Description: "The MYD09GA Version 6.1 product provides an estimate of the surface + spectral reflectance of Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) + Bands 1 through 7, corrected for atmospheric conditions such as gasses, aerosols, + and Rayleigh scattering. Provided along with the 500 meter (m) surface reflectance, + observation, and quality bands are a set of ten 1 kilometer observation bands and + geolocation flags. The reflectance layers from the MYD09GA are used as the source + data for many of the MODIS land products. \n\nKnown Issues\n* Prior to the Aqua + MODIS launch, Band 6 exhibited several anomalous detectors. Band 6 performance degraded + seriously after launch and presently a majority of the Band 6 detectors are non-functional. + Science users should read and use the non-functional detector flags and decide for + themselves the optimum manner to handle non-functional detector \"gaps\" for their + products. For complete information please refer to the [MODIS Characterization Support + Team (MCST) website](https://mcst.gsfc.nasa.gov/time-dependent-list-non-functional-or-noisy-detector).\n* + For complete information about known issues please refer to the [MODIS/VIIRS Land + Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Aqua&as=61).\n\nImprovments/Changes + from Previous Version\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MYD09GA.061 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2002-07-04 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - datacenter + - earth observation + - geospatial + - global + - hdf + - ice + - land + - opendap + - satellite imagery +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Aqua Surface Reflectance Daily L2G Global 1km and 500m SIN + Grid V061.' + ARN: arn:aws:s3:::lp-prod-protected/MYD09GA.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC + AuthorURL: https://www.earthdata.nasa.gov/centers/lp-daac diff --git a/datasets/nasa-myd09gq.yaml b/datasets/nasa-myd09gq.yaml new file mode 100644 index 000000000..7cf2aa2af --- /dev/null +++ b/datasets/nasa-myd09gq.yaml @@ -0,0 +1,50 @@ +Name: MODIS/Aqua Surface Reflectance Daily L2G Global 250m SIN Grid V061 +Description: "The MYD09GQ Version 6.1 product provides an estimate of the surface + spectral reflectance of Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) + 250 meter (m) bands 1 and 2, corrected for atmospheric conditions such as gasses, + aerosols, and Rayleigh scattering. Along with the 250 m bands are the Quality Assurance + (QA) layer and five observation layers. This product is intended to be used in conjunction + with the quality and viewing geometry information of the 500 m product (MYD09GA). + \n\nKnown Issues\n* Prior to the Aqua MODIS launch, Band 6 exhibited several anomalous + detectors. Band 6 performance degraded seriously after launch and presently a majority + of the Band 6 detectors are non-functional. Science users should read and use the + non-functional detector flags and decide for themselves the optimum manner to handle + non-functional detector \"gaps\" for their products. For complete information please + refer to the [MODIS Characterization Support Team (MCST) website](https://mcst.gsfc.nasa.gov/time-dependent-list-non-functional-or-noisy-detector).\n* + For complete information about known issues please refer to the [MODIS/VIIRS Land + Quality Assessment website](https://landweb.modaps.eosdis.nasa.gov/knownissue?sensor=MODIS&sat=Aqua&as=61).\n\nImprovments/Changes + from Previous Version\n* The Version 6.1 Level-1B (L1B) products have been improved + by undergoing various calibration changes that include: changes to the response-versus-scan + angle (RVS) approach that affects reflectance bands for Aqua and Terra MODIS, corrections + to adjust for the optical crosstalk in Terra MODIS infrared (IR) bands, and corrections + to the Terra MODIS forward look-up table (LUT) update for the period 2012 - 2017.\n* + A polarization correction has been applied to the L1B Reflective Solar Bands (RSB).\nRead + our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/MODIS/MYD09GQ.061 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2002-07-04 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - datacenter + - earth observation + - geospatial + - global + - hdf + - ice + - land + - opendap +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'MODIS/Aqua Surface Reflectance Daily L2G Global 250m SIN Grid V061.' + ARN: arn:aws:s3:::lp-prod-protected/MYD09GQ.061 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Download Files from S3 Using boto3 + URL: https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/how-tos/Earthdata_Cloud__Download_file_from_S3.ipynb + AuthorName: LPDAAC + AuthorURL: https://www.earthdata.nasa.gov/centers/lp-daac diff --git a/datasets/nasa-operal2cslc-s1-staticv1.yaml b/datasets/nasa-operal2cslc-s1-staticv1.yaml new file mode 100644 index 000000000..e2241b1f8 --- /dev/null +++ b/datasets/nasa-operal2cslc-s1-staticv1.yaml @@ -0,0 +1,40 @@ +Name: OPERA Coregistered Single-Look Complex from Sentinel-1 Static Layers validated + product (Version 1) +Description: |- + The Observational Products for End-Users from Remote Sensing Analysis (OPERA) Coregistered Single-Look Complex (CSLC) from Sentinel-1 (S1) Static Layers (CSLC-S1-STATIC) validated product contains static radar geometry layers associated with the OPERA Coregistered Single-Look Complex (CSLC) from Sentinel-1 (S1) validated product. Due to the S1 mission’s narrow orbital tube, radar-geometry layers vary slightly over time for each position on the ground, and therefore are considered static. These static layers are provided separately from the OPERA CSLC-S1 product, as they are produced only once or a limited number of times, to account for changes in the DEM, in the S1 orbit, or in the static layers generation algorithm. Each OPERA CSLC-S1-STATIC product is distributed as a Hierarchical Data Format version 5 (HDF5) file following the CF-1.8 convention containing both data raster layers and product metadata and corresponds to matching CSLC-S1 products with the same burst ID. OPERA CSLC-S1 products are available over North America which includes the USA and U.S. Territories, Canada within 200 km of the U.S. border, and all mainland countries from the southern U.S. border down to and including Panama. The CSLC-S1 products are available in the associated OPERA Coregistered Single-Look Complex from Sentinel-1 validated product (Version 1) dataset. + Read our doc on how to get AWS Credentials to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME +Documentation: https://doi.org/10.5067/SNWG/OPERA_L2_CSLC-S1-STATIC_V1 +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2014-04-03 to Ongoing +Tags: + - aws-pds + - coastal + - earth observation + - hdf + - ice + - land + - metadata + - oceans + - orbit + - radar + - sentinel-1 + - synthetic aperture radar + - xml +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Coregistered Single-Look Complex from Sentinel-1 Static Layers + validated product (Version 1).' + ARN: arn:aws:s3:::asf-cumulus-prod-opera-products/OPERA_L2_CSLC-S1_STATIC/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://cumulus.asf.alaska.edu/s3credentials +DataAtWork: + Tutorials: + - Title: Generate inteferograms without the need to download OPERA CSLC-S1 products locally + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/CSLC/Discover/Create_Interferogram_by_Streaming_CSLC-S1.ipynb + AuthorName: M. Grace Bato and K. Devlin + - Title: Generate interferograms and map the lava flow emplacement using OPERA CSLC-S1 + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/CSLC/Volcano/Map_Deformation_LavaFlow_using_CSLC-S1.ipynb + AuthorName: M. Grace Bato diff --git a/datasets/nasa-operal2cslc-s1v1.yaml b/datasets/nasa-operal2cslc-s1v1.yaml new file mode 100644 index 000000000..c8fbf5166 --- /dev/null +++ b/datasets/nasa-operal2cslc-s1v1.yaml @@ -0,0 +1,64 @@ +Name: OPERA Coregistered Single-Look Complex from Sentinel-1 validated product (Version + 1) +Description: "The Observational Products for End-Users from Remote Sensing Analysis + (OPERA) Coregistered Single-Look Complex (CSLC) from Sentinel-1 validated product + consists of Single Look Complex (SLC) images which contain both amplitude and phase + information of the complex radar return. The amplitude is primarily determined by + ground surface properties (e.g., terrain slope, surface roughness, and physical + properties), and phase primarily represents the distance between the radar and ground + targets corrected for the geometrical distance between the two based on the knowledge + from Digital Elevation Model and platform’s position, i.e., the CSLC phase represents + residual geometrical distance between the sensor and target, the atmospheric propagation + delay and the target movements. The CSLC-S1 product is derived from Copernicus Sentinel-1A + and Sentinel-1B Interferometric Wide (IW) SLC data. \n\nThe CSLC images are precisely + aligned or “coregistered” to a pre-defined UTM/Polar stereographic map projection + systems and posted at 5x10 m spacing in east and north direction, respectively. + \ Each CSLC-S1 product corresponds to a single S1 burst and is distributed as a + Hierarchical Data Format version 5 (HDF5) file following the CF-1.8 convention containing + both data raster layers (e.g., geocoded complex backscatter, low-resolution correction + look-up tables) and product metadata. OPERA CSLC-S1 products are available over + North America which includes the USA and U.S. Territories, Canada within 200 km + of the U.S. border, and all mainland countries from the southern U.S. border down + to and including Panama. The OPERA CSLC-S1 product contains modified Copernicus + Sentinel data (2016-2025).\n\nDue to the S1 mission’s narrow orbital tube, radar-geometry + layers vary slightly over time for each position on the ground, and therefore are + considered static. These static layers are provided separately from the OPERA CLSLC-S1 + product, as they are produced only once or a limited number of times. The static + layers are available in the associated OPERA Coregistered Single-Look Complex from + Sentinel-1 Static Layers validated product (Version 1).\nRead our doc on how to + get AWS Credentials to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME" +Documentation: https://doi.org/10.5067/SNWG/OPERA_L2_CSLC-S1_V1 +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2014-06-15 to Ongoing +Tags: + - aws-pds + - coastal + - earth observation + - hdf + - ice + - land + - metadata + - oceans + - orbit + - radar + - sentinel-1 + - synthetic aperture radar + - xml +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Coregistered Single-Look Complex from Sentinel-1 validated + product (Version 1).' + ARN: arn:aws:s3:::asf-cumulus-prod-opera-products/OPERA_L2_CSLC-S1/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://cumulus.asf.alaska.edu/s3credentialsREADME +DataAtWork: + Tutorials: + - Title: Generate inteferograms without the need to download OPERA CSLC-S1 products locally + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/CSLC/Discover/Create_Interferogram_by_Streaming_CSLC-S1.ipynb + AuthorName: M. Grace Bato and K. Devlin + - Title: Generate interferograms and map the lava flow emplacement using OPERA CSLC-S1 + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/CSLC/Volcano/Map_Deformation_LavaFlow_using_CSLC-S1.ipynb + AuthorName: M. Grace Bato diff --git a/datasets/nasa-operal2rtc-s1-staticv1.yaml b/datasets/nasa-operal2rtc-s1-staticv1.yaml new file mode 100644 index 000000000..adb9da30f --- /dev/null +++ b/datasets/nasa-operal2rtc-s1-staticv1.yaml @@ -0,0 +1,48 @@ +Name: OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 Static Layers + validated product (Version 1) +Description: |- + The Observational Products for End-Users from Remote Sensing Analysis (OPERA) Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (S1) Static Layers (RTC-S1-STATIC) validated product contains static radar geometry layers associated with the OPERA Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (S1) (RTC-S1) validated product. Due to the S1 mission’s narrow orbital tube, radar-geometry layers such as incidence angle, local incidence angle, number of looks, and RTC Area Normalization Factor (ANF) vary slightly over time for each position on the ground, and therefore are considered static. These static layers are provided separately from the OPERA RTC-S1 product, as they are produced only once or a limited number of times, to account for changes in the DEM, in the S1 orbit, or in the static-layers generation algorithm. Static layers are provided as single-band cloud-optimized GeoTIFF (COG) files, with map grid matching RTC-S1 products with the same burst ID. The standard OPERA RTC-S1 product is derived from the original Copernicus Sentinel-1 (S1) interferometric wide (IW) single-look complex (SLC) data, provided by the European Space Agency, with a temporal sampling coincident with the availability of Sentinel-1A and Sentinel-1B SLC data. The OPERA RTC-S1-STATIC and RTC-S1 products are provided at a near global scope (land masses excluding Antarctica). The RTC-S1 products are available in the associated OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 validated product (Version 1) dataset. + Read our doc on how to get AWS Credentials to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME +Documentation: https://doi.org/10.5067/SNWG/OPERA_L2_RTC-S1-STATIC_V1 +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2014-04-03 to Ongoing +Tags: + - aws-pds + - coastal + - cog + - earth observation + - geoscience + - global + - ice + - land + - metadata + - oceans + - orbit + - radar + - sentinel-1 + - synthetic aperture radar + - tiff + - xml +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 + Static Layers validated product (Version 1).' + ARN: arn:aws:s3:::asf-cumulus-prod-opera-products/OPERA_L2_RTC-S1_STATIC/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://cumulus.asf.alaska.edu/s3credentialsREADME +DataAtWork: + Publications: + - Title: An Area-Based Projection Algorithm for SAR Radiometric Terrain Correction + and Geocoding + URL: https://doi.org/10.1109/TGRS.2022.3147472 + AuthorName: Gustavo H. X. Shiroma, Marco Lavalle, and Sean M. Buckley + Tutorials: + - Title: Load, Mosaic, and Visualize OPERA RTC-S1 Data + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/RTC/notebooks/RTC_notebook.ipynb + AuthorName: K. Venkataramani + - Title: RTC Landslide Example + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/RTC/notebooks/RTC_landslide_example.ipynb + AuthorName: K. Venkataramani diff --git a/datasets/nasa-operal2rtc-s1v1.yaml b/datasets/nasa-operal2rtc-s1v1.yaml new file mode 100644 index 000000000..184022cd9 --- /dev/null +++ b/datasets/nasa-operal2rtc-s1v1.yaml @@ -0,0 +1,73 @@ +Name: OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 validated + product (Version 1) +Description: "The Observational Products for End-Users from Remote Sensing Analysis + (OPERA) Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (S1) + validated product consists of radar backscatter normalized with respect to the topography. + The product maps signals related to the physical properties of ground scattering + objects, such as surface roughness and soil moisture and/or vegetation. The OPERA + RTC-S1 product is derived from Copernicus Sentinel-1 Interferometric Wide (IW) Single + Look Complex (SLC) data with a near global scope and temporal sampling coincident + with the availability of S1 SLC data. \n\nEach OPERA RTC-S1 product corresponds + to a single S1 burst projected onto a pre-defined UTM/Polar stereographic map projection + system map grid with a 30-meter spacing. The Copernicus global 30 m (GLO-30) Digital + Elevation Model (DEM) is the reference DEM used to correct for the impacts of topography + and to geocode the product. The OPERA RTC-S1 product is normalized to the backscatter + coefficient gamma-naught, ɣ0, obtained from the original radar brightness beta-naught, + β0, through radiometric terrain correction. The RTC-S1 product is distributed + as cloud optimized GeoTIFFs with one GeoTIFF file per processed polarization. The + RTC-S1 product metadata is provided in the Hierarchical Data Format version 5 (HDF5) + format. The OPERA RTC-S1 product contains modified Copernicus Sentinel data (2022-2025).\n\nDue + to the S1 mission’s narrow orbital tube, radar-geometry layers such as incidence + angle, local incidence angle, number of looks, and RTC Area Normalization Factor + (ANF) vary slightly over time for each position on the ground, and therefore are + considered static. These static layers are provided separately from the OPERA RTC-S1 + product, as they are produced only once or a limited number of times, to account + for changes in the DEM, in the S1 orbit, or in the static-layers generation algorithm. + The static layers are available in the associated OPERA Radiometric Terrain Corrected + SAR Backscatter from Sentinel-1 Static Layers validated product (Version 1) dataset.\nRead + our doc on how to get AWS Credentials to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME" +Documentation: https://doi.org/10.5067/SNWG/OPERA_L2_RTC-S1_V1 +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2020-12-31 to Ongoing +Tags: + - aws-pds + - coastal + - earth observation + - geoscience + - global + - hdf + - ice + - land + - metadata + - oceans + - orbit + - radar + - sentinel-1 + - soil moisture + - synthetic aperture radar + - tiff + - xml +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 + validated product (Version 1).' + ARN: arn:aws:s3:::asf-cumulus-prod-opera-products/OPERA_L2_RTC-S1/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://cumulus.asf.alaska.edu/s3credentials +DataAtWork: + Publications: + - Title: An Area-Based Projection Algorithm for SAR Radiometric Terrain Correction + and Geocoding + URL: https://doi.org/10.1109/TGRS.2022.3147472 + AuthorName: Gustavo H. X. Shiroma, Marco Lavalle, and Sean M. Buckley + - Title: Thermal Denoising of Products Generated by the S-1 IPF + Tutorials: + - Title: Load, Mosaic, and Visualize OPERA RTC-S1 Data + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/RTC/notebooks/RTC_notebook.ipynb + AuthorName: K. Venkataramani + - Title: RTC Landslide Example + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/RTC/notebooks/RTC_landslide_example.ipynb + AuthorName: K. Venkataramani diff --git a/datasets/nasa-operal3disp-s1v1.yaml b/datasets/nasa-operal3disp-s1v1.yaml new file mode 100644 index 000000000..f2cc69b77 --- /dev/null +++ b/datasets/nasa-operal3disp-s1v1.yaml @@ -0,0 +1,48 @@ +Name: OPERA Surface Displacement from Sentinel-1 validated product (Version 1) +Description: "The Level-3 OPERA Sentinel-1 Surface Displacement (DISP) product is + generated through interferometric time-series analysis of Level-2 Coregistered Sentinel-1 + Single Look Complex (CSLC) datasets. Using a hybrid Persistent Scatterer (PS) and + Distributed Scatterer (DS) approach, this product quantifies Earth's surface displacement + in the radar line-of-sight. The DISP products enable the detection of anthropogenic + and natural surface changes, including subsidence, tectonic deformation, and landslides. + \n\nThe OPERA DISP suite comprises complementary datasets derived from Sentinel-1 + and NISAR inputs, designated as DISP-S1 and DISP-NI, respectively. Each product, + created per acquisition, adheres to a consistent structure, HDF5 file format, file-naming + convention, and a 30 m spatial posting. This collection specifically includes DISP-S1 + products, derived from Sentinel-1 data. \n\nDISP-S1 products provide spatial coverage + across North America, encompassing the United States, U.S. territories within 200 + km of the U.S. border, Canada, and mainland countries from the southern U.S. border + to Panama. These products are generated from Sentinel-1 Interferometric Wide (IW) + swath mode acquisitions starting in mid-2016. \n\nThe OPERA DISP-S1 product contains + modified Copernicus Sentinel data (2016-2025).\nRead our doc on how to get AWS Credentials + to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME" +Documentation: https://doi.org/10.5067/SNWG/OPL3DISPS1-V1 +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2016-07-01 to Ongoing +Tags: + - aws-pds + - earth observation + - land + - metadata + - netCDF + - orbit + - radar + - sentinel-1 + - synthetic aperture radar + - xml + - zarr +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Surface Displacement from Sentinel-1 validated product (Version + 1).' + ARN: arn:aws:s3:::asf-cumulus-prod-opera-product/OPERA_L3_DISP-S1/ + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://cumulus.asf.alaska.edu/s3credentials +DataAtWork: + Tutorials: + - Title: Inspect DISP-S1 Layers + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DISP/Timeseries/opera_disp/inspect_DISP-S1_layers.ipynb + AuthorName: M. Grace Bato diff --git a/datasets/nasa-operal3dist-alert-hls_v1.yaml b/datasets/nasa-operal3dist-alert-hls_v1.yaml new file mode 100644 index 000000000..b87ed5373 --- /dev/null +++ b/datasets/nasa-operal3dist-alert-hls_v1.yaml @@ -0,0 +1,43 @@ +Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product + (Version 1) +Description: |- + The Observational Products for End-Users from Remote Sensing Analysis ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 (HLS) product Version 1 maps vegetation disturbance alerts that are derived from data collected by Landsat 8 and Landsat 9 Operational Land Imager (OLI) and Sentinel-2A, Sentinel-2B, and Sentinel-2C Multi-Spectral Instrument (MSI). A vegetation disturbance alert is detected at 30 meter (m) spatial resolution when there is an indicated decrease in vegetation cover within an HLS pixel. The Level-3 data product also provides additional information about more general disturbance trends and auxiliary generic disturbance information as determined from the variations of the reflectance through the HLS scenes. [HLS](https://lpdaac.usgs.gov/product_search/?collections=HLS&status=Operational&view=list) data represent the highest temporal frequency data available at medium spatial resolution. The combined observations will provide greater sensitivity to land changes, whether of large magnitude/short duration or small magnitude/long duration. + + The OPERA_L3_DIST-ALERT-HLS (or DIST-ALERT) data product is provided in Cloud Optimized GeoTIFF (COG) format, and each layer is distributed as a separate file. There are 19 layers contained within the DIST-ALERT product. The layers for both vegetation and generic disturbance include disturbance status, loss or anomaly, maximum loss anomaly, disturbance confidence layer, date of disturbance, count of observations with loss anomalies, days of ongoing anomalies, and day of last disturbance detection. Additional layers are vegetation cover percent, historical percent vegetation cover, and data mask. See the Product Specification Document (PSD) for a more detailed description of the individual layers provided in the DIST-ALERT product. + + The OPERA_L3_DIST-ALERT-HLS product contains modified Copernicus Sentinel data (2020-2025). + + Known Issues + * Additional usage constraints are provided under Section 5 of the Algorithm Theoretical Basis Document (ATBD). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001 +Contact: 'Email: lpdaac@usgs.gov. Home Page: https://lpdaac.usgs.gov/' +ManagedBy: NASA +UpdateFrequency: From 2022-01-01 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - cog + - earth observation + - environmental + - global + - land + - land cover + - land use + - satellite imagery +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 + product (Version 1).' + ARN: arn:aws:s3:::lp-protected/OPERA_L3_DIST-ALERT-HLS_V1.001 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Getting Started with OPERA DIST-ALERT-HLS Products + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Discover/Stream_and_Viz_DIST-ALERT-folium.ipynb + AuthorName: R. Dhillon and M. Grace Bato + - Title: Getting Started with OPERA DIST Product + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Wildfire/Intro_To_DIST.ipynb + AuthorName: M. Grace Bato and R. Dhillon diff --git a/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml b/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml new file mode 100644 index 000000000..dc01fa1d7 --- /dev/null +++ b/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml @@ -0,0 +1,58 @@ +Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 provisional + product (Version 0) +Description: "The OPERA_L3_DIST-ALERT-HLS Version 0 data product was decommissioned + on April 25, 2025. Users are encouraged to use the [OPERA_L3_DIST-ALERT-HLS V1](https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001) + data product which was released on March 14, 2024, and has achieved stage 1 validation.\n\nThe + Observational Products for End-Users from Remote Sensing Analysis (OPERA) Land Surface + Disturbance Alert from Harmonized Landsat Sentinel-2 (HLS) provisional data product + Version 0 maps vegetation disturbance alerts from data collected by Landsat 8 and + Landsat 9 Operational Land Imager (OLI) and Sentinel-2A, Sentinel-2B, and Sentinel-2C + Multi-Spectral Instrument (MSI). Vegetation disturbance alert is detected at 30 + meter (m) spatial resolution when there is an indicated decrease in vegetation cover + within an HLS pixel. The product also provides auxiliary generic disturbance information + as determined from the variations of the reflectance through the HLS scenes to provide + information about more general disturbance trends. HLS data represent the highest + temporal frequency data available at medium spatial resolution. The combined observations + will provide greater sensitivity to land changes, whether of large magnitude/short + duration, or small magnitude/long duration. \n\nThe OPERA_L3_DIST-ALERT-HLS (or + DIST-ALERT) data product is provided in Cloud Optimized GeoTIFF (COG) format, and + each layer is distributed as a separate file. There are 19 layers contained within + in the DIST-ALERT product: vegetation disturbance status, current vegetation cover + indicator, current vegetation anomaly value, historical vegetation cover indicator, + max vegetation anomaly value, vegetation disturbance confidence layer, date of initial + vegetation disturbance, number of detected vegetation loss anomalies, and vegetation + disturbance duration. See the Product Specification for a more detailed description + of the individual layers provided in the DIST-ALERT product. \n\nKnown Issues\n* + Additional usage constraints are provided under Section 5 of the Algorithm Theoretical + Basis Document (ATBD).\nRead our doc on how to get AWS Credentials to retrieve this + data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_PROVISIONAL_V0.000 +Contact: 'Email: lpdaac@usgs.gov. Home Page: https://lpdaac.usgs.gov/' +ManagedBy: NASA +UpdateFrequency: From 2022-01-01 to 2024-02-26 +Tags: + - aws-pds + - cog + - earth observation + - environmental + - global + - land + - land cover + - land use +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 + provisional product (Version 0).' + ARN: arn:aws:s3:::lp-protected/OPERA_DIST-ALERT-HLS_PROVISIONAL_V0.000 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Getting Started with OPERA DIST-ALERT-HLS Products + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Discover/Stream_and_Viz_DIST-ALERT-folium.ipynb + AuthorName: R. Dhillon and M. Grace Bato + - Title: Getting Started with OPERA DIST Product + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Wildfire/Intro_To_DIST.ipynb + AuthorName: M. Grace Bato and R. Dhillon diff --git a/datasets/nasa-operal3dist-alert-hlsv1.yaml b/datasets/nasa-operal3dist-alert-hlsv1.yaml new file mode 100644 index 000000000..ee890928c --- /dev/null +++ b/datasets/nasa-operal3dist-alert-hlsv1.yaml @@ -0,0 +1,43 @@ +Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product + (Version 1) +Description: |- + The Observational Products for End-Users from Remote Sensing Analysis ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 (HLS) product Version 1 maps vegetation disturbance alerts that are derived from data collected by Landsat 8 and Landsat 9 Operational Land Imager (OLI) and Sentinel-2A, Sentinel-2B, and Sentinel-2C Multi-Spectral Instrument (MSI). A vegetation disturbance alert is detected at 30 meter (m) spatial resolution when there is an indicated decrease in vegetation cover within an HLS pixel. The Level-3 data product also provides additional information about more general disturbance trends and auxiliary generic disturbance information as determined from the variations of the reflectance through the HLS scenes. [HLS](https://lpdaac.usgs.gov/product_search/?collections=HLS&status=Operational&view=list) data represent the highest temporal frequency data available at medium spatial resolution. The combined observations will provide greater sensitivity to land changes, whether of large magnitude/short duration or small magnitude/long duration. + + The OPERA_L3_DIST-ALERT-HLS (or DIST-ALERT) data product is provided in Cloud Optimized GeoTIFF (COG) format, and each layer is distributed as a separate file. There are 19 layers contained within the DIST-ALERT product. The layers for both vegetation and generic disturbance include disturbance status, loss or anomaly, maximum loss anomaly, disturbance confidence layer, date of disturbance, count of observations with loss anomalies, days of ongoing anomalies, and day of last disturbance detection. Additional layers are vegetation cover percent, historical percent vegetation cover, and data mask. See the Product Specification Document (PSD) for a more detailed description of the individual layers provided in the DIST-ALERT product. + + The OPERA_L3_DIST-ALERT-HLS product contains modified Copernicus Sentinel data (2020-2025). + + Known Issues + * Additional usage constraints are provided under Section 5 of the Algorithm Theoretical Basis Document (ATBD). + Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001 +Contact: 'User Services: lpdaac@usgs.gov' +ManagedBy: NASA +UpdateFrequency: From 2022-01-01 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - cog + - earth observation + - environmental + - global + - land + - land cover + - land use + - satellite imagery +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 + product (Version 1).' + ARN: arn:aws:s3:::lp-protected/OPERA_L3_DIST-ALERT-HLS_V1.001 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Getting Started with OPERA DIST-ALERT-HLS Products + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Discover/Stream_and_Viz_DIST-ALERT-folium.ipynb + AuthorName: R. Dhillon and M. Grace Bato + - Title: Getting Started with OPERA DIST Product + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Wildfire/Intro_To_DIST.ipynb + AuthorName: M. Grace Bato and R. Dhillon diff --git a/datasets/nasa-operal3dswx-hlsv1.yaml b/datasets/nasa-operal3dswx-hlsv1.yaml new file mode 100644 index 000000000..3d9a80c57 --- /dev/null +++ b/datasets/nasa-operal3dswx-hlsv1.yaml @@ -0,0 +1,72 @@ +Name: OPERA Dynamic Surface Water Extent from Harmonized Landsat Sentinel-2 product + (Version 1) +Description: "This dataset contains Level-3 Dynamic OPERA surface water extent product + version 1. The data are validated surface water extent observations beginning April + 2023. Known issues and caveats on usage are described under Documentation. The input + dataset for generating each product is the Harmonized Landsat-8 and Sentinel-2A/B/C + (HLS) product version 2.0. HLS products provide surface reflectance (SR) data from + the Operational Land Imager (OLI) aboard the Landsat 8 satellite and the MultiSpectral + Instrument (MSI) aboard the Sentinel-2A/B/C satellite. The surface water extent + products are distributed over projected map coordinates using the Universal Transverse + Mercator (UTM) projection. Each UTM tile covers an area of 109.8 km × 109.8 km. + This area is divided into 3,660 rows and 3,660 columns at 30-m pixel spacing. Each + product is distributed as a set of 10 GeoTIFF (Geographic Tagged Image File Format) + files including water classification, associated confidence, land cover classification, + terrain shadow layer, cloud/cloud-shadow classification, Digital elevation model + (DEM), and Diagnostic layer.\n

\nThe digital elevation model (DEM) provided + as a layer of the DSWx-HLS product (band 10) was generated using the Copernicus + DEM 30-m and Copernicus DEM 90-m models provided by the European Space Agency. The + Copernicus DEM 30-m and Copernicus DEM 90-m were produced using Copernicus WorldDEM-30 + © DLR e.V. 2010-2014 and © Airbus Defence and Space GmbH 2014-2018 provided under + COPERNICUS by the European Union and ESA; all rights reserved. The organizations + in charge of the OPERA project, the Copernicus programme, and Airbus Defence and + Space GmbH by law or by delegation do not assume any legal responsibility or liability, + whether express or implied, arising from the use of this DEM.\n

\nThe OPERA + DSWx-HLS product contains modified Copernicus Sentinel data (2023-2025).\n

\nTo + access the calibration/validation database for OPERA Dynamic Surface Water Extent + Products, please contact podaac@podaac.jpl.nasa.gov \nRead our doc on how to get + AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/OPDSW-PL3V1 +Contact: 'Help Desk: podaac@podaac.jpl.nasa.gov. Home Page: https://podaac.jpl.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 2023-04-04 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - cog + - datacenter + - earth observation + - ice + - land + - land cover + - metadata + - surface water + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Dynamic Surface Water Extent from Harmonized Landsat Sentinel-2 + product (Version 1).' + ARN: arn:aws:s3:::podaac-ops-cumulus-protected/OPERA_L3_DSWX-HLS_V1 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://archive.podaac.earthdata.nasa.gov/s3credentials +DataAtWork: + Publications: + - Title: Improved Automated Detection of Subpixel-Scale Inundation—Revised Dynamic + Surface Water Extent (DSWE) Partial Surface Water Tests + URL: https://doi.org/10.3390/rs11040374 + AuthorName: Jones, John W + Tutorials: + - Title: Working with OPERA Dynamic Surface Water Extent (DSWx) Data + URL: https://podaac.github.io/tutorials/notebooks/datasets/OPERA_GIS_Cloud.html + AuthorName: Nicholas Tarpinian + NotebookURL: https://github.com/podaac/tutorials/blob/master/notebooks/datasets/OPERA_GIS_Cloud.ipynb + - Title: Access DSWx-HLS S3 + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DSWx/Discover/Access_DSWx-HLS_S3.ipynb + AuthorName: M. Grace Bato + - Title: Stream and Viz DSWx-HLS via Direct HTTPS + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DSWx/Discover/Stream_and_Viz_DSWx-HLS_viaDirectHTTPS.ipynb + AuthorName: M. Grace Bato + - Title: Getting Started with OPERA DSWx Product + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DSWx/Reservoir/Intro_to_DSWx.ipynb + AuthorName: K. Devlin and M. Grace Bato diff --git a/datasets/nasa-operal3dswx-s1v1.yaml b/datasets/nasa-operal3dswx-s1v1.yaml new file mode 100644 index 000000000..dfcb555a0 --- /dev/null +++ b/datasets/nasa-operal3dswx-s1v1.yaml @@ -0,0 +1,48 @@ +Name: OPERA Dynamic Surface Water Extent from Sentinel-1 (Version 1) +Description: "This dataset contains Level-3 Dynamic OPERA Surface Water Extent from + Sentinel-1 (DSWx-S1) product version 1. DSWx-S1 provides near-global geographical + mapping of surface water extent over land at a spatial resolution of 30 meters over + the Military Grid reference System (MGRS) grid system, with a temporal revisit frequency + between 6-12 days. Using Sentinel-1 radar observations, DSWx-S1 maps open inland + water bodies greater than 3 hectares and 200 meters in width, irrespective of cloud + conditions and daylight illumination that often pose challenges to optical sensors. + Forward production of the DSWx-S1 data record began in Sept 2024. Each product + is distributed as a set of 3 GeoTIFF (Geographic Tagged Image File Format) files + including water classification and associated confidence layers.\n

\nThe + OPERA DSWx-S1 product contains modified Copernicus Sentinel data (2024-2025).\n

\nTo + access the calibration/validation database for OPERA Dynamic Surface Water Extent + Products, please contact podaac@podaac.jpl.nasa.gov \nRead our doc on how to get + AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME +Documentation: https://doi.org/10.5067/OPDSWS1-L3V1 +Contact: 'Help Desk: podaac@podaac.jpl.nasa.gov. Home Page: https://podaac.jpl.nasa.gov/' +ManagedBy: NASA +UpdateFrequency: From 2024-08-01 to Ongoing (Daily - < Weekly) +Tags: + - aws-pds + - cog + - datacenter + - earth observation + - global + - land + - orbit + - radar + - sentinel-1 + - surface water + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Dynamic Surface Water Extent from Sentinel-1 (Version 1).' + ARN: arn:aws:s3:::podaac-ops-cumulus-protected/OPERA_L3_DSWX-S1_V1 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://archive.podaac.earthdata.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Working with OPERA Dynamic Surface Water Extent (DSWx) Data + URL: https://podaac.github.io/tutorials/notebooks/datasets/OPERA_GIS_Cloud.html + AuthorName: Nicholas Tarpinian + NotebookURL: https://github.com/podaac/tutorials/blob/master/notebooks/datasets/OPERA_GIS_Cloud.ipynb + - Title: Generate Flood Maps without downloading OPERA DSWx-S1 products locally + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DSWx/Flood/Brazil_DSWx-S1_FloodProduct.ipynb + AuthorName: S. Sangha diff --git a/datasets/nasa-sentinel-1adpgrdhigh.yaml b/datasets/nasa-sentinel-1adpgrdhigh.yaml new file mode 100644 index 000000000..11c815e5b --- /dev/null +++ b/datasets/nasa-sentinel-1adpgrdhigh.yaml @@ -0,0 +1,43 @@ +Name: SENTINEL-1A_DUAL_POL_GRD_HIGH_RES +Description: |- + Sentinel-1A Dual-pol ground projected high and full resolution images + Read our doc on how to get AWS Credentials to retrieve this data: https://sentinel1.asf.alaska.edu/s3credentialsREADME +Documentation: https://webdocs.asf.alaska.edu/Sentinel-1/Sentinel-1-User-Guide.pdf +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2014-04-03 to Ongoing +Tags: + - aws-pds + - agriculture + - coastal + - earth observation + - earthquakes + - ecosystems + - ice + - land + - land cover + - land use + - metadata + - oceans + - radar + - sentinel-1 + - stac + - surface water + - synthetic aperture radar + - tiff + - urban + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'SENTINEL-1A_DUAL_POL_GRD_HIGH_RES.' + ARN: arn:aws:s3:::asf-ngap2w-p-s1-grd-7d1b4348 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://sentinel1.asf.alaska.edu/s3credentials +DataAtWork: + Tutorials: + - Title: Interferometric Synthetic Aperture Radar Tutorial + URL: https://github.com/live-eo/sentinel1-slc/blob/main/docs/tutorial_InSAR.md + AuthorName: LiveEO + AuthorURL: https://live-eo.com/ diff --git a/datasets/nasa-sentinel-1aslc.yaml b/datasets/nasa-sentinel-1aslc.yaml new file mode 100644 index 000000000..c005f9be3 --- /dev/null +++ b/datasets/nasa-sentinel-1aslc.yaml @@ -0,0 +1,42 @@ +Name: SENTINEL-1A_SLC +Description: |- + Sentinel-1A slant-range product + Read our doc on how to get AWS Credentials to retrieve this data: https://sentinel1.asf.alaska.edu/s3credentialsREADME +Documentation: https://webdocs.asf.alaska.edu/Sentinel-1/Sentinel-1-User-Guide.pdf +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2014-04-03 to Ongoing +Tags: + - aws-pds + - coastal + - earthquakes + - ecosystems + - ice + - land + - land cover + - land use + - metadata + - oceans + - orbit + - radar + - sentinel-1 + - stac + - surface water + - synthetic aperture radar + - tiff + - urban + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'SENTINEL-1A_SLC.' + ARN: arn:aws:s3::asf-ngap2w-p-s1-slc-7b420b89 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://sentinel1.asf.alaska.edu/s3credentials +DataAtWork: + Tutorials: + - Title: Interferometric Synthetic Aperture Radar Tutorial + URL: https://github.com/live-eo/sentinel1-slc/blob/main/docs/tutorial_InSAR.md + AuthorName: LiveEO + AuthorURL: https://live-eo.com/ diff --git a/datasets/nasa-sentinel-1bdpgrdhigh.yaml b/datasets/nasa-sentinel-1bdpgrdhigh.yaml new file mode 100644 index 000000000..e6b3bd03b --- /dev/null +++ b/datasets/nasa-sentinel-1bdpgrdhigh.yaml @@ -0,0 +1,42 @@ +Name: SENTINEL-1B_DUAL_POL_GRD_HIGH_RES +Description: |- + Sentinel-1B Dual-pol ground projected high and full resolution images + Read our doc on how to get AWS Credentials to retrieve this data: https://sentinel1.asf.alaska.edu/s3credentialsREADME +Documentation: https://webdocs.asf.alaska.edu/Sentinel-1/Sentinel-1-User-Guide.pdf +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2016-04-25 to 2021-12-24 +Tags: + - aws-pds + - agriculture + - coastal + - earthquakes + - ecosystems + - ice + - land + - land cover + - land use + - metadata + - oceans + - radar + - sentinel-1 + - stac + - surface water + - synthetic aperture radar + - tiff + - urban + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'SENTINEL-1B_DUAL_POL_GRD_HIGH_RES.' + ARN: arn:aws:s3:::asf-ngap2w-p-s1-grd-7d1b4348 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://sentinel1.asf.alaska.edu/s3credentials +DataAtWork: + Tutorials: + - Title: Interferometric Synthetic Aperture Radar Tutorial + URL: https://github.com/live-eo/sentinel1-slc/blob/main/docs/tutorial_InSAR.md + AuthorName: LiveEO + AuthorURL: https://live-eo.com/ diff --git a/datasets/nasa-sentinel-1bslc.yaml b/datasets/nasa-sentinel-1bslc.yaml new file mode 100644 index 000000000..b178c3ee0 --- /dev/null +++ b/datasets/nasa-sentinel-1bslc.yaml @@ -0,0 +1,43 @@ +Name: SENTINEL-1B_SLC +Description: |- + Sentinel-1B slant-range product + Read our doc on how to get AWS Credentials to retrieve this data: https://sentinel1.asf.alaska.edu/s3credentialsREADME +Documentation: https://webdocs.asf.alaska.edu/Sentinel-1/Sentinel-1-User-Guide.pdf +Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' +ManagedBy: NASA +UpdateFrequency: From 2016-04-25 to 2021-12-24 +Tags: + - aws-pds + - agriculture + - coastal + - earthquakes + - ecosystems + - ice + - land + - land cover + - land use + - metadata + - oceans + - orbit + - radar + - sentinel-1 + - stac + - surface water + - synthetic aperture radar + - tiff + - urban + - water +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'SENTINEL-1B_SLC.' + ARN: arn:aws:s3:::asf-ngap2w-p-s1-slc-7b420b89 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://sentinel1.asf.alaska.edu/s3credentials +DataAtWork: + Tutorials: + - Title: Interferometric Synthetic Aperture Radar Tutorial + URL: https://github.com/live-eo/sentinel1-slc/blob/main/docs/tutorial_InSAR.md + AuthorName: LiveEO + AuthorURL: https://live-eo.com/ From d420625e51262c3cdef39fcfec949ca082644e25 Mon Sep 17 00:00:00 2001 From: mgrover1 Date: Fri, 8 Aug 2025 09:04:04 -0600 Subject: [PATCH 198/751] ADD: Add MRMS Cookbook from Pythia --- datasets/noaa-mrms-pds.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/noaa-mrms-pds.yaml b/datasets/noaa-mrms-pds.yaml index 64bc5c953..adb4a58ab 100644 --- a/datasets/noaa-mrms-pds.yaml +++ b/datasets/noaa-mrms-pds.yaml @@ -48,6 +48,10 @@ Resources: Region: us-east-1 Type: SNS Topic DataAtWork: + Tools & Applications: + - Title: Collection of Jupyter Notebooks using Python for working with MRMS Data + URL: https://projectpythia.org/mrms-cookbook/ + AuthorName: Project Pythia Community Publications: - Title: "Multi-Radar Multi-Sensor (MRMS) Quantitative Precipitation Estimation: Initial Operating Capabilities" URL: https://journals.ametsoc.org/view/journals/bams/97/4/bams-d-14-00174.1.xml From 6895f5538f67b870a1375ad5fc7037d15994c700 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 11:17:57 -0400 Subject: [PATCH 199/751] remove duplicate ARNs --- datasets/nasa-mcd43a1.yaml | 1 - datasets/nasa-mcd43a3.yaml | 1 - datasets/nasa-mcd43a4.yaml | 1 - datasets/nasa-mi1b2e.yaml | 1 - 4 files changed, 4 deletions(-) diff --git a/datasets/nasa-mcd43a1.yaml b/datasets/nasa-mcd43a1.yaml index b63b87746..240dac694 100644 --- a/datasets/nasa-mcd43a1.yaml +++ b/datasets/nasa-mcd43a1.yaml @@ -43,7 +43,6 @@ Resources: - Description: 'MODIS/Terra+Aqua BRDF/Albedo Model Parameters Daily L3 Global - 500m V061.' ARN: arn:aws:s3:::lp-prod-protected/MCD43A1.061 - ARN: arn:aws:s3:::lp-prod-protected/MCD43A1.061 Region: us-west-2 Type: S3 Bucket RequesterPays: false diff --git a/datasets/nasa-mcd43a3.yaml b/datasets/nasa-mcd43a3.yaml index 6f0701e66..2f9465edf 100644 --- a/datasets/nasa-mcd43a3.yaml +++ b/datasets/nasa-mcd43a3.yaml @@ -32,7 +32,6 @@ Tags: License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' Resources: - Description: 'MODIS/Terra+Aqua BRDF/Albedo Albedo Daily L3 Global - 500m V061.' - ARN: arn:aws:s3:::lp-prod-protected/MCD43A3.061 ARN: arn:aws:s3:::lp-prod-protected/MCD43A3.061 Region: us-west-2 Type: S3 Bucket diff --git a/datasets/nasa-mcd43a4.yaml b/datasets/nasa-mcd43a4.yaml index 67afdaf70..373c35bbc 100644 --- a/datasets/nasa-mcd43a4.yaml +++ b/datasets/nasa-mcd43a4.yaml @@ -34,7 +34,6 @@ Resources: - Description: 'MODIS/Terra+Aqua BRDF/Albedo Nadir BRDF-Adjusted Ref Daily L3 Global - 500m V061.' ARN: arn:aws:s3:::lp-prod-protected/MCD43A4.061 - ARN: arn:aws:s3:::lp-prod-protected/MCD43A4.061 Region: us-west-2 Type: S3 Bucket RequesterPays: false diff --git a/datasets/nasa-mi1b2e.yaml b/datasets/nasa-mi1b2e.yaml index 3df062f3c..c09e05b93 100644 --- a/datasets/nasa-mi1b2e.yaml +++ b/datasets/nasa-mi1b2e.yaml @@ -32,7 +32,6 @@ Tags: License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' Resources: - Description: 'MISR Level 1B2 Ellipsoid Data V004. (Format: netCDF-4)' - ARN: arn:aws:s3:::asdc-prod-protected/MISR/MI1B2E_004 ARN: arn:aws:s3:::asdc-prod-protected/MISR/MI1B2E_004 Region: us-west-2 Type: S3 Bucket From caed2a5646603f52f1aff43c564409cc22587d50 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 11:18:22 -0400 Subject: [PATCH 200/751] add opera dist-ann dataset --- datasets/nasa-operal3dist-ann-hlsv1.yaml | 65 ++++++++++++++++++++++++ 1 file changed, 65 insertions(+) create mode 100644 datasets/nasa-operal3dist-ann-hlsv1.yaml diff --git a/datasets/nasa-operal3dist-ann-hlsv1.yaml b/datasets/nasa-operal3dist-ann-hlsv1.yaml new file mode 100644 index 000000000..d7e7ef843 --- /dev/null +++ b/datasets/nasa-operal3dist-ann-hlsv1.yaml @@ -0,0 +1,65 @@ +Name: OPERA Land Surface Disturbance Annual from Harmonized Landsat Sentinel-2 product + (Version 1) +Description: "The Observational Products for End-Users from Remote Sensing Analysis + ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Annual from + Harmonized Landsat Sentinel-2 (HLS) product Version 1 summarizes the [DIST-ALERT](https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001) + data product into an annual vegetation disturbance data product. Vegetation disturbance + is mapped when there is an indicated decrease in vegetation cover within an HLS + Version 2 pixel. The product also provides auxiliary generic disturbance information + as determined from the variations of the reflectance through the DIST-ALERT scenes + to provide information about more general disturbance trends. The DIST-ANN product + tracks changes at the annual scale, aggregating changes identified in the DIST-ALERT + product. Only confirmed disturbances from the associated year are reported together + with the date of initial disturbance. As confirmed disturbances are determined using + subsequent cloud-free observations to determine if the loss detections persist, + the required number of HLS scenes depends on visibility of the target. Due to this + dependency, summarizing the DIST-ALERT in the DIST-ANN product will have some latency + contingent on the algorithmic calibration and is detailed in the Algorithm Theoretical + Basis Document (ATBD).\n\nThe OPERA_L3_DIST-ANN-HLS (or DIST-ANN) data product is + provided in Cloud Optimized GeoTIFF (COG) format, and each layer is distributed + as a separate COG. There are 21 layers contained within the DIST-ANN product: vegetation + disturbance status, historical vegetation cover indicator, maximum vegetation cover + indicator, maximum vegetation anomaly value, vegetation disturbance confidence layer, + date of initial vegetation disturbance, number of detected vegetation loss anomalies, + vegetation disturbance duration, date of last observation assessed for vegetation + disturbance, and several generic disturbance layers. Each product layer is gridded + to the same resolution and tiling system as HLS V2: 30 meter (m) and Military Grid + Reference System (MGRS). See the Product Specification Document (PSD) for a more + detailed description of the individual layers provided in the DIST-ANN product. + \n\nThe OPERA_L3_DIST-ANN-HLS product contains modified Copernicus Sentinel data + (2020-2025).\n\nKnown Issues\n* Additional usage constraints are provided under + Section 5 of the Algorithm Theoretical Basis Document (ATBD).\nRead our doc on how + to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME" +Documentation: https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ANN-HLS_V1.001 +Contact: 'Email: lpdaac@usgs.gov. Home Page: https://lpdaac.usgs.gov/' +ManagedBy: NASA +UpdateFrequency: From 2022-01-01 to Ongoing (Annual) +Tags: + - aws-pds + - cog + - earth observation + - environmental + - global + - land + - land cover + - land use +License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' +Resources: + - Description: 'OPERA Land Surface Disturbance Annual from Harmonized Landsat Sentinel-2 + product (Version 1).' + ARN: arn:aws:s3:::lp-protected/OPERA_L3_DIST-ANN-HLS_V1.001 + Region: us-west-2 + Type: S3 Bucket + RequesterPays: false + ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials +DataAtWork: + Tutorials: + - Title: Visualization and Exploration of OPERA DIST-ANN-HLS Product Layers + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ANN/Intro_To_DIST_ANN.ipynb + AuthorName: C. Speed and M. Grace Bato + - Title: Visualizing and Analyzing the OPERA DIST-ANN-HLS Product to Explore Land-Use Change in Brazil + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ANN/Land_Use_Change_DIST_ANN.ipynb + AuthorName: C. Speed and M. Grace Bato + - Title: Visualizing and Analyzing the OPERA DIST-ANN-HLS Product to Visualize Wildfire Impact in Northern Quebec + URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ANN/Wildfires_DIST_ANN.ipynb + AuthorName: C. Speed and M. Grace Bato From 0472a7b0db0f21de177a0fa7a4250aa451f47d6e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 8 Aug 2025 07:24:15 -0800 Subject: [PATCH 201/751] Update nasa-operal3dswx-s1v1.yaml --- datasets/nasa-operal3dswx-s1v1.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nasa-operal3dswx-s1v1.yaml b/datasets/nasa-operal3dswx-s1v1.yaml index dfcb555a0..7aa2340e4 100644 --- a/datasets/nasa-operal3dswx-s1v1.yaml +++ b/datasets/nasa-operal3dswx-s1v1.yaml @@ -12,7 +12,7 @@ Description: "This dataset contains Level-3 Dynamic OPERA Surface Water Extent f OPERA DSWx-S1 product contains modified Copernicus Sentinel data (2024-2025).\n

\nTo access the calibration/validation database for OPERA Dynamic Surface Water Extent Products, please contact podaac@podaac.jpl.nasa.gov \nRead our doc on how to get - AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME + AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME" Documentation: https://doi.org/10.5067/OPDSWS1-L3V1 Contact: 'Help Desk: podaac@podaac.jpl.nasa.gov. Home Page: https://podaac.jpl.nasa.gov/' ManagedBy: NASA From e6b1ffd82b30ade840af6137db075df96d21ef9a Mon Sep 17 00:00:00 2001 From: lizadams Date: Fri, 8 Aug 2025 12:49:43 -0400 Subject: [PATCH 202/751] update --- datasets/cmas-data-warehouse.yaml | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/datasets/cmas-data-warehouse.yaml b/datasets/cmas-data-warehouse.yaml index 2cc8125a6..dcfb1d9d4 100644 --- a/datasets/cmas-data-warehouse.yaml +++ b/datasets/cmas-data-warehouse.yaml @@ -60,18 +60,25 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://cmas-equates.s3.amazonaws.com/index.html)' + - Description: CMAQ 2022 Modeling Platform + ARN: arn:aws:s3:::2022platform + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://epa-2022-modeling-platform.s3.amazonaws.com/index.html)' + - '[OAQPS 2022 Modeling Platform](https://registry.opendata.aws/epa-2022-modeling-platform/)' - Description: CMAQ 2021 Modeling Platform ARN: arn:aws:s3:::2021platform Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket](https://2021platform.s3.amazonaws.com/readme.html)' + - '[Browse Bucket](https://2021platform.s3.amazonaws.com/index.html)' - Description: CMAQ 2019 Modeling Platform ARN: arn:aws:s3:::cmaq-2019-modeling-platform Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket](https://cmaq-2019-modeling-platform.s3.amazonaws.com/readme.html)' + - '[Browse Bucket](https://cmaq-2019-modeling-platform.s3.amazonaws.com/index.html)' - Description: CMAQ 2018 Modeling Platform ARN: arn:aws:s3:::cmas-cmaq-modeling-platform-2018 Region: us-east-1 From 2d0ce64b748ff8b6559eca56df3083186134163a Mon Sep 17 00:00:00 2001 From: lizadams Date: Fri, 8 Aug 2025 12:51:40 -0400 Subject: [PATCH 203/751] update --- datasets/cmas-data-warehouse.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/cmas-data-warehouse.yaml b/datasets/cmas-data-warehouse.yaml index dcfb1d9d4..3367861ec 100644 --- a/datasets/cmas-data-warehouse.yaml +++ b/datasets/cmas-data-warehouse.yaml @@ -60,8 +60,8 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://cmas-equates.s3.amazonaws.com/index.html)' - - Description: CMAQ 2022 Modeling Platform - ARN: arn:aws:s3:::2022platform + - Description: EPA 2022 Modeling Platform + ARN: arn:aws:s3:::epa-2022-modeling-platform Region: us-east-1 Type: S3 Bucket Explore: From 6fb8b38105919de066a5a6d63592db60b7dfe741 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 14:32:35 -0400 Subject: [PATCH 204/751] modify tags, urls, other minor fixes --- datasets/nasa-gpm3imergde.yaml | 6 ++---- datasets/nasa-gpm3imergdf.yaml | 8 +++----- datasets/nasa-gpm3imergdl.yaml | 6 ++---- datasets/nasa-gpm3imerghh.yaml | 6 ++---- datasets/nasa-gpm3imerghhe.yaml | 6 ++---- datasets/nasa-gpm3imerghhl.yaml | 6 ++---- datasets/nasa-gpm3imergm.yaml | 12 +++++------- datasets/nasa-m2i3npasm.yaml | 9 ++++++++- datasets/nasa-m2i3nvaer.yaml | 9 ++++++++- datasets/nasa-m2i3nvasm.yaml | 9 ++++++++- datasets/nasa-m2t1nxslv.yaml | 7 +++++++ datasets/nasa-operal2rtc-s1v1.yaml | 2 ++ datasets/nasa-operal3disp-s1v1.yaml | 2 +- datasets/nasa-sentinel-1aslc.yaml | 2 +- 14 files changed, 53 insertions(+), 37 deletions(-) diff --git a/datasets/nasa-gpm3imergde.yaml b/datasets/nasa-gpm3imergde.yaml index feb4d53ca..c359dba0f 100644 --- a/datasets/nasa-gpm3imergde.yaml +++ b/datasets/nasa-gpm3imergde.yaml @@ -93,16 +93,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System + URL: https://doi.org/10.1175/JAM2173.1 AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale.' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff - Title: Kalman Filter Based CMORPH URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Access GES DISC Data Using Python diff --git a/datasets/nasa-gpm3imergdf.yaml b/datasets/nasa-gpm3imergdf.yaml index 9d60d0f30..c38bf54e1 100644 --- a/datasets/nasa-gpm3imergdf.yaml +++ b/datasets/nasa-gpm3imergdf.yaml @@ -77,7 +77,7 @@ Tags: - ice - land - metadata - - netCDF + - netcdf - opendap License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' Resources: @@ -92,16 +92,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System + URL: https://doi.org/10.1175/JAM2173.1 AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale.' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff - Title: Kalman Filter Based CMORPH URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Access GES DISC Data Using Python diff --git a/datasets/nasa-gpm3imergdl.yaml b/datasets/nasa-gpm3imergdl.yaml index abd004a2a..a0cbfec95 100644 --- a/datasets/nasa-gpm3imergdl.yaml +++ b/datasets/nasa-gpm3imergdl.yaml @@ -93,16 +93,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System + URL: https://doi.org/10.1175/JAM2173.1 AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale.' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff - Title: Kalman Filter Based CMORPH URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Access GES DISC Data Using Python diff --git a/datasets/nasa-gpm3imerghh.yaml b/datasets/nasa-gpm3imerghh.yaml index ef6dbc01e..c48629508 100644 --- a/datasets/nasa-gpm3imerghh.yaml +++ b/datasets/nasa-gpm3imerghh.yaml @@ -54,16 +54,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System + URL: https://doi.org/10.1175/JAM2173.1 AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale.' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff - Title: Kalman Filter Based CMORPH URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Read IMERG Data Using Python diff --git a/datasets/nasa-gpm3imerghhe.yaml b/datasets/nasa-gpm3imerghhe.yaml index e727bef5b..dd3e638dd 100644 --- a/datasets/nasa-gpm3imerghhe.yaml +++ b/datasets/nasa-gpm3imerghhe.yaml @@ -81,16 +81,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System + URL: https://doi.org/10.1175/JAM2173.1 AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale.' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff - Title: Kalman Filter Based CMORPH URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Read IMERG Data Using Python diff --git a/datasets/nasa-gpm3imerghhl.yaml b/datasets/nasa-gpm3imerghhl.yaml index 7d9435683..b524526a1 100644 --- a/datasets/nasa-gpm3imerghhl.yaml +++ b/datasets/nasa-gpm3imerghhl.yaml @@ -82,16 +82,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System + URL: https://doi.org/10.1175/JAM2173.1 AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale.' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff - Title: Kalman Filter Based CMORPH URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Read IMERG Data Using Python diff --git a/datasets/nasa-gpm3imergm.yaml b/datasets/nasa-gpm3imergm.yaml index d8c94152e..090e68364 100644 --- a/datasets/nasa-gpm3imergm.yaml +++ b/datasets/nasa-gpm3imergm.yaml @@ -54,16 +54,14 @@ DataAtWork: Publications: - Title: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System - AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao, 2004 - - Title: 'The TRMM Multi-satellite Precipitation Analysis: Quasi-Global, Multi-Year, - Combined-Sensor Precipitation Estimates at Fine Scale' - AuthorName: Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, - K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff + URL: https://doi.org/10.1175/JAM2173.1 + AuthorName: Hong, Y., K. L. Hsu, S. Sorooshian, and X. Gao - Title: Kalman Filter Based CMORPH + URL: https://doi.org/10.1175/JHM-D-11-022.1 AuthorName: Joyce, R. J., P. Xie, and J. E. Janowiak - Title: Calculation of Gridded Precipitation Data for the Global Land-Surface - Using In-Situ Gauge Observations. Proc. of the 2nd Internat. Precip. Working - Group Workshop + Using In-Situ Gauge Observations + URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider Tutorials: - Title: How to Read IMERG Data Using Python diff --git a/datasets/nasa-m2i3npasm.yaml b/datasets/nasa-m2i3npasm.yaml index d1f4b59c3..16ed297d7 100644 --- a/datasets/nasa-m2i3npasm.yaml +++ b/datasets/nasa-m2i3npasm.yaml @@ -42,7 +42,7 @@ Tags: - ice - land - metadata - - netCDF + - netcdf - opendap - water License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' @@ -85,14 +85,17 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + URL: ~ AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, S. Pawson, O. Reale, S. D. Schubert, and M. Suarez - Title: Data assimilation using incremental analysis updates + URL: https://doi.org/10.1175/1520-0493(1996)124<1256:DAUIAU>2.0.CO;2 AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) Data Assimilation System - Version 4 + URL: https://ntrs.nasa.gov/citations/20050175690 AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu - Title: Design and implementation of components in the Earth System Modeling @@ -102,16 +105,20 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure + URL: ~ AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances' + URL: https://doi.org/10.1175//1520-0493(2003)131<1524:NAOTAO>2.0.CO;2 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances' + URL: https://doi.org/10.1175//2543.1 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + URL: https://doi.org/10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2 AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish Tutorials: - Title: How to Access GES DISC Data Using Python diff --git a/datasets/nasa-m2i3nvaer.yaml b/datasets/nasa-m2i3nvaer.yaml index 1a73d8b3a..fe861e154 100644 --- a/datasets/nasa-m2i3nvaer.yaml +++ b/datasets/nasa-m2i3nvaer.yaml @@ -44,7 +44,7 @@ Tags: - ice - land - metadata - - netCDF + - netcdf - opendap - water License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' @@ -87,14 +87,17 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + URL: ~ AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, S. Pawson, O. Reale, S. D. Schubert, and M. Suarez - Title: Data assimilation using incremental analysis updates + URL: https://doi.org/10.1175/1520-0493(1996)124<1256:DAUIAU>2.0.CO;2 AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) Data Assimilation System - Version 4 + URL: https://ntrs.nasa.gov/citations/20050175690 AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu - Title: Design and implementation of components in the Earth System Modeling @@ -104,16 +107,20 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure + URL: ~ AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances' + URL: https://doi.org/10.1175//1520-0493(2003)131<1524:NAOTAO>2.0.CO;2 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances' + URL: https://doi.org/10.1175//2543.1 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + URL: https://doi.org/10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2 AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish Tutorials: - Title: How to Access GES DISC Data Using Python diff --git a/datasets/nasa-m2i3nvasm.yaml b/datasets/nasa-m2i3nvasm.yaml index d1b39f14a..aa2316b35 100644 --- a/datasets/nasa-m2i3nvasm.yaml +++ b/datasets/nasa-m2i3nvasm.yaml @@ -43,7 +43,7 @@ Tags: - ice - land - metadata - - netCDF + - netcdf - opendap - water License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' @@ -86,14 +86,17 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + URL: ~ AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, S. Pawson, O. Reale, S. D. Schubert, and M. Suarez - Title: Data assimilation using incremental analysis updates + URL: https://doi.org/10.1175/1520-0493(1996)124<1256:DAUIAU>2.0.CO;2 AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) Data Assimilation System - Version 4 + URL: https://ntrs.nasa.gov/citations/20050175690 AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu - Title: Design and implementation of components in the Earth System Modeling @@ -103,16 +106,20 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure + URL: ~ AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances' + URL: https://doi.org/10.1175//1520-0493(2003)131<1524:NAOTAO>2.0.CO;2 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances' + URL: https://doi.org/10.1175//2543.1 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + URL: https://doi.org/10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2 AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish Tutorials: - Title: How to Access GES DISC Data Using Python diff --git a/datasets/nasa-m2t1nxslv.yaml b/datasets/nasa-m2t1nxslv.yaml index 9b413ae22..90481a707 100644 --- a/datasets/nasa-m2t1nxslv.yaml +++ b/datasets/nasa-m2t1nxslv.yaml @@ -86,14 +86,17 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' + URL: ~ AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, S. Pawson, O. Reale, S. D. Schubert, and M. Suarez - Title: Data assimilation using incremental analysis updates + URL: https://doi.org/10.1175/1520-0493(1996)124<1256:DAUIAU>2.0.CO;2 AuthorName: Bloom, S., L. Takacs, A. DaSilva, and D. Ledvina - Title: Documentation and Validation of the Goddard Earth Observing System (GEOS) Data Assimilation System - Version 4 + URL: https://ntrs.nasa.gov/citations/20050175690 AuthorName: Bloom, S., A. da Silva, D. Dee, M. Bosilovich, J.-D. Chern, S. Pawson, S. Schubert, M. Sienkiewicz, I. Stajner, W.-W. Tan, M.-L. Wu - Title: Design and implementation of components in the Earth System Modeling @@ -103,16 +106,20 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure + URL: ~ AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances' + URL: https://doi.org/10.1175//1520-0493(2003)131<1524:NAOTAO>2.0.CO;2 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances' + URL: https://doi.org/10.1175//2543.1 AuthorName: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roberts - Title: Three-dimensional variational analysis with spatially inhomogeneous covariances + URL: https://doi.org/10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2 AuthorName: Wu, W.-S., R.J. Purser and D.F. Parrish Tutorials: - Title: How to Read and Plot NetCDF MERRA-2 Data in Python diff --git a/datasets/nasa-operal2rtc-s1v1.yaml b/datasets/nasa-operal2rtc-s1v1.yaml index 184022cd9..19207462f 100644 --- a/datasets/nasa-operal2rtc-s1v1.yaml +++ b/datasets/nasa-operal2rtc-s1v1.yaml @@ -64,6 +64,8 @@ DataAtWork: URL: https://doi.org/10.1109/TGRS.2022.3147472 AuthorName: Gustavo H. X. Shiroma, Marco Lavalle, and Sean M. Buckley - Title: Thermal Denoising of Products Generated by the S-1 IPF + URL: https://sentinels.copernicus.eu/documents/247904/2142675/Thermal-Denoising-of-Products-Generated-by-Sentinel-1-IPF.pdf + AuthorName: ~ Tutorials: - Title: Load, Mosaic, and Visualize OPERA RTC-S1 Data URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/RTC/notebooks/RTC_notebook.ipynb diff --git a/datasets/nasa-operal3disp-s1v1.yaml b/datasets/nasa-operal3disp-s1v1.yaml index f2cc69b77..152195956 100644 --- a/datasets/nasa-operal3disp-s1v1.yaml +++ b/datasets/nasa-operal3disp-s1v1.yaml @@ -25,7 +25,7 @@ Tags: - earth observation - land - metadata - - netCDF + - netcdf - orbit - radar - sentinel-1 diff --git a/datasets/nasa-sentinel-1aslc.yaml b/datasets/nasa-sentinel-1aslc.yaml index c005f9be3..80fe7436a 100644 --- a/datasets/nasa-sentinel-1aslc.yaml +++ b/datasets/nasa-sentinel-1aslc.yaml @@ -29,7 +29,7 @@ Tags: License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' Resources: - Description: 'SENTINEL-1A_SLC.' - ARN: arn:aws:s3::asf-ngap2w-p-s1-slc-7b420b89 + ARN: arn:aws:s3:::asf-ngap2w-p-s1-slc-7b420b89 Region: us-west-2 Type: S3 Bucket RequesterPays: false From d346112952473ecf91cbebc33a62abc6e44221e3 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 14:46:20 -0400 Subject: [PATCH 205/751] update OPERA_L3_DISP-S1_V1 bucket --- datasets/nasa-operal3disp-s1v1.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nasa-operal3disp-s1v1.yaml b/datasets/nasa-operal3disp-s1v1.yaml index 152195956..36cbcb5e6 100644 --- a/datasets/nasa-operal3disp-s1v1.yaml +++ b/datasets/nasa-operal3disp-s1v1.yaml @@ -36,7 +36,7 @@ License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/ Resources: - Description: 'OPERA Surface Displacement from Sentinel-1 validated product (Version 1).' - ARN: arn:aws:s3:::asf-cumulus-prod-opera-product/OPERA_L3_DISP-S1/ + ARN: arn:aws:s3:::asf-cumulus-prod-opera-products/OPERA_L3_DISP-S1_V1/ Region: us-west-2 Type: S3 Bucket RequesterPays: false From e72c1567c4d707faf7d83645d19e4c936d92a3d6 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 14:58:45 -0400 Subject: [PATCH 206/751] shorten names --- datasets/nasa-m2t1nxslv.yaml | 3 +-- datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/datasets/nasa-m2t1nxslv.yaml b/datasets/nasa-m2t1nxslv.yaml index 90481a707..ee9c1b2e5 100644 --- a/datasets/nasa-m2t1nxslv.yaml +++ b/datasets/nasa-m2t1nxslv.yaml @@ -1,5 +1,4 @@ -Name: 'MERRA-2 tavg1_2d_slv_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Single-Level - Diagnostics 0.625 x 0.5 degree V5.12.4 (M2T1NXSLV) at GES DISC' +Name: 'MERRA-2 tavg1_2d_slv_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Single-Level Diagnostics 0.625 x 0.5 degree' Description: "M2T1NXSLV (or tavg1_2d_slv_Nx) is an hourly time-averaged 2-dimensional data collection in Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2). This collection consists of meteorology diagnostics at popularly diff --git a/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml b/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml index ad011caa1..4f431e7eb 100644 --- a/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml +++ b/datasets/nasa-modis-t-jpl-l2p-v2019-0.yaml @@ -1,5 +1,4 @@ -Name: GHRSST Level 2P Global Sea Surface Skin Temperature from the Moderate Resolution - Imaging Spectroradiometer (MODIS) on the NASA Terra satellite (GDS2) +Name: GHRSST Level 2P Global Sea Surface Skin Temperature from the MODIS on the NASA Terra satellite (GDS2) Description: |- NASA produces skin sea surface temperature (SST) products from the Infrared (IR) channels of the Moderate-resolution Imaging Spectroradiometer (MODIS) onboard the Terra satellite. Terra was launched by NASA on December 18, 1999, into a sun synchronous, polar orbit with a daylight descending node at 10:30 am, to study the global dynamics of the Earth atmosphere, land and oceans. The MODIS captures data in 36 spectral bands at a variety of spatial resolutions. Two SST products can be present in these files. The first is a skin SST produced for both day and night observations, derived from the long wave IR 11 and 12 micron wavelength channels, using a modified nonlinear SST algorithm intended to provide continuity with SST derived from heritage and current NASA sensors. At night, a second SST product is produced using the mid-infrared 3.95 and 4.05 micron channels which are unique to MODIS; the SST derived from these measurements is identified as SST4. The SST4 product has lower uncertainty, but due to sun glint can only be produced at night. MODIS L2P SST data have a 1 km spatial resolution at nadir and are stored in 288 five minute granules per day. Full global coverage is obtained every two days, with coverage poleward of 32.3 degree being complete each day. The production of MODIS L2P SST files is part of the Group for High Resolution Sea Surface Temperature (GHRSST) project, and is a joint collaboration between the NASA Jet Propulsion Laboratory (JPL), the NASA Ocean Biology Processing Group (OBPG), and the Rosenstiel School of Marine and Atmospheric Science (RSMAS). Researchers at RSMAS are responsible for SST algorithm development, error statistics and quality flagging, while the OBPG, as the NASA ground data system, is responsible for the production of daily MODIS ocean products. JPL acquires MODIS ocean granules from the OBPG and reformats them to the GHRSST L2P netCDF specification with complete metadata and ancillary variables, and distributes the data as the official Physical Oceanography Data Archive (PO.DAAC) for SST. The R2019.0 supersedes the previous R2014.0 datasets which can be found at https://doi.org/10.5067/GHMDT-2PJ02 Read our doc on how to get AWS Credentials to retrieve this data: https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME From 3e0a7d301ed9685d81ce4e41da68ea69977cb999 Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 15:05:17 -0400 Subject: [PATCH 207/751] change indentation of tutorials Co-authored-by: EthanGrants <105686621+EthanGrants@users.noreply.github.com> --- datasets/nasa-gpm3imergm.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nasa-gpm3imergm.yaml b/datasets/nasa-gpm3imergm.yaml index 090e68364..074966231 100644 --- a/datasets/nasa-gpm3imergm.yaml +++ b/datasets/nasa-gpm3imergm.yaml @@ -63,7 +63,7 @@ DataAtWork: Using In-Situ Gauge Observations URL: https://www.researchgate.net/profile/Udo-Schneider-4/publication/253114707_Calculation_of_Gridded_Precipitation_Data_for_the_Global_Land-Surface_using_in-situ_Gauge_Observations/links/0deec53bbcb3a0e220000000/Calculation-of-Gridded-Precipitation-Data-for-the-Global-Land-Surface-using-in-situ-Gauge-Observations.pdf AuthorName: Rudolf, B., and U. Schneider -Tutorials: + Tutorials: - Title: How to Read IMERG Data Using Python URL: https://github.com/nasa/gesdisc-tutorials/blob/main/notebooks/How_to_Read_IMERG_Data_Using_Python.ipynb AuthorName: James Acker, Jerome Alfred, Helen Amos, Chris Battisto, Thomas Hearty, Alexis Hunzinger, Lena Iredell, Christoph Keller, Binita KC, Carlee Loeser, Ariana Louise, Kristan Morgan, Dieu My T. Nguyen, Dana Ostrenga, Xiaohua Pan, Kanan Patel, Brianna R. Pagán, Andrey Savtchenko, Elliot Sherman, From 0b7765f26aeb43ba9bfeed845058e94d69f5aa1e Mon Sep 17 00:00:00 2001 From: KaseyW31 Date: Fri, 8 Aug 2025 15:15:45 -0400 Subject: [PATCH 208/751] add urls and author name Co-authored-by: EthanGrants <105686621+EthanGrants@users.noreply.github.com> --- datasets/nasa-m2i3npasm.yaml | 4 ++-- datasets/nasa-m2i3nvaer.yaml | 4 ++-- datasets/nasa-m2i3nvasm.yaml | 4 ++-- datasets/nasa-m2t1nxslv.yaml | 4 ++-- datasets/nasa-operal2rtc-s1v1.yaml | 2 +- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/datasets/nasa-m2i3npasm.yaml b/datasets/nasa-m2i3npasm.yaml index 16ed297d7..a504f7fcc 100644 --- a/datasets/nasa-m2i3npasm.yaml +++ b/datasets/nasa-m2i3npasm.yaml @@ -85,7 +85,7 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' - URL: ~ + URL: https://ntrs.nasa.gov/api/citations/20160005045/downloads/20160005045.pdf AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, @@ -105,7 +105,7 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure - URL: ~ + URL: https://doi.org/10.1029/2000JD900327 AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian diff --git a/datasets/nasa-m2i3nvaer.yaml b/datasets/nasa-m2i3nvaer.yaml index fe861e154..0c1f7f796 100644 --- a/datasets/nasa-m2i3nvaer.yaml +++ b/datasets/nasa-m2i3nvaer.yaml @@ -87,7 +87,7 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' - URL: ~ + URL: https://ntrs.nasa.gov/api/citations/20160005045/downloads/20160005045.pdf AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, @@ -107,7 +107,7 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure - URL: ~ + URL: https://doi.org/10.1029/2000JD900327 AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian diff --git a/datasets/nasa-m2i3nvasm.yaml b/datasets/nasa-m2i3nvasm.yaml index aa2316b35..598502838 100644 --- a/datasets/nasa-m2i3nvasm.yaml +++ b/datasets/nasa-m2i3nvasm.yaml @@ -86,7 +86,7 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' - URL: ~ + URL: https://ntrs.nasa.gov/api/citations/20160005045/downloads/20160005045.pdf AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, @@ -106,7 +106,7 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure - URL: ~ + URL: https://doi.org/10.1029/2000JD900327 AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian diff --git a/datasets/nasa-m2t1nxslv.yaml b/datasets/nasa-m2t1nxslv.yaml index ee9c1b2e5..913045bec 100644 --- a/datasets/nasa-m2t1nxslv.yaml +++ b/datasets/nasa-m2t1nxslv.yaml @@ -85,7 +85,7 @@ DataAtWork: AuthorName: Reichle, R. H., C. S. Draper, Q. Liu, M. Girotto, S. P. P. Mahanama, R. D. Koster, and G. J. M. De Lannoy - Title: '2015b: MERRA-2: Initial Evaluation of the Climate' - URL: ~ + URL: https://ntrs.nasa.gov/api/citations/20160005045/downloads/20160005045.pdf AuthorName: Bosilovich, M. G., S. Akella, L. Coy, R. Cullather, C. Draper, R. Gelaro, R. Kovach, Q.Liu, A. Molod, P. Norris, K. Wargan, W. Chao, R. Reichle, L. Takacs, Y. Vikhliaev, S. Bloom, A. Collow, S. Firth, G. Labow, G. Partyka, @@ -105,7 +105,7 @@ DataAtWork: Balaji, P. Li, W. Yang, C. Hill, and A. da Silva - Title: A catchment-based approach to modeling land surface processes in a GCM, Part 1, Model Structure - URL: ~ + URL: https://doi.org/10.1029/2000JD900327 AuthorName: Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar - Title: 'Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian diff --git a/datasets/nasa-operal2rtc-s1v1.yaml b/datasets/nasa-operal2rtc-s1v1.yaml index 19207462f..3c8617cb3 100644 --- a/datasets/nasa-operal2rtc-s1v1.yaml +++ b/datasets/nasa-operal2rtc-s1v1.yaml @@ -65,7 +65,7 @@ DataAtWork: AuthorName: Gustavo H. X. Shiroma, Marco Lavalle, and Sean M. Buckley - Title: Thermal Denoising of Products Generated by the S-1 IPF URL: https://sentinels.copernicus.eu/documents/247904/2142675/Thermal-Denoising-of-Products-Generated-by-Sentinel-1-IPF.pdf - AuthorName: ~ + AuthorName: Riccardo Piantanida Tutorials: - Title: Load, Mosaic, and Visualize OPERA RTC-S1 Data URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/RTC/notebooks/RTC_notebook.ipynb From 693d1b263a53c2a5b682a176140e62c766d6a164 Mon Sep 17 00:00:00 2001 From: Brian Helba Date: Fri, 17 Jan 2025 15:47:43 -0500 Subject: [PATCH 209/751] Add "isic-archive" dataset --- datasets/isic-archive.yaml | 61 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) create mode 100644 datasets/isic-archive.yaml diff --git a/datasets/isic-archive.yaml b/datasets/isic-archive.yaml new file mode 100644 index 000000000..b4eacb9e0 --- /dev/null +++ b/datasets/isic-archive.yaml @@ -0,0 +1,61 @@ +Name: International Skin Imaging Collaboration (ISIC) Archive +Description: A public-access archive of skin lesion images, supporting teaching, research, and the development and evaluation of diagnostic algorithms. +Documentation: https://www.isic-archive.com/ +Contact: support@isic-archive.com +ManagedBy: International Skin Imaging Collaboration (ISIC) +UpdateFrequency: Upon new data ingest from contributors. +Tags: + - biology + - cancer + - classification + - computational pathology + - dicom + - grand-challenge.org + - health + - Homo sapiens + - imaging + - life sciences + - machine learning + - medical image computing + - medical imaging + - medicine + - microscopy + - segmentation +License: Creative Commons licenses (CC-0, CC-BY, or CC-BY-NC) are defined per-image. +Resources: + - Description: Images of skin lesions and associated metadata + ARN: arn:aws:s3:::isic-archive + Region: us-east-1 + Type: S3 Bucket + - Description: Notifications of new data + ARN: arn:aws:sns:us-east-1:024848456264:isic-archive-object_created + Region: us-east-1 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: ISIC Archive Data Dictionary + URL: https://www.isic-archive.com/data-dictionary + NotebookURL: + AuthorName: International Skin Imaging Collaboration (ISIC) + Tools & Applications: + - Title: ISIC Archive Gallery + URL: https://gallery.isic-archive.com + AuthorName: International Skin Imaging Collaboration (ISIC) + - Title: isic-cli - The official command line tool for interacting with the ISIC Archive + URL: https://pypi.org/project/isic-cli + AuthorName: International Skin Imaging Collaboration (ISIC) + Publications: + - Title: "A patient-centric dataset of images and metadata for identifying melanomas using clinical context" + URL: https://doi.org/10.1038/s41597-021-00815-z + AuthorName: Rotemberg V, Kurtansky N, Betz-Stablein B, Caffery L, Chousakos E, Codella N, et al + - Title: "The SLICE-3D dataset: 400,000 skin lesion image crops extracted from 3D TBP for skin cancer detection" + URL: https://doi.org/10.1038/s41597-024-03743-w + AuthorName: Kurtansky N, D'Alessandro B, Gillis M, Betz-Stablein B, Cerminara S, Garcia R, et al + - Title: "International Skin Imaging Collaboration - Designated Diagnoses (ISIC-DX): Consensus terminology for lesion diagnostic labeling" + URL: https://doi.org/10.1111/jdv.20055 + AuthorName: Scope A, Liopyris K, Weber J, Barnhill R, Braun R, Curiel-Lewandrowski C, et al + - Title: "Human surface anatomy terminology for dermatology: a Delphi consensus from the International Skin Imaging Collaboration" + URL: https://doi.org/10.1111/jdv.16855 + AuthorName: Navarrete-Dechent C, Liopyris K, Molenda M, Braun R, Curiel-Lewandrowski C, Dusza S, et al +ADXCategories: + - Healthcare & Life Sciences Data From d27ee760f75b1766fdf36ebf7de3ebfaca711728 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 8 Aug 2025 14:26:54 -0800 Subject: [PATCH 210/751] Update cmas-data-warehouse.yaml From 7eed950b5849c84fac428ec7818a76ad2435509c Mon Sep 17 00:00:00 2001 From: EZ4Fanta <62417573+xfd997700@users.noreply.github.com> Date: Mon, 11 Aug 2025 15:42:01 +0800 Subject: [PATCH 211/751] BioLiP dataset yaml upload First-time upload of BioLiP dataset yaml --- datasets/biolip.yaml | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 datasets/biolip.yaml diff --git a/datasets/biolip.yaml b/datasets/biolip.yaml new file mode 100644 index 000000000..ba1fb346c --- /dev/null +++ b/datasets/biolip.yaml @@ -0,0 +1,34 @@ +Name: BioLiP +Description: BioLiP is a semi-manually curated database for high-quality, biologically relevant ligand-protein binding interactions. The structure data are collected primarily from the Protein Data Bank (PDB), with biological insights mined from literature and other specific databases. BioLiP aims to construct the most comprehensive and accurate database for serving the needs of ligand-protein docking, virtual ligand screening and protein function annotation. +Documentation: https://zhanggroup.org/BioLiP +Contact: zhanglab@zhanggroup.org +ManagedBy: "[Zhang Lab](https://zhanggroup.org/)" +UpdateFrequency: No regular schedule; updated upon availability of major dataset revisions +Tags: + - protein + - structural biology + - molecular docking + - bioinformatics + - molecule + - life sciences + - chemistry + +License: No explicit license stated (publicly available for academic and research use). +Citation: "Chengxin Zhang, Xi Zhang, Peter L Freddolino, and Yang Zhang. BioLiP2: an updated structure database for biologically relevent ligand-protein interactions, Nucleic Acids Research, gkad630 (2023)." +Resources: + - Description: BioLiP dataset + ARN: arn:aws:s3:::biolip + Region: ap-southeast-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: BioLiP API usage + URL: https://zhanggroup.org/BioLiP/help.html + AuthorName: Zhang Lab + Publications: + - Title: "BioLiP2: an updated structure database for biologically relevant ligand-protein interactions" + URL: https://academic.oup.com/nar/article/52/D1/D404/7233921 + AuthorName: Chengxin Zhang, Xi Zhang, Peter L Freddolino, and Yang Zhang + - Title: "BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions" + URL: https://academic.oup.com/nar/article/41/D1/D1096/1074898 + AuthorName: Jianyi Yang, Ambrish Roy, and Yang Zhang From a1a89a0b8ab22491e88b9155c51ca45923dd21e0 Mon Sep 17 00:00:00 2001 From: Bl4ckH4wkGER <25514600+Bl4ckH4wkGER@users.noreply.github.com> Date: Mon, 11 Aug 2025 11:33:42 -0700 Subject: [PATCH 212/751] Add allen-hmba-releases.yaml to datasets folder --- datasets/allen-hmba-releases.yaml | 50 +++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 datasets/allen-hmba-releases.yaml diff --git a/datasets/allen-hmba-releases.yaml b/datasets/allen-hmba-releases.yaml new file mode 100644 index 000000000..1c63d3b3a --- /dev/null +++ b/datasets/allen-hmba-releases.yaml @@ -0,0 +1,50 @@ +Name: Human and Mammalian Brain Atlas +Description: + Human and Mammalian Brain Atlas (HMBA) is a major atlas of the BRAIN Initiative Cell Atlas Network (BICAN) that proposes to establish a comprehensive, + highly granular cell atlas in complete adult human, macaque, and marmoset brains that links brain structure, function and cellular architecture. + Release artifacts have been made available in this OpenData bucket to enable utilization along with their paper publications by the neuroscience community. +Documentation: https://portal.brain-map.org/explore/hmba +Contact: awspds@alleninstitute.org +ManagedBy: "[Allen Institute](http://www.alleninstitute.org/)" +UpdateFrequency: Never +Tags: + - aws-pds + - biology + - gene expression + - neurobiology + - life sciences + - transcriptomics + - basal ganglia + - Mus musculus + - Macaca mulatta + - Callithrix jacchus + - Homo sapiens +License: http://www.alleninstitute.org/legal/terms-use/ +Citation: +Resources: + - Description: project data files in a public bucket + ARN: arn:aws:s3:::allen-hmba-releases + Region: us-west-2 + Type: S3 bucket + Explore: +DataAtWork: + Tutorials: + - Title: + URL: + NotebookURL: + AuthorName: + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: + - \ No newline at end of file From cbe25c086a41cf04123abd1591b1ff128ec65860 Mon Sep 17 00:00:00 2001 From: lbesnard Date: Tue, 12 Aug 2025 16:19:26 +1000 Subject: [PATCH 213/751] Fix: AODN - replace nbviewer with github in notebook url --- datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml | 4 ++-- .../aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml | 4 ++-- datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml | 4 ++-- datasets/aodn_mooring_ctd_delayed_qc.yaml | 4 ++-- datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml | 4 ++-- ...dn_mooring_satellite_altimetry_calibration_validation.yaml | 4 ++-- ...radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- ...ricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml | 4 ++-- ...adar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_coffsharbour_wind_delayed_qc.yaml | 4 ++-- ..._radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- ...n_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- ...ar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- ...dar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml | 4 ++-- ...uthaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml | 4 ++-- datasets/aodn_radar_southaustraliagulfs_wind_delayed_qc.yaml | 4 ++-- ...ar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml | 4 ++-- datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml | 4 ++-- ...dn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml | 4 ++-- ..._satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml | 4 ++-- ...dn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml | 4 ++-- .../aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml | 4 ++-- ...e_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml | 4 ++-- ..._ghrsst_l3s_1day_daynighttime_single_sensor_australia.yaml | 4 ++-- ...sst_l3s_1day_daynighttime_single_sensor_southernocean.yaml | 4 ++-- ...ite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml | 4 ++-- ...e_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml | 4 ++-- ..._ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml | 4 ++-- ...dn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml | 4 ++-- ...atellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml | 4 ++-- .../aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml | 4 ++-- ...aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml | 4 ++-- ...aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml | 4 ++-- datasets/aodn_satellite_optical_water_type_1day_aqua.yaml | 4 ++-- .../aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml | 4 ++-- datasets/aodn_slocum_glider_delayed_qc.yaml | 4 ++-- datasets/aodn_vessel_air_sea_flux_product_delayed.yaml | 4 ++-- datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml | 4 ++-- datasets/aodn_vessel_co2_delayed_qc.yaml | 4 ++-- datasets/aodn_vessel_fishsoop_realtime_qc.yaml | 4 ++-- datasets/aodn_vessel_sst_delayed_qc.yaml | 4 ++-- datasets/aodn_vessel_trv_realtime_qc.yaml | 4 ++-- datasets/aodn_vessel_xbt_delayed_qc.yaml | 4 ++-- datasets/aodn_vessel_xbt_realtime_nonqc.yaml | 4 ++-- datasets/aodn_wave_buoy_realtime_nonqc.yaml | 4 ++-- 60 files changed, 120 insertions(+), 120 deletions(-) diff --git a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml index 61b3b66eb..060c89501 100644 --- a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml +++ b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml @@ -42,12 +42,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Animal Tracking Facility - Acoustic Tracking - Quality Controlled Detections (2007 - ongoing) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/animal_acoustic_tracking_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/animal_acoustic_tracking_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/animal_acoustic_tracking_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml b/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml index b5ed8aa87..b0e740ceb 100644 --- a/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml +++ b/datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml @@ -44,12 +44,12 @@ DataAtWork: Tutorials: - Title: Accessing Satellite Relay Tagging Program - Southern Ocean - MEOP Quality Controlled CTD Profiles - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/animal_ctd_satellite_relay_tagging_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/animal_ctd_satellite_relay_tagging_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/animal_ctd_satellite_relay_tagging_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml b/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml index b1adeda49..74f5261d8 100644 --- a/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml +++ b/datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml @@ -7,12 +7,12 @@ DataAtWork: AuthorURL: https://github.com/aodn/aodn_cloud_optimised NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/model_sea_level_anomaly_gridded_realtime.ipynb Title: Accessing IMOS - OceanCurrent - Gridded sea level anomaly - Near real time - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/model_sea_level_anomaly_gridded_realtime.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/model_sea_level_anomaly_gridded_realtime.ipynb - AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb Description: "Gridded (adjusted) sea level anomaly (GSLA), gridded sea level (GSL)\ \ and surface geostrophic velocity (UCUR,VCUR) for the Australasian region. GSLA\ \ is mapped using optimal interpolation of detided, de-meaned, inverse-barometer-adjusted\ diff --git a/datasets/aodn_mooring_ctd_delayed_qc.yaml b/datasets/aodn_mooring_ctd_delayed_qc.yaml index 015281534..f7dcf3f4f 100644 --- a/datasets/aodn_mooring_ctd_delayed_qc.yaml +++ b/datasets/aodn_mooring_ctd_delayed_qc.yaml @@ -31,12 +31,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - Australian National Mooring Network (ANMN) - CTD Profiles - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_ctd_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_ctd_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_ctd_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml b/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml index add684a9d..7af7dc113 100644 --- a/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml +++ b/datasets/aodn_mooring_hourly_timeseries_delayed_qc.yaml @@ -40,12 +40,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - Moorings - Hourly time-series product - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_hourly_timeseries_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_hourly_timeseries_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_hourly_timeseries_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml b/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml index a1a800537..3d7db8d1a 100644 --- a/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml +++ b/datasets/aodn_mooring_satellite_altimetry_calibration_validation.yaml @@ -58,12 +58,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - SRS Satellite Altimetry Calibration and Validation Sub-Facility - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_satellite_altimetry_calibration_validation.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_satellite_altimetry_calibration_validation.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/mooring_satellite_altimetry_calibration_validation.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml index 27ae1ab7b..ab71ffd86 100644 --- a/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_bonneycoast_velocity_hourly_averaged_delayed_qc.yaml @@ -37,12 +37,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Bonney Coast HF ocean radar site (South Australia, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_BonneyCoast_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_BonneyCoast_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_BonneyCoast_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml index 00404ce4c..cccf5485d 100644 --- a/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_capricornbunkergroup_velocity_hourly_averaged_delayed_qc.yaml @@ -49,12 +49,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Capricorn Bunker Group HF ocean radar site (Great Barrier Reef, Queensland, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml b/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml index 3de654e18..7ed719d1b 100644 --- a/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_capricornbunkergroup_wave_delayed_qc.yaml @@ -48,12 +48,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Capricorn Bunker Group HF ocean radar site (Great Barrier Reef, Queensland, Australia) - Delayed mode wave - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_wave_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_wave_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_wave_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml b/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml index d39a8dd70..4d97c7ecf 100644 --- a/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml +++ b/datasets/aodn_radar_capricornbunkergroup_wind_delayed_qc.yaml @@ -49,12 +49,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Capricorn Bunker Group HF ocean radar site (Great Barrier Reef, Queensland, Australia) - Delayed mode wind - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_wind_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_wind_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CapricornBunkerGroup_wind_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml index 621ea7d4b..c6b0779ae 100644 --- a/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_coffsharbour_velocity_hourly_averaged_delayed_qc.yaml @@ -45,12 +45,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Coffs Harbour HF ocean radar site (New South Wales, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml b/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml index f9e0fc997..4155d3e90 100644 --- a/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_coffsharbour_wave_delayed_qc.yaml @@ -44,12 +44,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Coffs Harbour HF ocean radar site (New South Wales, Australia) - Delayed mode wave - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_wave_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_wave_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_wave_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_coffsharbour_wind_delayed_qc.yaml b/datasets/aodn_radar_coffsharbour_wind_delayed_qc.yaml index 65bd8351c..075ac6293 100644 --- a/datasets/aodn_radar_coffsharbour_wind_delayed_qc.yaml +++ b/datasets/aodn_radar_coffsharbour_wind_delayed_qc.yaml @@ -41,12 +41,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Coffs Harbour HF ocean radar site (New South Wales, Australia) - Delayed mode wind - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_wind_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_wind_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoffsHarbour_wind_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml index fd1233744..027e8d906 100644 --- a/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_coralcoast_velocity_hourly_averaged_delayed_qc.yaml @@ -40,12 +40,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Coral Coast HF ocean radar site (Western Australia, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoralCoast_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoralCoast_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_CoralCoast_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml index d456f15be..040d426dd 100644 --- a/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml @@ -34,12 +34,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Newcastle HF ocean radar site (New South Wales, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_Newcastle_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_Newcastle_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_Newcastle_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml index 628c1fa2a..b66b85a27 100644 --- a/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_northwestshelf_velocity_hourly_averaged_delayed_qc.yaml @@ -33,12 +33,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Northwest Shelf HF ocean radar site (Western Australia, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_NorthWestShelf_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_NorthWestShelf_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_NorthWestShelf_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml index aa2f7704f..723672d01 100644 --- a/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_rottnestshelf_velocity_hourly_averaged_delayed_qc.yaml @@ -40,12 +40,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Rottnest Shelf HF ocean radar site (Western Australia, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml b/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml index 2d299cc64..7488e2c72 100644 --- a/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_rottnestshelf_wave_delayed_qc.yaml @@ -39,12 +39,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Rottnest Shelf HF ocean radar site (Western Australia, Australia) - Delayed mode wave - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_wave_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_wave_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_wave_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml b/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml index ee0208c5b..dc515048c 100644 --- a/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml +++ b/datasets/aodn_radar_rottnestshelf_wind_delayed_qc.yaml @@ -40,12 +40,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Rottnest Shelf HF ocean radar site (Western Australia, Australia) - Delayed mode wind - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_wind_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_wind_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_RottnestShelf_wind_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml index 7464e80c1..931cb0297 100644 --- a/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_southaustraliagulfs_velocity_hourly_averaged_delayed_qc.yaml @@ -47,12 +47,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - South Australia Gulfs HF ocean radar site (South Australia, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml b/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml index f566a4293..2bf2020b1 100644 --- a/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml +++ b/datasets/aodn_radar_southaustraliagulfs_wave_delayed_qc.yaml @@ -46,12 +46,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - South Australia Gulfs HF ocean radar site (South Australia, Australia) - Delayed mode wave - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_wave_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_wave_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_wave_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_southaustraliagulfs_wind_delayed_qc.yaml b/datasets/aodn_radar_southaustraliagulfs_wind_delayed_qc.yaml index e7ca87294..0ff3805f3 100644 --- a/datasets/aodn_radar_southaustraliagulfs_wind_delayed_qc.yaml +++ b/datasets/aodn_radar_southaustraliagulfs_wind_delayed_qc.yaml @@ -43,12 +43,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - South Australia Gulfs HF ocean radar site (South Australia, Australia) - Delayed mode wind - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_wind_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_wind_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_SouthAustraliaGulfs_wind_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml index ce05e2b5e..2eb96c0b5 100644 --- a/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_turquoisecoast_velocity_hourly_averaged_delayed_qc.yaml @@ -49,12 +49,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - ACORN - Turquoise Coast HF ocean radar site (Western Australia, Australia) - Delayed mode sea water velocity - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_TurquoiseCoast_velocity_hourly_averaged_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_TurquoiseCoast_velocity_hourly_averaged_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/radar_TurquoiseCoast_velocity_hourly_averaged_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml index a6a08924d..18d16d2d4 100644 --- a/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_carder_1day_aqua.yaml @@ -35,12 +35,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Chlorophyll-a concentration (Carder model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_carder_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_carder_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_carder_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml index 2a8577a24..2b76dec58 100644 --- a/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_gsm_1day_aqua.yaml @@ -30,12 +30,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Chlorophyll-a concentration (GSM model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml b/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml index e0d63c8e0..a1a854290 100644 --- a/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml +++ b/datasets/aodn_satellite_chlorophylla_gsm_1day_noaa20.yaml @@ -30,12 +30,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - NOAA20 - 01 day - Chlorophyll-a concentration (GSM model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_noaa20.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_noaa20.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_noaa20.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml b/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml index 0f020ba71..406690b80 100644 --- a/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml +++ b/datasets/aodn_satellite_chlorophylla_gsm_1day_snpp.yaml @@ -30,12 +30,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - SNPP - 01 day - Chlorophyll-a concentration (GSM model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_snpp.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_snpp.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_gsm_1day_snpp.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml index b37517d8d..0ce96b25a 100644 --- a/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_oc3_1day_aqua.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Chlorophyll-a concentration (OC3 model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml b/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml index e5dd4458c..a017b666f 100644 --- a/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml +++ b/datasets/aodn_satellite_chlorophylla_oc3_1day_noaa20.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - NOAA20 - 01 day - Chlorophyll-a concentration (OC3 model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_noaa20.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_noaa20.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_noaa20.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml b/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml index 492e30862..bf5b513ee 100644 --- a/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml +++ b/datasets/aodn_satellite_chlorophylla_oc3_1day_snpp.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - SNPP - 01 day - Chlorophyll-a concentration (OC3 model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_snpp.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_snpp.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oc3_1day_snpp.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml b/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml index 189c29431..d040491b1 100644 --- a/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml +++ b/datasets/aodn_satellite_chlorophylla_oci_1day_aqua.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Chlorophyll-a concentration (OCI model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml b/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml index f3296c66f..dd4117717 100644 --- a/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml +++ b/datasets/aodn_satellite_chlorophylla_oci_1day_noaa20.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - NOAA20 - 01 day - Chlorophyll-a concentration (OCI model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_noaa20.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_noaa20.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_noaa20.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml b/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml index 5b83a8ba9..ae98aa384 100644 --- a/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml +++ b/datasets/aodn_satellite_chlorophylla_oci_1day_snpp.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - SNPP - 01 day - Chlorophyll-a concentration (OCI model) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_snpp.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_snpp.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_chlorophylla_oci_1day_snpp.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml index e2e7c5403..f8a6efe64 100644 --- a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml +++ b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_aqua.yaml @@ -29,12 +29,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Diffuse attenuation coefficient (k490 ) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml index 040cb6945..347e2e0d3 100644 --- a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml +++ b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_noaa20.yaml @@ -31,12 +31,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - NOAA20 - 01 day - Diffuse attenuation coefficient (k490) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_noaa20.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_noaa20.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_noaa20.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml index d6d08bfd4..cf3e34ab9 100644 --- a/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml +++ b/datasets/aodn_satellite_diffuse_attenuation_coefficent_1day_snpp.yaml @@ -30,12 +30,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - Satellite Remote Sensing - SNPP - 01 day - Diffuse attenuation coefficient (k490) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_snpp.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_snpp.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_diffuse_attenuation_coefficent_1day_snpp.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml b/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml index 0ff16356a..83f66d92c 100644 --- a/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml +++ b/datasets/aodn_satellite_ghrsst_l3c_1day_nighttime_himawari8.yaml @@ -40,12 +40,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3C - Himawari-8 - 1 day - night time - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3c_1day_nighttime_himawari8.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3c_1day_nighttime_himawari8.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3c_1day_nighttime_himawari8.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml index 5ea6793c3..268b31e44 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.yaml @@ -39,12 +39,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3S - Multi Sensor - 1 day - day and night time - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_multi_sensor_australia.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.yaml index df88d49d6..087232afd 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.yaml @@ -33,12 +33,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3S - Single Sensor - 1 day - day and night time - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_single_sensor_australia.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml index bb1844d32..f81f1e8c0 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.yaml @@ -36,12 +36,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3S - Single Sensor - 1 day - day and night time - Southern Ocean - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1day_daynighttime_single_sensor_southernocean.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml index 36d7d95a4..bfeaf5479 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.yaml @@ -34,12 +34,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3S - Single Sensor - 1 month - day time - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_1month_daytime_single_sensor_australia.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml index 90d19081c..0bf089d00 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.yaml @@ -39,12 +39,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3S - Multi Sensor - 3 day - day and night time - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_3day_daynighttime_multi_sensor_australia.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml index e2eb7589d..fdc735c5d 100644 --- a/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.yaml @@ -37,12 +37,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L3S - Single Sensor - 6 day - day and night time - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l3s_6day_daynighttime_single_sensor_australia.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml b/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml index ceb965b75..e22e952e3 100644 --- a/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml +++ b/datasets/aodn_satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.yaml @@ -32,12 +32,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L4 - GAMSSA - World - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l4_gamssa_1day_multi_sensor_world.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml b/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml index af6b9fd7f..957d24a05 100644 --- a/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml +++ b/datasets/aodn_satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.yaml @@ -33,12 +33,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - SST - L4 - RAMSSA - Australia - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_ghrsst_l4_ramssa_1day_multi_sensor_australia.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml b/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml index a85c49572..454c62d6d 100644 --- a/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_nanoplankton_fraction_oc3_1day_aqua.yaml @@ -39,12 +39,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Nanoplankton fraction (OC3 model and Brewin et al 2012 algorithm) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_nanoplankton_fraction_oc3_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_nanoplankton_fraction_oc3_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_nanoplankton_fraction_oc3_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml b/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml index ebee62c0a..4508e5c89 100644 --- a/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml +++ b/datasets/aodn_satellite_net_primary_productivity_gsm_1day_aqua.yaml @@ -47,12 +47,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Net Primary Productivity (GSM model and Eppley-VGPM algorithm) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_net_primary_productivity_gsm_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_net_primary_productivity_gsm_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_net_primary_productivity_gsm_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml b/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml index 8e921cd11..cccbebe11 100644 --- a/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_net_primary_productivity_oc3_1day_aqua.yaml @@ -51,12 +51,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Net Primary Productivity (OC3 model and Eppley-VGPM algorithm) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_net_primary_productivity_oc3_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_net_primary_productivity_oc3_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_net_primary_productivity_oc3_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml b/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml index e5bb1f38c..668f35298 100644 --- a/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml +++ b/datasets/aodn_satellite_optical_water_type_1day_aqua.yaml @@ -34,12 +34,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Optical Water Type (Moore et al 2009 algorithm) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_optical_water_type_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_optical_water_type_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_optical_water_type_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml b/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml index c68005b3e..f2b4de443 100644 --- a/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml +++ b/datasets/aodn_satellite_picoplankton_fraction_oc3_1day_aqua.yaml @@ -39,12 +39,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SRS - MODIS - 01 day - Picoplankton fraction (OC3 model and Brewin et al 2012 algorithm) - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_picoplankton_fraction_oc3_1day_aqua.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_picoplankton_fraction_oc3_1day_aqua.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/satellite_picoplankton_fraction_oc3_1day_aqua.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_slocum_glider_delayed_qc.yaml b/datasets/aodn_slocum_glider_delayed_qc.yaml index cd5c2df11..cb8e4e9d2 100644 --- a/datasets/aodn_slocum_glider_delayed_qc.yaml +++ b/datasets/aodn_slocum_glider_delayed_qc.yaml @@ -54,12 +54,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing Ocean Gliders - delayed mode glider deployments - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/slocum_glider_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/slocum_glider_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/slocum_glider_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml b/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml index 43c5cb4bc..ab6d85bf4 100644 --- a/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml +++ b/datasets/aodn_vessel_air_sea_flux_product_delayed.yaml @@ -40,12 +40,12 @@ Resources: DataAtWork: Tutorials: - Title: 'Accessing IMOS-SOOP-Air Sea Flux: Flux product' - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_air_sea_flux_product_delayed.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_air_sea_flux_product_delayed.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_air_sea_flux_product_delayed.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml b/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml index 35879331c..d2cbcdbe3 100644 --- a/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml +++ b/datasets/aodn_vessel_air_sea_flux_sst_meteo_realtime.yaml @@ -43,12 +43,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SOOP-Air Sea Flux (ASF) sub-facility - Meteorological and SST Observations - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_air_sea_flux_sst_meteo_realtime.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_air_sea_flux_sst_meteo_realtime.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_air_sea_flux_sst_meteo_realtime.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_co2_delayed_qc.yaml b/datasets/aodn_vessel_co2_delayed_qc.yaml index b8a2210cc..f89739374 100644 --- a/datasets/aodn_vessel_co2_delayed_qc.yaml +++ b/datasets/aodn_vessel_co2_delayed_qc.yaml @@ -39,12 +39,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SOOP Underway CO2 Measurements Research Group - delayed mode data - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_co2_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_co2_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_co2_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_fishsoop_realtime_qc.yaml b/datasets/aodn_vessel_fishsoop_realtime_qc.yaml index 06c0e26d0..f8d46faad 100644 --- a/datasets/aodn_vessel_fishsoop_realtime_qc.yaml +++ b/datasets/aodn_vessel_fishsoop_realtime_qc.yaml @@ -11,12 +11,12 @@ DataAtWork: NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_fishsoop_realtime_qc.ipynb Title: Accessing IMOS SOOP - Fisheries Vessels as Ships of Opportunity Sub-Facility - Real-time data - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_fishsoop_realtime_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_fishsoop_realtime_qc.ipynb - AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb Description: "Fisheries Vessels as Ships of Opportunities (FishSOOP) is an IMOS Sub-Facility\ \ working with fishers to collect real-time temperature and depth data by installing\ \ equipment on a network of commercial fishing vessels using a range of common fishing\ diff --git a/datasets/aodn_vessel_sst_delayed_qc.yaml b/datasets/aodn_vessel_sst_delayed_qc.yaml index 26e3d3802..ec94218ad 100644 --- a/datasets/aodn_vessel_sst_delayed_qc.yaml +++ b/datasets/aodn_vessel_sst_delayed_qc.yaml @@ -43,12 +43,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing IMOS - SOOP Sea Surface Temperature - delayed mode data - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_sst_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_sst_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_sst_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_trv_realtime_qc.yaml b/datasets/aodn_vessel_trv_realtime_qc.yaml index ced99a379..539952e17 100644 --- a/datasets/aodn_vessel_trv_realtime_qc.yaml +++ b/datasets/aodn_vessel_trv_realtime_qc.yaml @@ -44,12 +44,12 @@ DataAtWork: Tutorials: - Title: 'Accessing Sensors on Tropical Research Vessels: Enhanced Measurements from Ships of Opportunity (SOOP)' - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_trv_realtime_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_trv_realtime_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_trv_realtime_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_xbt_delayed_qc.yaml b/datasets/aodn_vessel_xbt_delayed_qc.yaml index 763ab5b29..2c5d6753a 100644 --- a/datasets/aodn_vessel_xbt_delayed_qc.yaml +++ b/datasets/aodn_vessel_xbt_delayed_qc.yaml @@ -29,12 +29,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SOOP Expendable Bathythermographs (XBT) Research Group - XBT delayed mode data - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_xbt_delayed_qc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_xbt_delayed_qc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_xbt_delayed_qc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_vessel_xbt_realtime_nonqc.yaml b/datasets/aodn_vessel_xbt_realtime_nonqc.yaml index fb3423a31..27515adee 100644 --- a/datasets/aodn_vessel_xbt_realtime_nonqc.yaml +++ b/datasets/aodn_vessel_xbt_realtime_nonqc.yaml @@ -33,12 +33,12 @@ DataAtWork: Tutorials: - Title: Accessing IMOS - SOOP Expendable Bathythermographs (XBT) Research Group - XBT real-time data - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_xbt_realtime_nonqc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_xbt_realtime_nonqc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/vessel_xbt_realtime_nonqc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised diff --git a/datasets/aodn_wave_buoy_realtime_nonqc.yaml b/datasets/aodn_wave_buoy_realtime_nonqc.yaml index 7a6af4069..6ca6272e9 100644 --- a/datasets/aodn_wave_buoy_realtime_nonqc.yaml +++ b/datasets/aodn_wave_buoy_realtime_nonqc.yaml @@ -47,12 +47,12 @@ Resources: DataAtWork: Tutorials: - Title: Accessing Wave buoys Observations - Australia - near real-time - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/wave_buoy_realtime_nonqc.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/wave_buoy_realtime_nonqc.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/wave_buoy_realtime_nonqc.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised - Title: Accessing and search for any AODN dataset - URL: https://nbviewer.org/github/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb + URL: https://github.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb NotebookURL: https://githubtocolab.com/aodn/aodn_cloud_optimised/blob/main/notebooks/GetAodnData.ipynb AuthorName: Laurent Besnard AuthorURL: https://github.com/aodn/aodn_cloud_optimised From 799b5745fe9bee41ad5ed9c3f337df227471a86e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 12 Aug 2025 07:08:46 -0800 Subject: [PATCH 214/751] Update aodn_animal_acoustic_tracking_delayed_qc.yaml --- datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml index 060c89501..6e46358b2 100644 --- a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml +++ b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml @@ -28,9 +28,10 @@ Collabs: Tags: - biodiversity Tags: -- oceans -- marine mammals -- biology + - aws-psd + - oceans + - marine mammals + - biology License: http://creativecommons.org/licenses/by/4.0/ Resources: - Description: Cloud Optimised AODN dataset of IMOS - Animal Tracking Facility - Acoustic From b546b8fa1e8f4f0ffcab3d9f6d47cf343842e425 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 12 Aug 2025 07:16:57 -0800 Subject: [PATCH 215/751] Update aodn_animal_acoustic_tracking_delayed_qc.yaml --- datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml index 6e46358b2..07b78c759 100644 --- a/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml +++ b/datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml @@ -28,7 +28,7 @@ Collabs: Tags: - biodiversity Tags: - - aws-psd + - aws-pds - oceans - marine mammals - biology From 6328cb74e1f5f11a53df21cac5e69cdf06eb6a62 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Tue, 12 Aug 2025 09:30:01 -0600 Subject: [PATCH 216/751] Update marine-energy-data.yaml Add UMSLI data resource to marine-energy-data --- datasets/marine-energy-data.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/marine-energy-data.yaml b/datasets/marine-energy-data.yaml index 94fb299f0..76dd2d0e9 100644 --- a/datasets/marine-energy-data.yaml +++ b/datasets/marine-energy-data.yaml @@ -38,6 +38,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=marine-energy-data&prefix=pacwave%2F)' + - Description: "[Unobtrusive Multi-static Serial LiDAR Imager (UMSLI) Dataset](https://mhkdr.openei.org/submissions/507)" + ARN: arn:aws:s3:::marine-energy-data/umsli/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=marine-energy-data&prefix=umsli%2F)' DataAtWork: Tools & Applications: Publications: From b845f074d572f7b06421eda686b1f3903b67d904 Mon Sep 17 00:00:00 2001 From: Kasey Wei Date: Tue, 12 Aug 2025 11:44:12 -0400 Subject: [PATCH 217/751] shorten more names, remove extra spaces/newlines in names --- datasets/nasa-airibrad.yaml | 3 +-- datasets/nasa-airicrad.yaml | 3 +-- datasets/nasa-gpm3imergde.yaml | 3 +-- datasets/nasa-gpm3imergdf.yaml | 3 +-- datasets/nasa-gpm3imergdl.yaml | 3 +-- datasets/nasa-gpm3imerghh.yaml | 3 +-- datasets/nasa-gpm3imerghhe.yaml | 3 +-- datasets/nasa-gpm3imerghhl.yaml | 3 +-- datasets/nasa-gpm3imergm.yaml | 3 +-- datasets/nasa-gpmimerglandseamask.yaml | 3 +-- datasets/nasa-gpmmergir.yaml | 3 +-- datasets/nasa-hlsl30.yaml | 3 +-- datasets/nasa-hlss30.yaml | 3 +-- datasets/nasa-m2i3npasm.yaml | 3 +-- datasets/nasa-m2i3nvaer.yaml | 3 +-- datasets/nasa-m2i3nvasm.yaml | 3 +-- datasets/nasa-operal2cslc-s1-staticv1.yaml | 3 +-- datasets/nasa-operal2cslc-s1v1.yaml | 3 +-- datasets/nasa-operal2rtc-s1-staticv1.yaml | 3 +-- datasets/nasa-operal2rtc-s1v1.yaml | 3 +-- datasets/nasa-operal3dist-alert-hls_v1.yaml | 3 +-- datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml | 3 +-- datasets/nasa-operal3dist-alert-hlsv1.yaml | 3 +-- datasets/nasa-operal3dist-ann-hlsv1.yaml | 3 +-- datasets/nasa-operal3dswx-hlsv1.yaml | 3 +-- 25 files changed, 25 insertions(+), 50 deletions(-) diff --git a/datasets/nasa-airibrad.yaml b/datasets/nasa-airibrad.yaml index 5568ec4a5..74c3eae00 100644 --- a/datasets/nasa-airibrad.yaml +++ b/datasets/nasa-airibrad.yaml @@ -1,5 +1,4 @@ -Name: AIRS/Aqua L1B Infrared (IR) geolocated and calibrated radiances V005 (AIRIBRAD) - at GES DISC +Name: AIRS/Aqua L1B Infrared (IR) geolocated and calibrated radiances V005 (AIRIBRAD) at GES DISC Description: |- WARNING: On 2021/09/23 the EOS Aqua executed a Deep Space Maneuver (DSM). In the DSM, the spacecraft is turned such that the normal Earth field of regard is deep space. diff --git a/datasets/nasa-airicrad.yaml b/datasets/nasa-airicrad.yaml index b5ebf35e2..0a10a5e09 100644 --- a/datasets/nasa-airicrad.yaml +++ b/datasets/nasa-airicrad.yaml @@ -1,5 +1,4 @@ -Name: AIRS/Aqua L1C Infrared (IR) resampled and corrected radiances V6.7 (AIRICRAD) - at GES DISC +Name: AIRS/Aqua L1C Infrared (IR) resampled and corrected radiances V6.7 (AIRICRAD) at GES DISC Description: |- The Atmospheric Infrared Sounder (AIRS) is a grating spectrometer (R = 1200) aboard the second Earth Observing System (EOS) polar-orbiting platform, EOS Aqua. In combination with the Advanced Microwave Sounding Unit (AMSU) and the Humidity Sounder for Brazil (HSB), AIRS constitutes an innovative atmospheric sounding group of visible, infrared, and microwave sensors. The AIRS Infrared (IR) level 1C data set contains AIRS infrared calibrated and geolocated radiances in W/m2/micron/ster. This data set is generated from AIRS level 1B data. The spectral coverage of L1C data is from 3.74 to 15.4 mm. The nominal spectral resolution lambda / delta lambda = 1200. The spectrum is sampled twice per spectral resolution element in a total of 2645 spectral channels. A day of AIRS data is divided into 240 granules (scenes) each of 6-minute duration. For the AIRS IR measurements, an individual granule contains 135 pixels across-track and 90 along-track pixels; there are total of 135 x 90 = 12,150 pixels per granule. AIRS employs a 49.5 degree crosstrack scanning with a 1.1 degree instantaneous field of view (IFOV) to provide twice daily coverage of essentially the entire globe in a 1:30 PM sun synchronous orbit with the 13.5 x 13.5 km2 spatial resolution at nadir. The L1C swath products are derived from the L1B swath products. The primary purpose of the level 1C is to generate the spectra of radiances without spectral gaps caused by the instrument design and bad spectral points. The AIRS L1C data can be used for comparative (with other IR measurements) studies and for weather-climate research. diff --git a/datasets/nasa-gpm3imergde.yaml b/datasets/nasa-gpm3imergde.yaml index c359dba0f..7ee5ece1f 100644 --- a/datasets/nasa-gpm3imergde.yaml +++ b/datasets/nasa-gpm3imergde.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Early Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDE) - at GES DISC +Name: GPM IMERG Early Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDE) at GES DISC Description: "Version 07 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 07.\n\nThe Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global diff --git a/datasets/nasa-gpm3imergdf.yaml b/datasets/nasa-gpm3imergdf.yaml index c38bf54e1..415e36b1c 100644 --- a/datasets/nasa-gpm3imergdf.yaml +++ b/datasets/nasa-gpm3imergdf.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) - at GES DISC +Name: GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) at GES DISC Description: "Version 07 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 07.\n\nThe Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global diff --git a/datasets/nasa-gpm3imergdl.yaml b/datasets/nasa-gpm3imergdl.yaml index a0cbfec95..82c8d32ce 100644 --- a/datasets/nasa-gpm3imergdl.yaml +++ b/datasets/nasa-gpm3imergdl.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Late Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDL) - at GES DISC +Name: GPM IMERG Late Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDL) at GES DISC Description: "Version 07 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 07.\n\nThe Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global diff --git a/datasets/nasa-gpm3imerghh.yaml b/datasets/nasa-gpm3imerghh.yaml index c48629508..8bdf2c380 100644 --- a/datasets/nasa-gpm3imerghh.yaml +++ b/datasets/nasa-gpm3imerghh.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Final Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHH) - at GES DISC +Name: GPM IMERG Final Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHH) at GES DISC Description: |- Version 07B is the current version of the IMERG data sets. Older versions will no longer be available and have been superseded by Version 07. diff --git a/datasets/nasa-gpm3imerghhe.yaml b/datasets/nasa-gpm3imerghhe.yaml index dd3e638dd..2635fb6fe 100644 --- a/datasets/nasa-gpm3imerghhe.yaml +++ b/datasets/nasa-gpm3imerghhe.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Early Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHHE) - at GES DISC +Name: GPM IMERG Early Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHHE) at GES DISC Description: "Version 07B is the current version of the IMERG data sets. Older versions will no longer be available and have been superseded by Version 07.\n\nThe Integrated Multi-satellitE Retrievals for GPM (IMERG) is the unified U.S. algorithm that provides diff --git a/datasets/nasa-gpm3imerghhl.yaml b/datasets/nasa-gpm3imerghhl.yaml index b524526a1..cbdefbdd6 100644 --- a/datasets/nasa-gpm3imerghhl.yaml +++ b/datasets/nasa-gpm3imerghhl.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Late Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHHL) - at GES DISC +Name: GPM IMERG Late Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 (GPM_3IMERGHHL) at GES DISC Description: |- Version 07B is the current version of the IMERG data sets. Older versions will no longer be available and have been superseded by Version 07.\n\nThe Integrated diff --git a/datasets/nasa-gpm3imergm.yaml b/datasets/nasa-gpm3imergm.yaml index 074966231..5a03706c8 100644 --- a/datasets/nasa-gpm3imergm.yaml +++ b/datasets/nasa-gpm3imergm.yaml @@ -1,5 +1,4 @@ -Name: GPM IMERG Final Precipitation L3 1 month 0.1 degree x 0.1 degree V07 (GPM_3IMERGM) - at GES DISC +Name: GPM IMERG Final Precipitation L3 1 month 0.1 degree x 0.1 degree V07 (GPM_3IMERGM) at GES DISC Description: |- Version 07B is the current version of the IMERG data sets. Older versions will no longer be available and have been superseded by Version 07. diff --git a/datasets/nasa-gpmimerglandseamask.yaml b/datasets/nasa-gpmimerglandseamask.yaml index 52f43f805..d7690a77e 100644 --- a/datasets/nasa-gpmimerglandseamask.yaml +++ b/datasets/nasa-gpmimerglandseamask.yaml @@ -1,5 +1,4 @@ -Name: Land/Sea static mask relevant to IMERG precipitation 0.1x0.1 degree V2 (GPM_IMERG_LandSeaMask) - at GES DISC +Name: Land/Sea static mask relevant to IMERG precipitation 0.1x0.1 degree V2 (GPM_IMERG_LandSeaMask) at GES DISC Description: |- Version 2 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 2. diff --git a/datasets/nasa-gpmmergir.yaml b/datasets/nasa-gpmmergir.yaml index 049ecab57..098a21d8b 100644 --- a/datasets/nasa-gpmmergir.yaml +++ b/datasets/nasa-gpmmergir.yaml @@ -1,5 +1,4 @@ -Name: NCEP/CPC L3 Half Hourly 4km Global (60S - 60N) Merged IR V1 (GPM_MERGIR) at - GES DISC +Name: NCEP/CPC L3 Half Hourly 4km Global (60S - 60N) Merged IR V1 (GPM_MERGIR) at GES DISC Description: "These data originate from NOAA/NCEP.\n\nThe NOAA Climate Prediction Center/NCEP/NWS is making the data available originally in binary format, in a weekly rotating archive. The NASA GES DISC is acquiring the binary files as they become diff --git a/datasets/nasa-hlsl30.yaml b/datasets/nasa-hlsl30.yaml index 63133b64a..098773dd0 100644 --- a/datasets/nasa-hlsl30.yaml +++ b/datasets/nasa-hlsl30.yaml @@ -1,5 +1,4 @@ -Name: HLS Landsat Operational Land Imager Surface Reflectance and TOA Brightness Daily - Global 30m v2.0 +Name: HLS Landsat Operational Land Imager Surface Reflectance and TOA Brightness Daily Global 30m v2.0 Description: "The Harmonized Landsat Sentinel-2 (HLS) project provides consistent surface reflectance (SR) and top of atmosphere (TOA) brightness data from a virtual constellation of satellite sensors. The Operational Land Imager (OLI) is housed diff --git a/datasets/nasa-hlss30.yaml b/datasets/nasa-hlss30.yaml index e8f40ff26..5332e1d63 100644 --- a/datasets/nasa-hlss30.yaml +++ b/datasets/nasa-hlss30.yaml @@ -1,5 +1,4 @@ -Name: HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m - v2.0 +Name: HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m v2.0 Description: "The Harmonized Landsat Sentinel-2 (HLS) project provides consistent surface reflectance data from the Operational Land Imager (OLI) aboard the joint NASA/USGS Landsat 8 satellite and the Multi-Spectral Instrument (MSI) aboard Europe’s diff --git a/datasets/nasa-m2i3npasm.yaml b/datasets/nasa-m2i3npasm.yaml index a504f7fcc..e863ee10e 100644 --- a/datasets/nasa-m2i3npasm.yaml +++ b/datasets/nasa-m2i3npasm.yaml @@ -1,5 +1,4 @@ -Name: 'MERRA-2 inst3_3d_asm_Np: 3d,3-Hourly,Instantaneous,Pressure-Level,Assimilation,Assimilated - Meteorological Fields 0.625 x 0.5 degree V5.12.4 (M2I3NPASM) at GES DISC' +Name: 'MERRA-2 inst3_3d_asm_Np: 3d,3-Hourly,Instantaneous,Pressure-Level,Assimilation,Assimilated Meteorological Fields' Description: "M2I3NPASM (or inst3_3d_asm_Np) is an instantaneous 3-dimensional 3-hourly data collection in Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2). This collection consists of assimilations of meteorological diff --git a/datasets/nasa-m2i3nvaer.yaml b/datasets/nasa-m2i3nvaer.yaml index 0c1f7f796..bd9d7f6aa 100644 --- a/datasets/nasa-m2i3nvaer.yaml +++ b/datasets/nasa-m2i3nvaer.yaml @@ -1,5 +1,4 @@ -Name: 'MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol - Mixing Ratio 0.625 x 0.5 degree V5.12.4 (M2I3NVAER) at GES DISC' +Name: 'MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol Mixing Ratio 0.625 x 0.5 degree' Description: "M2I3NVAER (or inst3_3d_aer_Nv) is an instantaneous 3-dimensional 3-hourly data collection in Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2). This collection consists of assimilations of aerosol mixing diff --git a/datasets/nasa-m2i3nvasm.yaml b/datasets/nasa-m2i3nvasm.yaml index 598502838..c705a48d2 100644 --- a/datasets/nasa-m2i3nvasm.yaml +++ b/datasets/nasa-m2i3nvasm.yaml @@ -1,5 +1,4 @@ -Name: 'MERRA-2 inst3_3d_asm_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Assimilated - Meteorological Fields 0.625 x 0.5 degree V5.12.4 (M2I3NVASM) at GES DISC' +Name: 'MERRA-2 inst3_3d_asm_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Assimilated Meteorological Fields 0.625 x 0.5 degree' Description: "M2I3NVASM (or inst3_3d_asm_Nv) is an instantaneous 3-dimensional 3-hourly data collection in Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2). This collection consists of assimilations of meteorological diff --git a/datasets/nasa-operal2cslc-s1-staticv1.yaml b/datasets/nasa-operal2cslc-s1-staticv1.yaml index e2241b1f8..8aea817af 100644 --- a/datasets/nasa-operal2cslc-s1-staticv1.yaml +++ b/datasets/nasa-operal2cslc-s1-staticv1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Coregistered Single-Look Complex from Sentinel-1 Static Layers validated - product (Version 1) +Name: OPERA Coregistered Single-Look Complex from Sentinel-1 Static Layers validated product (Version 1) Description: |- The Observational Products for End-Users from Remote Sensing Analysis (OPERA) Coregistered Single-Look Complex (CSLC) from Sentinel-1 (S1) Static Layers (CSLC-S1-STATIC) validated product contains static radar geometry layers associated with the OPERA Coregistered Single-Look Complex (CSLC) from Sentinel-1 (S1) validated product. Due to the S1 mission’s narrow orbital tube, radar-geometry layers vary slightly over time for each position on the ground, and therefore are considered static. These static layers are provided separately from the OPERA CSLC-S1 product, as they are produced only once or a limited number of times, to account for changes in the DEM, in the S1 orbit, or in the static layers generation algorithm. Each OPERA CSLC-S1-STATIC product is distributed as a Hierarchical Data Format version 5 (HDF5) file following the CF-1.8 convention containing both data raster layers and product metadata and corresponds to matching CSLC-S1 products with the same burst ID. OPERA CSLC-S1 products are available over North America which includes the USA and U.S. Territories, Canada within 200 km of the U.S. border, and all mainland countries from the southern U.S. border down to and including Panama. The CSLC-S1 products are available in the associated OPERA Coregistered Single-Look Complex from Sentinel-1 validated product (Version 1) dataset. Read our doc on how to get AWS Credentials to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME diff --git a/datasets/nasa-operal2cslc-s1v1.yaml b/datasets/nasa-operal2cslc-s1v1.yaml index c8fbf5166..e2128a349 100644 --- a/datasets/nasa-operal2cslc-s1v1.yaml +++ b/datasets/nasa-operal2cslc-s1v1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Coregistered Single-Look Complex from Sentinel-1 validated product (Version - 1) +Name: OPERA Coregistered Single-Look Complex from Sentinel-1 validated product (Version 1) Description: "The Observational Products for End-Users from Remote Sensing Analysis (OPERA) Coregistered Single-Look Complex (CSLC) from Sentinel-1 validated product consists of Single Look Complex (SLC) images which contain both amplitude and phase diff --git a/datasets/nasa-operal2rtc-s1-staticv1.yaml b/datasets/nasa-operal2rtc-s1-staticv1.yaml index adb9da30f..f784de02c 100644 --- a/datasets/nasa-operal2rtc-s1-staticv1.yaml +++ b/datasets/nasa-operal2rtc-s1-staticv1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 Static Layers - validated product (Version 1) +Name: OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 Static Layers validated product (Version 1) Description: |- The Observational Products for End-Users from Remote Sensing Analysis (OPERA) Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (S1) Static Layers (RTC-S1-STATIC) validated product contains static radar geometry layers associated with the OPERA Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (S1) (RTC-S1) validated product. Due to the S1 mission’s narrow orbital tube, radar-geometry layers such as incidence angle, local incidence angle, number of looks, and RTC Area Normalization Factor (ANF) vary slightly over time for each position on the ground, and therefore are considered static. These static layers are provided separately from the OPERA RTC-S1 product, as they are produced only once or a limited number of times, to account for changes in the DEM, in the S1 orbit, or in the static-layers generation algorithm. Static layers are provided as single-band cloud-optimized GeoTIFF (COG) files, with map grid matching RTC-S1 products with the same burst ID. The standard OPERA RTC-S1 product is derived from the original Copernicus Sentinel-1 (S1) interferometric wide (IW) single-look complex (SLC) data, provided by the European Space Agency, with a temporal sampling coincident with the availability of Sentinel-1A and Sentinel-1B SLC data. The OPERA RTC-S1-STATIC and RTC-S1 products are provided at a near global scope (land masses excluding Antarctica). The RTC-S1 products are available in the associated OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 validated product (Version 1) dataset. Read our doc on how to get AWS Credentials to retrieve this data: https://cumulus.asf.alaska.edu/s3credentialsREADME diff --git a/datasets/nasa-operal2rtc-s1v1.yaml b/datasets/nasa-operal2rtc-s1v1.yaml index 3c8617cb3..98ecbaa8d 100644 --- a/datasets/nasa-operal2rtc-s1v1.yaml +++ b/datasets/nasa-operal2rtc-s1v1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 validated - product (Version 1) +Name: OPERA Radiometric Terrain Corrected SAR Backscatter from Sentinel-1 validated product (Version 1) Description: "The Observational Products for End-Users from Remote Sensing Analysis (OPERA) Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (S1) validated product consists of radar backscatter normalized with respect to the topography. diff --git a/datasets/nasa-operal3dist-alert-hls_v1.yaml b/datasets/nasa-operal3dist-alert-hls_v1.yaml index b87ed5373..f5c485c2b 100644 --- a/datasets/nasa-operal3dist-alert-hls_v1.yaml +++ b/datasets/nasa-operal3dist-alert-hls_v1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product - (Version 1) +Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product (Version 1) Description: |- The Observational Products for End-Users from Remote Sensing Analysis ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 (HLS) product Version 1 maps vegetation disturbance alerts that are derived from data collected by Landsat 8 and Landsat 9 Operational Land Imager (OLI) and Sentinel-2A, Sentinel-2B, and Sentinel-2C Multi-Spectral Instrument (MSI). A vegetation disturbance alert is detected at 30 meter (m) spatial resolution when there is an indicated decrease in vegetation cover within an HLS pixel. The Level-3 data product also provides additional information about more general disturbance trends and auxiliary generic disturbance information as determined from the variations of the reflectance through the HLS scenes. [HLS](https://lpdaac.usgs.gov/product_search/?collections=HLS&status=Operational&view=list) data represent the highest temporal frequency data available at medium spatial resolution. The combined observations will provide greater sensitivity to land changes, whether of large magnitude/short duration or small magnitude/long duration. diff --git a/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml b/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml index dc01fa1d7..dcf283261 100644 --- a/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml +++ b/datasets/nasa-operal3dist-alert-hlsprovisionalv0.yaml @@ -1,5 +1,4 @@ -Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 provisional - product (Version 0) +Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 provisional product (Version 0) Description: "The OPERA_L3_DIST-ALERT-HLS Version 0 data product was decommissioned on April 25, 2025. Users are encouraged to use the [OPERA_L3_DIST-ALERT-HLS V1](https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001) data product which was released on March 14, 2024, and has achieved stage 1 validation.\n\nThe diff --git a/datasets/nasa-operal3dist-alert-hlsv1.yaml b/datasets/nasa-operal3dist-alert-hlsv1.yaml index ee890928c..0e3810bf2 100644 --- a/datasets/nasa-operal3dist-alert-hlsv1.yaml +++ b/datasets/nasa-operal3dist-alert-hlsv1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product - (Version 1) +Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product (Version 1) Description: |- The Observational Products for End-Users from Remote Sensing Analysis ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 (HLS) product Version 1 maps vegetation disturbance alerts that are derived from data collected by Landsat 8 and Landsat 9 Operational Land Imager (OLI) and Sentinel-2A, Sentinel-2B, and Sentinel-2C Multi-Spectral Instrument (MSI). A vegetation disturbance alert is detected at 30 meter (m) spatial resolution when there is an indicated decrease in vegetation cover within an HLS pixel. The Level-3 data product also provides additional information about more general disturbance trends and auxiliary generic disturbance information as determined from the variations of the reflectance through the HLS scenes. [HLS](https://lpdaac.usgs.gov/product_search/?collections=HLS&status=Operational&view=list) data represent the highest temporal frequency data available at medium spatial resolution. The combined observations will provide greater sensitivity to land changes, whether of large magnitude/short duration or small magnitude/long duration. diff --git a/datasets/nasa-operal3dist-ann-hlsv1.yaml b/datasets/nasa-operal3dist-ann-hlsv1.yaml index d7e7ef843..bad381dfc 100644 --- a/datasets/nasa-operal3dist-ann-hlsv1.yaml +++ b/datasets/nasa-operal3dist-ann-hlsv1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Land Surface Disturbance Annual from Harmonized Landsat Sentinel-2 product - (Version 1) +Name: OPERA Land Surface Disturbance Annual from Harmonized Landsat Sentinel-2 product (Version 1) Description: "The Observational Products for End-Users from Remote Sensing Analysis ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Annual from Harmonized Landsat Sentinel-2 (HLS) product Version 1 summarizes the [DIST-ALERT](https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001) diff --git a/datasets/nasa-operal3dswx-hlsv1.yaml b/datasets/nasa-operal3dswx-hlsv1.yaml index 3d9a80c57..0c3818c04 100644 --- a/datasets/nasa-operal3dswx-hlsv1.yaml +++ b/datasets/nasa-operal3dswx-hlsv1.yaml @@ -1,5 +1,4 @@ -Name: OPERA Dynamic Surface Water Extent from Harmonized Landsat Sentinel-2 product - (Version 1) +Name: OPERA Dynamic Surface Water Extent from Harmonized Landsat Sentinel-2 product (Version 1) Description: "This dataset contains Level-3 Dynamic OPERA surface water extent product version 1. The data are validated surface water extent observations beginning April 2023. Known issues and caveats on usage are described under Documentation. The input From 5cacd599d67885144e2c68e642ef097edfb380c3 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 12 Aug 2025 08:11:37 -0800 Subject: [PATCH 218/751] Update nasa-gedil4aagbdensityv212056.yaml From 89c2b4559863f5bf07f1349cb8dcee712bdf42c9 Mon Sep 17 00:00:00 2001 From: nutellaBear <48599863+LalithShiyam@users.noreply.github.com> Date: Wed, 13 Aug 2025 12:50:51 +0200 Subject: [PATCH 219/751] Create enhance-pet-1-6k.yaml MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds AWS Open Data Registry entry for ENHANCE.PET 1.6k — a 1,597-case multi-center [18F]FDG-PET/CT dataset with 130 CT-derived segmentations across 7 anatomical groups. Includes licensing by origin (CC BY 4.0, CC BY-NC 4.0), metadata spreadsheets, and labels.json. Access via MOOSE CLI from AWS Open Data S3. --- datasets/enhance-pet-1-6k.yaml | 71 ++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100644 datasets/enhance-pet-1-6k.yaml diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml new file mode 100644 index 000000000..c20f1019c --- /dev/null +++ b/datasets/enhance-pet-1-6k.yaml @@ -0,0 +1,71 @@ +Name: ENHANCE.PET 1.6k: Whole-/Total-Body [18F]FDG-PET/CT with CT-Derived Segmentations +Description: > + Open, multi-center dataset of 1,597 whole-/total-body FDG-PET/CT studies with + 130 CT-derived, expert-verified anatomical segmentations per scan (~250 GB). + Provided as anonymized NIfTI (PET, CT, labels) with spreadsheet metadata. + Designed for segmentation benchmarking, multi-organ analysis, radiomics, and PET/CT AI research. + +Documentation: + - https://github.com/ENHANCE-PET/MOOSE/blob/main/DATA_CARD.md +Contact: Lalith.shiyam@med.uni-muenchen.de +ManagedBy: ENHANCE.PET initiative (LMU Klinikum & partners) +UpdateFrequency: Ad hoc (new releases aligned with additional cohort availability) + +Tags: + - medical-imaging + - pet + - ct + - fdg + - segmentation + - nifti + - oncology + - open-data + - radiomics + - ai-tools + +License: > + Dataset licensing per originating site: + - AutoPET Challenge: CC BY-NC 4.0 (non-commercial use) + - University Hospital Leipzig: CC BY 4.0 + - Azienda Ospedaliero Universitaria Careggi: CC BY 4.0 + Software (MOOSE): Apache-2.0. + +Citation: > + Ferrara D. et al. (2025). Sharing a whole-/total-body [18F]FDG-PET/CT dataset with + CT-derived segmentations: an ENHANCE.PET initiative. https://doi.org/10.21203/rs.3.rs-7169062/v2 + +Resources: + - Description: ENHANCE.PET 1.6k dataset (hosted under AWS Open Data) + ARN: arn:aws:s3::: + Region: us-east-1 + Type: S3 Bucket + Explore: s3:/// + +DataAtWork: + Tutorials: + - Title: Dataset Organization & AWS Access (MOOSE CLI) + URL: https://github.com/ENHANCE-PET/MOOSE/blob/main/DATA_CARD.md + AuthorName: ENHANCE.PET Team + AuthorURL: https://enhance.pet/ + Services: [ S3 ] + + Tools & Applications: + - Title: MOOSE (Multi-organ objective segmentation tool) + URL: https://github.com/ENHANCE-PET/MOOSE + AuthorName: ENHANCE.PET (QIMP Team) + AuthorURL: https://enhance.pet/ + + Publications: + - Title: Sharing a whole-/total-body [18F]FDG-PET/CT dataset with CT-derived segmentations: an ENHANCE.PET initiative + URL: https://doi.org/10.21203/rs.3.rs-7169062/v2 + AuthorName: Ferrara, D.; Pires, M.; Gutschmayer, S.; Yu, J.; Abdelhafez, Y. G.; Abenavoli, E.; Badawi, R. D.; + Chaudhari, A. J.; Chen, M. S.; Cherry, S. R.; Frille, A.; Geist, B. K.; Grüenert, S.; Hacker, M.; + Hesse, S.; Kerkhoff, T.; Linder, P.; Pappisch, J.; Pusitz, S.; Raslan, O. A.; Rausch, I.; + Raychaudhuri, S. P.; Sabri, O.; Schmidt, F.; Sciagrà, R.; Spencer, B.; Wang, G.; Wirtz, H.; + Beyer, T.; Sundar, L. K. S. + AuthorURL: https://orcid.org/0000-0002-8711-8081 + +DeprecatedNotice: "" +ADXCategories: + - life-sciences + - machine-learning From 67a520669849cc0456883e947a655c243382d3e2 Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Wed, 13 Aug 2025 10:04:55 -0400 Subject: [PATCH 220/751] Added publication MRDEM30 --- datasets/canelevation-dem.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index 456b7f073..2a7a08c42 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -77,10 +77,10 @@ DataAtWork: URL: https://nrcan.github.io/cloud-optimized-geospatial/ AuthorName: NRCan Publications: - - Title: - URL: - AuthorName: - AuthorURL: + - Title: Descriptor: Medium Resolution Digital Elevation Model From Natural Resources Canada’s CanElevation Series (MRDEM-30) + URL: https://doi.org/10.1109/IEEEDATA.2025.3576318 + AuthorName: H. McGrath et al. + From bb6f9cf1f94c59f5233695abb983e85c4f4f36d2 Mon Sep 17 00:00:00 2001 From: nutellaBear <48599863+LalithShiyam@users.noreply.github.com> Date: Wed, 13 Aug 2025 16:48:27 +0200 Subject: [PATCH 221/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index c20f1019c..09f7a1c06 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -13,15 +13,11 @@ UpdateFrequency: Ad hoc (new releases aligned with additional cohort availabilit Tags: - medical-imaging - - pet - - ct - - fdg - segmentation - nifti - - oncology - - open-data - - radiomics - - ai-tools + - cancer + - radiology + - life sciences License: > Dataset licensing per originating site: From dba02db553929f0e4916e7581fc53347c42793c9 Mon Sep 17 00:00:00 2001 From: nutellaBear <48599863+LalithShiyam@users.noreply.github.com> Date: Wed, 13 Aug 2025 18:45:19 +0200 Subject: [PATCH 222/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 09f7a1c06..62d96faa2 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -31,12 +31,23 @@ Citation: > CT-derived segmentations: an ENHANCE.PET initiative. https://doi.org/10.21203/rs.3.rs-7169062/v2 Resources: - - Description: ENHANCE.PET 1.6k dataset (hosted under AWS Open Data) - ARN: arn:aws:s3::: - Region: us-east-1 + - Description: ENHANCE.PET 1.6k public S3 bucket + ARN: arn:aws:s3:::enhance-pet-1-6k + Region: us-west-2 Type: S3 Bucket - Explore: s3:/// + Explore: https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ + - Description: ENHANCE.PET 1.6k dataset ZIP (≈250 GB) + ARN: arn:aws:s3:::enhance-pet-1-6k/ENHANCE-PET-1_6k.zip + Region: us-west-2 + Type: S3 Object + Explore: https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ENHANCE-PET-1_6k.zip + + - Description: SNS topic for ENHANCE.PET 1.6k S3 object creation events + ARN: arn:aws:sns:us-west-2:602670427264:enhance-pet-1-6k-object_created + Region: us-west-2 + Type: SNS Topic + DataAtWork: Tutorials: - Title: Dataset Organization & AWS Access (MOOSE CLI) From f3c54d8f323e90e4c707cb92738b2696c1882661 Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Wed, 13 Aug 2025 08:52:58 -0800 Subject: [PATCH 223/751] removed duplicate NASA dataset --- datasets/nasa-operal3dist-alert-hls_v1.yaml | 42 --------------------- datasets/nasa-operal3dist-alert-hlsv1.yaml | 2 +- 2 files changed, 1 insertion(+), 43 deletions(-) delete mode 100644 datasets/nasa-operal3dist-alert-hls_v1.yaml diff --git a/datasets/nasa-operal3dist-alert-hls_v1.yaml b/datasets/nasa-operal3dist-alert-hls_v1.yaml deleted file mode 100644 index f5c485c2b..000000000 --- a/datasets/nasa-operal3dist-alert-hls_v1.yaml +++ /dev/null @@ -1,42 +0,0 @@ -Name: OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 product (Version 1) -Description: |- - The Observational Products for End-Users from Remote Sensing Analysis ([OPERA](https://www.jpl.nasa.gov/go/opera)) Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 (HLS) product Version 1 maps vegetation disturbance alerts that are derived from data collected by Landsat 8 and Landsat 9 Operational Land Imager (OLI) and Sentinel-2A, Sentinel-2B, and Sentinel-2C Multi-Spectral Instrument (MSI). A vegetation disturbance alert is detected at 30 meter (m) spatial resolution when there is an indicated decrease in vegetation cover within an HLS pixel. The Level-3 data product also provides additional information about more general disturbance trends and auxiliary generic disturbance information as determined from the variations of the reflectance through the HLS scenes. [HLS](https://lpdaac.usgs.gov/product_search/?collections=HLS&status=Operational&view=list) data represent the highest temporal frequency data available at medium spatial resolution. The combined observations will provide greater sensitivity to land changes, whether of large magnitude/short duration or small magnitude/long duration. - - The OPERA_L3_DIST-ALERT-HLS (or DIST-ALERT) data product is provided in Cloud Optimized GeoTIFF (COG) format, and each layer is distributed as a separate file. There are 19 layers contained within the DIST-ALERT product. The layers for both vegetation and generic disturbance include disturbance status, loss or anomaly, maximum loss anomaly, disturbance confidence layer, date of disturbance, count of observations with loss anomalies, days of ongoing anomalies, and day of last disturbance detection. Additional layers are vegetation cover percent, historical percent vegetation cover, and data mask. See the Product Specification Document (PSD) for a more detailed description of the individual layers provided in the DIST-ALERT product. - - The OPERA_L3_DIST-ALERT-HLS product contains modified Copernicus Sentinel data (2020-2025). - - Known Issues - * Additional usage constraints are provided under Section 5 of the Algorithm Theoretical Basis Document (ATBD). - Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME -Documentation: https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001 -Contact: 'Email: lpdaac@usgs.gov. Home Page: https://lpdaac.usgs.gov/' -ManagedBy: NASA -UpdateFrequency: From 2022-01-01 to Ongoing (Daily - < Weekly) -Tags: - - aws-pds - - cog - - earth observation - - environmental - - global - - land - - land cover - - land use - - satellite imagery -License: '[Creative Commons BY 4.0](https://creativecommons.org/licenses/by/4.0/)' -Resources: - - Description: 'OPERA Land Surface Disturbance Alert from Harmonized Landsat Sentinel-2 - product (Version 1).' - ARN: arn:aws:s3:::lp-protected/OPERA_L3_DIST-ALERT-HLS_V1.001 - Region: us-west-2 - Type: S3 Bucket - RequesterPays: false - ControlledAccess: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials -DataAtWork: - Tutorials: - - Title: Getting Started with OPERA DIST-ALERT-HLS Products - URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Discover/Stream_and_Viz_DIST-ALERT-folium.ipynb - AuthorName: R. Dhillon and M. Grace Bato - - Title: Getting Started with OPERA DIST Product - URL: https://github.com/OPERA-Cal-Val/OPERA_Applications/blob/main/DIST/DIST_ALERT/Wildfire/Intro_To_DIST.ipynb - AuthorName: M. Grace Bato and R. Dhillon diff --git a/datasets/nasa-operal3dist-alert-hlsv1.yaml b/datasets/nasa-operal3dist-alert-hlsv1.yaml index 0e3810bf2..f5c485c2b 100644 --- a/datasets/nasa-operal3dist-alert-hlsv1.yaml +++ b/datasets/nasa-operal3dist-alert-hlsv1.yaml @@ -10,7 +10,7 @@ Description: |- * Additional usage constraints are provided under Section 5 of the Algorithm Theoretical Basis Document (ATBD). Read our doc on how to get AWS Credentials to retrieve this data: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME Documentation: https://doi.org/10.5067/SNWG/OPERA_L3_DIST-ALERT-HLS_V1.001 -Contact: 'User Services: lpdaac@usgs.gov' +Contact: 'Email: lpdaac@usgs.gov. Home Page: https://lpdaac.usgs.gov/' ManagedBy: NASA UpdateFrequency: From 2022-01-01 to Ongoing (Daily - < Weekly) Tags: From 9a2e0fc76b78342ea8a475f98d4367335257f09f Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 13 Aug 2025 14:37:06 -0400 Subject: [PATCH 224/751] Update enhance-pet-1-6k.yaml Removing hyphen from "medical imaging" --- datasets/enhance-pet-1-6k.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 62d96faa2..d276cb22e 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -12,7 +12,7 @@ ManagedBy: ENHANCE.PET initiative (LMU Klinikum & partners) UpdateFrequency: Ad hoc (new releases aligned with additional cohort availability) Tags: - - medical-imaging + - medical imaging - segmentation - nifti - cancer From 7a9cdf9742720a0ab258d2a7a0c06ec8ff31f670 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 13 Aug 2025 14:49:33 -0400 Subject: [PATCH 225/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index d276cb22e..29bccaff2 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -1,4 +1,4 @@ -Name: ENHANCE.PET 1.6k: Whole-/Total-Body [18F]FDG-PET/CT with CT-Derived Segmentations +Name: ENHANCE.PET 1.6k - Whole-/Total-Body [18F]FDG-PET/CT with CT-Derived Segmentations Description: > Open, multi-center dataset of 1,597 whole-/total-body FDG-PET/CT studies with 130 CT-derived, expert-verified anatomical segmentations per scan (~250 GB). @@ -28,7 +28,7 @@ License: > Citation: > Ferrara D. et al. (2025). Sharing a whole-/total-body [18F]FDG-PET/CT dataset with - CT-derived segmentations: an ENHANCE.PET initiative. https://doi.org/10.21203/rs.3.rs-7169062/v2 + CT-derived segmentations - an ENHANCE.PET initiative. https://doi.org/10.21203/rs.3.rs-7169062/v2 Resources: - Description: ENHANCE.PET 1.6k public S3 bucket @@ -63,7 +63,7 @@ DataAtWork: AuthorURL: https://enhance.pet/ Publications: - - Title: Sharing a whole-/total-body [18F]FDG-PET/CT dataset with CT-derived segmentations: an ENHANCE.PET initiative + - Title: Sharing a whole-/total-body [18F]FDG-PET/CT dataset with CT-derived segmentations - an ENHANCE.PET initiative URL: https://doi.org/10.21203/rs.3.rs-7169062/v2 AuthorName: Ferrara, D.; Pires, M.; Gutschmayer, S.; Yu, J.; Abdelhafez, Y. G.; Abenavoli, E.; Badawi, R. D.; Chaudhari, A. J.; Chen, M. S.; Cherry, S. R.; Frille, A.; Geist, B. K.; Grüenert, S.; Hacker, M.; From 4dbc2504fa5a3b9b73d460069811fb97779d98e6 Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Wed, 13 Aug 2025 15:41:15 -0400 Subject: [PATCH 226/751] Update noaa-nws-hafs.yaml Added information under Update Frequency to note that additional forecast files for the 2022 hurricane season are now available. --- datasets/noaa-nws-hafs.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-nws-hafs.yaml b/datasets/noaa-nws-hafs.yaml index 3ed6b9f8c..d2e9b2f56 100644 --- a/datasets/noaa-nws-hafs.yaml +++ b/datasets/noaa-nws-hafs.yaml @@ -15,7 +15,7 @@ Contact: | For any questions regarding data delivery or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov.
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov ManagedBy: "[NOAA](http://www.noaa.gov/)" -UpdateFrequency: Event Driven +UpdateFrequency: Event Driven.

As of August 2025, a few forecast cycles for Hurricane Fiona (07L) and Hurricane Nicole (17L) from the 2022 Atlantic hurricane season have been made available. These files can be found under the subdirectories of hfsa_retro and hfsb_retro. Additional forecast files from the 2022 hurricane season can be made available upon user request. Collabs: ASDI: Tags: From 066119b475c33909a8348cab9163c5bb67f7b9bf Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 13 Aug 2025 12:05:26 -0800 Subject: [PATCH 227/751] Update noaa-nws-hafs.yaml From fd3f41e109ab6fd70d98c944fafc884d9c92baca Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Thu, 14 Aug 2025 10:00:22 -0400 Subject: [PATCH 228/751] Update dendritic-consortium.yaml --- datasets/dendritic-consortium.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/dendritic-consortium.yaml b/datasets/dendritic-consortium.yaml index fd82b01f0..9166377c8 100644 --- a/datasets/dendritic-consortium.yaml +++ b/datasets/dendritic-consortium.yaml @@ -30,5 +30,11 @@ DataAtWork: URL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium/tree/main/tutorials AuthorName: Dendritic Consortium AuthorURL: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium + Tools & Applications: + - Title: Dendritic Consortium Database + URL: https://dendritic-consortium.vercel.app/database + AuthorName: Dendritic Consortium + AuthorURL: https://dendritic-consortium.vercel.app ADXCategories: - Healthcare & Life Sciences Data + From 3cca4641fb012e0f5ba68cfa35bc20c250015c08 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 14 Aug 2025 11:10:53 -0400 Subject: [PATCH 229/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 6 ------ 1 file changed, 6 deletions(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 29bccaff2..44160e068 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -37,12 +37,6 @@ Resources: Type: S3 Bucket Explore: https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ - - Description: ENHANCE.PET 1.6k dataset ZIP (≈250 GB) - ARN: arn:aws:s3:::enhance-pet-1-6k/ENHANCE-PET-1_6k.zip - Region: us-west-2 - Type: S3 Object - Explore: https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ENHANCE-PET-1_6k.zip - - Description: SNS topic for ENHANCE.PET 1.6k S3 object creation events ARN: arn:aws:sns:us-west-2:602670427264:enhance-pet-1-6k-object_created Region: us-west-2 From f8c4614a6dfbd952b6012fe7c5029d17bf34535a Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 14 Aug 2025 15:20:01 -0400 Subject: [PATCH 230/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 44160e068..9f930e7a9 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -48,7 +48,7 @@ DataAtWork: URL: https://github.com/ENHANCE-PET/MOOSE/blob/main/DATA_CARD.md AuthorName: ENHANCE.PET Team AuthorURL: https://enhance.pet/ - Services: [ S3 ] + Services: [ s3 ] Tools & Applications: - Title: MOOSE (Multi-organ objective segmentation tool) From 28ad8f954ab935735826de98a05f88b456a439c3 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 14 Aug 2025 15:25:24 -0400 Subject: [PATCH 231/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 9f930e7a9..a6b78b0c2 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -48,7 +48,6 @@ DataAtWork: URL: https://github.com/ENHANCE-PET/MOOSE/blob/main/DATA_CARD.md AuthorName: ENHANCE.PET Team AuthorURL: https://enhance.pet/ - Services: [ s3 ] Tools & Applications: - Title: MOOSE (Multi-organ objective segmentation tool) From 0188335b4bc2e28ab7f1ddf335f404291c1b072b Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 14 Aug 2025 15:40:26 -0400 Subject: [PATCH 232/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index a6b78b0c2..7ba35b08a 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -65,7 +65,5 @@ DataAtWork: Beyer, T.; Sundar, L. K. S. AuthorURL: https://orcid.org/0000-0002-8711-8081 -DeprecatedNotice: "" ADXCategories: - - life-sciences - - machine-learning + - Healthcare & Life Sciences Data From d95efd3b4e2359830b27d8b025f5703b3f40f44b Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 14 Aug 2025 15:51:54 -0400 Subject: [PATCH 233/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 7ba35b08a..797183c1a 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -5,8 +5,7 @@ Description: > Provided as anonymized NIfTI (PET, CT, labels) with spreadsheet metadata. Designed for segmentation benchmarking, multi-organ analysis, radiomics, and PET/CT AI research. -Documentation: - - https://github.com/ENHANCE-PET/MOOSE/blob/main/DATA_CARD.md +Documentation: https://github.com/ENHANCE-PET/MOOSE/blob/main/DATA_CARD.md Contact: Lalith.shiyam@med.uni-muenchen.de ManagedBy: ENHANCE.PET initiative (LMU Klinikum & partners) UpdateFrequency: Ad hoc (new releases aligned with additional cohort availability) @@ -35,7 +34,8 @@ Resources: ARN: arn:aws:s3:::enhance-pet-1-6k Region: us-west-2 Type: S3 Bucket - Explore: https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ + Explore: + - https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ - Description: SNS topic for ENHANCE.PET 1.6k S3 object creation events ARN: arn:aws:sns:us-west-2:602670427264:enhance-pet-1-6k-object_created From cc913a1f9cd35ac5720408f93d5e1d032884e5fd Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 14 Aug 2025 15:56:01 -0400 Subject: [PATCH 234/751] Update enhance-pet-1-6k.yaml --- datasets/enhance-pet-1-6k.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/enhance-pet-1-6k.yaml b/datasets/enhance-pet-1-6k.yaml index 797183c1a..18f6d5fb6 100644 --- a/datasets/enhance-pet-1-6k.yaml +++ b/datasets/enhance-pet-1-6k.yaml @@ -35,7 +35,7 @@ Resources: Region: us-west-2 Type: S3 Bucket Explore: - - https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/ + - '[Browse Bucket](https://enhance-pet-1-6k.s3.us-west-2.amazonaws.com/)' - Description: SNS topic for ENHANCE.PET 1.6k S3 object creation events ARN: arn:aws:sns:us-west-2:602670427264:enhance-pet-1-6k-object_created From 13847ba041d73b149a426e25c1b74ae62c4f6faf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juan=20Pablo=20Casta=C3=B1o?= <64097439+jpcastanoo@users.noreply.github.com> Date: Fri, 15 Aug 2025 10:54:23 -0400 Subject: [PATCH 235/751] Update dendritic-consortium.yaml Adding SNS topic resource --- datasets/dendritic-consortium.yaml | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/datasets/dendritic-consortium.yaml b/datasets/dendritic-consortium.yaml index 9166377c8..366e9391a 100644 --- a/datasets/dendritic-consortium.yaml +++ b/datasets/dendritic-consortium.yaml @@ -1,5 +1,5 @@ Name: Dendritic Consortium Multimodal Dataset -Description: The Dendritic Consortium provides a multimodal dataset integrating calcium and voltage imaging, electrophysiology, electron microscopy, proteomics, and computational models of Baz1a pyramidal neurons in the mouse primary visual cortex (V1), and endodermal neurons in Hydra vulgaris. +Description: The Dendritic Consortium provides a multimodal dataset integrating calcium and voltage imaging, electrophysiology, electron microscopy, proteomics, and computational models of Baz1a pyramidal neurons in the mouse primary visual cortex (V1). Documentation: https://github.com/jpcastanoo/aws-open-data-dendritic-consortium Contact: dendriticconsortium@gmail.com ManagedBy: Dendritic Consortium @@ -20,10 +20,14 @@ Tags: - single neuron models License: There are no restrictions on the use of this data. Resources: - - Description: "Multimodal dataset from Baz1a pyramidal neurons in mouse V1 and endodermal neurons in Hydra vulgaris, including TIFF, ABF, MAT, CSV, PNG, PY, HOC, and SWC files." + - Description: "Multimodal dataset from Baz1a pyramidal neurons in mouse V1, including TIFF, ABF, MAT, CSV, PNG, PY, HOC, and SWC files." ARN: arn:aws:s3:::dendritic-consortium - Region: us-east-2 + Region: us-west-2 Type: S3 Bucket + - Description: Notifications for new Dendritic Consortium data + ARN: arn:aws:sns:us-west-2:662855374544:dendritic-consortium-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: Download and Visualize Data from the Dendritic Consortium Dataset @@ -37,4 +41,3 @@ DataAtWork: AuthorURL: https://dendritic-consortium.vercel.app ADXCategories: - Healthcare & Life Sciences Data - From 5a155b838ddb9de66bc625a405bbafee5043abfa Mon Sep 17 00:00:00 2001 From: Matthew Berkeley <42berkeley@cua.edu> Date: Fri, 15 Aug 2025 17:48:03 +0200 Subject: [PATCH 236/751] Add S3 Bucket --- datasets/busco-data.yaml | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index c2392d0dd..2963f7a05 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -18,11 +18,10 @@ License: The BUSCO datasets are licensed under the Creative Commons Attribution- Any use of these datasets for analyses in a publication or product must include the citation of the corresponding paper: https://doi.org/10.1093/molbev/msab199 Citation: Resources: - - Description: - ARN: - Region: - Type: - Explore: + - Description: BUSCO datasets and companion files for use with BUSCO pipeline + ARN: arn:aws:s3:::busco-data + Region: us-east-1 + Type: S3 Bucket DataAtWork: Tutorials: - Title: BUSCO - from QC to gene prediction and phylogenomics From 233554638fb32b83735a15a92b042872847a29ea Mon Sep 17 00:00:00 2001 From: tim-essential Date: Sun, 17 Aug 2025 19:16:20 -0400 Subject: [PATCH 237/751] Update eai-essential-web-v1.yaml to include ARN and Region for S3 bucket --- datasets/eai-essential-web-v1.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml index 4a43116e5..3eb01efc5 100644 --- a/datasets/eai-essential-web-v1.yaml +++ b/datasets/eai-essential-web-v1.yaml @@ -13,8 +13,8 @@ Tags: License: 'Essential-Web-v1.0 contributions are made available under the [ODC attribution license](https://opendatacommons.org/licenses/by/odc_by_1.0_public_text.txt); however, users should also abide by the [Common Crawl - Terms of Use](https://commoncrawl.org/terms-of-use). We do not alter the license of any of the underlying data.' Resources: - Description: 'Essential-Web v1.0: 24T tokens of organized web data' - ARN: # TODO: fill in - Region: # TODO: fill in + ARN: arn:aws:s3:::essential-web-v1.0 + Region: us-west-2 Type: S3 Bucket Explore: - https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 From 7de0ae3e38ea259ed6992ca385ed6fa4fda3f02d Mon Sep 17 00:00:00 2001 From: Matthew Berkeley <42berkeley@cua.edu> Date: Mon, 18 Aug 2025 12:37:11 +0200 Subject: [PATCH 238/751] Add SNS topic --- datasets/busco-data.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index 2963f7a05..90b76f41c 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -22,6 +22,10 @@ Resources: ARN: arn:aws:s3:::busco-data Region: us-east-1 Type: S3 Bucket + - Description: Notifications for new BUSCO data + ARN: arn:aws:sns:us-east-1:622022425660:my-dataset-object_created + Region: us-east-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: BUSCO - from QC to gene prediction and phylogenomics From 7a31cfe85047b00fa0a7c1e59f2948cc2d4b19ea Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Mon, 18 Aug 2025 15:11:51 -0400 Subject: [PATCH 239/751] Update canelevation-dem.yaml Added browse bucket link and Radiant Earth STAC Browser link to each ressources --- datasets/canelevation-dem.yaml | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index 2a7a08c42..a10fd399e 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -29,28 +29,32 @@ Resources: Type: S3 Bucket Explore: - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-1m)' - - '[Browse Bucket](...)' + - '[STAC Browser by Radiant Earth](https://radiantearth.github.io/stac-browser/#/external/datacube.services.geo.ca/stac/api/collections/hrdem-mosaic-1m)' + - '[Browse Bucket](https://canelevation-dem.s3.ca-central-1.amazonaws.com/index.html)' - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) at 2m / Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) à 2m ARN: arn:aws:s3:::canelevation-dem/hrdem-mosaic-2m/ Region: ca-central-1 Type: S3 Bucket Explore: - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-mosaic-2m)' - - '[Browse Bucket](...)' + - '[STAC Browser by Radiant Earth](https://radiantearth.github.io/stac-browser/#/external/datacube.services.geo.ca/stac/api/collections/hrdem-mosaic-2m)' + - '[Browse Bucket](https://canelevation-dem.s3.ca-central-1.amazonaws.com/index.html)' - Description: Medium Resolution Digital Elevation Model (MRDEM). Modèle numérique d'élévation de moyenne résolution (MNEMR) ARN: arn:aws:s3:::canelevation-dem/mrdem-30/ Region: ca-central-1 Type: S3 Bucket Explore: - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=mrdem-30)' - - '[Browse Bucket](...)' + - '[STAC Browser by Radiant Earth](https://radiantearth.github.io/stac-browser/#/external/datacube.services.geo.ca/stac/api/collections/mrdem-30)' + - '[Browse Bucket](https://canelevation-dem.s3.ca-central-1.amazonaws.com/index.html)' - Description: Mosaic of High Resolution Digital Elevation Model (HRDEM) by LiDAR acquisition project. Mosaïque de Modèle numérique d'élévation de haute résolution (MNEHR) par project d'acquisition LiDAR. ARN: arn:aws:s3:::canelevation-dem/hrdem-lidar/ Region: ca-central-1 Type: S3 Bucket Explore: - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-lidar)' - - '[Browse Bucket](...)' + - '[STAC Browser by Radiant Earth](https://radiantearth.github.io/stac-browser/#/external/datacube.services.geo.ca/stac/api/collections/hrdem-lidar)' + - '[Browse Bucket](https://canelevation-dem.s3.ca-central-1.amazonaws.com/index.html)' - Description: Notifications for Canada Digital Elevation Models. ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-create-object Region: ca-central-1 @@ -86,3 +90,4 @@ DataAtWork: + From 090c19092dc05fa58586ec1df9f0878c3a47debc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 18 Aug 2025 12:12:45 -0800 Subject: [PATCH 240/751] Update canelevation-dem.yaml --- datasets/canelevation-dem.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index a10fd399e..cbb4fe7d2 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -81,7 +81,7 @@ DataAtWork: URL: https://nrcan.github.io/cloud-optimized-geospatial/ AuthorName: NRCan Publications: - - Title: Descriptor: Medium Resolution Digital Elevation Model From Natural Resources Canada’s CanElevation Series (MRDEM-30) + - Title: "Descriptor: Medium Resolution Digital Elevation Model From Natural Resources Canada’s CanElevation Series (MRDEM-30)" URL: https://doi.org/10.1109/IEEEDATA.2025.3576318 AuthorName: H. McGrath et al. @@ -91,3 +91,4 @@ DataAtWork: + From 61b07d4ef7e568c5c04c7325dbb2789cbf4f18e3 Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 18 Aug 2025 16:27:59 -0400 Subject: [PATCH 241/751] Update busco-data.yaml --- datasets/busco-data.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index 90b76f41c..2d4dc7be7 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -14,6 +14,7 @@ Tags: - open source software - protein - virus + - aws-pds License: The BUSCO datasets are licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Any use of these datasets for analyses in a publication or product must include the citation of the corresponding paper: https://doi.org/10.1093/molbev/msab199 Citation: From 4a7e6ed62841f0e117c741516e0559b895e716ee Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 18 Aug 2025 16:29:23 -0400 Subject: [PATCH 242/751] Update dendritic-consortium.yaml --- datasets/dendritic-consortium.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/dendritic-consortium.yaml b/datasets/dendritic-consortium.yaml index 366e9391a..2553462e0 100644 --- a/datasets/dendritic-consortium.yaml +++ b/datasets/dendritic-consortium.yaml @@ -18,6 +18,7 @@ Tags: - neurophysiology - simulation neuroscience - single neuron models + - aws-pds License: There are no restrictions on the use of this data. Resources: - Description: "Multimodal dataset from Baz1a pyramidal neurons in mouse V1, including TIFF, ABF, MAT, CSV, PNG, PY, HOC, and SWC files." @@ -41,3 +42,4 @@ DataAtWork: AuthorURL: https://dendritic-consortium.vercel.app ADXCategories: - Healthcare & Life Sciences Data + From 71fe2fe3ad13be2cf54a01068db59e55a334d3f2 Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 18 Aug 2025 16:35:19 -0400 Subject: [PATCH 243/751] Update busco-data.yaml Modified syntax and added ADX category --- datasets/busco-data.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index 2d4dc7be7..cdd491c1b 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -43,12 +43,12 @@ DataAtWork: - Title: OrthoDB and BUSCO update - annotation of orthologs with wider sampling of genomes URL: https://academic.oup.com/nar/article/53/D1/D516/7899526?login=true AuthorName: Fredrik Tegenfeldt, Dmitry Kuznetsov, Mosè Manni, Matthew Berkeley, Evgeny M Zdobnov, Evgenia V Kriventseva - - Title: BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes + - Title: BUSCO Update - Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes URL: https://academic.oup.com/mbe/article/38/10/4647/6329644?login=true AuthorName: Mosè Manni, Matthew R Berkeley, Mathieu Seppey, Felipe A Simão, Evgeny M Zdobnov - - Title: BUSCO: assessing genomic data quality and beyond + - Title: BUSCO - assessing genomic data quality and beyond URL: https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpz1.323 AuthorName: Mosè Manni, Matthew R. Berkeley, Mathieu Seppey, Evgeny M. Zdobnov DeprecatedNotice: ADXCategories: - - + - Healthcare & Life Sciences Data From ea3b43c78588eb5da7679a309f9c6715e5de4dc8 Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 18 Aug 2025 16:44:04 -0400 Subject: [PATCH 244/751] Update busco-data.yaml --- datasets/busco-data.yaml | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index cdd491c1b..dd3fa8f68 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -32,13 +32,8 @@ DataAtWork: - Title: BUSCO - from QC to gene prediction and phylogenomics URL: https://www.youtube.com/watch?v=9SjVY3BT8JU AuthorName: Matthew Berkeley - AuthorURL: + AuthorURL: https://github.com/berkelem Services: - Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: Publications: - Title: OrthoDB and BUSCO update - annotation of orthologs with wider sampling of genomes URL: https://academic.oup.com/nar/article/53/D1/D516/7899526?login=true From 2c86f3e85baaa6337f30060133b80bb9fbaa693c Mon Sep 17 00:00:00 2001 From: Matthew Berkeley <42berkeley@cua.edu> Date: Mon, 18 Aug 2025 22:58:34 +0200 Subject: [PATCH 245/751] Fix formatting --- datasets/busco-data.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index dd3fa8f68..6129ce937 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -15,8 +15,7 @@ Tags: - protein - virus - aws-pds -License: The BUSCO datasets are licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. -Any use of these datasets for analyses in a publication or product must include the citation of the corresponding paper: https://doi.org/10.1093/molbev/msab199 +License: The BUSCO datasets are licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Any use of these datasets for analyses in a publication or product must include the citation of the corresponding paper - https://doi.org/10.1093/molbev/msab199 Citation: Resources: - Description: BUSCO datasets and companion files for use with BUSCO pipeline From 99700eed5f099a258ce83b3397e51f5ba08b6c8d Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 18 Aug 2025 19:30:00 -0400 Subject: [PATCH 246/751] Update busco-data.yaml --- datasets/busco-data.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/busco-data.yaml b/datasets/busco-data.yaml index 6129ce937..23e3e8840 100644 --- a/datasets/busco-data.yaml +++ b/datasets/busco-data.yaml @@ -34,13 +34,13 @@ DataAtWork: AuthorURL: https://github.com/berkelem Services: Publications: - - Title: OrthoDB and BUSCO update - annotation of orthologs with wider sampling of genomes + - Title: OrthoDB and BUSCO update - annotation of orthologs with wider sampling of genomes. URL: https://academic.oup.com/nar/article/53/D1/D516/7899526?login=true AuthorName: Fredrik Tegenfeldt, Dmitry Kuznetsov, Mosè Manni, Matthew Berkeley, Evgeny M Zdobnov, Evgenia V Kriventseva - - Title: BUSCO Update - Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes + - Title: BUSCO Update - Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. URL: https://academic.oup.com/mbe/article/38/10/4647/6329644?login=true AuthorName: Mosè Manni, Matthew R Berkeley, Mathieu Seppey, Felipe A Simão, Evgeny M Zdobnov - - Title: BUSCO - assessing genomic data quality and beyond + - Title: BUSCO - assessing genomic data quality and beyond. URL: https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpz1.323 AuthorName: Mosè Manni, Matthew R. Berkeley, Mathieu Seppey, Evgeny M. Zdobnov DeprecatedNotice: From b9ed96843e8d015ff4174017cce81dc316ff6925 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 19 Aug 2025 14:45:14 -0800 Subject: [PATCH 247/751] Update marine-energy-data.yaml From 8ab6824dab4b03c0c3ad1de04b3724dcdaf6a669 Mon Sep 17 00:00:00 2001 From: lizadams Date: Thu, 21 Aug 2025 11:03:28 -0400 Subject: [PATCH 248/751] add bucket --- datasets/cmas-data-warehouse.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/cmas-data-warehouse.yaml b/datasets/cmas-data-warehouse.yaml index 3367861ec..da82ac981 100644 --- a/datasets/cmas-data-warehouse.yaml +++ b/datasets/cmas-data-warehouse.yaml @@ -60,6 +60,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://cmas-equates.s3.amazonaws.com/index.html)' + - Description: CMAQ 2023 12US4 CRACMM3 Modeling Platform + ARN: arn:aws:s3::::::cmaq-12us4-cracmm3-modeling-platform-2023 + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://cmaq-12us4-cracmm3-modeling-platform-2023.s3.amazonaws.com/index.html)' - Description: EPA 2022 Modeling Platform ARN: arn:aws:s3:::epa-2022-modeling-platform Region: us-east-1 From c13714ffa177d001b88af6b24dd065a8e04f5409 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Thu, 21 Aug 2025 09:38:40 -0700 Subject: [PATCH 249/751] Update nrel-pds-sup3rcc.yaml Add new resources to sup3rcc dataset --- datasets/nrel-pds-sup3rcc.yaml | 80 ++++++++++++++++++++++++++++++++-- 1 file changed, 76 insertions(+), 4 deletions(-) diff --git a/datasets/nrel-pds-sup3rcc.yaml b/datasets/nrel-pds-sup3rcc.yaml index ddaee520b..774b17b72 100644 --- a/datasets/nrel-pds-sup3rcc.yaml +++ b/datasets/nrel-pds-sup3rcc.yaml @@ -38,18 +38,90 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc)' - - Description: 'Sup3rCC - CONUS - MRI ESM 2.0 - SSP585 - r1i1p1f1' + - Description: 'Sup3rCC Generative Models' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/models/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=models%2F)' + - Description: 'Sup3rCC - CONUS - EC-Earth3 - SSP585 - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_ecearth3_ssp585_r1i1p1f1%2F/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_ecearth3_ssp585_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - EC-Earth3-CC - Historical - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_ecearth3cc_historical_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_ecearth3cc_historical_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - EC-Earth3-CC - SSP245 - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_ecearth3cc_ssp245_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_ecearth3cc_ssp245_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - EC-Earth3-Veg - Historical - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_ecearth3veg_historical_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_ecearth3veg_historical_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - EC-Earth3-Veg - SSP245 - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_ecearth3veg_ssp245_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_ecearth3veg_ssp245_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - GFDL-CM4 - Historical - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_gfdlcm4_historical_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_gfdlcm4_historical_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - GFDL-CM4 - SSP245 - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_gfdlcm4_ssp245_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_gfdlcm4_ssp245_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - MPI-ESM1.2-HR - Historical - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_mpiesm12hr_historical_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_mpiesm12hr_historical_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - MPI-ESM1.2-HR - SSP245 - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_mpiesm12hr_ssp245_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_mpiesm12hr_ssp245_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - MRI-ESM2.0 - Historical - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_mriesm20_historical_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_mriesm20_historical_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - MRI-ESM2.0 - SSP585 - r1i1p1f1' ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_mriesm20_ssp585_r1i1p1f1/ Region: us-west-2 Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_mriesm20_ssp585_r1i1p1f1%2F)' - - Description: 'Sup3rCC Generative Models' - ARN: arn:aws:s3:::nrel-pds-sup3rcc/models/ + - Description: 'Sup3rCC - CONUS - TaiESM1 - Historical - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_taiesm1_historical_r1i1p1f1/ Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=models%2F)' + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_taiesm1_historical_r1i1p1f1%2F)' + - Description: 'Sup3rCC - CONUS - TaiESM1 - SSP245 - r1i1p1f1' + ARN: arn:aws:s3:::nrel-pds-sup3rcc/conus_taiesm1_ssp245_r1i1p1f1/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-sup3rcc&prefix=conus_taiesm1_ssp245_r1i1p1f1%2F)' DataAtWork: Tutorials: - Title: Using the Sup3rCC Data From 57be3781303b9287b4dc3dc7b12fe6e1b592ed48 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 21 Aug 2025 08:42:13 -0800 Subject: [PATCH 250/751] Update cmas-data-warehouse.yaml From b883189d692bdbae2d5b5746847bf9078c6f9e8b Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 21 Aug 2025 11:34:02 -0800 Subject: [PATCH 251/751] Update nrel-pds-sup3rcc.yaml From 357e912b69581735d6226b2e21b7cf589fb8184a Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Thu, 21 Aug 2025 17:36:45 -0400 Subject: [PATCH 252/751] add stac api to resources --- datasets/kyfromabove.yaml | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index 51644285a..d7547a98a 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -11,6 +11,7 @@ Tags: - geospatial - lidar - elevation + - emergency response License: | Public Domain with Attribution Resources: @@ -20,6 +21,9 @@ Resources: Type: S3 Bucket RequesterPays: False Explore: + - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' + - 'KyFromAbove Explorer' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html)' - Description: KyFromAbove Topographic Contours, digital elevation models, point cloud, spot elevations and the KyTopo Map Series quadrangles can be found in this bucket. ARN: arn:aws:s3:::kyfromabove/elevation/ @@ -41,6 +45,8 @@ Resources: Type: S3 Bucket RequesterPays: False Explore: + - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/DEM/)' - Description: There are three data resources in this folder - 1) KyTopo Map Series quadrangles in a Cloud Optimized GeoTIFF (COG) format, 2) KyTopo Map Series quadrangles with all collar information in a non-georeferenced PNG format for printing on a standard ARCH-D sized sheet, and 3) the KyTopo Map Series quadrangles tile grid in a geopackage format. The COGs were created using GDAL with JPEG compression at a 90% quality setting and the default 512x512 tile setting. ARN: arn:aws:s3:::kyfromabove/elevation/KyTopoMapSeries/ @@ -55,6 +61,8 @@ Resources: Type: S3 Bucket RequesterPays: False Explore: + - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/PointCloud/)' - Description: The data in this bucket includes spot elevations for the entire Commonwealth of Kentucky generated from the KyFromAbove Phase 1 LiDAR-derived digital elevation model (DEM) in a geopackage format. ArcGIS was used to create this dataset. Spot elevations for Phase 2 and Phase 3 will be generated upon completion of each Phase. ARN: arn:aws:s3:::kyfromabove/elevation/SpotElevations/ @@ -69,13 +77,18 @@ Resources: Type: S3 Bucket RequesterPays: False Explore: + - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - 'KyFromAbove Explorer [oblique-viewer](https://explore.kyfromabove.ky.gov/)' + - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#imagery/)' - Description: KyFromAbove ortho imagery for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. Each image tile has been converted to a Cloud Optimized GeoTiff format. Phase 1 and 2 data is organized by acquisition year and is currently available for use. Phase 3 is organized by year and season, as imagery is being acquired during the fall and spring leaf-off seasons as sun angle permits. Phase 3 ortho imagery will be available in early 2025. ARN: arn:aws:s3:::kyfromabove/imagery/orthos/ Region: us-west-2 Type: S3 Bucket RequesterPays: False - Explore: + Explore: + - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#imagery/orthos/)' - Description: KyFromAbove oblique imagery can be found in this folder. The four oblique views associated with each ortho image are provided in a 3-band (RGB) Cloud Optimized GeoTiff format using the default 512x512 tile setting. There are no oblique images available for Phase 1 and 2. Phase 3 data is available for the entire state. It is organized by year and season (where Season1 = Spring and Season2 = Fall) as imagery is being acquired during the fall and spring leaf-off seasons as sun angle and weather conditions permit. ARN: arn:aws:s3:::kyfromabove/imagery/obliques/ @@ -83,6 +96,7 @@ Resources: Type: S3 Bucket RequesterPays: False Explore: + - '[KyFromAbove Explorer](https://explore.kyfromabove.ky.gov/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#imagery/obliques/)' DataAtWork: Tutorials: @@ -98,6 +112,14 @@ DataAtWork: AuthorName: Ian Horn Services: - Amazon S3 + Tools & Applications: + - Title: Kentucky From Above SpatioTemporal Asset catalog + URL: https://kygeonet.ky.gov/stac + AuthorName: Ian Horn, Ky Div. of Geographic Information + - Title: KyFromAbove Explorer + URL: https://explore.kyfromabove.ky.gov/ + AuthorName: NV5 + AuthorURL: https://www.nv5.com/geospatial/ Publications: - Title: A New View of Kentucky's Cities URL: https://www.mydigitalpublication.com/publication/?m=16892&i=816848&view=articleBrowser&article_id=4737566&ver=html5 From d22be3c45133f7d749cd4c61aa2f36db43e0788c Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Thu, 21 Aug 2025 17:43:05 -0400 Subject: [PATCH 253/751] typo --- datasets/kyfromabove.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index d7547a98a..cceef4cc0 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -23,7 +23,6 @@ Resources: Explore: - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - - 'KyFromAbove Explorer' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html)' - Description: KyFromAbove Topographic Contours, digital elevation models, point cloud, spot elevations and the KyTopo Map Series quadrangles can be found in this bucket. ARN: arn:aws:s3:::kyfromabove/elevation/ @@ -113,7 +112,7 @@ DataAtWork: Services: - Amazon S3 Tools & Applications: - - Title: Kentucky From Above SpatioTemporal Asset catalog + - Title: Kentucky From Above SpatioTemporal Asset Catalog URL: https://kygeonet.ky.gov/stac AuthorName: Ian Horn, Ky Div. of Geographic Information - Title: KyFromAbove Explorer From 3fb13362a800ad36c72f59f0728fc2e9c984b2ed Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Thu, 21 Aug 2025 18:11:07 -0400 Subject: [PATCH 254/751] add AuthorURL to tool --- datasets/kyfromabove.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index cceef4cc0..c63b56374 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -109,6 +109,7 @@ DataAtWork: URL: https://github.com/ianhorn/kyfromabove-on-aws-examples/blob/main/examples/clip_tiles_to_boundary.ipynb NotebookURL: https://github.com/ianhorn/kyfromabove-on-aws-examples/blob/main/examples/clip_tiles_to_boundary.ipynb AuthorName: Ian Horn + AuthorURL: https://www.linkedin.com/in/ian-horn-503b1022/ Services: - Amazon S3 Tools & Applications: From 78352257ebe684e0b5b9d2992ff1776db806a529 Mon Sep 17 00:00:00 2001 From: EZ4Fanta <62417573+xfd997700@users.noreply.github.com> Date: Fri, 22 Aug 2025 11:20:47 +0800 Subject: [PATCH 255/751] Add new resource --- datasets/biolip.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/biolip.yaml b/datasets/biolip.yaml index ba1fb346c..3e6c2e9b5 100644 --- a/datasets/biolip.yaml +++ b/datasets/biolip.yaml @@ -20,6 +20,10 @@ Resources: ARN: arn:aws:s3:::biolip Region: ap-southeast-1 Type: S3 Bucket + - Description: BioLiP interaction structures + ARN: arn:aws:s3:::biolip/weekly + Region: ap-southeast-1 + Type: S3 Bucket DataAtWork: Tutorials: - Title: BioLiP API usage From 0e8baa62a52e75efea804c4675a05ebf84fcfb7f Mon Sep 17 00:00:00 2001 From: bo1929 Date: Fri, 22 Aug 2025 09:31:34 -0700 Subject: [PATCH 256/751] Added dataset description for krepp. --- datasets/krepp-idx.yaml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 datasets/krepp-idx.yaml diff --git a/datasets/krepp-idx.yaml b/datasets/krepp-idx.yaml new file mode 100644 index 000000000..021344ce1 --- /dev/null +++ b/datasets/krepp-idx.yaml @@ -0,0 +1,29 @@ +Name: Reference Indexes for krepp +Description: krepp is an alignment-free method for estimating distances and phylogenetic placement of individual reads to many thousands of reference genomes in a scalable manner using k-mers. This dataset includes k-mer-based indexes consisting of ultra-large reference genome sets that can be efficiently analyzed using krepp. +Documentation: https://github.com/bo1929/krepp/wiki/Available-reference-indexes +Contact: https://github.com/bo1929/krepp/issues +ManagedBy: Mirarab Lab at UC San Diego +UpdateFrequency: Quarterly or as new data becomes available +Tags: + - bioinformatics + - metagenomics + - microbiome + - reference index + - phylogenetics + - life sciences +License: GPL-3.0 license. Use of the data should be cited in the usual way, following https://github.com/bo1929/krepp/tree/master?tab=readme-ov-file#citation. +Resources: + - Description: This dataset contains genomic indexes for various reference datasets in binary format. Using krepp, you can perform distance estimation and phylogenetic placement with respect to these indexes. + ARN: arn:aws:s3:::krepp-idx + Region: us-west-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Tutorial for using krepp indexes for metagenomic sequence analysis. + URL: https://github.com/bo1929/krepp/wiki/Tutorial + AuthorName: Ali Osman Berk Sapci + AuthorURL: https://bo1929.github.io/ + Publications: + - Title: A k-mer-based maximum likelihood method for estimating distances of reads to genomes enables genome-wide phylogenetic placement. + URL: https://www.biorxiv.org/content/10.1101/2025.01.20.633730v2 + AuthorName: Sapci et al. (2024) From 316d693e096468177a3ee31c6c018122500da423 Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Fri, 22 Aug 2025 12:57:40 -0400 Subject: [PATCH 257/751] bring update date with phase3 deliverables --- datasets/kyfromabove.yaml | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index c63b56374..b6da6cef5 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -3,7 +3,7 @@ Description: The KyFromAbove initiative is focused on building and maintaining a Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/kyfromabove Contact: More information regarding the KyFromAbove program can be found at https://kyfromabove.ky.gov. If you have specific questions please contact - kyfromabove@ky.gov. ManagedBy: "[Kentucky Division of Geographic Information](https://kygeonet.ky.gov)" -UpdateFrequency: KyFromAbove data is typically updated on an annual basis. Each year, a portion of the state is acquired with an overall update cycle of every three to four years. This update cadance is determined by both funding and the length of leaf-off conditions in a given year. This catalog currently includes imagery and LiDAR data from 2010 through 2024 for most products. +UpdateFrequency: KyFromAbove data are typically updated on an annual basis. Each year, a portion of the state is acquired with an overall update cycle of every three to four years. This update cadance is determined by both funding and the length of leaf-off conditions in a given year. This catalog currently includes imagery and LiDAR data from 2010 through 2024 for most products. Tags: - aws-pds - earth observation @@ -15,7 +15,7 @@ Tags: License: | Public Domain with Attribution Resources: - - Description: Elevation and imagery data resources for the Commonwealth of Kentucky are organized in this bucket. Elevation data is available in Cloud Optimized GeoTIFF (COG) and Geopackage formats depending on the data type. Imagery data is also available in Cloud Optimized GeoTIFF (COG)format. A Cloud Optimized GeoTIFF (COG) is a GeoTIFF file optimized for hosting on a HTTP file server. COG has an internal organization that enables more efficient workflows on the cloud by supporting HTTP GET range requests, where just parts of a file are requested and returned. + - Description: Elevation and imagery data resources for the Commonwealth of Kentucky are organized in this bucket. Elevation data are available in Cloud Optimized GeoTIFF (COG) and Geopackage formats depending on the data type. Imagery data is also available in COG format. A Cloud Optimized GeoTIFF (COG) is a GeoTIFF file optimized for hosting on a HTTP file server. COG has an internal organization that enables more efficient workflows on the cloud by supporting HTTP GET range requests, where just parts of a file are requested and returned. ARN: arn:aws:s3:::kyfromabove Region: us-west-2 Type: S3 Bucket @@ -31,36 +31,40 @@ Resources: RequesterPays: False Explore: - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/)' - - Description: Topographic contours created from the KyFromAbove Phase 1 LiDAR-derived digital elevation model (DEM) in a geopackage and Esri file geodatabase format. There are four data resources in this folder - 1) KyTopo contours at a 10-foot interval primarily for Western and Central Kentucky, 2) KyTopo contours at a 20-foot interval primarily for Central and Eastern Kentucky, 3) KyTopo contours at a 40-foot interval for Eastern Kentucky, and 4) KyTopo contours at a 5-foot interval for the entire Commonwealth. + - Description: Topographic contours created from the KyFromAbove Phase 1 LiDAR-derived digital elevation model (DEM) in geopackage and Esri file geodatabase format. There are four data resources in this folder - 1) KyTopo contours at a 10-foot interval primarily for Western and Central Kentucky, 2) KyTopo contours at a 20-foot interval primarily for Central and Eastern Kentucky, 3) KyTopo contours at a 40-foot interval for Eastern Kentucky, and 4) KyTopo contours at a 5-foot interval for the entire Commonwealth. ARN: arn:aws:s3:::kyfromabove/elevation/Contours/ Region: us-west-2 Type: S3 Bucket RequesterPays: False Explore: - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/Contours/)' - - Description: LiDAR-derived digital elevation models (DEM) for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. There are currently three data resources in this folder - 1) Phase 1 LiDAR-derived DEMs at a 5 foot resolution, 2) Phase 2 LiDAR-derived DEMs at a 2 foot resolution, and 3) Phase 3 LiDAR-derived DEMs at a 2 foot resolution. All data has been converted to a Cloud Optimized GeoTIFF (COG) format. Phase 2 is now complete however Phase 3 efforts are still underway. + - Description: LiDAR-derived digital elevation models (DEM) for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. There are currently three data resources in this folder - 1) Phase 1 LiDAR-derived DEMs at a 5 foot resolution, 2) Phase 2 LiDAR-derived DEMs at a 2 foot resolution, and 3) Phase 3 LiDAR-derived DEMs at a 2 foot resolution. All data has been converted to a Cloud Optimized GeoTIFF (COG) format. Phase 3 efforts are still underway. ARN: arn:aws:s3:::kyfromabove/elevation/DEM/ Region: us-west-2 Type: S3 Bucket RequesterPays: False Explore: - - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[KyFromAbove Stac-Browser - Phase 1](https://kygeonet.ky.gov/collections/dem-phase1)' + - '[KyFromAbove Stac-Browser - Phase 2](https://kygeonet.ky.gov/collections/dem-phase2)' + - '[KyFromAbove Stac-Browser - Phase 3](https://kygeonet.ky.gov/collections/dem-phase3)' - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/DEM/)' - - Description: There are three data resources in this folder - 1) KyTopo Map Series quadrangles in a Cloud Optimized GeoTIFF (COG) format, 2) KyTopo Map Series quadrangles with all collar information in a non-georeferenced PNG format for printing on a standard ARCH-D sized sheet, and 3) the KyTopo Map Series quadrangles tile grid in a geopackage format. The COGs were created using GDAL with JPEG compression at a 90% quality setting and the default 512x512 tile setting. + - Description: There are three data resources in this folder - 1) KyTopo Map Series quadrangles in a COG format, 2) KyTopo Map Series quadrangles with all collar information in a non-georeferenced PNG format for printing on a standard ARCH-D sized sheet, and 3) the KyTopo Map Series quadrangles tile grid in a geopackage format. The COGs were created using GDAL with JPEG compression at a 90% quality setting and the default 512x512 tile setting. ARN: arn:aws:s3:::kyfromabove/elevation/KyTopoMapSeries/ Region: us-west-2 Type: S3 Bucket RequesterPays: False Explore: - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/KyTopoMapSeries/)' - - Description: LiDAR-derived Point Cloud tiles for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. There is currently one data resource in this folder - 1) Phase 1 LiDAR-derived point clouds in LAZ format. Phase 2 is now complete however Phase 3 efforts are still underway It is our aim to provide Phase 2 and Phase 3 data in a COPC (LAZ format). + - Description: LiDAR-derived Point Cloud tiles for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. There currently two complete resources in this folder - 1) Phase 1 LiDAR-derived point clouds in LAZ format and 2) Phase 2 LiDAR-derived Point Clouds in COPC format. Phase 3 is partially availabe in COPC format while efforts are still ongoing. ARN: arn:aws:s3:::kyfromabove/elevation/PointCloud/ Region: us-west-2 Type: S3 Bucket RequesterPays: False Explore: - - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[KyFromAbove Stac-Browser - Phase 1](https://kygeonet.ky.gov/collections/laz-phase1)' + - '[KyFromAbove Stac-Browser - Phase 2](https://kygeonet.ky.gov/collections/laz-phase2)' + - '[KyFromAbove Stac-Browser - Phase 3](https://kygeonet.ky.gov/collections/laz-phase3)' - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/PointCloud/)' - Description: The data in this bucket includes spot elevations for the entire Commonwealth of Kentucky generated from the KyFromAbove Phase 1 LiDAR-derived digital elevation model (DEM) in a geopackage format. ArcGIS was used to create this dataset. Spot elevations for Phase 2 and Phase 3 will be generated upon completion of each Phase. @@ -70,17 +74,19 @@ Resources: RequesterPays: False Explore: - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#elevation/SpotElevations/)' - - Description: KyFromAbove aerial imagery, both nadir and oblique views, can be found in this bucket. Phase 1 and 2 data is currently available. Phase 3 oblique imagery is available and ortho imagery will be available in early 2025. + - Description: KyFromAbove aerial imagery, both nadir and oblique views, can be found in this bucket. Ortho imagery is available for Phases 1, 3, and 3. Oblique imagery is available for Phase 3 only. ARN: arn:aws:s3:::kyfromabove/imagery/ Region: us-west-2 Type: S3 Bucket RequesterPays: False Explore: - - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' + - '[KyFromAbove Stac-Browser - Phase 1 Orthos](https://kygeonet.ky.gov/collections/orthos-phase1)' + - '[KyFromAbove Stac-Browser - Phase 2 Orthos](https://kygeonet.ky.gov/collections/orthos-phase2)' + - '[KyFromAbove Stac-Browser - Phase 3 Orthos](https://kygeonet.ky.gov/collections/orthos-phase3)' - 'KyFromAbove Explorer [oblique-viewer](https://explore.kyfromabove.ky.gov/)' - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#imagery/)' - - Description: KyFromAbove ortho imagery for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. Each image tile has been converted to a Cloud Optimized GeoTiff format. Phase 1 and 2 data is organized by acquisition year and is currently available for use. Phase 3 is organized by year and season, as imagery is being acquired during the fall and spring leaf-off seasons as sun angle permits. Phase 3 ortho imagery will be available in early 2025. + - Description: KyFromAbove ortho imagery for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. Each image tile has been converted to a COG format. Phase 1 and 2 data are organized by acquisition year and are currently available for use. Phase 3 is organized by year and season, as imagery was acquired during the fall and spring leaf-off seasons as sun angle permitted. ARN: arn:aws:s3:::kyfromabove/imagery/orthos/ Region: us-west-2 Type: S3 Bucket @@ -89,7 +95,7 @@ Resources: - '[KyFromAbove Stac-Browser](https://kygeonet.ky.gov/stac)' - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#imagery/orthos/)' - - Description: KyFromAbove oblique imagery can be found in this folder. The four oblique views associated with each ortho image are provided in a 3-band (RGB) Cloud Optimized GeoTiff format using the default 512x512 tile setting. There are no oblique images available for Phase 1 and 2. Phase 3 data is available for the entire state. It is organized by year and season (where Season1 = Spring and Season2 = Fall) as imagery is being acquired during the fall and spring leaf-off seasons as sun angle and weather conditions permit. + - Description: KyFromAbove oblique imagery can be found in this folder. The four oblique views associated with each ortho image are provided in a 3-band (RGB) COG format using the default 512x512 tile setting. There are no oblique images available for Phases 1 and 2. Phase 3 data is available for the entire state. It is organized by year and season (where Season1 = Spring and Season2 = Fall) as imagery was acquired during the fall and spring leaf-off seasons as sun angle and weather conditions permitted. ARN: arn:aws:s3:::kyfromabove/imagery/obliques/ Region: us-west-2 Type: S3 Bucket From e62a67386f998f02ffcd754ebc4dd79ab4a72869 Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Fri, 22 Aug 2025 16:20:56 -0400 Subject: [PATCH 258/751] add back accidently deleted section --- datasets/kyfromabove.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index b6da6cef5..a6239b98d 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -4,6 +4,10 @@ Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/kyfromab Contact: More information regarding the KyFromAbove program can be found at https://kyfromabove.ky.gov. If you have specific questions please contact - kyfromabove@ky.gov. ManagedBy: "[Kentucky Division of Geographic Information](https://kygeonet.ky.gov)" UpdateFrequency: KyFromAbove data are typically updated on an annual basis. Each year, a portion of the state is acquired with an overall update cycle of every three to four years. This update cadance is determined by both funding and the length of leaf-off conditions in a given year. This catalog currently includes imagery and LiDAR data from 2010 through 2024 for most products. +Collabs: + ASDI: + Tags: + - elevation Tags: - aws-pds - earth observation From 39d28a3f1d38718508379630456aa82b69b070b2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ali=20=C5=9Eapc=C4=B1?= Date: Sun, 24 Aug 2025 14:43:57 -0700 Subject: [PATCH 259/751] Update krepp-idx.yaml --- datasets/krepp-idx.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/krepp-idx.yaml b/datasets/krepp-idx.yaml index 021344ce1..70f3106ff 100644 --- a/datasets/krepp-idx.yaml +++ b/datasets/krepp-idx.yaml @@ -8,8 +8,8 @@ Tags: - bioinformatics - metagenomics - microbiome - - reference index - - phylogenetics + - reference index + - phylogenetics - life sciences License: GPL-3.0 license. Use of the data should be cited in the usual way, following https://github.com/bo1929/krepp/tree/master?tab=readme-ov-file#citation. Resources: From 7efb96a4ae209ca4474a8a85b8f053ffde72ac43 Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Mon, 25 Aug 2025 10:59:35 -0800 Subject: [PATCH 260/751] added GFS notebook --- datasets/noaa-gfs-bdp-pds.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/datasets/noaa-gfs-bdp-pds.yaml b/datasets/noaa-gfs-bdp-pds.yaml index 002d759aa..142ddbe8a 100644 --- a/datasets/noaa-gfs-bdp-pds.yaml +++ b/datasets/noaa-gfs-bdp-pds.yaml @@ -74,6 +74,9 @@ Resources: Type: SNS Topic DataAtWork: Tutorials: + - Title: "NOAA Global Forecast System (GFS) quickstart notebook on AWS" + URL: https://github.com/aws-samples/aws-opendata-samples/blob/main/notebooks/noaa-gfs/noaa_gfs_quickstart.ipynb + AuthorName: Benoit de Chateauvieux Tools & Applications: Publications: - Title: GFS Warm Restart Files Additional Information From 789ab1be69da6ebcea1ebd180b830cc3e81543c7 Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Mon, 25 Aug 2025 16:28:01 -0400 Subject: [PATCH 261/751] update tags to include existing tags only --- datasets/kyfromabove.yaml | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index a6239b98d..3e2a54788 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -9,13 +9,17 @@ Collabs: Tags: - elevation Tags: - - aws-pds - - earth observation - aerial imagery + - cog + - dtm + - elevation + - geopackage - geospatial - lidar - - elevation - - emergency response + - mapping + - stac + - tiff + - tiles License: | Public Domain with Attribution Resources: From 84ecf269cf739055bfc1d0af8488e37ec62116b3 Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Mon, 25 Aug 2025 16:31:18 -0400 Subject: [PATCH 262/751] update tags --- datasets/kyfromabove.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index 3e2a54788..9920e7a7b 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -10,8 +10,10 @@ Collabs: - elevation Tags: - aerial imagery + - aws-pds - cog - dtm + - earth observation - elevation - geopackage - geospatial From c8ffc2a4e8a8f6d1882969d5110119fb57a922cf Mon Sep 17 00:00:00 2001 From: ianhorn <=> Date: Mon, 25 Aug 2025 16:36:13 -0400 Subject: [PATCH 263/751] add disaster response as tag --- datasets/kyfromabove.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index 9920e7a7b..4fcca3276 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -13,6 +13,7 @@ Tags: - aws-pds - cog - dtm + - disaster response - earth observation - elevation - geopackage From 700c07c59a5799e6a84e3c9c650c27ed6f6a6c93 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 25 Aug 2025 13:52:03 -0800 Subject: [PATCH 264/751] Update kyfromabove.yaml From 4cbd0c1cebd2ea7dcdf6f3bd67e4b7c53d2c6789 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 25 Aug 2025 16:43:11 -0800 Subject: [PATCH 265/751] Update kyfromabove.yaml --- datasets/kyfromabove.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/kyfromabove.yaml b/datasets/kyfromabove.yaml index 4fcca3276..ab64b673b 100644 --- a/datasets/kyfromabove.yaml +++ b/datasets/kyfromabove.yaml @@ -94,7 +94,7 @@ Resources: - '[KyFromAbove Stac-Browser - Phase 1 Orthos](https://kygeonet.ky.gov/collections/orthos-phase1)' - '[KyFromAbove Stac-Browser - Phase 2 Orthos](https://kygeonet.ky.gov/collections/orthos-phase2)' - '[KyFromAbove Stac-Browser - Phase 3 Orthos](https://kygeonet.ky.gov/collections/orthos-phase3)' - - 'KyFromAbove Explorer [oblique-viewer](https://explore.kyfromabove.ky.gov/)' + - '[KyFromAbove Explorer oblique-viewer](https://explore.kyfromabove.ky.gov/)' - '[STAC V1.0.0 endpoint](https://spved5ihrl.execute-api.us-west-2.amazonaws.com/)' - '[Browse Bucket](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html#imagery/)' - Description: KyFromAbove ortho imagery for the Commonwealth of Kentucky organized in a 5000x5000 foot grid. Each image tile has been converted to a COG format. Phase 1 and 2 data are organized by acquisition year and are currently available for use. Phase 3 is organized by year and season, as imagery was acquired during the fall and spring leaf-off seasons as sun angle permitted. From 2e10f4f17cfac2ed5ea59ee529f5fe9385d1175b Mon Sep 17 00:00:00 2001 From: Michael Chungyoun Date: Mon, 25 Aug 2025 21:52:03 -0400 Subject: [PATCH 266/751] Add entry for flab --- datasets/flab.yaml | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) create mode 100644 datasets/flab.yaml diff --git a/datasets/flab.yaml b/datasets/flab.yaml new file mode 100644 index 000000000..dfa3651db --- /dev/null +++ b/datasets/flab.yaml @@ -0,0 +1,35 @@ +Name: FLAb: Fitness Landscapes for Antibodies +Description: FLAb is the largest publicly available therapeutic antibody dataset designed to train and benchmark protein AI models. It provides open-access, high-quality developability data on diverse therapeutic properties, including expression, thermostability, immunogenicity, aggregation, polyreactivity, binding affinity, and pharmacokinetics. +Documentation: https://github.com/Graylab/FLAb/blob/main/README.md +Contact: mchungy1@jhu.edu +ManagedBy: "[Jeffrey Gray Lab, Johns Hopkins University](https://graylab.jhu.edu/)" +UpdateFrequency: Any new public release of antibody developabilty data is deposited into FLAb +Tags: + - Protein language models + - Protein design + - Antibody engineering + - Therapeueutic antibodies + - Developability + - Machine learning + - Clinical stage therapeutics + - Biophysics +License: https://creativecommons.org/licenses/by/4.0/ +Citation: "FLAb was accessed on [DATE] at registry.opendata.aws/flab" +Resources: + - Description: Antibody developabiltiy data in CSV format + ARN: arn:aws:s3:::graylab-flab + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: FLAb tutorial: Benchmarking a protein language model for antibody expression prediction + NotebookURL: https://github.com/Graylab/FLAb/blob/main/examples/FLAb_ZeroShotExample_IgLM_Expression.ipynb + AuthorName: Michael Chungyoun + AuthorURL: https://www.linkedin.com/in/mfc12/ + Publications: + - Title: FLAb: Benchmarking deep learning methods for antibody fitness prediction + URL: https://doi.org/10.1101/2024.01.13.575504 + AuthorName: Michael Chungyoun and Jeffrey J. Gray + AuthorURL: https://www.linkedin.com/in/mfc12/ +ADXCategories: + - Healthcare & Life Sciences Data \ No newline at end of file From a5232fadc93ce816e88dee0e29c4c00e88ae7015 Mon Sep 17 00:00:00 2001 From: Ev Date: Tue, 26 Aug 2025 12:33:44 -0400 Subject: [PATCH 267/751] Update aws-public-blockchain.yaml Typo fixes --- datasets/aws-public-blockchain.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index f4e86221b..62de65c01 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -39,10 +39,10 @@ DataAtWork: Publications: - Title: "Exploring Arbitrum Data: Analyze L2 Activity with AWS Public Blockchain Datasets" URL: https://repost.aws/articles/ARpnBONglsT2e6D-hZZmxVvA/exploring-arbitrum-data-analyze-l2-activity-with-aws-public-blockchain-datasets - AuthorName: Simon Goldberd, Everton Fraga + AuthorName: Simon Goldberg, Everton Fraga - Title: "Unlocking XRP Ledger Data: Comprehensive Analysis with AWS Public Blockchain Datasets" URL: https://repost.aws/articles/ARg_zMIXlhTG2hSDFZDfF6hQ/unlocking-xrp-ledger-data-comprehensive-analysis-with-aws-public-blockchain-datasets - AuthorName: Simon Goldberd, Everton Fraga + AuthorName: Simon Goldberg, Everton Fraga - Title: New datasets added to the AWS Public Blockchain Datasets — available for analytics and research URL: https://repost.aws/articles/AR3gztQGeSS8CfaKNNeyYwsQ AuthorName: Everton Fraga, Simon Goldberg From 2ad4bc37e84c9aeb38d423970c6168acc137bd77 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 26 Aug 2025 08:36:11 -0800 Subject: [PATCH 268/751] Update aws-public-blockchain.yaml From aca6cc83eb270a503c7b643ce12c153a9a4217ed Mon Sep 17 00:00:00 2001 From: berylrab Date: Tue, 26 Aug 2025 13:39:04 -0400 Subject: [PATCH 269/751] Update biolip.yaml --- datasets/biolip.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/biolip.yaml b/datasets/biolip.yaml index 3e6c2e9b5..09812d841 100644 --- a/datasets/biolip.yaml +++ b/datasets/biolip.yaml @@ -12,7 +12,6 @@ Tags: - molecule - life sciences - chemistry - License: No explicit license stated (publicly available for academic and research use). Citation: "Chengxin Zhang, Xi Zhang, Peter L Freddolino, and Yang Zhang. BioLiP2: an updated structure database for biologically relevent ligand-protein interactions, Nucleic Acids Research, gkad630 (2023)." Resources: @@ -36,3 +35,4 @@ DataAtWork: - Title: "BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions" URL: https://academic.oup.com/nar/article/41/D1/D1096/1074898 AuthorName: Jianyi Yang, Ambrish Roy, and Yang Zhang + From d683cdb98bf1f39ff4db4e7a95f49029756f3b52 Mon Sep 17 00:00:00 2001 From: Antoine McGrath Date: Tue, 26 Aug 2025 18:23:21 -0500 Subject: [PATCH 270/751] Update eot-web-archive.yaml to 2024 EOT 2024 addition --- datasets/eot-web-archive.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/eot-web-archive.yaml b/datasets/eot-web-archive.yaml index 278bc6d0c..8316946b7 100644 --- a/datasets/eot-web-archive.yaml +++ b/datasets/eot-web-archive.yaml @@ -2,8 +2,8 @@ Name: End of Term Web Archive Dataset Description: > The End of Term Web Archive (EOT) captures and saves U.S. Government websites at the end of presidential administrations. The EOT has - thus far preserved websites from administration changes in 2008, 2012, 2016, - and 2020. Data from these web crawls have been made openly available in + thus far preserved websites from administration changes in 2008, 2012, 2016, 2020 + and 2024. Data from these web crawls have been made openly available in several formats in this dataset. Documentation: https://eotarchive.org/data/ Contact: Mark Phillips , Sawood Alam From 52f2f2e8ff4786948434587b664cdec5248e2640 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 27 Aug 2025 09:02:08 -0400 Subject: [PATCH 271/751] Update huj-herbarium.yaml --- datasets/huj-herbarium.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index 9958056c7..a8727004f 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -23,7 +23,7 @@ License: CC-BY-SA 4.0 Citation: Vascular plants - Herbarium of The National Natural History Collections was accessed on DATE from https://registry.opendata.aws/huj-herbarium. Resources: - Description: HUJ Herbarium Collection Images - ARN: + ARN: arn:aws:s3:::hujinnhc/specify_assets/ Region: il-central-1 Type: S3 bucket Explore: @@ -48,3 +48,4 @@ DataAtWork: DeprecatedNotice: ADXCategories: - + From 4c090a47f328e25d52951e9e6b47bcaf65bb5956 Mon Sep 17 00:00:00 2001 From: lizadams Date: Wed, 27 Aug 2025 12:21:54 -0400 Subject: [PATCH 272/751] update --- datasets/cmas-data-warehouse.yaml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/datasets/cmas-data-warehouse.yaml b/datasets/cmas-data-warehouse.yaml index da82ac981..80deb06e1 100644 --- a/datasets/cmas-data-warehouse.yaml +++ b/datasets/cmas-data-warehouse.yaml @@ -60,6 +60,13 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://cmas-equates.s3.amazonaws.com/index.html)' + - Description: Community Multiscale Air Quality (CMAQ) 2019 3D Gridded and Column data from the EPA's Air Quality Time Series (EQUATES) Project + ARN: arn:aws:s3:::epa-equates-v1 + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://https://epa-equates-v1.s3.amazonaws.com/index.html)' + - '[EPA CMAQ 2019 3D Gridded and Column data from EQUATES Project](https://aws.amazon.com/marketplace/pp/prodview-kziefcewnxcxe?sr=0-5&ref_=beagle&applicationId=AWSMPContessa)' - Description: CMAQ 2023 12US4 CRACMM3 Modeling Platform ARN: arn:aws:s3::::::cmaq-12us4-cracmm3-modeling-platform-2023 Region: us-east-1 From eb7aa80df28082a0a95f59b6405968e5b492ab30 Mon Sep 17 00:00:00 2001 From: lizadams Date: Wed, 27 Aug 2025 12:25:57 -0400 Subject: [PATCH 273/751] update --- datasets/cmas-data-warehouse.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/cmas-data-warehouse.yaml b/datasets/cmas-data-warehouse.yaml index 80deb06e1..1f6dfa66d 100644 --- a/datasets/cmas-data-warehouse.yaml +++ b/datasets/cmas-data-warehouse.yaml @@ -65,7 +65,7 @@ Resources: Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket](https://https://epa-equates-v1.s3.amazonaws.com/index.html)' + - '[Browse Bucket](https://epa-equates-v1.s3.amazonaws.com/index.html)' - '[EPA CMAQ 2019 3D Gridded and Column data from EQUATES Project](https://aws.amazon.com/marketplace/pp/prodview-kziefcewnxcxe?sr=0-5&ref_=beagle&applicationId=AWSMPContessa)' - Description: CMAQ 2023 12US4 CRACMM3 Modeling Platform ARN: arn:aws:s3::::::cmaq-12us4-cracmm3-modeling-platform-2023 From 7025e81002fad968894c8a2ee71332a4f0895cf7 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 27 Aug 2025 08:46:53 -0800 Subject: [PATCH 274/751] ok: Update cmas-data-warehouse.yaml From c8eff42cfe9900b3cb43f4fc6b1947c04866ed24 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 27 Aug 2025 09:31:42 -0800 Subject: [PATCH 275/751] ok: Update noaa-mrms-pds.yaml From d1fa554dd3b27f7e814858ace9f2f87a643efef5 Mon Sep 17 00:00:00 2001 From: Eyal Ben-Hur Date: Thu, 28 Aug 2025 01:05:35 +0300 Subject: [PATCH 276/751] Update huj-herbarium.yaml Capitalize S3 Bucket --- datasets/huj-herbarium.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index a8727004f..aa0a6ec3d 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -25,7 +25,7 @@ Resources: - Description: HUJ Herbarium Collection Images ARN: arn:aws:s3:::hujinnhc/specify_assets/ Region: il-central-1 - Type: S3 bucket + Type: S3 Bucket Explore: DataAtWork: Tutorials: @@ -49,3 +49,4 @@ DeprecatedNotice: ADXCategories: - + From d2e30aaa52d1c2eefd62502ea5515af1c85f34ac Mon Sep 17 00:00:00 2001 From: Yuk Kei Wan <41866052+yuukiiwa@users.noreply.github.com> Date: Mon, 1 Sep 2025 00:47:05 +0800 Subject: [PATCH 277/751] make bucket public --- .../yuukiiwa_application_placeholder.yaml | 30 --------------- frag-struc.yaml | 37 +++++++++++++++++++ 2 files changed, 37 insertions(+), 30 deletions(-) delete mode 100644 datasets/yuukiiwa_application_placeholder.yaml create mode 100644 frag-struc.yaml diff --git a/datasets/yuukiiwa_application_placeholder.yaml b/datasets/yuukiiwa_application_placeholder.yaml deleted file mode 100644 index 9f9b79afb..000000000 --- a/datasets/yuukiiwa_application_placeholder.yaml +++ /dev/null @@ -1,30 +0,0 @@ -Name: Update on August 31 -Description:Update on August 31, 2025 -Documentation: Update on August 31, 2025 -Contact: Update on August 31, 2025 -ManagedBy: "The Genome Institute of Singapore (https://www.a-star.edu.sg/gis) and UMass Chan Medical School's RNA Therapeutics Institute (https://www.umassmed.edu/rti/)" -UpdateFrequency: Datasets will be updated periodically as additional data are generated. -Tags: - - TBD -License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" -Citation: Update on August 31, 2025 -Resources: - - Description: Update on August 31, 2025 - ARN: Update on August 31, 2025 - Region: ap-southeast-1 - Type: S3 Bucket - Explore: - - '[Browse Bucket](http://frag-struc.s3-website-ap-southeast-1.amazonaws.com/)' -DataAtWork: - Tutorials: - - Title: Update on August 31, 2025 - URL: Update on August 31, 2025 - AuthorName: Leonard Schärfen and Yuk Kei Wan - Tools & Applications: - - Title: Update on August 31, 2025 - URL: Update on August 31, 2025 - AuthorName: Leonard Schärfen and Yuk Kei Wan - Publications: - - Title: Update on August 31, 2025 - URL: In Preparation - AuthorName: Leonard Schärfen and Yuk Kei Wan diff --git a/frag-struc.yaml b/frag-struc.yaml new file mode 100644 index 000000000..2e9bc7648 --- /dev/null +++ b/frag-struc.yaml @@ -0,0 +1,37 @@ +Name: RNA structure by fragmentation frequency +Description: "The fragSTRUC project devises a software to extract RNA secondary structure information from Illumina datasets, based on divalent ions in standard RNA-seq library preparation fragmenting sequences at non-base-paired regions of RNA." +Documentation: https://github.com/yuukiiwa/RNA_structure_by_fragmentation_frequency +Contact: "[fragSTRUC team](https://github.com/yuukiiwa/RNA_structure_by_fragmentation_frequency)" +ManagedBy: "The Genome Institute of Singapore (https://www.a-star.edu.sg/gis) and UMass Chan Medical School's RNA Therapeutics Institute (https://www.umassmed.edu/rti/)" +UpdateFrequency: Datasets will be updated periodically as additional data is generated. +Tags: + - RNA structure + - genomic + - transcriptomics + - life sciences + - Illumina sequencing + - bulk RNA sequencing + - bioinformatics + - bigwig +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" +Citation: "In addition, please cite Yuk Kei Wan and Leonard Schärfen Hidden structural information in RNA sequencing data." +Resources: + - Description: RNA structure by fragmentation frequency + ARN: arn:aws:s3:::frag-struc + Region: ap-southeast-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](http://frag-struc.s3-website-ap-southeast-1.amazonaws.com/)' +DataAtWork: + Tutorials: + - Title: Accessing the fragSTRUC dataset on AWS + URL: https://github.com/yuukiiwa/RNA_structure_by_fragmentation_frequency/blob/main/README.md + AuthorName: Yuk Kei Wan and Leonard Schärfen + Tools & Applications: + - Title: "fragSTRUC: RNA structure by fragmentation frequency" + URL: https://github.com/lschaerfen/fragstruc + AuthorName: Yuk Kei Wan and Leonard Schärfen + Publications: + - Title: Hidden structural information in RNA sequencing data. + URL: In Preparation + AuthorName: Yuk Kei Wan and Leonard Schärfen From dc0021704315c2099128527f1652c01612641339 Mon Sep 17 00:00:00 2001 From: Changqing Wang Date: Fri, 29 Aug 2025 15:16:56 +1000 Subject: [PATCH 278/751] add LongBench --- datasets/longbench.yaml | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) create mode 100644 datasets/longbench.yaml diff --git a/datasets/longbench.yaml b/datasets/longbench.yaml new file mode 100644 index 000000000..a45cf817d --- /dev/null +++ b/datasets/longbench.yaml @@ -0,0 +1,35 @@ +Name: LongBench - cross-platform reference dataset profiling cancer cell lines with bulk and single-cell approaches +Description: > + LongBench is a comprehensive benchmark dataset of the latest long-read transcriptomics technologies from Oxford Nanopore (ON) and Pacific Biosciences, alongside a comparison with next-generation sequencing from Illumina. We generated bulk and single-cell libraries from lung cancer cell lines which include different cancer subtypes to capture real biological variation. To further compare and assess sequencing platform performance, Sequins and SIRVs (Set 4) synthetic spike-ins have been included. +Documentation: https://github.com/mritchielab/LongBench.io +Contact: mritchie@wehi.edu.au +ManagedBy: Richie Lab, Walter and Eliza Hall Institute of Medical Research +UpdateFrequency: New data is added as soon as it is available. +Tags: + - benchmark + - long read sequencing + - single-cell transcriptomics + - short read sequencing + - bioinformatics + - fastq + - pod5 + - bam + - vcf + - cancer + - life sciences + +License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +Resources: + - Description: Bulk, single-cell, and single-nucleus RNA-seq data from the LongBench project, covering eight human lung cancer cell lines. Bulk sequencing (FASTQ) was performed on ONT PCR-cDNA, ONT direct RNA (including pod5 files for RNA modification analysis), PacBio Kinnex, and Illumina platforms. Single-cell and single-nucleus sequencing (FASTQ) was performed on ONT PCR-cDNA, PacBio Kinnex, and Illumina platforms. Aligned reads (BAM), variant calls (VCF), and processed gene expression data are also provided, along with reference genome annotations (GTF and FASTA). + ARN: arn:aws:s3:::longbench-data + Region: ap-southeast-2 + Type: S3 Bucket + +DataAtWork: + Tutorials: + - Title: Benchmarking long-read DE gene and transcript analysis with edgeR + URL: https://mritchielab.github.io/LongBench.io/bulk-de-benchmarking/ + AuthorName: Yupei You + +ADXCategories: + - Healthcare & Life Sciences Data From 9ed9a146adfba56c8f86367ba0315ea094b17176 Mon Sep 17 00:00:00 2001 From: berylrab Date: Tue, 2 Sep 2025 14:46:19 -0400 Subject: [PATCH 279/751] ok:Update frag-struc.yaml From 7a14f020ba05ff85ca1565da09fb719e4d953013 Mon Sep 17 00:00:00 2001 From: Yupei You Date: Wed, 3 Sep 2025 09:49:25 +1000 Subject: [PATCH 280/751] Add longbench, update license Corrected the phrasing for data availability and updated the license format. --- datasets/longbench.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/longbench.yaml b/datasets/longbench.yaml index a45cf817d..278c68206 100644 --- a/datasets/longbench.yaml +++ b/datasets/longbench.yaml @@ -4,7 +4,7 @@ Description: > Documentation: https://github.com/mritchielab/LongBench.io Contact: mritchie@wehi.edu.au ManagedBy: Richie Lab, Walter and Eliza Hall Institute of Medical Research -UpdateFrequency: New data is added as soon as it is available. +UpdateFrequency: New data will be added as soon as they are available. Tags: - benchmark - long read sequencing @@ -18,7 +18,7 @@ Tags: - cancer - life sciences -License: "[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" +License: CC BY-4.0 Resources: - Description: Bulk, single-cell, and single-nucleus RNA-seq data from the LongBench project, covering eight human lung cancer cell lines. Bulk sequencing (FASTQ) was performed on ONT PCR-cDNA, ONT direct RNA (including pod5 files for RNA modification analysis), PacBio Kinnex, and Illumina platforms. Single-cell and single-nucleus sequencing (FASTQ) was performed on ONT PCR-cDNA, PacBio Kinnex, and Illumina platforms. Aligned reads (BAM), variant calls (VCF), and processed gene expression data are also provided, along with reference genome annotations (GTF and FASTA). ARN: arn:aws:s3:::longbench-data From abbac42c08ceea14be2ca3553e48d5f10f478876 Mon Sep 17 00:00:00 2001 From: Mansour A <44963644+mansour2002@users.noreply.github.com> Date: Tue, 2 Sep 2025 22:06:09 -0700 Subject: [PATCH 281/751] Adding YAML file --- datasets/ucsf-rmac.yaml | 49 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 datasets/ucsf-rmac.yaml diff --git a/datasets/ucsf-rmac.yaml b/datasets/ucsf-rmac.yaml new file mode 100644 index 000000000..0ad543427 --- /dev/null +++ b/datasets/ucsf-rmac.yaml @@ -0,0 +1,49 @@ +Name: UCSF Renal Mass CT Dataset +Description: This dataset provides a set of 831 3D Multiphase CT exams of renal masses, registered across phases with annotations identifying the masses +Documentation: https://github.com/LarsonLab/UCSF-RMaC +Contact: "[Peder Larson](peder.larson@ucsf.edu)]" +ManagedBy: "[UCSF Larson Advanced Imaging Lab](https://larsonlab.github.io/)" +UpdateFrequency: ad hoc +Tags: + - aws-pds + - cancer + - life sciences + - computed tomography + - medicine + - medical imaging + - radiology +License: https://creativecommons.org/licenses/by/4.0/ +Citation: +Resources: + - Description: Renal Mass CT Data on S3 + ARN: arn:aws:s3:::ucsf-rmac-dataset + Region: us-west-2 + Type: S3 Bucket + Explore: https://s3.console.aws.amazon.com/s3/buckets/ucsf-rmac-dataset +DataAtWork: + Tutorials: + - Title: Label Exploration Tutorial + URL: https://github.com/LarsonLab/UCSF-RMaC/blob/main/tutorials/labelexploration.ipynb + NotebookURL: + AuthorName: Sule Sahin + AuthorURL: https://github.com/sule-sahin + Services: S3 + - Title: Mask Overlays + URL: https://github.com/LarsonLab/UCSF-RMaC/blob/main/tutorials/maskoverlays.ipynb + NotebookURL: + AuthorName: Sule Sahin + AuthorURL: https://github.com/sule-sahin + Services: S3 + Tools & Applications: + - Title: UCSF Renal Mass CT Dataset + URL: https://github.com/LarsonLab/UCSF-RMaC + AuthorName: Peder Larson + AuthorURL: https://scholar.google.com/citations?user=LrQ7YekAAAAJ&hl=en + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: Healthcare & Life Sciences Data + - \ No newline at end of file From bbea69c8c7effbdf04b3871c8c3fe21bbe0a8edd Mon Sep 17 00:00:00 2001 From: Peter Schmiedeskamp Date: Wed, 3 Sep 2025 06:37:31 -0700 Subject: [PATCH 282/751] ok: remove trailing whitespace From 1410fe25cffac0cffcae06baa562e48f2b2b339c Mon Sep 17 00:00:00 2001 From: Patrick Custer Date: Wed, 3 Sep 2025 13:45:36 -0400 Subject: [PATCH 283/751] Update panstarrs.yaml update panstarrs arn --- datasets/panstarrs.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/panstarrs.yaml b/datasets/panstarrs.yaml index 24107d68d..75e4c3513 100644 --- a/datasets/panstarrs.yaml +++ b/datasets/panstarrs.yaml @@ -12,7 +12,7 @@ Tags: License: STScI hereby grants the non-exclusive, royalty-free, non-transferable, worldwide right and license to use, reproduce, and publicly display in all media data from the PS1 surveys. Resources: - Description: PS1 DR1 and DR2 image files - ARN: arn:aws:s3:::stpubdata/ps1 + ARN: arn:aws:s3:::stpubdata/panstarrs/ps1 Region: us-east-1 Type: S3 Bucket RequesterPays: False From 4023773926903546afd50073ad0a75bcfc075804 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 3 Sep 2025 09:48:11 -0800 Subject: [PATCH 284/751] Update panstarrs.yaml From 87c7cf3947650c26ee439ac85d42cf523dfdc741 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 3 Sep 2025 10:31:44 -0800 Subject: [PATCH 285/751] ok: Update panstarrs.yaml From 28b8a5dd607660208cd20dad698d72bf737b5d47 Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Wed, 3 Sep 2025 15:38:32 -0600 Subject: [PATCH 286/751] Added Colorado elevation data --- registry.yml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 registry.yml diff --git a/registry.yml b/registry.yml new file mode 100644 index 000000000..ae8062b36 --- /dev/null +++ b/registry.yml @@ -0,0 +1,29 @@ +Name: State of Colorado Elevation Data +Description: The State of Colorado has gathered public historical elevation data. +Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing +Contact: oit_gis@state.co.us +ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team +UpdateFrequency: Periodically +Tags: + - aws-pds + - geospatial + - imaging + - mapping +License: https://creativecommons.org/publicdomain/zero/1.0/legalcode +Resources: + - Description: + ARN: arn:aws:s3:::colorado-public-elevation-data + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Colorado AWS Open Data Elevation Data Guide + URL: https://docs.google.com/document/d/1pAHZB6SgSE4QTawEbSnIIHpxVCTBg-IjQ6X9KJP28BM/edit?usp=sharing + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ + - Title: Colorado Public Elevation Data s3 Browser + URL: https://colorado-public-elevation-data.s3.amazonaws.com/index.html + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ +ADXCategories: + - Public Sector Data \ No newline at end of file From 48cb0335a7bda3fe0a283cb83c77517111321a86 Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Wed, 3 Sep 2025 15:39:11 -0600 Subject: [PATCH 287/751] Delete registry.yml --- registry.yml | 29 ----------------------------- 1 file changed, 29 deletions(-) delete mode 100644 registry.yml diff --git a/registry.yml b/registry.yml deleted file mode 100644 index ae8062b36..000000000 --- a/registry.yml +++ /dev/null @@ -1,29 +0,0 @@ -Name: State of Colorado Elevation Data -Description: The State of Colorado has gathered public historical elevation data. -Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing -Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team -UpdateFrequency: Periodically -Tags: - - aws-pds - - geospatial - - imaging - - mapping -License: https://creativecommons.org/publicdomain/zero/1.0/legalcode -Resources: - - Description: - ARN: arn:aws:s3:::colorado-public-elevation-data - Region: us-west-2 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: Colorado AWS Open Data Elevation Data Guide - URL: https://docs.google.com/document/d/1pAHZB6SgSE4QTawEbSnIIHpxVCTBg-IjQ6X9KJP28BM/edit?usp=sharing - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ - - Title: Colorado Public Elevation Data s3 Browser - URL: https://colorado-public-elevation-data.s3.amazonaws.com/index.html - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ -ADXCategories: - - Public Sector Data \ No newline at end of file From e013d73658fdc7ece1a855118dc068d421bb938d Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Wed, 3 Sep 2025 15:40:22 -0600 Subject: [PATCH 288/751] Added Colorado Public Elevation Data --- datasets/colorado-public-elevation-data.yml | 29 +++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 datasets/colorado-public-elevation-data.yml diff --git a/datasets/colorado-public-elevation-data.yml b/datasets/colorado-public-elevation-data.yml new file mode 100644 index 000000000..ae8062b36 --- /dev/null +++ b/datasets/colorado-public-elevation-data.yml @@ -0,0 +1,29 @@ +Name: State of Colorado Elevation Data +Description: The State of Colorado has gathered public historical elevation data. +Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing +Contact: oit_gis@state.co.us +ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team +UpdateFrequency: Periodically +Tags: + - aws-pds + - geospatial + - imaging + - mapping +License: https://creativecommons.org/publicdomain/zero/1.0/legalcode +Resources: + - Description: + ARN: arn:aws:s3:::colorado-public-elevation-data + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Colorado AWS Open Data Elevation Data Guide + URL: https://docs.google.com/document/d/1pAHZB6SgSE4QTawEbSnIIHpxVCTBg-IjQ6X9KJP28BM/edit?usp=sharing + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ + - Title: Colorado Public Elevation Data s3 Browser + URL: https://colorado-public-elevation-data.s3.amazonaws.com/index.html + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ +ADXCategories: + - Public Sector Data \ No newline at end of file From e1213860bb05582835c099fab0c35d17874c2a8b Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Wed, 3 Sep 2025 15:41:07 -0600 Subject: [PATCH 289/751] Delete datasets/colorado-public-elevation-data.yml --- datasets/colorado-public-elevation-data.yml | 29 --------------------- 1 file changed, 29 deletions(-) delete mode 100644 datasets/colorado-public-elevation-data.yml diff --git a/datasets/colorado-public-elevation-data.yml b/datasets/colorado-public-elevation-data.yml deleted file mode 100644 index ae8062b36..000000000 --- a/datasets/colorado-public-elevation-data.yml +++ /dev/null @@ -1,29 +0,0 @@ -Name: State of Colorado Elevation Data -Description: The State of Colorado has gathered public historical elevation data. -Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing -Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team -UpdateFrequency: Periodically -Tags: - - aws-pds - - geospatial - - imaging - - mapping -License: https://creativecommons.org/publicdomain/zero/1.0/legalcode -Resources: - - Description: - ARN: arn:aws:s3:::colorado-public-elevation-data - Region: us-west-2 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: Colorado AWS Open Data Elevation Data Guide - URL: https://docs.google.com/document/d/1pAHZB6SgSE4QTawEbSnIIHpxVCTBg-IjQ6X9KJP28BM/edit?usp=sharing - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ - - Title: Colorado Public Elevation Data s3 Browser - URL: https://colorado-public-elevation-data.s3.amazonaws.com/index.html - AuthorName: State of Colorado OIT-GIS - AuthorURL: https://geodata.colorado.gov/ -ADXCategories: - - Public Sector Data \ No newline at end of file From 717b40ab7f5427bd3ec409bacfcdd45e2af9f7dd Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Wed, 3 Sep 2025 15:41:53 -0600 Subject: [PATCH 290/751] uploaded colorado elevation data --- datasets/colorado-elevation-data.yml | 29 ++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 datasets/colorado-elevation-data.yml diff --git a/datasets/colorado-elevation-data.yml b/datasets/colorado-elevation-data.yml new file mode 100644 index 000000000..ae8062b36 --- /dev/null +++ b/datasets/colorado-elevation-data.yml @@ -0,0 +1,29 @@ +Name: State of Colorado Elevation Data +Description: The State of Colorado has gathered public historical elevation data. +Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing +Contact: oit_gis@state.co.us +ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team +UpdateFrequency: Periodically +Tags: + - aws-pds + - geospatial + - imaging + - mapping +License: https://creativecommons.org/publicdomain/zero/1.0/legalcode +Resources: + - Description: + ARN: arn:aws:s3:::colorado-public-elevation-data + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Colorado AWS Open Data Elevation Data Guide + URL: https://docs.google.com/document/d/1pAHZB6SgSE4QTawEbSnIIHpxVCTBg-IjQ6X9KJP28BM/edit?usp=sharing + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ + - Title: Colorado Public Elevation Data s3 Browser + URL: https://colorado-public-elevation-data.s3.amazonaws.com/index.html + AuthorName: State of Colorado OIT-GIS + AuthorURL: https://geodata.colorado.gov/ +ADXCategories: + - Public Sector Data \ No newline at end of file From 3b321827d4b4be3fada93cc337969c105d8931d1 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 3 Sep 2025 14:01:04 -0800 Subject: [PATCH 291/751] ok: Update colorado-elevation-data.yml --- datasets/colorado-elevation-data.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/colorado-elevation-data.yml b/datasets/colorado-elevation-data.yml index ae8062b36..d8014e025 100644 --- a/datasets/colorado-elevation-data.yml +++ b/datasets/colorado-elevation-data.yml @@ -26,4 +26,4 @@ DataAtWork: AuthorName: State of Colorado OIT-GIS AuthorURL: https://geodata.colorado.gov/ ADXCategories: - - Public Sector Data \ No newline at end of file + - Public Sector Data From e81beb1866876c10cab2148712faacb14acc7c08 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 3 Sep 2025 14:03:10 -0800 Subject: [PATCH 292/751] ok: Rename colorado-elevation-data.yml to colorado-elevation-data.yaml --- .../{colorado-elevation-data.yml => colorado-elevation-data.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{colorado-elevation-data.yml => colorado-elevation-data.yaml} (100%) diff --git a/datasets/colorado-elevation-data.yml b/datasets/colorado-elevation-data.yaml similarity index 100% rename from datasets/colorado-elevation-data.yml rename to datasets/colorado-elevation-data.yaml From 2b00f4ee797bb8d4d9ee94900522766ba6a4c638 Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Thu, 4 Sep 2025 07:25:37 -0600 Subject: [PATCH 293/751] Update colorado-elevation-data.yaml --- datasets/colorado-elevation-data.yaml | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/datasets/colorado-elevation-data.yaml b/datasets/colorado-elevation-data.yaml index d8014e025..450b3ad92 100644 --- a/datasets/colorado-elevation-data.yaml +++ b/datasets/colorado-elevation-data.yaml @@ -11,10 +11,14 @@ Tags: - mapping License: https://creativecommons.org/publicdomain/zero/1.0/legalcode Resources: - - Description: + - Description: Colorado Elevation Data (LiDAR) ARN: arn:aws:s3:::colorado-public-elevation-data Region: us-west-2 Type: S3 Bucket + - Description: Notifications for new Colorado Elevation data + ARN: arn:aws:sns:us-west-2:180294215083:colorado-public-elevation-data-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: Colorado AWS Open Data Elevation Data Guide From 683949169bfae169c496f5d3c881cdc25cc3a586 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 4 Sep 2025 07:36:21 -0800 Subject: [PATCH 294/751] ok: Update colorado-elevation-data.yaml From 29ce9d6735eb3498413f7dcc76d4fa59a9f281b1 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 4 Sep 2025 08:45:09 -0800 Subject: [PATCH 295/751] ok: Update aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml adding aws-pds tag --- ...aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml b/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml index 040d426dd..457a06109 100644 --- a/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml +++ b/datasets/aodn_radar_newcastle_velocity_hourly_averaged_delayed_qc.yaml @@ -20,6 +20,7 @@ Collabs: Tags: - oceans Tags: +- aws-pds - oceans - ocean currents - ocean velocity From b0278ed0979121b6262494b0f2fdb931599b94bc Mon Sep 17 00:00:00 2001 From: Louis Erbkamm Date: Fri, 5 Sep 2025 16:55:35 +0200 Subject: [PATCH 296/751] Add Arnis project to terrain-tiles.yaml --- datasets/terrain-tiles.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/terrain-tiles.yaml b/datasets/terrain-tiles.yaml index 4ce3b3fc5..0e3a5bdf5 100644 --- a/datasets/terrain-tiles.yaml +++ b/datasets/terrain-tiles.yaml @@ -89,6 +89,10 @@ DataAtWork: URL: https://app.shadowmap.org/ AuthorName: Shadowmap Technologies GmbH AuthorURL: https://shadowmap.org + - Title: "Arnis: Generate any location from the real world in Minecraft with a high level of detail" + URL: https://github.com/louis-e/arnis + AuthorName: Louis Erbkamm + AuthorURL: https://louisdev.de/ Publications: - Title: "Landscape transformations produce favorable roosting conditions for turkey vultures and black vultures" URL: https://www.nature.com/articles/s41598-021-94045-3 From 7cafa57e5c3b67d92ce3d1e7032d8346ada4d30f Mon Sep 17 00:00:00 2001 From: David Turner Date: Fri, 5 Sep 2025 18:17:34 -0400 Subject: [PATCH 297/751] Update nasa-heasarc.yaml --- datasets/nasa-heasarc.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/datasets/nasa-heasarc.yaml b/datasets/nasa-heasarc.yaml index f1518f231..8823931ab 100644 --- a/datasets/nasa-heasarc.yaml +++ b/datasets/nasa-heasarc.yaml @@ -173,6 +173,11 @@ Resources: Region: us-east-1 Type: S3 Bucket +- Description: The [XRISM Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/xrism.html) Data Archive. Total size > 1 TB. + ARN: arn:aws:s3:::nasa-heasarc/xrism/data/obs/ + Region: us-east-1 + Type: S3 Bucket + DataAtWork: Tutorials: - Title: HEASARC Cloud access page From 523e1cb368d6e608b07d43ecb8e86c2533fbc686 Mon Sep 17 00:00:00 2001 From: David Turner Date: Fri, 5 Sep 2025 18:25:23 -0400 Subject: [PATCH 298/751] Add SRG-eROSITA entry to the nasa-heasarc s3 description --- datasets/nasa-heasarc.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/datasets/nasa-heasarc.yaml b/datasets/nasa-heasarc.yaml index 8823931ab..2a8969f42 100644 --- a/datasets/nasa-heasarc.yaml +++ b/datasets/nasa-heasarc.yaml @@ -142,6 +142,11 @@ Resources: ARN: arn:aws:s3:::nasa-heasarc/sax/data/ Region: us-east-1 Type: S3 Bucket + + - Description: The [SRG Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/srg.html) eROSITA Instrument Data Archive. More information available at [the eROSITA support site](https://heasarc.gsfc.nasa.gov/docs/srg/erosita/). Total size > 3 TB. + ARN: arn:aws:s3:::nasa-heasarc/srg/data/erosita/ + Region: us-east-1 + Type: S3 Bucket - Description: The [Suzaku Mission](https://heasarc.gsfc.nasa.gov/docs/astroe/astroe2.html) Data Archive. For more information, see the website of the [Suzaku/Astro-E2 Guest Observer Facility](https://heasarc.gsfc.nasa.gov/docs/suzaku/astroegof.html). Total size 5.4 TB. ARN: arn:aws:s3:::nasa-heasarc/suzaku/data/ From 60c94398f28dcfa0aa4e9a7e1a6eca3ccf249398 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 5 Sep 2025 14:41:29 -0800 Subject: [PATCH 299/751] Update nasa-heasarc.yaml --- datasets/nasa-heasarc.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/nasa-heasarc.yaml b/datasets/nasa-heasarc.yaml index 2a8969f42..ab3418f91 100644 --- a/datasets/nasa-heasarc.yaml +++ b/datasets/nasa-heasarc.yaml @@ -24,7 +24,6 @@ Tags: - imaging - satellite imagery - x-ray - License: See [the HEASARC data policy web site](https://heasarc.gsfc.nasa.gov/docs/heasarc/data_policy.html) Citation: See [the HEASARC data policy web site](https://heasarc.gsfc.nasa.gov/docs/heasarc/data_policy.html) Resources: From 2f9b62c15fa511714483dda59ec9bb5ad64a0b3f Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 5 Sep 2025 14:41:46 -0800 Subject: [PATCH 300/751] ok: Update nasa-heasarc.yaml From 015adf8971406cbae5ca228331b0f4d083413427 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 5 Sep 2025 14:48:15 -0800 Subject: [PATCH 301/751] ok: Update nasa-heasarc.yaml --- datasets/nasa-heasarc.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/nasa-heasarc.yaml b/datasets/nasa-heasarc.yaml index ab3418f91..0b5575438 100644 --- a/datasets/nasa-heasarc.yaml +++ b/datasets/nasa-heasarc.yaml @@ -24,6 +24,7 @@ Tags: - imaging - satellite imagery - x-ray + License: See [the HEASARC data policy web site](https://heasarc.gsfc.nasa.gov/docs/heasarc/data_policy.html) Citation: See [the HEASARC data policy web site](https://heasarc.gsfc.nasa.gov/docs/heasarc/data_policy.html) Resources: @@ -142,7 +143,7 @@ Resources: Region: us-east-1 Type: S3 Bucket - - Description: The [SRG Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/srg.html) eROSITA Instrument Data Archive. More information available at [the eROSITA support site](https://heasarc.gsfc.nasa.gov/docs/srg/erosita/). Total size > 3 TB. + - Description: "The [SRG Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/srg.html) eROSITA Instrument Data Archive. More information available at [the eROSITA support site](https://heasarc.gsfc.nasa.gov/docs/srg/erosita/). Total size > 3 TB." ARN: arn:aws:s3:::nasa-heasarc/srg/data/erosita/ Region: us-east-1 Type: S3 Bucket @@ -177,7 +178,7 @@ Resources: Region: us-east-1 Type: S3 Bucket -- Description: The [XRISM Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/xrism.html) Data Archive. Total size > 1 TB. +- Description: "The [XRISM Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/xrism.html) Data Archive. Total size > 1 TB." ARN: arn:aws:s3:::nasa-heasarc/xrism/data/obs/ Region: us-east-1 Type: S3 Bucket From 92cab2a763432f284017a54be492cdc6f65c83af Mon Sep 17 00:00:00 2001 From: David Turner Date: Fri, 5 Sep 2025 18:54:13 -0400 Subject: [PATCH 302/751] Update nasa-heasarc.yaml I indented the XRISM entry one character too few --- datasets/nasa-heasarc.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/nasa-heasarc.yaml b/datasets/nasa-heasarc.yaml index 0b5575438..83adbded9 100644 --- a/datasets/nasa-heasarc.yaml +++ b/datasets/nasa-heasarc.yaml @@ -178,10 +178,10 @@ Resources: Region: us-east-1 Type: S3 Bucket -- Description: "The [XRISM Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/xrism.html) Data Archive. Total size > 1 TB." - ARN: arn:aws:s3:::nasa-heasarc/xrism/data/obs/ - Region: us-east-1 - Type: S3 Bucket + - Description: "The [XRISM Mission](https://heasarc.gsfc.nasa.gov/docs/heasarc/missions/xrism.html) Data Archive. Total size > 1 TB." + ARN: arn:aws:s3:::nasa-heasarc/xrism/data/obs/ + Region: us-east-1 + Type: S3 Bucket DataAtWork: Tutorials: From 09f31256faf52a0f9eaaa3fee1c721505fad3815 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 5 Sep 2025 14:55:41 -0800 Subject: [PATCH 303/751] ok: Update nasa-heasarc.yaml From 0ae8647537b3bcf1308ee7405d9bcfc7c5dfc3ed Mon Sep 17 00:00:00 2001 From: Pablo Cingolani Date: Sat, 6 Sep 2025 05:55:33 -0400 Subject: [PATCH 304/751] Added snpeff.yaml --- datasets/snpeff.yaml | 56 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 datasets/snpeff.yaml diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml new file mode 100644 index 000000000..62b884635 --- /dev/null +++ b/datasets/snpeff.yaml @@ -0,0 +1,56 @@ +Name: SnpEff & SnpSift Genomic Variant Annotation Databases +Description: "SnpEff is a variant annotation and effect prediction tool that annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes). It supports over 38,000 genomes and provides comprehensive genomic databases for variant annotation. The databases include reference genomes, gene annotations, protein sequences, and regulatory elements from trusted sources like ENSEMBL, RefSeq, and UCSC. SnpSift complements SnpEff by providing tools to annotate genomic variants using databases, filter large genomic datasets, and manipulate annotated variants. Together, these tools provide a complete solution for genomic variant analysis, supporting research in human genetics, cancer genomics, pharmacogenomics, and model organism studies." +Contact: Pablo Cingolani +Documentation: https://pcingola.github.io/SnpEff/ +ManagedBy: "[Pablo Cingolani](http://www.linkedin.com/in/pablocingolani)" +UpdateFrequency: Monthly +Tags: + - life sciences + - genomic + - variant annotation + - bioinformatics + - genetic + - genome + - cancer + - protein + - vcf + - whole genome sequencing + - whole exome sequencing + - transcriptomics + - structural variation +License: "[MIT License](https://opensource.org/licenses/MIT)" +Resources: + - Description: "Pre-built genomic databases for many reference genomes including human (GRCh37, GRCh38), mouse, rat, and other model organisms. Each database contains gene annotations, transcript information, protein sequences, and regulatory elements required for variant annotation." + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: SnpEff Documentation + URL: https://pcingola.github.io/SnpEff/ + AuthorName: Pablo Cingolani + AuthorURL: http://www.linkedin.com/in/pablocingolani + - Title: SnpEff Introduction and Quick Start + URL: https://pcingola.github.io/SnpEff/snpeff/introduction/ + AuthorName: SnpEff Project + AuthorURL: https://pcingola.github.io/SnpEff/ + - Title: Building Custom SnpEff Databases + URL: https://pcingola.github.io/SnpEff/snpeff/build_db/ + AuthorName: SnpEff Project + AuthorURL: https://pcingola.github.io/SnpEff/ + Tools & Applications: + - Title: SnpEff + URL: https://github.com/pcingola/SnpEff + AuthorName: Pablo Cingolani + AuthorURL: http://www.linkedin.com/in/pablocingolani + - Title: SnpSift + URL: https://github.com/pcingola/SnpSift + AuthorName: Pablo Cingolani + AuthorURL: http://www.linkedin.com/in/pablocingolani + Publications: + - Title: "A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3" + URL: https://www.ncbi.nlm.nih.gov/pubmed/22728672 + AuthorName: Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM + AuthorURL: https://www.ncbi.nlm.nih.gov/pubmed/?term=Cingolani%20P%5BAuthor%5D&cauthor=true&cauthor_uid=22728672 + - Title: "Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift" + URL: https://www.frontiersin.org/articles/10.3389/fgene.2012.00035/full + AuthorName: Cingolani P, Patel VM, Coon M, Nguyen T, Land SJ, Ruden DM, Lu X + AuthorURL: https://www.frontiersin.org/people/u/4691 \ No newline at end of file From 0af3c2e094ca147c379f451a378f94eddec0b519 Mon Sep 17 00:00:00 2001 From: Pablo Cingolani Date: Sat, 6 Sep 2025 05:58:59 -0400 Subject: [PATCH 305/751] Added tag "snpeff" --- datasets/snpeff.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml index 62b884635..d4a9a0704 100644 --- a/datasets/snpeff.yaml +++ b/datasets/snpeff.yaml @@ -5,6 +5,7 @@ Documentation: https://pcingola.github.io/SnpEff/ ManagedBy: "[Pablo Cingolani](http://www.linkedin.com/in/pablocingolani)" UpdateFrequency: Monthly Tags: + - snpeff - life sciences - genomic - variant annotation From a8a839580c019068d1eabca99f6cc9e1b234e28b Mon Sep 17 00:00:00 2001 From: Olivier Date: Sat, 6 Sep 2025 21:35:34 +0100 Subject: [PATCH 306/751] update IBL Nature publications --- datasets/ibl-autism.yaml | 41 ++++++++++++++++++++++++++++ datasets/ibl-behaviour.yaml | 10 +++---- datasets/ibl-brain-wide-map.yaml | 21 ++++++++++---- datasets/ibl-reproducible-ephys.yaml | 20 ++++++++------ 4 files changed, 73 insertions(+), 19 deletions(-) create mode 100644 datasets/ibl-autism.yaml diff --git a/datasets/ibl-autism.yaml b/datasets/ibl-autism.yaml new file mode 100644 index 000000000..f6e9fbe65 --- /dev/null +++ b/datasets/ibl-autism.yaml @@ -0,0 +1,41 @@ +Name: IBL Neuropixels Brainwide Map on AWS +Description: Electrophysiological recordings of mouse brain activity acquired during a decision making task in multiple autism mice models. +Documentation: https://docs.internationalbrainlab.org/notebooks_external/2025_data_release_autism_noel.html +Contact: info@internationalbrainlab.org +ManagedBy: "[International Brain Laboratory](https://www.internationalbrainlab.com)" +UpdateFrequency: TBD +Tags: + - aws-pds + - life sciences + - neuroscience + - neurophysiology + - open source software + - Mus musculus + - autism +License: CC-BY 4.0 +Resources: + - Description: Project data in public bucket + ARN: arn:aws:s3:::ibl-brain-wide-map-public + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Intermediate Datasets and Analysis Code + URL: https://osf.io/fap2s/ and https://osf.io/fap2s/wiki/home/ + AuthorName: Noel et al. + - Title: Download the public data via ONE + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html + AuthorName: IBL Data Architecture Working Group + AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members + - Title: Find data associated with a release or publication + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication + AuthorName: IBL Data Architecture Working Group + AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members + - Title: Loading Data + URL: https://docs.internationalbrainlab.org/loading_examples.html + AuthorName: IBL Data Architecture Working Group + AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members + Publications: + - Title: A common computational and neural anomaly across mouse models of autism + URL: https://doi.org/10.1038/s41593-025-01965-8 + AuthorName: Noel et al. diff --git a/datasets/ibl-behaviour.yaml b/datasets/ibl-behaviour.yaml index c79e76aee..cb54d3bbc 100644 --- a/datasets/ibl-behaviour.yaml +++ b/datasets/ibl-behaviour.yaml @@ -1,6 +1,6 @@ Name: IBL Behavioral Data on AWS Description: Behavioral data of mice performing a decision-making task, associated with 2020 publication of the IBL. -Documentation: https://int-brain-lab.github.io/iblenv/notebooks_external/data_release_behavior.html +Documentation: https://docs.internationalbrainlab.org/notebooks_external/2021_data_release_behavior.html Contact: info@internationalbrainlab.org ManagedBy: "[International Brain Laboratory](https://www.internationalbrainlab.com)" UpdateFrequency: TBD @@ -29,19 +29,19 @@ DataAtWork: AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Download the public data via ONE - URL: https://int-brain-lab.github.io/iblenv/notebooks_external/data_download.html + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Find data associated with a release or publication - URL: https://int-brain-lab.github.io/iblenv/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Loading Data - URL: https://int-brain-lab.github.io/iblenv/loading_examples.html# + URL: https://docs.internationalbrainlab.org/loading_examples.html AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members Publications: - Title: Standardized and reproducible measurement of decision-making in mice URL: https://doi.org/10.7554/eLife.63711 AuthorName: International Brain Laboratory et al. - AuthorURL: www.internationalbrainlab.com \ No newline at end of file + AuthorURL: www.internationalbrainlab.com diff --git a/datasets/ibl-brain-wide-map.yaml b/datasets/ibl-brain-wide-map.yaml index 1ecc083d6..998456dcd 100644 --- a/datasets/ibl-brain-wide-map.yaml +++ b/datasets/ibl-brain-wide-map.yaml @@ -1,6 +1,6 @@ Name: IBL Neuropixels Brainwide Map on AWS -Description: Electrophysiological recordings of mouse brain activity acquired using Neuropixels probes and accompanying behavioral data. -Documentation: https://int-brain-lab.github.io/iblenv/notebooks_external/data_release_brainwidemap.html +Description: Electrophysiological recordings of mouse brain activity acquired during a decision making task. +Documentation: https://docs.internationalbrainlab.org/notebooks_external/2025_data_release_brainwidemap.html Contact: info@internationalbrainlab.org ManagedBy: "[International Brain Laboratory](https://www.internationalbrainlab.com)" UpdateFrequency: TBD @@ -40,15 +40,24 @@ DataAtWork: URL: https://colab.research.google.com/drive/1th3MRZGHMSaeAvGmKGJQ84rBk8eEI4Fu AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - - Title: Download the public datasets - URL: https://int-brain-lab.github.io/iblenv/notebooks_external/data_download.html + - Title: Download the public data via ONE + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Find data associated with a release or publication - URL: https://int-brain-lab.github.io/iblenv/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Loading Data - URL: https://int-brain-lab.github.io/iblenv/loading_examples.html# + URL: https://docs.internationalbrainlab.org/loading_examples.html AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members + Publications: + - Title: A brain-wide map of neural activity during complex behaviour + URL: https://doi.org/10.1038/s41586-025-09235-0 + AuthorName: International Brain Laboratory et al. + AuthorURL: www.internationalbrainlab.com + - Title: Brain-wide representations of prior information in mouse decision-making + URL: https://doi.org/10.1038/s41586-025-09226-1 + AuthorName: International Brain Laboratory et al. + AuthorURL: www.internationalbrainlab.com diff --git a/datasets/ibl-reproducible-ephys.yaml b/datasets/ibl-reproducible-ephys.yaml index 22150a6f0..9a16e1119 100644 --- a/datasets/ibl-reproducible-ephys.yaml +++ b/datasets/ibl-reproducible-ephys.yaml @@ -1,6 +1,6 @@ Name: IBL Neuropixels Reproducible Ephys Data on AWS Description: Electrophysiological recordings acquired using Neuropixels probes in different mice and labs, targeting the same brain locations (including posterior parietal cortex, hippocampus, and thalamus). -Documentation: https://int-brain-lab.github.io/iblenv/notebooks_external/data_release_repro_ephys.html +Documentation: https://docs.internationalbrainlab.org/notebooks_external/2024_data_release_repro_ephys.html Contact: info@internationalbrainlab.org ManagedBy: "[International Brain Laboratory](https://www.internationalbrainlab.com)" UpdateFrequency: TBD @@ -28,20 +28,24 @@ DataAtWork: AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members Tutorials: - - Title: Download the public datasets - URL: https://int-brain-lab.github.io/iblenv/notebooks_external/data_download.html + - Title: Compute the RIGOR metrics + URL: https://github.com/int-brain-lab/paper-reproducible-ephys/blob/2397f2cf5b92689f39e94ef7d8f76f0a7e2bd2a7/RIGOR_script.ipynb + AuthorName: IBL Data Architecture Working Group + AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members + - Title: Download the public data via ONE + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Find data associated with a release or publication - URL: https://int-brain-lab.github.io/iblenv/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication + URL: https://docs.internationalbrainlab.org/notebooks_external/data_download.html#Find-data-associated-with-a-release-or-publication AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members - Title: Loading Data - URL: https://int-brain-lab.github.io/iblenv/loading_examples.html# + URL: https://docs.internationalbrainlab.org/loading_examples.html AuthorName: IBL Data Architecture Working Group AuthorURL: https://github.com/orgs/int-brain-lab/teams/data-architecture-wg/members Publications: - - Title: Reproducibility of in-vivo electrophysiological measurements in mice - URL: https://doi.org/10.1101/2022.05.09.491042 + - Title: Reproducibility of in vivo electrophysiological measurements in mice + URL: https://doi.org/10.7554/eLife.100840.1 AuthorName: International Brain Laboratory et al. - AuthorURL: www.internationalbrainlab.com \ No newline at end of file + AuthorURL: www.internationalbrainlab.com From c28ce95fc30b095953f5a91735557b8eb6367317 Mon Sep 17 00:00:00 2001 From: blahner Date: Sun, 7 Sep 2025 15:53:03 -0700 Subject: [PATCH 307/751] add initial mosaic dataset yaml --- datasets/mosaic.yaml | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 datasets/mosaic.yaml diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml new file mode 100644 index 000000000..22b0ae817 --- /dev/null +++ b/datasets/mosaic.yaml @@ -0,0 +1,28 @@ +Name: Meta-Organized Stimuli And fMRI Imaging data for Computational modeling (MOSAIC) +Description: This extensible dataset, MOSAIC, aggregates individual functional magnetic resonance imaging (fMRI) datasets by leveraging a shared preprocessing pipeline and stimulus curation procedure. This dataset aggregation procedure achieves the scale necessary for neural network training and the diversity needed for generalizable results. +Documentation: https://github.com/blahner/mosaic-preprocessing +Contact: blahner@mit.edu +ManagedBy: Massachusetts Institute of Technology, Georgia Tech +UpdateFrequency: New data is uploaded as researchers preprocess their fMRI data according to MOSAIC format and submit. +Tags: + - fMRI + - Vision + - Image + - Video +License: CC BY 4.0 +Citation: +Resources: + - Description: HDF5 files containing preprocessed fMRI data + ARN: arn:aws:account::042585258830:account + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://042585258830:account.s3.amazonaws.com/index.html)' +DataAtWork: + Tutorials: + - Title: Load HDF5 file (Jupyter notebook) + URL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb + NotebookURL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb + AuthorName: Benjamin Lahner +ADXCategories: + - Healthcare & Life Sciences Data \ No newline at end of file From ade2ed6d745f1787c4adfd6148077409e297356f Mon Sep 17 00:00:00 2001 From: blahner Date: Sun, 7 Sep 2025 16:05:43 -0700 Subject: [PATCH 308/751] changed ARN and Explore url --- datasets/mosaic.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml index 22b0ae817..e09fbcd41 100644 --- a/datasets/mosaic.yaml +++ b/datasets/mosaic.yaml @@ -13,11 +13,11 @@ License: CC BY 4.0 Citation: Resources: - Description: HDF5 files containing preprocessed fMRI data - ARN: arn:aws:account::042585258830:account + ARN: arn:aws:s3:::mosaic-fmri Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket](https://042585258830:account.s3.amazonaws.com/index.html)' + - '[Browse Bucket](https://mosaic-fmri.s3.amazonaws.com/index.html)' DataAtWork: Tutorials: - Title: Load HDF5 file (Jupyter notebook) From c472ce54410d416f4c25647e20196755a81206ab Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 8 Sep 2025 16:13:01 -0400 Subject: [PATCH 309/751] ok: Update ibl-autism.yaml From d151b37f2f1ee9dbd53ca1425622f0a9f26d5a8a Mon Sep 17 00:00:00 2001 From: berylrab Date: Mon, 8 Sep 2025 16:18:35 -0400 Subject: [PATCH 310/751] ok:Update ibl-autism.yaml --- datasets/ibl-autism.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ibl-autism.yaml b/datasets/ibl-autism.yaml index f6e9fbe65..84435dbfe 100644 --- a/datasets/ibl-autism.yaml +++ b/datasets/ibl-autism.yaml @@ -11,7 +11,7 @@ Tags: - neurophysiology - open source software - Mus musculus - - autism + - autism spectrum disorder License: CC-BY 4.0 Resources: - Description: Project data in public bucket From 66d64401bb53fd6aaafe2dc0bef8a5d8c6361dcc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Sep 2025 14:13:05 -0800 Subject: [PATCH 311/751] ok: Update mosaic.yaml --- datasets/mosaic.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml index e09fbcd41..7e64adc4e 100644 --- a/datasets/mosaic.yaml +++ b/datasets/mosaic.yaml @@ -5,6 +5,7 @@ Contact: blahner@mit.edu ManagedBy: Massachusetts Institute of Technology, Georgia Tech UpdateFrequency: New data is uploaded as researchers preprocess their fMRI data according to MOSAIC format and submit. Tags: + - aws-pds - fMRI - Vision - Image @@ -25,4 +26,4 @@ DataAtWork: NotebookURL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb AuthorName: Benjamin Lahner ADXCategories: - - Healthcare & Life Sciences Data \ No newline at end of file + - Healthcare & Life Sciences Data From 6bcd8cce3afe0ab4315720a551cdf67d915d108c Mon Sep 17 00:00:00 2001 From: mhuynh-au <76926809+mhuynh-au@users.noreply.github.com> Date: Tue, 9 Sep 2025 13:55:36 +0800 Subject: [PATCH 312/751] Add new dataset: CSIRO ASKAP Radio Telescope --- datasets/askap.yaml | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 datasets/askap.yaml diff --git a/datasets/askap.yaml b/datasets/askap.yaml new file mode 100644 index 000000000..4764ebbf9 --- /dev/null +++ b/datasets/askap.yaml @@ -0,0 +1,44 @@ +Name: ASKAP Radio Telescope +Description: | + +ASKAP is the CSIRO’s newest radio telescope. It is situated at the Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory on Wajarri Yamaji Country in the Murchison region of Western Australia, about 800 km north of Perth. + +ASKAP consists of 36 12m dishes, spread-out as far as 6km apart. It uses a new technology called Phased Array Feeds (PAFs), which allows it to see more of the sky at once. This novel technology allows ASKAP to achieve extremely high survey speed, making it one of the best instruments in the world for mapping the sky at radio wavelengths. + +Initial dataset available - The Rapid ASKAP Continuum Survey (RACS) + +RACS is the first large-area survey completed with ASKAP. This survey is revolutionary as the entire sky was observed in a matter of weeks, doing what previously took telescopes years to do. RACS initially covered the whole sky at 890 MHz (RACS-Low), and has since expanded to ASKAP’s other bands (1.4 and 1.7 GHz). RACS also covers the sky in multiple epochs, with a second epoch of RACS-Low and RACS-Mid obtained and processed. + +RACS provides astronomers with a unique opportunity to study the radio sky and radio populations, in particular supermassive blackholes (active galactic nuclei) and their role in galaxy evolution. The multi-epoch approach also allows a study of the transient sky and testing and verification of calibration methods. The large area allows for cosmological studies, such as a search for anisotropy in the galaxy population, or cosmic dipole. + +Documentation: https://www.atnf.csiro.au/facilities/askap-radio-telescope/ +Contact: atnf-datasup@csiro.au +ManagedBy: "[Australia Telescope National Facility, CSIRO](http://www.atnf.csiro.au/)" +Citation: Please see the [ATNF acknowledgement page](https://www.atnf.csiro.au/resources/publications/atnf-publication-acknowledgement-statements/) for full citation instructions. +UpdateFrequency: Roughly quarterly +Tags: + - astronomy + - archives +License: CC-BY-4.0. Attribution required for refereed scientific papers. +Resources: + - Description: The Rapid ASKAP Continuum Survey (RACS) Public Data Releases + ARN: arn:aws:s3:::askap/racs + Region: ap-southeast-2 + Type: S3 Bucket + RequesterPays: False +DataAtWork: + Tutorials: + - Title: CSIRO ASKAP Science Data Archive User Guide + URL: https://research.csiro.au/casda/casda-user-guide/ + AuthorName: CSIRO, ATNF + - Title: Rapid Askap Continuum Survey (RACS) Home Page + URL: https://research.csiro.au/racs/ + AuthorName: CSIRO, ATNF + Tools & Applications: + Publications: + - Title: ASKAP Publication List + URL: https://www.atnf.csiro.au/facilities/askap-radio-telescope/publications/ + AuthorName: various, list maintained by CSIRO, ATNF + - Title: ASKAP System Description paper + URL: https://doi.org/10.1017/pasa.2021.1 + AuthorName: Hotan, A. et al. \ No newline at end of file From aedab2db0e6493f307fefbd8d8e91dba586320d2 Mon Sep 17 00:00:00 2001 From: berylrab Date: Tue, 9 Sep 2025 12:27:36 -0400 Subject: [PATCH 313/751] ok:Update pasteur-logan.yaml Adding the preprint --- datasets/pasteur-logan.yaml | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/datasets/pasteur-logan.yaml b/datasets/pasteur-logan.yaml index 577bf8b97..9d738b29f 100644 --- a/datasets/pasteur-logan.yaml +++ b/datasets/pasteur-logan.yaml @@ -52,4 +52,7 @@ DataAtWork: URL: https://github.com/asl/f2sz AuthorName: Anton Korobeynikov AuthorURL: https://anton.korobeynikov.info/ - + Publications: + - Title: Logan - Planetary-Scale Genome Assembly Surveys Life’s Diversity + URL: https://www.biorxiv.org/content/10.1101/2024.07.30.605881v2.full + AuthorName: Chikhi R., Lemane T., Loll-Krippleber R., et al (2025) From a769cdba63611b838242bb28a2bc796da96f57ac Mon Sep 17 00:00:00 2001 From: blahner <40153591+blahner@users.noreply.github.com> Date: Tue, 9 Sep 2025 14:13:46 -0700 Subject: [PATCH 314/751] Update mosaic.yaml added valid tags --- datasets/mosaic.yaml | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml index 7e64adc4e..2fd24ffd2 100644 --- a/datasets/mosaic.yaml +++ b/datasets/mosaic.yaml @@ -6,10 +6,12 @@ ManagedBy: Massachusetts Institute of Technology, Georgia Tech UpdateFrequency: New data is uploaded as researchers preprocess their fMRI data according to MOSAIC format and submit. Tags: - aws-pds - - fMRI - - Vision - - Image - - Video + - brain images + - brain models + - hdf5 + - neuroimaging + - neuroscience + - machine learning License: CC BY 4.0 Citation: Resources: From aab30924bb4ed3d8d486b6af098bb9f3f8204f63 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 14:40:59 -0400 Subject: [PATCH 315/751] Update apex.yaml --- datasets/apex.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/apex.yaml b/datasets/apex.yaml index 9ac7afeb5..bb09598ea 100644 --- a/datasets/apex.yaml +++ b/datasets/apex.yaml @@ -24,6 +24,7 @@ Tags: - brain models - analysis ready data - nifti + - aws-pds License: '[CC BY](https://creativecommons.org/licenses/by/4.0)' Citation: Resources: From 44b6c5f1c8787e51d28e5f9de8138622dbe620a2 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 15:06:48 -0400 Subject: [PATCH 316/751] ok: Update apex.yaml Adding pds tag From d52b7b84282c063017b2ee72f896652a251dd5f3 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 16:38:28 -0400 Subject: [PATCH 317/751] ok: Update huj-herbarium.yaml --- datasets/huj-herbarium.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index aa0a6ec3d..70e5dee50 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -3,7 +3,6 @@ Description: Our collection encompasses approximately one million vascular plant specimens from the Mediterranean and Middle East biodiversity hotspot, representing flora from Israel, Jordan, Hermon, Sinai, Egypt, the Caucasus, Arabia, North Africa, and throughout the Mediterranean basin. This scientifically significant repository includes published voucher specimens, original specimens used for "Flora Palaestina" illustrations, and critical references for the Israeli gene bank collections. The ongoing digitization process captures high-resolution images of each specimen while systematically incorporating label information into our computerized catalog. This virtual herbarium will democratize access to these valuable botanical resources, enabling global researchers to examine specimens in exceptional detail from anywhere in the world. Beyond preservation, this digital transformation unlocks new research possibilities through computational analysis of both visual specimen characteristics and associated metadata. The dataset will serve as a foundational resource for advancing botanical research, ecological modeling, taxonomic investigation, historical analysis, and numerous other scientific disciplines concerned with plant biodiversity in this ecologically and historically significant region. - Documentation: Contact: Eyal.Ben-Hur@mail.huji.ac.il ManagedBy: National Natural History Collections, The Hebrew University of Jerusalem @@ -18,7 +17,6 @@ Tags: - imaging - image processing - aws-pds - License: CC-BY-SA 4.0 Citation: Vascular plants - Herbarium of The National Natural History Collections was accessed on DATE from https://registry.opendata.aws/huj-herbarium. Resources: @@ -50,3 +48,4 @@ ADXCategories: - + From 9901ffa77365b5aa582ca9785ebc6fa73e01423e Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 16:43:06 -0400 Subject: [PATCH 318/751] ok: Update huj-herbarium.yaml Adding ADX category --- datasets/huj-herbarium.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index 70e5dee50..5ca590aec 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -45,7 +45,8 @@ DataAtWork: AuthorURL: DeprecatedNotice: ADXCategories: - - + - Healthcare & Life Sciences Data + From 8ef4315b01732b06231a514aba6405b478a74aee Mon Sep 17 00:00:00 2001 From: tim-essential Date: Thu, 11 Sep 2025 13:48:23 -0700 Subject: [PATCH 319/751] Add SNS Topic for Essential-Web v1.0 data notifications Added SNS Topic details for Essential-Web v1.0 notifications. --- datasets/eai-essential-web-v1.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml index 3eb01efc5..4ef74667e 100644 --- a/datasets/eai-essential-web-v1.yaml +++ b/datasets/eai-essential-web-v1.yaml @@ -18,6 +18,10 @@ Resources: Type: S3 Bucket Explore: - https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 + - Description: Notifications for new Essential-Web v1.0 data + ARN: arn:aws:sns:us-west-2:021391128517:essential-web-v10-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Publications: - Title: 'Essential-Web v1.0: 24T tokens of organized web data' From 91afba4846481bbdbbaa01e3170c05d75507db00 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 17:22:01 -0400 Subject: [PATCH 320/751] ok: Update eai-essential-web-v1.yaml Updating to merge From 2229d2a3968ef2188e6b1244677cceaba65480ca Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 17:24:57 -0400 Subject: [PATCH 321/751] ok: Update eai-essential-web-v1.yaml Updating tags to reflect tags from https://github.com/awslabs/open-data-registry/blob/main/tags.yaml --- datasets/eai-essential-web-v1.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml index 4ef74667e..bdabd0358 100644 --- a/datasets/eai-essential-web-v1.yaml +++ b/datasets/eai-essential-web-v1.yaml @@ -8,8 +8,8 @@ Tags: - aws-pds - machine learning - natural language processing - - web data - - text + - web archive + - text analysis License: 'Essential-Web-v1.0 contributions are made available under the [ODC attribution license](https://opendatacommons.org/licenses/by/odc_by_1.0_public_text.txt); however, users should also abide by the [Common Crawl - Terms of Use](https://commoncrawl.org/terms-of-use). We do not alter the license of any of the underlying data.' Resources: - Description: 'Essential-Web v1.0: 24T tokens of organized web data' From 33938a0aa1c2b5ebc8518a480e95233b534c83bf Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 11 Sep 2025 17:30:10 -0400 Subject: [PATCH 322/751] ok: Update eai-essential-web-v1.yaml Removed link under Explore due to syntax error. The same link is provided under documentation. --- datasets/eai-essential-web-v1.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/eai-essential-web-v1.yaml b/datasets/eai-essential-web-v1.yaml index bdabd0358..375bec992 100644 --- a/datasets/eai-essential-web-v1.yaml +++ b/datasets/eai-essential-web-v1.yaml @@ -16,8 +16,6 @@ Resources: ARN: arn:aws:s3:::essential-web-v1.0 Region: us-west-2 Type: S3 Bucket - Explore: - - https://huggingface.co/datasets/EssentialAI/essential-web-v1.0 - Description: Notifications for new Essential-Web v1.0 data ARN: arn:aws:sns:us-west-2:021391128517:essential-web-v10-object_created Region: us-west-2 From e727a101073b71851abedf48f21a6636a2edaa95 Mon Sep 17 00:00:00 2001 From: Kim Fisher Date: Thu, 11 Sep 2025 17:42:54 -0400 Subject: [PATCH 323/751] coral reef ic yaml --- ...ralreef-image-classification-training.yaml | 47 +++++++++++++++++++ tags.yaml | 1 + 2 files changed, 48 insertions(+) create mode 100644 datasets/coralreef-image-classification-training.yaml diff --git a/datasets/coralreef-image-classification-training.yaml b/datasets/coralreef-image-classification-training.yaml new file mode 100644 index 000000000..da2b01a8d --- /dev/null +++ b/datasets/coralreef-image-classification-training.yaml @@ -0,0 +1,47 @@ +Name: Community coral reef image classification training data +Description: "Community-sourced repository of coral reef image classification training data, including continually updated confirmed annotations from [MERMAID](https://datamermaid.org/)" +Documentation: https://github.com/data-mermaid/image-classification-open-data +Contact: contact@datamermaid.org +ManagedBy: "[MERMAID](https://datamermaid.org/)" +UpdateFrequency: Each partner organization updates on their own cadence. MERMAID updates once per day. +Tags: + - coastal + - conservation + - coral reef + - csv + - global + - machine learning + - marine + - parquet + - survey +License: "[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/)" +Resources: + - Description: "The coral-reef-training AWS S3 bucket provides a single, open, well-structured, growing, community-sourced repository of coral reef image classification training data. Hosted at s3://coral-reef-training, this bucket supports global efforts in coral reef conservation through standardized, machine-learning-ready imagery and annotations. + +The bucket serves as the image storage backend for MERMAID’s image classification workflows and to distribute confirmed and scrubbed MERMAID coral reef image data, but it also provides a shared location where partners including CoralNet can contribute to and benefit from collective ML model development, each according to its own data structures and policies. Data in the bucket is free and open for public access; only contributing organizations have write access to their own data prefixes. + +By centralizing and standardizing coral reef image data, this initiative accelerates collaboration across scientific, conservation, and machine learning communities and facilitates the creation of a common, evolving image classification model for coral reefs worldwide." + ARN: arn:aws:s3:::coral-reef-training + Region: us-east-1 + Type: S3 Bucket + Explore: + - "[Browse Bucket](https://coral-reef-training.s3.amazonaws.com/index.html)" +DataAtWork: + Tutorials: + - Title: MERMAID Image Classification Open Data Tutorial - Python version + URL: https://data-mermaid.github.io/image-classification-open-data/image-classification-open-data-tutorial_Python.html + AuthorName: Domazetoski V, Caldwell I + AuthorURL: https://github.com/ViktorDomazetoski, https://github.com/ircaldwell + - Title: MERMAID Image Classification Open Data Tutorial - R version + URL: https://data-mermaid.github.io/image-classification-open-data/image-classification-open-data-tutorial_R.html + AuthorName: Caldwell I + AuthorURL: https://github.com/ircaldwell + Tools & Applications: + - Title: MERMAID Collect + URL: https://app.datamermaid.org/ + AuthorName: MERMAID + AuthorURL: https://datamermaid.org/ + - Title: MERMAID Explore + URL: https://explore.datamermaid.org/ + AuthorName: MERMAID + AuthorURL: https://datamermaid.org/ diff --git a/tags.yaml b/tags.yaml index f32cf4b52..c7b74cdc8 100644 --- a/tags.yaml +++ b/tags.yaml @@ -90,6 +90,7 @@ - conversation data - copper - copyright monitoring +- coral reef - coronavirus - cover song identification - COVID-19 From 44b12a97901f3d7de3f66d6355169023e80dfd56 Mon Sep 17 00:00:00 2001 From: Ben Hitz Date: Thu, 11 Sep 2025 16:28:53 -0700 Subject: [PATCH 324/751] add igvf-consortium yaml --- datasets/igvf-consortium.yaml | 48 +++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 datasets/igvf-consortium.yaml diff --git a/datasets/igvf-consortium.yaml b/datasets/igvf-consortium.yaml new file mode 100644 index 000000000..aa38ad8ec --- /dev/null +++ b/datasets/igvf-consortium.yaml @@ -0,0 +1,48 @@ +Name: The Impact of Variation on Function Consortium (IGVF) +Description: | + The IGVF (Impact of Genomic Variation on Function) Consortium aims to understand how genomic variation affects genome function, + which in turn impacts phenotype. The NHGRI is funding this collaborative program that brings together teams of investigators who + will use state-of-the-art experimental and computational approaches to model, predict, characterize and map genome function, how + genome function shapes phenotype, and how these processes are affected by genomic variation. These joint efforts will produce a + catalog of the impact of genomic variants on genome function and phenotypes. + The Data Corpus consists of single-cell Genomics experiments (both single modal, and multimodal, typically snRNA-seq and snATAC-seq), + Characterization experiments using Massively Parallel Reporter Assays (MPRAs) and CRISPR-screens along with a variety of protein mutatation + assays, and Predictive Models. + There are a huge variety of files in IGVF that are stored in the AWS OpenData Set so we recommend using the [metadata file]() or browsing the [IGVF Data Portal](https://data.igvf.org) +Contact: igvf-portal-help@lists.stanford.edu +ManagedBy: IGVF Data Administration and Coordination Center at Stanford University +Documentation: https://data.igvf.org/general-help +UpdateFrequency: Daily +Tags: + - aws-pds + - biology + - bioinformatics + - genetic + - genomic + - life sciences +License: E[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) You are free to share abnd adapt tgus data with proper attribution. +Resources: + - Description: Released and Archived IGVF Data Files + ARN: arn:aws:s3:::igvf-public + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title:Load AnnData files from IGVF into scanpy and view the UMAPs + URL: https://github.com/IGVF-DACC/igvf-data-usage-examples/blob/master/igvf-scanpy.ipynb + AuthorName: Ben Hitz + AuthorURL: https://github.com/hitz + - Title: Ingesting IGVF Data into TileDB with S3 backend + URL: https://github.com/IGVF-DACC/igvf-data-usage-examples/blob/master/ingest_igvf_h5ad_data_to_anndata_and_tiledb.ipynb + AuthorName: Otto Jolanki + AuthorURL: https://github.com/ottojolanki + Tools & Applications: + - Title: The IGVF Catalog + URL: https://catalog.igvf.org + AuthorName: The IGVF Consortium + AuthorURL: www.igvf.org + Publications: + - Title: Deciphering the impact of genomic variation on function + URL: https://www-nature-com.stanford.idm.oclc.org/articles/s41586-024-07510-0 + AuthorName: Jesse M. Engreitz and The IGVF Consortium + AuthorURL: https://orcid.org/0000-0002-5754-1719 From e83a81549ca0b0c8d49032603398cfa45e586221 Mon Sep 17 00:00:00 2001 From: willmacs <103065262+willmacs@users.noreply.github.com> Date: Fri, 12 Sep 2025 09:52:13 -0400 Subject: [PATCH 325/751] Added Jupyter resource. Updating French --- datasets/rcm-ceos-ard.yaml | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index 6584d6946..e269ee5e2 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -41,11 +41,15 @@ Resources: - '[EODMS STAC for RCM CEOS ARD](https://www.eodms-sgdot.nrcan-rncan.gc.ca/stac/collections/rcm-ard/items/)' DataAtWork: Publications: + - Title: Workflows for accessing and manipulating RCM ARD SpatioTemporal Asset Catalog (STAC) in JupyterLab Python Notebooks - Flux de travail pour accéder et manipuler le catalogue d'actifs spatio-temporels (STAC) RCM ARD dans les notebooks Python JupyterLab + URL: https://github.com/eodms-sgdot/rcm-ard-stac-examples + AuthorName: Canada Centre for Remote Sensing | Centre canadien de télédétection + AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing - Title: Synthetic Aperture Radar (CEOS-ARD SAR) URL: https://ceos.org/ard/files/PFS/SAR/v1.1/CEOS-ARD_PFS_Synthetic_Aperture_Radar_v1.1.pdf - AuthorName: Committee on Earth Observation Satellites (CEOS) for developing the CEOS ARD Standards. Specific acknowledgement to François Charbonneau (NRCan) for contributions to the standard development through CEOS committee membership as well as application to Canadian RADARSAT data. - AuthorURL: - - Title: CEOS Analysis Ready Data + AuthorName: Committee on Earth Observation Satellites (CEOS) for developing the CEOS ARD Standards. Specific acknowledgement to François Charbonneau (NRCan) for contributions to the standard development through CEOS committee membership as well as application to Canadian RADARSAT data. - Comité sur les satellites d'observation de la Terre (CEOS) pour l'élaboration des normes CEOS ARD. Remerciements particuliers à François Charbonneau (RNCan) pour ses contributions au développement des normes par le biais de son appartenance au comité CEOS ainsi que pour l'application aux données canadiennes RADARSAT. + AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing + - Title: CEOS Analysis Ready Data - Données prêtes à l'analyse du CEOS URL: https://ceos.org/ard/ AuthorName: Committee on Earth Observation Satellites (CEOS) AuthorURL: https://ceos.org/ From d6ca1478dff3fe7f7924f3791cf5796501a7eefc Mon Sep 17 00:00:00 2001 From: crichica <148996603+crichica@users.noreply.github.com> Date: Fri, 12 Sep 2025 10:09:51 -0400 Subject: [PATCH 326/751] ok: Update terrain-tiles.yaml ok: test build --- datasets/terrain-tiles.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/terrain-tiles.yaml b/datasets/terrain-tiles.yaml index 0e3a5bdf5..cc21053e5 100644 --- a/datasets/terrain-tiles.yaml +++ b/datasets/terrain-tiles.yaml @@ -99,4 +99,4 @@ DataAtWork: AuthorName: Jacob E. Hill, Kenneth F. Kellner, Bryan M. Kluever, Michael L. Avery, John S. Humphrey, Eric A. Tillman, Travis L. DeVault & Jerrold L. Belant - Title: Interactive Visualization of 3D Terrain Data Stored in the Cloud URL: https://ieeexplore.ieee.org/abstract/document/9298063 - AuthorName: Gregory Larrick, Yun Tian, Uri Rogers, Halim Acosta, and Fangyang Shen + AuthorName: Gregory Larrick, Yun Tian, Uri Rogers, Halim Acosta, and Fangyang Shen From b0e68101c16f100a5316f2428db1b04a615f3a03 Mon Sep 17 00:00:00 2001 From: Tjima <45975971+Tjima@users.noreply.github.com> Date: Fri, 12 Sep 2025 12:19:37 -0400 Subject: [PATCH 327/751] Update 1.1 noaa-nos-cora.yaml Updates to the registry page to add mode information crucial to users. This change is proposed by project team at NOAA NOS CO-OPS. --- datasets/noaa-nos-cora.yaml | 87 ++++++++++++++++++++++++------------- 1 file changed, 57 insertions(+), 30 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index 766572d57..de7da2b12 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -1,32 +1,60 @@ -Name: NOAA's Coastal Ocean Reanalysis (CORA) Dataset +NOAA's Coastal Ocean Reanalysis (CORA) Dataset: 1979-2022 +Tags: Ocean, atmosphere, climate, flood risk planning, transportation, restoration, weather, waves, storm modeling, hydrodynamic modeling + Description: | - NOAA's Coastal Ocean Reanalysis (CORA) for the Gulf of Mexico and East Coast (GEC) is produced using verified hourly water levels from the Center of Operational Oceanographic Products & Services (CO-OPS), through hydrodynamic modeling from Advanced Circulation "[ADCIRC](https://adcirc.org/)" and Simulating WAves Nearshore "[SWAN](https://swanmodel.sourceforge.io/)" models. Data are assimilated, processed, corrected, and processed again before quality assurance and skill assessment with additional verified tide station-based observations. -
-
- Details for CORA Dataset -
-
- **Timeseries** - 1979 to 2022 -
- **Size** - Approx. 20.5TB -
- **Domain** - Lat 5.8 to 45.8 ; Long -98.0 to -53.8 -
- **Nodes** - 1813443 centroids, 3564104 elements -
- **Grid cells** - Currently apporximately 505 -
- **Spatial Resolution** - 500m, 1983 Contiguous USA Albers projection (EPSG:5070) -
-Documentation: https://tidesandcurrents.noaa.gov/ + NOAA's [Coastal Ocean Reanalysis (CORA)](https://tidesandcurrents.noaa.gov/cora.html) for the Gulf, East Coast/Atlantic, and Caribbean (GEC) is produced using verified hourly water levels from the National Ocean Service’s [Center of Operational Oceanographic Products & Services](https://tidesandcurrents.noaa.gov/) (CO-OPS). [ADvanced CIRCulation Model (ADCIRC)](https://www.erdc.usace.army.mil/Media/Fact-Sheets/Fact-Sheet-Article-View/Article/476698/advanced-circulation-model/) and [Simulating WAves Nearshore (SWAN)](https://www.tudelft.nl/en/ceg/about-faculty/departments/hydraulic-engineering/sections/environmental-fluid-mechanics/research/swan) models are coupled to model coastal water levels and nearshore waves. Hourly water level observations are used for data assimilation and validation to improve the accuracy of modeled water levels and wave datasets. + + Additional Details: | + Metadata associated with model domain and time span. + . Timeseries - 1979 to 2022 + . Size - Approx. 44.6 TB + . Domain - Lat 5.8 to 45.8 ; Long -98.0 to -53.8 + . Nodes - [CORA Metadata Library](https://www.fisheries.noaa.gov/inport/item/75048) + . Grid cells - [CORA Metadata Library](https://www.fisheries.noaa.gov/inport/item/75048) + . Spatial Resolution: + . Centroids: 300-400 meters + . Gridded: 500 meters + . Projection: 1983 Contiguous USA Albers projection (EPSG:5070) + . Update Frequency: Product dependent. At minimum, annually. + +Datasets: | +Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format + . fort.63.nc - Water level elevation + . fort.73.nc - Atmospheric pressure at sea level + . fort.74.nc - Wind Velocity - 10 m elevation + . maxele.63.nc - Maximum water elevation + . swan_DIR.63.nc - Spectral mean wave direction + . swan_TMM10.63.nc - Spectral mean wave period + . swan_TPS.63.nc - Spectral peak wave period + . swan_HS.63.nc - Spectral zeroth moment wave height + . swan_HS_max.63.nc - Maximum spectral zeroth moment wave height + +Derived Products: | +Datasets resulting from the computation, modeling, or other processing using existing/collected data. All products are available in NetCDF (.nc) format + . CORA-V1.1-fort.63: Hourly water levels + . CORA-V1.1-swan_DIR.63: Hourly mean wave direction + . CORA-V1.1-swan_TPS.63: Hourly peak wave periods + . CORA-V1.1-swan_HS.63: Hourly significant wave heights + . CORA-V1.1-Grid: Hourly water levels interpolated from model nodes to uniform 500-meter resolution grid + +License: | +NOAA data disseminated through NODD are open to the public and can be used as desired. + +NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. + +Documentation: | + . [NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean](https://repository.library.noaa.gov/view/noaa/66833) + . [Assessment of water levels from 43 years of NOAA’s Coastal Ocean Reanalysis (CORA) for the Gulf of Mexico and East Coasts](https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2024.1381228/full?utm_source=Email_to_authors_&utm_medium=Email&utm_content=T1_11.5e1_author&utm_campaign=Email_publication&field=&journalName=Frontiers_in_Marine_Science&id=1381228) + +ManagedBy: | + . [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://tidesandcurrents.noaa.gov/about_us.html) + . See all datasets managed by: [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://registry.opendata.aws/collab/noaa/) + Contact: | For questions regarding data content or quality, email CO-OPS.UserServices@noaa.gov -
This data is made available to the public through the NOAA Open Data Dissemination (NODD) Program. For questions regarding this program, email nodd@noaa.gov. -
- We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NOAA NODD team at NODD@NOAA.GOV. -ManagedBy: "[NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://tidesandcurrents.noaa.gov/about_us.html)" -UpdateFrequency: Monthly, quarterly, and annually, depending on the dataset. + We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NOAA NODD team at nodd@noaa.gov. + Collabs: ASDI: Tags: @@ -54,8 +82,7 @@ Resources: Region: us-east-1 Type: SNS Topic DataAtWork: - Tutorials: - - Title: Notebooks for working with CORA Data - URL: https://github.com/NOAA-CO-OPS/CORA-Coastal-Ocean-ReAnalysis-CORA - AuthorName: John Ratcliff - AuthorURL: https://www.linkedin.com/in/johndratcliff/ + Usage Examples: + - Tutorials: Python-based Jupyter notebooks to access, analyze, visualize, and transform CORA datasets are available in the [CORA GitHub Repository](https://github.com/NOAA-CO-OPS/CORA-Coastal-Ocean-Reanalysis-CORA/tree/main?tab=readme-ov-file). + - Maps: [CORA-GEC Maximum Water Level Elevation Nodes, CORA-GEC 500 Meter Grid](https://noaa.maps.arcgis.com/apps/mapviewer/index.html?webmap=92321613d16f400894b9b7330ae2fab4) + - [Use Cases](https://tidesandcurrents.noaa.gov/cora.html#usecase) From b99c85097cfdf5ffdd2e71eabd03a2a8b9505746 Mon Sep 17 00:00:00 2001 From: willmacs <103065262+willmacs@users.noreply.github.com> Date: Fri, 12 Sep 2025 14:29:06 -0400 Subject: [PATCH 328/751] Update rcm-ceos-ard.yaml --- datasets/rcm-ceos-ard.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index e269ee5e2..227a8e6ef 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -40,11 +40,12 @@ Resources: Explore: - '[EODMS STAC for RCM CEOS ARD](https://www.eodms-sgdot.nrcan-rncan.gc.ca/stac/collections/rcm-ard/items/)' DataAtWork: - Publications: + Tutorials: - Title: Workflows for accessing and manipulating RCM ARD SpatioTemporal Asset Catalog (STAC) in JupyterLab Python Notebooks - Flux de travail pour accéder et manipuler le catalogue d'actifs spatio-temporels (STAC) RCM ARD dans les notebooks Python JupyterLab URL: https://github.com/eodms-sgdot/rcm-ard-stac-examples AuthorName: Canada Centre for Remote Sensing | Centre canadien de télédétection - AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing + AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing + Publications: - Title: Synthetic Aperture Radar (CEOS-ARD SAR) URL: https://ceos.org/ard/files/PFS/SAR/v1.1/CEOS-ARD_PFS_Synthetic_Aperture_Radar_v1.1.pdf AuthorName: Committee on Earth Observation Satellites (CEOS) for developing the CEOS ARD Standards. Specific acknowledgement to François Charbonneau (NRCan) for contributions to the standard development through CEOS committee membership as well as application to Canadian RADARSAT data. - Comité sur les satellites d'observation de la Terre (CEOS) pour l'élaboration des normes CEOS ARD. Remerciements particuliers à François Charbonneau (RNCan) pour ses contributions au développement des normes par le biais de son appartenance au comité CEOS ainsi que pour l'application aux données canadiennes RADARSAT. From 0beba19dc0633011d94788fe893a13bc9cc90300 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 09:44:49 -0800 Subject: [PATCH 329/751] ok: Update rcm-ceos-ard.yaml --- datasets/rcm-ceos-ard.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index 6add5a89f..3a77ae523 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -65,4 +65,4 @@ DataAtWork: - Title: Copernicus Global Digital Elevation Model URL: https://dataspace.copernicus.eu/explore-data/data-collections/copernicus-contributing-missions/collections-description/COP-DEM AuthorName: European Space Agency (ESA) - AuthorURL: https://www.esa.int/ \ No newline at end of file + AuthorURL: https://www.esa.int/ From 8b25a09de557b1d80f4bdac2d4ceb33e7b8c98b8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 09:51:16 -0800 Subject: [PATCH 330/751] ok: Update noaa-nos-cora.yaml From 59811526bbacc506e31ebdc48a5ede86147af104 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 09:54:59 -0800 Subject: [PATCH 331/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index de7da2b12..4c3ca60c9 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -17,8 +17,8 @@ Description: | . Projection: 1983 Contiguous USA Albers projection (EPSG:5070) . Update Frequency: Product dependent. At minimum, annually. -Datasets: | -Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format + Datasets: | + Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format . fort.63.nc - Water level elevation . fort.73.nc - Atmospheric pressure at sea level . fort.74.nc - Wind Velocity - 10 m elevation @@ -29,8 +29,8 @@ Water level and wave datasets resulting from the computation, assimilation, vali . swan_HS.63.nc - Spectral zeroth moment wave height . swan_HS_max.63.nc - Maximum spectral zeroth moment wave height -Derived Products: | -Datasets resulting from the computation, modeling, or other processing using existing/collected data. All products are available in NetCDF (.nc) format + Derived Products: | + Datasets resulting from the computation, modeling, or other processing using existing/collected data. All products are available in NetCDF (.nc) format . CORA-V1.1-fort.63: Hourly water levels . CORA-V1.1-swan_DIR.63: Hourly mean wave direction . CORA-V1.1-swan_TPS.63: Hourly peak wave periods From be0eca124b5384dcfe46c0d2fc79890a4a858756 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 09:59:13 -0800 Subject: [PATCH 332/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index 4c3ca60c9..5ea73d6fa 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -38,9 +38,9 @@ Description: | . CORA-V1.1-Grid: Hourly water levels interpolated from model nodes to uniform 500-meter resolution grid License: | -NOAA data disseminated through NODD are open to the public and can be used as desired. + NOAA data disseminated through NODD are open to the public and can be used as desired. -NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. + NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. Documentation: | . [NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean](https://repository.library.noaa.gov/view/noaa/66833) From c55283ed958f187d6e31c8de969289a88424e1a8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 10:03:50 -0800 Subject: [PATCH 333/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index 5ea73d6fa..2dae2caa4 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -85,4 +85,4 @@ DataAtWork: Usage Examples: - Tutorials: Python-based Jupyter notebooks to access, analyze, visualize, and transform CORA datasets are available in the [CORA GitHub Repository](https://github.com/NOAA-CO-OPS/CORA-Coastal-Ocean-Reanalysis-CORA/tree/main?tab=readme-ov-file). - Maps: [CORA-GEC Maximum Water Level Elevation Nodes, CORA-GEC 500 Meter Grid](https://noaa.maps.arcgis.com/apps/mapviewer/index.html?webmap=92321613d16f400894b9b7330ae2fab4) - - [Use Cases](https://tidesandcurrents.noaa.gov/cora.html#usecase) + - Use cases: https://tidesandcurrents.noaa.gov/cora.html#usecase From 1cd5e8009c08fe1b38779511533a88530b6876f6 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 11:20:42 -0800 Subject: [PATCH 334/751] ok: Update mosaic.yaml From adfbe57a62ad98ebc555a59261c41e8e3d952dd7 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 14:51:53 -0800 Subject: [PATCH 335/751] ok: Update coralreef-image-classification-training.yaml --- datasets/coralreef-image-classification-training.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/coralreef-image-classification-training.yaml b/datasets/coralreef-image-classification-training.yaml index da2b01a8d..8716dd5a9 100644 --- a/datasets/coralreef-image-classification-training.yaml +++ b/datasets/coralreef-image-classification-training.yaml @@ -5,6 +5,7 @@ Contact: contact@datamermaid.org ManagedBy: "[MERMAID](https://datamermaid.org/)" UpdateFrequency: Each partner organization updates on their own cadence. MERMAID updates once per day. Tags: + - aws-pds - coastal - conservation - coral reef From 097c0480350c7e1ff167297d0f87241b6e102473 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 15:02:23 -0800 Subject: [PATCH 336/751] ok: Update coralreef-image-classification-training.yaml From 4643739d1af34e2327d008002302668360a0eb7d Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Sep 2025 15:29:12 -0800 Subject: [PATCH 337/751] ok: Update caladapt-wildfire-dataset.yaml --- datasets/caladapt-wildfire-dataset.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/caladapt-wildfire-dataset.yaml b/datasets/caladapt-wildfire-dataset.yaml index e3a835a69..fe40ddd45 100644 --- a/datasets/caladapt-wildfire-dataset.yaml +++ b/datasets/caladapt-wildfire-dataset.yaml @@ -12,6 +12,7 @@ Collabs: Tags: - climate Tags: + - aws-pds - climate - climate model - climate projections @@ -61,4 +62,4 @@ DataAtWork: AuthorName: "Cal-Adapt: Analytics Engine Team" AuthorURL: https://github.com/cal-adapt ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 2c9f46c86f36462146dcc6a3cd53b011dedce17b Mon Sep 17 00:00:00 2001 From: berylrab Date: Tue, 16 Sep 2025 06:10:31 -0700 Subject: [PATCH 338/751] ok: Update ai3.yaml Added asw-pds tag --- datasets/ai3.yaml | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 1c5001bbb..832546387 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -1,12 +1,10 @@ Name: AI3 Description: > The rapid advancement of computing technologies, particularly artificial intelligence (AI), has revolutionized various domains, including drug discovery. Curated datasets are crucial for developing reliable, generalizable, and accurate models for practical applications. Generating experimental data on a large scale is an expensive and arduous process. In domains such as medical diagnostics where real-life data is hard to obtain, synthetic data has been shown to be extremely valuable. We, teams from IIIT Hyderabad, Intel, AWS, and Insilico Medicine, have performed physics-based calculations (molecular dynamics simulations) on about 20,000 protein-ligand complexes. The dataset comprises molecular dynamics snapshots, binding affinities calculated using the MM-PBSA method, and individual energy components, including electrostatic and van der Waals interactions. DatasetFileFormats essentially incorporate i. 3D coordinates of the protein-ligand complexes (pdb) in tar.gz files, and ii. CSV files containing the energy data. DatasetUsages are on i. ML scoring function for predicting binding affinities of given protein-ligand complexes, ii. Classification models for predicting correct binding poses of ligands, iii. identification of cryptic binding pockets, and iv. optimization of binding features by exploiting the individual components of the energy (experimental data has only the total binding affinity). Further, the novelty of the dataset highlights the fact that existing AI/ML training datasets lack dynamic data and are inherently biased. Further, binding affinity data existing in the literature are obtained from different experimental protocols. Therefore, this dataset has been uniquely created (from the same computational protocols) followed by free energy calculations with molecular dynamics (MD) simulations. The dynamic data-enriched protein-ligand coordinates can be used to effectively train convolutional neural network-based regression models for more accurate binding affinity prediction. - Documentation: https://github.com/devalab/AI3 Contact: devalab@iiit.ac.in ManagedBy: International Institute of Information Technology Hyderabad UpdateFrequency: Not updated - Tags: - pharmaceutical - simulations @@ -15,15 +13,13 @@ Tags: - machine learning - protein - molecular dynamics - + - aws-pds License: https://devalab.in/AI3.html - Resources: - Description: ai3data bucket includes coordinates and the energetics of ~20,000 protein-ligand binding affinity datasets. The subfolders of ai3data bucket consist of Version 1, Version2 and Version 3. Version1 contains the total Size of 10.4 GiB (Initial structure of the protein-ligand complex and the average binding affinities along with average energy components). Version2 contains the total Size of 1.2 TiB (Five trajectories of protein-ligand complex (200 snapshots in all) and the closest two water molecules for each of the protein-ligand complex, and the time series of the binding affinities along with average energy components). Version3 contains the total Size of 10.7 TiB (Five trajectories of completely solvated protein-ligand complex (200 snapshots in all), and the time series of binding affinities along with average energy components). ARN: arn:aws:s3:::ai3data Region: us-east-1 Type: S3 Bucket - DataAtWork: Tutorials: - Title: "AI3: Protein-Ligand Binding Affinity Dataset" @@ -39,4 +35,3 @@ DataAtWork: URL: https://www.nature.com/articles/s41597-023-02872-y AuthorName: U. Deva Priyakumar AuthorURL: https://devalab.in - From 1905f4a93be8334b6f9cc0b67e47c95c287f3fa2 Mon Sep 17 00:00:00 2001 From: Tjima <45975971+Tjima@users.noreply.github.com> Date: Tue, 16 Sep 2025 09:53:15 -0400 Subject: [PATCH 339/751] Update noaa-nos-cora.yaml Thank you for looking in to this. I just modified the DataWorks Section as specified. --- datasets/noaa-nos-cora.yaml | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index 2dae2caa4..ffc1594bc 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -42,10 +42,6 @@ License: | NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. -Documentation: | - . [NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean](https://repository.library.noaa.gov/view/noaa/66833) - . [Assessment of water levels from 43 years of NOAA’s Coastal Ocean Reanalysis (CORA) for the Gulf of Mexico and East Coasts](https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2024.1381228/full?utm_source=Email_to_authors_&utm_medium=Email&utm_content=T1_11.5e1_author&utm_campaign=Email_publication&field=&journalName=Frontiers_in_Marine_Science&id=1381228) - ManagedBy: | . [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://tidesandcurrents.noaa.gov/about_us.html) . See all datasets managed by: [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://registry.opendata.aws/collab/noaa/) @@ -81,8 +77,25 @@ Resources: ARN: arn:aws:sns:us-east-1:709902155096:NewNOSCORAObject Region: us-east-1 Type: SNS Topic + DataAtWork: - Usage Examples: - - Tutorials: Python-based Jupyter notebooks to access, analyze, visualize, and transform CORA datasets are available in the [CORA GitHub Repository](https://github.com/NOAA-CO-OPS/CORA-Coastal-Ocean-Reanalysis-CORA/tree/main?tab=readme-ov-file). - - Maps: [CORA-GEC Maximum Water Level Elevation Nodes, CORA-GEC 500 Meter Grid](https://noaa.maps.arcgis.com/apps/mapviewer/index.html?webmap=92321613d16f400894b9b7330ae2fab4) - - Use cases: https://tidesandcurrents.noaa.gov/cora.html#usecase + Tutorials: + - Title: Using Python to Access Coastal Ocean Reanalysis (CORA) Data + URL: https://github.com/NOAA-CO-OPS/CORA-Coastal-Ocean-Reanalysis-CORA + AuthorName: NOAA's Center for Operational Oceanographic Products and Services + AuthorURL: https://tidesandcurrents.noaa.gov/ + + Tools & Applications: + - Title: Coastal Ocean Reanalysis Use cases + URL: https://tidesandcurrents.noaa.gov/cora.html#usecase + AuthorName: NOAA's Center for Operational Oceanographic Products and Services + AuthorURL: https://tidesandcurrents.noaa.gov/ + + Publications: + - Title: NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025) + URL: https://doi.org/10.25923/5ypp-4e84 + AuthorName: Keeney, Analise; Dusek, Gregory; Callahan, John; Ratcliff, John; Jima, Tigist; Brooks, William; Marcy, Doug; Blanton, Brian; Tilson, Jeffrey; Asher, Taylor G.; Leuttich, Richard A.; Widlansky, Matthew J.; Rose, Linta; Morse, Cheryl; Haddad, Jana; & Waring, Blake + + - Title: Assessment of water levels from 43 years of NOAA’s Coastal Ocean Reanalysis (CORA) for the Gulf of Mexico and East Coasts + URL: https://doi.org/10.3389/fmars.2024.1381228 + AuthorName: Rose, Linta; Widlansky, Matthew J.; Feng, Xue; Thompson, Thompson; Asher, Taylor G.; Dusek, Gregory; Blanton, Blanton; Luettich, Richard A. Jr.; Callahan, John; Brooks, William; Keeney, Analise; Haddad, Jana; Sweet, William; Genz, Ayesha; Hovenga, Paige; Marra, John & Tilson, Jeffrey From 9a30460134a77ea7c042e9776b95e59d20a81590 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 06:36:15 -0800 Subject: [PATCH 340/751] ok: Update noaa-nos-cora.yaml From f057d768ef4e0d262c58c78ae0997926c3526849 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 06:42:47 -0800 Subject: [PATCH 341/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index ffc1594bc..c1671ddf2 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -67,35 +67,35 @@ Tags: - oceans License: NOAA data disseminated through NODD are open to the public and can be used as desired.

NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. Resources: - - Description: NOAA’s Coastal Ocean Reanalysis (CORA) Dataset NetCDF + - Description: "NOAA’s Coastal Ocean Reanalysis (CORA) Dataset NetCDF" ARN: arn:aws:s3:::noaa-nos-cora-pds Region: us-east-1 Type: S3 Bucket Explore: - '[Browse Bucket](https://noaa-nos-cora-pds.s3.amazonaws.com/index.html)' - - Description: NOAA’s Coastal Ocean Reanalysis (CORA) Dataset Notifications + - Description: "NOAA’s Coastal Ocean Reanalysis (CORA) Dataset Notifications" ARN: arn:aws:sns:us-east-1:709902155096:NewNOSCORAObject Region: us-east-1 Type: SNS Topic DataAtWork: Tutorials: - - Title: Using Python to Access Coastal Ocean Reanalysis (CORA) Data + - Title: "Using Python to Access Coastal Ocean Reanalysis (CORA) Data" URL: https://github.com/NOAA-CO-OPS/CORA-Coastal-Ocean-Reanalysis-CORA - AuthorName: NOAA's Center for Operational Oceanographic Products and Services + AuthorName: "NOAA's Center for Operational Oceanographic Products and Services" AuthorURL: https://tidesandcurrents.noaa.gov/ Tools & Applications: - Title: Coastal Ocean Reanalysis Use cases URL: https://tidesandcurrents.noaa.gov/cora.html#usecase - AuthorName: NOAA's Center for Operational Oceanographic Products and Services + AuthorName: "NOAA's Center for Operational Oceanographic Products and Services" AuthorURL: https://tidesandcurrents.noaa.gov/ Publications: - - Title: NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025) + - Title: "NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025)" URL: https://doi.org/10.25923/5ypp-4e84 AuthorName: Keeney, Analise; Dusek, Gregory; Callahan, John; Ratcliff, John; Jima, Tigist; Brooks, William; Marcy, Doug; Blanton, Brian; Tilson, Jeffrey; Asher, Taylor G.; Leuttich, Richard A.; Widlansky, Matthew J.; Rose, Linta; Morse, Cheryl; Haddad, Jana; & Waring, Blake - - Title: Assessment of water levels from 43 years of NOAA’s Coastal Ocean Reanalysis (CORA) for the Gulf of Mexico and East Coasts + - Title: "Assessment of water levels from 43 years of NOAA’s Coastal Ocean Reanalysis (CORA) for the Gulf of Mexico and East Coasts" URL: https://doi.org/10.3389/fmars.2024.1381228 AuthorName: Rose, Linta; Widlansky, Matthew J.; Feng, Xue; Thompson, Thompson; Asher, Taylor G.; Dusek, Gregory; Blanton, Blanton; Luettich, Richard A. Jr.; Callahan, John; Brooks, William; Keeney, Analise; Haddad, Jana; Sweet, William; Genz, Ayesha; Hovenga, Paige; Marra, John & Tilson, Jeffrey From 9d0bfdcbfa35d10712ce91d169da3405dc121628 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 07:46:14 -0800 Subject: [PATCH 342/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index c1671ddf2..ab557deea 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -1,5 +1,4 @@ -NOAA's Coastal Ocean Reanalysis (CORA) Dataset: 1979-2022 -Tags: Ocean, atmosphere, climate, flood risk planning, transportation, restoration, weather, waves, storm modeling, hydrodynamic modeling +Name: "NOAA's Coastal Ocean Reanalysis (CORA) Dataset: 1979-2022" Description: | NOAA's [Coastal Ocean Reanalysis (CORA)](https://tidesandcurrents.noaa.gov/cora.html) for the Gulf, East Coast/Atlantic, and Caribbean (GEC) is produced using verified hourly water levels from the National Ocean Service’s [Center of Operational Oceanographic Products & Services](https://tidesandcurrents.noaa.gov/) (CO-OPS). [ADvanced CIRCulation Model (ADCIRC)](https://www.erdc.usace.army.mil/Media/Fact-Sheets/Fact-Sheet-Article-View/Article/476698/advanced-circulation-model/) and [Simulating WAves Nearshore (SWAN)](https://www.tudelft.nl/en/ceg/about-faculty/departments/hydraulic-engineering/sections/environmental-fluid-mechanics/research/swan) models are coupled to model coastal water levels and nearshore waves. Hourly water level observations are used for data assimilation and validation to improve the accuracy of modeled water levels and wave datasets. From dc0b5aee9eebe5c86432024b407c1ad0738fe22c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 07:52:36 -0800 Subject: [PATCH 343/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index ab557deea..3e49ea15b 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -64,7 +64,7 @@ Tags: - agriculture - transportation - oceans -License: NOAA data disseminated through NODD are open to the public and can be used as desired.

NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. + Resources: - Description: "NOAA’s Coastal Ocean Reanalysis (CORA) Dataset NetCDF" ARN: arn:aws:s3:::noaa-nos-cora-pds From 1982cd4ea3ce33c0f6328a945971d3451cd56da1 Mon Sep 17 00:00:00 2001 From: Tjima <45975971+Tjima@users.noreply.github.com> Date: Tue, 16 Sep 2025 13:33:07 -0400 Subject: [PATCH 344/751] Update noaa-nos-cora.yaml Update Frequency and documentation updated. Thank you --- datasets/noaa-nos-cora.yaml | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index 3e49ea15b..bfff21d59 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -14,8 +14,11 @@ Description: | . Centroids: 300-400 meters . Gridded: 500 meters . Projection: 1983 Contiguous USA Albers projection (EPSG:5070) - . Update Frequency: Product dependent. At minimum, annually. - + + Documentation: + . ["NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025)"](https://doi.org/10.25923/5ypp-4e84) + UpdateFrequency: + . Product dependent. At minimum, annually. Datasets: | Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format . fort.63.nc - Water level elevation From e45ff8b11d516d362c15f044846735aababa6143 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 09:56:22 -0800 Subject: [PATCH 345/751] ok: Update noaa-nos-cora.yaml From 7366034b936c3da054d7c7663712857d8aeacd01 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 10:08:25 -0800 Subject: [PATCH 346/751] ok: Update noaa-nos-cora.yaml --- datasets/noaa-nos-cora.yaml | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index bfff21d59..61f8bfe86 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -14,11 +14,7 @@ Description: | . Centroids: 300-400 meters . Gridded: 500 meters . Projection: 1983 Contiguous USA Albers projection (EPSG:5070) - - Documentation: - . ["NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025)"](https://doi.org/10.25923/5ypp-4e84) - UpdateFrequency: - . Product dependent. At minimum, annually. + Datasets: | Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format . fort.63.nc - Water level elevation @@ -39,6 +35,12 @@ Description: | . CORA-V1.1-swan_HS.63: Hourly significant wave heights . CORA-V1.1-Grid: Hourly water levels interpolated from model nodes to uniform 500-meter resolution grid +Documentation: | + . ["NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025)"](https://doi.org/10.25923/5ypp-4e84) + +UpdateFrequency: + . Product dependent. At minimum, annually. + License: | NOAA data disseminated through NODD are open to the public and can be used as desired. From d0095e37dcb32400c9f2e85736f9f995bea88012 Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Tue, 16 Sep 2025 11:37:24 -0800 Subject: [PATCH 347/751] NOAA CORA html fixes --- datasets/noaa-nos-cora.yaml | 71 +++++++++++++++++++------------------ 1 file changed, 36 insertions(+), 35 deletions(-) diff --git a/datasets/noaa-nos-cora.yaml b/datasets/noaa-nos-cora.yaml index 61f8bfe86..436e105d5 100644 --- a/datasets/noaa-nos-cora.yaml +++ b/datasets/noaa-nos-cora.yaml @@ -2,44 +2,46 @@ Name: "NOAA's Coastal Ocean Reanalysis (CORA) Dataset: 1979-2022" Description: | NOAA's [Coastal Ocean Reanalysis (CORA)](https://tidesandcurrents.noaa.gov/cora.html) for the Gulf, East Coast/Atlantic, and Caribbean (GEC) is produced using verified hourly water levels from the National Ocean Service’s [Center of Operational Oceanographic Products & Services](https://tidesandcurrents.noaa.gov/) (CO-OPS). [ADvanced CIRCulation Model (ADCIRC)](https://www.erdc.usace.army.mil/Media/Fact-Sheets/Fact-Sheet-Article-View/Article/476698/advanced-circulation-model/) and [Simulating WAves Nearshore (SWAN)](https://www.tudelft.nl/en/ceg/about-faculty/departments/hydraulic-engineering/sections/environmental-fluid-mechanics/research/swan) models are coupled to model coastal water levels and nearshore waves. Hourly water level observations are used for data assimilation and validation to improve the accuracy of modeled water levels and wave datasets. +

+ Additional Details:
+ Metadata associated with model domain and time span: + - Timeseries - 1979 to 2022 + - Size - Approx. 44.6 TB + - Domain - Lat 5.8 to 45.8 ; Long -98.0 to -53.8 + - Nodes - [CORA Metadata Library](https://www.fisheries.noaa.gov/inport/item/75048) + - Grid cells - [CORA Metadata Library](https://www.fisheries.noaa.gov/inport/item/75048) + - Spatial Resolution: + - Centroids: 300-400 meters + - Gridded: 500 meters + - Projection: 1983 Contiguous USA Albers projection (EPSG:5070) +

- Additional Details: | - Metadata associated with model domain and time span. - . Timeseries - 1979 to 2022 - . Size - Approx. 44.6 TB - . Domain - Lat 5.8 to 45.8 ; Long -98.0 to -53.8 - . Nodes - [CORA Metadata Library](https://www.fisheries.noaa.gov/inport/item/75048) - . Grid cells - [CORA Metadata Library](https://www.fisheries.noaa.gov/inport/item/75048) - . Spatial Resolution: - . Centroids: 300-400 meters - . Gridded: 500 meters - . Projection: 1983 Contiguous USA Albers projection (EPSG:5070) + Datasets:
+ Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format: + - fort.63.nc - Water level elevation + - fort.73.nc - Atmospheric pressure at sea level + - fort.74.nc - Wind Velocity - 10 m elevation + - maxele.63.nc - Maximum water elevation + - swan_DIR.63.nc - Spectral mean wave direction + - swan_TMM10.63.nc - Spectral mean wave period + - swan_TPS.63.nc - Spectral peak wave period + - swan_HS.63.nc - Spectral zeroth moment wave height + - swan_HS_max.63.nc - Maximum spectral zeroth moment wave height +

- Datasets: | - Water level and wave datasets resulting from the computation, assimilation, validation, and optimization reanalysis datasets. All products are available in NetCDF (.nc) format - . fort.63.nc - Water level elevation - . fort.73.nc - Atmospheric pressure at sea level - . fort.74.nc - Wind Velocity - 10 m elevation - . maxele.63.nc - Maximum water elevation - . swan_DIR.63.nc - Spectral mean wave direction - . swan_TMM10.63.nc - Spectral mean wave period - . swan_TPS.63.nc - Spectral peak wave period - . swan_HS.63.nc - Spectral zeroth moment wave height - . swan_HS_max.63.nc - Maximum spectral zeroth moment wave height - - Derived Products: | - Datasets resulting from the computation, modeling, or other processing using existing/collected data. All products are available in NetCDF (.nc) format - . CORA-V1.1-fort.63: Hourly water levels - . CORA-V1.1-swan_DIR.63: Hourly mean wave direction - . CORA-V1.1-swan_TPS.63: Hourly peak wave periods - . CORA-V1.1-swan_HS.63: Hourly significant wave heights - . CORA-V1.1-Grid: Hourly water levels interpolated from model nodes to uniform 500-meter resolution grid + Derived Products:
+ Datasets resulting from the computation, modeling, or other processing using existing/collected data. All products are available in NetCDF (.nc) format: + - CORA-V1.1-fort.63: Hourly water levels + - CORA-V1.1-swan_DIR.63: Hourly mean wave direction + - CORA-V1.1-swan_TPS.63: Hourly peak wave periods + - CORA-V1.1-swan_HS.63: Hourly significant wave heights + - CORA-V1.1-Grid: Hourly water levels interpolated from model nodes to uniform 500-meter resolution grid +

Documentation: | - . ["NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025)"](https://doi.org/10.25923/5ypp-4e84) + [NOAA Technical Report NOS CO-OPS 108: NOAA’s Coastal Ocean Reanalysis: Gulf of Mexico, Atlantic, and Caribbean (January 2025)](https://doi.org/10.25923/5ypp-4e84) -UpdateFrequency: - . Product dependent. At minimum, annually. +UpdateFrequency: Product dependent. At minimum, annually. License: | NOAA data disseminated through NODD are open to the public and can be used as desired. @@ -47,8 +49,7 @@ License: | NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. ManagedBy: | - . [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://tidesandcurrents.noaa.gov/about_us.html) - . See all datasets managed by: [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://registry.opendata.aws/collab/noaa/) + [NOAA’s National Ocean Service, The Center for Operational Oceanographic Products and Services (CO-OPS)](https://tidesandcurrents.noaa.gov/about_us.html) Contact: | For questions regarding data content or quality, email CO-OPS.UserServices@noaa.gov From 3859ea0112a8907392da3f7d19a20b3f12b7d0ce Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Sep 2025 11:38:24 -0800 Subject: [PATCH 348/751] ok: Update noaa-nos-cora.yaml From d5fd2e1a5b44c592fbd6c1c0447c9ba339fe27e6 Mon Sep 17 00:00:00 2001 From: blahner <40153591+blahner@users.noreply.github.com> Date: Tue, 16 Sep 2025 23:09:43 -0700 Subject: [PATCH 349/751] Update mosaic.yaml updated bucket name, browse bucket link, and region --- datasets/mosaic.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml index 2fd24ffd2..6bda21665 100644 --- a/datasets/mosaic.yaml +++ b/datasets/mosaic.yaml @@ -16,11 +16,11 @@ License: CC BY 4.0 Citation: Resources: - Description: HDF5 files containing preprocessed fMRI data - ARN: arn:aws:s3:::mosaic-fmri - Region: us-east-1 + ARN: arn:aws:s3:::mosaicfmri + Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Bucket](https://mosaic-fmri.s3.amazonaws.com/index.html)' + - '[Browse Bucket](https://mosaicfmri.s3.amazonaws.com/index.html)' DataAtWork: Tutorials: - Title: Load HDF5 file (Jupyter notebook) From 24cf30e1ccf39ec9715425eecaf1d34b88edc3e5 Mon Sep 17 00:00:00 2001 From: awahle Date: Wed, 17 Sep 2025 10:05:45 -0500 Subject: [PATCH 350/751] Added Clinical Ultrasound Image Repository --- datasets/clinical-ultrasound-image-data.yaml | 21 ++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 datasets/clinical-ultrasound-image-data.yaml diff --git a/datasets/clinical-ultrasound-image-data.yaml b/datasets/clinical-ultrasound-image-data.yaml new file mode 100644 index 000000000..e251c435e --- /dev/null +++ b/datasets/clinical-ultrasound-image-data.yaml @@ -0,0 +1,21 @@ +Name: Clinical Ultrasound Image Repository +Description: Generic Clinical Ultrasound Data from Random Subjects acquired for Clinical Reasons, to be used for Developing Artificial Intelligence Applications. This dataset is complete with 2000 studies from 2000 subjects (one third each from abdominal, cardiac, and OB/GYN cases) +Documentation: https://clinical-ultrasound-image-repository.s3.amazonaws.com/index.html +Contact: shuver@nvidia.com +ManagedBy: [MONAI Development Team](https://github.com/Project-MONAI/MONAI) +UpdateFrequency: This is a static dataset; however, tutorials and resources will be updated as they are developed. +Tags: + - medicine + - ultrasound + - routine data + - clinical imaging + - diagnosis + - treatment +License: [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) +Resources: + - Description: Clinical Ultrasound Image Repository + ARN: arn:aws:s3:::clinical-ultrasound-image-repository + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://clinical-ultrasound-image-repository.s3.amazonaws.com/download.html)' From f48bb9f332811a1de489bd7fde00050eefddf8fd Mon Sep 17 00:00:00 2001 From: crichica <148996603+crichica@users.noreply.github.com> Date: Wed, 17 Sep 2025 13:10:40 -0400 Subject: [PATCH 351/751] ok: Update ai3.yaml --- datasets/ai3.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ai3.yaml b/datasets/ai3.yaml index 832546387..6d3d50d0d 100644 --- a/datasets/ai3.yaml +++ b/datasets/ai3.yaml @@ -1,4 +1,4 @@ -Name: AI3 +Name: AI3 Protein-Ligand Binding Affinity Dataset Description: > The rapid advancement of computing technologies, particularly artificial intelligence (AI), has revolutionized various domains, including drug discovery. Curated datasets are crucial for developing reliable, generalizable, and accurate models for practical applications. Generating experimental data on a large scale is an expensive and arduous process. In domains such as medical diagnostics where real-life data is hard to obtain, synthetic data has been shown to be extremely valuable. We, teams from IIIT Hyderabad, Intel, AWS, and Insilico Medicine, have performed physics-based calculations (molecular dynamics simulations) on about 20,000 protein-ligand complexes. The dataset comprises molecular dynamics snapshots, binding affinities calculated using the MM-PBSA method, and individual energy components, including electrostatic and van der Waals interactions. DatasetFileFormats essentially incorporate i. 3D coordinates of the protein-ligand complexes (pdb) in tar.gz files, and ii. CSV files containing the energy data. DatasetUsages are on i. ML scoring function for predicting binding affinities of given protein-ligand complexes, ii. Classification models for predicting correct binding poses of ligands, iii. identification of cryptic binding pockets, and iv. optimization of binding features by exploiting the individual components of the energy (experimental data has only the total binding affinity). Further, the novelty of the dataset highlights the fact that existing AI/ML training datasets lack dynamic data and are inherently biased. Further, binding affinity data existing in the literature are obtained from different experimental protocols. Therefore, this dataset has been uniquely created (from the same computational protocols) followed by free energy calculations with molecular dynamics (MD) simulations. The dynamic data-enriched protein-ligand coordinates can be used to effectively train convolutional neural network-based regression models for more accurate binding affinity prediction. Documentation: https://github.com/devalab/AI3 From 314a76eb846d99c0a11a7fe8d4b346b26d048d15 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 17 Sep 2025 10:07:38 -0800 Subject: [PATCH 352/751] Update mosaic.yaml From 18f0993b91ee3d00ded472cf12d660af1f4aa147 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 17 Sep 2025 10:07:56 -0800 Subject: [PATCH 353/751] ok: Update mosaic.yaml From fbde85699bc6af581bd43189e207e04509b759a4 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 17 Sep 2025 10:09:35 -0800 Subject: [PATCH 354/751] ok: Update mosaic.yaml From 8cd284ace9da975d07bd2ae68acdf3f0f3358cc3 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 17 Sep 2025 13:39:34 -0700 Subject: [PATCH 355/751] ok: Update clinical-ultrasound-image-data.yaml Added LS and pds tags --- datasets/clinical-ultrasound-image-data.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/clinical-ultrasound-image-data.yaml b/datasets/clinical-ultrasound-image-data.yaml index e251c435e..bfa147e11 100644 --- a/datasets/clinical-ultrasound-image-data.yaml +++ b/datasets/clinical-ultrasound-image-data.yaml @@ -11,6 +11,8 @@ Tags: - clinical imaging - diagnosis - treatment + - life sciences + - aws-pds License: [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) Resources: - Description: Clinical Ultrasound Image Repository From 61095546f537416f4857b7ecb81871c2e53b7999 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 17 Sep 2025 13:41:36 -0700 Subject: [PATCH 356/751] ok: Update clinical-ultrasound-image-data.yaml Removing tags not available on tags.yaml --- datasets/clinical-ultrasound-image-data.yaml | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/datasets/clinical-ultrasound-image-data.yaml b/datasets/clinical-ultrasound-image-data.yaml index bfa147e11..dc34d7f3a 100644 --- a/datasets/clinical-ultrasound-image-data.yaml +++ b/datasets/clinical-ultrasound-image-data.yaml @@ -6,11 +6,8 @@ ManagedBy: [MONAI Development Team](https://github.com/Project-MONAI/MONAI) UpdateFrequency: This is a static dataset; however, tutorials and resources will be updated as they are developed. Tags: - medicine - - ultrasound - - routine data - - clinical imaging - - diagnosis - - treatment + - medical imaging + - machine learning - life sciences - aws-pds License: [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) From 2bc4ec0928afa3b9e35118d703b5310cfe2bb4df Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 18 Sep 2025 10:42:07 -0400 Subject: [PATCH 357/751] ok: Update clinical-ultrasound-image-data.yaml kicking off build --- datasets/clinical-ultrasound-image-data.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/clinical-ultrasound-image-data.yaml b/datasets/clinical-ultrasound-image-data.yaml index dc34d7f3a..eb66ec775 100644 --- a/datasets/clinical-ultrasound-image-data.yaml +++ b/datasets/clinical-ultrasound-image-data.yaml @@ -10,6 +10,7 @@ Tags: - machine learning - life sciences - aws-pds + License: [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) Resources: - Description: Clinical Ultrasound Image Repository From eb361528b40f4be7de98c14c1178f4f5e27ee391 Mon Sep 17 00:00:00 2001 From: awahle Date: Thu, 18 Sep 2025 10:57:11 -0500 Subject: [PATCH 358/751] attempt to fix syntax errors around external links --- datasets/clinical-ultrasound-image-data.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/clinical-ultrasound-image-data.yaml b/datasets/clinical-ultrasound-image-data.yaml index eb66ec775..86fec44b0 100644 --- a/datasets/clinical-ultrasound-image-data.yaml +++ b/datasets/clinical-ultrasound-image-data.yaml @@ -2,7 +2,7 @@ Name: Clinical Ultrasound Image Repository Description: Generic Clinical Ultrasound Data from Random Subjects acquired for Clinical Reasons, to be used for Developing Artificial Intelligence Applications. This dataset is complete with 2000 studies from 2000 subjects (one third each from abdominal, cardiac, and OB/GYN cases) Documentation: https://clinical-ultrasound-image-repository.s3.amazonaws.com/index.html Contact: shuver@nvidia.com -ManagedBy: [MONAI Development Team](https://github.com/Project-MONAI/MONAI) +ManagedBy: "[MONAI Development Team](https://github.com/Project-MONAI/MONAI)" UpdateFrequency: This is a static dataset; however, tutorials and resources will be updated as they are developed. Tags: - medicine @@ -11,11 +11,11 @@ Tags: - life sciences - aws-pds -License: [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) +License: "[CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" Resources: - Description: Clinical Ultrasound Image Repository ARN: arn:aws:s3:::clinical-ultrasound-image-repository Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Bucket](https://clinical-ultrasound-image-repository.s3.amazonaws.com/download.html)' + - "[Browse Bucket](https://clinical-ultrasound-image-repository.s3.amazonaws.com/download.html)" From 6b6eb0c4e998237ea7ec01bd021193e7849af7c9 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 18 Sep 2025 13:24:14 -0400 Subject: [PATCH 359/751] ok: Update clinical-ultrasound-image-data.yaml removing space --- datasets/clinical-ultrasound-image-data.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/clinical-ultrasound-image-data.yaml b/datasets/clinical-ultrasound-image-data.yaml index 86fec44b0..4c92184b4 100644 --- a/datasets/clinical-ultrasound-image-data.yaml +++ b/datasets/clinical-ultrasound-image-data.yaml @@ -10,7 +10,6 @@ Tags: - machine learning - life sciences - aws-pds - License: "[CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)" Resources: - Description: Clinical Ultrasound Image Repository From 36c7191cfea096e92ca8a61eaf109f14cd7bbd96 Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Thu, 18 Sep 2025 13:45:07 -0400 Subject: [PATCH 360/751] Update noaa-ocs-hydrodata.yaml Added links for STAC catalog and STAC Browser developed by NOAA Office of Coast Survey --- datasets/noaa-ocs-hydrodata.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/noaa-ocs-hydrodata.yaml b/datasets/noaa-ocs-hydrodata.yaml index 2093bd341..a7cac737d 100644 --- a/datasets/noaa-ocs-hydrodata.yaml +++ b/datasets/noaa-ocs-hydrodata.yaml @@ -28,6 +28,8 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://noaa-ocs-hydrodata-pds.s3.amazonaws.com/index.html)' + - '[STAC Catalog](https://noaa-ocs-hydrodata-pds.s3.amazonaws.com/catalog.json)' + - '[STAC Browser](https://radiantearth.github.io/stac-browser/#/external/noaa-ocs-hydrodata-pds.s3.amazonaws.com/catalog.json?.language=en)' - Description: NOAA Office of Coast Survey Hydrographic Survey Data New Object Notification ARN: arn:aws:sns:us-east-1:709902155096:NewOCSHYDROObject Region: us-east-1 From c261fdb2e8b7e97ba1fb12d597877f6b63cde2c9 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 18 Sep 2025 09:58:00 -0800 Subject: [PATCH 361/751] ok: Update noaa-ocs-hydrodata.yaml From 78de456a7cd7f89a5eae18271ec34cb2a477baee Mon Sep 17 00:00:00 2001 From: Mansour A <44963644+mansour2002@users.noreply.github.com> Date: Fri, 19 Sep 2025 16:02:04 -0700 Subject: [PATCH 362/751] Added SNS topic ARN to ucsf-rmac.yaml --- datasets/ucsf-rmac.yaml | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/datasets/ucsf-rmac.yaml b/datasets/ucsf-rmac.yaml index 0ad543427..da68ccb87 100644 --- a/datasets/ucsf-rmac.yaml +++ b/datasets/ucsf-rmac.yaml @@ -20,6 +20,10 @@ Resources: Region: us-west-2 Type: S3 Bucket Explore: https://s3.console.aws.amazon.com/s3/buckets/ucsf-rmac-dataset + - Description: Notifications for new Renal Mass CT data + ARN: arn:aws:sns:us-west-1:905542596225:ucsf-dmi-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: Label Exploration Tutorial @@ -46,4 +50,4 @@ DataAtWork: AuthorURL: DeprecatedNotice: ADXCategories: Healthcare & Life Sciences Data - - \ No newline at end of file + - From 047dd49f5368fead8849530716e28f666027c218 Mon Sep 17 00:00:00 2001 From: lizadams Date: Mon, 22 Sep 2025 15:29:56 -0400 Subject: [PATCH 363/751] add dataset --- datasets/cmas-data-warehouse.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/cmas-data-warehouse.yaml b/datasets/cmas-data-warehouse.yaml index 1f6dfa66d..fe3372d6a 100644 --- a/datasets/cmas-data-warehouse.yaml +++ b/datasets/cmas-data-warehouse.yaml @@ -73,6 +73,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://cmaq-12us4-cracmm3-modeling-platform-2023.s3.amazonaws.com/index.html)' + - Description: CMAQ Model Versions 5.5 CRACMM2 Input Data (2022r1) -- 12/22/2021 - 12/31/2022 12km CONUS + ARN: arn:aws:s3::::::cmaq-12us1-cracmm2-modeling-platform-2022 + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://cmaq-12us1-cracmm2-modeling-platform-2022.s3.amazonaws.com/index.html)' - Description: EPA 2022 Modeling Platform ARN: arn:aws:s3:::epa-2022-modeling-platform Region: us-east-1 From 249a5025936758a54706c4d9a17b9fc69ac0029c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 22 Sep 2025 12:57:16 -0800 Subject: [PATCH 364/751] ok: Update cmas-data-warehouse.yaml From 328aaf3e06ebc58cad94c1a7260fdc30110a6354 Mon Sep 17 00:00:00 2001 From: "Y.R. Moon" <96802030+yr-moon@users.noreply.github.com> Date: Tue, 23 Sep 2025 17:41:43 +0900 Subject: [PATCH 365/751] add SpaceEye-T yaml --- datasets/st-open-data.yaml | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 datasets/st-open-data.yaml diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml new file mode 100644 index 000000000..3cdf174b8 --- /dev/null +++ b/datasets/st-open-data.yaml @@ -0,0 +1,30 @@ +Name: SpaceEye-T VVHR EO Open Data +Description: | + SpaceEye-T satellite collects the highest resolution optical imagery among the commercial satellites, 25 cm resolution. The Open Data features various satellite images around the world for end users to experience the power of VVHR optical data. +Documentation: https://www.si-imaging.com/page/72?sca=SpaceEye-T +Contact: https://https//www.si-imaging.com +UpdateFrequency: The dataset is frequently updated. The frequent updates include the time-series data for regular monitoring, and the data for disaster management. SI Imaging wants to provide the user expierence on what is possible with VVHR optical satellite data. If you have a suggestion for a new location, feedback on the dataset, or any questions, contact us. +Tags: + - aws-pds + - satellite imagery + - earth observation + - vvhr + - disaster response + - agriculture monitoring + - geospatial + - image processing +License: Creative Commons Attribution-NonCommercial 4.0 International +ManagedBy: "[SI Imaging Services](https://https//www.si-imaging.com/)" +Resources: + - Description: SpaceEye-T Imagery Collection + ARN: arn:aws:s3:::st-vvhr-opendata + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Bucket](http://st-vvhr-opendata.s3-website.us-west-2.amazonaws.com/)' +DataAtWork: + Publications: + - Title: "SpaceEye-T Data Manual" + URL: https://www.si-imaging.com/page/72?sca=SpaceEye-T + AuthorName: SI-Imaging Services +ADXCategoriesAdxCategories: Resources Data \ No newline at end of file From 309dbf15385528b2f46863915e3ec063d2900a63 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 24 Sep 2025 11:28:55 -0400 Subject: [PATCH 366/751] ok: Update ucsf-rmac.yaml format for ADX --- datasets/ucsf-rmac.yaml | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/datasets/ucsf-rmac.yaml b/datasets/ucsf-rmac.yaml index da68ccb87..3c5370032 100644 --- a/datasets/ucsf-rmac.yaml +++ b/datasets/ucsf-rmac.yaml @@ -43,11 +43,6 @@ DataAtWork: URL: https://github.com/LarsonLab/UCSF-RMaC AuthorName: Peder Larson AuthorURL: https://scholar.google.com/citations?user=LrQ7YekAAAAJ&hl=en - Publications: - - Title: - URL: - AuthorName: - AuthorURL: DeprecatedNotice: -ADXCategories: Healthcare & Life Sciences Data - - +ADXCategories: + - Healthcare & Life Sciences Data From fb60209e95f1de933383bb9bcadd41b276a93352 Mon Sep 17 00:00:00 2001 From: berylrab Date: Wed, 24 Sep 2025 16:39:22 -0400 Subject: [PATCH 367/751] ok: Update ucsf-rmac.yaml formatting changes --- datasets/ucsf-rmac.yaml | 3 --- 1 file changed, 3 deletions(-) diff --git a/datasets/ucsf-rmac.yaml b/datasets/ucsf-rmac.yaml index 3c5370032..e8b59c393 100644 --- a/datasets/ucsf-rmac.yaml +++ b/datasets/ucsf-rmac.yaml @@ -19,7 +19,6 @@ Resources: ARN: arn:aws:s3:::ucsf-rmac-dataset Region: us-west-2 Type: S3 Bucket - Explore: https://s3.console.aws.amazon.com/s3/buckets/ucsf-rmac-dataset - Description: Notifications for new Renal Mass CT data ARN: arn:aws:sns:us-west-1:905542596225:ucsf-dmi-object_created Region: us-west-2 @@ -31,13 +30,11 @@ DataAtWork: NotebookURL: AuthorName: Sule Sahin AuthorURL: https://github.com/sule-sahin - Services: S3 - Title: Mask Overlays URL: https://github.com/LarsonLab/UCSF-RMaC/blob/main/tutorials/maskoverlays.ipynb NotebookURL: AuthorName: Sule Sahin AuthorURL: https://github.com/sule-sahin - Services: S3 Tools & Applications: - Title: UCSF Renal Mass CT Dataset URL: https://github.com/LarsonLab/UCSF-RMaC From c09db741aaced5a6fab5602b5c0bfe5ddd978ca1 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 24 Sep 2025 16:40:48 -0800 Subject: [PATCH 368/751] ok: Update st-open-data.yaml --- datasets/st-open-data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 3cdf174b8..16f88d16a 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -27,4 +27,4 @@ DataAtWork: - Title: "SpaceEye-T Data Manual" URL: https://www.si-imaging.com/page/72?sca=SpaceEye-T AuthorName: SI-Imaging Services -ADXCategoriesAdxCategories: Resources Data \ No newline at end of file +ADXCategoriesAdxCategories: Resources Data From 962dcda2cdc38114f55762c3fae189cc29d06eb7 Mon Sep 17 00:00:00 2001 From: bessx <1159066+bessx@users.noreply.github.com> Date: Wed, 24 Sep 2025 22:51:23 -0700 Subject: [PATCH 369/751] Add DeepDrug AI's DPEB dataset --- datasets/deepdrug-dpeb.yml | 43 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 datasets/deepdrug-dpeb.yml diff --git a/datasets/deepdrug-dpeb.yml b/datasets/deepdrug-dpeb.yml new file mode 100644 index 000000000..5711449cd --- /dev/null +++ b/datasets/deepdrug-dpeb.yml @@ -0,0 +1,43 @@ +Name: DeepDrug Protein Embeddings Bank (DPEB) +Description: DPEB is a multimodal database of human protein embeddings integrating four biologically complementary representations—AlphaFold2, BioEmbeddings, ESM-2, and ProtVec—designed for enhanced protein-protein interaction prediction and functional classification. +Documentation: https://github.com/deepdrugai/DPEB +Contact: https://github.com/deepdrugai/DPEB/issues +ManagedBy: "Louisiana State University" +UpdateFrequency: Initial release; maintained for at least 2 years with updates planned based on new embedding models and protein coverage. +Tags: + - bioinformatics + - protein + - structural biology + - machine learning + - life sciences +License: MIT +Citation: "Sajol MSI et al. DeepDrug Protein Embeddings Bank (DPEB) was accessed on [DATE] at https://registry.opendata.aws/dpeb" +Resources: + - Description: Multimodal human protein embeddings (AlphaFold2, BioEmbeddings, ESM-2, ProtVec) with JSONL-formatted metadata containing FASTA, UniProt IDs, and embeddings. + ARN: arn:aws:s3:::deepdrug-dpeb-human-protein-embeddings + Region: us-cst-1 + Type: S3 Bucket + Explore: + - "https://github.com/deepdrugai/DPEB" +DataAtWork: + Tutorials: + - Title: Aggregating and Clustering AlphaFold2 Embeddings from DPEB + URL: https://github.com/deepdrugai/DPEB/tree/main + NotebookURL: https://github.com/deepdrugai/DPEB/tree/main/tutorial + AuthorName: Md. Saiful Islam Sajol + AuthorURL: https://github.com/deepdrugai + Services: EC2 + + Tools & Applications: + - Title: DPEB Explorer Tool + URL: https://github.com/deepdrugai/DPEB + AuthorName: DeepDrug Lab + AuthorURL: https://github.com/deepdrugai + Publications: + - Title: A Multimodal Human Protein Embeddings Database: DeepDrug Protein Embeddings Bank (DPEB) + URL: https://doi.org/10.XXXX/nar.dpeb2025 + AuthorName: Sajol MSI, Rajasekaran M, Bess A, Alvin C, Mukhopadhyay S + AuthorURL: https://github.com/deepdrugai/DPEB +ADXCategories: + - Life Sciences + - Artificial Intelligence From 2933db99944ee30aced08977e9cea4624a51bfd9 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 08:27:39 -0400 Subject: [PATCH 370/751] ok: Update deepdrug-dpeb.yml tags + ADX categories --- datasets/deepdrug-dpeb.yml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/datasets/deepdrug-dpeb.yml b/datasets/deepdrug-dpeb.yml index 5711449cd..dd5c83aa2 100644 --- a/datasets/deepdrug-dpeb.yml +++ b/datasets/deepdrug-dpeb.yml @@ -10,6 +10,7 @@ Tags: - structural biology - machine learning - life sciences + - aws-pds License: MIT Citation: "Sajol MSI et al. DeepDrug Protein Embeddings Bank (DPEB) was accessed on [DATE] at https://registry.opendata.aws/dpeb" Resources: @@ -27,7 +28,6 @@ DataAtWork: AuthorName: Md. Saiful Islam Sajol AuthorURL: https://github.com/deepdrugai Services: EC2 - Tools & Applications: - Title: DPEB Explorer Tool URL: https://github.com/deepdrugai/DPEB @@ -39,5 +39,4 @@ DataAtWork: AuthorName: Sajol MSI, Rajasekaran M, Bess A, Alvin C, Mukhopadhyay S AuthorURL: https://github.com/deepdrugai/DPEB ADXCategories: - - Life Sciences - - Artificial Intelligence + - Healthcare & Life Sciences Data From 91f1387598898bc6a8907a6ccd2ef8b06aa6c708 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 08:36:44 -0400 Subject: [PATCH 371/751] ok: Rename deepdrug-dpeb.yml to deepdrug-dpeb.yaml typo for yaml --- datasets/{deepdrug-dpeb.yml => deepdrug-dpeb.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{deepdrug-dpeb.yml => deepdrug-dpeb.yaml} (100%) diff --git a/datasets/deepdrug-dpeb.yml b/datasets/deepdrug-dpeb.yaml similarity index 100% rename from datasets/deepdrug-dpeb.yml rename to datasets/deepdrug-dpeb.yaml From 7cd0e8cc2e497470128b2fa6d35e8a4cf20f079a Mon Sep 17 00:00:00 2001 From: Devin McCabe Date: Thu, 25 Sep 2025 09:53:42 -0400 Subject: [PATCH 372/751] add depmap-omics-ccle yaml --- datasets/depmap-omics-ccle.yaml | 64 +++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 datasets/depmap-omics-ccle.yaml diff --git a/datasets/depmap-omics-ccle.yaml b/datasets/depmap-omics-ccle.yaml new file mode 100644 index 000000000..2c9efb544 --- /dev/null +++ b/datasets/depmap-omics-ccle.yaml @@ -0,0 +1,64 @@ +Name: The Cancer Dependency Map (DepMap) Cancer Cell Line Encyclopedia (CCLE) Dataset +Description: This dataset consists of whole genome sequencing (WGS), whole exome sequencing (WES), and RNA sequencing files generated from ~1000 cancer cell lines described in Ghandi et al., 2019. +Documentation: https://github.com/broadinstitute/depmap-omics-ccle +Contact: https://forum.depmap.org +ManagedBy: "[Cancer Data Science](https://cancerdatascience.org/), [Broad Institute](https://www.broadinstitute.org/)" +UpdateFrequency: occasionally (as additional sequencings are generated for publicly-releasible CCLE models) +Tags: + - aws-pds + - bam + - biology + - bioinformatics + - cancer + - genetic + - genomic + - Homo sapiens + - life sciences + - short read sequencing + - transcriptomics + - whole exome sequencing + - whole genome sequencing +License: https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/using-genomic-data +Citation: Ghandi, Huang, Jané-Valbuena et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019). https://doi.org/10.1038/s41586-019-1186-3 +Resources: + - Description: CRAM/BAM files (and their corresponding CRAI/BAI indexes) for RNA, WES, and WGS samples released by The Cancer Dependency Map (DepMap) as part of the Cancer Cell Line Encyclopedia (CCLE) project + ARN: arn:aws:s3:::depmap-omics-ccle + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: DepMap Omics CCLE data on the AWS Open Data Registry + URL: https://github.com/broadinstitute/depmap-omics-ccle + Tools & Applications: + - Title: The Cancer Dependency Map (DepMap) + URL: https://depmap.org + - Title: Cancer Cell Line Encyclopedia (CCLE) + URL: https://sites.broadinstitute.org/ccle + Publications: + - Title: Next-generation characterization of the Cancer Cell Line Encyclopedia + URL: https://www.nature.com/articles/s41586-019-1186-3 + AuthorName: Ghandi, Huang, Jané-Valbuena et al. + - Title: The present and future of the Cancer Dependency Map + URL: https://www.nature.com/articles/s41568-024-00763-x + AuthorName: Arafeh, Shibue, Dempster et al. + AuthorURL: https://depmap.org + - Title: Partial gene suppression improves identification of cancer vulnerabilities when CRISPR-Cas9 knockout is pan-lethal + URL: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-03020-w + AuthorName: Krill-Burger, Dempster, Borah et al. + - Title: Genetic dependencies associated with transcription factor activities in human cancer cell lines + URL: https://www.sciencedirect.com/science/article/pii/S2211124724005035 + AuthorName: Thatikonda, Supper, Wachter et al. + - Title: Bridging the gap between cancer cell line models and tumours using gene expression data + URL: https://www.nature.com/articles/s41416-021-01359-0 + AuthorName: Noorbakhsh, Vazquez & McFarland + - Title: Integrated cross-study datasets of genetic dependencies in cancer + URL: https://www.nature.com/articles/s41467-021-21898-7 + AuthorName: Pacini, Dempster, Boyle et al. + - Title: Machine learning multi-omics analysis reveals cancer driver dysregulation in pan-cancer cell lines compared to primary tumors + URL: https://www.nature.com/articles/s42003-022-04075-4 + AuthorName: Sanders, Chandra, Zebarjadi et al. + - Title: "The Network Zoo: a multilingual package for the inference and analysis of gene regulatory networks" + URL: https://link.springer.com/article/10.1186/s13059-023-02877-1 + AuthorName: Ben Guebila, Wang, Lopes-Ramos et al. +ADXCategories: + - Healthcare & Life Sciences Data From b1ba0f2702a47f47aad5d7bb809fa054f9910ea5 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 10:39:50 -0400 Subject: [PATCH 373/751] ok: Update deepdrug-dpeb.yaml syntax error --- datasets/deepdrug-dpeb.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/deepdrug-dpeb.yaml b/datasets/deepdrug-dpeb.yaml index dd5c83aa2..5d851c384 100644 --- a/datasets/deepdrug-dpeb.yaml +++ b/datasets/deepdrug-dpeb.yaml @@ -34,7 +34,7 @@ DataAtWork: AuthorName: DeepDrug Lab AuthorURL: https://github.com/deepdrugai Publications: - - Title: A Multimodal Human Protein Embeddings Database: DeepDrug Protein Embeddings Bank (DPEB) + - Title: A Multimodal Human Protein Embeddings Database - DeepDrug Protein Embeddings Bank (DPEB) URL: https://doi.org/10.XXXX/nar.dpeb2025 AuthorName: Sajol MSI, Rajasekaran M, Bess A, Alvin C, Mukhopadhyay S AuthorURL: https://github.com/deepdrugai/DPEB From d2fad0038e96de65189d82f4268c8381dd46dfb9 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 10:47:39 -0400 Subject: [PATCH 374/751] ok: Update deepdrug-dpeb.yaml removing explore and editing typo on region --- datasets/deepdrug-dpeb.yaml | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/datasets/deepdrug-dpeb.yaml b/datasets/deepdrug-dpeb.yaml index 5d851c384..39138a50d 100644 --- a/datasets/deepdrug-dpeb.yaml +++ b/datasets/deepdrug-dpeb.yaml @@ -16,10 +16,8 @@ Citation: "Sajol MSI et al. DeepDrug Protein Embeddings Bank (DPEB) was accessed Resources: - Description: Multimodal human protein embeddings (AlphaFold2, BioEmbeddings, ESM-2, ProtVec) with JSONL-formatted metadata containing FASTA, UniProt IDs, and embeddings. ARN: arn:aws:s3:::deepdrug-dpeb-human-protein-embeddings - Region: us-cst-1 + Region: us-east-1 Type: S3 Bucket - Explore: - - "https://github.com/deepdrugai/DPEB" DataAtWork: Tutorials: - Title: Aggregating and Clustering AlphaFold2 Embeddings from DPEB From 55e1bdf89e8d06484ef4d950374bd378fe718831 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 10:52:18 -0400 Subject: [PATCH 375/751] ok: Update deepdrug-dpeb.yaml tutorial format --- datasets/deepdrug-dpeb.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/deepdrug-dpeb.yaml b/datasets/deepdrug-dpeb.yaml index 39138a50d..5cf583e4e 100644 --- a/datasets/deepdrug-dpeb.yaml +++ b/datasets/deepdrug-dpeb.yaml @@ -22,10 +22,11 @@ DataAtWork: Tutorials: - Title: Aggregating and Clustering AlphaFold2 Embeddings from DPEB URL: https://github.com/deepdrugai/DPEB/tree/main - NotebookURL: https://github.com/deepdrugai/DPEB/tree/main/tutorial + NotebookURL: https://github.com/deepdrugai/DPEB/blob/main/tutorial/tutorial_clustering.py AuthorName: Md. Saiful Islam Sajol AuthorURL: https://github.com/deepdrugai - Services: EC2 + Services: + - EC2 Tools & Applications: - Title: DPEB Explorer Tool URL: https://github.com/deepdrugai/DPEB From 8bd4e43b47d8031792b58124877a7b5190c463a0 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 10:57:39 -0400 Subject: [PATCH 376/751] ok: Update deepdrug-dpeb.yaml Removing notebook due to format reqmts --- datasets/deepdrug-dpeb.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/deepdrug-dpeb.yaml b/datasets/deepdrug-dpeb.yaml index 5cf583e4e..983cd863f 100644 --- a/datasets/deepdrug-dpeb.yaml +++ b/datasets/deepdrug-dpeb.yaml @@ -22,7 +22,6 @@ DataAtWork: Tutorials: - Title: Aggregating and Clustering AlphaFold2 Embeddings from DPEB URL: https://github.com/deepdrugai/DPEB/tree/main - NotebookURL: https://github.com/deepdrugai/DPEB/blob/main/tutorial/tutorial_clustering.py AuthorName: Md. Saiful Islam Sajol AuthorURL: https://github.com/deepdrugai Services: From 1b200364765e22abeecac58a0555b6d58634cc98 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 11:03:06 -0400 Subject: [PATCH 377/751] ok: Update deepdrug-dpeb.yaml removing ec2 --- datasets/deepdrug-dpeb.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/deepdrug-dpeb.yaml b/datasets/deepdrug-dpeb.yaml index 983cd863f..99b2da9f3 100644 --- a/datasets/deepdrug-dpeb.yaml +++ b/datasets/deepdrug-dpeb.yaml @@ -24,8 +24,6 @@ DataAtWork: URL: https://github.com/deepdrugai/DPEB/tree/main AuthorName: Md. Saiful Islam Sajol AuthorURL: https://github.com/deepdrugai - Services: - - EC2 Tools & Applications: - Title: DPEB Explorer Tool URL: https://github.com/deepdrugai/DPEB From a787460825a46686ae99a829f07c7b29ab039be6 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 07:03:25 -0800 Subject: [PATCH 378/751] ok: Update askap.yaml --- datasets/askap.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index 4764ebbf9..2bc14d659 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -17,6 +17,7 @@ ManagedBy: "[Australia Telescope National Facility, CSIRO](http://www.atnf.csiro Citation: Please see the [ATNF acknowledgement page](https://www.atnf.csiro.au/resources/publications/atnf-publication-acknowledgement-statements/) for full citation instructions. UpdateFrequency: Roughly quarterly Tags: + - aws-pds - astronomy - archives License: CC-BY-4.0. Attribution required for refereed scientific papers. @@ -41,4 +42,4 @@ DataAtWork: AuthorName: various, list maintained by CSIRO, ATNF - Title: ASKAP System Description paper URL: https://doi.org/10.1017/pasa.2021.1 - AuthorName: Hotan, A. et al. \ No newline at end of file + AuthorName: Hotan, A. et al. From 2c561294258cb2e580ce9f8d53a2c33fbce702bd Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 07:09:12 -0800 Subject: [PATCH 379/751] ok: Update askap.yaml --- datasets/askap.yaml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index 2bc14d659..c49384de4 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -1,15 +1,15 @@ Name: ASKAP Radio Telescope Description: | -ASKAP is the CSIRO’s newest radio telescope. It is situated at the Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory on Wajarri Yamaji Country in the Murchison region of Western Australia, about 800 km north of Perth. + ASKAP is the CSIRO’s newest radio telescope. It is situated at the Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory on Wajarri Yamaji Country in the Murchison region of Western Australia, about 800 km north of Perth. -ASKAP consists of 36 12m dishes, spread-out as far as 6km apart. It uses a new technology called Phased Array Feeds (PAFs), which allows it to see more of the sky at once. This novel technology allows ASKAP to achieve extremely high survey speed, making it one of the best instruments in the world for mapping the sky at radio wavelengths. + ASKAP consists of 36 12m dishes, spread-out as far as 6km apart. It uses a new technology called Phased Array Feeds (PAFs), which allows it to see more of the sky at once. This novel technology allows ASKAP to achieve extremely high survey speed, making it one of the best instruments in the world for mapping the sky at radio wavelengths. -Initial dataset available - The Rapid ASKAP Continuum Survey (RACS) + Initial dataset available - The Rapid ASKAP Continuum Survey (RACS) -RACS is the first large-area survey completed with ASKAP. This survey is revolutionary as the entire sky was observed in a matter of weeks, doing what previously took telescopes years to do. RACS initially covered the whole sky at 890 MHz (RACS-Low), and has since expanded to ASKAP’s other bands (1.4 and 1.7 GHz). RACS also covers the sky in multiple epochs, with a second epoch of RACS-Low and RACS-Mid obtained and processed. + RACS is the first large-area survey completed with ASKAP. This survey is revolutionary as the entire sky was observed in a matter of weeks, doing what previously took telescopes years to do. RACS initially covered the whole sky at 890 MHz (RACS-Low), and has since expanded to ASKAP’s other bands (1.4 and 1.7 GHz). RACS also covers the sky in multiple epochs, with a second epoch of RACS-Low and RACS-Mid obtained and processed. -RACS provides astronomers with a unique opportunity to study the radio sky and radio populations, in particular supermassive blackholes (active galactic nuclei) and their role in galaxy evolution. The multi-epoch approach also allows a study of the transient sky and testing and verification of calibration methods. The large area allows for cosmological studies, such as a search for anisotropy in the galaxy population, or cosmic dipole. + RACS provides astronomers with a unique opportunity to study the radio sky and radio populations, in particular supermassive blackholes (active galactic nuclei) and their role in galaxy evolution. The multi-epoch approach also allows a study of the transient sky and testing and verification of calibration methods. The large area allows for cosmological studies, such as a search for anisotropy in the galaxy population, or cosmic dipole. Documentation: https://www.atnf.csiro.au/facilities/askap-radio-telescope/ Contact: atnf-datasup@csiro.au From 64b015834024ba93f82dbad760a53513def6ed1a Mon Sep 17 00:00:00 2001 From: "Y.R. Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 26 Sep 2025 02:17:11 +0900 Subject: [PATCH 380/751] Update tags in st-open-data.yaml Removed 'vvhr' and 'agriculture monitoring' tags from the dataset. --- datasets/st-open-data.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 16f88d16a..0779a4297 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -8,9 +8,7 @@ Tags: - aws-pds - satellite imagery - earth observation - - vvhr - disaster response - - agriculture monitoring - geospatial - image processing License: Creative Commons Attribution-NonCommercial 4.0 International From b6989339a1540f189b908014302e1e3b2a1851b8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 09:28:20 -0800 Subject: [PATCH 381/751] ok: Update st-open-data.yaml From 1368a7e855e60238774f566ec179494a10321057 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 09:28:47 -0800 Subject: [PATCH 382/751] ok: Update st-open-data.yaml From 58c780f39712b73d6cf732d02e138968a30fbac4 Mon Sep 17 00:00:00 2001 From: Johan Winnubst Date: Thu, 25 Sep 2025 10:41:59 -0700 Subject: [PATCH 383/751] added e11bio-prism datasset --- datasets/e11bio-prism.yaml | 61 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) create mode 100644 datasets/e11bio-prism.yaml diff --git a/datasets/e11bio-prism.yaml b/datasets/e11bio-prism.yaml new file mode 100644 index 000000000..66a30ad09 --- /dev/null +++ b/datasets/e11bio-prism.yaml @@ -0,0 +1,61 @@ +Name: E11bio PRISM +Description: | + This dataset was generated using E11.bio's PRISM technology (Protein Reconstruction and Identification through Multiplexing), + a platform that combines viral barcoding, expansion microscopy, and iterative immunolabeling for large-scale neuronal reconstruction. + + Neurons in the mouse hippocampal CA3 were transduced with a library of adeno-associated viruses (AAVs) + encoding diverse “protein bits”—small epitope tags that act as combinatorial barcodes. + Tissue was then processed with an expansion microscopy protocol, physically enlarging the sample ~5× + to achieve an effective voxel size of ~35 × 35 × 80 nm. + Across multiple cycles of staining, imaging, and antibody stripping, the same expanded tissue was repeatedly labeled, + enabling iterative immunostaining for dozens of molecular targets. + + The dataset includes: + 1) Light microscopy data of multiplexed brain tissue + 2) Segmentations of cell morphology and protein expression in the tissue + 3) Files for faster visualization of the data (e.g. precomputed format) + 4) Additional supporting files (e.g. model predictions, manual annotations etc.) +Documentation: https://github.com/e11bio/e11-open-data +Contact: hello@e11.bio +ManagedBy: "[E11.bio](https://e11.bio)" +UpdateFrequency: As required +Tags: + - bioinformatics + - biology + - brain images + - cell imaging + - computer vision + - fluorescence imaging + - high-throughput imaging + - image processing + - imaging + - ion channels + - life sciences + - machine learning + - microscopy + - morphological reconstructions + - Mus musculus + - neurobiology + - neuroimaging + - neuroscience + - protein + - segmentation + - zarr +License: https://e11.bio/terms-of-use +Resources: + - Description: Data files in a public bucket + ARN: arn:aws:s3:::e11bio-prism + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: E11.Bio PRISM OpenData + URL: https://github.com/e11bio/e11-open-data + NotebookURL: + AuthorName: Arlo Sheridan & Johan Winnubst + AuthorURL: https://e11.bio/team + Tools & Applications: + - Title: Volara + URL: https://github.com/e11bio/volara + AuthorName: Arlo Sheridan & Will Patton + AuthorURL: https://e11.bio/team \ No newline at end of file From 322035d96c9dfc69c46721b8a09b5b5067930978 Mon Sep 17 00:00:00 2001 From: "Y.R. Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 26 Sep 2025 02:43:02 +0900 Subject: [PATCH 384/751] Fix typo in ADXCategoriesAdxCategories field From 87bc1e915828d5c9a03d29583968caccfdc9fed5 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 10:05:05 -0800 Subject: [PATCH 385/751] ok: Update st-open-data.yaml From 9c6ca0337fce93447b1a2c3f7e56c7f0253db276 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 14:07:20 -0400 Subject: [PATCH 386/751] ok: Update e11bio-prism.yaml adding tag aws-pds --- datasets/e11bio-prism.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/e11bio-prism.yaml b/datasets/e11bio-prism.yaml index 66a30ad09..56a92e88d 100644 --- a/datasets/e11bio-prism.yaml +++ b/datasets/e11bio-prism.yaml @@ -41,6 +41,7 @@ Tags: - protein - segmentation - zarr + - aws-pds License: https://e11.bio/terms-of-use Resources: - Description: Data files in a public bucket @@ -58,4 +59,4 @@ DataAtWork: - Title: Volara URL: https://github.com/e11bio/volara AuthorName: Arlo Sheridan & Will Patton - AuthorURL: https://e11.bio/team \ No newline at end of file + AuthorURL: https://e11.bio/team From 588a9456f9578176c731684ef538090874ce0fc2 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 10:08:08 -0800 Subject: [PATCH 387/751] ok: Update st-open-data.yaml --- datasets/st-open-data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 0779a4297..e19d397da 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -25,4 +25,4 @@ DataAtWork: - Title: "SpaceEye-T Data Manual" URL: https://www.si-imaging.com/page/72?sca=SpaceEye-T AuthorName: SI-Imaging Services -ADXCategoriesAdxCategories: Resources Data +ADXCategories: Resources Data From f641f43a7401c6cf67074c9d97d3a76774798915 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 10:13:01 -0800 Subject: [PATCH 388/751] ok: Update st-open-data.yaml --- datasets/st-open-data.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index e19d397da..f9b587aa4 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -25,4 +25,5 @@ DataAtWork: - Title: "SpaceEye-T Data Manual" URL: https://www.si-imaging.com/page/72?sca=SpaceEye-T AuthorName: SI-Imaging Services -ADXCategories: Resources Data +ADXCategories: + - Resources Data From 63d57318d832fb4dc76f19ce8c0936c22cdae13b Mon Sep 17 00:00:00 2001 From: "Y.R. Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 26 Sep 2025 03:43:17 +0900 Subject: [PATCH 389/751] Update st-open-data.yaml with SNS Topic information Added SNS Topic details for SpaceEye-T VVHR EO Open data. --- datasets/st-open-data.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index f9b587aa4..1eb8c057c 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -20,6 +20,10 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](http://st-vvhr-opendata.s3-website.us-west-2.amazonaws.com/)' + - Description: Notifications for new SpaceEye-T VVHR EO Open data + ARN: arn:aws:sns:us-west-2:348881531141:st-vvhr-opendata-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Publications: - Title: "SpaceEye-T Data Manual" From 2e9fc480278793483fe84e460a951966a44b2906 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 25 Sep 2025 10:47:52 -0800 Subject: [PATCH 390/751] ok: Update st-open-data.yaml From f3c37d378f21de54b79e6385d6d6bd59cf7e7790 Mon Sep 17 00:00:00 2001 From: berylrab Date: Thu, 25 Sep 2025 17:23:58 -0400 Subject: [PATCH 391/751] ok: Update deepdrug-dpeb.yaml update bucket name and region --- datasets/deepdrug-dpeb.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/deepdrug-dpeb.yaml b/datasets/deepdrug-dpeb.yaml index 99b2da9f3..30779b6c3 100644 --- a/datasets/deepdrug-dpeb.yaml +++ b/datasets/deepdrug-dpeb.yaml @@ -15,8 +15,8 @@ License: MIT Citation: "Sajol MSI et al. DeepDrug Protein Embeddings Bank (DPEB) was accessed on [DATE] at https://registry.opendata.aws/dpeb" Resources: - Description: Multimodal human protein embeddings (AlphaFold2, BioEmbeddings, ESM-2, ProtVec) with JSONL-formatted metadata containing FASTA, UniProt IDs, and embeddings. - ARN: arn:aws:s3:::deepdrug-dpeb-human-protein-embeddings - Region: us-east-1 + ARN: arn:aws:s3:::deepdrug-dpeb + Region: us-west-2 Type: S3 Bucket DataAtWork: Tutorials: From a4fcbbe497ce207efa82205996eabc8bdd48e08e Mon Sep 17 00:00:00 2001 From: Ev Date: Fri, 26 Sep 2025 15:45:56 -0400 Subject: [PATCH 392/751] Update aws-public-blockchain.yaml Adding Cronos to the list --- datasets/aws-public-blockchain.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index 62de65c01..78e32a22d 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -13,6 +13,7 @@ Description: > - XRP Ledger - SonarX - s3://aws-public-blockchain/v1.1/sonarx/xrp/
- Stellar(XDR files) - Stellar - s3://aws-public-blockchain/v1.1/stellar/
- The Open Network (TON) - TON - s3://aws-public-blockchain/v1.1/ton/
+ - Cronos - Cronos - s3://aws-public-blockchain/v1.1/cronos/

Become a Data Provider

From 636a2952045e1f81ce5ffb7081d85be3923c2977 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 11:57:47 -0800 Subject: [PATCH 393/751] ok: Update aws-public-blockchain.yaml --- datasets/aws-public-blockchain.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/aws-public-blockchain.yaml b/datasets/aws-public-blockchain.yaml index 78e32a22d..ec2019e79 100644 --- a/datasets/aws-public-blockchain.yaml +++ b/datasets/aws-public-blockchain.yaml @@ -25,6 +25,7 @@ Contact: aws-blockchain-data@amazon.com ManagedBy: "[Amazon Web Services](https://aws.amazon.com/)" UpdateFrequency: New data is delivered daily to the current date folders Parquet files. Tags: + - aws-pds - blockchain - web3 License: https://github.com/aws-samples/digital-assets-examples/blob/main/LICENSE From dd553b5523c7d7cbc653d86bdd7c372bc178cb2c Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Fri, 26 Sep 2025 13:40:20 -0800 Subject: [PATCH 394/751] minor fixes for logos --- datasets/humancellatlas.yaml | 2 +- datasets/oceanomics.yaml | 2 +- datasets/proteingym.yaml | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/humancellatlas.yaml b/datasets/humancellatlas.yaml index d1d9ec5d7..f3f9b196c 100644 --- a/datasets/humancellatlas.yaml +++ b/datasets/humancellatlas.yaml @@ -12,7 +12,7 @@ Documentation: https://data.humancellatlas.org/ Contact: https://data.humancellatlas.org/contact -ManagedBy: UC Santa Cruz Genomics Institute, University of California, Santa Cruz (UCSC) +ManagedBy: UC Santa Cruz Genomics Institute, University of California, Santa Cruz, UCSC UpdateFrequency: Monthly diff --git a/datasets/oceanomics.yaml b/datasets/oceanomics.yaml index 5f669ae44..b78db3d08 100644 --- a/datasets/oceanomics.yaml +++ b/datasets/oceanomics.yaml @@ -2,7 +2,7 @@ Name: OceanOmics Description: "Minderoo Foundation OceanOmics aims to establish environmental DNA (eDNA) as a tool to measure, understand, and protect oceans. OceanOmics mainly generates two types of data: eDNA sequencing data (metabarcoding, metagenomics), and genome assembly data (marine vertebrates)." Documentation: https://edna.minderoo.org Contact: oceanomics@minderoo.org -ManagedBy: Minderoo Foundation OceanOmics (Dr Shannon Corrigan, Dr Philipp Bayer) +ManagedBy: Minderoo Foundation OceanOmics, Dr Shannon Corrigan, Dr Philipp Bayer UpdateFrequency: Data will be continually updated as it is generated. Collabs: ASDI: diff --git a/datasets/proteingym.yaml b/datasets/proteingym.yaml index 6d7b8bdc8..13aed7c2f 100644 --- a/datasets/proteingym.yaml +++ b/datasets/proteingym.yaml @@ -3,7 +3,7 @@ Description: | ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), as well as clinical annotations from experts about the pathogenicity of mutants in over 3k human genes. Documentation: https://github.com/OATML-Markslab/ProteinGym/blob/main/README.md Contact: pascal_notin@hms.harvard.edu -ManagedBy: "Harvard Medical School; University of Oxford" +ManagedBy: "Harvard Medical School, University of Oxford" UpdateFrequency: Quarterly Tags: - aws-pds From 427eb67e17d0e39dd37d91eb81fb8e815fe4a779 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 13:42:32 -0800 Subject: [PATCH 395/751] ok: Update humancellatlas.yaml --- datasets/humancellatlas.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/humancellatlas.yaml b/datasets/humancellatlas.yaml index f3f9b196c..30bd81938 100644 --- a/datasets/humancellatlas.yaml +++ b/datasets/humancellatlas.yaml @@ -95,4 +95,4 @@ DataAtWork: AuthorName: "Various authors" ADXCategories: - - Healthcare & Life Sciences Data \ No newline at end of file + - Healthcare & Life Sciences Data From 4126945fd1127d0ebbd08952a1bbc920ba1389ea Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 14:17:26 -0800 Subject: [PATCH 396/751] Update colorado-elevation-data.yaml --- datasets/colorado-elevation-data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/colorado-elevation-data.yaml b/datasets/colorado-elevation-data.yaml index 450b3ad92..66328bb34 100644 --- a/datasets/colorado-elevation-data.yaml +++ b/datasets/colorado-elevation-data.yaml @@ -2,7 +2,7 @@ Name: State of Colorado Elevation Data Description: The State of Colorado has gathered public historical elevation data. Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team +ManagedBy: State of Colorado Governor's Office of Information Technology OIT GIS team UpdateFrequency: Periodically Tags: - aws-pds From 570b9298b410e544a6c06ff54588753de5b385ef Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 14:18:52 -0800 Subject: [PATCH 397/751] Update colorado-imagery.yaml --- datasets/colorado-imagery.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/colorado-imagery.yaml b/datasets/colorado-imagery.yaml index a5c92527d..60f7238ca 100644 --- a/datasets/colorado-imagery.yaml +++ b/datasets/colorado-imagery.yaml @@ -2,7 +2,7 @@ Name: State of Colorado Imagery Description: The State of Colorado has gathered public historical imagery ranging from 2005 to 2021. Documentation: https://docs.google.com/document/d/1YDHignUj9lQTMw2J-SqA96MTP8KmJYtk2ZKKC2ZYuPE/edit?usp=sharing Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology (OIT) GIS team +ManagedBy: State of Colorado Governor's Office of Information Technology OIT GIS team UpdateFrequency: Periodically Collabs: ASDI: From e075a43d59273bb149d44d7e9d2a24e0d0b507ef Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 14:19:28 -0800 Subject: [PATCH 398/751] ok: Update colorado-imagery.yaml From 6334d43d9bf70b6b59027eeb3a537cbe2a84ceb3 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 14:25:00 -0800 Subject: [PATCH 399/751] ok: Update colorado-elevation-data.yaml From 886422784b67c74987e3fab2df8d0e32c18e0c1e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 14:28:38 -0800 Subject: [PATCH 400/751] ok: Update colorado-elevation-data.yaml From 014a039a0eed8bc4b1f0dc9ffd6b7e97eaadb3ca Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Fri, 26 Sep 2025 14:56:42 -0800 Subject: [PATCH 401/751] logo fix --- datasets/colorado-elevation-data.yaml | 2 +- datasets/colorado-imagery.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/colorado-elevation-data.yaml b/datasets/colorado-elevation-data.yaml index 66328bb34..8cba77ce4 100644 --- a/datasets/colorado-elevation-data.yaml +++ b/datasets/colorado-elevation-data.yaml @@ -2,7 +2,7 @@ Name: State of Colorado Elevation Data Description: The State of Colorado has gathered public historical elevation data. Documentation: https://docs.google.com/document/d/1HMO-d4cCrBvFa2F6-N3lhP6rkezlvBmSUFA5S8t_ekQ/edit?usp=sharing Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology OIT GIS team +ManagedBy: State of Colorado Governors Office of Information Technology OIT GIS team UpdateFrequency: Periodically Tags: - aws-pds diff --git a/datasets/colorado-imagery.yaml b/datasets/colorado-imagery.yaml index 60f7238ca..2a1afd1ad 100644 --- a/datasets/colorado-imagery.yaml +++ b/datasets/colorado-imagery.yaml @@ -2,7 +2,7 @@ Name: State of Colorado Imagery Description: The State of Colorado has gathered public historical imagery ranging from 2005 to 2021. Documentation: https://docs.google.com/document/d/1YDHignUj9lQTMw2J-SqA96MTP8KmJYtk2ZKKC2ZYuPE/edit?usp=sharing Contact: oit_gis@state.co.us -ManagedBy: State of Colorado Governor's Office of Information Technology OIT GIS team +ManagedBy: State of Colorado Governors Office of Information Technology OIT GIS team UpdateFrequency: Periodically Collabs: ASDI: From c0a31be94a459cc8c65e24c3bba5882fa417a269 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 26 Sep 2025 14:58:14 -0800 Subject: [PATCH 402/751] ok: Update colorado-elevation-data.yaml From 5727d8ad8c8d1ced2f3699737b94c6c0afe88f92 Mon Sep 17 00:00:00 2001 From: Alfonso Ladino Date: Sat, 27 Sep 2025 20:31:28 -0500 Subject: [PATCH 403/751] Update ideam-radares.yaml --- datasets/ideam-radares.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/ideam-radares.yaml b/datasets/ideam-radares.yaml index f66486136..17caf7584 100644 --- a/datasets/ideam-radares.yaml +++ b/datasets/ideam-radares.yaml @@ -29,9 +29,10 @@ DataAtWork: - Title: Read and plot Sigmet files available on AWS using Xradar URL: https://docs.openradarscience.org/projects/xradar/en/stable/notebooks/Read-plot-Sigmet-data-from-AWS.html AuthorName: Alfonso Ladino - - Title: Taller de datos científicos con Python y R - AtmosCol 2023 - URL: https://projectpythia.org/AtmosCol-2023/notebooks/2.acceso-datos/2.2.Radares.html + - Title: Ciencia de Datos Hidrometeorológicos con Python + URL: https://projectpythia.org/AtmosCol-2023/radares AuthorName: Alfonso Ladino, Nicole Rivera, Max Grover - Title: Specific Differential Phase (KDP) retrieval methods comparison URL: https://projectpythia.org/radar-cookbook/notebooks/example-workflows/kdp-comparison.html AuthorName: Alfonso Ladino, Max Grover + From 31a6fe3543fca6e29ae7394db9295606ede26bcf Mon Sep 17 00:00:00 2001 From: Allan Frank Date: Mon, 29 Sep 2025 14:04:14 +0200 Subject: [PATCH 404/751] Adding Danish Meteorological Institute (DMI) Reanalysis dataset v0.5 --- datasets/dmi-danra-05.yaml | 49 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 datasets/dmi-danra-05.yaml diff --git a/datasets/dmi-danra-05.yaml b/datasets/dmi-danra-05.yaml new file mode 100644 index 000000000..82d49f890 --- /dev/null +++ b/datasets/dmi-danra-05.yaml @@ -0,0 +1,49 @@ +Name: Danish Meteorological Institute (DMI) Reanalysis dataset v0.5 +Description: DANRA is a high-resolution meteorological reanalysis dataset for Denmark and Northwestern Europe covering the period September 1990 to December 2023 +Documentation: https://dmidk.github.io/danradocs/intro.html +Contact: https://www.dmi.dk/kontakt +ManagedBy: "[Danish Meteorological Institute](https://www.dmi.dk/)" +UpdateFrequency: Not updated +Collabs: + ASDI: + Tags: + - climate + - weather +Tags: + - air temperature + - atmosphere + - geospatial + - global + - land + - meteorological + - near-surface air temperature + - near-surface relative humidity + - near-surface specific humidity + - model + - water + - weather + - zarr +License: DMI Reanalysis dataset v0.5 is distributed under the [Creative Commons License CC BY 4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) +Resources: + - Description: DMI's Open Data Forecasts + ARN: arn:aws:s3:::dmi-opendata + Region: eu-north-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Looking at distributions + URL: https://dmidk.github.io/danradocs/notebooks/distributions.html + NotebookURL: https://dmidk.github.io/danradocs/_sources/notebooks/distributions.ipynb + AuthorName: Danish Meteorological Institute + AuthorURL: https://www.dmi.dk/ + Services: + - Amazon S3 + - Title: DANRA figures + URL: https://dmidk.github.io/danradocs/notebooks/paper-figures.html + NotebookURL: https://dmidk.github.io/danradocs/_sources/notebooks/paper-figures.ipynb + AuthorName: Danish Meteorological Institute + AuthorURL: https://www.dmi.dk/ + Services: + - Amazon S3 +ADXCategories: + - Environmental Data From 9318a820e6a65efe23d8d5d9b3321ad1392856c2 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 29 Sep 2025 08:53:37 -0800 Subject: [PATCH 405/751] ok: Update ideam-radares.yaml --- datasets/ideam-radares.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/ideam-radares.yaml b/datasets/ideam-radares.yaml index 17caf7584..72030295f 100644 --- a/datasets/ideam-radares.yaml +++ b/datasets/ideam-radares.yaml @@ -35,4 +35,3 @@ DataAtWork: - Title: Specific Differential Phase (KDP) retrieval methods comparison URL: https://projectpythia.org/radar-cookbook/notebooks/example-workflows/kdp-comparison.html AuthorName: Alfonso Ladino, Max Grover - From 35f1c6caecfb15837f359ae0b386d31e75d2ad05 Mon Sep 17 00:00:00 2001 From: Alfonso Ladino Date: Mon, 29 Sep 2025 12:09:24 -0500 Subject: [PATCH 406/751] Update ideam-radares.yaml --- datasets/ideam-radares.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/ideam-radares.yaml b/datasets/ideam-radares.yaml index 72030295f..911c9ed06 100644 --- a/datasets/ideam-radares.yaml +++ b/datasets/ideam-radares.yaml @@ -33,5 +33,6 @@ DataAtWork: URL: https://projectpythia.org/AtmosCol-2023/radares AuthorName: Alfonso Ladino, Nicole Rivera, Max Grover - Title: Specific Differential Phase (KDP) retrieval methods comparison - URL: https://projectpythia.org/radar-cookbook/notebooks/example-workflows/kdp-comparison.html + URL: https://projectpythia.org/radar-cookbook/notebooks/example-workflows/kdp-comparison/ AuthorName: Alfonso Ladino, Max Grover + From 9fba5c01efd118e9fde0ac6889254cfcfb65bccc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 29 Sep 2025 09:16:44 -0800 Subject: [PATCH 407/751] ok: Update dmi-danra-05.yaml --- datasets/dmi-danra-05.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/dmi-danra-05.yaml b/datasets/dmi-danra-05.yaml index 82d49f890..4aba72247 100644 --- a/datasets/dmi-danra-05.yaml +++ b/datasets/dmi-danra-05.yaml @@ -10,6 +10,7 @@ Collabs: - climate - weather Tags: + - aws-pds - air temperature - atmosphere - geospatial From 09723847cba51992b3f6308bbf65337e8a93739e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 29 Sep 2025 09:38:30 -0800 Subject: [PATCH 408/751] ok: Update ideam-radares.yaml --- datasets/ideam-radares.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/ideam-radares.yaml b/datasets/ideam-radares.yaml index 911c9ed06..98913330e 100644 --- a/datasets/ideam-radares.yaml +++ b/datasets/ideam-radares.yaml @@ -35,4 +35,3 @@ DataAtWork: - Title: Specific Differential Phase (KDP) retrieval methods comparison URL: https://projectpythia.org/radar-cookbook/notebooks/example-workflows/kdp-comparison/ AuthorName: Alfonso Ladino, Max Grover - From dbdd1ea40c72141fe5c68432805d4eddb08b3e11 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 29 Sep 2025 15:40:47 -0800 Subject: [PATCH 409/751] Update ecmwf-era5.yaml era5 link --- datasets/ecmwf-era5.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/ecmwf-era5.yaml b/datasets/ecmwf-era5.yaml index 2f40a542a..4793f608b 100644 --- a/datasets/ecmwf-era5.yaml +++ b/datasets/ecmwf-era5.yaml @@ -1,5 +1,6 @@ Deprecated: True -DeprecatedNotice: The provider of this dataset will no longer maintain this dataset. We are open to talking with anyone else who might be willing to provide this dataset to the community. Contact opendata@amazon.com. +DeprecatedNotice: | +

The provider of this dataset will no longer maintain it, but has instead worked with NSF NCAR to rehost the dataset here: https://registry.opendata.aws/nsf-ncar-era5/

Name: ECMWF ERA5 Reanalysis Description: | ERA5 is the fifth generation of ECMWF atmospheric reanalyses of the global climate, and the first reanalysis produced as an operational service. It utilizes the best available observation data from satellites and in-situ stations, which are assimilated and processed using ECMWF's Integrated Forecast System (IFS) Cycle 41r2. From d2007088f7713e63160adbf3d5bf93a04f88fbb8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 29 Sep 2025 15:41:33 -0800 Subject: [PATCH 410/751] ok: Update ecmwf-era5.yaml From 8babb4b8524d95576d4562797895dcff8de4b7b2 Mon Sep 17 00:00:00 2001 From: "Y.R. Moon" <96802030+yr-moon@users.noreply.github.com> Date: Tue, 30 Sep 2025 11:26:00 +0900 Subject: [PATCH 411/751] Fix contact URL in st-open-data.yaml The previous commit included an incorrect contact URL. This commit updates it to the correct one. --- datasets/st-open-data.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 1eb8c057c..5f9e508c4 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -2,7 +2,7 @@ Name: SpaceEye-T VVHR EO Open Data Description: | SpaceEye-T satellite collects the highest resolution optical imagery among the commercial satellites, 25 cm resolution. The Open Data features various satellite images around the world for end users to experience the power of VVHR optical data. Documentation: https://www.si-imaging.com/page/72?sca=SpaceEye-T -Contact: https://https//www.si-imaging.com +Contact: https://www.si-imaging.com UpdateFrequency: The dataset is frequently updated. The frequent updates include the time-series data for regular monitoring, and the data for disaster management. SI Imaging wants to provide the user expierence on what is possible with VVHR optical satellite data. If you have a suggestion for a new location, feedback on the dataset, or any questions, contact us. Tags: - aws-pds @@ -12,7 +12,7 @@ Tags: - geospatial - image processing License: Creative Commons Attribution-NonCommercial 4.0 International -ManagedBy: "[SI Imaging Services](https://https//www.si-imaging.com/)" +ManagedBy: "[SI Imaging Services](https://www.si-imaging.com/)" Resources: - Description: SpaceEye-T Imagery Collection ARN: arn:aws:s3:::st-vvhr-opendata From 18cb250f193479dc266937cdb67bf4e26d2b882e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 30 Sep 2025 08:27:30 -0800 Subject: [PATCH 412/751] Update st-open-data.yaml From fe14c7bdf582f00aa3676b7a93f5cf0d90306dae Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 30 Sep 2025 08:27:43 -0800 Subject: [PATCH 413/751] ok: Update st-open-data.yaml From 7f38ecebb8fa2f78f3b8ad8a8e8c0bccc7471c45 Mon Sep 17 00:00:00 2001 From: Devin McCabe Date: Tue, 30 Sep 2025 16:22:04 -0400 Subject: [PATCH 414/751] add depmap-omics-ccle SNS resource --- datasets/depmap-omics-ccle.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/depmap-omics-ccle.yaml b/datasets/depmap-omics-ccle.yaml index 2c9efb544..c2d0d9bbf 100644 --- a/datasets/depmap-omics-ccle.yaml +++ b/datasets/depmap-omics-ccle.yaml @@ -25,6 +25,10 @@ Resources: ARN: arn:aws:s3:::depmap-omics-ccle Region: us-east-1 Type: S3 Bucket + - Description: Notifications for new depmap-omics-ccle data + ARN: arn:aws:sns:us-east-1:019511184952:depmap-omics-ccle-object_created + Region: us-east-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: DepMap Omics CCLE data on the AWS Open Data Registry From 6eb79bafbf49d0d590160fb1a6b668b81e8df958 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 30 Sep 2025 16:39:14 -0400 Subject: [PATCH 415/751] ok: Update depmap-omics-ccle.yaml ok From 95e24d9e1f86c2cbcaa283b624cac399d75fcad8 Mon Sep 17 00:00:00 2001 From: Allan Frank Date: Thu, 2 Oct 2025 13:15:29 +0200 Subject: [PATCH 416/751] Corrected S3 bucket ARN and description --- datasets/dmi-danra-05.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/dmi-danra-05.yaml b/datasets/dmi-danra-05.yaml index 4aba72247..04b7c1ed4 100644 --- a/datasets/dmi-danra-05.yaml +++ b/datasets/dmi-danra-05.yaml @@ -26,8 +26,8 @@ Tags: - zarr License: DMI Reanalysis dataset v0.5 is distributed under the [Creative Commons License CC BY 4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) Resources: - - Description: DMI's Open Data Forecasts - ARN: arn:aws:s3:::dmi-opendata + - Description: DMI Reanalysis dataset v0.5 + ARN: arn:aws:s3:::dmi-danra-05 Region: eu-north-1 Type: S3 Bucket DataAtWork: From a461b2c77046d5dcdb5f81a79dd57a06583c5011 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 2 Oct 2025 04:57:29 -0800 Subject: [PATCH 417/751] ok: Update dmi-danra-05.yaml From e7aa83dded160806cbc8f7822112846a50f5bb1c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 2 Oct 2025 05:04:53 -0800 Subject: [PATCH 418/751] ok: Update dmi-danra-05.yaml From 9ad5fe68f57a17605dd052acada3a019d72e9e9c Mon Sep 17 00:00:00 2001 From: "Youngran (Rachel) Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 3 Oct 2025 15:18:52 +0900 Subject: [PATCH 419/751] Revise license and citation details in YAML file Updated license information and added citation guidelines. --- datasets/st-open-data.yaml | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 5f9e508c4..213b5862b 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -11,7 +11,12 @@ Tags: - disaster response - geospatial - image processing -License: Creative Commons Attribution-NonCommercial 4.0 International +License: | + Creative Commons Attribution-NonCommercial 4.0 International. + Please visit [our Terms of Use webpage](https://si-imaging.com/page/73) to review the following documents: + - ST-1 Product Terms of Use + - ST-1 License and Attribution Guide for Open Data +Citation: When publicly post of ST products Open Data unedited, "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY-NC 4.0)". When publicly post derivative data created by using ST products Open Data, "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY-NC 4.0)". ManagedBy: "[SI Imaging Services](https://www.si-imaging.com/)" Resources: - Description: SpaceEye-T Imagery Collection From b25832f94d1793bbb4f97716e2ecbae360b95783 Mon Sep 17 00:00:00 2001 From: "Youngran (Rachel) Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 3 Oct 2025 15:26:43 +0900 Subject: [PATCH 420/751] Improve citation formatting in st-open-data.yaml Reformat citation guidelines for clarity and readability. --- datasets/st-open-data.yaml | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 213b5862b..4b0f71f83 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -16,7 +16,13 @@ License: | Please visit [our Terms of Use webpage](https://si-imaging.com/page/73) to review the following documents: - ST-1 Product Terms of Use - ST-1 License and Attribution Guide for Open Data -Citation: When publicly post of ST products Open Data unedited, "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY-NC 4.0)". When publicly post derivative data created by using ST products Open Data, "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY-NC 4.0)". +Citation: | + When publicly posting ST products Open Data unedited, use: + "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY-NC 4.0)". + + When publicly posting derivative data created by using ST products Open Data, use: + "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY-NC 4.0)". + ManagedBy: "[SI Imaging Services](https://www.si-imaging.com/)" Resources: - Description: SpaceEye-T Imagery Collection From ed7ef264e038687b8b3f5804e90abf99e083f075 Mon Sep 17 00:00:00 2001 From: "Youngran (Rachel) Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 3 Oct 2025 15:33:37 +0900 Subject: [PATCH 421/751] Update License and Citation in st-open-data.yaml --- datasets/st-open-data.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 4b0f71f83..6d725ec00 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -13,7 +13,8 @@ Tags: - image processing License: | Creative Commons Attribution-NonCommercial 4.0 International. - Please visit [our Terms of Use webpage](https://si-imaging.com/page/73) to review the following documents: + Please visit: [our Terms of Use webpage](https://si-imaging.com/page/73) + to review the following documents: - ST-1 Product Terms of Use - ST-1 License and Attribution Guide for Open Data Citation: | From 682795839cb948f91515aeaf010588f48e5ac1b6 Mon Sep 17 00:00:00 2001 From: Devin McCabe Date: Fri, 3 Oct 2025 09:21:27 -0400 Subject: [PATCH 422/751] add missing author names --- datasets/depmap-omics-ccle.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/datasets/depmap-omics-ccle.yaml b/datasets/depmap-omics-ccle.yaml index c2d0d9bbf..f50630598 100644 --- a/datasets/depmap-omics-ccle.yaml +++ b/datasets/depmap-omics-ccle.yaml @@ -33,11 +33,14 @@ DataAtWork: Tutorials: - Title: DepMap Omics CCLE data on the AWS Open Data Registry URL: https://github.com/broadinstitute/depmap-omics-ccle + AuthorName: Devin McCabe Tools & Applications: - Title: The Cancer Dependency Map (DepMap) URL: https://depmap.org + AuthorName: Arafeh, Shibue, Dempster et al. - Title: Cancer Cell Line Encyclopedia (CCLE) URL: https://sites.broadinstitute.org/ccle + AuthorName: Ghandi, Huang, Jané-Valbuena et al. Publications: - Title: Next-generation characterization of the Cancer Cell Line Encyclopedia URL: https://www.nature.com/articles/s41586-019-1186-3 From ded29c9528541a05eb2778bb584b1240e6fc6202 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 3 Oct 2025 09:12:42 -0800 Subject: [PATCH 423/751] ok: Update fvcom_gom3.yaml From a66116534cb9f0e6e76c149d30f6732bb959b462 Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Fri, 3 Oct 2025 15:24:51 -0400 Subject: [PATCH 424/751] Update noaa-wod.yaml Updated contact language. It contained outdated information. --- datasets/noaa-wod.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/noaa-wod.yaml b/datasets/noaa-wod.yaml index 650e9b6cb..61ca56f1b 100644 --- a/datasets/noaa-wod.yaml +++ b/datasets/noaa-wod.yaml @@ -3,8 +3,8 @@ Description: > The World Ocean Database (WOD) is the largest uniformly formatted, quality-controlled, publicly available historical subsurface ocean profile database. From Captain Cook's second voyage in 1772 to today's automated Argo floats, global aggregation of ocean variable information including temperature, salinity, oxygen, nutrients, and others vs. depth allow for study and understanding of the changing physical, chemical, and to some extent biological state of the World's Oceans. Browse the bucket via the AWS S3 explorer: https://noaa-wod-pds.s3.amazonaws.com/index.html Documentation: https://www.nodc.noaa.gov/OC5/WOD/pr_wod.html Contact: | - For any questions regarding data delivery not associated with this platform or any general questions regarding the NOAA Big Data Program, email noaa.bdp@noaa.gov. -
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NOAA BDP team here: noaa.bdp@noaa.gov + For any questions regarding data delivery not associated with this platform or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov. + We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov ManagedBy: "[NOAA](http://www.noaa.gov/)" UpdateFrequency: Data is update on a quarterly basis Collabs: @@ -34,3 +34,4 @@ DataAtWork: - Title: The World Ocean Database User's Manual URL: https://data.nodc.noaa.gov/woa/WOD/DOC/wodreadme.pdf AuthorName: Hernan E. Garcia, Tim P. Boyer, Ricardo A. Locarnini, Olga K. Baranova, Melissa M. Zweng + From 0a7fa955c2eda741783e25174a1ad61bc4f169d6 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 3 Oct 2025 13:13:44 -0800 Subject: [PATCH 425/751] ok: Update noaa-wod.yaml --- datasets/noaa-wod.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-wod.yaml b/datasets/noaa-wod.yaml index 61ca56f1b..af8013170 100644 --- a/datasets/noaa-wod.yaml +++ b/datasets/noaa-wod.yaml @@ -34,4 +34,4 @@ DataAtWork: - Title: The World Ocean Database User's Manual URL: https://data.nodc.noaa.gov/woa/WOD/DOC/wodreadme.pdf AuthorName: Hernan E. Garcia, Tim P. Boyer, Ricardo A. Locarnini, Olga K. Baranova, Melissa M. Zweng - + From 06c0a042db1f6703950194705d0037c98223a584 Mon Sep 17 00:00:00 2001 From: Peter Schmiedeskamp Date: Mon, 6 Oct 2025 08:29:11 -0700 Subject: [PATCH 426/751] ok: trailing whitespace From a2ccf44cf367f5923a05501e5bf15e8b7448a583 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 7 Oct 2025 11:52:36 -0800 Subject: [PATCH 427/751] Update noaa-nexrad.yaml - remove deprecated bucket and SNS topic --- datasets/noaa-nexrad.yaml | 8 -------- 1 file changed, 8 deletions(-) diff --git a/datasets/noaa-nexrad.yaml b/datasets/noaa-nexrad.yaml index cfa9222f9..7e5373c5c 100644 --- a/datasets/noaa-nexrad.yaml +++ b/datasets/noaa-nexrad.yaml @@ -52,14 +52,6 @@ Resources: ARN: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel3Object Region: us-east-1 Type: SNS Topic - - Description: "*OLD NEXRAD Level II archive bucket* which is now Deprecated. It is recommended to move to the new bucket: unidata-nexrad-level2 and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive" - ARN: arn:aws:s3:::noaa-nexrad-level2 - Region: us-east-1 - Type: S3 Bucket - - Description: "Notifications for the *OLD Level II archival bucket* which is now Deprecated. It is recommended to move to the new bucket: unidata-nexrad-level2 and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive" - ARN: arn:aws:sns:us-east-1:811054952067:NewNEXRADLevel2Archive - Region: us-east-1 - Type: SNS Topic DataAtWork: Tutorials: - Title: Using Python to Access NCEI Archived NEXRAD Level 2 Data (Jupyter notebook) From 5af88997d756a21d248bf214307ef053b8c4889c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 7 Oct 2025 12:10:53 -0800 Subject: [PATCH 428/751] ok: Update noaa-nexrad.yaml From 2e99847ca8a4310fd7f61698926adf215a4dd7b4 Mon Sep 17 00:00:00 2001 From: "Youngran (Rachel) Moon" <96802030+yr-moon@users.noreply.github.com> Date: Thu, 9 Oct 2025 13:30:18 +0900 Subject: [PATCH 429/751] Revise license and citation details in st-open-data.yaml Updated license information and citation guidelines in the YAML file. --- datasets/st-open-data.yaml | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index 6d725ec00..dc0a0a9f9 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -12,18 +12,14 @@ Tags: - geospatial - image processing License: | - Creative Commons Attribution-NonCommercial 4.0 International. - Please visit: [our Terms of Use webpage](https://si-imaging.com/page/73) - to review the following documents: - - ST-1 Product Terms of Use - - ST-1 License and Attribution Guide for Open Data + Creative Commons Attribution 4.0 International (CC BY 4.0). + For more information, See the document "ST-1 Product Terms of Use" at [our Terms of Use webpage](https://si-imaging.com/page/73) Citation: | When publicly posting ST products Open Data unedited, use: "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY-NC 4.0)". When publicly posting derivative data created by using ST products Open Data, use: "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY-NC 4.0)". - ManagedBy: "[SI Imaging Services](https://www.si-imaging.com/)" Resources: - Description: SpaceEye-T Imagery Collection From ea935238dbd6efa8e39519ce157fb67a7cbd6671 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 9 Oct 2025 06:10:56 -0800 Subject: [PATCH 430/751] ok: Update st-open-data.yaml From 401dd34502f752e3c311ffb45af39417c8e0ba45 Mon Sep 17 00:00:00 2001 From: "Youngran (Rachel) Moon" <96802030+yr-moon@users.noreply.github.com> Date: Fri, 10 Oct 2025 10:36:08 +0900 Subject: [PATCH 431/751] Fix citation wording in st-open-data.yaml Correct citation instructions for ST products Open Data. "CC BY-NC 4.0" to "CC BY 4.0" --- datasets/st-open-data.yaml | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/datasets/st-open-data.yaml b/datasets/st-open-data.yaml index dc0a0a9f9..1ba885ac1 100644 --- a/datasets/st-open-data.yaml +++ b/datasets/st-open-data.yaml @@ -15,11 +15,8 @@ License: | Creative Commons Attribution 4.0 International (CC BY 4.0). For more information, See the document "ST-1 Product Terms of Use" at [our Terms of Use webpage](https://si-imaging.com/page/73) Citation: | - When publicly posting ST products Open Data unedited, use: - "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY-NC 4.0)". - - When publicly posting derivative data created by using ST products Open Data, use: - "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY-NC 4.0)". + When publicly post of ST products Open Data unedited, "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY 4.0)". + When publicly post derivative data created by using ST products Open Data, "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY 4.0)". ManagedBy: "[SI Imaging Services](https://www.si-imaging.com/)" Resources: - Description: SpaceEye-T Imagery Collection From 625cdcc084aa8bbc934ca6d66a76e77d8f6cca1e Mon Sep 17 00:00:00 2001 From: mhuynh-au <76926809+mhuynh-au@users.noreply.github.com> Date: Fri, 10 Oct 2025 13:43:22 +0800 Subject: [PATCH 432/751] Update askap.yaml Addingn SNS Topic Resource --- datasets/askap.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index c49384de4..f38b882d2 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -27,6 +27,10 @@ Resources: Region: ap-southeast-2 Type: S3 Bucket RequesterPays: False + - Description: Notifications for new data + ARN: arn:aws:sns:ap-southeast-2:336305517014:racs-low1-object_created + Region: sp-southeast-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: CSIRO ASKAP Science Data Archive User Guide From 1be3f1857bc67eae9f39f4bef1ac1eb772422ada Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 10 Oct 2025 09:08:29 -0800 Subject: [PATCH 433/751] ok: Update st-open-data.yaml From d5708cead35f477c61d5a2c369afaf2a9a169dfb Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 10 Oct 2025 09:16:40 -0800 Subject: [PATCH 434/751] ok: Update askap.yaml --- datasets/askap.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index f38b882d2..53859b925 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -27,7 +27,7 @@ Resources: Region: ap-southeast-2 Type: S3 Bucket RequesterPays: False - - Description: Notifications for new data + - Description: Notifications for new Rapid ASKAP Continuum Survey (RACS) data ARN: arn:aws:sns:ap-southeast-2:336305517014:racs-low1-object_created Region: sp-southeast-2 Type: SNS Topic From 8b68dd4e38534322e84ab37e5bbb2636c620c110 Mon Sep 17 00:00:00 2001 From: blahner Date: Sun, 12 Oct 2025 14:10:42 -0700 Subject: [PATCH 435/751] added more mosaic tutorials --- datasets/mosaic.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml index 6bda21665..9b7751849 100644 --- a/datasets/mosaic.yaml +++ b/datasets/mosaic.yaml @@ -23,6 +23,20 @@ Resources: - '[Browse Bucket](https://mosaicfmri.s3.amazonaws.com/index.html)' DataAtWork: Tutorials: + - Title: Preprocess fMRI datasets with MOSAIC shared pipeline + URL: https://github.com/blahner/mosaic-preprocessing + AuthorName: Benjamin Lahner + - Title: MOSAIC Python package (mosaic-dataset) + URL: https://pypi.org/project/mosaic-dataset/ + AuthorName: Mayukh Deb + - Title: Download MOSAIC data, visualize fMRI responses, load and run brain-optimized models (Jupyter notebook) + URL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic-starter.ipynb + NotebookURL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic-starter.ipynb + AuthorName: Mayukh Deb + - Title: Run a synthetic localizer experiment using MOSAIC's brain-optimized models (Jupyter notebook) + URL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic_synthetic_localizer.ipynb + NotebookURL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic_synthetic_localizer.ipynb + AuthorName: Benjamin Lahner - Title: Load HDF5 file (Jupyter notebook) URL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb NotebookURL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb From 4fa42c0ff33ba58b6b1b9a5e4973a385b2aa670c Mon Sep 17 00:00:00 2001 From: blahner Date: Sun, 12 Oct 2025 14:28:54 -0700 Subject: [PATCH 436/751] changed documentation url from preprocessing code to project page url --- datasets/mosaic.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/mosaic.yaml b/datasets/mosaic.yaml index 9b7751849..8cfdfb6f0 100644 --- a/datasets/mosaic.yaml +++ b/datasets/mosaic.yaml @@ -1,6 +1,6 @@ Name: Meta-Organized Stimuli And fMRI Imaging data for Computational modeling (MOSAIC) Description: This extensible dataset, MOSAIC, aggregates individual functional magnetic resonance imaging (fMRI) datasets by leveraging a shared preprocessing pipeline and stimulus curation procedure. This dataset aggregation procedure achieves the scale necessary for neural network training and the diversity needed for generalizable results. -Documentation: https://github.com/blahner/mosaic-preprocessing +Documentation: https://blahner.github.io/MOSAICfmri/ Contact: blahner@mit.edu ManagedBy: Massachusetts Institute of Technology, Georgia Tech UpdateFrequency: New data is uploaded as researchers preprocess their fMRI data according to MOSAIC format and submit. From a16b3f6328fdb2a6a5648eb3ed884c7fee432883 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 13 Oct 2025 11:00:49 -0800 Subject: [PATCH 437/751] ok: Update mosaic.yaml From 22528428bfee92d1ac2db34e4f1248479d340061 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 14 Oct 2025 12:24:53 -0800 Subject: [PATCH 438/751] ok: Update open-ceda.yaml PR https://github.com/awslabs/open-data-registry/pull/2899 --- datasets/open-ceda.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/open-ceda.yaml b/datasets/open-ceda.yaml index 0add6f10d..4445b922f 100644 --- a/datasets/open-ceda.yaml +++ b/datasets/open-ceda.yaml @@ -1,8 +1,8 @@ Name: Open CEDA by Watershed Description: | CEDA is a multi-regional Environmentally-Extended Input-Output (EEIO) model developed to support a wide range of environmental systems analyses—including corporate carbon accounting and sustainable spend analysis. CEDA provides unparalleled global coverage and granularity, representing 95% of the world's GDP across 148 countries and 400 sectors, enabling robust and geographically comprehensive Scope 3 greenhouse gas (GHG) measurement. - Open CEDA is the publicly avaialable version of CEDA, now easy to download and available for free for all use cases. For more information please visit our website at openceda.org - CEDA 2024, the latest version of CEDA, uses 2022 as its base year, ensuring that emissions factors and economic data reflect the most recent global economic landscape available. To maintain accuracy and relevance, CEDA is updated annually with the latest data releases. + Open CEDA is the publicly avaialable version of CEDA, now easy to download and available for free for all use cases. For more information please visit our website at openceda.org. + This data registry entry contains CEDA 2025 and CEDA 2024 in two separate files. CEDA 2025, the latest version of CEDA, uses 2023 as its base year, ensuring that emissions factors and economic data reflect the most recent global economic landscape available. To maintain accuracy and relevance, CEDA is updated annually with the latest data releases. At its core, CEDA connects economic exchanges to GHG emissions by quantifying the life-cycle emissions of products and services. This is achieved through the integration of input-output tables, which represent the full supply-chain network of the global economy, with GHG emissions data. As a result, CEDA provides users with a powerful tool to assess the environmental impacts embedded in corporate value chains. Documentation: https://openceda.org/ Contact: ceda-support@watershed.com From c27ffaa90be491d79f11f1e36faf26eb93d4ece9 Mon Sep 17 00:00:00 2001 From: Keisuke Sehara Date: Wed, 15 Oct 2025 11:51:49 +0900 Subject: [PATCH 439/751] Update braidyn-bc_cued-lever-pull.yaml Entered the details of the S3 bucket --- datasets/braidyn-bc_cued-lever-pull.yaml | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/datasets/braidyn-bc_cued-lever-pull.yaml b/datasets/braidyn-bc_cued-lever-pull.yaml index 34dc9fd3a..09afee4de 100644 --- a/datasets/braidyn-bc_cued-lever-pull.yaml +++ b/datasets/braidyn-bc_cued-lever-pull.yaml @@ -27,11 +27,10 @@ Tags: - behavioral tracking License: Creative Commons Attribution 4.0 International (CC-BY 4.0) Resources: - - Description: - ARN: - Region: - Type: - Explore: + - Description: "BraiDyn-BC: Cued lever-pull task dataset" + ARN: arn:aws:s3:::braidyn-bc-buckets + Region: ap-northeast-1 + Type: S3 bucket DataAtWork: Tutorials: - Title: Detailed usage tutorials on Google Colab From ec7573b7498f5711620389025da9b04dc971f830 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adri=C3=A0=20Amell=20Tosas?= Date: Wed, 15 Oct 2025 23:12:29 +0200 Subject: [PATCH 440/751] RoA entry --- datasets/roa.yaml | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100644 datasets/roa.yaml diff --git a/datasets/roa.yaml b/datasets/roa.yaml new file mode 100644 index 000000000..1f5414d74 --- /dev/null +++ b/datasets/roa.yaml @@ -0,0 +1,46 @@ +Name: Rain over Africa +Description: The Rain over Africa (RoA) dataset consists of spaceborn estimates of precipitation of Rain over Africa using only geostationary imagery and obtained through a convolutional and quantile regression neural network. The dataset also contains some uncertainty estimates. +Documentation: https://github.com/SEE-GEO/roa +Contact: https://github.com/SEE-GEO/roa +ManagedBy: "[Geoscience and Remote Sensing at Chalmers University of Technology](https://www.chalmers.se/en/departments/see/research/geo)" +UpdateFrequency: At most, yearly +Tags: + - agriculture + - analysis ready data + - atmosphere + - aws-pds + - climate + - deep learning + - earth observation + - geophysics + - geoscience + - hydrology + - machine learning + - precipitation + - satellite imagery + - weather + - zarr +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" +Citation: "Please refer to https://github.com/SEE-GEO/roa#5-how-to-cite for instructions on how to cite the RoA data." +Resources: + - Description: RoA expected rain rate and quantiles at levels 16% and 84% in Zarr format + ARN: arn:aws:s3:::roa + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Reading RoA data + URL: https://github.com/SEE-GEO/roa?tab=readme-ov-file#22-reading-roa-data + AuthorName: Adrià Amell + Services: Amazon S3 + - Title: How to use the data + URL: https://github.com/SEE-GEO/roa?tab=readme-ov-file#3-how-to-use-the-data + AuthorName: Adrià Amell + Services: Amazon S3 + Publications: + - Title: Probabilistic near real-time retrievals of Rain over Africa using deep learning + URL: https://doi.org/10.1029/2025JD044595 + AuthorName: Adrià Amell, Lilian Hee, Simon Pfreundschuh, and Patrick Eriksson +DeprecatedNotice: +ADXCategories: + - Environmental Data \ No newline at end of file From bdd1ca4304ec8370bbd9bd4ffb489f6246c2f9a6 Mon Sep 17 00:00:00 2001 From: adriaat Date: Thu, 16 Oct 2025 17:54:39 +0200 Subject: [PATCH 441/751] Update bucket name --- datasets/roa.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/roa.yaml b/datasets/roa.yaml index 1f5414d74..682c99a48 100644 --- a/datasets/roa.yaml +++ b/datasets/roa.yaml @@ -24,7 +24,7 @@ License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: "Please refer to https://github.com/SEE-GEO/roa#5-how-to-cite for instructions on how to cite the RoA data." Resources: - Description: RoA expected rain rate and quantiles at levels 16% and 84% in Zarr format - ARN: arn:aws:s3:::roa + ARN: arn:aws:s3:::rainoverafrica Region: us-west-2 Type: S3 Bucket DataAtWork: From 32a3e2b7f4fd660d256e7835f2f0456d2f73693b Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 16 Oct 2025 10:34:49 -0800 Subject: [PATCH 442/751] ok: Update roa.yaml --- datasets/roa.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/roa.yaml b/datasets/roa.yaml index 682c99a48..2f942a7ee 100644 --- a/datasets/roa.yaml +++ b/datasets/roa.yaml @@ -5,6 +5,7 @@ Contact: https://github.com/SEE-GEO/roa ManagedBy: "[Geoscience and Remote Sensing at Chalmers University of Technology](https://www.chalmers.se/en/departments/see/research/geo)" UpdateFrequency: At most, yearly Tags: + - aws-pds - agriculture - analysis ready data - atmosphere @@ -43,4 +44,4 @@ DataAtWork: AuthorName: Adrià Amell, Lilian Hee, Simon Pfreundschuh, and Patrick Eriksson DeprecatedNotice: ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From edfa107f6933ebb7bb48aa86324d7a9b282a7c13 Mon Sep 17 00:00:00 2001 From: Siyuan-Shen Date: Thu, 16 Oct 2025 16:42:16 -0500 Subject: [PATCH 443/751] Update the contact email Updated contact email for dataset support. --- datasets/surface-pm2-5-v6gl.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/surface-pm2-5-v6gl.yaml b/datasets/surface-pm2-5-v6gl.yaml index f41ab8080..a73763702 100644 --- a/datasets/surface-pm2-5-v6gl.yaml +++ b/datasets/surface-pm2-5-v6gl.yaml @@ -1,7 +1,7 @@ Name: SatPM2.5 Description: Fine particulate matter (PM2.5) concentrations are estimated using information from satellite-, simulation- and monitor-based sources. Aerosol optical depth from multiple satellites (MODIS, VIIRS, MISR, SeaWiFS, and VIIRS) and their respective retrievals (Dark Target, Deep Blue, MAIAC) is combined with simulation (GEOS-Chem) based upon their relative uncertainties as determined using ground-based sun photometer (AERONET) observations to produce geophysical estimates that explain most of the variance in ground-based PM2.5 measurements. A subsequent statistical fusion incorporates additional information from ground-based PM2.5 measurements. Documentation: https://sites.wustl.edu/acag/datasets/surface-pm2-5/#V6.GL.02.04 -Contact: randall.martin@wustl.edu +Contact: support@satpm.org ManagedBy: "https://sites.wustl.edu/acag/" UpdateFrequency: Yearly Collabs: From 701107b8554edda51aae1a74d4f0214dcd8b6290 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 16 Oct 2025 14:06:07 -0800 Subject: [PATCH 444/751] ok: Update surface-pm2-5-v6gl.yaml --- datasets/surface-pm2-5-v6gl.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/surface-pm2-5-v6gl.yaml b/datasets/surface-pm2-5-v6gl.yaml index a73763702..eecf20d65 100644 --- a/datasets/surface-pm2-5-v6gl.yaml +++ b/datasets/surface-pm2-5-v6gl.yaml @@ -9,6 +9,7 @@ Collabs: Tags: - climate Tags: + - aws-pds - atmosphere - netcdf - environmental From 44cf6edc514ff85c899540175467b9f12e21a251 Mon Sep 17 00:00:00 2001 From: mhuynh-au <76926809+mhuynh-au@users.noreply.github.com> Date: Fri, 17 Oct 2025 15:54:10 +0800 Subject: [PATCH 445/751] Update askap.yaml updated bucket ARN and SNS Topic ARN --- datasets/askap.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index 53859b925..6bc3e65a7 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -23,12 +23,12 @@ Tags: License: CC-BY-4.0. Attribution required for refereed scientific papers. Resources: - Description: The Rapid ASKAP Continuum Survey (RACS) Public Data Releases - ARN: arn:aws:s3:::askap/racs + ARN: arn:aws:s3:::askap-odp/racs-low1/ Region: ap-southeast-2 Type: S3 Bucket RequesterPays: False - - Description: Notifications for new Rapid ASKAP Continuum Survey (RACS) data - ARN: arn:aws:sns:ap-southeast-2:336305517014:racs-low1-object_created + - Description: Notifications for ASKAP data + ARN: arn:aws:sns:ap-southeast-2:336305517014:askap-odp-object_created Region: sp-southeast-2 Type: SNS Topic DataAtWork: From 143d734693093ee06bfc235476544923c5e17622 Mon Sep 17 00:00:00 2001 From: mhuynh-au <76926809+mhuynh-au@users.noreply.github.com> Date: Fri, 17 Oct 2025 16:03:15 +0800 Subject: [PATCH 446/751] Update askap.yaml fixed region typo --- datasets/askap.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index 6bc3e65a7..5a082ff16 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -29,7 +29,7 @@ Resources: RequesterPays: False - Description: Notifications for ASKAP data ARN: arn:aws:sns:ap-southeast-2:336305517014:askap-odp-object_created - Region: sp-southeast-2 + Region: ap-southeast-2 Type: SNS Topic DataAtWork: Tutorials: From d90e73b48b608566810c5a8614b85e48c69f7071 Mon Sep 17 00:00:00 2001 From: mhuynh-au <76926809+mhuynh-au@users.noreply.github.com> Date: Fri, 17 Oct 2025 16:11:15 +0800 Subject: [PATCH 447/751] Update askap.yaml minor change to description of SNS topic --- datasets/askap.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/askap.yaml b/datasets/askap.yaml index 5a082ff16..252758039 100644 --- a/datasets/askap.yaml +++ b/datasets/askap.yaml @@ -27,7 +27,7 @@ Resources: Region: ap-southeast-2 Type: S3 Bucket RequesterPays: False - - Description: Notifications for ASKAP data + - Description: Notifications for new ASKAP data ARN: arn:aws:sns:ap-southeast-2:336305517014:askap-odp-object_created Region: ap-southeast-2 Type: SNS Topic From a38285285a49ff89bd22d9df97726d15f0fbf3b1 Mon Sep 17 00:00:00 2001 From: adriaat Date: Fri, 17 Oct 2025 15:01:31 +0200 Subject: [PATCH 448/751] Update quantiles shared --- datasets/roa.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/roa.yaml b/datasets/roa.yaml index 2f942a7ee..bd39daf9b 100644 --- a/datasets/roa.yaml +++ b/datasets/roa.yaml @@ -24,7 +24,7 @@ Tags: License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: "Please refer to https://github.com/SEE-GEO/roa#5-how-to-cite for instructions on how to cite the RoA data." Resources: - - Description: RoA expected rain rate and quantiles at levels 16% and 84% in Zarr format + - Description: RoA expected rain rate and quantiles at levels 5%, 16%, 25%, 50%, 75%, 84%, and 95% in Zarr format ARN: arn:aws:s3:::rainoverafrica Region: us-west-2 Type: S3 Bucket From 0b6b7e2f4f97d0fa10485970b5f19524c1396108 Mon Sep 17 00:00:00 2001 From: Christopher Tabone Date: Fri, 17 Oct 2025 17:58:38 +0000 Subject: [PATCH 449/751] Add Alliance of Genome Resources dataset --- datasets/alliance-genome-resources.yaml | 86 +++++++++++++++++++++++++ 1 file changed, 86 insertions(+) create mode 100644 datasets/alliance-genome-resources.yaml diff --git a/datasets/alliance-genome-resources.yaml b/datasets/alliance-genome-resources.yaml new file mode 100644 index 000000000..1a9f45b51 --- /dev/null +++ b/datasets/alliance-genome-resources.yaml @@ -0,0 +1,86 @@ +Name: Alliance of Genome Resources +Description: The Alliance of Genome Resources is a consortium that integrates genomic, genetic, and molecular data from leading model organism databases including Drosophila melanogaster, Caenorhabditis elegans, Danio rerio (zebrafish), Mus musculus (mouse), Rattus norvegicus (rat), Saccharomyces cerevisiae (yeast), Xenopus laevis and Xenopus tropicalis (frogs), and human reference data. The Alliance provides comprehensive datasets including gene annotations, disease associations, expression data (bulk and single-cell RNA-Seq), protein and genetic interactions, orthology relationships, variants and alleles, and complete genome sequences with annotations. Data is organized into Alliance-wide integrated datasets and organism-specific collections, supporting comparative genomics, disease modeling, and functional genomics research. +Documentation: https://github.com/alliance-genome/agr_open_data +Contact: help@alliancegenome.org +ManagedBy: Alliance of Genome Resources Consortium +UpdateFrequency: Quarterly releases (every ~3 months) +Tags: + - aws-pds + - genomic + - bioinformatics + - biology + - gene expression + - life sciences + - genetic + - genome + - Drosophila melanogaster + - Caenorhabditis elegans + - Danio rerio + - Mus musculus + - Rattus norvegicus + - Homo sapiens + - disease + - transcriptomics + - protein + - vcf + - fasta +License: Most Alliance data is available under CC0 1.0 Universal (Public Domain Dedication). Some datasets may use CC-BY 4.0 (attribution required). Full details at https://www.alliancegenome.org/terms-of-use +Citation: Alliance of Genome Resources Consortium. Alliance of Genome Resources Portal - unified model organism research platform. Nucleic Acids Research (2023). https://doi.org/10.1093/nar/gkac1003 +Resources: + - Description: Alliance-wide integrated datasets including disease associations, gene expression, molecular and genetic interactions, orthology relationships, and gene descriptions across all Alliance organisms. Data organized by release version (8.3.0, 8.2.0, etc.) and data type. Includes combined data files and organism-specific collections for FB (FlyBase/Drosophila), MGI (Mouse), RGD (Rat), SGD (Yeast), WB (Worm), XBXL/XBXT (Xenopus), ZFIN (Zebrafish), and HUMAN reference data. Files are available in TSV, JSON, and specialized formats (PSI-MI TAB for interactions, VCF for variants). + ARN: arn:aws:s3:::mod-datadumps + Region: us-east-1 + Type: S3 Bucket + - Description: FlyBase-specific data for Drosophila melanogaster and related species, including gene annotations, GO annotations, expression data (bulk RNA-Seq, single-cell RNA-Seq), disease associations, phenotypes, interactions, orthologs, genome sequences (FASTA), and genome annotations (GFF3/GTF). Data organized by release (current/, FB2025_04/, etc.) with precomputed analysis files and complete Chado XML database dumps. Publicly accessible via HTTPS for direct download without AWS credentials. + ARN: arn:aws:s3:::s3ftp.flybase.org + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse via HTTPS](https://s3ftp.flybase.org/releases/current/)' +DataAtWork: + Tutorials: + - Title: Alliance of Genome Resources AWS Data Access Tutorials + URL: https://github.com/alliance-genome/agr_open_data/blob/main/TUTORIAL.md + AuthorName: Alliance of Genome Resources Consortium + AuthorURL: https://www.alliancegenome.org + Services: + - S3 + Tools & Applications: + - Title: Alliance of Genome Resources Portal + URL: https://www.alliancegenome.org + AuthorName: Alliance of Genome Resources Consortium + AuthorURL: https://www.alliancegenome.org + - Title: FlyBase - Drosophila Database + URL: https://flybase.org + AuthorName: FlyBase Consortium + AuthorURL: https://flybase.org + - Title: WormBase - C. elegans Database + URL: https://www.wormbase.org + AuthorName: WormBase Consortium + AuthorURL: https://www.wormbase.org + - Title: ZFIN - Zebrafish Database + URL: https://zfin.org + AuthorName: ZFIN + AuthorURL: https://zfin.org + - Title: MGI - Mouse Genome Database + URL: http://www.informatics.jax.org + AuthorName: MGI + AuthorURL: http://www.informatics.jax.org + - Title: RGD - Rat Genome Database + URL: https://rgd.mcw.edu + AuthorName: RGD + AuthorURL: https://rgd.mcw.edu + - Title: SGD - Saccharomyces Genome Database + URL: https://www.yeastgenome.org + AuthorName: SGD + AuthorURL: https://www.yeastgenome.org + - Title: Xenbase - Xenopus Database + URL: http://www.xenbase.org + AuthorName: Xenbase + AuthorURL: http://www.xenbase.org + Publications: + - Title: Alliance of Genome Resources Portal - unified model organism research platform + URL: https://doi.org/10.1093/nar/gkac1003 + AuthorName: Alliance of Genome Resources Consortium +ADXCategories: + - Healthcare & Life Sciences Data From b021f3b0cd81def8b5191881b91ce04d4445ab99 Mon Sep 17 00:00:00 2001 From: adriaat Date: Sat, 18 Oct 2025 00:07:07 +0200 Subject: [PATCH 450/751] Adding SNS Topic resource --- datasets/roa.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/roa.yaml b/datasets/roa.yaml index bd39daf9b..0f5320729 100644 --- a/datasets/roa.yaml +++ b/datasets/roa.yaml @@ -28,6 +28,10 @@ Resources: ARN: arn:aws:s3:::rainoverafrica Region: us-west-2 Type: S3 Bucket + - Description: Description: Notifications for new Rain over Africa data + ARN: arn:aws:sns:us-west-2:261854712492:rainoverafrica-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: Reading RoA data From 3bbccea5f3f9a90e4594412ed3e6e5173416d359 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 17 Oct 2025 14:10:28 -0800 Subject: [PATCH 451/751] ok: Update roa.yaml From a76cd0d6d39a61b9f612db2a834100dc8f19bd72 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 17 Oct 2025 14:17:30 -0800 Subject: [PATCH 452/751] ok: Update roa.yaml --- datasets/roa.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/roa.yaml b/datasets/roa.yaml index 0f5320729..9143b6924 100644 --- a/datasets/roa.yaml +++ b/datasets/roa.yaml @@ -28,7 +28,7 @@ Resources: ARN: arn:aws:s3:::rainoverafrica Region: us-west-2 Type: S3 Bucket - - Description: Description: Notifications for new Rain over Africa data + - Description: Notifications for new Rain over Africa data ARN: arn:aws:sns:us-west-2:261854712492:rainoverafrica-object_created Region: us-west-2 Type: SNS Topic From 79b0cb41ab0a2de194d305909a9639ce52626966 Mon Sep 17 00:00:00 2001 From: Ziang Liu Date: Fri, 17 Oct 2025 19:45:16 -0400 Subject: [PATCH 453/751] Add OpenRoboCare Dataset --- datasets/open-robo-care.yaml | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 datasets/open-robo-care.yaml diff --git a/datasets/open-robo-care.yaml b/datasets/open-robo-care.yaml new file mode 100644 index 000000000..d6fd9da7f --- /dev/null +++ b/datasets/open-robo-care.yaml @@ -0,0 +1,28 @@ +Name: OpenRoboCare Multi-Modal Expert Demonstration Dataset for Robot-Assisted Caregiving +Description: A comprehensive multi-modal dataset capturing real-world caregiving routines from 21 occupational therapists performing 15 daily caregiving tasks. The dataset includes synchronized RGB-D video, tactile sensing, eye-gaze tracking, pose annotations, and action labels across 315 sessions totaling 19.8 hours of expert demonstrations. Data modalities include anonymized RGB images, depth maps, 44-sensor tactile readings, 2D/3D pose tracking, temporal action annotations, and first/third-person videos, enabling research in robot learning from demonstration, multimodal perception, and safe human-robot interaction for caregiving applications. +Documentation: https://emprise.cs.cornell.edu/robo-care/docs +Contact: https://emprise.cs.cornell.edu/robo-care/ +ManagedBy: "[EmPRISE Lab at Cornell University](https://emprise.cs.cornell.edu/)" +UpdateFrequency: Static dataset - no regular updates planned +Tags: + - computer vision + - robotics + - machine learning + - health +License: "BSD-3-Clause license - Academic and non-commercial use permitted. See documentation for full terms." +Citation: "Liang, X., Liu, Z., Lin, K., Gu, E., Ye, R., Nguyen, T., Hsu, C., Wu, Z., Yang, X., Cheung, C.S.Y., Soh, H., Dimitropoulou, K., & Bhattacharjee, T. (2025). OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)." +Resources: + - Description: Full Dataset + ARN: arn:aws:s3:::openrobocare + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: OpenRoboCare Dataset Viewer + URL: https://emprise.cs.cornell.edu/robo-care-viewer/ + AuthorName: Cornell University EmPRISE Lab + AuthorURL: https://emprise.cs.cornell.edu/ + Publications: + - Title: "OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving" + URL: https://emprise.cs.cornell.edu/robo-care/ + AuthorName: Liang X, Liu Z, Lin K, et al. \ No newline at end of file From e2859c1c67abd1f895bf5dae133c671b667eebe4 Mon Sep 17 00:00:00 2001 From: Adam Tyson Date: Mon, 20 Oct 2025 10:23:47 +0100 Subject: [PATCH 454/751] Create brainglobe.yaml --- datasets/brainglobe.yaml | 45 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 datasets/brainglobe.yaml diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml new file mode 100644 index 000000000..19df95b0f --- /dev/null +++ b/datasets/brainglobe.yaml @@ -0,0 +1,45 @@ +Name: BrainGlobe Atlases +Description: BrainGlobe provides an archive and standardised interface to anatomical atlases from multiple species. This dataset includes these atlases, and other data (e.g. sample neuroanatomy data) to allow the greatest use of the atlases. +Documentation: https://brainglobe.info/documentation/brainglobe-atlasapi/usage/atlas-details.html +Contact: hello@brainglobe.info +ManagedBy: [BrainGlobe](https://brainglobe.info/) +UpdateFrequency: When new atlases are packaged +Tags: + - biology + - digital preservation + - Homo sapiens + - image processing + - imaging + - light-sheet microscopy + - magnetic resonance imaging + - medical imaging + - microscopy + - Mus musculus + - neurobiology + - neuroimaging + - neuroscience + - Rattus norvegicus + - volumetric imaging + - zarr + +License: Creative Commons CC0 1.0 Universal +Citation: Claudi et al., (2020). BrainGlobe Atlas API: a common interface for neuroanatomical atlases. Journal of Open Source Software, 5(54), 2668, https://doi.org/10.21105/joss.02668 +Resources: + - Description: Atlases and sample data in a public bucket + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: + URL: https://brainglobe.info/aws_examples/explore_atlas.html + NotebookURL: + AuthorName: Alessandro Felder + AuthorURL: https://github.com/alessandrofelder + Services: + Publications: + - Title: BrainGlobe Atlas API: a common interface for neuroanatomical atlases + URL: https://doi.org/10.21105/joss.02668 + AuthorName: Federico Claudi, Luigi Petrucco, Adam L. Tyson et al. + AuthorURL: https://brainglobe.info/ From 0e851d0774f380a3ce7fd43228311180f363fc1b Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 20 Oct 2025 15:50:39 -0400 Subject: [PATCH 455/751] ok: Update huj-herbarium.yaml removing blank fields --- datasets/huj-herbarium.yaml | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index 5ca590aec..37f9ac309 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -32,17 +32,6 @@ DataAtWork: NotebookURL: AuthorName: Eyal Ben-Hur AuthorURL: - Services: - Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: - Publications: - - Title: - URL: - AuthorName: - AuthorURL: DeprecatedNotice: ADXCategories: - Healthcare & Life Sciences Data @@ -50,3 +39,4 @@ ADXCategories: + From 3819a4e4a27cce86bc041e69b8a8f0a58221a401 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 21 Oct 2025 14:41:40 -0400 Subject: [PATCH 456/751] ok: Update huj-herbarium.yaml Adding documentation --- datasets/huj-herbarium.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/huj-herbarium.yaml b/datasets/huj-herbarium.yaml index 37f9ac309..59b891572 100644 --- a/datasets/huj-herbarium.yaml +++ b/datasets/huj-herbarium.yaml @@ -3,7 +3,7 @@ Description: Our collection encompasses approximately one million vascular plant specimens from the Mediterranean and Middle East biodiversity hotspot, representing flora from Israel, Jordan, Hermon, Sinai, Egypt, the Caucasus, Arabia, North Africa, and throughout the Mediterranean basin. This scientifically significant repository includes published voucher specimens, original specimens used for "Flora Palaestina" illustrations, and critical references for the Israeli gene bank collections. The ongoing digitization process captures high-resolution images of each specimen while systematically incorporating label information into our computerized catalog. This virtual herbarium will democratize access to these valuable botanical resources, enabling global researchers to examine specimens in exceptional detail from anywhere in the world. Beyond preservation, this digital transformation unlocks new research possibilities through computational analysis of both visual specimen characteristics and associated metadata. The dataset will serve as a foundational resource for advancing botanical research, ecological modeling, taxonomic investigation, historical analysis, and numerous other scientific disciplines concerned with plant biodiversity in this ecologically and historically significant region. -Documentation: +Documentation: https://bit.ly/HUJVirtualHerbarium Contact: Eyal.Ben-Hur@mail.huji.ac.il ManagedBy: National Natural History Collections, The Hebrew University of Jerusalem UpdateFrequency: Monthly @@ -40,3 +40,4 @@ ADXCategories: + From b394322b4f6b4db6ebf7313f2fd7186356014313 Mon Sep 17 00:00:00 2001 From: willmacs <103065262+willmacs@users.noreply.github.com> Date: Tue, 21 Oct 2025 15:43:23 -0400 Subject: [PATCH 457/751] Link to Radiant Earth's STAC browser, instead --- datasets/rcm-ceos-ard.yaml | 2 +- datasets/sentinel-products-ca-mirror.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index 3a77ae523..8f11d8a70 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -42,7 +42,7 @@ Resources: Region: ca-central-1 Type: S3 Bucket Explore: - - '[EODMS STAC for RCM CEOS ARD](https://www.eodms-sgdot.nrcan-rncan.gc.ca/stac/collections/rcm-ard/items/)' + - '[STAC for RCM CEOS ARD products](https://radiantearth.github.io/stac-browser/#/external/www.eodms-sgdot.nrcan-rncan.gc.ca/stac/collections/rcm-ard?.language=en)' DataAtWork: Tutorials: - Title: Workflows for accessing and manipulating RCM ARD SpatioTemporal Asset Catalog (STAC) in JupyterLab Python Notebooks - Flux de travail pour accéder et manipuler le catalogue d'actifs spatio-temporels (STAC) RCM ARD dans les notebooks Python JupyterLab diff --git a/datasets/sentinel-products-ca-mirror.yaml b/datasets/sentinel-products-ca-mirror.yaml index 76c5e9d7d..a14da8007 100644 --- a/datasets/sentinel-products-ca-mirror.yaml +++ b/datasets/sentinel-products-ca-mirror.yaml @@ -35,5 +35,5 @@ Resources: Region: ca-central-1 Type: S3 Bucket Explore: - - '[EODMS STAC for Sentinel products](https://www.eodms-sgdot.nrcan-rncan.gc.ca/stac/)' + - '[STAC for Sentinel products](https://radiantearth.github.io/stac-browser/#/external/www.eodms-sgdot.nrcan-rncan.gc.ca/stac/collections/sentinel-1)' From d957dd3b4f1c564c768ef5e5f106a5923c63ea11 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Wed, 22 Oct 2025 13:20:24 -0500 Subject: [PATCH 458/751] Add CHAMMI-75 dataset with starting description --- datasets/chammi-75.yaml | 52 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 datasets/chammi-75.yaml diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml new file mode 100644 index 000000000..de9db6070 --- /dev/null +++ b/datasets/chammi-75.yaml @@ -0,0 +1,52 @@ +Name: CHAMMI-75 +Description: | + Quantifying cell morphology using images and machine learning models has proven to be a powerful tool to study the response of cells to treatments. + However, the models used to quantify cellular morphology are typically trained with a single microscopy imaging type and under controlled experimental conditions. + This results in specialized models that cannot be reused across biological studies because the technical specifications do not match (e.g., different number of channels), + or because the target experimental conditions are out of distribution. We have created CHAMMI-75, a large-scale dataset containing 2.8 million multi-channel, + high-resolution images curated from 75 diverse, publicly available biological studies. This dataset is useful to investigate and develop channel-adaptive models, + which could process microscopy images of varying technical specifications and regardless of the number of channels. By breaking the limitations of existing models, + CHAMMI-75 is an invaluable resource for creating the next generation of foundation models for image-based biological research. +Documentation: +Contact: Contact via email Juan Caicedo, juan.caicedo@wisc.edu +ManagedBy: Morgridge Institute for Research +UpdateFrequency: Every 2 years +Tags: + - aws-pds + - microscopy + - machine learning + - biology + - life sciences + - imaging + - high-throughput imaging + - cell imaging + - fluorescence imaging +License: CC BY 4.0 License +Citation: +Resources: + - Description: Images, training set and evaluation set available in an S3 bucket + ARN: + Region: + Type: + Explore: +DataAtWork: + Tutorials: + - Title: + URL: + NotebookURL: + AuthorName: + AuthorURL: + Services: + Tools & Applications: + - Title: + URL: + AuthorName: + AuthorURL: + Publications: + - Title: + URL: + AuthorName: + AuthorURL: +DeprecatedNotice: +ADXCategories: + - Healthcare & Life Sciences Data From ad67d46c0077a9d00045b927cbf2f510595af128 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Wed, 22 Oct 2025 13:29:27 -0500 Subject: [PATCH 459/751] update docs link --- datasets/chammi-75.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index de9db6070..e6a89dc83 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -7,7 +7,7 @@ Description: | high-resolution images curated from 75 diverse, publicly available biological studies. This dataset is useful to investigate and develop channel-adaptive models, which could process microscopy images of varying technical specifications and regardless of the number of channels. By breaking the limitations of existing models, CHAMMI-75 is an invaluable resource for creating the next generation of foundation models for image-based biological research. -Documentation: +Documentation: https://github.com/CaicedoLab/CHAMMI-75 Contact: Contact via email Juan Caicedo, juan.caicedo@wisc.edu ManagedBy: Morgridge Institute for Research UpdateFrequency: Every 2 years From f11bcb434ecb01e4eef592c6c9f1142dc9df9e04 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Wed, 22 Oct 2025 13:46:26 -0500 Subject: [PATCH 460/751] Add tutorial entry for CHAMMI-75 dataset --- datasets/chammi-75.yaml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index e6a89dc83..ae4b0a223 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -31,12 +31,12 @@ Resources: Explore: DataAtWork: Tutorials: - - Title: - URL: - NotebookURL: - AuthorName: + - Title: Get To Know A Dataset: CHAMMI-75 + URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/get-to-know-a-dataset-template.ipynb + NotebookURL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/get-to-know-a-dataset-template.ipynb + AuthorName: Vidit Agrawal, Juan Caicedo AuthorURL: - Services: + Services: Getting to know a dataset Tools & Applications: - Title: URL: From f86254db518b0136807ba81987b9c9a2aca9502c Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Wed, 22 Oct 2025 13:47:25 -0500 Subject: [PATCH 461/751] Remove 'aws-pds' tag from chammi-75.yaml Removed 'aws-pds' tag from the dataset. --- datasets/chammi-75.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index ae4b0a223..9c9f0039f 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -12,7 +12,6 @@ Contact: Contact via email Juan Caicedo, juan.caicedo@wisc.edu ManagedBy: Morgridge Institute for Research UpdateFrequency: Every 2 years Tags: - - aws-pds - microscopy - machine learning - biology From bace312b94b3fac71e0b993cfa6a4cb0581b3507 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Wed, 22 Oct 2025 15:10:27 -0500 Subject: [PATCH 462/751] Update documentation URL to be https in dandiarchive.yaml --- datasets/dandiarchive.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dandiarchive.yaml b/datasets/dandiarchive.yaml index 516561178..8a340f5b4 100644 --- a/datasets/dandiarchive.yaml +++ b/datasets/dandiarchive.yaml @@ -8,7 +8,7 @@ Description: > [BIDS - Brain Imaging Data Structure](https://bids.neuroimaging.io/), and [NIDM - Neuro Imaging Data Model](http://nidm.nidash.org/). Development of DANDI is supported by the National Institute of Mental Health. -Documentation: http://dandiarchive.org +Documentation: https://dandiarchive.org Contact: '[DANDI Archive Help Desk](https://github.com/dandi/helpdesk/issues/new/choose)' ManagedBy: '[DANDI Archive](https://about.dandiarchive.org/team)' UpdateFrequency: New datasets deposited every month From 36804abe0815bd18ac5ceba6d5831d6b754473c3 Mon Sep 17 00:00:00 2001 From: Ben Massey <127201321+mo-ben-massey@users.noreply.github.com> Date: Thu, 23 Oct 2025 10:39:43 +0100 Subject: [PATCH 463/751] Update wis2global cache Added the bucket ARN, updated its region and the Explore URL. Also added the SNS topic definition. --- datasets/wis2-global-cache.yaml | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/datasets/wis2-global-cache.yaml b/datasets/wis2-global-cache.yaml index 0b07181e7..afa398a85 100644 --- a/datasets/wis2-global-cache.yaml +++ b/datasets/wis2-global-cache.yaml @@ -23,11 +23,15 @@ Tags: License: There are no restrictions on the use of this data. Attribution of original source is requested. Resources: - Description: Core data as defined in the [WMO Unified Data Policy (Resolution 1 (Cg-19))](https://library.wmo.int/idurl/4/58009) and the [initial Catalogue of Core Data](http://library.wmo.int/doc_num.php?explnum_id=11001#page=139). Data covers a global extent. Data is provided in WMO approved formats - GRIB, BUFR and NetCDF. Users should subscribe to receive notification messages about newly available data. Please refer to the documentation. - ARN: arn:aws:s3:::wis2-global-cache - Region: eu-west-2 + ARN: arn:aws:s3:::wis2globalcache + Region: us-east-1 Type: S3 Bucket Explore: - - '[Browse Bucket](https://wis2-global-cache.s3.amazonaws.com/index.html)' + - '[Browse Bucket](https://wis2globalcache.s3.us-east-1.amazonaws.com/index.html)' + - Description: Notifications for new wis2globalcache data + ARN: arn:aws:sns:us-east-1:211648629506:wis2globalcache-object_created + Region: us-east-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: WIS 2.0 video for 19th World Meterological Congress From bc37bee4947d9068ef28c24cd7f643a66ea6307f Mon Sep 17 00:00:00 2001 From: crichica <148996603+crichica@users.noreply.github.com> Date: Thu, 23 Oct 2025 10:42:27 -0400 Subject: [PATCH 464/751] ok: Update wis2-global-cache.yaml --- datasets/wis2-global-cache.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/wis2-global-cache.yaml b/datasets/wis2-global-cache.yaml index afa398a85..99e23ea97 100644 --- a/datasets/wis2-global-cache.yaml +++ b/datasets/wis2-global-cache.yaml @@ -46,3 +46,4 @@ DataAtWork: AuthorName: World Meteorological Organisation ADXCategories: - Environmental Data + From bf9ad6cac73ab245aa8600dde8ef64d6f1cc5c51 Mon Sep 17 00:00:00 2001 From: Allison Heath Date: Thu, 23 Oct 2025 10:51:56 -0400 Subject: [PATCH 465/751] :sparkles: initial ARN for CBTN --- datasets/radiant.yaml | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/datasets/radiant.yaml b/datasets/radiant.yaml index 1f4e1e1c3..fb5cd6b9d 100644 --- a/datasets/radiant.yaml +++ b/datasets/radiant.yaml @@ -33,7 +33,11 @@ Tags: - whole genome sequencing License: "NIH Genomic Data Sharing Policy: https://grants.nih.gov/grants/guide/notice-files/not-od-14-124.html" Resources: - +- Description: "Children's Brain Tumor Network" + ARN: arn:aws:s3:::opendata-chop-study-us-east-1-prd-sd-bhjxbdqk + Region: us-east-1 + Type: S3 Bucket + ControlledAccess: https://cbtn.org/ DataAtWork: Tools & Applications: From 816f99512f0f19cbad9af4ebe818bf6362a1bf22 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 24 Oct 2025 15:09:35 -0400 Subject: [PATCH 466/751] ok: Update radiant.yaml --- datasets/radiant.yaml | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/datasets/radiant.yaml b/datasets/radiant.yaml index fb5cd6b9d..d2e351598 100644 --- a/datasets/radiant.yaml +++ b/datasets/radiant.yaml @@ -12,7 +12,6 @@ Description: > radiology and pathology imaging data. Data are collected or generated as part of consent-based, IRB-approved observational or interventional studies with the goal of making it available globally to researchers across a broad number of disciplines. - Documentation: https://cbtn.org/research-resources Contact: research@cbtn.org ManagedBy: "[The Center for Data-Driven Discovery in Biomedicine (D3b) at the Children's Hospital of Philadelphia](https://d3b.center/)" @@ -38,7 +37,6 @@ Resources: Region: us-east-1 Type: S3 Bucket ControlledAccess: https://cbtn.org/ - DataAtWork: Tools & Applications: - Title: RADIANT Source Code @@ -84,4 +82,4 @@ DataAtWork: AuthorName: Joshua A Shapiro, Krutika S Gaonkar, Stephanie J Spielman, et al. - Title: "Generation and multi-dimensional profiling of a childhood cancer cell line atlas defines new therapeutic opportunities" URL: https://pubmed.ncbi.nlm.nih.gov/37001527/ - AuthorName: Claire Xin Sun, Paul Daniel, Gabrielle Bradshaw et al. \ No newline at end of file + AuthorName: Claire Xin Sun, Paul Daniel, Gabrielle Bradshaw et al. From 1a0254507cb1d17dee734922115319ceaa5d6931 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 24 Oct 2025 15:28:30 -0400 Subject: [PATCH 467/751] ok: Update dandiarchive.yaml --- datasets/dandiarchive.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/dandiarchive.yaml b/datasets/dandiarchive.yaml index 8a340f5b4..70dbb1296 100644 --- a/datasets/dandiarchive.yaml +++ b/datasets/dandiarchive.yaml @@ -43,4 +43,3 @@ DataAtWork: URL: https://hub.dandiarchive.org/ AuthorName: DANDI Project AuthorURL: https://dandiarchive.org/ - Publications: From 348d21b0327a680ec00cb1ebab0cf67f26c48963 Mon Sep 17 00:00:00 2001 From: Troy Raen Date: Thu, 23 Oct 2025 18:36:21 -0700 Subject: [PATCH 468/751] Add QR2 to spherex-qr --- datasets/spherex-qr.yaml | 56 ++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 22 deletions(-) diff --git a/datasets/spherex-qr.yaml b/datasets/spherex-qr.yaml index c4238ef77..66e695c0d 100644 --- a/datasets/spherex-qr.yaml +++ b/datasets/spherex-qr.yaml @@ -1,5 +1,5 @@ Name: 'SPHEREx Quick Release (QR): An All-Sky Spectral Survey' -Description: 'The Spectro-Photometer for the History of the Universe, Epoch of Reionization, and Ices Explorer (SPHEREx) is a NASA Astrophysics Medium-class Explorer (MIDEX) mission launched in March 2025. During its planned two-year mission, SPHEREx will perform the first ever all-sky spectral survey in the optical to near-infrared (0.75-5 microns). SPHEREx Quick Release (QR) is the first data release. SPHEREx data will be used to probe inflation and the early universe, trace the history of galactic light production, and investigate the origin of planetary systems and biogenic ices, in addition to contributing to many other astrophysics research topics.' +Description: 'The Spectro-Photometer for the History of the Universe, Epoch of Reionization, and Ices Explorer (SPHEREx) is a NASA Astrophysics Medium-class Explorer (MIDEX) mission launched in March 2025. During its planned two-year mission, SPHEREx will perform the first ever all-sky spectral survey in the optical to near-infrared (0.75-5 microns). SPHEREx data will be used to probe inflation and the early universe, trace the history of galactic light production, and investigate the origin of planetary systems and biogenic ices, in addition to contributing to many other astrophysics research topics. IRSA began releasing SPHEREx QR2 data on a weekly basis in October 2025. QR2 features substantially improved calibrations and supersedes QR1.' Documentation: https://irsa.ipac.caltech.edu/Missions/spherex.html Contact: https://irsa.ipac.caltech.edu/docs/help_desk.html ManagedBy: "NASA/IPAC Infrared Science Archive ([IRSA](https://irsa.ipac.caltech.edu)) at Caltech" @@ -12,64 +12,76 @@ Tags: - satellite imagery - survey License: https://irsa.ipac.caltech.edu/data_use_terms.html -Citation: 'If you use SPHEREx data from the IRSA archive, please cite the appropriate Digital Object Identifier: [10.26131/IRSA629](https://www.ipac.caltech.edu/doi/irsa/10.26131/IRSA629), include the following acknowledgement: "This publication makes use of data products from the Spectro-Photometer for the History of the Universe, Epoch of Reionization and Ices Explorer (SPHEREx), which is a joint project of the Jet Propulsion Laboratory and the California Institute of Technology, and is funded by the National Aeronautics and Space Administration.", and follow the [IRSA acknowledgement guidelines](https://irsa.ipac.caltech.edu/ack.html).' +Citation: 'If you use SPHEREx data from the IRSA archive, please cite the appropriate Digital Object Identifier: [10.26131/IRSA629](https://www.ipac.caltech.edu/doi/irsa/10.26131/IRSA629) for QR1 or [10.26131/IRSA652](https://www.ipac.caltech.edu/doi/irsa/10.26131/IRSA652) for QR2, include the following acknowledgement: "This publication makes use of data products from the Spectro-Photometer for the History of the Universe, Epoch of Reionization and Ices Explorer (SPHEREx), which is a joint project of the Jet Propulsion Laboratory and the California Institute of Technology, and is funded by the National Aeronautics and Space Administration.", and follow the [IRSA acknowledgement guidelines](https://irsa.ipac.caltech.edu/ack.html).' Resources: - - Description: 'Spectral Images: Calibrated Spectral Images plus per-pixel status and processing flags, variance map, zodiacal model, exposure-averaged PSF, and wavelength WCS. Multi-extension FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/level2 + - Description: 'QR2 Spectral Images: Calibrated Spectral Images plus per-pixel status and processing flags, variance map, zodiacal model, exposure-averaged PSF, and wavelength WCS. Multi-extension FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/level2 Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Absolute Gain Matrix: Pixel-to-pixel gain variations within a single spectral channel and relative gain differences across channels. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/abs_gain_matrix + - Description: 'QR2 Absolute Gain Matrix: Pixel-to-pixel gain variations within a single spectral channel and relative gain differences across channels. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/abs_gain_matrix Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Exposure-Averaged PSF: Wavelength-dependent point spread function (PSF) estimates on a fine positional grid across each detector. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/average_psf + - Description: 'QR2 Exposure-Averaged PSF: Wavelength-dependent point spread function (PSF) estimates on a fine positional grid across each detector. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/average_psf Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Dark Current: Per pixel dark current. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/dark + - Description: 'QR2 Dark Current: Per pixel dark current. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/dark Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Dichroic: Map of pixels affected by flux attenuation due to the dichroic filter. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/dichroic + - Description: 'QR2 Dichroic: Map of pixels affected by flux attenuation due to the dichroic filter. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/dichroic Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Gain Factors: Gain factors for each detector. YAML format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/gain_factors + - Description: 'QR2 Gain Factors: Gain factors for each detector. YAML format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/gain_factors Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Non-functional Pixel Map: Map of permanently non-functioning pixels. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/nonfunc + - Description: 'QR2 Non-functional Pixel Map: Map of permanently non-functioning pixels. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/nonfunc Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Non-Linearity Correction: Corrections applied to compensate for detector non-linearity due to gain degradation. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/nonlinear_pars + - Description: 'QR2 Non-Linearity Correction: Corrections applied to compensate for detector non-linearity due to gain degradation. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/nonlinear_pars Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Readout Noise: Per-detector read noise maps. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/readnoise_pars + - Description: 'QR2 Readout Noise: Per-detector read noise maps. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/readnoise_pars Region: us-east-1 Type: S3 Bucket RequesterPays: False AccountRequired: False - - Description: 'Spectral WCS Map: Detailed World Coordinate System (WCS) map. FITS format.' - ARN: arn:aws:s3:::nasa-irsa-spherex/qr/spectral_wcs + - Description: 'QR2 Solid Angle Pixel Map: Per-detector measure of the solid angle per pixel in units of squared arcsec. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/solid_angle_pixel_map + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'QR2 Spectral WCS Map: Detailed World Coordinate System (WCS) map. FITS format.' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr2/spectral_wcs + Region: us-east-1 + Type: S3 Bucket + RequesterPays: False + AccountRequired: False + - Description: 'SPHEREx Quick Release 1 (QR1): IRSA began releasing SPHEREx QR1 data on a weekly basis in July 2025. QR1 is superseded by QR2 and only available through January 2026. The data products and their organization in the bucket are described in the [SPHEREx Archive at IRSA User Guide](https://caltech-ipac.github.io/spherex-archive-documentation/).' + ARN: arn:aws:s3:::nasa-irsa-spherex/qr Region: us-east-1 Type: S3 Bucket RequesterPays: False From f2f80dda63e0c670e606d31f4761942fc61d9ce5 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Sun, 26 Oct 2025 17:30:00 -0500 Subject: [PATCH 469/751] Add evaluation benchmarks entry to chammi-75.yaml Added a new entry for running evaluation benchmarks with details. --- datasets/chammi-75.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index 9c9f0039f..07aed1243 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -36,6 +36,12 @@ DataAtWork: AuthorName: Vidit Agrawal, Juan Caicedo AuthorURL: Services: Getting to know a dataset + - Title: Running evaluation benchmarks + URL: + NotebookURL: + AuthorName: Vidit Agrawal, Juan Caicedo + Author URL: + Services: It will enable researchers to run state of the art benchmarks in the exploration of single cell self-supervised learning foundation models. Tools & Applications: - Title: URL: From e728854fb1a552b97e3abe9897523a3c18794edf Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Sun, 26 Oct 2025 17:35:24 -0500 Subject: [PATCH 470/751] Update tutorial URLs in chammi-75.yaml --- datasets/chammi-75.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index 07aed1243..ee49fa9fd 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -31,14 +31,14 @@ Resources: DataAtWork: Tutorials: - Title: Get To Know A Dataset: CHAMMI-75 - URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/get-to-know-a-dataset-template.ipynb + URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/ NotebookURL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/get-to-know-a-dataset-template.ipynb AuthorName: Vidit Agrawal, Juan Caicedo AuthorURL: Services: Getting to know a dataset - Title: Running evaluation benchmarks - URL: - NotebookURL: + URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/ + NotebookURL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/running-benchmarks.ipynb AuthorName: Vidit Agrawal, Juan Caicedo Author URL: Services: It will enable researchers to run state of the art benchmarks in the exploration of single cell self-supervised learning foundation models. From 8902e85446337485c4d17ae9e878de2639a87ac6 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Sun, 26 Oct 2025 17:35:43 -0500 Subject: [PATCH 471/751] Update chammi-75.yaml --- datasets/chammi-75.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index ee49fa9fd..243c31cc5 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -36,7 +36,7 @@ DataAtWork: AuthorName: Vidit Agrawal, Juan Caicedo AuthorURL: Services: Getting to know a dataset - - Title: Running evaluation benchmarks + - Title: Running CHAMMI-75 Evaluation Benchmarks URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/ NotebookURL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/running-benchmarks.ipynb AuthorName: Vidit Agrawal, Juan Caicedo From 9d583328b8d1beae4dc621f4c56bef61447fb1e2 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Sun, 26 Oct 2025 18:09:27 -0500 Subject: [PATCH 472/751] Update CHAMMI-75 entry with source code details Added source code reference for CHAMMI-75. --- datasets/chammi-75.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/chammi-75.yaml b/datasets/chammi-75.yaml index 243c31cc5..52aece41f 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi-75.yaml @@ -43,9 +43,9 @@ DataAtWork: Author URL: Services: It will enable researchers to run state of the art benchmarks in the exploration of single cell self-supervised learning foundation models. Tools & Applications: - - Title: - URL: - AuthorName: + - Title: CHAMMI-75 Source Code + URL: https://github.com/CaicedoLab/CHAMMI-75 + AuthorName: Vidit Agrawal AuthorURL: Publications: - Title: From 28654fa8aabb70cf180ba8fc40aa8ef2045bbaf8 Mon Sep 17 00:00:00 2001 From: adriaat Date: Mon, 27 Oct 2025 10:39:36 +0100 Subject: [PATCH 473/751] item as a list --- datasets/roa.yaml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/datasets/roa.yaml b/datasets/roa.yaml index 9143b6924..7381162c0 100644 --- a/datasets/roa.yaml +++ b/datasets/roa.yaml @@ -37,11 +37,13 @@ DataAtWork: - Title: Reading RoA data URL: https://github.com/SEE-GEO/roa?tab=readme-ov-file#22-reading-roa-data AuthorName: Adrià Amell - Services: Amazon S3 + Services: + - Amazon S3 - Title: How to use the data URL: https://github.com/SEE-GEO/roa?tab=readme-ov-file#3-how-to-use-the-data AuthorName: Adrià Amell - Services: Amazon S3 + Services: + - Amazon S3 Publications: - Title: Probabilistic near real-time retrievals of Rain over Africa using deep learning URL: https://doi.org/10.1029/2025JD044595 From fba1523fe5af864b0dd1b5bbad6eb281b1e702ea Mon Sep 17 00:00:00 2001 From: Stefano Campanella <15182642+stefanocampanella@users.noreply.github.com> Date: Wed, 22 Oct 2025 15:11:35 +0200 Subject: [PATCH 474/751] Add ARCO-OCEAN dataset to the open-data-registry --- datasets/ogs-arco-ocean.yaml | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 datasets/ogs-arco-ocean.yaml diff --git a/datasets/ogs-arco-ocean.yaml b/datasets/ogs-arco-ocean.yaml new file mode 100644 index 000000000..3c7e618a8 --- /dev/null +++ b/datasets/ogs-arco-ocean.yaml @@ -0,0 +1,28 @@ +Name: ARCO-OCEAN +Description: | + ARCO-OCEAN is an analysis-ready cloud-optimized dataset providing physical properties of the ocean, waves, and sea ice for a period of about 28 years between the 1st of January 1993 and the 30th of June 2021. The dataset includes also atmospheric and hydrological variables that would be needed as boundary conditions and used to drive a numerical simulation. The dataset is the result of collecting, processing, merging and optimizing for the cloud different data sources, all retrospective analyses (reanalyses) or hindcasts of different Earth system components. The dataset has been designed with machine learning in mind, and takes inspiration from similar datasets derived from ERA5. +Documentation: "[ARCO-OCEAN](https://github.com/inogs/arco-ocean)" +Contact: scampanella@ogs.it +ManagedBy: "[OGS](https://www.ogs.it/en/dynamics-ecosystems-and-computational-oceanography)" +UpdateFrequency: Variable (as needed). +Tags: + - analysis ready data + - atmosphere + - climate + - hydrology + - ice + - machine learning + - oceans + - physics + - zarr +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" +Citation: "Campanella, S., Salon S., Querin, S., Bortolussi, L., and Stock, J.: ARCO-OCEAN: A dataset of physical properties of the ocean, waves, and sea ice, with hydrological and atmospheric forcing, optimized for machine learning, accessed on DD-MM-YYYY." +DataAtWork: + Tutorials: + - Title: Computing the Oceanic El Nino Index (ONI) with Xarray and ARCO-OCEAN + URL: https://github.com/inogs/arco-ocean/blob/main/tutorials/oni.ipynb + NotebookURL: https://github.com/inogs/arco-ocean/blob/main/tutorials/oni.ipynb + AuthorName: OGS + AuthorURL: https://github.com/inogs/ +ADXCategories: + - Environmental Data \ No newline at end of file From 1586d1514d45dd22dd02bf3787079b199f83664a Mon Sep 17 00:00:00 2001 From: Stefano Campanella <15182642+stefanocampanella@users.noreply.github.com> Date: Wed, 22 Oct 2025 15:21:51 +0200 Subject: [PATCH 475/751] Fix typo in the citation field of ARCO-OCEAN dataset configuration --- datasets/ogs-arco-ocean.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ogs-arco-ocean.yaml b/datasets/ogs-arco-ocean.yaml index 3c7e618a8..9202564d1 100644 --- a/datasets/ogs-arco-ocean.yaml +++ b/datasets/ogs-arco-ocean.yaml @@ -16,7 +16,7 @@ Tags: - physics - zarr License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" -Citation: "Campanella, S., Salon S., Querin, S., Bortolussi, L., and Stock, J.: ARCO-OCEAN: A dataset of physical properties of the ocean, waves, and sea ice, with hydrological and atmospheric forcing, optimized for machine learning, accessed on DD-MM-YYYY." +Citation: "Campanella, S., Salon, S., Querin, S., Bortolussi, L., and Stock, J.: ARCO-OCEAN: A dataset of physical properties of the ocean, waves, and sea ice, with hydrological and atmospheric forcing, optimized for machine learning, accessed on DD-MM-YYYY." DataAtWork: Tutorials: - Title: Computing the Oceanic El Nino Index (ONI) with Xarray and ARCO-OCEAN From 22812bd921acba59e5d0e9f913cef8ec01e824b3 Mon Sep 17 00:00:00 2001 From: willmacs <103065262+willmacs@users.noreply.github.com> Date: Mon, 27 Oct 2025 10:43:29 -0400 Subject: [PATCH 476/751] Update rcm-ceos-ard.yaml reciprocal link on geo.ca --- datasets/rcm-ceos-ard.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index 8f11d8a70..d498ef278 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -66,3 +66,7 @@ DataAtWork: URL: https://dataspace.copernicus.eu/explore-data/data-collections/copernicus-contributing-missions/collections-description/COP-DEM AuthorName: European Space Agency (ESA) AuthorURL: https://www.esa.int/ + - Title: RCM CEOS ARD Dataset on GEO.ca | Ensemble de données RCM CEOS ARD sur GEO.ca + URL: https://app.geo.ca/en-ca/map-browser/record/eodms-rcm-ard + AuthorName: Canada Centre for Remote Sensing | Centre canadien de télédétection + AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing \ No newline at end of file From 05ba9831e10ce21fc22958a88ffbedfc07a6a4f4 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 27 Oct 2025 07:27:22 -0800 Subject: [PATCH 477/751] ok: Update askap.yaml From 25ccbf99d7f6a046599bd5095a76e02951d76b92 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 27 Oct 2025 07:30:05 -0800 Subject: [PATCH 478/751] ok: Update askap.yaml From 8953b8b669af03a3c2b7548c1b713a20e81de6e2 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 27 Oct 2025 07:55:59 -0800 Subject: [PATCH 479/751] ok: Update roa.yaml From 37727e46ba7a3aa7faa2282eb94ed88e1904f890 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 27 Oct 2025 14:28:54 -0800 Subject: [PATCH 480/751] ok: Update rcm-ceos-ard.yaml --- datasets/rcm-ceos-ard.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index d498ef278..fb15cdd7b 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -69,4 +69,4 @@ DataAtWork: - Title: RCM CEOS ARD Dataset on GEO.ca | Ensemble de données RCM CEOS ARD sur GEO.ca URL: https://app.geo.ca/en-ca/map-browser/record/eodms-rcm-ard AuthorName: Canada Centre for Remote Sensing | Centre canadien de télédétection - AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing \ No newline at end of file + AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing From e134975ab8d33e7ae6a782bd10084169b6f54749 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 27 Oct 2025 15:00:00 -0800 Subject: [PATCH 481/751] ok: Update spherex-qr.yaml From 20c872b1501fe149913e16f68ee622e7a3594dad Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 28 Oct 2025 08:27:52 -0800 Subject: [PATCH 482/751] ok: Update spherex-qr.yaml From 39442445a586e39603d58abb7a737cd56fab79bf Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Tue, 28 Oct 2025 12:34:34 -0400 Subject: [PATCH 483/751] Update noaa-nws-naqfc-pds.yaml Added SNS Topic --- datasets/noaa-nws-naqfc-pds.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/noaa-nws-naqfc-pds.yaml b/datasets/noaa-nws-naqfc-pds.yaml index ebb215cfb..080740bec 100644 --- a/datasets/noaa-nws-naqfc-pds.yaml +++ b/datasets/noaa-nws-naqfc-pds.yaml @@ -38,6 +38,10 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://noaa-nws-naqfc-pds.s3.amazonaws.com/index.html)' + - Description: New data notifications for NAQFC, only Lambda and SQS protocols allowed + ARN: arn:aws:sns:us-east-1:709902155096:NewNWSAirQualityObject + Region: us-east-1 + Type: SNS Topic DataAtWork: Tutorials: Tools & Applications: From 08810bff102df752f4d12140532855a59325713d Mon Sep 17 00:00:00 2001 From: Bl4ckH4wkGER <25514600+Bl4ckH4wkGER@users.noreply.github.com> Date: Tue, 28 Oct 2025 10:31:39 -0700 Subject: [PATCH 484/751] Update allen-hmba-releases.yaml Update tags, add citation, application, and publication. --- datasets/allen-hmba-releases.yaml | 40 ++++++++++++++----------------- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/datasets/allen-hmba-releases.yaml b/datasets/allen-hmba-releases.yaml index 1c63d3b3a..d45699fce 100644 --- a/datasets/allen-hmba-releases.yaml +++ b/datasets/allen-hmba-releases.yaml @@ -13,12 +13,10 @@ Tags: - gene expression - neurobiology - life sciences - - transcriptomics - - basal ganglia + - single cell transcriptomics - Mus musculus - - Macaca mulatta - - Callithrix jacchus - Homo sapiens + - non-human primate License: http://www.alleninstitute.org/legal/terms-use/ Citation: Resources: @@ -26,25 +24,23 @@ Resources: ARN: arn:aws:s3:::allen-hmba-releases Region: us-west-2 Type: S3 bucket - Explore: DataAtWork: Tutorials: - - Title: - URL: - NotebookURL: - AuthorName: - AuthorURL: - Services: + - Title: Human-Mammalian Brain - Basal Ganglia - Data + URL: https://alleninstitute.github.io/abc_atlas_access/descriptions/HMBA-BG_dataset.html + AuthorName: Allen Institute for Brain Science + AuthorURL: www.alleninstitute.org + - Title: Human-Mammalian Brain - CCF Book + URL: https://alleninstitute.github.io/CCF-MAP/ + AuthorName: Allen Institute for Brain Science + AuthorURL: www.alleninstitute.org Tools & Applications: - - Title: - URL: - AuthorName: - AuthorURL: + - Title: HMBA Basal Ganglia resources in Brain Knowledge Platform's Data Catalog + URL: https://knowledge.brain-map.org/data/POZ2HCPBT60DSDJ8UA7 + AuthorName: Allen Institute for Brain Science + AuthorURL: www.alleninstitute.org Publications: - - Title: - URL: - AuthorName: - AuthorURL: -DeprecatedNotice: -ADXCategories: - - \ No newline at end of file + - Title: Cross-species consensus atlas of primate basal ganglia + URL: Preprint in preparation. + AuthorName: Johansen N.J., Fu Y., Schmitz M., et al. + AuthorURL: https://alleninstitute.org/person/nelson-johansen/ From 73e11960da9f9029e6e43d9cf616ae609cfa15c5 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 28 Oct 2025 10:20:41 -0800 Subject: [PATCH 485/751] ok: Update noaa-nws-naqfc-pds.yaml From c6a67e0d833371f282af41aa9a1528494f5178c8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 28 Oct 2025 10:39:19 -0800 Subject: [PATCH 486/751] ok: Update noaa-nws-naqfc-pds.yaml --- datasets/noaa-nws-naqfc-pds.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-nws-naqfc-pds.yaml b/datasets/noaa-nws-naqfc-pds.yaml index 080740bec..dd22a24d2 100644 --- a/datasets/noaa-nws-naqfc-pds.yaml +++ b/datasets/noaa-nws-naqfc-pds.yaml @@ -38,7 +38,7 @@ Resources: Type: S3 Bucket Explore: - '[Browse Bucket](https://noaa-nws-naqfc-pds.s3.amazonaws.com/index.html)' - - Description: New data notifications for NAQFC, only Lambda and SQS protocols allowed + - Description: New data notifications for NAQFC, only Lambda and SQS protocols allowed ARN: arn:aws:sns:us-east-1:709902155096:NewNWSAirQualityObject Region: us-east-1 Type: SNS Topic From 9619ba20c4779645f92d10f89fbfb437ff3d6d81 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 28 Oct 2025 11:02:06 -0800 Subject: [PATCH 487/751] ok: Update noaa-nws-naqfc-pds.yaml From 25cb85e0a3fc9bb94fd6203e64e3c5474c406c20 Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Tue, 28 Oct 2025 18:14:45 -0500 Subject: [PATCH 488/751] Update chammi.yaml with new source and publication info Added CHAMMI Benchmarking source code and publication details. --- datasets/{chammi-75.yaml => chammi.yaml} | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) rename datasets/{chammi-75.yaml => chammi.yaml} (86%) diff --git a/datasets/chammi-75.yaml b/datasets/chammi.yaml similarity index 86% rename from datasets/chammi-75.yaml rename to datasets/chammi.yaml index 52aece41f..fc2e61a41 100644 --- a/datasets/chammi-75.yaml +++ b/datasets/chammi.yaml @@ -47,10 +47,14 @@ DataAtWork: URL: https://github.com/CaicedoLab/CHAMMI-75 AuthorName: Vidit Agrawal AuthorURL: + - Title: CHAMMI Benchmarking Source Code + URL: https://github.com/chaudatascience/channel_adaptive_models + AuthorName: Chau Pham + AuthorURL: Publications: - - Title: - URL: - AuthorName: + - Title: CHAMMI: A benchmark for channel-adaptive models in microscopy imaging + URL: https://neurips.cc/virtual/2023/poster/73620 + AuthorName: Zitong Sam Chen, Chau Pham, Siqi Wang, Michael Doron, Nikita Moshkov, Bryan Plummer, Juan C. Caicedo AuthorURL: DeprecatedNotice: ADXCategories: From 90b4b4d4f8b3a4bf7e9f5dc1795643c1ccf1888a Mon Sep 17 00:00:00 2001 From: Reuben Jacob Mathew <47947817+rjmat97@users.noreply.github.com> Date: Wed, 29 Oct 2025 19:03:19 +0530 Subject: [PATCH 489/751] Create ont_basemod_data.yaml for benchmarking datasets Added ONT Methylation Benchmarking Datasets YAML file with details about the datasets, documentation, and resources. --- datasets/ont_basemod_data.yaml | 42 ++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 datasets/ont_basemod_data.yaml diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml new file mode 100644 index 000000000..e2fc7c441 --- /dev/null +++ b/datasets/ont_basemod_data.yaml @@ -0,0 +1,42 @@ +Name: ONT Methylation Benchmarking Datasets +Description: ONT Methylation Benchmarking Datasets are generated to benchmark existing methylation-calling tools on the Oxford Nanopore sequencing platform using their recent R10.4.1 flowcell chemistry. It spans a diverse range of species, including bacteria (E. coli, H. pylori J99, H. pylori 26695, A. variabilis, T. denticola), plants (Rice, Arabidopsis), and mammals (mouse, human).In addition, the dataset includes EMSeq data for E. coli, plant, and mouse samples, which can serve as ground truth for methylation studies. It also provides unmethylated whole-genome amplified (WGA) DNA for H. pylori 26695 and a dam- dcm- double mutant (DM) of E. coli that lacks canonical 5mC and 6mA methylation. These variants, together with their wild-type counterparts, offer value for both training and benchmarking DNA methylation calling models. +Documentation: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/documentation.md +Contact: + - onkar.ccmb@csir.res.in + - tej.ccmb@csir.res.in +ManagedBy: "[CSIR-Centre for Cellular and Molecular Biology](https://www.ccmb.res.in/)" +UpdateFrequency: Datasets will be updated periodically as additional data is generated. +Tags: + - ONT + - nanopore + - long read sequencing + - methylation + - bioinformatics + - epigenetics + - benchmarking + - pod5 + - bam + - bed +License: "[MIT License](https://opensource.org/license/mit)" +Citation: "Please cite Kulkarni et al. Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation. bioRxiv (2024). doi: https://doi.org/10.1101/2024.11.09.622763 when referencing the ONT methylation benchmarking datasets in publications." +Resources: + - Description: ONT Methylation Benchmarking Datasets + ARN: arn:aws:s3:::ont-basemod-benchmark-data + Region: ap-south-1 + Type: S3 Bucket + Explore: "[Browse Bucket](https://ont-basemod-benchmark-data.s3.amazonaws.com/index.html)" + - Description: Notifications for object created + ARN: arn:aws:sns:ap-south-1:767415906609:ont-basemod-benchmark-data-object_created + Region: ap-south-1 + Type: SNS topic +DataAtWork: + Tutorials: + - Title: Methylation calling using ONT methylation benchmarking dataset + URL: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/tutorial.md + AuthorName: Onkar Kulkarni + Services: EC2 + + Publications: + - Title: Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation + URL: https://www.biorxiv.org/content/10.1101/2024.11.09.622763v1 + AuthorName: Kulkarni et al. From 4235b159a3319b82ba6d2f2cb1ef51b09532cc53 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Wed, 29 Oct 2025 11:39:29 -0400 Subject: [PATCH 490/751] ok: Update braidyn-bc_cued-lever-pull.yaml Changing tags to reflect what is available on tags.yaml --- datasets/braidyn-bc_cued-lever-pull.yaml | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/datasets/braidyn-bc_cued-lever-pull.yaml b/datasets/braidyn-bc_cued-lever-pull.yaml index 09afee4de..a96239b1e 100644 --- a/datasets/braidyn-bc_cued-lever-pull.yaml +++ b/datasets/braidyn-bc_cued-lever-pull.yaml @@ -16,15 +16,13 @@ Contact: "Ken Nakae (ken.nakae@gmail.com)" ManagedBy: "[BraiDyn-BC Database Project](https://boatneck-weeder-7b7.notion.site/BraiDyn-BC-Database-303cf08c89f94d81bb2eaed4c3c50345)" UpdateFrequency: NA Tags: - - mouse - - behavior - - head fixation - - wide-field calcium imaging - - high-speed videography - - motor-skill learning - - operant conditioning - - sensory mapping - - behavioral tracking + - Mus musculus + - neuroscience + - calcium imaging + - video + - imaging + - life sciences + - aws-pds License: Creative Commons Attribution 4.0 International (CC-BY 4.0) Resources: - Description: "BraiDyn-BC: Cued lever-pull task dataset" From 0fda2cbf1503369f60f4c30b776440ce2c2dcb3e Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Wed, 29 Oct 2025 11:51:03 -0400 Subject: [PATCH 491/751] ok: Update braidyn-bc_cued-lever-pull.yaml format description --- datasets/braidyn-bc_cued-lever-pull.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/braidyn-bc_cued-lever-pull.yaml b/datasets/braidyn-bc_cued-lever-pull.yaml index a96239b1e..adc647917 100644 --- a/datasets/braidyn-bc_cued-lever-pull.yaml +++ b/datasets/braidyn-bc_cued-lever-pull.yaml @@ -25,7 +25,7 @@ Tags: - aws-pds License: Creative Commons Attribution 4.0 International (CC-BY 4.0) Resources: - - Description: "BraiDyn-BC: Cued lever-pull task dataset" + - Description: BraiDyn-BC: Cued lever-pull task dataset ARN: arn:aws:s3:::braidyn-bc-buckets Region: ap-northeast-1 Type: S3 bucket From 99763deafa82c80b9a3c2d9e4f052c2ad8cae638 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Wed, 29 Oct 2025 11:56:40 -0400 Subject: [PATCH 492/751] ok: Update braidyn-bc_cued-lever-pull.yaml format description --- datasets/braidyn-bc_cued-lever-pull.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/braidyn-bc_cued-lever-pull.yaml b/datasets/braidyn-bc_cued-lever-pull.yaml index adc647917..c2b5f5c8a 100644 --- a/datasets/braidyn-bc_cued-lever-pull.yaml +++ b/datasets/braidyn-bc_cued-lever-pull.yaml @@ -25,7 +25,7 @@ Tags: - aws-pds License: Creative Commons Attribution 4.0 International (CC-BY 4.0) Resources: - - Description: BraiDyn-BC: Cued lever-pull task dataset + - Description: BraiDyn-BC - Cued lever-pull task dataset ARN: arn:aws:s3:::braidyn-bc-buckets Region: ap-northeast-1 Type: S3 bucket From e725d2dc5951af5c188832af4990f45238e4c800 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Wed, 29 Oct 2025 12:04:08 -0400 Subject: [PATCH 493/751] ok: Update braidyn-bc_cued-lever-pull.yaml update S3 bucket --- datasets/braidyn-bc_cued-lever-pull.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/braidyn-bc_cued-lever-pull.yaml b/datasets/braidyn-bc_cued-lever-pull.yaml index c2b5f5c8a..7b7373d65 100644 --- a/datasets/braidyn-bc_cued-lever-pull.yaml +++ b/datasets/braidyn-bc_cued-lever-pull.yaml @@ -28,7 +28,7 @@ Resources: - Description: BraiDyn-BC - Cued lever-pull task dataset ARN: arn:aws:s3:::braidyn-bc-buckets Region: ap-northeast-1 - Type: S3 bucket + Type: S3 Bucket DataAtWork: Tutorials: - Title: Detailed usage tutorials on Google Colab From 85c47d029c2c72d80a84c6d96fb6f2a4fdb5927d Mon Sep 17 00:00:00 2001 From: gopanairepa <117306436+gopanairepa@users.noreply.github.com> Date: Wed, 29 Oct 2025 15:42:22 -0400 Subject: [PATCH 494/751] Adding EPAs Hourly Weather Research and Forecasting (WRF) model dataset --- .../epa-hourly-prognostic-meteorology.yaml | 55 +++++++++++++++++++ 1 file changed, 55 insertions(+) create mode 100644 datasets/epa-hourly-prognostic-meteorology.yaml diff --git a/datasets/epa-hourly-prognostic-meteorology.yaml b/datasets/epa-hourly-prognostic-meteorology.yaml new file mode 100644 index 000000000..18534be05 --- /dev/null +++ b/datasets/epa-hourly-prognostic-meteorology.yaml @@ -0,0 +1,55 @@ +Name: >- + EPA Hourly Prognostic Meteorological Data +Description: >- + The data are hourly outputs from the Weather Research and Forecasting (WRF) model + generated by the EPA's Office of Air Quality Planning and Standards, Air Quality + Assessment Division, Air Quality Modeling Group. These data were generated at a 12-km + resolution over the Continental United States (12US), beginning for the year 2021 and + continuing annually through 2023. These files are intended for use in a broad range of + air quality applications, but specifically may be used in dispersion modeling applications + that would benefit from the use of the Mesoscale Model Interface (MMIF) tool + (https://www.epa.gov/scram/air-quality-dispersion-modeling-related-model-support-programs#mmif) + which translates prognostic meteorological data into formats suitable for use with AERMOD, + CALPUFF, or SCICHEM. The individual files are less than 1GB in size, which allows for + the use of the MMIF tool in a Windows environment. These data are anticipated to be updated + annually so the 3 most-recent years are available for use. Additionally, model-observation + paired files are included to aid in the performance evaluation that is necessary for use + of these data in regulatory applications per Appendix W to 40 CFR Part 51. +Documentation: >- + 2022 WRF Modeling TSD: + https://bit.ly/2022WRF +Contact: Misenis.Chris@epa.gov +ManagedBy: U.S. Environmental Protection Agency (https://www.epa.gov) +UpdateFrequency: Annually +Tags: + - aws-pds + - environmental + - air quality + - regulatory + - weather + - meteorological +License: >- + These datasets are products of the U.S. Government and are intended for public + access and use. Unless otherwise specified, all data produced by the U.S EPA + is, by default, in the public domain and are not subject to domestic copyright + protection under 17 U.S.C. § 105. More details on the U.S. Public Domain + license are available here: http://www.usa.gov/publicdomain/label/1.0/ +Citation: >- + WRF Modeling: + US EPA, 2024, "Meteorological Model Performance for Annual 2022 Simulation + WRF v4.4.2" +Resources: + - Description: >- + The WRF output are stored as uncompressed netcdf/hdf5 formatted files in + directories corresponding to the specific years of interest. The model-obs + paired files are stored as comma-delimited files in the year-specific + directories. + ARN: 'arn:aws:s3:::epa-hourly-prognostic-meteorology' + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://epa-hourly-prognostic-meteorology.s3.amazonaws.com/index.html)' + - Description: Notification for the EPA Hourly Prognostic Meteorological Data bucket + ARN: 'arn:aws:sns:us-east-1:127085394039:epa-hourly-prognostic-meteorology-object_created' + Region: us-east-1 + Type: SNS Topic \ No newline at end of file From 556ccce0072029060d3bf7f131749aa1f4388f90 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 29 Oct 2025 12:05:23 -0800 Subject: [PATCH 495/751] ok: Update epa-hourly-prognostic-meteorology.yaml --- datasets/epa-hourly-prognostic-meteorology.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/epa-hourly-prognostic-meteorology.yaml b/datasets/epa-hourly-prognostic-meteorology.yaml index 18534be05..accdd1a7e 100644 --- a/datasets/epa-hourly-prognostic-meteorology.yaml +++ b/datasets/epa-hourly-prognostic-meteorology.yaml @@ -52,4 +52,4 @@ Resources: - Description: Notification for the EPA Hourly Prognostic Meteorological Data bucket ARN: 'arn:aws:sns:us-east-1:127085394039:epa-hourly-prognostic-meteorology-object_created' Region: us-east-1 - Type: SNS Topic \ No newline at end of file + Type: SNS Topic From 3af85ddda716836cd32f9c25fb0c5a112503c2b9 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:23:38 -0400 Subject: [PATCH 496/751] ok: Update allen-hmba-releases.yaml --- datasets/allen-hmba-releases.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/allen-hmba-releases.yaml b/datasets/allen-hmba-releases.yaml index d45699fce..67f606184 100644 --- a/datasets/allen-hmba-releases.yaml +++ b/datasets/allen-hmba-releases.yaml @@ -20,7 +20,7 @@ Tags: License: http://www.alleninstitute.org/legal/terms-use/ Citation: Resources: - - Description: project data files in a public bucket + - Description: Project data files in a public bucket ARN: arn:aws:s3:::allen-hmba-releases Region: us-west-2 Type: S3 bucket @@ -44,3 +44,4 @@ DataAtWork: URL: Preprint in preparation. AuthorName: Johansen N.J., Fu Y., Schmitz M., et al. AuthorURL: https://alleninstitute.org/person/nelson-johansen/ + From eb9ec11def8f108a8e69f9543adfa1914bcbb5b6 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:28:14 -0400 Subject: [PATCH 497/751] ok: Update ont_basemod_data.yaml Updating tags --- datasets/ont_basemod_data.yaml | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index e2fc7c441..4657a3550 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -7,16 +7,14 @@ Contact: ManagedBy: "[CSIR-Centre for Cellular and Molecular Biology](https://www.ccmb.res.in/)" UpdateFrequency: Datasets will be updated periodically as additional data is generated. Tags: - - ONT + - aws-pds + - life sciences - nanopore - long read sequencing - - methylation - bioinformatics - - epigenetics + - epigenomics - benchmarking - - pod5 - bam - - bed License: "[MIT License](https://opensource.org/license/mit)" Citation: "Please cite Kulkarni et al. Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation. bioRxiv (2024). doi: https://doi.org/10.1101/2024.11.09.622763 when referencing the ONT methylation benchmarking datasets in publications." Resources: From 4776fc546062e8995456d20f3e52011825a48ab2 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:30:42 -0400 Subject: [PATCH 498/751] ok: Update allen-hmba-releases.yaml updated tag --- datasets/allen-hmba-releases.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/allen-hmba-releases.yaml b/datasets/allen-hmba-releases.yaml index 67f606184..b9547631b 100644 --- a/datasets/allen-hmba-releases.yaml +++ b/datasets/allen-hmba-releases.yaml @@ -13,7 +13,7 @@ Tags: - gene expression - neurobiology - life sciences - - single cell transcriptomics + - single-cell transcriptomics - Mus musculus - Homo sapiens - non-human primate @@ -45,3 +45,4 @@ DataAtWork: AuthorName: Johansen N.J., Fu Y., Schmitz M., et al. AuthorURL: https://alleninstitute.org/person/nelson-johansen/ + From d9baa64c927431d49a7dcf358deeb788d6dff203 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:33:23 -0400 Subject: [PATCH 499/751] ok: Update ont_basemod_data.yaml tags --- datasets/ont_basemod_data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index 4657a3550..397d2351b 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -9,7 +9,7 @@ UpdateFrequency: Datasets will be updated periodically as additional data is gen Tags: - aws-pds - life sciences - - nanopore + - genomics - long read sequencing - bioinformatics - epigenomics From 95db693c3e91a2c829c70ceaae88d214387a438c Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:34:49 -0400 Subject: [PATCH 500/751] ok: Update allen-hmba-releases.yaml --- datasets/allen-hmba-releases.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/allen-hmba-releases.yaml b/datasets/allen-hmba-releases.yaml index b9547631b..7e07d440e 100644 --- a/datasets/allen-hmba-releases.yaml +++ b/datasets/allen-hmba-releases.yaml @@ -23,7 +23,7 @@ Resources: - Description: Project data files in a public bucket ARN: arn:aws:s3:::allen-hmba-releases Region: us-west-2 - Type: S3 bucket + Type: S3 Bucket DataAtWork: Tutorials: - Title: Human-Mammalian Brain - Basal Ganglia - Data @@ -46,3 +46,4 @@ DataAtWork: AuthorURL: https://alleninstitute.org/person/nelson-johansen/ + From 9687ba7e729eda68535c619406b1ead24d237d32 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:38:07 -0400 Subject: [PATCH 501/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index 397d2351b..44ade9867 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -9,7 +9,7 @@ UpdateFrequency: Datasets will be updated periodically as additional data is gen Tags: - aws-pds - life sciences - - genomics + - genomic - long read sequencing - bioinformatics - epigenomics From 4db9f81782d543c5f747b75eb8ee75df9c77448c Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:40:14 -0400 Subject: [PATCH 502/751] ok: Update allen-hmba-releases.yaml Removing publication info as it expects a link. Please add the link when it is ready. --- datasets/allen-hmba-releases.yaml | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/datasets/allen-hmba-releases.yaml b/datasets/allen-hmba-releases.yaml index 7e07d440e..b7275588d 100644 --- a/datasets/allen-hmba-releases.yaml +++ b/datasets/allen-hmba-releases.yaml @@ -39,11 +39,8 @@ DataAtWork: URL: https://knowledge.brain-map.org/data/POZ2HCPBT60DSDJ8UA7 AuthorName: Allen Institute for Brain Science AuthorURL: www.alleninstitute.org - Publications: - - Title: Cross-species consensus atlas of primate basal ganglia - URL: Preprint in preparation. - AuthorName: Johansen N.J., Fu Y., Schmitz M., et al. - AuthorURL: https://alleninstitute.org/person/nelson-johansen/ + + From 4afb4a92916b1600335e52436ae222a5edb66383 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:42:51 -0400 Subject: [PATCH 503/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index 44ade9867..5a64ceaed 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -13,7 +13,7 @@ Tags: - long read sequencing - bioinformatics - epigenomics - - benchmarking + - benchmark - bam License: "[MIT License](https://opensource.org/license/mit)" Citation: "Please cite Kulkarni et al. Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation. bioRxiv (2024). doi: https://doi.org/10.1101/2024.11.09.622763 when referencing the ONT methylation benchmarking datasets in publications." From 7c80064ef8bf171e345eebb0a2932a41f5df3d6f Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 09:54:18 -0400 Subject: [PATCH 504/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index 5a64ceaed..dcba6a21e 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -33,7 +33,6 @@ DataAtWork: URL: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/tutorial.md AuthorName: Onkar Kulkarni Services: EC2 - Publications: - Title: Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation URL: https://www.biorxiv.org/content/10.1101/2024.11.09.622763v1 From 225d4e35593f04abe1bb0c30425b12605f361b6c Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 10:35:12 -0400 Subject: [PATCH 505/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index dcba6a21e..93bd037ca 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -26,7 +26,7 @@ Resources: - Description: Notifications for object created ARN: arn:aws:sns:ap-south-1:767415906609:ont-basemod-benchmark-data-object_created Region: ap-south-1 - Type: SNS topic + Type: SNS Topic DataAtWork: Tutorials: - Title: Methylation calling using ONT methylation benchmarking dataset From c6c55a7ac7d7cb7e589ddfe767675ebebc35fa21 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 10:51:12 -0400 Subject: [PATCH 506/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index 93bd037ca..518b3c188 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -2,8 +2,7 @@ Name: ONT Methylation Benchmarking Datasets Description: ONT Methylation Benchmarking Datasets are generated to benchmark existing methylation-calling tools on the Oxford Nanopore sequencing platform using their recent R10.4.1 flowcell chemistry. It spans a diverse range of species, including bacteria (E. coli, H. pylori J99, H. pylori 26695, A. variabilis, T. denticola), plants (Rice, Arabidopsis), and mammals (mouse, human).In addition, the dataset includes EMSeq data for E. coli, plant, and mouse samples, which can serve as ground truth for methylation studies. It also provides unmethylated whole-genome amplified (WGA) DNA for H. pylori 26695 and a dam- dcm- double mutant (DM) of E. coli that lacks canonical 5mC and 6mA methylation. These variants, together with their wild-type counterparts, offer value for both training and benchmarking DNA methylation calling models. Documentation: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/documentation.md Contact: - - onkar.ccmb@csir.res.in - - tej.ccmb@csir.res.in + - "onkar.ccmb@csir.res.in; tej.ccmb@csir.res.in" ManagedBy: "[CSIR-Centre for Cellular and Molecular Biology](https://www.ccmb.res.in/)" UpdateFrequency: Datasets will be updated periodically as additional data is generated. Tags: @@ -22,7 +21,8 @@ Resources: ARN: arn:aws:s3:::ont-basemod-benchmark-data Region: ap-south-1 Type: S3 Bucket - Explore: "[Browse Bucket](https://ont-basemod-benchmark-data.s3.amazonaws.com/index.html)" + Explore: + - "[Browse Bucket](https://ont-basemod-benchmark-data.s3.amazonaws.com/index.html)" - Description: Notifications for object created ARN: arn:aws:sns:ap-south-1:767415906609:ont-basemod-benchmark-data-object_created Region: ap-south-1 @@ -32,7 +32,8 @@ DataAtWork: - Title: Methylation calling using ONT methylation benchmarking dataset URL: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/tutorial.md AuthorName: Onkar Kulkarni - Services: EC2 + Services: + - EC2 Publications: - Title: Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation URL: https://www.biorxiv.org/content/10.1101/2024.11.09.622763v1 From b00c0d2ef34d724ae337d9eb9b1bf6110aa4af55 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 10:57:10 -0400 Subject: [PATCH 507/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index 518b3c188..f242873ae 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -22,7 +22,7 @@ Resources: Region: ap-south-1 Type: S3 Bucket Explore: - - "[Browse Bucket](https://ont-basemod-benchmark-data.s3.amazonaws.com/index.html)" + - '[Browse Bucket](https://ont-basemod-benchmark-data.s3.amazonaws.com/index.html)' - Description: Notifications for object created ARN: arn:aws:sns:ap-south-1:767415906609:ont-basemod-benchmark-data-object_created Region: ap-south-1 @@ -32,8 +32,6 @@ DataAtWork: - Title: Methylation calling using ONT methylation benchmarking dataset URL: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/tutorial.md AuthorName: Onkar Kulkarni - Services: - - EC2 Publications: - Title: Comprehensive benchmarking of tools for nanopore-based detection of DNA methylation URL: https://www.biorxiv.org/content/10.1101/2024.11.09.622763v1 From 974e6f87b4da5e695d64f3568cbdc1c11fa14a30 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 11:02:47 -0400 Subject: [PATCH 508/751] ok: Update ont_basemod_data.yaml --- datasets/ont_basemod_data.yaml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datasets/ont_basemod_data.yaml b/datasets/ont_basemod_data.yaml index f242873ae..bd50e2a5c 100644 --- a/datasets/ont_basemod_data.yaml +++ b/datasets/ont_basemod_data.yaml @@ -1,8 +1,7 @@ Name: ONT Methylation Benchmarking Datasets Description: ONT Methylation Benchmarking Datasets are generated to benchmark existing methylation-calling tools on the Oxford Nanopore sequencing platform using their recent R10.4.1 flowcell chemistry. It spans a diverse range of species, including bacteria (E. coli, H. pylori J99, H. pylori 26695, A. variabilis, T. denticola), plants (Rice, Arabidopsis), and mammals (mouse, human).In addition, the dataset includes EMSeq data for E. coli, plant, and mouse samples, which can serve as ground truth for methylation studies. It also provides unmethylated whole-genome amplified (WGA) DNA for H. pylori 26695 and a dam- dcm- double mutant (DM) of E. coli that lacks canonical 5mC and 6mA methylation. These variants, together with their wild-type counterparts, offer value for both training and benchmarking DNA methylation calling models. Documentation: https://github.com/SowpatiLab/ont-basemod-benchmark-data/blob/main/documentation.md -Contact: - - "onkar.ccmb@csir.res.in; tej.ccmb@csir.res.in" +Contact: "onkar.ccmb@csir.res.in; tej.ccmb@csir.res.in" ManagedBy: "[CSIR-Centre for Cellular and Molecular Biology](https://www.ccmb.res.in/)" UpdateFrequency: Datasets will be updated periodically as additional data is generated. Tags: From 3e9b1a6315ec1b7b7c4ea471aa0ed7629170f51f Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 11:37:13 -0400 Subject: [PATCH 509/751] ok: Update nrel-pds-ncdb.yaml --- datasets/nrel-pds-ncdb.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/nrel-pds-ncdb.yaml b/datasets/nrel-pds-ncdb.yaml index 22a4b8ec1..69a678349 100644 --- a/datasets/nrel-pds-ncdb.yaml +++ b/datasets/nrel-pds-ncdb.yaml @@ -45,7 +45,6 @@ Resources: Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=nrel-pds-hsds&prefix=nrel%2Fncdb%2F)' DataAtWork: - Tutorials: Tools & Applications: - Title: NCDB Website URL: https://climate.nrel.gov From d0620cb40886e5a886d50e052d833bf92cca860c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ali=20=C5=9Eapc=C4=B1?= Date: Thu, 30 Oct 2025 09:04:54 -0700 Subject: [PATCH 510/751] Update and rename krepp-idx.yaml to kreppref.yaml --- datasets/{krepp-idx.yaml => kreppref.yaml} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename datasets/{krepp-idx.yaml => kreppref.yaml} (98%) diff --git a/datasets/krepp-idx.yaml b/datasets/kreppref.yaml similarity index 98% rename from datasets/krepp-idx.yaml rename to datasets/kreppref.yaml index 70f3106ff..1a8f53f88 100644 --- a/datasets/krepp-idx.yaml +++ b/datasets/kreppref.yaml @@ -14,7 +14,7 @@ Tags: License: GPL-3.0 license. Use of the data should be cited in the usual way, following https://github.com/bo1929/krepp/tree/master?tab=readme-ov-file#citation. Resources: - Description: This dataset contains genomic indexes for various reference datasets in binary format. Using krepp, you can perform distance estimation and phylogenetic placement with respect to these indexes. - ARN: arn:aws:s3:::krepp-idx + ARN: arn:aws:s3:::kreppref Region: us-west-1 Type: S3 Bucket DataAtWork: From 18f6b3b46ee695649eff9f39e9e23d0c755ac6e3 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 30 Oct 2025 12:35:16 -0400 Subject: [PATCH 511/751] ok: Update kreppref.yaml Modified tags --- datasets/kreppref.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/kreppref.yaml b/datasets/kreppref.yaml index 1a8f53f88..d0d60298d 100644 --- a/datasets/kreppref.yaml +++ b/datasets/kreppref.yaml @@ -9,7 +9,7 @@ Tags: - metagenomics - microbiome - reference index - - phylogenetics + - aws-pds - life sciences License: GPL-3.0 license. Use of the data should be cited in the usual way, following https://github.com/bo1929/krepp/tree/master?tab=readme-ov-file#citation. Resources: From 7daa039f8104dbf3c197d6616058cdbabe061e80 Mon Sep 17 00:00:00 2001 From: Stefano Campanella <15182642+stefanocampanella@users.noreply.github.com> Date: Fri, 31 Oct 2025 13:04:03 +0100 Subject: [PATCH 512/751] Add resources section to ARCO-OCEAN dataset configuration --- datasets/ogs-arco-ocean.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/datasets/ogs-arco-ocean.yaml b/datasets/ogs-arco-ocean.yaml index 9202564d1..c1739f584 100644 --- a/datasets/ogs-arco-ocean.yaml +++ b/datasets/ogs-arco-ocean.yaml @@ -17,6 +17,11 @@ Tags: - zarr License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: "Campanella, S., Salon, S., Querin, S., Bortolussi, L., and Stock, J.: ARCO-OCEAN: A dataset of physical properties of the ocean, waves, and sea ice, with hydrological and atmospheric forcing, optimized for machine learning, accessed on DD-MM-YYYY." +Resources: + - Description: Zarr analysis ready data + ARN: arn:aws:s3:::ogs-arco-ocean + Region: eu-south-1 + Type: S3 Bucket DataAtWork: Tutorials: - Title: Computing the Oceanic El Nino Index (ONI) with Xarray and ARCO-OCEAN From 14cfecb28738977ba836acc69cc2a33eab3bd58a Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 31 Oct 2025 09:33:03 -0400 Subject: [PATCH 513/751] ok: Update longbench.yaml --- datasets/longbench.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/longbench.yaml b/datasets/longbench.yaml index 278c68206..eb7f5a290 100644 --- a/datasets/longbench.yaml +++ b/datasets/longbench.yaml @@ -17,7 +17,7 @@ Tags: - vcf - cancer - life sciences - + - aws-pds License: CC BY-4.0 Resources: - Description: Bulk, single-cell, and single-nucleus RNA-seq data from the LongBench project, covering eight human lung cancer cell lines. Bulk sequencing (FASTQ) was performed on ONT PCR-cDNA, ONT direct RNA (including pod5 files for RNA modification analysis), PacBio Kinnex, and Illumina platforms. Single-cell and single-nucleus sequencing (FASTQ) was performed on ONT PCR-cDNA, PacBio Kinnex, and Illumina platforms. Aligned reads (BAM), variant calls (VCF), and processed gene expression data are also provided, along with reference genome annotations (GTF and FASTA). From c1402a6b2431ad3b2cd5d4912ce7efbc04c51f45 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 31 Oct 2025 14:22:33 -0400 Subject: [PATCH 514/751] ok: Update longbench.yaml --- datasets/longbench.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/longbench.yaml b/datasets/longbench.yaml index eb7f5a290..7f653856c 100644 --- a/datasets/longbench.yaml +++ b/datasets/longbench.yaml @@ -12,7 +12,6 @@ Tags: - short read sequencing - bioinformatics - fastq - - pod5 - bam - vcf - cancer From 9fb0744c33070566325f032720670434e3195745 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 31 Oct 2025 12:46:30 -0800 Subject: [PATCH 515/751] ok: Update ogs-arco-ocean.yaml --- datasets/ogs-arco-ocean.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/ogs-arco-ocean.yaml b/datasets/ogs-arco-ocean.yaml index c1739f584..0e08b3f37 100644 --- a/datasets/ogs-arco-ocean.yaml +++ b/datasets/ogs-arco-ocean.yaml @@ -6,6 +6,7 @@ Contact: scampanella@ogs.it ManagedBy: "[OGS](https://www.ogs.it/en/dynamics-ecosystems-and-computational-oceanography)" UpdateFrequency: Variable (as needed). Tags: + - aws-pds - analysis ready data - atmosphere - climate @@ -30,4 +31,4 @@ DataAtWork: AuthorName: OGS AuthorURL: https://github.com/inogs/ ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 6f3da7626ddd85aca5dd58be1a6066ba5c89fae1 Mon Sep 17 00:00:00 2001 From: Andrew Johnston Date: Fri, 31 Oct 2025 13:56:43 -0800 Subject: [PATCH 516/751] Update temporal span for nasa-operal2rtc-s1v1 --- datasets/nasa-operal2rtc-s1v1.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/nasa-operal2rtc-s1v1.yaml b/datasets/nasa-operal2rtc-s1v1.yaml index 98ecbaa8d..1ea9c9299 100644 --- a/datasets/nasa-operal2rtc-s1v1.yaml +++ b/datasets/nasa-operal2rtc-s1v1.yaml @@ -15,7 +15,7 @@ Description: "The Observational Products for End-Users from Remote Sensing Analy β0, through radiometric terrain correction. The RTC-S1 product is distributed as cloud optimized GeoTIFFs with one GeoTIFF file per processed polarization. The RTC-S1 product metadata is provided in the Hierarchical Data Format version 5 (HDF5) - format. The OPERA RTC-S1 product contains modified Copernicus Sentinel data (2022-2025).\n\nDue + format. The OPERA RTC-S1 product contains modified Copernicus Sentinel data (2016-2025).\n\nDue to the S1 mission’s narrow orbital tube, radar-geometry layers such as incidence angle, local incidence angle, number of looks, and RTC Area Normalization Factor (ANF) vary slightly over time for each position on the ground, and therefore are @@ -28,7 +28,7 @@ Description: "The Observational Products for End-Users from Remote Sensing Analy Documentation: https://doi.org/10.5067/SNWG/OPERA_L2_RTC-S1_V1 Contact: 'Email: uso@asf.alaska.edu. Home Page: https://www.asf.alaska.edu/' ManagedBy: NASA -UpdateFrequency: From 2020-12-31 to Ongoing +UpdateFrequency: From 2016-04-14 to Ongoing Tags: - aws-pds - coastal From 8a32cba1ed3e5005bdb793af85e4029a05ff1354 Mon Sep 17 00:00:00 2001 From: Darcy Weedman <96114727+darcyweedman@users.noreply.github.com> Date: Sun, 2 Nov 2025 16:11:49 +1000 Subject: [PATCH 517/751] Update sentinel-2.yaml Include Geopera's pera portal which uses sentinel-2 for index visualisation and land analysis --- datasets/sentinel-2.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/sentinel-2.yaml b/datasets/sentinel-2.yaml index dd4f528e0..526b43f70 100644 --- a/datasets/sentinel-2.yaml +++ b/datasets/sentinel-2.yaml @@ -163,6 +163,10 @@ DataAtWork: URL: https://map.onesoil.ai/ AuthorName: OneSoil AuthorURL: https://onesoil.ai/ + - Title: Pera Portal + URL: https://portal.geopera.com/ + AuthorName: Geopera + AuthorURL: https://geopera.com/ Publications: - Title: Using Remote Sensing Images and Cloud Services on AWS to Improve Land Use and Cover Monitoring URL: https://ieeexplore.ieee.org/abstract/document/9165649 From 40003c589744e8256a62f908f43d2df36c17e33e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 3 Nov 2025 09:01:11 -0900 Subject: [PATCH 518/751] ok: Update nasa-operal2rtc-s1v1.yaml From 9562a912345be72cc4ba757c36817637827c5acb Mon Sep 17 00:00:00 2001 From: Stefano Campanella <15182642+stefanocampanella@users.noreply.github.com> Date: Tue, 4 Nov 2025 16:21:18 +0100 Subject: [PATCH 519/751] Add SNS Topic for notifications to ARCO-OCEAN dataset configuration --- datasets/ogs-arco-ocean.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/ogs-arco-ocean.yaml b/datasets/ogs-arco-ocean.yaml index 0e08b3f37..2dd61e280 100644 --- a/datasets/ogs-arco-ocean.yaml +++ b/datasets/ogs-arco-ocean.yaml @@ -23,6 +23,10 @@ Resources: ARN: arn:aws:s3:::ogs-arco-ocean Region: eu-south-1 Type: S3 Bucket + - Description: Notifications for new ARCO-OCEAN data + ARN: arn:aws:sns:eu-south-1:985149164500:ogs-arco-ocean-object_created + Region: eu-south-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Computing the Oceanic El Nino Index (ONI) with Xarray and ARCO-OCEAN From 1ffda0a6d76400d5e4bbfba00842f5acdddcf3b4 Mon Sep 17 00:00:00 2001 From: Matt McCormick Date: Tue, 4 Nov 2025 11:10:44 -0500 Subject: [PATCH 520/751] ome-zarr-open-scivis: add citation and update contact email --- datasets/ome-zarr-open-scivis.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/ome-zarr-open-scivis.yaml b/datasets/ome-zarr-open-scivis.yaml index c0a110131..79eb9a5a5 100644 --- a/datasets/ome-zarr-open-scivis.yaml +++ b/datasets/ome-zarr-open-scivis.yaml @@ -1,7 +1,7 @@ Name: OME-Zarr Open SciVis Datasets Description: This project provides the Open SciVis Datasets in a chunked, highly-compressed, multi-scale format, encodes metadata in JSON according to the OME-Zarr specification, and hosts the datasets on AWS S3 through the AWS Open Data Program, aiming to serve as a web-based resource for the scientific visualization community to enhance reproducibility and facilitate testing and development of OME-Zarr tools. Documentation: https://github.com/InsightSoftwareConsortium/OMEZarrOpenSciVisDatasets -Contact: "Matt McCormick " +Contact: "Matt McCormick " ManagedBy: "NumFOCUS" UpdateFrequency: On a biannual basis we update the datasets and sync with OME-Zarr standards. Tags: @@ -16,6 +16,7 @@ Tags: - volumetric imaging - zarr License: CC-BY-4.0 unless otherwise specified +Citation: McCormick, M. (2025). OME-Zarr Open SciVis Datasets (v2025.10.31). Zenodo. https://doi.org/10.5281/zenodo.17495294 Resources: - Description: OME-Zarr Open SciVis Datasets ARN: arn:aws:s3:::ome-zarr-scivis From ca027ab6d295ef4d75fe54a4adc381c3b0e952fc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 4 Nov 2025 07:41:59 -0900 Subject: [PATCH 521/751] ok: Update ogs-arco-ocean.yaml From 0b70a514988bc530becbd85be94f1eaf1c8ccc5f Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 4 Nov 2025 13:01:01 -0500 Subject: [PATCH 522/751] ok: Update ome-zarr-open-scivis.yaml --- datasets/ome-zarr-open-scivis.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/ome-zarr-open-scivis.yaml b/datasets/ome-zarr-open-scivis.yaml index 79eb9a5a5..db5961416 100644 --- a/datasets/ome-zarr-open-scivis.yaml +++ b/datasets/ome-zarr-open-scivis.yaml @@ -15,6 +15,7 @@ Tags: - computed tomography - volumetric imaging - zarr + - aws-pds License: CC-BY-4.0 unless otherwise specified Citation: McCormick, M. (2025). OME-Zarr Open SciVis Datasets (v2025.10.31). Zenodo. https://doi.org/10.5281/zenodo.17495294 Resources: From 06c1dd899e604b7209579651bbaf3db51d8ddc3e Mon Sep 17 00:00:00 2001 From: Daofeng Li Date: Wed, 5 Nov 2025 08:44:22 -0600 Subject: [PATCH 523/751] Add HPRC v2 epigenome Add HPRC v2 epigenome datasets --- datasets/hprc-epigenome.yaml | 42 ++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 datasets/hprc-epigenome.yaml diff --git a/datasets/hprc-epigenome.yaml b/datasets/hprc-epigenome.yaml new file mode 100644 index 000000000..b8900f75a --- /dev/null +++ b/datasets/hprc-epigenome.yaml @@ -0,0 +1,42 @@ +Name: Epigenomes of the Human Pangenome Reference Consortium (HPRC) Release 2 +Description: | + The Human Pangenome Reference Consortium (HPRC) Release 2 represents a landmark achievement in genomics, providing high-quality phased genome assemblies from over 200 individuals with comprehensive functional genomics data. The HPRC Epigenome Browser provides researchers a way to explore all epigenomics data generated by release 2. The HPRC Epigenome Browser (HPRCEB) is a modern, interactive web portal that democratizes access to HPRC Release 2 epigenomics data through an intuitive interface supporting genome selection, data visualization, and bulk download capabilities. The portal integrates genome assemblies, DNA methylation profiles, gene expression data, and chromatin accessibility measurements across diverse populations, enabling researchers to efficiently identify and retrieve datasets matching their specific research needs. +Contact: dli23@wustl.edu +ManagedBy: Ting Wang Lab (https://wang.wustl.edu/) +Documentation: https://epigenome.humanpangenome.org/?tab=tutorials +UpdateFrequency: Annual. The repository will be updated with each new batch of data as it is generated and released under the next HPRC yearly cycle. +Tags: + - biology + - bioinformatics + - genetic + - genomic + - life sciences + - PacBio + - ONT + - Fiber-seq +License: External data users may freely download, analyze, and publish results based on any HPRC data provided here without restrictions. +Resources: + - Description: HPRC Epigenome Browser + ARN: arn:aws:s3:::hprc-epigenome + Region: us-west-2 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Finding and Downloading HPRC Epigenome Data files + URL: https://epigenome.humanpangenome.org/ + AuthorName: HPRCEB + AuthorURL: https://epigenome.humanpangenome.org/ + Publications: + - Title: | + A draft human pangenome reference + URL: https://doi.org/10.1038/s41586-023-05896-x + AuthorName: Liao, WW., Asri, M., Ebler, J. et al. + - Title: | + Modbed track: Visualization of modified bases in single-molecule sequencing + URL: https://www.sciencedirect.com/science/article/pii/S2666979X23002999?via%3Dihub + AuthorName: Daofeng Li, Xiaoyu Zhuo, Jessica K. Harrison, Shane Liu, Ting Wang + - Title: | + WashU Epigenome Browser update 2025 + URL: https://doi.org/10.1093/nar/gkaf387 + AuthorName: Chanrung Seng, Shane Liu, Wenjin Zhang, Xiaoyu Zhuo, Daofeng Li, Ting Wang + From d983378981d7c41f140e8d42f48c526c61130034 Mon Sep 17 00:00:00 2001 From: Daofeng Li Date: Wed, 5 Nov 2025 11:19:39 -0600 Subject: [PATCH 524/751] add the get-to-know dataset information add the get-to-know dataset information --- datasets/hprc-epigenome.yaml | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/datasets/hprc-epigenome.yaml b/datasets/hprc-epigenome.yaml index b8900f75a..d8f9784f8 100644 --- a/datasets/hprc-epigenome.yaml +++ b/datasets/hprc-epigenome.yaml @@ -22,21 +22,20 @@ Resources: Type: S3 Bucket DataAtWork: Tutorials: - - Title: Finding and Downloading HPRC Epigenome Data files - URL: https://epigenome.humanpangenome.org/ - AuthorName: HPRCEB + - Title: | + Get To Know A Dataset: HPRC Epigenome + URL: https://github.com/twlab/open-data-examples/blob/main/get-to-know-hprc-epigenome.ipynb + AuthorName: HPRC Epigenome Browser AuthorURL: https://epigenome.humanpangenome.org/ Publications: + - Title: A draft human pangenome reference + URL: https://doi.org/10.1038/s41586-023-05896-x + AuthorName: Liao, WW., Asri, M., Ebler, J. et al. - Title: | - A draft human pangenome reference - URL: https://doi.org/10.1038/s41586-023-05896-x - AuthorName: Liao, WW., Asri, M., Ebler, J. et al. - - Title: | - Modbed track: Visualization of modified bases in single-molecule sequencing - URL: https://www.sciencedirect.com/science/article/pii/S2666979X23002999?via%3Dihub - AuthorName: Daofeng Li, Xiaoyu Zhuo, Jessica K. Harrison, Shane Liu, Ting Wang - - Title: | - WashU Epigenome Browser update 2025 - URL: https://doi.org/10.1093/nar/gkaf387 - AuthorName: Chanrung Seng, Shane Liu, Wenjin Zhang, Xiaoyu Zhuo, Daofeng Li, Ting Wang + Modbed track: Visualization of modified bases in single-molecule sequencing + URL: https://www.sciencedirect.com/science/article/pii/S2666979X23002999?via%3Dihub + AuthorName: Daofeng Li, Xiaoyu Zhuo, Jessica K. Harrison, Shane Liu, Ting Wang + - Title: WashU Epigenome Browser update 2025 + URL: https://doi.org/10.1093/nar/gkaf387 + AuthorName: Chanrung Seng, Shane Liu, Wenjin Zhang, Xiaoyu Zhuo, Daofeng Li, Ting Wang From b0e67d57e0a30a3a11e72d70bdfe3ee9096d8ed1 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 6 Nov 2025 09:46:20 -0500 Subject: [PATCH 525/751] ok: Update dandiarchive.yaml From 58b0b1b87bb36a0ac58627c0742d60a5887cf7e7 Mon Sep 17 00:00:00 2001 From: Ziang Liu Date: Thu, 6 Nov 2025 14:54:07 -0500 Subject: [PATCH 526/751] Update s3 bucket and dataset tutorial --- datasets/open-robo-care.yaml | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/datasets/open-robo-care.yaml b/datasets/open-robo-care.yaml index d6fd9da7f..dd39e6994 100644 --- a/datasets/open-robo-care.yaml +++ b/datasets/open-robo-care.yaml @@ -13,11 +13,15 @@ License: "BSD-3-Clause license - Academic and non-commercial use permitted. See Citation: "Liang, X., Liu, Z., Lin, K., Gu, E., Ye, R., Nguyen, T., Hsu, C., Wu, Z., Yang, X., Cheung, C.S.Y., Soh, H., Dimitropoulou, K., & Bhattacharjee, T. (2025). OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)." Resources: - Description: Full Dataset - ARN: arn:aws:s3:::openrobocare - Region: us-east-1 + ARN: arn:aws:s3:::open-robo-care + Region: us-west-2 Type: S3 Bucket DataAtWork: Tutorials: + - Title: Get To Know A Dataset: OpenRoboCare + URL: https://github.com/empriselab/open-data-examples/blob/main/open-robo-care/get-to-know-a-dataset.ipynb + AuthorName: Cornell University EmPRISE Lab + AuthorURL: https://emprise.cs.cornell.edu/ - Title: OpenRoboCare Dataset Viewer URL: https://emprise.cs.cornell.edu/robo-care-viewer/ AuthorName: Cornell University EmPRISE Lab @@ -25,4 +29,4 @@ DataAtWork: Publications: - Title: "OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving" URL: https://emprise.cs.cornell.edu/robo-care/ - AuthorName: Liang X, Liu Z, Lin K, et al. \ No newline at end of file + AuthorName: Liang X, Liu Z, Lin K, et al. From f4c91dfc5e56a9f2d707851a05e7ef27602586aa Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 6 Nov 2025 16:04:32 -0500 Subject: [PATCH 527/751] ok: Update open-robo-care.yaml --- datasets/open-robo-care.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/open-robo-care.yaml b/datasets/open-robo-care.yaml index dd39e6994..be8099445 100644 --- a/datasets/open-robo-care.yaml +++ b/datasets/open-robo-care.yaml @@ -9,6 +9,8 @@ Tags: - robotics - machine learning - health + - aws-pds + - life sciences License: "BSD-3-Clause license - Academic and non-commercial use permitted. See documentation for full terms." Citation: "Liang, X., Liu, Z., Lin, K., Gu, E., Ye, R., Nguyen, T., Hsu, C., Wu, Z., Yang, X., Cheung, C.S.Y., Soh, H., Dimitropoulou, K., & Bhattacharjee, T. (2025). OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)." Resources: From 0a2a75537ab63e5a10f7dbfa3fbd4323f7d50007 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Fri, 7 Nov 2025 09:30:57 -0900 Subject: [PATCH 528/751] ok: Update sentinel-2.yaml --- datasets/sentinel-2.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/sentinel-2.yaml b/datasets/sentinel-2.yaml index 526b43f70..5743b956b 100644 --- a/datasets/sentinel-2.yaml +++ b/datasets/sentinel-2.yaml @@ -174,4 +174,3 @@ DataAtWork: - Title: "Coral-spawn slicks: Reflectance spectra and detection using optical satellite data" URL: https://www.sciencedirect.com/science/article/pii/S0034425720304284 AuthorName: Hiroya Yamano, Asahi Sakuma, Saki Harii - From d179b450f9dcfb9b761eeef9b3c7f74dce64ea01 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 10 Nov 2025 10:08:10 -0500 Subject: [PATCH 529/751] ok: Update open-robo-care.yaml From 733d81fce3edf6024669350728805cab07b87ca2 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 10 Nov 2025 10:08:52 -0500 Subject: [PATCH 530/751] ok: Update open-robo-care.yaml --- datasets/open-robo-care.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/open-robo-care.yaml b/datasets/open-robo-care.yaml index be8099445..12dfd0c6a 100644 --- a/datasets/open-robo-care.yaml +++ b/datasets/open-robo-care.yaml @@ -20,7 +20,7 @@ Resources: Type: S3 Bucket DataAtWork: Tutorials: - - Title: Get To Know A Dataset: OpenRoboCare + - Title: "Get To Know A Dataset: OpenRoboCare" URL: https://github.com/empriselab/open-data-examples/blob/main/open-robo-care/get-to-know-a-dataset.ipynb AuthorName: Cornell University EmPRISE Lab AuthorURL: https://emprise.cs.cornell.edu/ From 756951e16caac3eaf3b3291fca8b4ff7edb65c53 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 10 Nov 2025 11:13:49 -0500 Subject: [PATCH 531/751] ok: Update open-robo-care.yaml --- datasets/open-robo-care.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/open-robo-care.yaml b/datasets/open-robo-care.yaml index 12dfd0c6a..eef0e4b96 100644 --- a/datasets/open-robo-care.yaml +++ b/datasets/open-robo-care.yaml @@ -1,5 +1,5 @@ Name: OpenRoboCare Multi-Modal Expert Demonstration Dataset for Robot-Assisted Caregiving -Description: A comprehensive multi-modal dataset capturing real-world caregiving routines from 21 occupational therapists performing 15 daily caregiving tasks. The dataset includes synchronized RGB-D video, tactile sensing, eye-gaze tracking, pose annotations, and action labels across 315 sessions totaling 19.8 hours of expert demonstrations. Data modalities include anonymized RGB images, depth maps, 44-sensor tactile readings, 2D/3D pose tracking, temporal action annotations, and first/third-person videos, enabling research in robot learning from demonstration, multimodal perception, and safe human-robot interaction for caregiving applications. +Description: A comprehensive multimodal dataset capturing real-world caregiving routines from 21 occupational therapists performing 15 daily caregiving tasks. The dataset includes synchronized RGB-D video, tactile sensing, eye-gaze tracking, pose annotations, and action labels across 315 sessions totaling 19.8 hours of expert demonstrations. Data modalities include anonymized RGB images, depth maps, 44-sensor tactile readings, 2D/3D pose tracking, temporal action annotations, and first/third-person videos, enabling research in robot learning from demonstration, multimodal perception, and safe human-robot interaction for caregiving applications. Documentation: https://emprise.cs.cornell.edu/robo-care/docs Contact: https://emprise.cs.cornell.edu/robo-care/ ManagedBy: "[EmPRISE Lab at Cornell University](https://emprise.cs.cornell.edu/)" From 8352116c4bc93a1bb6c887740a603b55d2f22b85 Mon Sep 17 00:00:00 2001 From: Greg Lindahl Date: Wed, 12 Nov 2025 06:06:06 +0000 Subject: [PATCH 532/751] chore: Update Common Crawl description with new page count --- datasets/commoncrawl.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/commoncrawl.yaml b/datasets/commoncrawl.yaml index 25791b984..8d6a4254a 100644 --- a/datasets/commoncrawl.yaml +++ b/datasets/commoncrawl.yaml @@ -1,5 +1,5 @@ Name: Common Crawl -Description: A corpus of web crawl data composed of over 50 billion web pages. +Description: A corpus of web crawl data composed of over 300 billion web pages. Documentation: https://commoncrawl.org/get-started Contact: https://commoncrawl.org/contact-us ManagedBy: "[Common Crawl](https://commoncrawl.org/)" From 368a883d71a10e3392d97cc56e7c21fdf7cd9a7f Mon Sep 17 00:00:00 2001 From: kanagawa-pointcloud Date: Wed, 12 Nov 2025 16:07:54 +0900 Subject: [PATCH 533/751] Add a yaml file for kanagawa-pointcloud datasets --- datasets/kanagawa_pointcloud.yaml | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 datasets/kanagawa_pointcloud.yaml diff --git a/datasets/kanagawa_pointcloud.yaml b/datasets/kanagawa_pointcloud.yaml new file mode 100644 index 000000000..e21478b30 --- /dev/null +++ b/datasets/kanagawa_pointcloud.yaml @@ -0,0 +1,31 @@ +Name: Kanagawa, 3D Point Cloud Data +Description: | + This dataset comprises high-precision 3D point cloud data that encompasses the entire Kanagawa prefecture in Japan. + The data is produced through aerial laser survey, airborne laser bathymetry and mobile mapping systems, the culmination of many years of dedicated effort. + This data will be visualized and analyzed for use in infrastructure maintenance, disaster prevention measures and autonomous vehicle driving. +Documentation: https://github.com/aigidjp/opendata_kanagawa_pointcloud/blob/main/README.md +Contact: kanagawa-pointcloud@aigid.jp +ManagedBy: "[AIGID](https://aigid.jp/)" +UpdateFrequency: Currently not scheduled +Tags: + - aws-pds + - disaster response + - elevation + - geospatial + - japanese + - land + - lidar + - mapping +License: "Creative Commons Attribution 4.0 International (CC-BY 4.0)" +Resources: + - Description: Point Cloud Data of Kanagawa Prefecture, Japan + ARN: arn:aws:s3:::kanagawa-pointcloud + Region: ap-northeast-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Tutorial of handling LAS format point cloud data + URL: https://github.com/aigidjp/opendata_kanagawa_pointcloud/blob/main/tutorials/README.md + AuthorName: AIGID + Tools & Applications: + Publications: From 575a4c2ee94c3bdb20fbd4554de3fe5fe883544e Mon Sep 17 00:00:00 2001 From: Peter Schmiedeskamp Date: Wed, 12 Nov 2025 06:29:06 -0800 Subject: [PATCH 534/751] ok: ready to merge From 30b6b2d4bf1b8ad259809dc3efb8ff127bf6862e Mon Sep 17 00:00:00 2001 From: piyushrpt Date: Wed, 12 Nov 2025 10:07:26 -0800 Subject: [PATCH 535/751] Update ASTER L1T entry --- datasets/aster-l1t.yaml | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/datasets/aster-l1t.yaml b/datasets/aster-l1t.yaml index da2b1ce98..8fd5d687d 100644 --- a/datasets/aster-l1t.yaml +++ b/datasets/aster-l1t.yaml @@ -19,8 +19,8 @@ Description: | [here](https://github.com/awslabs/open-data-docs/tree/main/docs/aster-l1t). Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/aster-l1t -Contact: opendata@descarteslabs.com -ManagedBy: "[Descartes Labs](https://descarteslabs.com/)" +Contact: support@earthdaily.com +ManagedBy: "[EarthDaily Analytics](https://earthdaily.com/)" UpdateFrequency: Daily Collabs: ASDI: @@ -55,10 +55,10 @@ DataAtWork: AuthorName: Cole Krehbiel AuthorURL: https://lpdaac.usgs.gov/ Tools & Applications: - - Title: Descartes Labs Platform - URL: https://descarteslabs.com/platform - AuthorName: Descartes Labs Inc. - AuthorURL: https://descarteslabs.com + - Title: EarthDaily EarthOne Platform + URL: https://docs.earthone.earthdaily.com/ + AuthorName: EarthDaily Analytics + AuthorURL: https://earthdaily.com - Title: Latitude-Longitude to Path-Row conversion URL: "https://landsat.usgs.gov/landsat_acq#convertPathRow" AuthorName: USGS From 95b66aed662ca0938f9e7eeee2026c9d167ee3f8 Mon Sep 17 00:00:00 2001 From: piyushrpt Date: Wed, 12 Nov 2025 10:54:08 -0800 Subject: [PATCH 536/751] Update LPDAAC links --- datasets/aster-l1t.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/aster-l1t.yaml b/datasets/aster-l1t.yaml index 8fd5d687d..4190851c4 100644 --- a/datasets/aster-l1t.yaml +++ b/datasets/aster-l1t.yaml @@ -20,7 +20,7 @@ Description: | Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/aster-l1t Contact: support@earthdaily.com -ManagedBy: "[EarthDaily Analytics](https://earthdaily.com/)" +ManagedBy: "[EarthDaily](https://earthdaily.com/)" UpdateFrequency: Daily Collabs: ASDI: @@ -51,13 +51,13 @@ Resources: DataAtWork: Tutorials: - Title: Working with ASTER L1T Visible and Near Infrared (VNIR) Data in R - URL: https://lpdaac.usgs.gov/documents/128/ASTER_L1T_Tutorial.html + URL: https://git.earthdata.nasa.gov/projects/LPDUR/repos/aster-l1t/browse AuthorName: Cole Krehbiel - AuthorURL: https://lpdaac.usgs.gov/ + AuthorURL: https://www.earthdata.nasa.gov/centers/lp-daac Tools & Applications: - Title: EarthDaily EarthOne Platform URL: https://docs.earthone.earthdaily.com/ - AuthorName: EarthDaily Analytics + AuthorName: EarthDaily AuthorURL: https://earthdaily.com - Title: Latitude-Longitude to Path-Row conversion URL: "https://landsat.usgs.gov/landsat_acq#convertPathRow" From e1786b3c8c4acc3ec283a172ffecf33b7c77c226 Mon Sep 17 00:00:00 2001 From: David Raymond Date: Thu, 13 Nov 2025 14:36:38 -0500 Subject: [PATCH 537/751] Update pdb-3d-structural-biology-data.yaml --- datasets/pdb-3d-structural-biology-data.yaml | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/datasets/pdb-3d-structural-biology-data.yaml b/datasets/pdb-3d-structural-biology-data.yaml index 8597c1023..b1e0460c2 100644 --- a/datasets/pdb-3d-structural-biology-data.yaml +++ b/datasets/pdb-3d-structural-biology-data.yaml @@ -11,7 +11,9 @@ Description: > Electron Microscopy Data Bank (wwPDB-designated Archive Keeper: EMDB) Biological Magnetic Resonance Bank (wwPDB-designated Archive Keeper: BMRB) -Documentation: https://www.wwpdb.org/documentation/file-format +Documentation: + - https://www.wwpdb.org/documentation/file-format + - https://www.rcsb.org/docs/programmatic-access/file-download-services Contact: https://www.wwpdb.org/about/contact ManagedBy: "[Worldwide Protein Data Bank Partnership](wwpdb.org)" UpdateFrequency: | @@ -61,3 +63,11 @@ DataAtWork: - Title: "Protein Data Bank: the single global archive for 3D macromolecular structure data" URL: https://doi.org/10.1093/nar/gky949 AuthorName: wwPDB consortium + Tutorials: + - Title: "Get to Know a Dataset: Protein Data Bank 3D Structural Biology Data" + URL: https://github.com/rcsb/AWS-Open_Data_Registry/blob/master/PDB_3D_Dataset_Tour.ipynb + AuthorName: RCSB PDB + AuthorURL: https://rcsb.org/ + - Title: "PDB 101" + URL: https://pdb101.rcsb.org/ + From a4f116caffa957831a0fa3e3526595e713d93e2b Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 13 Nov 2025 14:54:09 -0500 Subject: [PATCH 538/751] ok: Update pdb-3d-structural-biology-data.yaml --- datasets/pdb-3d-structural-biology-data.yaml | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/datasets/pdb-3d-structural-biology-data.yaml b/datasets/pdb-3d-structural-biology-data.yaml index b1e0460c2..1be0ccb5e 100644 --- a/datasets/pdb-3d-structural-biology-data.yaml +++ b/datasets/pdb-3d-structural-biology-data.yaml @@ -56,13 +56,6 @@ Resources: Explore: - '[Browse Bucket](https://pdbsnapshots.s3.us-west-2.amazonaws.com/index.html)' DataAtWork: - Publications: - - Title: "Announcing the worldwide Protein Data Bank" - URL: https://doi.org/10.1038/nsb1203-980 - AuthorName: Berman, H., Henrick, K. & Nakamura, H. - - Title: "Protein Data Bank: the single global archive for 3D macromolecular structure data" - URL: https://doi.org/10.1093/nar/gky949 - AuthorName: wwPDB consortium Tutorials: - Title: "Get to Know a Dataset: Protein Data Bank 3D Structural Biology Data" URL: https://github.com/rcsb/AWS-Open_Data_Registry/blob/master/PDB_3D_Dataset_Tour.ipynb @@ -70,4 +63,10 @@ DataAtWork: AuthorURL: https://rcsb.org/ - Title: "PDB 101" URL: https://pdb101.rcsb.org/ - + Publications: + - Title: "Announcing the worldwide Protein Data Bank" + URL: https://doi.org/10.1038/nsb1203-980 + AuthorName: Berman, H., Henrick, K. & Nakamura, H. + - Title: "Protein Data Bank: the single global archive for 3D macromolecular structure data" + URL: https://doi.org/10.1093/nar/gky949 + AuthorName: wwPDB consortium From 0afd6c2157fabb94316bff78287514a0cb163e38 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 13 Nov 2025 15:06:14 -0500 Subject: [PATCH 539/751] ok: Update pdb-3d-structural-biology-data.yaml Moving the second doc link to tutorials --- datasets/pdb-3d-structural-biology-data.yaml | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/datasets/pdb-3d-structural-biology-data.yaml b/datasets/pdb-3d-structural-biology-data.yaml index 1be0ccb5e..16d3ca9af 100644 --- a/datasets/pdb-3d-structural-biology-data.yaml +++ b/datasets/pdb-3d-structural-biology-data.yaml @@ -11,9 +11,7 @@ Description: > Electron Microscopy Data Bank (wwPDB-designated Archive Keeper: EMDB) Biological Magnetic Resonance Bank (wwPDB-designated Archive Keeper: BMRB) -Documentation: - - https://www.wwpdb.org/documentation/file-format - - https://www.rcsb.org/docs/programmatic-access/file-download-services +Documentation: "https://www.wwpdb.org/documentation/file-format" Contact: https://www.wwpdb.org/about/contact ManagedBy: "[Worldwide Protein Data Bank Partnership](wwpdb.org)" UpdateFrequency: | @@ -63,6 +61,12 @@ DataAtWork: AuthorURL: https://rcsb.org/ - Title: "PDB 101" URL: https://pdb101.rcsb.org/ + AuthorName: RCSB PDB + AuthorURL: https://rcsb.org/ + - Title: "File Download Services" + URL: https://www.rcsb.org/docs/programmatic-access/file-download-services + AuthorName: RCSB PDB + AuthorURL: https://rcsb.org/ Publications: - Title: "Announcing the worldwide Protein Data Bank" URL: https://doi.org/10.1038/nsb1203-980 From 0ad15e7e0b75ddeda755d1b47c393efda0f7483e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 13 Nov 2025 11:08:28 -0900 Subject: [PATCH 540/751] ok: Update aster-l1t.yaml --- datasets/aster-l1t.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/aster-l1t.yaml b/datasets/aster-l1t.yaml index 4190851c4..351660714 100644 --- a/datasets/aster-l1t.yaml +++ b/datasets/aster-l1t.yaml @@ -79,4 +79,3 @@ DataAtWork: - Title: ASTER L1T Product Specification URL: https://lpdaac.usgs.gov/documents/1401/ASTER_L1T_Product_Specification_v1.pdf AuthorName: USGS EROS Data Center - From 4b3c6494f60565d74dc83e9326c9414561d9446d Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 13 Nov 2025 15:10:35 -0500 Subject: [PATCH 541/751] ok: Update ember.yaml --- datasets/ember.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/ember.yaml b/datasets/ember.yaml index 8b2501c0b..3b7a1715c 100644 --- a/datasets/ember.yaml +++ b/datasets/ember.yaml @@ -33,6 +33,7 @@ Tags: - Homo sapiens - Mus musculus - non-human primate + - aws-pds License: Creative Commons 4.0 International (CC BY 4.0) Resources: - Description: Time series neurophysiology and behavioral data from animal and human (deidentified) From 4d62e38eff6790bfc0517581b1d9809ec9182aa1 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 13 Nov 2025 11:11:27 -0900 Subject: [PATCH 542/751] ok: Update aster-l1t.yaml From 24339fb7c1d8a7978f1720e40eff35360dc7d44e Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 13 Nov 2025 15:53:33 -0500 Subject: [PATCH 543/751] ok: Update ember.yaml --- datasets/ember.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/ember.yaml b/datasets/ember.yaml index 3b7a1715c..a32ecd8c8 100644 --- a/datasets/ember.yaml +++ b/datasets/ember.yaml @@ -44,4 +44,4 @@ DataAtWork: Publications: - Title: Mapping the landscape of social behavior URL: https://pubmed.ncbi.nlm.nih.gov/40043703/ - AuthorName: Ugne Klibaite, Tianqing Li, Diego Aldarondo, Jumana F Akoad, Bence P Ölveczky, Timothy W Dunn + AuthorName: Ugne Klibaite, Tianqing Li, Diego Aldarondo, Jumana F Akoad, Bence P Ölveczky, Timothy W Dunn. From 871c598bc5ee8beab155da2f91573ff818ed76ad Mon Sep 17 00:00:00 2001 From: "ceggers@rsna.org" Date: Fri, 14 Nov 2025 13:24:48 -0600 Subject: [PATCH 544/751] Create rsna-intracranial-aneurysm-detection-dataset.yaml --- ...tracranial-aneurysm-detection-dataset.yaml | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 datasets/rsna-intracranial-aneurysm-detection-dataset.yaml diff --git a/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml b/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml new file mode 100644 index 000000000..3b74910b3 --- /dev/null +++ b/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml @@ -0,0 +1,23 @@ +Name: RSNA Intracranial Aneurysm Detection Dataset (RSNA-ICA) +Description: "The Radiological Society of North America Intracranial Aneurysm Detection (RSNA-ICA) dataset is a collection of over 4,000 CT brain scans annotated by a cohort of over 40 volunteer radiologists from RSNA and the American Society of Neuroradiology to show the presence and location of intracranial aneurysms. It also includes a set of about 200 imaging studies that are annotated with AI-generated segmentations highlighting abnormalities. The imaging data was provided by 18 institutions. Initially compiled in 2025 for the RSNA Intracranial Aneurysm Detection AI Challenge hosted on Kaggle competition platform (https://www.kaggle.com/competitions/rsna-intracranial-aneurysm-detection), it represents the largest publicly available collection of its kind. Additional information on the dataset and how to make use of it is provided in a forthcoming Data Resource Publication listed below, as well as on the Kaggle competition website, which also provides access to models developed during the competition." +Documentation: https://github.com/RSNA/AI-Challenge-Data/wiki/RSNA-Intracranial-Aneurysm-Detection-Dataset +Contact: informatics@rsna.org +ManagedBy: 'Radiological Society of North America (https://www.rsna.org/)' +UpdateFrequency: The dataset may be updated with additional or corrected data on a need-to-update basis. +Tags: + - aws-pds + - radiology + - medical imaging + - medical image computing + - machine learning + - computer vision + - csv + - labeled + - life sciences +License: "You may access and use these de-identified imaging datasets and annotations (“the data”) for non-commercial purposes only, including academic research and education, as long as you agree to abide by the following provisions: Not to make any attempt to identify or contact any individual(s) who may be the subjects of the data. If you share or re-distribute the data in any form, include a citation to the “Radiological Society of North America Intracranial Aneurysm Detection (RSNA-ICA) Dataset, July 2025” [https://doi.org/10.1148/dataset.ica.2025]." +Resources: + - Description: Zip archive containing DCM and CSV files + ARN: arn:aws:s3:::intracranial-aneurysm-detection + Region: us-west-2 + Type: S3 Bucket + ControlledAccess: https://mira.rsna.org/dataset/7 \ No newline at end of file From a98bfb642a444c59657b1186b00b88142bab68b5 Mon Sep 17 00:00:00 2001 From: "ceggers@rsna.org" Date: Fri, 14 Nov 2025 14:23:47 -0600 Subject: [PATCH 545/751] Add publication details to RSNA datasets --- datasets/rsna-intracranial-aneurysm-detection-dataset.yaml | 7 ++++++- ...a-lumbar-spine-degenerative-classification-dataset.yaml | 7 ++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml b/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml index 3b74910b3..c0a82d297 100644 --- a/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml +++ b/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml @@ -20,4 +20,9 @@ Resources: ARN: arn:aws:s3:::intracranial-aneurysm-detection Region: us-west-2 Type: S3 Bucket - ControlledAccess: https://mira.rsna.org/dataset/7 \ No newline at end of file + ControlledAccess: https://mira.rsna.org/dataset/7 +DataAtWork: + Publications: + - Title: The RSNA Intercranial Aneurysm Detection Dataset + URL: https://pubs.rsna.org/doi/full/10.1148/ryai.2021200254 + AuthorName: Authors, Various \ No newline at end of file diff --git a/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml b/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml index 0c3ae14e0..7bdafb478 100644 --- a/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml +++ b/datasets/rsna-lumbar-spine-degenerative-classification-dataset.yaml @@ -20,4 +20,9 @@ Resources: ARN: arn:aws:s3:::lumbar-spine-degenerative-classification Region: us-west-2 Type: S3 Bucket - ControlledAccess: https://mira.rsna.org/dataset/6 \ No newline at end of file + ControlledAccess: https://mira.rsna.org/dataset/6 +DataAtWork: + Publications: + - Title: The RSNA Lumbar Spine Degenerative Classification Dataset + URL: https://pubs.rsna.org/doi/full/10.1148/ryai.2021200254 + AuthorName: Authors, Various From cb41e24bd0aabcfff4f16306bfbeb92f3a5fa72c Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 14 Nov 2025 15:28:59 -0500 Subject: [PATCH 546/751] ok: Update rsna-ratic.yaml --- datasets/rsna-ratic.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/rsna-ratic.yaml b/datasets/rsna-ratic.yaml index 2a78e5eb2..ce59c398c 100644 --- a/datasets/rsna-ratic.yaml +++ b/datasets/rsna-ratic.yaml @@ -15,6 +15,7 @@ Tags: - labeled - computed tomography - x-ray tomography + - life sciences License: "You may access and use these de-identified imaging datasets and annotations (“the data”) for non-commercial purposes only, including academic research and education, as long as you agree to abide by the following provisions: Not to make any attempt to identify or contact any individual(s) who may be the subjects of the data. If you share or re-distribute the data in any form, include a citation to the “Brain CT Hemorrhage Dataset, Copyright RSNA, 2019” as follows: Flanders AF, et al. The RSNA Brain CT Hemorrhage Dataset [10.1148/ryai.2020190211]. Radiology: Artificial Intelligence 2020;2:3." Resources: - Description: Zip archive containing DCM and CSV files From 826633cd4df16fe4554a1a9c4dc94801730aeabb Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 14 Nov 2025 15:29:21 -0500 Subject: [PATCH 547/751] ok: Update rsna-pulmonary-embolism-detection.yaml From 81195ac0316a0a4fc4a6a5c2dcf50e5f2e19c15a Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 14 Nov 2025 15:30:19 -0500 Subject: [PATCH 548/751] ok: Update rsna-lumbar-spine-degenerative-classification-dataset.yaml From 694f137ace237e7cedde30b1903648b9cd1028a4 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 14 Nov 2025 15:30:37 -0500 Subject: [PATCH 549/751] ok: Update rsna-intracranial-hemorrhage-detection.yaml From 20fc62422f174324efc46a592f7e5647f7610141 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 14 Nov 2025 15:31:06 -0500 Subject: [PATCH 550/751] ok: Update rsna-intracranial-aneurysm-detection-dataset.yaml --- datasets/rsna-intracranial-aneurysm-detection-dataset.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml b/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml index c0a82d297..1628878f8 100644 --- a/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml +++ b/datasets/rsna-intracranial-aneurysm-detection-dataset.yaml @@ -25,4 +25,5 @@ DataAtWork: Publications: - Title: The RSNA Intercranial Aneurysm Detection Dataset URL: https://pubs.rsna.org/doi/full/10.1148/ryai.2021200254 - AuthorName: Authors, Various \ No newline at end of file + AuthorName: Authors, Various + From 8fae71b5cf213b240c25c709f359b15062236c8a Mon Sep 17 00:00:00 2001 From: gopanairepa <117306436+gopanairepa@users.noreply.github.com> Date: Mon, 17 Nov 2025 08:21:01 -0500 Subject: [PATCH 551/751] Updating two EPA datasets to reflect recent reorgs --- datasets/epa-2022-modeling-platform.yaml | 7 ++++--- datasets/epa-hourly-prognostic-meteorology.yaml | 8 ++++---- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/datasets/epa-2022-modeling-platform.yaml b/datasets/epa-2022-modeling-platform.yaml index 5385ba4e2..c4ab087a0 100644 --- a/datasets/epa-2022-modeling-platform.yaml +++ b/datasets/epa-2022-modeling-platform.yaml @@ -1,9 +1,9 @@ Name: >- - OAQPS 2022 Modeling Platform + OSAP 2022 Modeling Platform Description: >- The data are part of the 2022 Modeling Platform used to support regulatory actions - and technical analyses conducted by the EPA's Office of Air Quality Planning and - Standards. Specifically, this data includes Weather Research and Forecasting Model (v4.4.2) + and technical analyses conducted by the EPA's Office of State Air Partnerships (OSAP). + Specifically, this data includes Weather Research and Forecasting Model (v4.4.2) conducted at a 12-km resolution over the Continental United States (12US). MCIP-processed files and wrfcamx-processed (12US1 domain) are also available as part of this dataset to assist in the use of emissions processing and photochemical modeling. These files @@ -49,6 +49,7 @@ Tags: - regulatory - weather - meteorological + - environmental License: >- These datasets are products of the U.S. Government and are intended for public access and use. Unless otherwise specified, all data produced by the U.S EPA diff --git a/datasets/epa-hourly-prognostic-meteorology.yaml b/datasets/epa-hourly-prognostic-meteorology.yaml index accdd1a7e..9a5f10843 100644 --- a/datasets/epa-hourly-prognostic-meteorology.yaml +++ b/datasets/epa-hourly-prognostic-meteorology.yaml @@ -2,8 +2,8 @@ Name: >- EPA Hourly Prognostic Meteorological Data Description: >- The data are hourly outputs from the Weather Research and Forecasting (WRF) model - generated by the EPA's Office of Air Quality Planning and Standards, Air Quality - Assessment Division, Air Quality Modeling Group. These data were generated at a 12-km + generated by the EPA's Office of State Air Partnerships (OSAP), Air Quality + Assessment Division, Air Quality Modeling Branch. These data were generated at a 12-km resolution over the Continental United States (12US), beginning for the year 2021 and continuing annually through 2023. These files are intended for use in a broad range of air quality applications, but specifically may be used in dispersion modeling applications @@ -23,11 +23,11 @@ ManagedBy: U.S. Environmental Protection Agency (https://www.epa.gov) UpdateFrequency: Annually Tags: - aws-pds - - environmental - air quality - regulatory - weather - meteorological + - environmental License: >- These datasets are products of the U.S. Government and are intended for public access and use. Unless otherwise specified, all data produced by the U.S EPA @@ -52,4 +52,4 @@ Resources: - Description: Notification for the EPA Hourly Prognostic Meteorological Data bucket ARN: 'arn:aws:sns:us-east-1:127085394039:epa-hourly-prognostic-meteorology-object_created' Region: us-east-1 - Type: SNS Topic + Type: SNS Topic \ No newline at end of file From 95680aab25fd852c2e07b8852c43b0a7dcfe5da9 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Mon, 17 Nov 2025 08:33:19 -0500 Subject: [PATCH 552/751] New Dataset: CCRS MODIS albedo --- datasets/CCRSMODISAlbedo.yml | 101 +++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 datasets/CCRSMODISAlbedo.yml diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml new file mode 100644 index 000000000..4910b48c6 --- /dev/null +++ b/datasets/CCRSMODISAlbedo.yml @@ -0,0 +1,101 @@ +Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS +Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km � 4,800 km (22,800 pixel x 19,200 lines). +Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ +Contact: alexander.trichtchenko@nrcan-rncan.gc.ca +ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation +UpdateFrequency: Semi-annually, intil the end of MODIS operations +Tags: + - aws-pds + - analysis ready data + - broadband + - Canada + - COG + - earth observation + - satellite imagery +License: There are no restrictions on the use of this data. Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ +Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. +Resources: + - Description: + ARN: + Region: + Type: + RequesterPays (Optional): + AccountRequired (Optional): + ControlledAccess (Optional): + Explore (Optional): +DataAtWork: + Tutorials: + - Title:Get To Know A Dataset - MCCRS MODIS Albedo at 250-m resolution and 10-day intervals + URL: https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb + Services: + AuthorName: Alexander Trichtchenko + AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko + NotebookURL (Optional):https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb + Publications: + - Title: Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America + URL: https://doi.org/10.1016/j.agrformet.2010.12.013 + AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. + - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data + URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data + AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. + - Title: Annual mapping of large Forest disturbances across Canada�s forests using 250 m MODIS imagery from 2000 to 2011 + URL: https://doi.org/10.1139/cjfr-2014-0229 + AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. + - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations + URL:https://doi.org/10.1029/2010JF001664 + AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. + - Title: Influence of two management practices in the Canadian Prairies on radiative forcing + URL: https://doi.org/10.1016/j.scitotenv.2020.142701 + AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. + - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery + URL: https://doi.org/10.1109/TGRS.2008.916633 + AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. + - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America + URL: https://doi.org/10.1016/j.rse.2008.06.010 + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. + URL: https://doi.org/10.1029/2004JD004741 + AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. + - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region + URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) + URL: https://doi.org/10.1016/J.RSE.2019.111297 + AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. + - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000�2023 + URL: https://doi.org/10.1080/07038992.2024.2371359 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Annual minimum snow/ice extent variations over Greenland since 2000: ice sheet, peripheral areas, and relation to ice mass balance + URL: https://doi.org/10.1175/BAMS-D-22-0244.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Landfast ice properties over the Beaufort Sea region in 2000�2019 from MODIS and Canadian Ice Service data + URL: https://doi.org/10.1139/cjes-2021-0011 + AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. + - Title: Landfast ice mapping using MODIS clear-sky composites: application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data + URL: https://doi.org/10.1080/07038992.2021.1909466 + AuthorName: Trishchenko, A.P., Luo, Y. + - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000�19: how much snow survives the summer melt? + URL: https://doi.org/10.1175/BAMS-D-20-0177.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000�2014 + URL: https://doi.org/10.1080/07038992.2016.1166043 + AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 + - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization + URL: https://doi.org/10.1117/12.689157 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. + - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1�B7 + URL: https://doi.org/10.1080/01431160802348119 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. + - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite: Continuing MODIS Time Series into the Future + URL: https://doi.org/10.1080/07038992.2019.1601006 + AuthorName: Trishchenko, A.P. + - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. + URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada + URL: https://doi.org/10.1109/IGARSS.2018.8519558 + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada + URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf + AuthorName: Trishchenko, Alexander P. + \ No newline at end of file From 69b9683d9c77a264506c740d593d70793c9b34e1 Mon Sep 17 00:00:00 2001 From: Jed Sundwall Date: Mon, 17 Nov 2025 07:38:22 -0800 Subject: [PATCH 553/751] Add Google Satellite Embedding V1 dataset details --- datasets/aef-source.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 datasets/aef-source.md diff --git a/datasets/aef-source.md b/datasets/aef-source.md new file mode 100644 index 000000000..354cef8c4 --- /dev/null +++ b/datasets/aef-source.md @@ -0,0 +1,23 @@ +Name: Google Satellite Embedding V1 +Description: COG (Cloud-Optimized GeoTIFF) files that together contain the AlphaEarth Foundations annual Satellite Embedding dataset. It contains the annual embeddings for the years from 2018 to 2024, inclusive. +Documentation: https://source.coop/tge-labs/aef +Contact: https://cloudnativegeo.org/join +ManagedBy: "[Source Cooperative](https://source.coop/)" +UpdateFrequency: As new data versions become available +Tags: + - machine learning + - satellite imagery + - aerial imagery + - earth observation + - imaging +License: CC-BY 4.0 +Citation: "The AlphaEarth Foundations Satellite Embedding dataset is produced by Google and Google DeepMind." +Resources: + - Description: Google Satellite Embedding V1 + ARN: arn:aws:s3:::us-west-2.opendata.source.coop/tge-labs/aef + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://source.coop/tge-labs/aef/)' +ADXCategories: + - Environmental Data From cbefe77dd0169a903f1bed4fb93b28369afb06be Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Mon, 17 Nov 2025 14:49:41 -0500 Subject: [PATCH 554/751] Add NOAA nClimGrid and Livneh climate data metadata This file contains metadata for NOAA nClimGrid and Livneh gridded historical climate observation thresholds, including descriptions, documentation links, contact information, and licensing details. --- datasets/noaa-cris-hist.yaml | 40 ++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 datasets/noaa-cris-hist.yaml diff --git a/datasets/noaa-cris-hist.yaml b/datasets/noaa-cris-hist.yaml new file mode 100644 index 000000000..c7ba0353d --- /dev/null +++ b/datasets/noaa-cris-hist.yaml @@ -0,0 +1,40 @@ +Name: NOAA nClimGrid and Livneh Gridded Historical Climate Observation Thresholds +Description: | + Livneh and nClimGrid are gridded observed historical climatology data that were used in the LOCA2 and STAR-ESDM downscaling process of global climate models as part of the 5th National Climate Assessment. The original Livneh and nClimGrid daily temperature and precipitation observations have been converted to a series of decision-relevant thresholds as part of the [(U.S. Climate Resilience Information System (CRIS))](https://cris.climate.gov/pages/about). These thresholds, such as days with extreme heat or precipitation, have been calculated to match the future projections from LOCA2 and STAR, also available in CRIS. +Documentation: | + For information, please consult https://cris.climate.gov/pages/about-the-data. +Contact: | + For any questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov. + We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov +ManagedBy: "[NOAA](http://www.noaa.gov/)" +UpdateFrequency: | + None +Collabs: + ASDI: + Tags: + - climate +Tags: + - aws-pds + - agriculture + - climate + - environmental + - meteorological + - weather +License: | + NOAA data disseminated through NODD are open to the public and can be used as desired.

NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. +Resources: + - Description: nClimGrid and Livneh Gridded Historical Observation Thresholds + ARN: arn:aws:s3:::noaa-cris-hist-pds + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://noaa-cris-hist-pds.s3.amazonaws.com/index.html)' + - Description: New data notifications for nClimGride and Livneh Gridded Historical Observation Thresholds, only Lambda and SQS protocols allowed + ARN: arn:aws:sns:us-west-2:123901341784:NewCRIS-HISTObject + Region: us-east-1 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: U.S. CRIS Resources + URL: https://cris.climate.gov/pages/developers + AuthorName: U.S. CRIS From ccc4a4e2cb5b4796322bfc43b6dd5987e949ec25 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 17 Nov 2025 16:20:08 -0900 Subject: [PATCH 555/751] ok: Update noaa-cris-hist.yaml From dd1da98998cb07003cbe1ea58b8c2ba495a17b5b Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 17 Nov 2025 16:30:16 -0900 Subject: [PATCH 556/751] ok: Update epa-2022-modeling-platform.yaml --- datasets/epa-2022-modeling-platform.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/epa-2022-modeling-platform.yaml b/datasets/epa-2022-modeling-platform.yaml index c4ab087a0..19df61a51 100644 --- a/datasets/epa-2022-modeling-platform.yaml +++ b/datasets/epa-2022-modeling-platform.yaml @@ -99,4 +99,4 @@ Resources: - Description: Notification for the 2022 Modeling Platform bucket ARN: 'arn:aws:sns:us-east-1:127085394039:epa-2022-modeling-platform-object_created' Region: us-east-1 - Type: SNS Topic \ No newline at end of file + Type: SNS Topic From 9ddc2fa1e48e59fdc54c36adeec8a73892cd16eb Mon Sep 17 00:00:00 2001 From: Vidit Agrawal <91577322+viditagr@users.noreply.github.com> Date: Tue, 18 Nov 2025 03:13:56 +0100 Subject: [PATCH 557/751] Revise dataset resources in chammi.yaml Updated dataset information for CHAMMI-75. --- datasets/chammi.yaml | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/datasets/chammi.yaml b/datasets/chammi.yaml index fc2e61a41..4f48f7b97 100644 --- a/datasets/chammi.yaml +++ b/datasets/chammi.yaml @@ -23,11 +23,10 @@ Tags: License: CC BY 4.0 License Citation: Resources: - - Description: Images, training set and evaluation set available in an S3 bucket - ARN: - Region: - Type: - Explore: + - Description: CHAMMI-75 Dataset: Images, training set and evaluation set available in an S3 bucket + ARN: arn:aws:s3:::chammi-data + Region: us-west-2 + Type: S3 Bucket DataAtWork: Tutorials: - Title: Get To Know A Dataset: CHAMMI-75 From f6a06790d9caaf5f1b50d11e7bc939457f413d15 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 17 Nov 2025 17:40:12 -0900 Subject: [PATCH 558/751] ok: Update aef-source.md --- datasets/aef-source.md | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/aef-source.md b/datasets/aef-source.md index 354cef8c4..fde49e2b4 100644 --- a/datasets/aef-source.md +++ b/datasets/aef-source.md @@ -5,6 +5,7 @@ Contact: https://cloudnativegeo.org/join ManagedBy: "[Source Cooperative](https://source.coop/)" UpdateFrequency: As new data versions become available Tags: + - aws-pds - machine learning - satellite imagery - aerial imagery From 8d524001edbfff76d992dc50b8e9aedd7144c97d Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 17 Nov 2025 17:42:04 -0900 Subject: [PATCH 559/751] ok: Rename aef-source.md to aef-source.yaml --- datasets/{aef-source.md => aef-source.yaml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename datasets/{aef-source.md => aef-source.yaml} (100%) diff --git a/datasets/aef-source.md b/datasets/aef-source.yaml similarity index 100% rename from datasets/aef-source.md rename to datasets/aef-source.yaml From 9e7024c7f4d37d735064b96efca1d2482e8538d1 Mon Sep 17 00:00:00 2001 From: mwielocha Date: Tue, 18 Nov 2025 14:30:50 +0100 Subject: [PATCH 560/751] NUVIEW State Open Data submission --- datasets/nuview-state.yaml | 44 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 datasets/nuview-state.yaml diff --git a/datasets/nuview-state.yaml b/datasets/nuview-state.yaml new file mode 100644 index 000000000..56b8ab033 --- /dev/null +++ b/datasets/nuview-state.yaml @@ -0,0 +1,44 @@ +Name: NUVIEW - Multi-State Geospatial Data +Description: | + NUVIEW hosts and manages a unified collection of geospatial datasets from multiple U.S. states and agencies + (LiDAR, orthophoto imagery, DEM/DSM, and derivative products). Data are organized in a + single S3 bucket with a logical sub-folder hierarchy: `/state_or_agency_product_type/acqusition_project_name/...`. All assets + are cloud-optimized (COG GeoTIFFs, COPC (Cloud Optimized Point Cloud) LAZ point clouds, etc.) and available under open licenses. +Documentation: Documentation is available for this data at the [s22s/nuview-state-opendata GitHub repository](https://github.com/s22s/nuview-state-opendata) maintained by NUVIEW. +Contact: support@nuview.space +ManagedBy: "[NUVIEW](https://nuview.space/)" +UpdateFrequency: Project-based updates. +Tags: + - geospatial + - satellite imagery + - natural resource + - sustainability + - disaster response + - digital elevation model + - lidar +License: CC0 "Public Domain" (or state/agency-specific open data licenses) +Resources: + - Description: Imagery + ARN: arn:aws:s3:::nuview-state-opendata + Region: us-west-2 + Type: S3 Bucket + RequesterPays: True + - Description: New data notifications + ARN: arn:aws:sns:us-west-2:{TBA}:nuview-state-opendata-events + Region: us-west-2 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: Get to Know a Dataset - NUVIEW State Open Data + URL: https://github.com/s22s/nuview-state-opendata/ + NotebookURL: https://github.com/s22s/nuview-state-opendata/blob/main/get-to-know-a-dataset.ipynb + AuthorName: NUVIEW, Inc. + AuthorURL: https://nuview.space + Tools & Applications: + - Title: NUVIEW Geospatial Platform for Alaska + URL: https://alaska.nuview.space/ + AuthorName: NUVIEW, Inc. + AuthorURL: https://nuview.space +ADXCategories: + - Environmental Data + - Public Sector Data From 8eac6d3438f1ca3a2f0d94cb17889ff75a1bbd9e Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Tue, 18 Nov 2025 10:31:19 -0700 Subject: [PATCH 561/751] Add Sup3rUHI dataset to oedi-data-lake.yaml --- datasets/oedi-data-lake.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/oedi-data-lake.yaml b/datasets/oedi-data-lake.yaml index 308ecb917..1dc3dff11 100644 --- a/datasets/oedi-data-lake.yaml +++ b/datasets/oedi-data-lake.yaml @@ -120,6 +120,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=buildings-bench)' + - Description: "[Super-Resolution for Renewable Resource Data and Urban Heat Islands (Sup3rUHI)](https://data.openei.org/submissions/6220)" + ARN: arn:aws:s3:::oedi-data-lake/sup3ruhi/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=sup3ruhi%2F&limit=50)' DataAtWork: Tools & Applications: - Title: "Tracking the Sun Tool" From ab05aac4b8c96bfc7781c3db9ea07086e4a7033c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 18 Nov 2025 14:51:47 -0900 Subject: [PATCH 562/751] ok: Update oedi-data-lake.yaml From 2adb8532d68366b44a06cd36845f1a3c1451e1ac Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Wed, 19 Nov 2025 08:47:44 -0500 Subject: [PATCH 563/751] add French to some sections add French to some sections --- datasets/CCRSMODISAlbedo.yml | 38 +++++++++++++++++++----------------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml index 4910b48c6..62951a961 100644 --- a/datasets/CCRSMODISAlbedo.yml +++ b/datasets/CCRSMODISAlbedo.yml @@ -1,9 +1,11 @@ -Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS -Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km � 4,800 km (22,800 pixel x 19,200 lines). -Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ +Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS +Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). + Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). +Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo Contact: alexander.trichtchenko@nrcan-rncan.gc.ca ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: Semi-annually, intil the end of MODIS operations +UpdateFrequency: Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS Tags: - aws-pds - analysis ready data @@ -12,7 +14,7 @@ Tags: - COG - earth observation - satellite imagery -License: There are no restrictions on the use of this data. Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ +License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. Resources: - Description: @@ -25,12 +27,12 @@ Resources: Explore (Optional): DataAtWork: Tutorials: - - Title:Get To Know A Dataset - MCCRS MODIS Albedo at 250-m resolution and 10-day intervals + - Title: Get To Know A Dataset - MCCRS MODIS Albedo at 250-m resolution and 10-day intervals URL: https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb Services: AuthorName: Alexander Trichtchenko AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko - NotebookURL (Optional):https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb + NotebookURL (Optional): https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb Publications: - Title: Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America URL: https://doi.org/10.1016/j.agrformet.2010.12.013 @@ -38,11 +40,11 @@ DataAtWork: - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. - - Title: Annual mapping of large Forest disturbances across Canada�s forests using 250 m MODIS imagery from 2000 to 2011 + - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 URL: https://doi.org/10.1139/cjfr-2014-0229 AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations - URL:https://doi.org/10.1029/2010JF001664 + URL: https://doi.org/10.1029/2010JF001664 AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. - Title: Influence of two management practices in the Canadian Prairies on radiative forcing URL: https://doi.org/10.1016/j.scitotenv.2020.142701 @@ -62,31 +64,31 @@ DataAtWork: - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) URL: https://doi.org/10.1016/J.RSE.2019.111297 AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. - - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000�2023 + - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 URL: https://doi.org/10.1080/07038992.2024.2371359 AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Annual minimum snow/ice extent variations over Greenland since 2000: ice sheet, peripheral areas, and relation to ice mass balance + - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance URL: https://doi.org/10.1175/BAMS-D-22-0244.1 AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Landfast ice properties over the Beaufort Sea region in 2000�2019 from MODIS and Canadian Ice Service data + - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data URL: https://doi.org/10.1139/cjes-2021-0011 AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. - - Title: Landfast ice mapping using MODIS clear-sky composites: application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data + - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data URL: https://doi.org/10.1080/07038992.2021.1909466 AuthorName: Trishchenko, A.P., Luo, Y. - - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000�19: how much snow survives the summer melt? + - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? URL: https://doi.org/10.1175/BAMS-D-20-0177.1 AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000�2014 + - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 URL: https://doi.org/10.1080/07038992.2016.1166043 AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization URL: https://doi.org/10.1117/12.689157 AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. - - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1�B7 + - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 URL: https://doi.org/10.1080/01431160802348119 AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. - - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite: Continuing MODIS Time Series into the Future + - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future URL: https://doi.org/10.1080/07038992.2019.1601006 AuthorName: Trishchenko, A.P. - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. @@ -97,5 +99,5 @@ DataAtWork: AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf - AuthorName: Trishchenko, Alexander P. + AuthorName: Trishchenko, Alexander P. \ No newline at end of file From 1719ef7375cf6be312561f44cbdea911bdd6565f Mon Sep 17 00:00:00 2001 From: Aodhan Sweeney <40372081+AodhanSweeney@users.noreply.github.com> Date: Wed, 19 Nov 2025 17:49:05 -0800 Subject: [PATCH 564/751] Update S3 ARN and dataset browse links --- datasets/planette_c3s_seasonal_forecast_data.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/planette_c3s_seasonal_forecast_data.yaml b/datasets/planette_c3s_seasonal_forecast_data.yaml index f4ac99a5c..ba68e14d0 100644 --- a/datasets/planette_c3s_seasonal_forecast_data.yaml +++ b/datasets/planette_c3s_seasonal_forecast_data.yaml @@ -33,11 +33,11 @@ Citation: | https://cds.climate.copernicus.eu/cdsapp#!/dataset/seasonal-original-single-levels Resources: - Description: C3S Seasonal Forecast Hindcasts and Forecasts (Zarr format) - ARN: arn:aws:s3:::planettebaikal/forecast_models/seasonal/seas5/ + ARN: arn:aws:s3:::planette-c3s-seasonal-forecasts/seas5/ Region: us-east-2 Type: S3 Bucket Explore: - - '[Browse Dataset](https://planettebaikal.s3.amazonaws.com/index.html#forecast_models/seasonal/)' + - '[Browse Dataset](https://planette-c3s-seasonal-forecasts.s3.amazonaws.com/index.html#seas5)' DataAtWork: Tutorials: - Title: Accessing C3S Seasonal Forecast Data with Python From 28f512f8b6617735140b6e98ad74f487735a204b Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 19 Nov 2025 17:15:38 -0900 Subject: [PATCH 565/751] ok: Update planette_c3s_seasonal_forecast_data.yaml --- datasets/planette_c3s_seasonal_forecast_data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/planette_c3s_seasonal_forecast_data.yaml b/datasets/planette_c3s_seasonal_forecast_data.yaml index ba68e14d0..66002ca30 100644 --- a/datasets/planette_c3s_seasonal_forecast_data.yaml +++ b/datasets/planette_c3s_seasonal_forecast_data.yaml @@ -63,4 +63,4 @@ DataAtWork: AuthorName: Copernicus Climate Change Service DeprecatedNotice: ADXCategories: - - Environmental Data + - Environmental Data From 02e9a25abc8f0f592f45134a34eb84dd9e8134fb Mon Sep 17 00:00:00 2001 From: Allan Frank Date: Thu, 20 Nov 2025 12:37:24 +0100 Subject: [PATCH 566/751] Updating DMI opendata documentation links --- datasets/dmi-opendata.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/dmi-opendata.yaml b/datasets/dmi-opendata.yaml index 8bde94349..c70d9b2ae 100644 --- a/datasets/dmi-opendata.yaml +++ b/datasets/dmi-opendata.yaml @@ -1,7 +1,7 @@ Name: Danish Meteorological Institute (DMI) Open Data Forecasts Description: DMI forecast data consist of various models where each model contains different set of parameters relating to a specific domain like ocean (WAM), storm flooding (DKSS) or weather (HARMONIE) -Documentation: https://opendatadocs.dmi.govcloud.dk/en/Data/Forecast_Data -Contact: https://opendatadocs.dmi.govcloud.dk/en/API_Status_and_Contact +Documentation: https://www.dmi.dk/friedata/dokumentation/forecast-data +Contact: https://www.dmi.dk/friedata/dokumentation/api-status-contact ManagedBy: "[Danish Meteorological Institute](https://www.dmi.dk/)" UpdateFrequency: Every hour, 3 hours or 6 hours depending on model Collabs: From 7a980402d0701aa7c7cf993666f77462596da4bf Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Thu, 20 Nov 2025 11:10:49 -0700 Subject: [PATCH 567/751] Add Buildings Sector Scenarios dataset --- datasets/oedi-data-lake.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/oedi-data-lake.yaml b/datasets/oedi-data-lake.yaml index 1dc3dff11..f7081ec98 100644 --- a/datasets/oedi-data-lake.yaml +++ b/datasets/oedi-data-lake.yaml @@ -126,6 +126,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=sup3ruhi%2F&limit=50)' + - Description: "[Buildings Sector Scenarios (BSS)](https://data.openei.org/submissions/8558)" + ARN: arn:aws:s3:::oedi-data-lake/building-sector-scenarios/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=building-sector-scenarios%2F&limit=50)' DataAtWork: Tools & Applications: - Title: "Tracking the Sun Tool" From 5f9093944a6d1d5ef6554685913df1a63a8b0ab4 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 20 Nov 2025 15:44:09 -0500 Subject: [PATCH 568/751] ok: Update snpeff.yaml --- datasets/snpeff.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml index d4a9a0704..20a8e0cbc 100644 --- a/datasets/snpeff.yaml +++ b/datasets/snpeff.yaml @@ -19,6 +19,7 @@ Tags: - whole exome sequencing - transcriptomics - structural variation + - aws-pds License: "[MIT License](https://opensource.org/licenses/MIT)" Resources: - Description: "Pre-built genomic databases for many reference genomes including human (GRCh37, GRCh38), mouse, rat, and other model organisms. Each database contains gene annotations, transcript information, protein sequences, and regulatory elements required for variant annotation." @@ -54,4 +55,4 @@ DataAtWork: - Title: "Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift" URL: https://www.frontiersin.org/articles/10.3389/fgene.2012.00035/full AuthorName: Cingolani P, Patel VM, Coon M, Nguyen T, Land SJ, Ruden DM, Lu X - AuthorURL: https://www.frontiersin.org/people/u/4691 \ No newline at end of file + AuthorURL: https://www.frontiersin.org/people/u/4691 From aba551e51d1482f68b15f4acd48295eec8f81b68 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 20 Nov 2025 15:49:32 -0500 Subject: [PATCH 569/751] ok: Update snpeff.yaml --- datasets/snpeff.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml index 20a8e0cbc..5e4e328c2 100644 --- a/datasets/snpeff.yaml +++ b/datasets/snpeff.yaml @@ -5,7 +5,6 @@ Documentation: https://pcingola.github.io/SnpEff/ ManagedBy: "[Pablo Cingolani](http://www.linkedin.com/in/pablocingolani)" UpdateFrequency: Monthly Tags: - - snpeff - life sciences - genomic - variant annotation From d1492edc6f0651940c80c3e3861972be7f39d7d4 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 20 Nov 2025 11:51:43 -0900 Subject: [PATCH 570/751] ok: Update dmi-opendata.yaml From eae6852d03ceb2a9a91503add2fdfe3421d775d0 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 20 Nov 2025 13:42:06 -0900 Subject: [PATCH 571/751] ok: Update oedi-data-lake.yaml From fb63b430c0b61768f336f3b631314ef084ff9f65 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 20 Nov 2025 13:43:37 -0900 Subject: [PATCH 572/751] ok: Update oedi-data-lake.yaml From a62fd05bd42e918a12cef2524e3f6fb665269135 Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Thu, 20 Nov 2025 18:58:08 -0600 Subject: [PATCH 573/751] Creating ASL 1000 DataCard --- datasets/asl_1000 | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 datasets/asl_1000 diff --git a/datasets/asl_1000 b/datasets/asl_1000 new file mode 100644 index 000000000..fec22a3fd --- /dev/null +++ b/datasets/asl_1000 @@ -0,0 +1,37 @@ +Name: ASL 1000 +Description: | +Overview +This dataset provides a high-fidelity collection of American Sign Language (ASL) videos annotated with 2D landmarks for hands, pose, and face. The data is designed to train advanced research and development in ASL recognition, translation, gesture analysis, and computer animation. + +The annotations for this dataset were generated using an automated data pipeline to pre-annotate keyframes from the source videos. As a final, critical step, all automated annotations were subsequently reviewed and meticulously corrected by human labellers to ensure the highest level of accuracy and reliability, making it suitable for training production-grade machine learning models. + +Annotation Methodology: +The annotations for this dataset were generated through a comprehensive process, combining automated extraction with human review to ensure the highest quality and accuracy. The specific steps in this process are described below. +Keyframe Extraction: Raw source videos were processed to extract the most meaningful frames. This step utilized motion analysis (optical flow) and active region detection to identify frames with significant signing activity, which were then refined based on image sharpness. +Automated Landmark Extraction: Each extracted keyframe was processed by an automated pipeline using Google's MediaPipe to generate a baseline set of annotations: +Pose Landmarks: 33 full-body pose landmarks were extracted, with a focus on the upper body and scores for visibility and presence. +Hand Landmarks: 21 high-accuracy landmarks were detected for both the left and right hand, including confidence scores. +Face Landmarks: A detailed face mesh of 468+ landmarks was extracted where applicable, including 52 blend shape coefficients and 3D transformation matrices. +Format Conversion and Ingestion: The extracted landmark data was converted into the SuperAnnotate JSON format and ingested into a human annotation workflow. +Human Verification and Correction: A team of trained human labellers reviewed every keyframe and all associated landmarks. They corrected any errors from the automated detection, improved landmark precision, and ensured temporal consistency. + +Dataset Contents and Format: +The dataset is structured to provide maximum flexibility, from raw media to fully processed annotations. The dataset includes: +Raw Videos: The original source videos +Extracted Keyframes: The raw, individual image frames, extracted by the pipeline's motion analysis step. +Annotation Files: JSON files for body, face, and hand landmarks. + +Potential Applications: +This dataset is ideal for a variety of tasks, including: +ASL Recognition and Translation: Training models to understand and translate signed language. +Gesture and Behavior Analysis: Studying the nuances of human motion and communication. +Avatar and Animation: Driving realistic 3D avatars and animations using the pose, hand, and facial expression data. + +Contact: trustworthyaiprojects@nvidia.com +ManagedBy: See all datasets managed by NVIDIA Corporation +UpdateFrequency: New data is added as soon as it is available. +License: Please see the NVIDIA Dataset License +DataAtWork: + Tutorials: + - Title: NVIDIA Trustworthy AI GitHub + URL: https://github.com/NVIDIA/Trustworthy-AI From f2e8d6e1edb99e69389b21e84398ace3d08b6453 Mon Sep 17 00:00:00 2001 From: japan-pointcloud Date: Fri, 21 Nov 2025 18:30:08 +0900 Subject: [PATCH 574/751] Create japan_pointcloud.yaml --- datasets/japan_pointcloud.yaml | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 datasets/japan_pointcloud.yaml diff --git a/datasets/japan_pointcloud.yaml b/datasets/japan_pointcloud.yaml new file mode 100644 index 000000000..f5413f929 --- /dev/null +++ b/datasets/japan_pointcloud.yaml @@ -0,0 +1,31 @@ +Name: Japan Prefectures, 3D Point Cloud Data +Description: | + This dataset comprises high-precision 3D point cloud data that covers all prefectures throughout Japan. + The data is produced through aerial laser surveys, airborne laser bathymetry, and mobile mapping systems, representing the culmination of many years of dedicated effort. + This data will be visualized and analyzed for use in infrastructure maintenance, disaster prevention measures, and autonomous vehicle driving. +Documentation: https://github.com/aigidjp/opendata_japan_pointcloud/blob/main/README.md +Contact: japan-pointcloud@aigid.jp +ManagedBy: "[AIGID](https://aigid.jp/)" +UpdateFrequency: Currently not scheduled +Tags: + - aws-pds + - disaster response + - elevation + - geospatial + - japanese + - land + - lidar + - mapping +License: "Creative Commons Attribution 4.0 International (CC-BY 4.0)" +Resources: + - Description: Point Cloud Data for Prefectures Across Japan + ARN: arn:aws:s3:::japan-pointcloud + Region: ap-northeast-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Tutorial of handling LAS format point cloud data + URL: https://github.com/aigidjp/opendata_japan_pointcloud/blob/main/tutorials/README.md + AuthorName: AIGID + Tools & Applications: + Publications: From 37fd62af1ab791b0cd1d8f30ab4e1affcd66b0fe Mon Sep 17 00:00:00 2001 From: Pablo Cingolani Date: Fri, 21 Nov 2025 08:04:34 -0500 Subject: [PATCH 575/751] Updateed ARN --- datasets/snpeff.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml index 5e4e328c2..72ea83d80 100644 --- a/datasets/snpeff.yaml +++ b/datasets/snpeff.yaml @@ -3,6 +3,11 @@ Description: "SnpEff is a variant annotation and effect prediction tool that ann Contact: Pablo Cingolani Documentation: https://pcingola.github.io/SnpEff/ ManagedBy: "[Pablo Cingolani](http://www.linkedin.com/in/pablocingolani)" +Resources: + - Description: + ARN: arn:aws:s3:::snpeff-public + Region: us-east-2 + Type: S3 Bucket UpdateFrequency: Monthly Tags: - life sciences From 534c3f3741b545e212a684c04634f43df21f8ba7 Mon Sep 17 00:00:00 2001 From: Pablo Cingolani Date: Fri, 21 Nov 2025 08:06:56 -0500 Subject: [PATCH 576/751] . --- datasets/snpeff.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml index 72ea83d80..a8aedc170 100644 --- a/datasets/snpeff.yaml +++ b/datasets/snpeff.yaml @@ -3,8 +3,8 @@ Description: "SnpEff is a variant annotation and effect prediction tool that ann Contact: Pablo Cingolani Documentation: https://pcingola.github.io/SnpEff/ ManagedBy: "[Pablo Cingolani](http://www.linkedin.com/in/pablocingolani)" -Resources: - - Description: +Resources: + - Description: SnpEff databases for genomic variant annotation ARN: arn:aws:s3:::snpeff-public Region: us-east-2 Type: S3 Bucket From 322e9a79727aba76987388709ea5cdfc871751a7 Mon Sep 17 00:00:00 2001 From: Pablo Cingolani Date: Fri, 21 Nov 2025 08:07:38 -0500 Subject: [PATCH 577/751] . --- datasets/snpeff.yaml | 3 --- 1 file changed, 3 deletions(-) diff --git a/datasets/snpeff.yaml b/datasets/snpeff.yaml index a8aedc170..74e1a78ef 100644 --- a/datasets/snpeff.yaml +++ b/datasets/snpeff.yaml @@ -25,9 +25,6 @@ Tags: - structural variation - aws-pds License: "[MIT License](https://opensource.org/licenses/MIT)" -Resources: - - Description: "Pre-built genomic databases for many reference genomes including human (GRCh37, GRCh38), mouse, rat, and other model organisms. Each database contains gene annotations, transcript information, protein sequences, and regulatory elements required for variant annotation." - Type: S3 Bucket DataAtWork: Tutorials: - Title: SnpEff Documentation From 6f07e814a6bda2b935bde016319cdda5250a9e97 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 21 Nov 2025 15:38:07 -0500 Subject: [PATCH 578/751] ok: Update snpeff.yaml From 5a189c2bfd23c78d368ae31ffca5ebadae6308ef Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 21 Nov 2025 15:42:41 -0500 Subject: [PATCH 579/751] ok: Update chammi.yaml --- datasets/chammi.yaml | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/datasets/chammi.yaml b/datasets/chammi.yaml index 4f48f7b97..eb6052790 100644 --- a/datasets/chammi.yaml +++ b/datasets/chammi.yaml @@ -8,7 +8,7 @@ Description: | which could process microscopy images of varying technical specifications and regardless of the number of channels. By breaking the limitations of existing models, CHAMMI-75 is an invaluable resource for creating the next generation of foundation models for image-based biological research. Documentation: https://github.com/CaicedoLab/CHAMMI-75 -Contact: Contact via email Juan Caicedo, juan.caicedo@wisc.edu +Contact: Juan Caicedo, juan.caicedo@wisc.edu ManagedBy: Morgridge Institute for Research UpdateFrequency: Every 2 years Tags: @@ -20,41 +20,34 @@ Tags: - high-throughput imaging - cell imaging - fluorescence imaging + - aws-pds License: CC BY 4.0 License Citation: Resources: - - Description: CHAMMI-75 Dataset: Images, training set and evaluation set available in an S3 bucket + - Description: "CHAMMI-75 Dataset: Images, training set and evaluation set available in an S3 bucket" ARN: arn:aws:s3:::chammi-data Region: us-west-2 Type: S3 Bucket DataAtWork: Tutorials: - - Title: Get To Know A Dataset: CHAMMI-75 + - Title: "Get To Know A Dataset: CHAMMI-75" URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/ NotebookURL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/get-to-know-a-dataset-template.ipynb AuthorName: Vidit Agrawal, Juan Caicedo - AuthorURL: - Services: Getting to know a dataset - Title: Running CHAMMI-75 Evaluation Benchmarks URL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/ NotebookURL: https://github.com/CaicedoLab/CHAMMI-75/blob/main/aws-tutorials/running-benchmarks.ipynb AuthorName: Vidit Agrawal, Juan Caicedo - Author URL: - Services: It will enable researchers to run state of the art benchmarks in the exploration of single cell self-supervised learning foundation models. Tools & Applications: - Title: CHAMMI-75 Source Code URL: https://github.com/CaicedoLab/CHAMMI-75 AuthorName: Vidit Agrawal - AuthorURL: - Title: CHAMMI Benchmarking Source Code URL: https://github.com/chaudatascience/channel_adaptive_models AuthorName: Chau Pham - AuthorURL: Publications: - Title: CHAMMI: A benchmark for channel-adaptive models in microscopy imaging URL: https://neurips.cc/virtual/2023/poster/73620 AuthorName: Zitong Sam Chen, Chau Pham, Siqi Wang, Michael Doron, Nikita Moshkov, Bryan Plummer, Juan C. Caicedo - AuthorURL: -DeprecatedNotice: ADXCategories: - Healthcare & Life Sciences Data From 9280b11963e425fe87795b96fd3f0c4ac8c4d6a8 Mon Sep 17 00:00:00 2001 From: Bryan Nielsen Date: Fri, 21 Nov 2025 12:54:53 -0800 Subject: [PATCH 580/751] Salk data for Aging Mouse Brain Epigeneti project --- .../salk-aging-mouse-brain-epigeneti.yaml | 24 +++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 datasets/salk-aging-mouse-brain-epigeneti.yaml diff --git a/datasets/salk-aging-mouse-brain-epigeneti.yaml b/datasets/salk-aging-mouse-brain-epigeneti.yaml new file mode 100644 index 000000000..710798a20 --- /dev/null +++ b/datasets/salk-aging-mouse-brain-epigeneti.yaml @@ -0,0 +1,24 @@ +Name: Aging Mouse Brain Epigeneti +Description: "Aging is a major risk factor for neurodegenerative diseases, yet underlying epigenetic mechanisms remain unclear. Here, we generated a comprehensive single-nucleus cell atlas of brain aging across multiple brain regions, comprising 132,551 single-cell methylomes and 72,666 joint chromatin conformation-methylome nuclei. Integration with companion transcriptomic and chromatin accessibility data yielded a cross-modality taxonomy of 36 major cell types." +Contact: ecker@salk.edu +Documentation: https://doi.org/10.1101/2025.04.21.648266 +ManagedBy: "[Salk Institute](http://www.salk.edu)" +UpdateFrequency: Never +Tags: + - life sciences + - genetic + - genomic + - whole genome sequencing + - whole exome sequencing + - transcriptomics + - fastq + - bam + - cram + - STRIDES +License: "[NCBI Policy](https://www.ncbi.nlm.nih.gov/home/about/policies/) and [NIH Genomic Data Sharing Policy ](https://osp.od.nih.gov/scientific-sharing/genomic-data-sharing/)" +DataAtWork: + Publications: + - Title: Cell-type-specific transposable element demethylation and TAD remodeling in the aging mouse brain + URL: https://doi.org/10.1101/2025.04.21.648266 + AuthorName: Zeng, Q., Wei, T., Klein, A., Bartlett, A., Liu, H., Nery, J.R., Castanon, R., Osteen, J., Johnson, N.D., Wang, W., Ding, W., Chen, H., Altshul, J., Kenworthy, M., Valadon, C., Owens, W., Wu, Z., Amaral, M.L., Song, Báez-Becerra, T.a.t.i.a.n.a., Cho, S., Chen, C., Willier, J., Cao, S., Rink, J., Lee, J., Barcoma, A., Arzavala, J., Emerson, N., Lu, Y.R., Ren, B., Behrens, M.a.r.g.a.r.i.t.a., Ecker, J.R. + AuthorURL: https://www.salk.edu/scientist/joseph-ecker/ From 7f95869c4f83a5b03d1c78bf1ef19ac128298e5e Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 24 Nov 2025 13:13:04 -0700 Subject: [PATCH 581/751] Add inspire irradiance dataset, fix buildings sector scenario link --- datasets/oedi-data-lake.yaml | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/datasets/oedi-data-lake.yaml b/datasets/oedi-data-lake.yaml index f7081ec98..79a09295b 100644 --- a/datasets/oedi-data-lake.yaml +++ b/datasets/oedi-data-lake.yaml @@ -131,7 +131,13 @@ Resources: Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=building-sector-scenarios%2F&limit=50)' + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=buildings-sector-scenarios%2F&limit=50)' + - Description: "[U.S. Agrivoltaic Irradiance Database](https://data.openei.org/submissions/8568)" + ARN: arn:aws:s3:::oedi-data-lake/inspire/agrivoltaics_irradiance/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=inspire%2Fagrivoltaics_irradiance%2F&limit=50)' DataAtWork: Tools & Applications: - Title: "Tracking the Sun Tool" From 5bb4060aab245ac6eb73a5ae569c8aa7396cc078 Mon Sep 17 00:00:00 2001 From: Jean-Sebastien Moreau Date: Tue, 25 Nov 2025 09:52:48 -0500 Subject: [PATCH 582/751] Update canelevation-pointcloud.yaml MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remplacement des URL suite à la transformation des tutoriels en format mkdocs. --- datasets/canelevation-pointcloud.yaml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/datasets/canelevation-pointcloud.yaml b/datasets/canelevation-pointcloud.yaml index 7ccdd3acd..f6765c5f3 100644 --- a/datasets/canelevation-pointcloud.yaml +++ b/datasets/canelevation-pointcloud.yaml @@ -49,13 +49,13 @@ Resources: DataAtWork: Tutorials: - Title: How to generate a digital elevation model (DEM) from a lidar point cloud in COPC LAZ format | Comment générer un modèle numérique d'élévation (MNE) à partir d'un nuage de point lidar en format COPC LAZ - URL: https://github.com/NRCan/CanElevation/blob/main/pointclouds_nuagespoints/DEM_from_COPC_lidar_EN.ipynb - NotebookURL: https://github.com/NRCan/CanElevation/blob/main/pointclouds_nuagespoints/DEM_from_COPC_lidar_EN.ipynb + URL: https://nrcan.github.io/CanElevation/pointclouds/dem-from-copc-lidar/ + NotebookURL: https://nrcan.github.io/CanElevation/pointclouds/DEM_from_COPC_lidar_EN.ipynb AuthorName: NRCan - Title: Identify projects and lidar tiles covering a region of interest | Déterminer les projets et les tuiles lidars couvrant une région d'intérêt - URL: https://github.com/NRCan/CanElevation/blob/main/pointclouds_nuagespoints/Get_Projects_Tiles_by_AOI_EN.ipynb - NotebookURL: https://github.com/NRCan/CanElevation/blob/main/pointclouds_nuagespoints/Get_Projects_Tiles_by_AOI_EN.ipynb + URL: https://nrcan.github.io/CanElevation/pointclouds/projects-tiles-by-aoi/ + NotebookURL: https://nrcan.github.io/CanElevation/pointclouds/Get_Projects_Tiles_by_AOI_EN.ipynb AuthorName: NRCan - Title: Using the LiDAR Point Clouds - CanElevation Series product in QGIS | Utilisation du produit Nuages de points lidar - Série CanÉlévation dans QGIS - URL: https://github.com/jsmoreau/CanElevation/blob/main/pointclouds_nuagespoints/QGIS_interactive_EN.md + URL: https://nrcan.github.io/CanElevation/pointclouds/qgis-interactive/ AuthorName: NRCan From 6b59f359ed2ac2177fd58fd316926d5db54c0967 Mon Sep 17 00:00:00 2001 From: Jean-Sebastien Moreau Date: Tue, 25 Nov 2025 09:56:40 -0500 Subject: [PATCH 583/751] Update canelevation-pointcloud.yaml Ajout d'un lien pour naviguer le bucket --- datasets/canelevation-pointcloud.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/canelevation-pointcloud.yaml b/datasets/canelevation-pointcloud.yaml index f6765c5f3..0eb3d890e 100644 --- a/datasets/canelevation-pointcloud.yaml +++ b/datasets/canelevation-pointcloud.yaml @@ -45,6 +45,7 @@ Resources: Type: S3 Bucket Explore: - '[LiDAR Data on Open Canada](https://open.canada.ca/data/en/dataset/7069387e-9986-4297-9f55-0288e9676947)' + - '[Browse Bucket](https://canelevation-lidar-point-clouds.s3.ca-central-1.amazonaws.com/pointclouds_nuagespoints/index.html#pointclouds_nuagespoints/)' DataAtWork: Tutorials: From a352df367da8d1ea326336e36dcbc238b49adc22 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 25 Nov 2025 09:59:39 -0500 Subject: [PATCH 584/751] ok: Update frag-struc.yaml --- frag-struc.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/frag-struc.yaml b/frag-struc.yaml index 2e9bc7648..b42271e94 100644 --- a/frag-struc.yaml +++ b/frag-struc.yaml @@ -13,6 +13,7 @@ Tags: - bulk RNA sequencing - bioinformatics - bigwig + - aws-pds License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: "In addition, please cite Yuk Kei Wan and Leonard Schärfen Hidden structural information in RNA sequencing data." Resources: From 5b9e87c056cac9706d3d17fb26feae93c14631db Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 25 Nov 2025 13:54:57 -0500 Subject: [PATCH 585/751] ok: Update chammi.yaml --- datasets/chammi.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/chammi.yaml b/datasets/chammi.yaml index eb6052790..e2f91a729 100644 --- a/datasets/chammi.yaml +++ b/datasets/chammi.yaml @@ -46,8 +46,8 @@ DataAtWork: URL: https://github.com/chaudatascience/channel_adaptive_models AuthorName: Chau Pham Publications: - - Title: CHAMMI: A benchmark for channel-adaptive models in microscopy imaging + - Title: "CHAMMI: A benchmark for channel-adaptive models in microscopy imaging" URL: https://neurips.cc/virtual/2023/poster/73620 AuthorName: Zitong Sam Chen, Chau Pham, Siqi Wang, Michael Doron, Nikita Moshkov, Bryan Plummer, Juan C. Caicedo ADXCategories: - - Healthcare & Life Sciences Data + - Healthcare & Life Sciences Data From 09a7ef56a35d4d8f4a00134225732576e6ef8bb9 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 25 Nov 2025 09:55:43 -0900 Subject: [PATCH 586/751] ok: Update canelevation-pointcloud.yaml From d02eeeac2e59aba29489bec2c7f2a671ec636b23 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 25 Nov 2025 10:08:44 -0900 Subject: [PATCH 587/751] ok: Update canelevation-pointcloud.yaml From 9ccaef1513b1a0f3d78799d27a674a1fa084dcb2 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 25 Nov 2025 10:39:15 -0900 Subject: [PATCH 588/751] ok: Update oedi-data-lake.yaml From c6a9f899d9bd1a9461b5a5e587b29e14512558c0 Mon Sep 17 00:00:00 2001 From: Charlotte <146997821+charlottecrevier@users.noreply.github.com> Date: Tue, 25 Nov 2025 14:57:18 -0500 Subject: [PATCH 589/751] Update canelevation-dem.yaml Add ressource description for hrdem-arcticdem subfolder --- datasets/canelevation-dem.yaml | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index cbb4fe7d2..d0b669e83 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -55,6 +55,14 @@ Resources: - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-lidar)' - '[STAC Browser by Radiant Earth](https://radiantearth.github.io/stac-browser/#/external/datacube.services.geo.ca/stac/api/collections/hrdem-lidar)' - '[Browse Bucket](https://canelevation-dem.s3.ca-central-1.amazonaws.com/index.html)' + - Description: High-Resolution Digital Elevation Model (HRDEM) generated from optical stereo imagery for Northern Canada / Modèle numérique d'élévation haute résolution (MNEHR) généré à partir de couple stéréo d'imagerie optique pour le nord du Canada + ARN: arn:aws:s3:::canelevation-dem/hrdem-arcticdem/ + Region: ca-central-1 + Type: S3 Bucket + Explore: + - '[STAC catalog](https://datacube.services.geo.ca/stac/api/search?collections=hrdem-arcticdem)' + - '[STAC Browser by Radiant Earth](https://radiantearth.github.io/stac-browser/#/external/datacube.services.geo.ca/stac/api/collections/hrdem-arcticdem)' + - '[Browse Bucket](https://canelevation-dem.s3.ca-central-1.amazonaws.com/index.html)' - Description: Notifications for Canada Digital Elevation Models. ARN: arn:aws:sns:ca-central-1:675987781521:canelevation-dem-create-object Region: ca-central-1 @@ -92,3 +100,4 @@ DataAtWork: + From 14e59adb30c43d384e5e2988fb3a338bc8b15fd8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 25 Nov 2025 11:29:32 -0900 Subject: [PATCH 590/751] ok: Update canelevation-dem.yaml --- datasets/canelevation-dem.yaml | 9 --------- 1 file changed, 9 deletions(-) diff --git a/datasets/canelevation-dem.yaml b/datasets/canelevation-dem.yaml index d0b669e83..cf77c68c4 100644 --- a/datasets/canelevation-dem.yaml +++ b/datasets/canelevation-dem.yaml @@ -92,12 +92,3 @@ DataAtWork: - Title: "Descriptor: Medium Resolution Digital Elevation Model From Natural Resources Canada’s CanElevation Series (MRDEM-30)" URL: https://doi.org/10.1109/IEEEDATA.2025.3576318 AuthorName: H. McGrath et al. - - - - - - - - - From 10e4d89417c1870cfd2b4e5cc69270601de9146e Mon Sep 17 00:00:00 2001 From: State of Colorado OIT-GIS Date: Tue, 25 Nov 2025 15:26:03 -0700 Subject: [PATCH 591/751] Update colorado-imagery.yaml Added SNS topic --- datasets/colorado-imagery.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/colorado-imagery.yaml b/datasets/colorado-imagery.yaml index a5c92527d..169280adb 100644 --- a/datasets/colorado-imagery.yaml +++ b/datasets/colorado-imagery.yaml @@ -20,6 +20,10 @@ Resources: ARN: arn:aws:s3:::colorado-public-imagery Region: us-west-2 Type: S3 Bucket + - Description: Notifications for real-time data updates + ARN: arn:aws:sns:us-west-2:180294215083:colorado-public-imagery-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: Colorado AWS Open Imagery Guide From a9b334127e847840c92579c726b599b35001a029 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 25 Nov 2025 13:42:19 -0900 Subject: [PATCH 592/751] ok: Update colorado-imagery.yaml From 99854237338c6aada85bffb5d269439b5edf5260 Mon Sep 17 00:00:00 2001 From: Alden Keefe Sampson Date: Tue, 25 Nov 2025 23:36:53 -0500 Subject: [PATCH 593/751] Add dynamical.org NOAA HRRR, NOAA GFS, and ECMWF IFS ENS datasets --- datasets/dynamical-ecmwf-ifs-ens.yaml | 46 +++++++++++++++++++++++++++ datasets/dynamical-noaa-gfs.yaml | 46 +++++++++++++++++++++++++++ datasets/dynamical-noaa-hrrr.yaml | 45 ++++++++++++++++++++++++++ 3 files changed, 137 insertions(+) create mode 100644 datasets/dynamical-ecmwf-ifs-ens.yaml create mode 100644 datasets/dynamical-noaa-gfs.yaml create mode 100644 datasets/dynamical-noaa-hrrr.yaml diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml new file mode 100644 index 000000000..793c975a5 --- /dev/null +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -0,0 +1,46 @@ +Name: ECMWF IFS ENS +Description: | + +

+ The Integrated Forecasting System (IFS) is a global forecast model developed + by ECMWF. ENS is an ensemble configuration of IFS, containing 51 ensemble members. + IFS consists of a numerical model of the Earth system, which includes + an atmospheric model at its heart, coupled with models of other Earth system + components such as the ocean. The data assimilation system combines + the latest weather observations with a recent forecast to obtain the best + possible estimate of the current state of the Earth system. +

+ + These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. + +Documentation: https://dynamical.org/catalog/ifs-ens/' +Contact: feedback@dynamical.org +ManagedBy: "[dynamical.org](https://dynamical.org)" +UpdateFrequency: ECMWF IFS ENS Forecast, 15 day, 0.25 degree: Forecasts initialized every 24 hours +Tags: + - weather + - atmosphere + - meteorological + - climate + - forecast + - zarr +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" +Resources: + - Description: ECMWF IFS ENS Icechunk Zarr data + ARN: arn:aws:s3:::dynamical-ecmwf-ifs-ens + Region: us-west-2 + Type: S3 Bucket + Explore: + - "[Browse Bucket](https://dynamical-ecmwf-ifs-ens.s3.amazonaws.com/index.html)" +DataAtWork: + Tutorials: + - Title: ECMWF IFS ENS Forecast, 15 day, 0.25 degree python quickstart notebook + NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/ecmwf-ifs-ens-forecast-15-day-0-25-degree-icechunk.ipynb + AuthorName: dynamical.org + AuthorURL: https://dynamical.org +ADXCategories: + - Environmental Data \ No newline at end of file diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml new file mode 100644 index 000000000..81745a7a7 --- /dev/null +++ b/datasets/dynamical-noaa-gfs.yaml @@ -0,0 +1,46 @@ +Name: NOAA GFS +Description: | + +

+ The Global Forecast System (GFS) is a National Oceanic and Atmospheric + Administration (NOAA) National Centers for Environmental Prediction + (NCEP) weather forecast model that generates data for dozens of + atmospheric and land-soil variables, including temperatures, winds, + precipitation, soil moisture, and atmospheric ozone concentration. The + system couples four separate models (atmosphere, ocean model, land/soil + model, and sea ice) that work together to depict weather conditions. +

+ + These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. +
    + +
  • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
  • + +
+Documentation: https://dynamical.org/catalog/gfs/' +Contact: feedback@dynamical.org +ManagedBy: "[dynamical.org](https://dynamical.org)" +UpdateFrequency: NOAA GFS forecast: Forecasts initialized every 6 hours +Tags: + - weather + - atmosphere + - meteorological + - climate + - forecast + - zarr +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" +Resources: + - Description: NOAA GFS Icechunk Zarr data + ARN: arn:aws:s3:::dynamical-noaa-gfs + Region: us-west-2 + Type: S3 Bucket + Explore: + - "[Browse Bucket](https://dynamical-noaa-gfs.s3.amazonaws.com/index.html)" +DataAtWork: + Tutorials: + - Title: NOAA GFS forecast python quickstart notebook + NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/noaa-gfs-forecast-icechunk.ipynb + AuthorName: dynamical.org + AuthorURL: https://dynamical.org +ADXCategories: + - Environmental Data \ No newline at end of file diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml new file mode 100644 index 000000000..9712c012a --- /dev/null +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -0,0 +1,45 @@ +Name: NOAA HRRR +Description: | + +

+ The High-Resolution Rapid Refresh (HRRR) is a NOAA real-time 3-km resolution, + hourly updated, cloud-resolving, convection-allowing atmospheric model, + initialized by 3km grids with 3km radar assimilation. Radar data is + assimilated in the HRRR every 15 min over a 1-h period adding further + detail to that provided by the hourly data assimilation from the 13km + radar-enhanced Rapid Refresh. +

+ + These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. +
    + +
  • NOAA HRRR forecast, 48 hour - Weather forecasts from the High Resolution Rapid Refresh (HRRR) model operated by NOAA NWS NCEP.
  • + +
+Documentation: https://dynamical.org/catalog/hrrr/' +Contact: feedback@dynamical.org +ManagedBy: "[dynamical.org](https://dynamical.org)" +UpdateFrequency: NOAA HRRR forecast, 48 hour: Forecasts initialized every 6 hours. +Tags: + - weather + - atmosphere + - meteorological + - climate + - forecast + - zarr +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" +Resources: + - Description: NOAA HRRR Icechunk Zarr data + ARN: arn:aws:s3:::dynamical-noaa-hrrr + Region: us-west-2 + Type: S3 Bucket + Explore: + - "[Browse Bucket](https://dynamical-noaa-hrrr.s3.amazonaws.com/index.html)" +DataAtWork: + Tutorials: + - Title: NOAA HRRR forecast, 48 hour python quickstart notebook + NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/noaa-hrrr-forecast-48-hour-icechunk.ipynb + AuthorName: dynamical.org + AuthorURL: https://dynamical.org +ADXCategories: + - Environmental Data \ No newline at end of file From ccc1ef4cf2cf59319cca2add05404bdfe59df448 Mon Sep 17 00:00:00 2001 From: Aodhan Sweeney <40372081+AodhanSweeney@users.noreply.github.com> Date: Mon, 1 Dec 2025 06:21:46 -0800 Subject: [PATCH 594/751] adding planette_era5 archive This is an addition of the planette_era5 archive to this fork of the aws open data registry to mitigate issues I was having with allowing upstream maintainers to edit the PR when using the planette account --- datasets/planette_era5_reanalysis.yaml | 69 ++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 datasets/planette_era5_reanalysis.yaml diff --git a/datasets/planette_era5_reanalysis.yaml b/datasets/planette_era5_reanalysis.yaml new file mode 100644 index 000000000..8f18ba304 --- /dev/null +++ b/datasets/planette_era5_reanalysis.yaml @@ -0,0 +1,69 @@ +Name: Planette ERA5 Archive +Description: | + The ERA5 archive provides a comprehensive record of global weather and climate from 1940 to present, + with multiple temporal aggregations for flexible analysis. This dataset is derived from + the ECMWF/Copernicus ERA5 reanalysis and includes daily means, 7-day rolling means, + and monthly/seasonal aggregations at 0.25°×0.25° global resolution. The Planette ERA5 archive stores + this data in cloud-native format (Zarr with icechunk) for efficient access and analysis. + + The dataset includes essential atmospheric variables at both surface and pressure levels, enabling a + wide range of climate analyses, from daily weather patterns to long-term climate trends. Daily means + are computed by averaging hourly ERA5 data, while longer temporal aggregations are derived from these + daily means. +Documentation: https://github.com/PlanetteAI/planette_era5_archive/blob/main/README.md +Contact: aodhan.sweeney@planette.ai +ManagedBy: Planette.ai +UpdateFrequency: Monthly +Collabs: + ASDI: + Tags: + - climate + - weather + - forecast +Tags: + - aws-pds + - climate + - weather + - earth observation +License: | + Copernicus Licence (similar to CC-BY-4.0): You are free to share and adapt the material + for any purpose, even commercially, provided that you give appropriate credit. + https://cds.climate.copernicus.eu/api/v2/terms/static/licence-to-use-copernicus-products.pdf +Citation: | + Hersbach, H., Bell, B., Berrisford, P., et al. (2020): The ERA5 global reanalysis. + Quarterly Journal of the Royal Meteorological Society, 146(730), 1999-2049. + https://doi.org/10.1002/qj.3803 + +Resources: + - Description: ERA5 Reanalysis Data with Multiple Temporal Aggregations (Zarr format) + ARN: arn:aws:s3:::planette-era5/era5/ + Region: us-east-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://planette-era5.s3.amazonaws.com/index.html#era5/)' +DataAtWork: + Tutorials: + - Title: Accessing ERA5 Data with Python + URL: https://github.com/PlanetteAI/planette_era5_archive/blob/main/planette_era5_tutorial.ipynb + AuthorName: "Aodhan Sweeney-Jaramillo" + AuthorURL: https://github.com/AodhanSweeney + Tools & Applications: + - Title: xarray + URL: https://docs.xarray.dev/ + AuthorName: xarray Developers + - Title: zarr-python + URL: https://zarr.dev/ + AuthorName: zarr Developers + - Title: icechunk + URL: https://github.com/earth-mover/icechunk + AuthorName: earth-mover + Publications: + - Title: "The ERA5 global reanalysis" + URL: https://doi.org/10.1002/qj.3803 + AuthorName: Hersbach, H., et al. + - Title: "ERA5 Documentation" + URL: https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation + AuthorName: ECMWF +DeprecatedNotice: +ADXCategories: + - Environmental Data \ No newline at end of file From fef1ed32c1c623eb9226928d67e270d0150b7edf Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 1 Dec 2025 12:22:33 -0500 Subject: [PATCH 595/751] ok: Update hprc-epigenome.yaml --- datasets/hprc-epigenome.yaml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/datasets/hprc-epigenome.yaml b/datasets/hprc-epigenome.yaml index d8f9784f8..2212f7333 100644 --- a/datasets/hprc-epigenome.yaml +++ b/datasets/hprc-epigenome.yaml @@ -10,10 +10,9 @@ Tags: - bioinformatics - genetic - genomic + - epigenomics - life sciences - - PacBio - - ONT - - Fiber-seq + - aws-pds License: External data users may freely download, analyze, and publish results based on any HPRC data provided here without restrictions. Resources: - Description: HPRC Epigenome Browser From 2ae466f21f48468c12c8b430cb84c5085bc17e7a Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 08:29:19 -0900 Subject: [PATCH 596/751] ok: Update surya-bench.yaml From 3c50fa94ebd9de40f5976615fd6d437bef18e5ff Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 08:42:19 -0900 Subject: [PATCH 597/751] ok: Update planette_era5_reanalysis.yaml --- datasets/planette_era5_reanalysis.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/planette_era5_reanalysis.yaml b/datasets/planette_era5_reanalysis.yaml index 8f18ba304..589445847 100644 --- a/datasets/planette_era5_reanalysis.yaml +++ b/datasets/planette_era5_reanalysis.yaml @@ -66,4 +66,4 @@ DataAtWork: AuthorName: ECMWF DeprecatedNotice: ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 16c8c7aaa049a978b5cae480e3ba155bacf91ee0 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 1 Dec 2025 13:49:40 -0500 Subject: [PATCH 598/751] ok: Update hprc-epigenome.yaml --- datasets/hprc-epigenome.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/hprc-epigenome.yaml b/datasets/hprc-epigenome.yaml index 2212f7333..40d4be708 100644 --- a/datasets/hprc-epigenome.yaml +++ b/datasets/hprc-epigenome.yaml @@ -22,7 +22,7 @@ Resources: DataAtWork: Tutorials: - Title: | - Get To Know A Dataset: HPRC Epigenome + "Get To Know A Dataset: HPRC Epigenome" URL: https://github.com/twlab/open-data-examples/blob/main/get-to-know-hprc-epigenome.ipynb AuthorName: HPRC Epigenome Browser AuthorURL: https://epigenome.humanpangenome.org/ @@ -31,7 +31,7 @@ DataAtWork: URL: https://doi.org/10.1038/s41586-023-05896-x AuthorName: Liao, WW., Asri, M., Ebler, J. et al. - Title: | - Modbed track: Visualization of modified bases in single-molecule sequencing + "Modbed track: Visualization of modified bases in single-molecule sequencing" URL: https://www.sciencedirect.com/science/article/pii/S2666979X23002999?via%3Dihub AuthorName: Daofeng Li, Xiaoyu Zhuo, Jessica K. Harrison, Shane Liu, Ting Wang - Title: WashU Epigenome Browser update 2025 From cb8eae9c6890422660d29e878ae2915c79ebb117 Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Mon, 1 Dec 2025 17:34:15 -0600 Subject: [PATCH 599/751] Adding resource details. --- datasets/{asl_1000 => asl_1000.yaml} | 5 +++++ 1 file changed, 5 insertions(+) rename datasets/{asl_1000 => asl_1000.yaml} (96%) diff --git a/datasets/asl_1000 b/datasets/asl_1000.yaml similarity index 96% rename from datasets/asl_1000 rename to datasets/asl_1000.yaml index fec22a3fd..7638cec06 100644 --- a/datasets/asl_1000 +++ b/datasets/asl_1000.yaml @@ -35,3 +35,8 @@ DataAtWork: Tutorials: - Title: NVIDIA Trustworthy AI GitHub URL: https://github.com/NVIDIA/Trustworthy-AI +Resources: + - Description: ASL 1000 Data + ARN: arn:aws:s3:::trustworthyaiproduct + Region: us-east-2 + Type: S3 Bucket \ No newline at end of file From cadf9dc63ec3dfb81a4db4d7b173224d0b0d0deb Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:48:19 -0900 Subject: [PATCH 600/751] ok: Update dynamical-ecmwf-ifs-ens.yaml --- datasets/dynamical-ecmwf-ifs-ens.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index 793c975a5..ba5ca8265 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -43,4 +43,4 @@ DataAtWork: AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 4d0fe3385aa3554e03a1f9d4402f6750cf7c0dbc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:51:45 -0900 Subject: [PATCH 601/751] ok: Update dynamical-noaa-hrrr.yaml --- datasets/dynamical-noaa-hrrr.yaml | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml index 9712c012a..5b348a246 100644 --- a/datasets/dynamical-noaa-hrrr.yaml +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -1,6 +1,6 @@ Name: NOAA HRRR Description: | - +

The High-Resolution Rapid Refresh (HRRR) is a NOAA real-time 3-km resolution, hourly updated, cloud-resolving, convection-allowing atmospheric model, @@ -9,13 +9,13 @@ Description: | detail to that provided by the hourly data assimilation from the 13km radar-enhanced Rapid Refresh.

- - These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. -
    - -
  • NOAA HRRR forecast, 48 hour - Weather forecasts from the High Resolution Rapid Refresh (HRRR) model operated by NOAA NWS NCEP.
  • - -
+ + These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. +
    + +
  • NOAA HRRR forecast, 48 hour - Weather forecasts from the High Resolution Rapid Refresh (HRRR) model operated by NOAA NWS NCEP.
  • + +
Documentation: https://dynamical.org/catalog/hrrr/' Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" @@ -42,4 +42,4 @@ DataAtWork: AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From f512dfbd06bdb99f4306313c567ff83bc4ff61ed Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:55:37 -0900 Subject: [PATCH 602/751] ok: Update dynamical-ecmwf-ifs-ens.yaml --- datasets/dynamical-ecmwf-ifs-ens.yaml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index ba5ca8265..543aacdb8 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -1,6 +1,6 @@ Name: ECMWF IFS ENS Description: | - +

The Integrated Forecasting System (IFS) is a global forecast model developed by ECMWF. ENS is an ensemble configuration of IFS, containing 51 ensemble members. @@ -10,13 +10,13 @@ Description: | the latest weather observations with a recent forecast to obtain the best possible estimate of the current state of the Earth system.

- - These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. - Documentation: https://dynamical.org/catalog/ifs-ens/' Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" From d6bc08127b668e3ee82766bd2253ea7985c945fb Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:56:21 -0900 Subject: [PATCH 603/751] ok: Update dynamical-noaa-gfs.yaml --- datasets/dynamical-noaa-gfs.yaml | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index 81745a7a7..78a3a6442 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -1,6 +1,6 @@ Name: NOAA GFS Description: | - +

The Global Forecast System (GFS) is a National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction @@ -10,18 +10,19 @@ Description: | system couples four separate models (atmosphere, ocean model, land/soil model, and sea ice) that work together to depict weather conditions.

- - These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. -
    + + These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. +
      -
    • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
    • +
    • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
    • -
    +
Documentation: https://dynamical.org/catalog/gfs/' Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" UpdateFrequency: NOAA GFS forecast: Forecasts initialized every 6 hours Tags: + - aws-pds - weather - atmosphere - meteorological @@ -43,4 +44,4 @@ DataAtWork: AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 090fff62e5ba4c4fd7c59e624ba771a00462887b Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:57:06 -0900 Subject: [PATCH 604/751] ok: Update dynamical-noaa-hrrr.yaml --- datasets/dynamical-noaa-hrrr.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml index 5b348a246..e76f8daf2 100644 --- a/datasets/dynamical-noaa-hrrr.yaml +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -16,11 +16,12 @@ Description: |
  • NOAA HRRR forecast, 48 hour - Weather forecasts from the High Resolution Rapid Refresh (HRRR) model operated by NOAA NWS NCEP.
  • -Documentation: https://dynamical.org/catalog/hrrr/' +Documentation: https://dynamical.org/catalog/hrrr/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" -UpdateFrequency: NOAA HRRR forecast, 48 hour: Forecasts initialized every 6 hours. +UpdateFrequency: "NOAA HRRR forecast, 48 hour: Forecasts initialized every 6 hours." Tags: + - aws-pds - weather - atmosphere - meteorological From 27ac4d0803ea3285279c58cb12659df4623440a2 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:57:21 -0900 Subject: [PATCH 605/751] ok: Update dynamical-noaa-gfs.yaml --- datasets/dynamical-noaa-gfs.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index 78a3a6442..78bc8d48c 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -17,7 +17,7 @@ Description: |
  • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
  • -Documentation: https://dynamical.org/catalog/gfs/' +Documentation: https://dynamical.org/catalog/gfs/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" UpdateFrequency: NOAA GFS forecast: Forecasts initialized every 6 hours From a2074a04fb0be0197ec335c3f46533428467cf26 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:57:49 -0900 Subject: [PATCH 606/751] ok: Update dynamical-ecmwf-ifs-ens.yaml --- datasets/dynamical-ecmwf-ifs-ens.yaml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index 543aacdb8..c7ac7429d 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -17,11 +17,12 @@ Description: |
  • ECMWF IFS ENS Forecast, 15 day, 0.25 degree - Ensemble weather forecasts from the ECMWF Integrated Forecasting System (IFS).
  • -Documentation: https://dynamical.org/catalog/ifs-ens/' +Documentation: https://dynamical.org/catalog/ifs-ens/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" -UpdateFrequency: ECMWF IFS ENS Forecast, 15 day, 0.25 degree: Forecasts initialized every 24 hours +UpdateFrequency: "ECMWF IFS ENS Forecast, 15 day, 0.25 degree: Forecasts initialized every 24 hours" Tags: + - aws-pds - weather - atmosphere - meteorological From 4c1584c6fb89df67b7fe5fe7472ed1224f2e80fa Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 14:58:24 -0900 Subject: [PATCH 607/751] ok: Update dynamical-noaa-gfs.yaml --- datasets/dynamical-noaa-gfs.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index 78bc8d48c..f32be073e 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -20,7 +20,7 @@ Description: | Documentation: https://dynamical.org/catalog/gfs/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" -UpdateFrequency: NOAA GFS forecast: Forecasts initialized every 6 hours +UpdateFrequency: "NOAA GFS forecast: Forecasts initialized every 6 hours" Tags: - aws-pds - weather From 656952f6248978c1cf622852c0edab66aa16acbd Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 15:03:38 -0900 Subject: [PATCH 608/751] ok: Update dynamical-noaa-hrrr.yaml --- datasets/dynamical-noaa-hrrr.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml index e76f8daf2..fa7a8dd3c 100644 --- a/datasets/dynamical-noaa-hrrr.yaml +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -38,7 +38,7 @@ Resources: - "[Browse Bucket](https://dynamical-noaa-hrrr.s3.amazonaws.com/index.html)" DataAtWork: Tutorials: - - Title: NOAA HRRR forecast, 48 hour python quickstart notebook + - Title: NOAA HRRR forecast, 48 hour python quickstart notebook NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/noaa-hrrr-forecast-48-hour-icechunk.ipynb AuthorName: dynamical.org AuthorURL: https://dynamical.org From 3291cb5288b931c633a23b842f111f9ed6810546 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 15:23:36 -0900 Subject: [PATCH 609/751] ok: Update dynamical-noaa-gfs.yaml --- datasets/dynamical-noaa-gfs.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index f32be073e..7fae6d862 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -39,7 +39,7 @@ Resources: - "[Browse Bucket](https://dynamical-noaa-gfs.s3.amazonaws.com/index.html)" DataAtWork: Tutorials: - - Title: NOAA GFS forecast python quickstart notebook + - Title: NOAA GFS forecast python quickstart notebook NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/noaa-gfs-forecast-icechunk.ipynb AuthorName: dynamical.org AuthorURL: https://dynamical.org From 90b797336a81cbaa462c8b375688ea6ae2e29858 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 1 Dec 2025 15:23:54 -0900 Subject: [PATCH 610/751] ok: Update dynamical-ecmwf-ifs-ens.yaml --- datasets/dynamical-ecmwf-ifs-ens.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index c7ac7429d..107d4da16 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -39,7 +39,7 @@ Resources: - "[Browse Bucket](https://dynamical-ecmwf-ifs-ens.s3.amazonaws.com/index.html)" DataAtWork: Tutorials: - - Title: ECMWF IFS ENS Forecast, 15 day, 0.25 degree python quickstart notebook + - Title: ECMWF IFS ENS Forecast, 15 day, 0.25 degree python quickstart notebook NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/ecmwf-ifs-ens-forecast-15-day-0-25-degree-icechunk.ipynb AuthorName: dynamical.org AuthorURL: https://dynamical.org From 43a745f6c436c916baa51d49106ef99883a83b8f Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Mon, 1 Dec 2025 18:29:23 -0600 Subject: [PATCH 611/751] Updating Notebook URL. --- datasets/asl_1000.yaml | 24 +----------------------- 1 file changed, 1 insertion(+), 23 deletions(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index 7638cec06..4fc42d061 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -5,28 +5,6 @@ This dataset provides a high-fidelity collection of American Sign Language (ASL) The annotations for this dataset were generated using an automated data pipeline to pre-annotate keyframes from the source videos. As a final, critical step, all automated annotations were subsequently reviewed and meticulously corrected by human labellers to ensure the highest level of accuracy and reliability, making it suitable for training production-grade machine learning models. -Annotation Methodology: -The annotations for this dataset were generated through a comprehensive process, combining automated extraction with human review to ensure the highest quality and accuracy. The specific steps in this process are described below. -Keyframe Extraction: Raw source videos were processed to extract the most meaningful frames. This step utilized motion analysis (optical flow) and active region detection to identify frames with significant signing activity, which were then refined based on image sharpness. -Automated Landmark Extraction: Each extracted keyframe was processed by an automated pipeline using Google's MediaPipe to generate a baseline set of annotations: -Pose Landmarks: 33 full-body pose landmarks were extracted, with a focus on the upper body and scores for visibility and presence. -Hand Landmarks: 21 high-accuracy landmarks were detected for both the left and right hand, including confidence scores. -Face Landmarks: A detailed face mesh of 468+ landmarks was extracted where applicable, including 52 blend shape coefficients and 3D transformation matrices. -Format Conversion and Ingestion: The extracted landmark data was converted into the SuperAnnotate JSON format and ingested into a human annotation workflow. -Human Verification and Correction: A team of trained human labellers reviewed every keyframe and all associated landmarks. They corrected any errors from the automated detection, improved landmark precision, and ensured temporal consistency. - -Dataset Contents and Format: -The dataset is structured to provide maximum flexibility, from raw media to fully processed annotations. The dataset includes: -Raw Videos: The original source videos -Extracted Keyframes: The raw, individual image frames, extracted by the pipeline's motion analysis step. -Annotation Files: JSON files for body, face, and hand landmarks. - -Potential Applications: -This dataset is ideal for a variety of tasks, including: -ASL Recognition and Translation: Training models to understand and translate signed language. -Gesture and Behavior Analysis: Studying the nuances of human motion and communication. -Avatar and Animation: Driving realistic 3D avatars and animations using the pose, hand, and facial expression data. - Contact: trustworthyaiprojects@nvidia.com ManagedBy: See all datasets managed by NVIDIA Corporation UpdateFrequency: New data is added as soon as it is available. @@ -34,7 +12,7 @@ License: Please see the NVIDIA Dataset License DataAtWork: Tutorials: - Title: NVIDIA Trustworthy AI GitHub - URL: https://github.com/NVIDIA/Trustworthy-AI + URL: https://github.com/NVIDIA/Trustworthy-AI/blob/main/ASL%20Developer%20Community/notebooks/asl_data_pipeline.ipynb Resources: - Description: ASL 1000 Data ARN: arn:aws:s3:::trustworthyaiproduct From e867aef169fc46e20879464bba0ba4c0f1f3715d Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Mon, 1 Dec 2025 18:31:17 -0600 Subject: [PATCH 612/751] Updating the tutorial repository. --- datasets/asl_1000.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index 4fc42d061..094fbfc6e 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -11,8 +11,8 @@ UpdateFrequency: New data is added as soon as it is available. License: Please see the NVIDIA Dataset License DataAtWork: Tutorials: - - Title: NVIDIA Trustworthy AI GitHub - URL: https://github.com/NVIDIA/Trustworthy-AI/blob/main/ASL%20Developer%20Community/notebooks/asl_data_pipeline.ipynb + - Title: ASL Data Pipeline + URL: https://github.com/NVIDIA/Trustworthy-AI/tree/main/ASL%20Developer%20Community Resources: - Description: ASL 1000 Data ARN: arn:aws:s3:::trustworthyaiproduct From e261939a494caa5312e705ec8e893bb6520ae602 Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Mon, 1 Dec 2025 18:40:55 -0600 Subject: [PATCH 613/751] Adding webpaget to request access. --- datasets/asl_1000.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index 094fbfc6e..a3b5a5110 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -17,4 +17,5 @@ Resources: - Description: ASL 1000 Data ARN: arn:aws:s3:::trustworthyaiproduct Region: us-east-2 - Type: S3 Bucket \ No newline at end of file + Type: S3 Bucket + ControlledAccess: https://www.nvidia.com/en-us/gated-resources/trustworthy-ai-american-sign-language/ \ No newline at end of file From a53c0f7a72f38b69be64e00ca11ef7fd708fecb0 Mon Sep 17 00:00:00 2001 From: Alden Keefe Sampson Date: Mon, 1 Dec 2025 23:45:01 -0500 Subject: [PATCH 614/751] Add `URL` for tutorials, SNS topic, and clean up description whitespace --- datasets/dynamical-ecmwf-ifs-ens.yaml | 22 +++++++++++++--------- datasets/dynamical-noaa-gfs.yaml | 22 +++++++++++++--------- datasets/dynamical-noaa-hrrr.yaml | 24 ++++++++++++++---------- 3 files changed, 40 insertions(+), 28 deletions(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index 107d4da16..b5cc6de33 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -1,7 +1,7 @@ Name: ECMWF IFS ENS Description: | - -

    +

    +

    The Integrated Forecasting System (IFS) is a global forecast model developed by ECMWF. ENS is an ensemble configuration of IFS, containing 51 ensemble members. IFS consists of a numerical model of the Earth system, which includes @@ -10,13 +10,12 @@ Description: | the latest weather observations with a recent forecast to obtain the best possible estimate of the current state of the Earth system.

    - - These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. -
      - +

      These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org.

      +

      When Icechunk 2.0 is released, these datasets will be updated correspondingly, and updated client libraries will be required for access. To be notified of dataset updates, subscribe to the mailing list.

      + +
    +
    Documentation: https://dynamical.org/catalog/ifs-ens/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" @@ -37,11 +36,16 @@ Resources: Type: S3 Bucket Explore: - "[Browse Bucket](https://dynamical-ecmwf-ifs-ens.s3.amazonaws.com/index.html)" + - Description: Notifications for dataset updates + ARN: arn:aws:sns:us-west-2:761136292730:dynamical-ecmwf-ifs-ens-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: ECMWF IFS ENS Forecast, 15 day, 0.25 degree python quickstart notebook + URL: https://github.com/dynamical-org/notebooks/blob/main/ecmwf-ifs-ens-forecast-15-day-0-25-degree-icechunk.ipynb NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/ecmwf-ifs-ens-forecast-15-day-0-25-degree-icechunk.ipynb AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data + - Environmental Data \ No newline at end of file diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index 7fae6d862..86ffbb689 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -1,7 +1,7 @@ Name: NOAA GFS Description: | - -

    +

    +

    The Global Forecast System (GFS) is a National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP) weather forecast model that generates data for dozens of @@ -10,13 +10,12 @@ Description: | system couples four separate models (atmosphere, ocean model, land/soil model, and sea ice) that work together to depict weather conditions.

    - - These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. -
      - +

      These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org.

      +

      When Icechunk 2.0 is released, these datasets will be updated correspondingly, and updated client libraries will be required for access. To be notified of dataset updates, subscribe to the mailing list.

      +
      • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
      • - -
      +
    +
    Documentation: https://dynamical.org/catalog/gfs/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" @@ -37,11 +36,16 @@ Resources: Type: S3 Bucket Explore: - "[Browse Bucket](https://dynamical-noaa-gfs.s3.amazonaws.com/index.html)" + - Description: Notifications for dataset updates + ARN: arn:aws:sns:us-west-2:761136292730:dynamical-noaa-gfs-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: NOAA GFS forecast python quickstart notebook + URL: https://github.com/dynamical-org/notebooks/blob/main/noaa-gfs-forecast-icechunk.ipynb NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/noaa-gfs-forecast-icechunk.ipynb AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data + - Environmental Data \ No newline at end of file diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml index fa7a8dd3c..f5ecfe387 100644 --- a/datasets/dynamical-noaa-hrrr.yaml +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -1,7 +1,7 @@ Name: NOAA HRRR Description: | - -

    +

    +

    The High-Resolution Rapid Refresh (HRRR) is a NOAA real-time 3-km resolution, hourly updated, cloud-resolving, convection-allowing atmospheric model, initialized by 3km grids with 3km radar assimilation. Radar data is @@ -9,17 +9,16 @@ Description: | detail to that provided by the hourly data assimilation from the 13km radar-enhanced Rapid Refresh.

    - - These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org. -
      - +

      These datasets have been translated to cloud-optimized Icechunk Zarr format by dynamical.org.

      +

      When Icechunk 2.0 is released, these datasets will be updated correspondingly, and updated client libraries will be required for access. To be notified of dataset updates, subscribe to the mailing list.

      + +
    +
    Documentation: https://dynamical.org/catalog/hrrr/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" -UpdateFrequency: "NOAA HRRR forecast, 48 hour: Forecasts initialized every 6 hours." +UpdateFrequency: "NOAA HRRR forecast, 48 hour: Forecasts initialized every 6 hours" Tags: - aws-pds - weather @@ -36,11 +35,16 @@ Resources: Type: S3 Bucket Explore: - "[Browse Bucket](https://dynamical-noaa-hrrr.s3.amazonaws.com/index.html)" + - Description: Notifications for dataset updates + ARN: arn:aws:sns:us-west-2:761136292730:dynamical-noaa-hrrr-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: NOAA HRRR forecast, 48 hour python quickstart notebook + URL: https://github.com/dynamical-org/notebooks/blob/main/noaa-hrrr-forecast-48-hour-icechunk.ipynb NotebookURL: https://github.com/dynamical-org/notebooks/blob/main/noaa-hrrr-forecast-48-hour-icechunk.ipynb AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data + - Environmental Data \ No newline at end of file From 271e502f6ba39e0b947a3348f15e2863f323ece1 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 2 Dec 2025 09:09:18 -0500 Subject: [PATCH 615/751] ok: Update asl_1000.yaml --- datasets/asl_1000.yaml | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index a3b5a5110..b16a9c367 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -8,6 +8,9 @@ The annotations for this dataset were generated using an automated data pipeline Contact: trustworthyaiprojects@nvidia.com ManagedBy: See all datasets managed by NVIDIA Corporation UpdateFrequency: New data is added as soon as it is available. +Tags: + - aws-pds + - video License: Please see the NVIDIA Dataset License DataAtWork: Tutorials: @@ -18,4 +21,4 @@ Resources: ARN: arn:aws:s3:::trustworthyaiproduct Region: us-east-2 Type: S3 Bucket - ControlledAccess: https://www.nvidia.com/en-us/gated-resources/trustworthy-ai-american-sign-language/ \ No newline at end of file + ControlledAccess: https://www.nvidia.com/en-us/gated-resources/trustworthy-ai-american-sign-language/ From 8774ff7577dbfa4f8c4c400ada4c403b0e64f3b7 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 2 Dec 2025 09:20:42 -0500 Subject: [PATCH 616/751] ok: Update asl_1000.yaml --- datasets/asl_1000.yaml | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index b16a9c367..b6de6733d 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -1,16 +1,12 @@ Name: ASL 1000 -Description: | -Overview -This dataset provides a high-fidelity collection of American Sign Language (ASL) videos annotated with 2D landmarks for hands, pose, and face. The data is designed to train advanced research and development in ASL recognition, translation, gesture analysis, and computer animation. - -The annotations for this dataset were generated using an automated data pipeline to pre-annotate keyframes from the source videos. As a final, critical step, all automated annotations were subsequently reviewed and meticulously corrected by human labellers to ensure the highest level of accuracy and reliability, making it suitable for training production-grade machine learning models. - +Description: This dataset provides a high-fidelity collection of American Sign Language (ASL) videos annotated with 2D landmarks for hands, pose, and face. The data is designed to train advanced research and development in ASL recognition, translation, gesture analysis, and computer animation. The annotations for this dataset were generated using an automated data pipeline to pre-annotate keyframes from the source videos. As a final, critical step, all automated annotations were subsequently reviewed and meticulously corrected by human labellers to ensure the highest level of accuracy and reliability, making it suitable for training production-grade machine learning models. Contact: trustworthyaiprojects@nvidia.com -ManagedBy: See all datasets managed by NVIDIA Corporation +ManagedBy: "[NVIDIA Corporation](https://www.nvidia.com/en-us/)" UpdateFrequency: New data is added as soon as it is available. Tags: - aws-pds - video + - machine learning License: Please see the NVIDIA Dataset License DataAtWork: Tutorials: From 4d70be6b19fb51f14cebf3a873e950b438b45127 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 2 Dec 2025 09:25:45 -0500 Subject: [PATCH 617/751] ok: Update asl_1000.yaml --- datasets/asl_1000.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index b6de6733d..39fea743f 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -1,5 +1,6 @@ Name: ASL 1000 Description: This dataset provides a high-fidelity collection of American Sign Language (ASL) videos annotated with 2D landmarks for hands, pose, and face. The data is designed to train advanced research and development in ASL recognition, translation, gesture analysis, and computer animation. The annotations for this dataset were generated using an automated data pipeline to pre-annotate keyframes from the source videos. As a final, critical step, all automated annotations were subsequently reviewed and meticulously corrected by human labellers to ensure the highest level of accuracy and reliability, making it suitable for training production-grade machine learning models. +Documentation: "URL or description of documentation" Contact: trustworthyaiprojects@nvidia.com ManagedBy: "[NVIDIA Corporation](https://www.nvidia.com/en-us/)" UpdateFrequency: New data is added as soon as it is available. @@ -12,6 +13,7 @@ DataAtWork: Tutorials: - Title: ASL Data Pipeline URL: https://github.com/NVIDIA/Trustworthy-AI/tree/main/ASL%20Developer%20Community + AuthorName: "Name of the author" Resources: - Description: ASL 1000 Data ARN: arn:aws:s3:::trustworthyaiproduct From 4e0e17058eca224d7761a4cf750e74068d532998 Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Tue, 2 Dec 2025 11:22:50 -0600 Subject: [PATCH 618/751] Updating the license. --- datasets/asl_1000.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index 39fea743f..eff017d7c 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -1,6 +1,6 @@ Name: ASL 1000 Description: This dataset provides a high-fidelity collection of American Sign Language (ASL) videos annotated with 2D landmarks for hands, pose, and face. The data is designed to train advanced research and development in ASL recognition, translation, gesture analysis, and computer animation. The annotations for this dataset were generated using an automated data pipeline to pre-annotate keyframes from the source videos. As a final, critical step, all automated annotations were subsequently reviewed and meticulously corrected by human labellers to ensure the highest level of accuracy and reliability, making it suitable for training production-grade machine learning models. -Documentation: "URL or description of documentation" +Documentation: "https://github.com/NVIDIA/Trustworthy-AI/tree/main/ASL%20Developer%20Community" Contact: trustworthyaiprojects@nvidia.com ManagedBy: "[NVIDIA Corporation](https://www.nvidia.com/en-us/)" UpdateFrequency: New data is added as soon as it is available. @@ -8,12 +8,12 @@ Tags: - aws-pds - video - machine learning -License: Please see the NVIDIA Dataset License +License: "Please see the [NVIDIA Dataset License](https://github.com/NVIDIA/Trustworthy-AI/blob/main/ASL%20Developer%20Community/NVIDIA%20Data%20License%20for%20ASL%20Project.pdf)" DataAtWork: Tutorials: - Title: ASL Data Pipeline - URL: https://github.com/NVIDIA/Trustworthy-AI/tree/main/ASL%20Developer%20Community - AuthorName: "Name of the author" + URL: https://github.com/NVIDIA/Trustworthy-AI/blob/main/ASL%20Developer%20Community/notebooks/asl_data_pipeline.ipynb + AuthorName: "NVIDIA" Resources: - Description: ASL 1000 Data ARN: arn:aws:s3:::trustworthyaiproduct From 53deb7e543e75e418b5a00692a61f62c05d56005 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 2 Dec 2025 08:26:27 -0900 Subject: [PATCH 619/751] ok: Update dynamical-ecmwf-ifs-ens.yaml --- datasets/dynamical-ecmwf-ifs-ens.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index b5cc6de33..ec43a1ac0 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -48,4 +48,4 @@ DataAtWork: AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 0f057e556d0795ea1d7b8d74c321ab71bc29bca0 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 2 Dec 2025 12:27:30 -0500 Subject: [PATCH 620/751] ok: Update asl_1000.yaml --- datasets/asl_1000.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index eff017d7c..9d631a72f 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -3,7 +3,7 @@ Description: This dataset provides a high-fidelity collection of American Sign L Documentation: "https://github.com/NVIDIA/Trustworthy-AI/tree/main/ASL%20Developer%20Community" Contact: trustworthyaiprojects@nvidia.com ManagedBy: "[NVIDIA Corporation](https://www.nvidia.com/en-us/)" -UpdateFrequency: New data is added as soon as it is available. +UpdateFrequency: New data added as soon as it is available. Tags: - aws-pds - video From ea61a2d17c60f15f58a34c9bcb47a5d4257fabfd Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 2 Dec 2025 08:33:38 -0900 Subject: [PATCH 621/751] ok: Update dynamical-ecmwf-ifs-ens.yaml From 1c75fe6eb279670fca645361b215a7451a0fdec1 Mon Sep 17 00:00:00 2001 From: Alden Keefe Sampson Date: Tue, 2 Dec 2025 14:23:58 -0500 Subject: [PATCH 622/751] Fix links to dynamical.org model documentation pages --- datasets/dynamical-ecmwf-ifs-ens.yaml | 4 ++-- datasets/dynamical-noaa-gfs.yaml | 2 +- datasets/dynamical-noaa-hrrr.yaml | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index ec43a1ac0..f3d51b3ed 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -16,7 +16,7 @@ Description: |
  • ECMWF IFS ENS Forecast, 15 day, 0.25 degree - Ensemble weather forecasts from the ECMWF Integrated Forecasting System (IFS).
  • -Documentation: https://dynamical.org/catalog/ifs-ens/ +Documentation: https://dynamical.org/catalog/models/ecmwf-ifs-ens/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" UpdateFrequency: "ECMWF IFS ENS Forecast, 15 day, 0.25 degree: Forecasts initialized every 24 hours" @@ -48,4 +48,4 @@ DataAtWork: AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data + - Environmental Data \ No newline at end of file diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index 86ffbb689..572b9b8fc 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -16,7 +16,7 @@ Description: |
  • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
  • -Documentation: https://dynamical.org/catalog/gfs/ +Documentation: https://dynamical.org/catalog/models/noaa-gfs/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" UpdateFrequency: "NOAA GFS forecast: Forecasts initialized every 6 hours" diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml index f5ecfe387..4e3d34eea 100644 --- a/datasets/dynamical-noaa-hrrr.yaml +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -15,7 +15,7 @@ Description: |
  • NOAA HRRR forecast, 48 hour - Weather forecasts from the High Resolution Rapid Refresh (HRRR) model operated by NOAA NWS NCEP.
  • -Documentation: https://dynamical.org/catalog/hrrr/ +Documentation: https://dynamical.org/catalog/models/noaa-hrrr/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" UpdateFrequency: "NOAA HRRR forecast, 48 hour: Forecasts initialized every 6 hours" From 5159b99ed076544e06111a3920002bf9a5b161d8 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 2 Dec 2025 10:40:45 -0900 Subject: [PATCH 623/751] ok: Update dynamical-ecmwf-ifs-ens.yaml --- datasets/dynamical-ecmwf-ifs-ens.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index f3d51b3ed..ca696efc1 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -48,4 +48,4 @@ DataAtWork: AuthorName: dynamical.org AuthorURL: https://dynamical.org ADXCategories: - - Environmental Data \ No newline at end of file + - Environmental Data From 7e0072167d8ede3ccb277252dd76f1702c31be15 Mon Sep 17 00:00:00 2001 From: ricardo Date: Wed, 3 Dec 2025 10:21:04 +0000 Subject: [PATCH 624/751] add open targets dataset yaml --- datasets/open-targets-platform.yaml | 69 +++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 datasets/open-targets-platform.yaml diff --git a/datasets/open-targets-platform.yaml b/datasets/open-targets-platform.yaml new file mode 100644 index 000000000..540ff9bc6 --- /dev/null +++ b/datasets/open-targets-platform.yaml @@ -0,0 +1,69 @@ +Name: OpenTargets - Platform +Description: The Open Targets Platform is a comprehensive data integration tool that supports systematic identification and prioritisation of potential therapeutic drug targets. By integrating publicly available datasets including data generated by the Open Targets experimental and informatics research programmes, the Platform provides data and services to assist in the task of therapeutic hypothesis building. +Documentation: https://platform-docs.opentargets.org/ +Contact: outreach@opentargets.org +ManagedBy: Open Targets +UpdateFrequency: The data is release every three months. +Tags: + - drug targets + - drug discovery + - therapeutics + - targets + - diseases + - drugs + - gentropy + - variants + - credible sets +License: https://creativecommons.org/publicdomain/zero/1.0/ +Resources: + - Description: OpenTargets Release Data + ARN: + Region: + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Autoimmune colocalisations + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_colocalisations.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: Autoimmune credible sets + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_credible_set.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: ChEMBL Evidence Data Download + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/chembl_evidence_download.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: Exploration of Open Targets datasets + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/exploring_ot_datasets.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: Open Targets informatics tools + URL: https://www.ebi.ac.uk/training/online/courses/open-targets-quick-tour/ + AuthorName: Helena Cornu + AuthorURL: https://www.ebi.ac.uk/people/person/helena-cornu/ + - Title: Getting started with the Open Targets Platform GraphQL API + URL: https://www.ebi.ac.uk/training/events/getting-started-open-targets-platform-graphql-api/ + AuthorName: Helena Cornu + AuthorURL: https://www.ebi.ac.uk/people/person/helena-cornu/ + Tools & Applications: + - Title: Open Targets Platform + URL: https://platform.opentargets.org/ + AuthorName: Open Targets + AuthorURL: https://github.com/opentargets/ot-ui-apps + - Title: Open Targets Platform API + URL: https://api.platform.opentargets.org/ + AuthorName: Open Targets + AuthorURL: https://github.com/opentargets/platform-api + Publications: + - Title: Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery + URL: https://academic.oup.com/nar/article/53/D1/D1467/7917960?login=true + AuthorName: Annalisa Buniello + AuthorURL: https://orcid.org/0000-0002-4623-8642 +DeprecatedNotice: +ADXCategories: + - Healthcare & Life Sciences Data From 6d7b9e10442a323327db1cfa14c8719df169c118 Mon Sep 17 00:00:00 2001 From: ricardo Date: Wed, 3 Dec 2025 10:33:28 +0000 Subject: [PATCH 625/751] update pub url to doi --- datasets/open-targets-platform.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/open-targets-platform.yaml b/datasets/open-targets-platform.yaml index 540ff9bc6..41316f32e 100644 --- a/datasets/open-targets-platform.yaml +++ b/datasets/open-targets-platform.yaml @@ -61,7 +61,7 @@ DataAtWork: AuthorURL: https://github.com/opentargets/platform-api Publications: - Title: Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery - URL: https://academic.oup.com/nar/article/53/D1/D1467/7917960?login=true + URL: https://doi.org/10.1093/nar/gkae1128 AuthorName: Annalisa Buniello AuthorURL: https://orcid.org/0000-0002-4623-8642 DeprecatedNotice: From 81337f17bba4a180c11bc305aca0ff3ff3b979f8 Mon Sep 17 00:00:00 2001 From: ricardo Date: Wed, 3 Dec 2025 10:37:16 +0000 Subject: [PATCH 626/751] update pub title --- datasets/open-targets-platform.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/open-targets-platform.yaml b/datasets/open-targets-platform.yaml index 41316f32e..27ff34cbf 100644 --- a/datasets/open-targets-platform.yaml +++ b/datasets/open-targets-platform.yaml @@ -60,7 +60,7 @@ DataAtWork: AuthorName: Open Targets AuthorURL: https://github.com/opentargets/platform-api Publications: - - Title: Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery + - Title: "Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery" URL: https://doi.org/10.1093/nar/gkae1128 AuthorName: Annalisa Buniello AuthorURL: https://orcid.org/0000-0002-4623-8642 From 8d12c2a7a4eca8bb779624e11aab9f47a72d1aae Mon Sep 17 00:00:00 2001 From: Ben Hitz Date: Wed, 3 Dec 2025 11:07:36 -0800 Subject: [PATCH 627/751] Update igvf-consortium.yaml add SNS notification for new data --- datasets/igvf-consortium.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/igvf-consortium.yaml b/datasets/igvf-consortium.yaml index aa38ad8ec..6dd643803 100644 --- a/datasets/igvf-consortium.yaml +++ b/datasets/igvf-consortium.yaml @@ -26,6 +26,10 @@ Resources: ARN: arn:aws:s3:::igvf-public Region: us-west-2 Type: S3 Bucket + - Description: Notifications for new IGVF data + ARN: arn:aws:sns:us-west-2:407227577691:igvf-public-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title:Load AnnData files from IGVF into scanpy and view the UMAPs From 53139dcd6b2ef0d641311afb8c2a7883c083e5f4 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Wed, 3 Dec 2025 17:31:43 -0500 Subject: [PATCH 628/751] ok: Update igvf-consortium.yaml --- datasets/igvf-consortium.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/igvf-consortium.yaml b/datasets/igvf-consortium.yaml index 6dd643803..291198c27 100644 --- a/datasets/igvf-consortium.yaml +++ b/datasets/igvf-consortium.yaml @@ -20,7 +20,7 @@ Tags: - genetic - genomic - life sciences -License: E[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) You are free to share abnd adapt tgus data with proper attribution. +License: E[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) You are free to share and adapt tgus data with proper attribution. Resources: - Description: Released and Archived IGVF Data Files ARN: arn:aws:s3:::igvf-public From d069d54ea23b575949bd389c75d58959a758baaa Mon Sep 17 00:00:00 2001 From: Chris Stoner Date: Wed, 3 Dec 2025 16:36:06 -0900 Subject: [PATCH 629/751] search fix - cannot use div tags in description body --- datasets/dynamical-ecmwf-ifs-ens.yaml | 2 -- datasets/dynamical-noaa-gfs.yaml | 2 -- datasets/dynamical-noaa-hrrr.yaml | 2 -- 3 files changed, 6 deletions(-) diff --git a/datasets/dynamical-ecmwf-ifs-ens.yaml b/datasets/dynamical-ecmwf-ifs-ens.yaml index ca696efc1..1c8dd1604 100644 --- a/datasets/dynamical-ecmwf-ifs-ens.yaml +++ b/datasets/dynamical-ecmwf-ifs-ens.yaml @@ -1,6 +1,5 @@ Name: ECMWF IFS ENS Description: | -

    The Integrated Forecasting System (IFS) is a global forecast model developed by ECMWF. ENS is an ensemble configuration of IFS, containing 51 ensemble members. @@ -15,7 +14,6 @@ Description: |

    -
    Documentation: https://dynamical.org/catalog/models/ecmwf-ifs-ens/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" diff --git a/datasets/dynamical-noaa-gfs.yaml b/datasets/dynamical-noaa-gfs.yaml index 572b9b8fc..c5977956d 100644 --- a/datasets/dynamical-noaa-gfs.yaml +++ b/datasets/dynamical-noaa-gfs.yaml @@ -1,6 +1,5 @@ Name: NOAA GFS Description: | -

    The Global Forecast System (GFS) is a National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction @@ -15,7 +14,6 @@ Description: |

    • NOAA GFS forecast - Weather forecasts from the Global Forecast System (GFS) operated by NOAA NWS NCEP.
    -
    Documentation: https://dynamical.org/catalog/models/noaa-gfs/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" diff --git a/datasets/dynamical-noaa-hrrr.yaml b/datasets/dynamical-noaa-hrrr.yaml index 4e3d34eea..683cab422 100644 --- a/datasets/dynamical-noaa-hrrr.yaml +++ b/datasets/dynamical-noaa-hrrr.yaml @@ -1,6 +1,5 @@ Name: NOAA HRRR Description: | -

    The High-Resolution Rapid Refresh (HRRR) is a NOAA real-time 3-km resolution, hourly updated, cloud-resolving, convection-allowing atmospheric model, @@ -14,7 +13,6 @@ Description: |

    -
    Documentation: https://dynamical.org/catalog/models/noaa-hrrr/ Contact: feedback@dynamical.org ManagedBy: "[dynamical.org](https://dynamical.org)" From be7a86aa037d1053b20b69ae41ad86e11f5a2c8f Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 3 Dec 2025 16:37:30 -0900 Subject: [PATCH 630/751] ok: Update dynamical-ecmwf-ifs-ens.yaml From 63b1cb718eaca485e36da7fbeaf1dfb40654fc81 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Thu, 4 Dec 2025 11:45:50 -0700 Subject: [PATCH 631/751] Add US Tidal dataset information to marine-energy-data.yaml --- datasets/marine-energy-data.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/marine-energy-data.yaml b/datasets/marine-energy-data.yaml index 76dd2d0e9..805de63c2 100644 --- a/datasets/marine-energy-data.yaml +++ b/datasets/marine-energy-data.yaml @@ -44,6 +44,12 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=marine-energy-data&prefix=umsli%2F)' + - Description: "[High Resolution Tidal Hindcast (US Tidal) Dataset](https://mhkdr.openei.org/submissions/632)" + ARN: arn:aws:s3:::marine-energy-data/us-tidal/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=marine-energy-data&prefix=us-tidal%2F)' DataAtWork: Tools & Applications: Publications: From f11d8ff2265a111d416387ca32397cc067606c94 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 4 Dec 2025 10:01:25 -0900 Subject: [PATCH 632/751] ok: Update marine-energy-data.yaml From e61cf7be538c4f11f4f3b383d695bff1c6969879 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 4 Dec 2025 16:16:52 -0500 Subject: [PATCH 633/751] ok: Update igvf-consortium.yaml --- datasets/igvf-consortium.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/igvf-consortium.yaml b/datasets/igvf-consortium.yaml index 291198c27..e590dd4f2 100644 --- a/datasets/igvf-consortium.yaml +++ b/datasets/igvf-consortium.yaml @@ -20,7 +20,7 @@ Tags: - genetic - genomic - life sciences -License: E[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) You are free to share and adapt tgus data with proper attribution. +License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" You are free to share and adapt this data with proper attribution. Resources: - Description: Released and Archived IGVF Data Files ARN: arn:aws:s3:::igvf-public @@ -32,7 +32,7 @@ Resources: Type: SNS Topic DataAtWork: Tutorials: - - Title:Load AnnData files from IGVF into scanpy and view the UMAPs + - Title: Load AnnData files from IGVF into scanpy and view the UMAPs URL: https://github.com/IGVF-DACC/igvf-data-usage-examples/blob/master/igvf-scanpy.ipynb AuthorName: Ben Hitz AuthorURL: https://github.com/hitz From 378de48c2e73c4bff3f67783b17cffd8f2483dee Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 4 Dec 2025 16:24:58 -0500 Subject: [PATCH 634/751] ok: Update igvf-consortium.yaml --- datasets/igvf-consortium.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/igvf-consortium.yaml b/datasets/igvf-consortium.yaml index e590dd4f2..8b1fed257 100644 --- a/datasets/igvf-consortium.yaml +++ b/datasets/igvf-consortium.yaml @@ -20,7 +20,7 @@ Tags: - genetic - genomic - life sciences -License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" You are free to share and adapt this data with proper attribution. +License: '[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)' You are free to share and adapt this data with proper attribution Resources: - Description: Released and Archived IGVF Data Files ARN: arn:aws:s3:::igvf-public From d3e02e8d087defe742a11a2c315a9762378bad57 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 4 Dec 2025 16:33:07 -0500 Subject: [PATCH 635/751] ok: Update igvf-consortium.yaml --- datasets/igvf-consortium.yaml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/datasets/igvf-consortium.yaml b/datasets/igvf-consortium.yaml index 8b1fed257..2b5e40cbc 100644 --- a/datasets/igvf-consortium.yaml +++ b/datasets/igvf-consortium.yaml @@ -20,7 +20,7 @@ Tags: - genetic - genomic - life sciences -License: '[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)' You are free to share and adapt this data with proper attribution +License: CC BY 4.0 - https://creativecommons.org/licenses/by/4.0/ - You are free to share and adapt this data with proper attribution Resources: - Description: Released and Archived IGVF Data Files ARN: arn:aws:s3:::igvf-public @@ -41,10 +41,10 @@ DataAtWork: AuthorName: Otto Jolanki AuthorURL: https://github.com/ottojolanki Tools & Applications: - - Title: The IGVF Catalog - URL: https://catalog.igvf.org - AuthorName: The IGVF Consortium - AuthorURL: www.igvf.org + - Title: The IGVF Catalog + URL: https://catalog.igvf.org + AuthorName: The IGVF Consortium + AuthorURL: www.igvf.org Publications: - Title: Deciphering the impact of genomic variation on function URL: https://www-nature-com.stanford.idm.oclc.org/articles/s41586-024-07510-0 From 0d39fb04fb6fe03d403f4bf7442d93eb6c653d47 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 4 Dec 2025 16:54:27 -0500 Subject: [PATCH 636/751] ok: Update flab.yaml --- datasets/flab.yaml | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/datasets/flab.yaml b/datasets/flab.yaml index dfa3651db..e42a71bae 100644 --- a/datasets/flab.yaml +++ b/datasets/flab.yaml @@ -1,18 +1,15 @@ -Name: FLAb: Fitness Landscapes for Antibodies +Name: "FLAb: Fitness Landscapes for Antibodies" Description: FLAb is the largest publicly available therapeutic antibody dataset designed to train and benchmark protein AI models. It provides open-access, high-quality developability data on diverse therapeutic properties, including expression, thermostability, immunogenicity, aggregation, polyreactivity, binding affinity, and pharmacokinetics. Documentation: https://github.com/Graylab/FLAb/blob/main/README.md Contact: mchungy1@jhu.edu ManagedBy: "[Jeffrey Gray Lab, Johns Hopkins University](https://graylab.jhu.edu/)" UpdateFrequency: Any new public release of antibody developabilty data is deposited into FLAb Tags: - - Protein language models - - Protein design - - Antibody engineering - - Therapeueutic antibodies - - Developability - - Machine learning - - Clinical stage therapeutics - - Biophysics + - protein + - protein template + - machine learning + - life sciences + - aws-pds License: https://creativecommons.org/licenses/by/4.0/ Citation: "FLAb was accessed on [DATE] at registry.opendata.aws/flab" Resources: @@ -22,14 +19,14 @@ Resources: Type: S3 Bucket DataAtWork: Tutorials: - - Title: FLAb tutorial: Benchmarking a protein language model for antibody expression prediction + - Title: "FLAb tutorial: Benchmarking a protein language model for antibody expression prediction" NotebookURL: https://github.com/Graylab/FLAb/blob/main/examples/FLAb_ZeroShotExample_IgLM_Expression.ipynb AuthorName: Michael Chungyoun AuthorURL: https://www.linkedin.com/in/mfc12/ Publications: - - Title: FLAb: Benchmarking deep learning methods for antibody fitness prediction + - Title: "FLAb: Benchmarking deep learning methods for antibody fitness prediction" URL: https://doi.org/10.1101/2024.01.13.575504 AuthorName: Michael Chungyoun and Jeffrey J. Gray AuthorURL: https://www.linkedin.com/in/mfc12/ ADXCategories: - - Healthcare & Life Sciences Data \ No newline at end of file + - Healthcare & Life Sciences Data From 6f510c293581791c982efa5f87aaea2fcd1feeca Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 4 Dec 2025 17:28:38 -0500 Subject: [PATCH 637/751] ok: Update flab.yaml --- datasets/flab.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/flab.yaml b/datasets/flab.yaml index e42a71bae..ef6a70ac5 100644 --- a/datasets/flab.yaml +++ b/datasets/flab.yaml @@ -20,7 +20,7 @@ Resources: DataAtWork: Tutorials: - Title: "FLAb tutorial: Benchmarking a protein language model for antibody expression prediction" - NotebookURL: https://github.com/Graylab/FLAb/blob/main/examples/FLAb_ZeroShotExample_IgLM_Expression.ipynb + URL: https://github.com/Graylab/FLAb/blob/main/examples/FLAb_ZeroShotExample_IgLM_Expression.ipynb AuthorName: Michael Chungyoun AuthorURL: https://www.linkedin.com/in/mfc12/ Publications: From 7ff51931a24e88090af7a753a8a43c0224cc2557 Mon Sep 17 00:00:00 2001 From: Christopher Tabone Date: Fri, 5 Dec 2025 17:00:08 +0000 Subject: [PATCH 638/751] Update bucket name from mod-datadumps to alliance-genome-downloads Change Alliance S3 bucket ARN to reflect the new public bucket name and add browse bucket link for easier data exploration. --- datasets/alliance-genome-resources.yaml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/datasets/alliance-genome-resources.yaml b/datasets/alliance-genome-resources.yaml index 1a9f45b51..3ed166b94 100644 --- a/datasets/alliance-genome-resources.yaml +++ b/datasets/alliance-genome-resources.yaml @@ -27,10 +27,12 @@ Tags: License: Most Alliance data is available under CC0 1.0 Universal (Public Domain Dedication). Some datasets may use CC-BY 4.0 (attribution required). Full details at https://www.alliancegenome.org/terms-of-use Citation: Alliance of Genome Resources Consortium. Alliance of Genome Resources Portal - unified model organism research platform. Nucleic Acids Research (2023). https://doi.org/10.1093/nar/gkac1003 Resources: - - Description: Alliance-wide integrated datasets including disease associations, gene expression, molecular and genetic interactions, orthology relationships, and gene descriptions across all Alliance organisms. Data organized by release version (8.3.0, 8.2.0, etc.) and data type. Includes combined data files and organism-specific collections for FB (FlyBase/Drosophila), MGI (Mouse), RGD (Rat), SGD (Yeast), WB (Worm), XBXL/XBXT (Xenopus), ZFIN (Zebrafish), and HUMAN reference data. Files are available in TSV, JSON, and specialized formats (PSI-MI TAB for interactions, VCF for variants). - ARN: arn:aws:s3:::mod-datadumps + - Description: Alliance-wide integrated datasets including disease associations, gene expression, molecular and genetic interactions, orthology relationships, gene descriptions, and variants across all Alliance organisms. Data is organized by release version (8.3.0/, 8.2.0/, etc.), then by data type, with organism-specific collections for FB (FlyBase/Drosophila), MGI (Mouse), RGD (Rat), SGD (Yeast), WB (Worm), XBXL/XBXT (Xenopus), ZFIN (Zebrafish), and HUMAN reference data. Available in TSV, JSON, and VCF formats. + ARN: arn:aws:s3:::alliance-genome-downloads Region: us-east-1 Type: S3 Bucket + Explore: + - '[Browse Bucket](https://alliance-genome-downloads.s3.amazonaws.com/)' - Description: FlyBase-specific data for Drosophila melanogaster and related species, including gene annotations, GO annotations, expression data (bulk RNA-Seq, single-cell RNA-Seq), disease associations, phenotypes, interactions, orthologs, genome sequences (FASTA), and genome annotations (GFF3/GTF). Data organized by release (current/, FB2025_04/, etc.) with precomputed analysis files and complete Chado XML database dumps. Publicly accessible via HTTPS for direct download without AWS credentials. ARN: arn:aws:s3:::s3ftp.flybase.org Region: us-east-1 From f575817676fcee68f52620fc46d4e731bdb4ca50 Mon Sep 17 00:00:00 2001 From: Greg Lindahl Date: Sun, 7 Dec 2025 18:17:04 +0000 Subject: [PATCH 639/751] feat: Add new tutorial details for Common Crawl dataset --- datasets/commoncrawl.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/commoncrawl.yaml b/datasets/commoncrawl.yaml index 8d6a4254a..e816e9c10 100644 --- a/datasets/commoncrawl.yaml +++ b/datasets/commoncrawl.yaml @@ -18,6 +18,12 @@ Resources: Type: S3 Bucket AccountRequired: True DataAtWork: + Tutorials: + - Title: Get To Know A Dataset - Common Crawl + URL: https://github.com/commoncrawl/whirlwind-python-notebook + NotebookURL: https://github.com/commoncrawl/whirlwind-python-notebook/blob/main/aws-ccf-dataset.ipynb + AuthorName: Common Crawl Foundation + AuthorURL: https://commoncrawl.org/ Tutorials: - Title: Analysing Petabytes of Websites URL: http://tech.marksblogg.com/petabytes-of-website-data-spark-emr.html From 7d3a9b212ea999b5f4d1119a5d193faa1840ed4f Mon Sep 17 00:00:00 2001 From: Peter Schmiedeskamp Date: Mon, 8 Dec 2025 05:22:56 -0800 Subject: [PATCH 640/751] ok: ready to merge From c6df1d09ed6da3b495e1397b38e8929368136d93 Mon Sep 17 00:00:00 2001 From: Peter Schmiedeskamp Date: Mon, 8 Dec 2025 05:33:21 -0800 Subject: [PATCH 641/751] Remove duplicate Tutorials section Removed duplicate Tutorials section from commoncrawl.yaml --- datasets/commoncrawl.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/commoncrawl.yaml b/datasets/commoncrawl.yaml index e816e9c10..f33198846 100644 --- a/datasets/commoncrawl.yaml +++ b/datasets/commoncrawl.yaml @@ -24,7 +24,6 @@ DataAtWork: NotebookURL: https://github.com/commoncrawl/whirlwind-python-notebook/blob/main/aws-ccf-dataset.ipynb AuthorName: Common Crawl Foundation AuthorURL: https://commoncrawl.org/ - Tutorials: - Title: Analysing Petabytes of Websites URL: http://tech.marksblogg.com/petabytes-of-website-data-spark-emr.html AuthorName: Mark Litwintschik From 453c86ae3f6c35effbefdcd52f342b33fd42e6cb Mon Sep 17 00:00:00 2001 From: Peter Schmiedeskamp Date: Mon, 8 Dec 2025 05:34:23 -0800 Subject: [PATCH 642/751] ok: to merge From 0d4538fa29f9f9853dd928a7845b8820980a9f9c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 09:14:25 -0900 Subject: [PATCH 643/751] ok: Update kanagawa_pointcloud.yaml From 178cc1ed4caf13f61a2e08a274959c0af6e2e432 Mon Sep 17 00:00:00 2001 From: xhagrg Date: Mon, 8 Dec 2025 14:01:50 -0600 Subject: [PATCH 644/751] Add SNS topic and browse. --- datasets/surya-bench.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/datasets/surya-bench.yaml b/datasets/surya-bench.yaml index 2a48b9460..2fa33eb4c 100644 --- a/datasets/surya-bench.yaml +++ b/datasets/surya-bench.yaml @@ -22,3 +22,9 @@ Resources: ARN: arn:aws:s3:::nasa-surya-bench Region: us-west-2 Type: S3 Bucket + - Description: Notifications for Surya bench data + ARN: arn:aws:sns:us-west-2:614929158106:nasa-surya-bench-object_created + Region: us-west-2 + Type: SNS Topic + Explore: + - '[Browse Bucket](http://nasa-surya-bench.s3.amazonaws.com/index.html)' From 1babebff0acb75b27cc86e1cd2723dba63379437 Mon Sep 17 00:00:00 2001 From: xhagrg Date: Mon, 8 Dec 2025 14:02:17 -0600 Subject: [PATCH 645/751] Update browse url. --- datasets/surya-bench.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/surya-bench.yaml b/datasets/surya-bench.yaml index 2fa33eb4c..37f1d6bdc 100644 --- a/datasets/surya-bench.yaml +++ b/datasets/surya-bench.yaml @@ -22,9 +22,9 @@ Resources: ARN: arn:aws:s3:::nasa-surya-bench Region: us-west-2 Type: S3 Bucket + Explore: + - '[Browse Bucket](http://nasa-surya-bench.s3.amazonaws.com/index.html)' - Description: Notifications for Surya bench data ARN: arn:aws:sns:us-west-2:614929158106:nasa-surya-bench-object_created Region: us-west-2 Type: SNS Topic - Explore: - - '[Browse Bucket](http://nasa-surya-bench.s3.amazonaws.com/index.html)' From bee02c0895384b5b776e4e84e81a7c08e67b8202 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 11:23:46 -0900 Subject: [PATCH 646/751] ok: Update surya-bench.yaml From 84272c1743607e8314c65da794055dc6baaa9d8b Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:22:45 -0700 Subject: [PATCH 647/751] Update ManagedBy field in gdr-data-lake.yaml --- datasets/gdr-data-lake.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/gdr-data-lake.yaml b/datasets/gdr-data-lake.yaml index 47cbca42d..9c0b4c335 100644 --- a/datasets/gdr-data-lake.yaml +++ b/datasets/gdr-data-lake.yaml @@ -10,7 +10,7 @@ Description: | Documentation: https://github.com/openEDI/documentation/ Contact: https://github.com/openEDI/documentation/issues -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: As needed Collabs: ASDI: From c9a042887b8bb300cd5030d3b492db3e6fee457d Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:24:13 -0700 Subject: [PATCH 648/751] Update ManagedBy field in oedi-data-lake.yaml --- datasets/oedi-data-lake.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/oedi-data-lake.yaml b/datasets/oedi-data-lake.yaml index 79a09295b..c35320844 100644 --- a/datasets/oedi-data-lake.yaml +++ b/datasets/oedi-data-lake.yaml @@ -8,7 +8,7 @@ Description: | analysis and advance innovation. Documentation: https://github.com/openEDI/documentation/ Contact: https://github.com/openEDI/documentation/issues -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: As needed Collabs: ASDI: From 66274b505bb4dfd477e2f1a451a8d5e95ff2b00d Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:25:18 -0700 Subject: [PATCH 649/751] Update ManagedBy field in marine-energy-data.yaml --- datasets/marine-energy-data.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/marine-energy-data.yaml b/datasets/marine-energy-data.yaml index 805de63c2..9e43b77c1 100644 --- a/datasets/marine-energy-data.yaml +++ b/datasets/marine-energy-data.yaml @@ -7,7 +7,7 @@ Description: | This data lake is a sister-data lake to the Department of Energy’s Open Energy Data Initiative (OEDI) data lake. Documentation: https://github.com/openEDI/documentation/ Contact: https://github.com/openEDI/documentation/issues -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: As needed Collabs: ASDI: From 9812e3ad80a92966eda1f5912fd85113916b0f54 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:27:12 -0700 Subject: [PATCH 650/751] Update ManagedBy field Updated managed by organization from 'National Renewable Energy Laboratory' to 'National Laboratory of the Rockies'. --- datasets/nrel-pds-building-stock.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/nrel-pds-building-stock.yaml b/datasets/nrel-pds-building-stock.yaml index ecd709cda..284c11fd5 100644 --- a/datasets/nrel-pds-building-stock.yaml +++ b/datasets/nrel-pds-building-stock.yaml @@ -8,7 +8,7 @@ Description: | and demand flexibility upgrades applied. Documentation: https://www.nrel.gov/buildings/end-use-load-profiles.html Contact: ComStock@nrel.gov and ResStock@nrel.gov -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: Twice per year Collabs: ASDI: @@ -54,3 +54,4 @@ DataAtWork: URL: https://www.nrel.gov/docs/fy22osti/80889.pdf AuthorName: E. Wilson, A. Parker, A. Fontanini, et al. + From 6f542c8c06b5004de9f7994a5f589d20b43ef1bc Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:28:12 -0700 Subject: [PATCH 651/751] Update ManagedBy field in dsgrid.yaml Update NREL to NLR --- datasets/nrel-pds-dsgrid.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nrel-pds-dsgrid.yaml b/datasets/nrel-pds-dsgrid.yaml index b267999e6..bf8468c51 100644 --- a/datasets/nrel-pds-dsgrid.yaml +++ b/datasets/nrel-pds-dsgrid.yaml @@ -7,7 +7,7 @@ Description: | production cost models. Documentation: https://www.nrel.gov/analysis/dsgrid.html Contact: elaine.hale@nrel.gov -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: Annually Collabs: ASDI: From 03dde623b706444f99c032453aee3bc014008ff3 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:29:10 -0700 Subject: [PATCH 652/751] Update ManagedBy field in nrel-pds-ncdb.yaml Updated NREL to NLR --- datasets/nrel-pds-ncdb.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nrel-pds-ncdb.yaml b/datasets/nrel-pds-ncdb.yaml index 69a678349..12f7da8ae 100644 --- a/datasets/nrel-pds-ncdb.yaml +++ b/datasets/nrel-pds-ncdb.yaml @@ -8,7 +8,7 @@ Description: | important climate scenarios. Documentation: https://nsrdb.nrel.gov/ Contact: Manajit.Sengupta@nrel.gov -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: As needed Collabs: ASDI: From 2c5644201da44efca9d25bdbb2a8fb0c1273bb5a Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:31:34 -0700 Subject: [PATCH 653/751] Update managed by field in nrel-pds-nsrdb.yaml Updated NREL to NLR --- datasets/nrel-pds-nsrdb.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/nrel-pds-nsrdb.yaml b/datasets/nrel-pds-nsrdb.yaml index 69eae4656..9023d0769 100644 --- a/datasets/nrel-pds-nsrdb.yaml +++ b/datasets/nrel-pds-nsrdb.yaml @@ -9,7 +9,7 @@ Description: | spatial scales to accurately represent regional solar radiation climates. Documentation: https://nsrdb.nrel.gov/ Contact: nsrdb@nrel.gov -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: Annually Collabs: ASDI: @@ -124,3 +124,4 @@ DataAtWork: - Title: Physics-guided machine learning for improved accuracy of the National Solar Radiation Database URL: https://www.sciencedirect.com/science/article/abs/pii/S0038092X22000044 AuthorName: Grant Buster, Mike Bannister, Aron Habte, Dylan Hettinger, Galen Maclaurin, Michael Rossol, Manajit Sengupta, Yu Xie + From b54568f2e23fa0fbdc20f0a1a7b3e4824992b688 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:32:53 -0700 Subject: [PATCH 654/751] Update ManagedBy field in nrel-pds-porotomo Updated NREL to NLR --- datasets/nrel-pds-porotomo.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nrel-pds-porotomo.yaml b/datasets/nrel-pds-porotomo.yaml index 1f38d6bcd..9f1d4d959 100644 --- a/datasets/nrel-pds-porotomo.yaml +++ b/datasets/nrel-pds-porotomo.yaml @@ -7,7 +7,7 @@ Description: | Energy (EERE), U.S. Department of Energy. Documentation: https://github.com/openEDI/documentation/blob/master/PoroTomo/PoroTomo.md Contact: Thomas Coleman (thomas.coleman@silixa.com) -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: As needed Collabs: ASDI: From df75d9b3dddf6749a422de5c41eb6cc196a31872 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:33:53 -0700 Subject: [PATCH 655/751] Update ManagedBy field in nrel-pds-sup3rcc.yaml Updated NREL to NLR --- datasets/nrel-pds-sup3rcc.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nrel-pds-sup3rcc.yaml b/datasets/nrel-pds-sup3rcc.yaml index 774b17b72..533939ad4 100644 --- a/datasets/nrel-pds-sup3rcc.yaml +++ b/datasets/nrel-pds-sup3rcc.yaml @@ -18,7 +18,7 @@ Description: | quantify this uncertainty. Documentation: https://github.com/NREL/sup3r Contact: Grant Buster (grant.buster@nrel.gov) -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: Annual Collabs: ASDI: From 1135313af81be4f5ab5a202587e40d90aad68730 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:34:56 -0700 Subject: [PATCH 656/751] Update ManagedBy field in windai.yaml Updated NREL to NLR --- datasets/nrel-pds-windai.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nrel-pds-windai.yaml b/datasets/nrel-pds-windai.yaml index e5ad7eefc..5583f3e6b 100644 --- a/datasets/nrel-pds-windai.yaml +++ b/datasets/nrel-pds-windai.yaml @@ -11,7 +11,7 @@ Description: | documentation that show how to access the data for ML modeling. Documentation: https://github.com/NREL/windAI_bench Contact: Ryan King (ryan.king@nrel.gov) -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: Annually Collabs: ASDI: From 5af50c2e4131b519b49358efd4606ecc74ae7890 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:36:19 -0700 Subject: [PATCH 657/751] Update ManagedBy field in nrel-pds-wtk.yaml Updated NREL to NLR --- datasets/nrel-pds-wtk.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/nrel-pds-wtk.yaml b/datasets/nrel-pds-wtk.yaml index 8990cbbf2..fe03449fc 100644 --- a/datasets/nrel-pds-wtk.yaml +++ b/datasets/nrel-pds-wtk.yaml @@ -7,7 +7,7 @@ Description: | integration studies. Documentation: https://www.nrel.gov/grid/wind-toolkit.html Contact: wind-toolkit@nrel.gov -ManagedBy: '[National Renewable Energy Laboratory](https://www.nrel.gov/)' +ManagedBy: '[National Laboratory of the Rockies](https://www.nrel.gov/)' UpdateFrequency: As Needed Collabs: ASDI: @@ -270,3 +270,4 @@ DataAtWork: - Title: 'WTK-LED: The WIND Toolkit Long-Term Ensemble Dataset' URL: https://www.osti.gov/servlets/purl/2473210/ AuthorName: Caroline Draxl, Jiali Wang, Lindsay Sheridan, et al. + From 9e7b92ce3b0ded7f5d56013ff3cf413de72763b3 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:19:07 -0900 Subject: [PATCH 658/751] ok: Update nrel-pds-wtk.yaml --- datasets/nrel-pds-wtk.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/nrel-pds-wtk.yaml b/datasets/nrel-pds-wtk.yaml index fe03449fc..5af640d77 100644 --- a/datasets/nrel-pds-wtk.yaml +++ b/datasets/nrel-pds-wtk.yaml @@ -270,4 +270,3 @@ DataAtWork: - Title: 'WTK-LED: The WIND Toolkit Long-Term Ensemble Dataset' URL: https://www.osti.gov/servlets/purl/2473210/ AuthorName: Caroline Draxl, Jiali Wang, Lindsay Sheridan, et al. - From 6718175a519846d05d5f9f7921d97a807c12148c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:26:29 -0900 Subject: [PATCH 659/751] ok: Update nrel-pds-windai.yaml From 8ef79622f18698b5869e736f96379352cc4435d6 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:29:49 -0900 Subject: [PATCH 660/751] ok: Update nrel-pds-sup3rcc.yaml From b57d8182e6974311d579d9c0481ee8e0a5b87a60 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:33:36 -0900 Subject: [PATCH 661/751] ok: Update nrel-pds-porotomo.yaml From 478095935a00ce4764f24852aff0dd4c7522ec93 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:37:24 -0900 Subject: [PATCH 662/751] ok: Update nrel-pds-nsrdb.yaml --- datasets/nrel-pds-nsrdb.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/nrel-pds-nsrdb.yaml b/datasets/nrel-pds-nsrdb.yaml index 9023d0769..027fca976 100644 --- a/datasets/nrel-pds-nsrdb.yaml +++ b/datasets/nrel-pds-nsrdb.yaml @@ -124,4 +124,3 @@ DataAtWork: - Title: Physics-guided machine learning for improved accuracy of the National Solar Radiation Database URL: https://www.sciencedirect.com/science/article/abs/pii/S0038092X22000044 AuthorName: Grant Buster, Mike Bannister, Aron Habte, Dylan Hettinger, Galen Maclaurin, Michael Rossol, Manajit Sengupta, Yu Xie - From c4356791cb6ba8a3d28c5416d961e4e0be29f633 Mon Sep 17 00:00:00 2001 From: kanagawa-pointcloud Date: Tue, 9 Dec 2025 10:52:17 +0900 Subject: [PATCH 663/751] Update kanagawa_pointcloud.yaml --- datasets/kanagawa_pointcloud.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/kanagawa_pointcloud.yaml b/datasets/kanagawa_pointcloud.yaml index e21478b30..b5a1e7fd7 100644 --- a/datasets/kanagawa_pointcloud.yaml +++ b/datasets/kanagawa_pointcloud.yaml @@ -22,6 +22,10 @@ Resources: ARN: arn:aws:s3:::kanagawa-pointcloud Region: ap-northeast-1 Type: S3 Bucket +- Description: Notifications for new kanagawa-pointcloud data + ARN: arn:aws:sns:ap-northeast-1:895319340027:kanagawa-pointcloud_created + Region: ap-northeast-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Tutorial of handling LAS format point cloud data From 148faff5146617d5d2289df1fc0eb5a4fa2e4e06 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 16:57:10 -0900 Subject: [PATCH 664/751] ok: Update nrel-pds-ncdb.yaml From ea5b9b3101911dfcee29aa69c1b007caa46f8c4c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 17:02:57 -0900 Subject: [PATCH 665/751] ok: Update nrel-pds-dsgrid.yaml From fcb282a3ff20109093f4e3df30988324be4ae299 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 17:08:38 -0900 Subject: [PATCH 666/751] ok: Update nrel-pds-building-stock.yaml --- datasets/nrel-pds-building-stock.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/nrel-pds-building-stock.yaml b/datasets/nrel-pds-building-stock.yaml index 284c11fd5..38fffdda3 100644 --- a/datasets/nrel-pds-building-stock.yaml +++ b/datasets/nrel-pds-building-stock.yaml @@ -53,5 +53,3 @@ DataAtWork: - Title: "End-Use Load Profiles for the U.S. Building Stock" URL: https://www.nrel.gov/docs/fy22osti/80889.pdf AuthorName: E. Wilson, A. Parker, A. Fontanini, et al. - - From f70f19428c6a0c78571cca67e7054398dfc72d20 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 17:14:47 -0900 Subject: [PATCH 667/751] ok: Update marine-energy-data.yaml From ccf9559a2c54e662ee2e7d33835e5cd55e192ed4 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 17:18:03 -0900 Subject: [PATCH 668/751] ok: Update oedi-data-lake.yaml From b2cde85c9e998acd70c03be6b48560c1ce9913c7 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 8 Dec 2025 17:21:57 -0900 Subject: [PATCH 669/751] ok: Update gdr-data-lake.yaml From a50dd95d3713f5fbfab509b2f2d85caa94469881 Mon Sep 17 00:00:00 2001 From: japan-pointcloud Date: Tue, 9 Dec 2025 13:03:06 +0900 Subject: [PATCH 670/751] Update japan_pointcloud.yaml Adding SNS topic resource --- datasets/japan_pointcloud.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/japan_pointcloud.yaml b/datasets/japan_pointcloud.yaml index f5413f929..add1552b8 100644 --- a/datasets/japan_pointcloud.yaml +++ b/datasets/japan_pointcloud.yaml @@ -22,6 +22,10 @@ Resources: ARN: arn:aws:s3:::japan-pointcloud Region: ap-northeast-1 Type: S3 Bucket +- Description: Notifications for new japan-pointcloud data + ARN: arn:aws:sns:ap-northeast-1:250546175908:japan-pointcloud-object_created + Region: ap-northeast-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Tutorial of handling LAS format point cloud data From 2ea62857f9878b47b812728b0badf1aadd85d040 Mon Sep 17 00:00:00 2001 From: mwielocha Date: Tue, 9 Dec 2025 14:16:04 +0100 Subject: [PATCH 671/751] updated SNS queue arn --- datasets/nuview-state.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nuview-state.yaml b/datasets/nuview-state.yaml index 56b8ab033..5ce94f32e 100644 --- a/datasets/nuview-state.yaml +++ b/datasets/nuview-state.yaml @@ -24,7 +24,7 @@ Resources: Type: S3 Bucket RequesterPays: True - Description: New data notifications - ARN: arn:aws:sns:us-west-2:{TBA}:nuview-state-opendata-events + ARN: arn:aws:sns:us-west-2:830737158982:nuview-state-opendata-object_created Region: us-west-2 Type: SNS Topic DataAtWork: From 627eb057f23d602461da7c92867f0c007457b16a Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 9 Dec 2025 09:35:41 -0500 Subject: [PATCH 672/751] ok: Update alliance-genome-resources.yaml --- datasets/alliance-genome-resources.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/alliance-genome-resources.yaml b/datasets/alliance-genome-resources.yaml index 3ed166b94..106286562 100644 --- a/datasets/alliance-genome-resources.yaml +++ b/datasets/alliance-genome-resources.yaml @@ -45,8 +45,6 @@ DataAtWork: URL: https://github.com/alliance-genome/agr_open_data/blob/main/TUTORIAL.md AuthorName: Alliance of Genome Resources Consortium AuthorURL: https://www.alliancegenome.org - Services: - - S3 Tools & Applications: - Title: Alliance of Genome Resources Portal URL: https://www.alliancegenome.org From 0b1215aff5a31d30e0166f17c8bfd39cd2d7b120 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 9 Dec 2025 09:41:22 -0500 Subject: [PATCH 673/751] ok: Update alliance-genome-resources.yaml --- datasets/alliance-genome-resources.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/alliance-genome-resources.yaml b/datasets/alliance-genome-resources.yaml index 106286562..3bf8e1ffa 100644 --- a/datasets/alliance-genome-resources.yaml +++ b/datasets/alliance-genome-resources.yaml @@ -19,7 +19,6 @@ Tags: - Mus musculus - Rattus norvegicus - Homo sapiens - - disease - transcriptomics - protein - vcf From 81e7c795854a4d87af3942346fac62fe920d4c14 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 9 Dec 2025 06:28:09 -0900 Subject: [PATCH 674/751] ok: Update japan_pointcloud.yaml --- datasets/japan_pointcloud.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/japan_pointcloud.yaml b/datasets/japan_pointcloud.yaml index add1552b8..5cc02b8ff 100644 --- a/datasets/japan_pointcloud.yaml +++ b/datasets/japan_pointcloud.yaml @@ -22,10 +22,10 @@ Resources: ARN: arn:aws:s3:::japan-pointcloud Region: ap-northeast-1 Type: S3 Bucket -- Description: Notifications for new japan-pointcloud data - ARN: arn:aws:sns:ap-northeast-1:250546175908:japan-pointcloud-object_created - Region: ap-northeast-1 - Type: SNS Topic + - Description: Notifications for new japan-pointcloud data + ARN: arn:aws:sns:ap-northeast-1:250546175908:japan-pointcloud-object_created + Region: ap-northeast-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Tutorial of handling LAS format point cloud data From 6a2fecdbf076f3cd8f0944bc6154dab6293ea922 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 9 Dec 2025 06:35:04 -0900 Subject: [PATCH 675/751] ok: Update kanagawa_pointcloud.yaml --- datasets/kanagawa_pointcloud.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/datasets/kanagawa_pointcloud.yaml b/datasets/kanagawa_pointcloud.yaml index b5a1e7fd7..0ef88f532 100644 --- a/datasets/kanagawa_pointcloud.yaml +++ b/datasets/kanagawa_pointcloud.yaml @@ -22,10 +22,10 @@ Resources: ARN: arn:aws:s3:::kanagawa-pointcloud Region: ap-northeast-1 Type: S3 Bucket -- Description: Notifications for new kanagawa-pointcloud data - ARN: arn:aws:sns:ap-northeast-1:895319340027:kanagawa-pointcloud_created - Region: ap-northeast-1 - Type: SNS Topic + - Description: Notifications for new kanagawa-pointcloud data + ARN: arn:aws:sns:ap-northeast-1:895319340027:kanagawa-pointcloud_created + Region: ap-northeast-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Tutorial of handling LAS format point cloud data From cd649893c1d6fda9c7be69384c77093372482d6e Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 9 Dec 2025 10:49:31 -0900 Subject: [PATCH 676/751] ok: Update nuview-state.yaml --- datasets/nuview-state.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/nuview-state.yaml b/datasets/nuview-state.yaml index 5ce94f32e..a2d9bc08d 100644 --- a/datasets/nuview-state.yaml +++ b/datasets/nuview-state.yaml @@ -9,6 +9,7 @@ Contact: support@nuview.space ManagedBy: "[NUVIEW](https://nuview.space/)" UpdateFrequency: Project-based updates. Tags: + - aws-pds - geospatial - satellite imagery - natural resource From f46b962fc41915b95884321c6f8aea86a55faf66 Mon Sep 17 00:00:00 2001 From: mwielocha Date: Tue, 9 Dec 2025 20:56:31 +0100 Subject: [PATCH 677/751] removed RequesterPays --- datasets/nuview-state.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/nuview-state.yaml b/datasets/nuview-state.yaml index 5ce94f32e..8012da67b 100644 --- a/datasets/nuview-state.yaml +++ b/datasets/nuview-state.yaml @@ -22,7 +22,6 @@ Resources: ARN: arn:aws:s3:::nuview-state-opendata Region: us-west-2 Type: S3 Bucket - RequesterPays: True - Description: New data notifications ARN: arn:aws:sns:us-west-2:830737158982:nuview-state-opendata-object_created Region: us-west-2 From 82f240f6d0c945ccab171c4686a8d389a044b9dd Mon Sep 17 00:00:00 2001 From: Adam Tyson Date: Wed, 10 Dec 2025 15:24:37 +0100 Subject: [PATCH 678/751] Update metadata --- datasets/brainglobe.yaml | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml index 19df95b0f..06cea67a0 100644 --- a/datasets/brainglobe.yaml +++ b/datasets/brainglobe.yaml @@ -26,18 +26,15 @@ License: Creative Commons CC0 1.0 Universal Citation: Claudi et al., (2020). BrainGlobe Atlas API: a common interface for neuroanatomical atlases. Journal of Open Source Software, 5(54), 2668, https://doi.org/10.21105/joss.02668 Resources: - Description: Atlases and sample data in a public bucket - ARN: - Region: - Type: - Explore: + ARN: arn:aws:s3:::brainglobe + Region: us-west-2 + Type: S3 Bucket DataAtWork: Tutorials: - - Title: + - Title: Interacting with cloud atlas data through Python URL: https://brainglobe.info/aws_examples/explore_atlas.html - NotebookURL: AuthorName: Alessandro Felder AuthorURL: https://github.com/alessandrofelder - Services: Publications: - Title: BrainGlobe Atlas API: a common interface for neuroanatomical atlases URL: https://doi.org/10.21105/joss.02668 From e6b7638a96ce9ee05bb7482e0f3fe426157eed86 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 11 Dec 2025 06:57:23 -0900 Subject: [PATCH 679/751] ok: Update nuview-state.yaml From 9ef4fc62e093418cfe0c232828fa7a44e9dc0495 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 11 Dec 2025 07:01:23 -0900 Subject: [PATCH 680/751] ok: Update nuview-state.yaml --- datasets/nuview-state.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/nuview-state.yaml b/datasets/nuview-state.yaml index b5135dbc4..04d30d235 100644 --- a/datasets/nuview-state.yaml +++ b/datasets/nuview-state.yaml @@ -15,7 +15,7 @@ Tags: - natural resource - sustainability - disaster response - - digital elevation model + - dem - lidar License: CC0 "Public Domain" (or state/agency-specific open data licenses) Resources: From 9bf57628722bb24165f8b8f40780a4ae7171099b Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 11 Dec 2025 12:23:17 -0500 Subject: [PATCH 681/751] ok: Update brainglobe.yaml --- datasets/brainglobe.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml index 06cea67a0..b41b5b6e1 100644 --- a/datasets/brainglobe.yaml +++ b/datasets/brainglobe.yaml @@ -6,6 +6,7 @@ ManagedBy: [BrainGlobe](https://brainglobe.info/) UpdateFrequency: When new atlases are packaged Tags: - biology + - life sciences - digital preservation - Homo sapiens - image processing @@ -21,7 +22,7 @@ Tags: - Rattus norvegicus - volumetric imaging - zarr - + - aws-pds License: Creative Commons CC0 1.0 Universal Citation: Claudi et al., (2020). BrainGlobe Atlas API: a common interface for neuroanatomical atlases. Journal of Open Source Software, 5(54), 2668, https://doi.org/10.21105/joss.02668 Resources: From 957450499a066d2c5a03e1e00b9bcc8a9231b53e Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 11 Dec 2025 14:31:24 -0500 Subject: [PATCH 682/751] ok: Update brainglobe.yaml --- datasets/brainglobe.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml index b41b5b6e1..33c718482 100644 --- a/datasets/brainglobe.yaml +++ b/datasets/brainglobe.yaml @@ -2,7 +2,7 @@ Name: BrainGlobe Atlases Description: BrainGlobe provides an archive and standardised interface to anatomical atlases from multiple species. This dataset includes these atlases, and other data (e.g. sample neuroanatomy data) to allow the greatest use of the atlases. Documentation: https://brainglobe.info/documentation/brainglobe-atlasapi/usage/atlas-details.html Contact: hello@brainglobe.info -ManagedBy: [BrainGlobe](https://brainglobe.info/) +ManagedBy: "[BrainGlobe](https://brainglobe.info/)" UpdateFrequency: When new atlases are packaged Tags: - biology @@ -24,7 +24,7 @@ Tags: - zarr - aws-pds License: Creative Commons CC0 1.0 Universal -Citation: Claudi et al., (2020). BrainGlobe Atlas API: a common interface for neuroanatomical atlases. Journal of Open Source Software, 5(54), 2668, https://doi.org/10.21105/joss.02668 +Citation: "Claudi et al., (2020). BrainGlobe Atlas API: a common interface for neuroanatomical atlases. Journal of Open Source Software, 5(54), 2668, https://doi.org/10.21105/joss.02668" Resources: - Description: Atlases and sample data in a public bucket ARN: arn:aws:s3:::brainglobe @@ -37,7 +37,7 @@ DataAtWork: AuthorName: Alessandro Felder AuthorURL: https://github.com/alessandrofelder Publications: - - Title: BrainGlobe Atlas API: a common interface for neuroanatomical atlases + - Title: "BrainGlobe Atlas API: a common interface for neuroanatomical atlases" URL: https://doi.org/10.21105/joss.02668 AuthorName: Federico Claudi, Luigi Petrucco, Adam L. Tyson et al. AuthorURL: https://brainglobe.info/ From a4541b23b31ca234af33017ae1159e2595b11d00 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Fri, 12 Dec 2025 08:28:09 -0500 Subject: [PATCH 683/751] Add files via upload update the blank fields left at the beginning --- CCRSMODISAlbedo.yml | 99 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 CCRSMODISAlbedo.yml diff --git a/CCRSMODISAlbedo.yml b/CCRSMODISAlbedo.yml new file mode 100644 index 000000000..adb682709 --- /dev/null +++ b/CCRSMODISAlbedo.yml @@ -0,0 +1,99 @@ +Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS +Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). + Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). +Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo +Contact: alexander.trichtchenko@nrcan-rncan.gc.ca +ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation +UpdateFrequency: Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS +Tags: + - aws-pds + - analysis ready data + - broadband + - Canada + - COG + - earth observation + - satellite imagery +License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ +Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. +Resources: + - Description: Cloud Optimized GeoTIFF (COG) images + ARN: arn:aws:s3::: ccrs-modis-albedo + Region: ca-central-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Get To Know A Dataset - MCCRS MODIS Albedo at 250-m resolution and 10-day intervals + URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + Services: + AuthorName: Alexander Trichtchenko + AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko + NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + Publications: + - Title: Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America + URL: https://doi.org/10.1016/j.agrformet.2010.12.013 + AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. + - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data + URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data + AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. + - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 + URL: https://doi.org/10.1139/cjfr-2014-0229 + AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. + - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations + URL: https://doi.org/10.1029/2010JF001664 + AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. + - Title: Influence of two management practices in the Canadian Prairies on radiative forcing + URL: https://doi.org/10.1016/j.scitotenv.2020.142701 + AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. + - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery + URL: https://doi.org/10.1109/TGRS.2008.916633 + AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. + - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America + URL: https://doi.org/10.1016/j.rse.2008.06.010 + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. + URL: https://doi.org/10.1029/2004JD004741 + AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. + - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region + URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) + URL: https://doi.org/10.1016/J.RSE.2019.111297 + AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. + - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 + URL: https://doi.org/10.1080/07038992.2024.2371359 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance + URL: https://doi.org/10.1175/BAMS-D-22-0244.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data + URL: https://doi.org/10.1139/cjes-2021-0011 + AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. + - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data + URL: https://doi.org/10.1080/07038992.2021.1909466 + AuthorName: Trishchenko, A.P., Luo, Y. + - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? + URL: https://doi.org/10.1175/BAMS-D-20-0177.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 + URL: https://doi.org/10.1080/07038992.2016.1166043 + AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 + - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization + URL: https://doi.org/10.1117/12.689157 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. + - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 + URL: https://doi.org/10.1080/01431160802348119 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. + - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future + URL: https://doi.org/10.1080/07038992.2019.1601006 + AuthorName: Trishchenko, A.P. + - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. + URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada + URL: https://doi.org/10.1109/IGARSS.2018.8519558 + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada + URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf + AuthorName: Trishchenko, Alexander P. + \ No newline at end of file From 812cfc06c199d1e7afea6ea61f061494c63f5d4a Mon Sep 17 00:00:00 2001 From: Heng Li Date: Sat, 13 Dec 2025 19:45:59 -0500 Subject: [PATCH 684/751] Add Open Human Genome Library (OpenHGL) The AWS Open Data Sponsorship Program has approved my application. Following instructions in the Onboarding Handbook for Data Providers, I am submitting a pull request to record my dataset. --- datasets/openhgl.yaml | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 datasets/openhgl.yaml diff --git a/datasets/openhgl.yaml b/datasets/openhgl.yaml new file mode 100644 index 000000000..d5177bb12 --- /dev/null +++ b/datasets/openhgl.yaml @@ -0,0 +1,37 @@ +Name: Open Human Genome Library +Description: > + The Open Human Genome Library (OpenHGL) is a collection of high-quality + *de novo* human assemblies that are publicly available in genomic databases + (e.g. NCBI and CNCB) or from individual research papers. It provides + consistent naming and uniform formats across datasets, supporting efficient + subsequence retrieval and approximate string search. +Documentation: https://lh3.github.io/OpenHGL/ +Contact: https://github.com/lh3/OpenHGL/issues +ManagedBy: Heng Li lab at Dana-Farber Cancer Institute and Harvard Medical School +UpdateFrequency: As new data or new analysis becomes available +Tags: + - aws-pds + - bioinformatics + - genomic + - biology + - life sciences +License: Creative Commons Zero (CC0) +Resources: + - Description: > + This bucket contains genomic sequences in the AGC format and the + corresponding FM-index in the ropebwt3 format. + ARN: arn:aws:s3:::openhgl + Region: us-east-1 + Type: S3 Bucket +DataAtWork: + Tutorials: + - Title: Using OpenHGL data + URL: https://lh3.github.io/OpenHGL/ + AuthorName: Heng Li + Publications: + - Title: "AGC: compact representation of assembled genomes with fast queries and updates" + URL: https://pubmed.ncbi.nlm.nih.gov/36864624/ + AuthorName: Sebastian Deorowicz, Agnieszka Danek, Heng Li + - Title: BWT construction and search at the terabase scale + URL: https://doi.org/10.1093/bioinformatics/btae717 + AuthorName: Heng Li From e4557d89b3014f89220a95d741bd53c7b8505beb Mon Sep 17 00:00:00 2001 From: Adam Tyson Date: Mon, 15 Dec 2025 12:53:30 +0000 Subject: [PATCH 685/751] Add SNS topic --- datasets/brainglobe.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml index 33c718482..1ba88b20d 100644 --- a/datasets/brainglobe.yaml +++ b/datasets/brainglobe.yaml @@ -30,6 +30,10 @@ Resources: ARN: arn:aws:s3:::brainglobe Region: us-west-2 Type: S3 Bucket + - Description: Notifications for new BrainGlobe data + ARN: arn:aws:sns:us-west-2:865567910455:brainglobe-object_created + Region: us-west-2 + Type: SNS Topic DataAtWork: Tutorials: - Title: Interacting with cloud atlas data through Python From aa52d30bd8a2445da56d204432475e2fcf579f00 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 15 Dec 2025 09:59:26 -0500 Subject: [PATCH 686/751] ok: Update brainglobe.yaml --- datasets/brainglobe.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml index 1ba88b20d..0233c64af 100644 --- a/datasets/brainglobe.yaml +++ b/datasets/brainglobe.yaml @@ -45,3 +45,5 @@ DataAtWork: URL: https://doi.org/10.21105/joss.02668 AuthorName: Federico Claudi, Luigi Petrucco, Adam L. Tyson et al. AuthorURL: https://brainglobe.info/ + ADXCategories: + - Healthcare & Life Sciences Data From 5b43d512049f1a91ced8a4fe749c3180d5c52d58 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 15 Dec 2025 10:02:46 -0500 Subject: [PATCH 687/751] ok: Update brainglobe.yaml --- datasets/brainglobe.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/brainglobe.yaml b/datasets/brainglobe.yaml index 0233c64af..bc712d254 100644 --- a/datasets/brainglobe.yaml +++ b/datasets/brainglobe.yaml @@ -45,5 +45,5 @@ DataAtWork: URL: https://doi.org/10.21105/joss.02668 AuthorName: Federico Claudi, Luigi Petrucco, Adam L. Tyson et al. AuthorURL: https://brainglobe.info/ - ADXCategories: +ADXCategories: - Healthcare & Life Sciences Data From 3cfd830859e6d3eacab3e20a25842d488382da72 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 15 Dec 2025 11:35:38 -0500 Subject: [PATCH 688/751] ok: Update uniprot.yaml https://github.com/awslabs/open-data-registry/pull/2981 --- datasets/uniprot.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/uniprot.yaml b/datasets/uniprot.yaml index a02c3421c..cd24f0a9c 100644 --- a/datasets/uniprot.yaml +++ b/datasets/uniprot.yaml @@ -18,6 +18,10 @@ Tags: - SPARQL License: http://creativecommons.org/licenses/by/4.0/ Resources: + - Description: UniProt 2025_04 + ARN: arn:aws:s3:::aws-open-data-uniprot-rdf/2025-04/ + Region: eu-west-3 + Type: S3 Bucket - Description: UniProt 2025_03 ARN: arn:aws:s3:::aws-open-data-uniprot-rdf/2025-03/ Region: eu-west-3 From 2733cd20724b10c150d9b414384ee95bfac25e4b Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Dec 2025 10:16:34 -0900 Subject: [PATCH 689/751] ok: Update CCRSMODISAlbedo.yaml --- CCRSMODISAlbedo.yml | 1 - 1 file changed, 1 deletion(-) diff --git a/CCRSMODISAlbedo.yml b/CCRSMODISAlbedo.yml index adb682709..9075e760d 100644 --- a/CCRSMODISAlbedo.yml +++ b/CCRSMODISAlbedo.yml @@ -96,4 +96,3 @@ DataAtWork: - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. - \ No newline at end of file From 70f485f2a1cfed80e6378fce9b9cbff8b6535505 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Dec 2025 10:18:08 -0900 Subject: [PATCH 690/751] ok: Update CCRSMODISAlbedo.yml --- datasets/CCRSMODISAlbedo.yml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml index 62951a961..ace7ddade 100644 --- a/datasets/CCRSMODISAlbedo.yml +++ b/datasets/CCRSMODISAlbedo.yml @@ -100,4 +100,3 @@ DataAtWork: - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. - \ No newline at end of file From d77e12f72a74f7953ba90813a9316d907ccf9a18 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 15 Dec 2025 10:18:53 -0900 Subject: [PATCH 691/751] ok: Update CCRSMODISAlbedo.yml --- datasets/CCRSMODISAlbedo.yml | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml index ace7ddade..c6e278de8 100644 --- a/datasets/CCRSMODISAlbedo.yml +++ b/datasets/CCRSMODISAlbedo.yml @@ -1,11 +1,13 @@ Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS -Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). +Description: | + Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo Contact: alexander.trichtchenko@nrcan-rncan.gc.ca ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: Semi-annually, until the end of MODIS operations - Deux fois par an, jusqu'à la fin des opérations MODIS +UpdateFrequency: | + Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS Tags: - aws-pds - analysis ready data @@ -100,3 +102,4 @@ DataAtWork: - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. + From e96b3fe3e4ececdbc06c3270e9193c88a01f513d Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Tue, 16 Dec 2025 10:11:53 -0500 Subject: [PATCH 692/751] Delete CCRSMODISAlbedo.yml duplicated, one is already in the folder of datasets --- CCRSMODISAlbedo.yml | 98 --------------------------------------------- 1 file changed, 98 deletions(-) delete mode 100644 CCRSMODISAlbedo.yml diff --git a/CCRSMODISAlbedo.yml b/CCRSMODISAlbedo.yml deleted file mode 100644 index 9075e760d..000000000 --- a/CCRSMODISAlbedo.yml +++ /dev/null @@ -1,98 +0,0 @@ -Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS -Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). - Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). -Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo -Contact: alexander.trichtchenko@nrcan-rncan.gc.ca -ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: Semi-annually, until the end of MODIS operations - Deux fois par an, jusqu'à la fin des opérations MODIS -Tags: - - aws-pds - - analysis ready data - - broadband - - Canada - - COG - - earth observation - - satellite imagery -License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ -Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. -Resources: - - Description: Cloud Optimized GeoTIFF (COG) images - ARN: arn:aws:s3::: ccrs-modis-albedo - Region: ca-central-1 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: Get To Know A Dataset - MCCRS MODIS Albedo at 250-m resolution and 10-day intervals - URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Services: - AuthorName: Alexander Trichtchenko - AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko - NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Publications: - - Title: Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America - URL: https://doi.org/10.1016/j.agrformet.2010.12.013 - AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. - - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data - URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data - AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. - - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 - URL: https://doi.org/10.1139/cjfr-2014-0229 - AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. - - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations - URL: https://doi.org/10.1029/2010JF001664 - AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. - - Title: Influence of two management practices in the Canadian Prairies on radiative forcing - URL: https://doi.org/10.1016/j.scitotenv.2020.142701 - AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. - - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery - URL: https://doi.org/10.1109/TGRS.2008.916633 - AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. - - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America - URL: https://doi.org/10.1016/j.rse.2008.06.010 - AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. - URL: https://doi.org/10.1029/2004JD004741 - AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. - - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region - URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf - AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) - URL: https://doi.org/10.1016/J.RSE.2019.111297 - AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. - - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 - URL: https://doi.org/10.1080/07038992.2024.2371359 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance - URL: https://doi.org/10.1175/BAMS-D-22-0244.1 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data - URL: https://doi.org/10.1139/cjes-2021-0011 - AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. - - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data - URL: https://doi.org/10.1080/07038992.2021.1909466 - AuthorName: Trishchenko, A.P., Luo, Y. - - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? - URL: https://doi.org/10.1175/BAMS-D-20-0177.1 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 - URL: https://doi.org/10.1080/07038992.2016.1166043 - AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 - - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization - URL: https://doi.org/10.1117/12.689157 - AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. - - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 - URL: https://doi.org/10.1080/01431160802348119 - AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. - - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future - URL: https://doi.org/10.1080/07038992.2019.1601006 - AuthorName: Trishchenko, A.P. - - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. - URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ - AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada - URL: https://doi.org/10.1109/IGARSS.2018.8519558 - AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada - URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf - AuthorName: Trishchenko, Alexander P. From 9b866ea66d48ede535825153ff4cf733ac80a970 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Tue, 16 Dec 2025 10:13:25 -0500 Subject: [PATCH 693/751] Add files via upload add the resources of SNS topic and S3 bucket --- datasets/CCRSMODISAlbedo.yml | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml index c6e278de8..00216a92a 100644 --- a/datasets/CCRSMODISAlbedo.yml +++ b/datasets/CCRSMODISAlbedo.yml @@ -1,13 +1,11 @@ Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS -Description: | - Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). +Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo Contact: alexander.trichtchenko@nrcan-rncan.gc.ca ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: | - Semi-annually, until the end of MODIS operations - Deux fois par an, jusqu'à la fin des opérations MODIS +UpdateFrequency: Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS Tags: - aws-pds - analysis ready data @@ -19,22 +17,22 @@ Tags: License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. Resources: - - Description: - ARN: - Region: - Type: - RequesterPays (Optional): - AccountRequired (Optional): - ControlledAccess (Optional): - Explore (Optional): + - Description: CCRS MODIS Albedo, Cloud Optimized GeoTIFF (COG) images + ARN: arn:aws:s3:::ccrs-modis-albedo + Region: ca-central-1 + Type: S3 Bucket + - Description: Notifications for new CCRS MODIS Albedo data + ARN: arn:aws:sns:ca-central-1:675987781521:ccrs-modis-albedo-object_created + Region: ca-central-1 + Type: SNS Topic DataAtWork: Tutorials: - - Title: Get To Know A Dataset - MCCRS MODIS Albedo at 250-m resolution and 10-day intervals - URL: https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb + - Title: Get To Know A Dataset - CCRS MODIS Albedo at 250-m resolution and 10-day intervals + URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb Services: AuthorName: Alexander Trichtchenko AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko - NotebookURL (Optional): https://github.com/****/get-to-know-a-dataset-MYDATASET.ipynb + NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb Publications: - Title: Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America URL: https://doi.org/10.1016/j.agrformet.2010.12.013 @@ -102,4 +100,4 @@ DataAtWork: - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. - + \ No newline at end of file From ff24287020b890898ae756135d20a7dabf7d2d33 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Dec 2025 06:43:20 -0900 Subject: [PATCH 694/751] ok: Update CCRSMODISAlbedo.yml --- datasets/CCRSMODISAlbedo.yml | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml index 00216a92a..03e378133 100644 --- a/datasets/CCRSMODISAlbedo.yml +++ b/datasets/CCRSMODISAlbedo.yml @@ -1,11 +1,13 @@ Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS -Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). +Description: | + Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo Contact: alexander.trichtchenko@nrcan-rncan.gc.ca ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: Semi-annually, until the end of MODIS operations - Deux fois par an, jusqu'à la fin des opérations MODIS +UpdateFrequency: | + Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS Tags: - aws-pds - analysis ready data @@ -100,4 +102,3 @@ DataAtWork: - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. - \ No newline at end of file From 1c0dc681891a6d1e9ef55c6640a365f80667a5e6 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Dec 2025 06:46:00 -0900 Subject: [PATCH 695/751] ok: Update and rename CCRSMODISAlbedo.yml to CCRSMODISAlbedo.yaml --- datasets/{CCRSMODISAlbedo.yml => CCRSMODISAlbedo.yaml} | 1 + 1 file changed, 1 insertion(+) rename datasets/{CCRSMODISAlbedo.yml => CCRSMODISAlbedo.yaml} (98%) diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yaml similarity index 98% rename from datasets/CCRSMODISAlbedo.yml rename to datasets/CCRSMODISAlbedo.yaml index 03e378133..b1f94fd69 100644 --- a/datasets/CCRSMODISAlbedo.yml +++ b/datasets/CCRSMODISAlbedo.yaml @@ -102,3 +102,4 @@ DataAtWork: - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. + From 0a3154d670d6c058ac30619d7fd47b5841f141dc Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Dec 2025 06:49:11 -0900 Subject: [PATCH 696/751] ok: Update and rename CCRSMODISAlbedo.yaml to ccrsmodisalbedo.yaml must be lowercase --- datasets/{CCRSMODISAlbedo.yaml => ccrsmodisalbedo.yaml} | 1 + 1 file changed, 1 insertion(+) rename datasets/{CCRSMODISAlbedo.yaml => ccrsmodisalbedo.yaml} (98%) diff --git a/datasets/CCRSMODISAlbedo.yaml b/datasets/ccrsmodisalbedo.yaml similarity index 98% rename from datasets/CCRSMODISAlbedo.yaml rename to datasets/ccrsmodisalbedo.yaml index b1f94fd69..4f43a1ec1 100644 --- a/datasets/CCRSMODISAlbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -103,3 +103,4 @@ DataAtWork: URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. + From 9da031943b0c51fa6450263a6b1675fbc222306c Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Tue, 16 Dec 2025 06:57:26 -0900 Subject: [PATCH 697/751] ok: Update ccrsmodisalbedo.yaml --- datasets/ccrsmodisalbedo.yaml | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/datasets/ccrsmodisalbedo.yaml b/datasets/ccrsmodisalbedo.yaml index 4f43a1ec1..97b16a337 100644 --- a/datasets/ccrsmodisalbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -36,13 +36,13 @@ DataAtWork: AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb Publications: - - Title: Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America + - Title: "Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America" URL: https://doi.org/10.1016/j.agrformet.2010.12.013 AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. - - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 + - Title: "Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011" URL: https://doi.org/10.1139/cjfr-2014-0229 AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations @@ -54,7 +54,7 @@ DataAtWork: - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery URL: https://doi.org/10.1109/TGRS.2008.916633 AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. - - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America + - Title: "Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America" URL: https://doi.org/10.1016/j.rse.2008.06.010 AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. @@ -63,13 +63,13 @@ DataAtWork: - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) + - Title: "A raster version of the circumpolar Arctic vegetation map (CAVM)" URL: https://doi.org/10.1016/J.RSE.2019.111297 AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 URL: https://doi.org/10.1080/07038992.2024.2371359 AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance + - Title: "Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance" URL: https://doi.org/10.1175/BAMS-D-22-0244.1 AuthorName: Trishchenko, A.P., Ungureanu, C. - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data @@ -78,29 +78,30 @@ DataAtWork: - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data URL: https://doi.org/10.1080/07038992.2021.1909466 AuthorName: Trishchenko, A.P., Luo, Y. - - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? + - Title: "Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt?" URL: https://doi.org/10.1175/BAMS-D-20-0177.1 AuthorName: Trishchenko, A.P., Ungureanu, C. - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 URL: https://doi.org/10.1080/07038992.2016.1166043 AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 - - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization + - Title: "A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization" URL: https://doi.org/10.1117/12.689157 AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. - - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 + - Title: "Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7" URL: https://doi.org/10.1080/01431160802348119 AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future URL: https://doi.org/10.1080/07038992.2019.1601006 AuthorName: Trishchenko, A.P. - - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. + - Title: "MODIS Surface Albedo and Surface Reflectance Dataset. Format Description." URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada + - Title: "Warm season snow/ice probability maps from modis and viirs sensors over Canada" URL: https://doi.org/10.1109/IGARSS.2018.8519558 AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada - URL:https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf + URL: https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf AuthorName: Trishchenko, Alexander P. + From 55f1124329406a785b6471518f3181733892acd3 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Tue, 16 Dec 2025 11:40:42 -0500 Subject: [PATCH 698/751] Add files via upload update tags based on request --- datasets/CCRSMODISAlbedo.yml | 102 +++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) create mode 100644 datasets/CCRSMODISAlbedo.yml diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml new file mode 100644 index 000000000..bed7a997b --- /dev/null +++ b/datasets/CCRSMODISAlbedo.yml @@ -0,0 +1,102 @@ +Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS +Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). + Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). +Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo +Contact: alexander.trichtchenko@nrcan-rncan.gc.ca +ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation +UpdateFrequency: Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS +Tags: + - aws-pds + - analysis ready data + - broadband + - cog + - earth observation + - satellite imagery +License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ +Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. +Resources: + - Description: CCRS MODIS Albedo, Cloud Optimized GeoTIFF (COG) images + ARN: arn:aws:s3:::ccrs-modis-albedo + Region: ca-central-1 + Type: S3 Bucket + - Description: Notifications for new CCRS MODIS Albedo data + ARN: arn:aws:sns:ca-central-1:675987781521:ccrs-modis-albedo-object_created + Region: ca-central-1 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: Get To Know A Dataset - CCRS MODIS Albedo at 250-m resolution and 10-day intervals + URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + Services: + AuthorName: Alexander Trichtchenko + AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko + NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + Publications: + - Title: Boreal lichen woodlands:a possible negative feedback to climate change in eastern North America + URL: https://doi.org/10.1016/j.agrformet.2010.12.013 + AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. + - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data + URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data + AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. + - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 + URL: https://doi.org/10.1139/cjfr-2014-0229 + AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. + - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations + URL: https://doi.org/10.1029/2010JF001664 + AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. + - Title: Influence of two management practices in the Canadian Prairies on radiative forcing + URL: https://doi.org/10.1016/j.scitotenv.2020.142701 + AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. + - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery + URL: https://doi.org/10.1109/TGRS.2008.916633 + AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. + - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America + URL: https://doi.org/10.1016/j.rse.2008.06.010 + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. + URL: https://doi.org/10.1029/2004JD004741 + AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. + - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region + URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) + URL: https://doi.org/10.1016/J.RSE.2019.111297 + AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. + - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 + URL: https://doi.org/10.1080/07038992.2024.2371359 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance + URL: https://doi.org/10.1175/BAMS-D-22-0244.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data + URL: https://doi.org/10.1139/cjes-2021-0011 + AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. + - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data + URL: https://doi.org/10.1080/07038992.2021.1909466 + AuthorName: Trishchenko, A.P., Luo, Y. + - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? + URL: https://doi.org/10.1175/BAMS-D-20-0177.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 + URL: https://doi.org/10.1080/07038992.2016.1166043 + AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 + - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization + URL: https://doi.org/10.1117/12.689157 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. + - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 + URL: https://doi.org/10.1080/01431160802348119 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. + - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future + URL: https://doi.org/10.1080/07038992.2019.1601006 + AuthorName: Trishchenko, A.P. + - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. + URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada + URL: https://doi.org/10.1109/IGARSS.2018.8519558 + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada + URL: https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf + AuthorName: Trishchenko, Alexander P. + \ No newline at end of file From 6844febac16bbd1aec72f17f98edb4fd02b0a54f Mon Sep 17 00:00:00 2001 From: Arun George Zachariah Date: Tue, 16 Dec 2025 13:52:37 -0600 Subject: [PATCH 699/751] Updating access URL --- datasets/asl_1000.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/asl_1000.yaml b/datasets/asl_1000.yaml index 9d631a72f..6a77ccb89 100644 --- a/datasets/asl_1000.yaml +++ b/datasets/asl_1000.yaml @@ -19,4 +19,4 @@ Resources: ARN: arn:aws:s3:::trustworthyaiproduct Region: us-east-2 Type: S3 Bucket - ControlledAccess: https://www.nvidia.com/en-us/gated-resources/trustworthy-ai-american-sign-language/ + ControlledAccess: https://www.nvidia.com/en-us/gated-resources/trustworthy-ai-american-sign-language/dataset/ From 25d05bc5f06ac3a41d8760d2574111755b74945d Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Wed, 17 Dec 2025 17:43:38 -0500 Subject: [PATCH 700/751] Delete datasets/CCRSMODISAlbedo.yml need delete this file because file name is not lower case, need reupload. --- datasets/CCRSMODISAlbedo.yml | 102 ----------------------------------- 1 file changed, 102 deletions(-) delete mode 100644 datasets/CCRSMODISAlbedo.yml diff --git a/datasets/CCRSMODISAlbedo.yml b/datasets/CCRSMODISAlbedo.yml deleted file mode 100644 index bed7a997b..000000000 --- a/datasets/CCRSMODISAlbedo.yml +++ /dev/null @@ -1,102 +0,0 @@ -Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS -Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). - Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). -Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo -Contact: alexander.trichtchenko@nrcan-rncan.gc.ca -ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: Semi-annually, until the end of MODIS operations - Deux fois par an, jusqu'à la fin des opérations MODIS -Tags: - - aws-pds - - analysis ready data - - broadband - - cog - - earth observation - - satellite imagery -License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ -Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. -Resources: - - Description: CCRS MODIS Albedo, Cloud Optimized GeoTIFF (COG) images - ARN: arn:aws:s3:::ccrs-modis-albedo - Region: ca-central-1 - Type: S3 Bucket - - Description: Notifications for new CCRS MODIS Albedo data - ARN: arn:aws:sns:ca-central-1:675987781521:ccrs-modis-albedo-object_created - Region: ca-central-1 - Type: SNS Topic -DataAtWork: - Tutorials: - - Title: Get To Know A Dataset - CCRS MODIS Albedo at 250-m resolution and 10-day intervals - URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Services: - AuthorName: Alexander Trichtchenko - AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko - NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Publications: - - Title: Boreal lichen woodlands:a possible negative feedback to climate change in eastern North America - URL: https://doi.org/10.1016/j.agrformet.2010.12.013 - AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. - - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data - URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data - AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. - - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 - URL: https://doi.org/10.1139/cjfr-2014-0229 - AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. - - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations - URL: https://doi.org/10.1029/2010JF001664 - AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. - - Title: Influence of two management practices in the Canadian Prairies on radiative forcing - URL: https://doi.org/10.1016/j.scitotenv.2020.142701 - AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. - - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery - URL: https://doi.org/10.1109/TGRS.2008.916633 - AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. - - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America - URL: https://doi.org/10.1016/j.rse.2008.06.010 - AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. - URL: https://doi.org/10.1029/2004JD004741 - AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. - - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region - URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf - AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) - URL: https://doi.org/10.1016/J.RSE.2019.111297 - AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. - - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 - URL: https://doi.org/10.1080/07038992.2024.2371359 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance - URL: https://doi.org/10.1175/BAMS-D-22-0244.1 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data - URL: https://doi.org/10.1139/cjes-2021-0011 - AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. - - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data - URL: https://doi.org/10.1080/07038992.2021.1909466 - AuthorName: Trishchenko, A.P., Luo, Y. - - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? - URL: https://doi.org/10.1175/BAMS-D-20-0177.1 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 - URL: https://doi.org/10.1080/07038992.2016.1166043 - AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 - - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization - URL: https://doi.org/10.1117/12.689157 - AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. - - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 - URL: https://doi.org/10.1080/01431160802348119 - AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. - - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future - URL: https://doi.org/10.1080/07038992.2019.1601006 - AuthorName: Trishchenko, A.P. - - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. - URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ - AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada - URL: https://doi.org/10.1109/IGARSS.2018.8519558 - AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada - URL: https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf - AuthorName: Trishchenko, Alexander P. - \ No newline at end of file From 939180e0d8f61281a2f967f180face73dcab38d4 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Wed, 17 Dec 2025 17:45:12 -0500 Subject: [PATCH 701/751] Add files via upload reupload this file due to change the file name to be a lower case. --- datasets/ccrsmodisalbedo.yml | 102 +++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) create mode 100644 datasets/ccrsmodisalbedo.yml diff --git a/datasets/ccrsmodisalbedo.yml b/datasets/ccrsmodisalbedo.yml new file mode 100644 index 000000000..bed7a997b --- /dev/null +++ b/datasets/ccrsmodisalbedo.yml @@ -0,0 +1,102 @@ +Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS +Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). + Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). +Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo +Contact: alexander.trichtchenko@nrcan-rncan.gc.ca +ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation +UpdateFrequency: Semi-annually, until the end of MODIS operations + Deux fois par an, jusqu'à la fin des opérations MODIS +Tags: + - aws-pds + - analysis ready data + - broadband + - cog + - earth observation + - satellite imagery +License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ +Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. +Resources: + - Description: CCRS MODIS Albedo, Cloud Optimized GeoTIFF (COG) images + ARN: arn:aws:s3:::ccrs-modis-albedo + Region: ca-central-1 + Type: S3 Bucket + - Description: Notifications for new CCRS MODIS Albedo data + ARN: arn:aws:sns:ca-central-1:675987781521:ccrs-modis-albedo-object_created + Region: ca-central-1 + Type: SNS Topic +DataAtWork: + Tutorials: + - Title: Get To Know A Dataset - CCRS MODIS Albedo at 250-m resolution and 10-day intervals + URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + Services: + AuthorName: Alexander Trichtchenko + AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko + NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + Publications: + - Title: Boreal lichen woodlands:a possible negative feedback to climate change in eastern North America + URL: https://doi.org/10.1016/j.agrformet.2010.12.013 + AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. + - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data + URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data + AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. + - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 + URL: https://doi.org/10.1139/cjfr-2014-0229 + AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. + - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations + URL: https://doi.org/10.1029/2010JF001664 + AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. + - Title: Influence of two management practices in the Canadian Prairies on radiative forcing + URL: https://doi.org/10.1016/j.scitotenv.2020.142701 + AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. + - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery + URL: https://doi.org/10.1109/TGRS.2008.916633 + AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. + - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America + URL: https://doi.org/10.1016/j.rse.2008.06.010 + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. + URL: https://doi.org/10.1029/2004JD004741 + AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. + - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region + URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf + AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. + - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) + URL: https://doi.org/10.1016/J.RSE.2019.111297 + AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. + - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 + URL: https://doi.org/10.1080/07038992.2024.2371359 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance + URL: https://doi.org/10.1175/BAMS-D-22-0244.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data + URL: https://doi.org/10.1139/cjes-2021-0011 + AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. + - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data + URL: https://doi.org/10.1080/07038992.2021.1909466 + AuthorName: Trishchenko, A.P., Luo, Y. + - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? + URL: https://doi.org/10.1175/BAMS-D-20-0177.1 + AuthorName: Trishchenko, A.P., Ungureanu, C. + - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 + URL: https://doi.org/10.1080/07038992.2016.1166043 + AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 + - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization + URL: https://doi.org/10.1117/12.689157 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. + - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 + URL: https://doi.org/10.1080/01431160802348119 + AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. + - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future + URL: https://doi.org/10.1080/07038992.2019.1601006 + AuthorName: Trishchenko, A.P. + - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. + URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada + URL: https://doi.org/10.1109/IGARSS.2018.8519558 + AuthorName: Trishchenko, Alexander P., Ungureanu, Calin + - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada + URL: https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf + AuthorName: Trishchenko, Alexander P. + \ No newline at end of file From 49a44739f3162e8f0c14f48ba1624e1b95022ce2 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Wed, 17 Dec 2025 17:47:10 -0500 Subject: [PATCH 702/751] Delete datasets/ccrsmodisalbedo.yml redudant, delete it. --- datasets/ccrsmodisalbedo.yml | 102 ----------------------------------- 1 file changed, 102 deletions(-) delete mode 100644 datasets/ccrsmodisalbedo.yml diff --git a/datasets/ccrsmodisalbedo.yml b/datasets/ccrsmodisalbedo.yml deleted file mode 100644 index bed7a997b..000000000 --- a/datasets/ccrsmodisalbedo.yml +++ /dev/null @@ -1,102 +0,0 @@ -Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS -Description: Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). - Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). -Documentation: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo -Contact: alexander.trichtchenko@nrcan-rncan.gc.ca -ManagedBy: Canada Centre for Remote Sensing (CCRS), Canada Centre for Mapping and Earth Observation (CCMEO), Department of Natural Resources Canada (NRCan) https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-mapping-earth-observation -UpdateFrequency: Semi-annually, until the end of MODIS operations - Deux fois par an, jusqu'à la fin des opérations MODIS -Tags: - - aws-pds - - analysis ready data - - broadband - - cog - - earth observation - - satellite imagery -License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ -Citation: Trishchenko, Alexander P. 2025. CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals. -Resources: - - Description: CCRS MODIS Albedo, Cloud Optimized GeoTIFF (COG) images - ARN: arn:aws:s3:::ccrs-modis-albedo - Region: ca-central-1 - Type: S3 Bucket - - Description: Notifications for new CCRS MODIS Albedo data - ARN: arn:aws:sns:ca-central-1:675987781521:ccrs-modis-albedo-object_created - Region: ca-central-1 - Type: SNS Topic -DataAtWork: - Tutorials: - - Title: Get To Know A Dataset - CCRS MODIS Albedo at 250-m resolution and 10-day intervals - URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Services: - AuthorName: Alexander Trichtchenko - AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko - NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Publications: - - Title: Boreal lichen woodlands:a possible negative feedback to climate change in eastern North America - URL: https://doi.org/10.1016/j.agrformet.2010.12.013 - AuthorName: Bernier, P.Y., Desjardins, R.L., Karimi-Zindashty, Y., Worth, D., Beaudoin, A., Luo, Y., Wang, S. - - Title: Detection of North American land cover change between 2005 and 2010 with 250m MODIS data - URL: https://www.researchgate.net/publication/286156544_Detection_of_North_American_land_cover_change_between_2005_and_2010_with_250m_MODIS_Data - AuthorName: Colditz, R.R., Pouliot, D., Llamas, R.M., Homer, C., Latifovic, R., Ressl, R.A., Tovar, C.M., Hern�ndez, A.V., Richardson, K. - - Title: Annual mapping of large Forest disturbances across Canada's forests using 250 m MODIS imagery from 2000 to 2011 - URL: https://doi.org/10.1139/cjfr-2014-0229 - AuthorName: Guindon, L., Bernier, P.Y., Beaudoin, A., Pouliot, D., Villemaire, P., Hall, R.J., Latifovic, R., St-Amant, R. - - Title: Perennial snow and ice variations (2000-2008) in the Arctic circumpolar land area from satellite observations - URL: https://doi.org/10.1029/2010JF001664 - AuthorName: Fontana F.M.A., Trishchenko A.P., Luo Y., Khlopenkov K.V., Nussbaumer S.U., Wunderle S. - - Title: Influence of two management practices in the Canadian Prairies on radiative forcing - URL: https://doi.org/10.1016/j.scitotenv.2020.142701 - AuthorName: Liu, J., Worth, D.E., Desjardins, R.L., Haak, D., McConkey, B., Cerkowniak, D. - - Title: Implementation and Evaluation of Concurrent Gradient Search Method for Reprojection of MODIS Level 1B Imagery - URL: https://doi.org/10.1109/TGRS.2008.916633 - AuthorName: Khlopenkov, K.V., and Trishchenko, A.P. - - Title: Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial resolution for the seven MODIS land bands over Canada and North America - URL: https://doi.org/10.1016/j.rse.2008.06.010 - AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: Surface bidirectional reflectance and albedo properties derived by a land cover based approach from the MODIS observations. - URL: https://doi.org/10.1029/2004JD004741 - AuthorName: Luo, Y., Trishchenko, Alexander P., Latifovic, R., Li, Z. - - Title: An approach for developing surface albedo product from seven MODIS land bands at 250m spatial resolution over Canada and the Arctic circumpolar region - URL: https://lpvs.gsfc.nasa.gov/LPV_meetings/Beijing09/Luo_MODIS_Albedo_Product.pdf - AuthorName: Luo, Y., Trishchenko, A.P., Khlopenkov, K.V. - - Title: A raster version of the circumpolar Arctic vegetation map (CAVM) - URL: https://doi.org/10.1016/J.RSE.2019.111297 - AuthorName: Raynolds, M.K., Walker, D.A., Balser, A., Bay, C., Campbell, M., Cherosov, M.M., Dani�ls, F.J.A., Eidesen, P.B., Ermokhina, K.A., Frost, G.V., Jedrzejek, B., Jorgenson, M.T., Kennedy, B.E., Kholod, S.S., Lavrinenko, I.A., Lavrinenko, O.V., Magn�sson, B., Matveyeva, N.V., Met�salemsson, S., Nilsen, L., Olthof, I., Pospelov, I.N., Pospelova, E.B., Pouliot, D., Razzhivin, V., Schaepman-Strub, G., ?Sib�k, J., Telyatnikov, M.Y., Troeva, E. - - Title: Cumulative changes in minimum snow/ice extent over Canada and Northern USA for 2000-2023 - URL: https://doi.org/10.1080/07038992.2024.2371359 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Annual minimum snow/ice extent variations over Greenland since 2000:ice sheet, peripheral areas, and relation to ice mass balance - URL: https://doi.org/10.1175/BAMS-D-22-0244.1 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Landfast ice properties over the Beaufort Sea region in 2000-2019 from MODIS and Canadian Ice Service data - URL: https://doi.org/10.1139/cjes-2021-0011 - AuthorName: Trishchenko, A.P., Kostylev, V.E., Luo, Y., Ungureanu, C., Whalen, D., Li, J. - - Title: Landfast ice mapping using MODIS clear-sky composites:application for the Banks Island coastline in Beaufort Sea and comparison with Canadian Ice Service data - URL: https://doi.org/10.1080/07038992.2021.1909466 - AuthorName: Trishchenko, A.P., Luo, Y. - - Title: Minimum snow/ice extent over the Northern circumpolar landmass in 2000-19:how much snow survives the summer melt? - URL: https://doi.org/10.1175/BAMS-D-20-0177.1 - AuthorName: Trishchenko, A.P., Ungureanu, C. - - Title: Variations of annual minimum snow and ice extent over Canada and neighbouring landmass derived from MODIS 250-m imagery for 2000-2014 - URL: https://doi.org/10.1080/07038992.2016.1166043 - AuthorName: Trishchenko, A.P., Leblanc, S.G., Wang, S., Li, J., Ungureanu, C., Luo, Y., Khlopenkov, K.V., Fontana, F., 2016 - - Title: A method for downscaling MODIS land channels to 250-m spatial resolution using adaptive regression and normalization - URL: https://doi.org/10.1117/12.689157 - AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V. - - Title: Arctic circumpolar mosaic at 250m spatial resolution for IPY by fusion of MODIS/TERRA land bands B1-B7 - URL: https://doi.org/10.1080/01431160802348119 - AuthorName: Trishchenko, A.P., Luo, Y., Khlopenkov, K.V., Park, W.M., Wang, S. - - Title: Clear-Sky Composites over Canada from Visible Infrared Imaging Radiometer Suite:Continuing MODIS Time Series into the Future - URL: https://doi.org/10.1080/07038992.2019.1601006 - AuthorName: Trishchenko, A.P. - - Title: MODIS Surface Albedo and Surface Reflectance Dataset. Format Description. - URL: https://data.eodms-sgdot.nrcan-rncan.gc.ca/public/CCRS/Trishchenko_MODIS_Albedo/ - AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Warm season snow/ice probability maps from modis and viirs sensors over Canada - URL: https://doi.org/10.1109/IGARSS.2018.8519558 - AuthorName: Trishchenko, Alexander P., Ungureanu, Calin - - Title: Probability of the annual minimum snow and ice (MSI) presence over Canada - URL: https://open.canada.ca/data/en/dataset/808b84a1-6356-4103-a8e9-db46d5c20fcf - AuthorName: Trishchenko, Alexander P. - \ No newline at end of file From f2e45d9305ae5c8dbebcb1049891d68b685074f2 Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Wed, 17 Dec 2025 18:25:32 -0500 Subject: [PATCH 703/751] Update ccrsmodisalbedo.yaml Update tags, remove Canana because the tag list only include 'canada', and change COG to 'cog' because the tag list only has 'cog' --- datasets/ccrsmodisalbedo.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/datasets/ccrsmodisalbedo.yaml b/datasets/ccrsmodisalbedo.yaml index 97b16a337..cc9c4d0e8 100644 --- a/datasets/ccrsmodisalbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -11,9 +11,8 @@ UpdateFrequency: | Tags: - aws-pds - analysis ready data - - broadband - - Canada - - COG + - broadband + - cog - earth observation - satellite imagery License: Creative Commons Licence. Creative Commons BY 4.0 https://creativecommons.org/licenses/by/4.0/ @@ -105,3 +104,4 @@ DataAtWork: + From 0fdf6394eb6d2eb44a47b2cbfd0888b9027f4060 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 17 Dec 2025 18:11:35 -0900 Subject: [PATCH 704/751] ok: Update ccrsmodisalbedo.yaml --- datasets/ccrsmodisalbedo.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/ccrsmodisalbedo.yaml b/datasets/ccrsmodisalbedo.yaml index cc9c4d0e8..2252b03d4 100644 --- a/datasets/ccrsmodisalbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -105,3 +105,4 @@ DataAtWork: + From 4a0089efb26d58449993f055194a7e7a23508759 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 18 Dec 2025 06:35:53 -0900 Subject: [PATCH 705/751] ok: Update ccrsmodisalbedo.yaml --- datasets/ccrsmodisalbedo.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/ccrsmodisalbedo.yaml b/datasets/ccrsmodisalbedo.yaml index 2252b03d4..4d6945eed 100644 --- a/datasets/ccrsmodisalbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -30,10 +30,9 @@ DataAtWork: Tutorials: - Title: Get To Know A Dataset - CCRS MODIS Albedo at 250-m resolution and 10-day intervals URL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb - Services: AuthorName: Alexander Trichtchenko AuthorURL: https://profils-profiles.science.gc.ca/en/profile/alexander-p-trishchenko - NotebookURL (Optional): https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb + NotebookURL: https://github.com/OpsCCRS/AWS-Open-Data-Registry-Preparation/blob/main/CCRSMODISAlbedo/get-to-know-a-dataset-CCRSMODISAlbedo.ipynb Publications: - Title: "Boreal lichen woodlands: a possible negative feedback to climate change in eastern North America" URL: https://doi.org/10.1016/j.agrformet.2010.12.013 @@ -106,3 +105,4 @@ DataAtWork: + From 0600466c8a2d460b2afe9985b10cd0348ce73301 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 18 Dec 2025 11:21:09 -0500 Subject: [PATCH 706/751] ok: Update asl_1000.yaml From 35d708a3edd46552f066d798aa64f554a884ebda Mon Sep 17 00:00:00 2001 From: CCRS ST_OPS Date: Thu, 18 Dec 2025 12:28:09 -0500 Subject: [PATCH 707/751] Update ccrsmodisalbedo.yaml To make the name shorter based on request --- datasets/ccrsmodisalbedo.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/datasets/ccrsmodisalbedo.yaml b/datasets/ccrsmodisalbedo.yaml index 4d6945eed..746104ff0 100644 --- a/datasets/ccrsmodisalbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -1,4 +1,4 @@ -Name: CCRS MODIS albedo over Canada at 250-m resolution and 10-day intervals on AWS | Albédo CCRS MODIS au-dessus du Canada à une résolution de 250 m et à intervalles de 10 jours sur AWS +Name: CCRS MODIS albedo over Canada | Albédo CCRS MODIS au-dessus du Canada Description: | Times series of 10-day spectral and broadband albedo products derived at 250-m spatial resolution over Canadian territory and neighboring areas produced at the Canada Centre for Remote Sensing (CCRS) since February 2000 using MODIS L1B C6.1 swath imagery as input. The imagery for all spectral bands was downscaled and re-projected into the Lambert Conformal Conic (LCC) projection at 250-m spatial resolution. The area size is 5,700 km x 4,800 km (22,800 pixel x 19,200 lines). Séries temporelles de produits d’albédo spectral et à large bande générés à des intervalles de 10 jours avec une résolution spatiale de 250 m, couvrant le territoire canadien et les régions voisines. Ces produits sont élaborés par le Centre canadien de télédétection (CCT) depuis février 2000 à partir des images MODIS L1B C6.1. Les images de toutes les bandes spectrales ont été rééchantillonnées et reprojetées en projection conforme de Lambert (LCC) à une résolution spatiale de 250 m. La zone couverte est d’environ 5 700 km par 4 800 km (22 800 pixels par 19 200 lignes). @@ -106,3 +106,4 @@ DataAtWork: + From 16b97f3086a97b914772d1af337f243aaa0cb0e3 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 18 Dec 2025 10:07:52 -0900 Subject: [PATCH 708/751] ok: Update ccrsmodisalbedo.yaml --- datasets/ccrsmodisalbedo.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/datasets/ccrsmodisalbedo.yaml b/datasets/ccrsmodisalbedo.yaml index 746104ff0..c2dbdfd9b 100644 --- a/datasets/ccrsmodisalbedo.yaml +++ b/datasets/ccrsmodisalbedo.yaml @@ -105,5 +105,3 @@ DataAtWork: - - From d55dea49029ed25359904d28174a6b8935c7c5de Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 18 Dec 2025 10:14:21 -0900 Subject: [PATCH 709/751] ok: Update ccrsmodisalbedo.yaml From 74e704ca5b9847d59c442c9e8693d11e6668310d Mon Sep 17 00:00:00 2001 From: Heng Li Date: Fri, 19 Dec 2025 18:26:54 -0500 Subject: [PATCH 710/751] Add SNS topic resource --- datasets/openhgl.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/datasets/openhgl.yaml b/datasets/openhgl.yaml index d5177bb12..64f8ecde4 100644 --- a/datasets/openhgl.yaml +++ b/datasets/openhgl.yaml @@ -23,6 +23,10 @@ Resources: ARN: arn:aws:s3:::openhgl Region: us-east-1 Type: S3 Bucket + - Description: Notifications for OpenHGL updates + ARN: arn:aws:sns:us-east-1:104240442756:openhgl-object_created + Region: us-east-1 + Type: SNS Topic DataAtWork: Tutorials: - Title: Using OpenHGL data From d62a4b4053f818294a813379e9397445dd7806fa Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 22 Dec 2025 09:27:52 -0500 Subject: [PATCH 711/751] ok: Update openhgl.yaml --- datasets/openhgl.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/openhgl.yaml b/datasets/openhgl.yaml index 64f8ecde4..d5e3453ab 100644 --- a/datasets/openhgl.yaml +++ b/datasets/openhgl.yaml @@ -8,7 +8,7 @@ Description: > Documentation: https://lh3.github.io/OpenHGL/ Contact: https://github.com/lh3/OpenHGL/issues ManagedBy: Heng Li lab at Dana-Farber Cancer Institute and Harvard Medical School -UpdateFrequency: As new data or new analysis becomes available +UpdateFrequency: As new data or new analyses become available Tags: - aws-pds - bioinformatics From 7c3290b189f82a24b829a78f10dd2e26c33d8a88 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Mon, 22 Dec 2025 09:58:20 -0500 Subject: [PATCH 712/751] ok: Update openhgl.yaml From 48b26607513269864df02ed5a85f5fdceb751f7a Mon Sep 17 00:00:00 2001 From: Bryan Nielsen Date: Mon, 22 Dec 2025 13:07:25 -0800 Subject: [PATCH 713/751] Add S3 bucket and SNS topic for salk-aging-mouse-brain-epigeneti --- datasets/salk-aging-mouse-brain-epigeneti.yaml | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/datasets/salk-aging-mouse-brain-epigeneti.yaml b/datasets/salk-aging-mouse-brain-epigeneti.yaml index 710798a20..e5e013575 100644 --- a/datasets/salk-aging-mouse-brain-epigeneti.yaml +++ b/datasets/salk-aging-mouse-brain-epigeneti.yaml @@ -16,6 +16,18 @@ Tags: - cram - STRIDES License: "[NCBI Policy](https://www.ncbi.nlm.nih.gov/home/about/policies/) and [NIH Genomic Data Sharing Policy ](https://osp.od.nih.gov/scientific-sharing/genomic-data-sharing/)" + +Resources: + - Description: Aging mouse brain epigenomics dataset + ARN: arn:aws:s3:::salk-aging-mouse-brain-epigenetics + Region: us-west-2 + Type: S3 Bucket + +Notifications: + - Description: Dataset update notifications + ARN: arn:aws:sns:us-west-2:855738613743:salk-aging-mouse-brain-epigenetics-updates + Type: SNS Topic + DataAtWork: Publications: - Title: Cell-type-specific transposable element demethylation and TAD remodeling in the aging mouse brain From e2496cc79a62fe1f97e3899977a50894344a26a6 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 23 Dec 2025 10:44:53 -0500 Subject: [PATCH 714/751] ok: Update salk-aging-mouse-brain-epigeneti.yaml --- datasets/salk-aging-mouse-brain-epigeneti.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datasets/salk-aging-mouse-brain-epigeneti.yaml b/datasets/salk-aging-mouse-brain-epigeneti.yaml index e5e013575..ca9914c15 100644 --- a/datasets/salk-aging-mouse-brain-epigeneti.yaml +++ b/datasets/salk-aging-mouse-brain-epigeneti.yaml @@ -1,4 +1,4 @@ -Name: Aging Mouse Brain Epigeneti +Name: Aging Mouse Brain Epigenetic Description: "Aging is a major risk factor for neurodegenerative diseases, yet underlying epigenetic mechanisms remain unclear. Here, we generated a comprehensive single-nucleus cell atlas of brain aging across multiple brain regions, comprising 132,551 single-cell methylomes and 72,666 joint chromatin conformation-methylome nuclei. Integration with companion transcriptomic and chromatin accessibility data yielded a cross-modality taxonomy of 36 major cell types." Contact: ecker@salk.edu Documentation: https://doi.org/10.1101/2025.04.21.648266 @@ -14,7 +14,7 @@ Tags: - fastq - bam - cram - - STRIDES + - aws-pds License: "[NCBI Policy](https://www.ncbi.nlm.nih.gov/home/about/policies/) and [NIH Genomic Data Sharing Policy ](https://osp.od.nih.gov/scientific-sharing/genomic-data-sharing/)" Resources: From b88a1cade45f638a2d7458211bee07ae4280343a Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 23 Dec 2025 10:46:13 -0500 Subject: [PATCH 715/751] ok: Update salk-aging-mouse-brain-epigeneti.yaml --- datasets/salk-aging-mouse-brain-epigeneti.yaml | 4 ---- 1 file changed, 4 deletions(-) diff --git a/datasets/salk-aging-mouse-brain-epigeneti.yaml b/datasets/salk-aging-mouse-brain-epigeneti.yaml index ca9914c15..05d746076 100644 --- a/datasets/salk-aging-mouse-brain-epigeneti.yaml +++ b/datasets/salk-aging-mouse-brain-epigeneti.yaml @@ -16,18 +16,14 @@ Tags: - cram - aws-pds License: "[NCBI Policy](https://www.ncbi.nlm.nih.gov/home/about/policies/) and [NIH Genomic Data Sharing Policy ](https://osp.od.nih.gov/scientific-sharing/genomic-data-sharing/)" - Resources: - Description: Aging mouse brain epigenomics dataset ARN: arn:aws:s3:::salk-aging-mouse-brain-epigenetics Region: us-west-2 Type: S3 Bucket - -Notifications: - Description: Dataset update notifications ARN: arn:aws:sns:us-west-2:855738613743:salk-aging-mouse-brain-epigenetics-updates Type: SNS Topic - DataAtWork: Publications: - Title: Cell-type-specific transposable element demethylation and TAD remodeling in the aging mouse brain From 04fbc29b30fb28e5a7bfb66c11eb07583ec59185 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Tue, 23 Dec 2025 10:50:57 -0500 Subject: [PATCH 716/751] ok: Update salk-aging-mouse-brain-epigeneti.yaml --- datasets/salk-aging-mouse-brain-epigeneti.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/salk-aging-mouse-brain-epigeneti.yaml b/datasets/salk-aging-mouse-brain-epigeneti.yaml index 05d746076..efb5ef760 100644 --- a/datasets/salk-aging-mouse-brain-epigeneti.yaml +++ b/datasets/salk-aging-mouse-brain-epigeneti.yaml @@ -23,6 +23,7 @@ Resources: Type: S3 Bucket - Description: Dataset update notifications ARN: arn:aws:sns:us-west-2:855738613743:salk-aging-mouse-brain-epigenetics-updates + Region: us-west-2 Type: SNS Topic DataAtWork: Publications: From de005780f49424a8fca8f5d6b26af332af545a63 Mon Sep 17 00:00:00 2001 From: yuukiiwa Date: Wed, 31 Dec 2025 06:59:10 +0800 Subject: [PATCH 717/751] move frag-struc.yaml to the datasets directory --- frag-struc.yaml => datasets/frag-struc.yaml | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename frag-struc.yaml => datasets/frag-struc.yaml (100%) diff --git a/frag-struc.yaml b/datasets/frag-struc.yaml similarity index 100% rename from frag-struc.yaml rename to datasets/frag-struc.yaml From b43d3c12efcaba8b2a83a556d534dab9534e2866 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 2 Jan 2026 09:16:34 -0500 Subject: [PATCH 718/751] Update frag-struc.yaml --- datasets/frag-struc.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/datasets/frag-struc.yaml b/datasets/frag-struc.yaml index b42271e94..37b7b4c14 100644 --- a/datasets/frag-struc.yaml +++ b/datasets/frag-struc.yaml @@ -36,3 +36,5 @@ DataAtWork: - Title: Hidden structural information in RNA sequencing data. URL: In Preparation AuthorName: Yuk Kei Wan and Leonard Schärfen +ADXCategories: + - Healthcare & Life Sciences Data From 116ff88da49ad959fd35518de9d62aaa259e4989 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 2 Jan 2026 09:17:02 -0500 Subject: [PATCH 719/751] ok: Update frag-struc.yaml --- datasets/frag-struc.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/frag-struc.yaml b/datasets/frag-struc.yaml index 37b7b4c14..a431ebbb7 100644 --- a/datasets/frag-struc.yaml +++ b/datasets/frag-struc.yaml @@ -34,7 +34,7 @@ DataAtWork: AuthorName: Yuk Kei Wan and Leonard Schärfen Publications: - Title: Hidden structural information in RNA sequencing data. - URL: In Preparation + URL: In Preparation AuthorName: Yuk Kei Wan and Leonard Schärfen ADXCategories: - Healthcare & Life Sciences Data From 9ed2ba2611f40960330f726a40f51818937dfb0b Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 2 Jan 2026 09:26:41 -0500 Subject: [PATCH 720/751] ok: Update frag-struc.yaml --- datasets/frag-struc.yaml | 4 ---- 1 file changed, 4 deletions(-) diff --git a/datasets/frag-struc.yaml b/datasets/frag-struc.yaml index a431ebbb7..8652c6187 100644 --- a/datasets/frag-struc.yaml +++ b/datasets/frag-struc.yaml @@ -5,14 +5,10 @@ Contact: "[fragSTRUC team](https://github.com/yuukiiwa/RNA_structure_by_fragment ManagedBy: "The Genome Institute of Singapore (https://www.a-star.edu.sg/gis) and UMass Chan Medical School's RNA Therapeutics Institute (https://www.umassmed.edu/rti/)" UpdateFrequency: Datasets will be updated periodically as additional data is generated. Tags: - - RNA structure - genomic - transcriptomics - life sciences - - Illumina sequencing - - bulk RNA sequencing - bioinformatics - - bigwig - aws-pds License: "[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)" Citation: "In addition, please cite Yuk Kei Wan and Leonard Schärfen Hidden structural information in RNA sequencing data." From 17dc64701e4c7b41dd09428aadb9a97f79483f34 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 2 Jan 2026 09:30:57 -0500 Subject: [PATCH 721/751] ok: Update frag-struc.yaml Removing publication listing as the schema needs a link --- datasets/frag-struc.yaml | 4 ---- 1 file changed, 4 deletions(-) diff --git a/datasets/frag-struc.yaml b/datasets/frag-struc.yaml index 8652c6187..b0e84118a 100644 --- a/datasets/frag-struc.yaml +++ b/datasets/frag-struc.yaml @@ -28,9 +28,5 @@ DataAtWork: - Title: "fragSTRUC: RNA structure by fragmentation frequency" URL: https://github.com/lschaerfen/fragstruc AuthorName: Yuk Kei Wan and Leonard Schärfen - Publications: - - Title: Hidden structural information in RNA sequencing data. - URL: In Preparation - AuthorName: Yuk Kei Wan and Leonard Schärfen ADXCategories: - Healthcare & Life Sciences Data From 7046b8a1e7168a347e14a4ceb7cadcfe8857d155 Mon Sep 17 00:00:00 2001 From: ricardo Date: Wed, 7 Jan 2026 11:36:43 +0000 Subject: [PATCH 722/751] add notebook and update arn --- datasets/open-targets-platform.yaml | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/datasets/open-targets-platform.yaml b/datasets/open-targets-platform.yaml index 27ff34cbf..d63a711b9 100644 --- a/datasets/open-targets-platform.yaml +++ b/datasets/open-targets-platform.yaml @@ -17,11 +17,16 @@ Tags: License: https://creativecommons.org/publicdomain/zero/1.0/ Resources: - Description: OpenTargets Release Data - ARN: - Region: + ARN: arn:aws:s3:::open-targets-public-data-releases + Region: eu-west-1 Type: S3 Bucket DataAtWork: Tutorials: + - Title: Platform datasets on AWS + URL: https://platform-docs.opentargets.org/data-access/platform-datasets-on-aws + NotebookURL: https://colab.research.google.com/github/opentargets/notebooks/blob/main/notebooks/reading_data_from_aws.ipynb + AuthorName: Daniel Suveges + AuthorURL: https://www.ebi.ac.uk/people/person/daniel-suveges/ - Title: Autoimmune colocalisations URL: https://github.com/opentargets/notebooks NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_colocalisations.ipynb From b24e357d50430759e23b6a19369e576a274c370d Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Wed, 7 Jan 2026 10:15:37 -0700 Subject: [PATCH 723/751] Add crescent_dunes dataset and update links --- datasets/oedi-data-lake.yaml | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/datasets/oedi-data-lake.yaml b/datasets/oedi-data-lake.yaml index c35320844..c483cc47b 100644 --- a/datasets/oedi-data-lake.yaml +++ b/datasets/oedi-data-lake.yaml @@ -125,19 +125,25 @@ Resources: Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=sup3ruhi%2F&limit=50)' + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=sup3ruhi%2F)' - Description: "[Buildings Sector Scenarios (BSS)](https://data.openei.org/submissions/8558)" ARN: arn:aws:s3:::oedi-data-lake/building-sector-scenarios/ Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=buildings-sector-scenarios%2F&limit=50)' + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=buildings-sector-scenarios%2F)' - Description: "[U.S. Agrivoltaic Irradiance Database](https://data.openei.org/submissions/8568)" ARN: arn:aws:s3:::oedi-data-lake/inspire/agrivoltaics_irradiance/ Region: us-west-2 Type: S3 Bucket Explore: - - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=inspire%2Fagrivoltaics_irradiance%2F&limit=50)' + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=inspire%2Fagrivoltaics_irradiance%2F)' + - Description: "[Wind and Structural Loads on Heliostats](https://data.openei.org/submissions/8601)" + ARN: arn:aws:s3:::oedi-data-lake/crescent_dunes/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=crescent_dunes%2F)' DataAtWork: Tools & Applications: - Title: "Tracking the Sun Tool" From e02fb0ab2d060888dac939ca573a8f36b12f329b Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Wed, 7 Jan 2026 15:34:46 -0500 Subject: [PATCH 724/751] ok: Update tags.yaml Adding "drug discovery" as we now have a few of these datasets --- tags.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/tags.yaml b/tags.yaml index c7b74cdc8..4b0bf6f67 100644 --- a/tags.yaml +++ b/tags.yaml @@ -131,6 +131,7 @@ - drilling - drifters - Drosophila melanogaster +- drug discovery - dsm - dtm - earth observation From 9640d6a300d902edd0124fede5a87d9fc4b5ef4f Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Wed, 7 Jan 2026 17:17:31 -0500 Subject: [PATCH 725/751] Add NOAA S-104 Water Level Data specification Added NOAA S-104 Water Level Data specification with details about data sources, usage, and resources. --- datasets/noaa-s104.yaml | 54 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 datasets/noaa-s104.yaml diff --git a/datasets/noaa-s104.yaml b/datasets/noaa-s104.yaml new file mode 100644 index 000000000..fcc866f4e --- /dev/null +++ b/datasets/noaa-s104.yaml @@ -0,0 +1,54 @@ +Name: "NOAA S-104 Water Level Data" +Description: | + S-104 is a data and metadata encoding specification that is part of the [S-100 Universal Hydrographic Data Model](https://iho.int/en/s-100-universal-hydrographic-data-model), an international standard for hydrographic data. This collection of data contains water level forecast guidance from [NOAA's Global Surge and Tide Operational Forecast System 2-D (STOFS-2D-Global)](https://polar.ncep.noaa.gov/estofs/), an operational hydrodynamic nowcast and forecast modeling system for global water level conditions. These datasets are encoded as HDF-5 files conforming to the S-104 specification, and are geospatially subset into individual tiles conforming to the NOAA/OCS Nautical Product Tiling Scheme, with filenames indicating the corresponding NOAA Electronic Navigational Chart (ENC) Cell Identifier. A set of prototype S-104 tiles has been created for the Charleston, SC area for a select model run cycle. Each individual S-104 (HDF-5) file contains all forecast projections from a single model run for that geographic area. A single S-104 file will contain multiple gridded arrays each containing a forecast valid at a distinct time in the future, out to the forecast horizon of STOFS-2D-Global, which is 180 hours or 7.5 days. The water level forecast guidance includes the combined effects of storm surge (sub-tidal) and tides (astronomical tide predictions). +Documentation: | + https://noaa-s104-pds.s3.amazonaws.com/README.html +UpdateFrequency: | + Static +License: | + NOAA data disseminated through NODD are open to the public and can be used as desired. + NOAA makes data openly available to ensure maximum use of our data, and to spur and encourage exploration and innovation throughout the industry. NOAA requests attribution for the use or dissemination of unaltered NOAA data. However, it is not permissible to state or imply endorsement by or affiliation with NOAA. If you modify NOAA data, you may not state or imply that it is original, unaltered NOAA data. +ManagedBy: | + "[NOAA](http://www.noaa.gov/)" +Contact: | + For any data delivery issues, please contact the NOAA Open Data Dissemination Team at: nodd@noaa.gov. For general questions or feedback about the data, please submit inquiries through the NOAA Office of Coast Survey (OCS) ASSIST Tool at https://www.nauticalcharts.noaa.gov/customer-service/assist/. + We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NOAA NODD team at nodd@noaa.gov. +Collabs: +Tags: + - aws-pds + - s-100 + - s-104 + - water level + - surface navigation + - marine navigation + - harbor navigation + - hydrography + - ocean + - coastal + - forecast guidance + - hydrodynamic model + - STOFS-2D-Global + - surge and tide operational forecast system + +Resources: + - Description: "NOAA S-104 Water Level for Surface Navigation Datasets" + ARN: arn:aws:s3:::noaa-s104-pds + Region: us-east-1 + Type: S3 Bucket + Explore: + - '[Browse Bucket](https://noaa-s104-pds.s3.amazonaws.com/index.html)' + - Description: "NOAA S-104 Water Level for Surface Navigation Datasets" + ARN: arn:aws:sns:us-east-1:123901341784:NewS104Object + Region: us-east-1 + Type: SNS Topic + +DataAtWork: + Tutorials: + - Title: "NOAA Precision Marine Navigation Program: Developing Next-Gen Data Svcs for the Maritime Community" + URL: https://www.youtube.com/watch?v=laC0Du6-x3k + AuthorName: NOAA + - Title: "NOAA nowCOAST" + URL: https://nowcoast.noaa.gov/ + AuthorName: NOAA + + From 9eb11d671ab65b78ae292824cb21455355065cca Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 7 Jan 2026 13:38:08 -0900 Subject: [PATCH 726/751] ok: Update noaa-s104.yaml --- datasets/noaa-s104.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/noaa-s104.yaml b/datasets/noaa-s104.yaml index fcc866f4e..1e16f5c8e 100644 --- a/datasets/noaa-s104.yaml +++ b/datasets/noaa-s104.yaml @@ -51,4 +51,3 @@ DataAtWork: URL: https://nowcoast.noaa.gov/ AuthorName: NOAA - From 5aa23a7cec061ea331ada2ba55bc7ce9d7137030 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 7 Jan 2026 13:42:52 -0900 Subject: [PATCH 727/751] ok: Update noaa-s104.yaml --- datasets/noaa-s104.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/noaa-s104.yaml b/datasets/noaa-s104.yaml index 1e16f5c8e..646c837cf 100644 --- a/datasets/noaa-s104.yaml +++ b/datasets/noaa-s104.yaml @@ -47,7 +47,7 @@ DataAtWork: - Title: "NOAA Precision Marine Navigation Program: Developing Next-Gen Data Svcs for the Maritime Community" URL: https://www.youtube.com/watch?v=laC0Du6-x3k AuthorName: NOAA - - Title: "NOAA nowCOAST" + - Title: "NOAA nowCOAST" URL: https://nowcoast.noaa.gov/ AuthorName: NOAA From 9f3371e959e03a570d3dad2f65ebac6590ab6832 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 7 Jan 2026 13:50:47 -0900 Subject: [PATCH 728/751] ok: Update noaa-s104.yaml --- datasets/noaa-s104.yaml | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/datasets/noaa-s104.yaml b/datasets/noaa-s104.yaml index 646c837cf..9b7b36e68 100644 --- a/datasets/noaa-s104.yaml +++ b/datasets/noaa-s104.yaml @@ -16,19 +16,10 @@ Contact: | Collabs: Tags: - aws-pds - - s-100 - - s-104 - - water level - - surface navigation - marine navigation - - harbor navigation - hydrography - - ocean + - oceans - coastal - - forecast guidance - - hydrodynamic model - - STOFS-2D-Global - - surge and tide operational forecast system Resources: - Description: "NOAA S-104 Water Level for Surface Navigation Datasets" From e363204e90c6fd0a7ac37036cabfda60a3939c17 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Wed, 7 Jan 2026 14:07:57 -0900 Subject: [PATCH 729/751] ok: Update noaa-s104.yaml --- datasets/noaa-s104.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/datasets/noaa-s104.yaml b/datasets/noaa-s104.yaml index 9b7b36e68..3abff1d02 100644 --- a/datasets/noaa-s104.yaml +++ b/datasets/noaa-s104.yaml @@ -14,6 +14,9 @@ Contact: | For any data delivery issues, please contact the NOAA Open Data Dissemination Team at: nodd@noaa.gov. For general questions or feedback about the data, please submit inquiries through the NOAA Office of Coast Survey (OCS) ASSIST Tool at https://www.nauticalcharts.noaa.gov/customer-service/assist/. We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NOAA NODD team at nodd@noaa.gov. Collabs: + ASDI: + Tags: + - weather Tags: - aws-pds - marine navigation From fd43f071882895c2366eb8e22902bf7eb7b2de40 Mon Sep 17 00:00:00 2001 From: ricardo Date: Thu, 8 Jan 2026 11:30:47 +0000 Subject: [PATCH 730/751] delete new ot file found duplicated file --- datasets/open-targets-platform.yaml | 74 ----------------------------- 1 file changed, 74 deletions(-) delete mode 100644 datasets/open-targets-platform.yaml diff --git a/datasets/open-targets-platform.yaml b/datasets/open-targets-platform.yaml deleted file mode 100644 index d63a711b9..000000000 --- a/datasets/open-targets-platform.yaml +++ /dev/null @@ -1,74 +0,0 @@ -Name: OpenTargets - Platform -Description: The Open Targets Platform is a comprehensive data integration tool that supports systematic identification and prioritisation of potential therapeutic drug targets. By integrating publicly available datasets including data generated by the Open Targets experimental and informatics research programmes, the Platform provides data and services to assist in the task of therapeutic hypothesis building. -Documentation: https://platform-docs.opentargets.org/ -Contact: outreach@opentargets.org -ManagedBy: Open Targets -UpdateFrequency: The data is release every three months. -Tags: - - drug targets - - drug discovery - - therapeutics - - targets - - diseases - - drugs - - gentropy - - variants - - credible sets -License: https://creativecommons.org/publicdomain/zero/1.0/ -Resources: - - Description: OpenTargets Release Data - ARN: arn:aws:s3:::open-targets-public-data-releases - Region: eu-west-1 - Type: S3 Bucket -DataAtWork: - Tutorials: - - Title: Platform datasets on AWS - URL: https://platform-docs.opentargets.org/data-access/platform-datasets-on-aws - NotebookURL: https://colab.research.google.com/github/opentargets/notebooks/blob/main/notebooks/reading_data_from_aws.ipynb - AuthorName: Daniel Suveges - AuthorURL: https://www.ebi.ac.uk/people/person/daniel-suveges/ - - Title: Autoimmune colocalisations - URL: https://github.com/opentargets/notebooks - NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_colocalisations.ipynb - AuthorName: Open Targets Team - AuthorURL: https://github.com/opentargets - - Title: Autoimmune credible sets - URL: https://github.com/opentargets/notebooks - NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_credible_set.ipynb - AuthorName: Open Targets Team - AuthorURL: https://github.com/opentargets - - Title: ChEMBL Evidence Data Download - URL: https://github.com/opentargets/notebooks - NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/chembl_evidence_download.ipynb - AuthorName: Open Targets Team - AuthorURL: https://github.com/opentargets - - Title: Exploration of Open Targets datasets - URL: https://github.com/opentargets/notebooks - NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/exploring_ot_datasets.ipynb - AuthorName: Open Targets Team - AuthorURL: https://github.com/opentargets - - Title: Open Targets informatics tools - URL: https://www.ebi.ac.uk/training/online/courses/open-targets-quick-tour/ - AuthorName: Helena Cornu - AuthorURL: https://www.ebi.ac.uk/people/person/helena-cornu/ - - Title: Getting started with the Open Targets Platform GraphQL API - URL: https://www.ebi.ac.uk/training/events/getting-started-open-targets-platform-graphql-api/ - AuthorName: Helena Cornu - AuthorURL: https://www.ebi.ac.uk/people/person/helena-cornu/ - Tools & Applications: - - Title: Open Targets Platform - URL: https://platform.opentargets.org/ - AuthorName: Open Targets - AuthorURL: https://github.com/opentargets/ot-ui-apps - - Title: Open Targets Platform API - URL: https://api.platform.opentargets.org/ - AuthorName: Open Targets - AuthorURL: https://github.com/opentargets/platform-api - Publications: - - Title: "Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery" - URL: https://doi.org/10.1093/nar/gkae1128 - AuthorName: Annalisa Buniello - AuthorURL: https://orcid.org/0000-0002-4623-8642 -DeprecatedNotice: -ADXCategories: - - Healthcare & Life Sciences Data From 964ba05a912f42ecdfdde9e52c21ecec29b589d2 Mon Sep 17 00:00:00 2001 From: ricardo Date: Thu, 8 Jan 2026 11:32:26 +0000 Subject: [PATCH 731/751] add aditional information --- datasets/opentargets.yaml | 115 ++++++++++++++++++++++++-------------- 1 file changed, 74 insertions(+), 41 deletions(-) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index 501416684..d2b79d12a 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -1,45 +1,78 @@ -Deprecated: True -DeprecatedNotice: Amazon is no longer hosting this Data Lakehouse Ready dataset -Name: Open Targets - Data Lakehouse Ready -Description: | - This a Parquet representation of the Open Targets Platform's [latest export](https://www.targetvalidation.org/downloads/data). The Open Targets Platform integrates evidence from genetics, genomics, transcriptomics, drugs, animal models and scientific literature to score and rank target-disease associations for drug target identification. The Open Targets Platform (https://www.targetvalidation.org) is a freely available resource for the integration of genetics, genomics, and chemical data to aid systematic drug target identification and prioritisation. - This dataset is 'Lakehouse Ready'. Meaning, you can query this data in-place straight out of the Registry of Open Data S3 bucket. [Deploy this dataset's corresponding CloudFormation template](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/quickcreate?templateUrl=https%3A%2F%2Faws-roda-hcls-datalake.s3.amazonaws.com%2FOpenTargets.latest.RodaTemplate.json&stackName=OpenTargets-Latest-RODA) to create the AWS Glue catalog entries into your account in about 30 seconds. That one step will enable you to write SQL with AWS Athena, build dashboards and charts with Amazon Quicksight, perform HPC with AWS EMR, or join into your AWS Redshift clusters. More detail in (the documentation)[https://github.com/aws-samples/data-lake-as-code/blob/roda/README.md. -Documentation: https://github.com/aws-samples/data-lake-as-code/blob/roda/docs/roda_install.md -Contact: https://github.com/aws-samples/data-lake-as-code/issues -ManagedBy: "[Amazon Web Services](https://aws.amazon.com/)" -UpdateFrequency: Within two weeks of new Open Targets releases -Tags: - - chemistry - - genetic - - genomic - - molecule - - life sciences - - biotech blueprint - - parquet -License: https://github.com/aws-samples/data-lake-as-code/blob/roda/docs/roda_attributions.txt -Resources: - - Description: Latest Open Targets release. Updates within two weeks of new Open Targets version. Information on Open Targets releases can be found [here](https://www.targetvalidation.org/downloads/data). - ARN: arn:aws:s3:::aws-roda-hcls-datalake/opentargets_latest/ - Region: us-east-1 +Name: Open Targets +Description: The Open Targets Platform is a comprehensive data integration tool that supports systematic identification and prioritisation of potential therapeutic drug targets. By integrating publicly available datasets including data generated by the Open Targets experimental and informatics research programmes, the Platform provides data and services to assist in the task of therapeutic hypothesis building. +Documentation: https://platform-docs.opentargets.org/ +Contact: outreach@opentargets.org +ManagedBy: Open Targets +UpdateFrequency: The data is released every three months. +Tags: + - drug targets + - drug discovery + - therapeutics + - targets + - diseases + - drugs + - gentropy + - variants + - credible sets +License: https://creativecommons.org/publicdomain/zero/1.0/ +Resources: + - Description: Open Targets Data + ARN: arn:aws:s3:::open-targets-public-data-releases + Region: eu-west-1 Type: S3 Bucket - - Description: Open Targets v20.06. Does not update. - ARN: arn:aws:s3:::aws-roda-hcls-datalake/opentargets_20_06/ - Region: us-east-1 - Type: S3 Bucket - - Description: Open Targets v19.11. Does not update - ARN: arn:aws:s3:::aws-roda-hcls-datalake/opentargets_1911/ - Region: us-east-1 - Type: S3 Bucket + - Description: Notifications for new Open Targets data releases + ARN: arn:aws:sns:eu-west-1:674693859687:open-targets-public-data-releases-object_created + Region: eu-west-1 + Type: SNS Topic DataAtWork: Tutorials: - - Title: Data Lake as Code Deployment Guide - URL: https://github.com/aws-samples/data-lake-as-code/blob/roda/docs/roda_install.md - AuthorName: AWS Biotech Blueprints Team - Services: - - Amazon Athena - - AWS Glue - - AWS Lake Formation + - Title: Platform datasets on AWS + URL: https://platform-docs.opentargets.org/data-access/platform-datasets-on-aws + NotebookURL: https://colab.research.google.com/github/opentargets/notebooks/blob/main/notebooks/reading_data_from_aws.ipynb + AuthorName: Daniel Suveges + AuthorURL: https://www.ebi.ac.uk/people/person/daniel-suveges/ + - Title: Autoimmune colocalisations + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_colocalisations.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: Autoimmune credible sets + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/autoimmune_credible_set.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: ChEMBL Evidence Data Download + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/chembl_evidence_download.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: Exploration of Open Targets datasets + URL: https://github.com/opentargets/notebooks + NotebookURL: https://github.com/opentargets/notebooks/blob/main/notebooks/exploring_ot_datasets.ipynb + AuthorName: Open Targets Team + AuthorURL: https://github.com/opentargets + - Title: Open Targets informatics tools + URL: https://www.ebi.ac.uk/training/online/courses/open-targets-quick-tour/ + AuthorName: Helena Cornu + AuthorURL: https://www.ebi.ac.uk/people/person/helena-cornu/ + - Title: Getting started with the Open Targets Platform GraphQL API + URL: https://www.ebi.ac.uk/training/events/getting-started-open-targets-platform-graphql-api/ + AuthorName: Helena Cornu + AuthorURL: https://www.ebi.ac.uk/people/person/helena-cornu/ + Tools & Applications: + - Title: Open Targets Platform Web Application + URL: https://platform.opentargets.org/ + AuthorName: Open Targets + AuthorURL: https://github.com/opentargets/ot-ui-apps + - Title: Open Targets Platform API + URL: https://api.platform.opentargets.org/ + AuthorName: Open Targets + AuthorURL: https://github.com/opentargets/platform-api Publications: - - Title: Data Lake as Code, Featuring ChEMBL and Open Targets - URL: https://aws.amazon.com/blogs/startups/a-data-lake-as-code-featuring-chembl-and-opentargets/ - AuthorName: Paul Underwood + - Title: "Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery" + URL: https://doi.org/10.1093/nar/gkae1128 + AuthorName: Annalisa Buniello + AuthorURL: https://orcid.org/0000-0002-4623-8642 +DeprecatedNotice: +ADXCategories: + - Healthcare & Life Sciences Data From 6fc437795f379695ab28246041b7862dcb15e3f8 Mon Sep 17 00:00:00 2001 From: Adrienne Lowney <150190866+alowney@users.noreply.github.com> Date: Thu, 8 Jan 2026 09:26:33 -0700 Subject: [PATCH 732/751] Add updated Alaska and West Coast datasets Updated dataset information and added new resources for the WPTO US Wave dataset. --- datasets/wpto-pds-us-wave.yaml | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/datasets/wpto-pds-us-wave.yaml b/datasets/wpto-pds-us-wave.yaml index e72bfa821..27dc8a9ff 100644 --- a/datasets/wpto-pds-us-wave.yaml +++ b/datasets/wpto-pds-us-wave.yaml @@ -81,6 +81,18 @@ Resources: Type: S3 Bucket Explore: - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=wpto-pds-us-wave&prefix=v1.0.0%2FGulf_of_Mexico_and_Puerto_Rico%2F)' + - Description: Updated version of 42 Year Wave Hindcast (1979-2020) for Alaska at 3-hour temporal resolution and down to 200m spatial resolution in [HDF5](https://portal.hdfgroup.org/display/HDF5/HDF5) format. Updates resolve issues with NaNs. + ARN: arn:aws:s3:::wpto-pds-us-wave/v1.0.1/Alaska/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=wpto-pds-us-wave&prefix=v1.0.1%2FAlaska%2F)' + - Description: Updated version of 42 Year Wave Hindcast (1979-2020) for the West Coast of the United States at 3-hour temporal resolution and down to 200m spatial resolution in [HDF5](https://portal.hdfgroup.org/display/HDF5/HDF5) format. Updates resolve issues with NaNs. + ARN: arn:aws:s3:::wpto-pds-us-wave/v1.0.1/West_Coast/ + Region: us-west-2 + Type: S3 Bucket + Explore: + - '[Browse Dataset](https://data.openei.org/s3_viewer?bucket=wpto-pds-us-wave&prefix=v1.0.1%2FWest_Coast%2F)' DataAtWork: Tutorials: Tools & Applications: @@ -109,3 +121,4 @@ DataAtWork: - Title: Development and validation of a regional-scale high-resolution unstructured model for wave energy resource characterization along the US East Coast URL: https://doi.org/10.1016/j.renene.2019.01.020 AuthorName: Allahdadi, M.N., Gunawan, J. Lai, R. He, V.S. Neary + From a55fe591c729094128ac35e189412496f7f00541 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 8 Jan 2026 11:58:23 -0500 Subject: [PATCH 733/751] ok: Create anvilproject.yaml In place of https://github.com/awslabs/open-data-registry/pull/2919/files --- datasets/anvilproject.yaml | 207 +++++++++++++++++++++++++++++++++++++ 1 file changed, 207 insertions(+) create mode 100644 datasets/anvilproject.yaml diff --git a/datasets/anvilproject.yaml b/datasets/anvilproject.yaml new file mode 100644 index 000000000..dfae48ea1 --- /dev/null +++ b/datasets/anvilproject.yaml @@ -0,0 +1,207 @@ +Name: NHGRI AnVIL Project + +Description: "The NHGRI Analysis, Visualization, and Informatics Lab-space + (AnVIL) Project (https://anvilproject.org/) is the National Human Genome + Research Institute's cloud-based platform for genomic data sharing and + analysis. AnVIL hosts widely used human genome reference datasets generated + through NHGRI-funded research. AnVIL on Open Data on AWS provides public + access to open-access datasets available through AnVIL. The project is a + collaborative effort involving NHGRI, the Broad Institute, Johns Hopkins + University, the University of California Santa Cruz, Vanderbilt University + Medical Center, Brigham and Women's Hospital, the Carnegie Institution for + Science, the City University of New York, the Fred Hutchinson Cancer Research + Center, Harvard University, Oregon Health & Science University, Massachusetts + General Hospital, Moffitt Cancer Center, Penn State University, and Washington + University." + +Documentation: "https://explore.anvilproject.org/datasets" + +Contact: "https://anvilproject.org/help" + +ManagedBy: "The AnVIL Project, and UC Santa Cruz Genomics Institute, University of California, Santa Cruz (UCSC)" + +UpdateFrequency: Quarterly + +Tags: + - life sciences + - biology + - genome + - genomic + - gene expression + - Homo sapiens + +License: "https://anvilproject.org/faq/data-security" + +Citation: "Schatz MC, Philippakis AA, Afgan E, Banks E, Carey VJ, Carroll RJ, et al. [Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL)](https://www.cell.com/cell-genomics/fulltext/S2666-979X(21)00106-3). Cell Genomics. 2022;2. doi:10.1016/j.xgen.2021.100085" + +Resources: + - Description: "An S3 bucket containing all publicly accessible data files in the AnVIL Project. The bucket layout and access procedures are documented at https://github.com/DataBiosphere/azul/blob/develop/docs/mirror.rst and metadata can be viewed at https://explore.anvilproject.org/datasets or accessed programmatically at https://service.explore.anvilproject.org/" + ARN: arn:aws:s3:::humancellatlas + Region: us-east-1 + Type: S3 Bucket + Explore: + - "[Data Browser UI](https://explore.anvilproject.org/datasets)" + - "[Azul REST Web Service](https://service.explore.anvilproject.org/)" + - Description: "Notifications for new NHGRI AnVIL data" + ARN: arn:aws:sns:us-east-1:160936121715:anvilproject-object_created + Region: us-east-1 + Type: SNS Topic + +DataAtWork: + Publications: + - Title: "Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL)" + URL: "https://doi.org/10.1016/j.xgen.2021.100085" + AuthorName: "Michael C. Schatz, Anthony A. Philippakis, Enis Afgan, Eric + Banks, Vincent J. Carey, Robert J. Carroll, Alessandro Culotti, Kyle + Ellrott, Jeremy Goecks, Robert L. Grossman, Ira M. Hall, Kasper D. + Hansen, Jonathan Lawson, Jeffrey T. Leek, Anne O’Donnell Luria, Stephen + Mosher, Martin Morgan, Anton Nekrutenko, Brian D. O’Connor, Kevin + Osborn, Benedict Paten, Candace Patterson, Frederick J. Tan, Casey + Overby Taylor, Jennifer Vessio, Levi Waldron, Ting Wang, Kristin + Wuichet, AnVIL Team" + - Title: "Beyond the Human Genome Project: The Age of Complete Human Genome + Sequences and Pangenome References" + URL: "https://doi.org/10.1146/annurev-genom-021623-081639" + AuthorName: "Dylan J. Taylor, Jordan M. Eizenga, Qiuhui Li, Arun Das, + Katharine M. Jenike, Eimear E. Kenny, Karen H. Miga, Jean Monlong, Rajiv + C. McCoy, Benedict Paten, and Michael C. Schatz" + - Title: "CNPI: Rapid Analyses of Human Copy Number Data" + URL: "https://doi.org/10.1016/j.jmb.2025.169313" + AuthorName: "Jack Ustanik, Tychele N. Turner" + - Title: "The Galaxy platform for accessible, reproducible, and + collaborative data analyses: 2024 update" + URL: "https://doi.org/10.1093/nar/gkae410" + AuthorName: "The Galaxy Community" + - Title: "The complete sequence of a human Y chromosome" + URL: "https://doi.org/10.1038/s41586-023-06457-y" + AuthorName: "Arang Rhie, Sergey Nurk, Monika Cechova, Savannah J. Hoyt, + Dylan J. Taylor, Nicolas Altemose, Paul W. Hook, Sergey Koren, Mikko + Rautiainen, Ivan A. Alexandrov, Jamie Allen, Mobin Asri, Andrey V. + Bzikadze, Nae-Chyun Chen, Chen-Shan Chin, Mark Diekhans, Paul Flicek, + Giulio Formenti, Arkarachai Fungtammasan, Carlos Garcia Giron, Erik + Garrison, Ariel Gershman, Jennifer L. Gerton, Patrick G. S. Grady, + Andrea Guarracino, Leanne Haggerty, Reza Halabian, Nancy F. Hansen, + Robert Harris, Gabrielle A. Hartley, William T. Harvey, Marina Haukness, + Jakob Heinz, Thibaut Hourlier, Robert M. Hubley, Sarah E. Hunt, Stephen + Hwang, Miten Jain, Rupesh K. Kesharwani, Alexandra P. Lewis, Heng Li, + Glennis A. Logsdon, Julian K. Lucas, Wojciech Makalowski, Christopher + Markovic, Fergal J. Martin, Ann M. Mc Cartney, Rajiv C. McCoy, Jennifer + McDaniel, Brandy M. McNulty, Paul Medvedev, Alla Mikheenko, Katherine M. + Munson, Terence D. Murphy, Hugh E. Olsen, Nathan D. Olson, Luis F. + Paulin, David Porubsky, Tamara Potapova, Fedor Ryabov, Steven L. + Salzberg, Michael E. G. Sauria, Fritz J. Sedlazeck, Kishwar Shafin, + Valery A. Shepelev, Alaina Shumate, Jessica M. Storer, Likhitha + Surapaneni, Angela M. Taravella Oill, Françoise Thibaud-Nissen, Winston + Timp, Marta Tomaszkiewicz, Mitchell R. Vollger, Brian P. Walenz, Allison + C. Watwood, Matthias H. Weissensteiner, Aaron M. Wenger, Melissa A. + Wilson, Samantha Zarate, Yiming Zhu, Justin + M. Zook, Evan E. Eichler, Rachel J. O’Neill, Michael C. Schatz, Karen H. + Miga, Kateryna D. Makova, Adam M. Phillippy" + - Title: "Approaching complete genomes, transcriptomes and epi-omes with + accurate long-read sequencing" + URL: "https://doi.org/10.1038/s41592-022-01716-8" + AuthorName: "Sam Kovaka, Shujun Ou, Katharine M. Jenike, Michael C. + Schatz" + - Title: "The complete sequence and comparative analysis of ape sex + chromosomes" + URL: "https://doi.org/10.1038/s41586-024-07473-2" + AuthorName: "Kateryna D. Makova, Brandon D. Pickett, Robert S. Harris, + Gabrielle A. Hartley, Monika Cechova, Karol Pal, Sergey Nurk, DongAhn + Yoo, Qiuhui Li, Prajna Hebbar, Barbara C. McGrath, Francesca Antonacci, + Margaux Aubel, Arjun Biddanda, Matthew Borchers, Erich Bornberg-Bauer, + Gerard G. Bouffard, Shelise Y. Brooks, Lucia Carbone, Laura Carrel, + Andrew Carroll, Pi-Chuan Chang, Chen-Shan Chin, Daniel E. Cook, Sarah J. + C. Craig, Luciana de Gennaro, Mark Diekhans, Amalia Dutra, Gage H. + Garcia, Patrick G. S. Grady, Richard E. Green, Diana Haddad, Pille + Hallast, William T. Harvey, Glenn Hickey, David A. Hillis, Savannah J. + Hoyt, Hyeonsoo Jeong, Kaivan Kamali, Sergei L. Kosakovsky Pond, Troy + M. LaPolice, Charles Lee, Alexandra P. Lewis, Yong-Hwee E. Loh, + Patrick Masterson, Kelly M. McGarvey, Rajiv C. McCoy, Paul Medvedev, + Karen H. Miga, Katherine M. Munson, Evgenia Pak, Benedict Paten, + Brendan J. Pinto, Tamara Potapova, Arang Rhie, Joana L. Rocha, Fedor + Ryabov, Oliver A. Ryder, Samuel Sacco, Kishwar Shafin, Valery A. + Shepelev, Viviane Slon, Steven J. Solar, Jessica M. Storer, Peter H. + Sudmant, Sweetalana, Alex Sweeten, Michael G. Tassia, Françoise + Thibaud-Nissen, Mario Ventura, Melissa A. Wilson, Alice C. Young, + Huiqing Zeng, Xinru Zhang, Zachary A. Szpiech, Christian D. Huber, + Jennifer L. Gerton, Soojin V. Yi, Michael C. Schatz, Ivan A. + Alexandrov, Sergey Koren, Rachel J. O’Neill, Evan E. Eichler, Adam M. + Phillippy" + - Title: "Scalable Nanopore sequencing of human genomes provides a + comprehensive view of haplotype-resolved variation and methylation" + URL: "https://doi.org/10.1038/s41592-023-01993-x" + AuthorName: "Mikhail Kolmogorov, Kimberley J. Billingsley, Mira Mastoras, + Melissa Meredith, Jean Monlong, Ryan Lorig-Roach, Mobin Asri, Pilar + Alvarez Jerez, Laksh Malik, Ramita Dewan, Xylena Reed, Rylee M. Genner, + Kensuke Daida, Sairam Behera, Kishwar Shafin, Trevor Pesout, Jeshuwin + Prabakaran, Paolo Carnevali, Jianzhi Yang, Arang Rhie, Sonja W. Scholz, + Bryan J. Traynor, Karen H. Miga, Miten Jain, Winston Timp, Adam M. + Phillippy, Mark Chaisson, Fritz J. Sedlazeck, Cornelis Blauwendraat, + Benedict Paten" + - Title: "The Human Pangenome Project: a global resource to map genomic + diversity" + URL: "https://doi.org/10.1038/s41586-022-04601-8" + AuthorName: "Ting Wang, Lucinda Antonacci-Fulton, Kerstin Howe, Heather A. + Lawson, Julian K. Lucas, Adam M. Phillippy, Alice B. Popejoy, Mobin + Asri, Caryn Carson, Mark J. P. Chaisson, Xian Chang, Robert Cook-Deegan, + Adam L. Felsenfeld, Robert S. Fulton, Erik P. Garrison, Nanibaa’ A. + Garrison, Tina A. Graves-Lindsay, Hanlee Ji, Eimear E. Kenny, Barbara A. + Koenig, Daofeng Li, Tobias Marschall, Joshua F. McMichael, Adam M. + Novak, Deepak Purushotham, Valerie A. Schneider, Baergen I. Schultz, + Michael W. Smith, Heidi J. Sofia, Tsachy Weissman, Paul Flicek, Heng Li, + Karen H. Miga, Benedict Paten, Erich D. Jarvis, Ira M. Hall, Evan E. + Eichler, David Haussler, the Human Pangenome Reference Consortium" + - Title: "Deciphering the impact of genomic variation on function" + URL: "https://doi.org/10.1038/s41586-024-07510-0" + AuthorName: "IGVF Consortium" + - Title: "A complete reference genome improves analysis of human genetic variation" + URL: "https://doi.org/10.1126/science.abl3533" + AuthorName: "Sergey Aganezov, Stephanie M. Yan, Daniela C. Soto, Melanie + Kirsche, Samantha Zarate, Pavel Avdeyev, Dylan J. Taylor, Kishwar + Shafin, Alaina Shumate, Chunlin Xiao, Justin Wagner, Jennifer McDaniel, + Nathan D. Olson, Michael E. G. Sauria, Mitchell R. Vollger, Arang Rhie, + Melissa Meredith, Skylar Martin, Joyce Lee, Sergey Koren, Jeffrey A. + Rosenfeld, Benedict Paten, Ryan Layer, Chen-Shan Chin, Fritz J. + Sedlazeck, Nancy F. Hansen, Danny E. Miller, Adam M. Phillippy, Karen H. + Miga, Rajiv C. McCoy, Megan Y. Dennis, Justin M. Zook, Michael C. + Schatz" + - Title: "Jasmine and Iris: population-scale structural variant comparison + and analysis" + URL: "https://doi.org/10.1038/s41592-022-01753-3" + AuthorName: "Melanie Kirsche, Gautam Prabhu, Rachel Sherman, Bohan Ni, + Alexis Battle, Sergey Aganezov, Michael C. Schatz" + - Title: "A draft human pangenome reference" + URL: "https://www.nature.com/articles/s41586-023-05896-x" + AuthorName: "Wen-Wei Liao, Mobin Asri, Jana Ebler, Daniel Doerr, Marina + Haukness, Glenn Hickey, Shuangjia Lu, Julian K. Lucas, Jean Monlong, + Haley J. Abel, Silvia Buonaiuto, Xian H. Chang, Haoyu Cheng, Justin Chu, + Vincenza Colonna, Jordan M. Eizenga, Xiaowen Feng, Christian Fischer, + Robert S. Fulton, Shilpa Garg, Cristian Groza, Andrea Guarracino, + William T. Harvey, Simon Heumos, Kerstin Howe, Miten Jain, Tsung-Yu Lu, + Charles Markello, Fergal J. Martin, Matthew W. Mitchell, Katherine M. + Munson, Moses Njagi Mwaniki, Adam M. Novak, Hugh E. Olsen, Trevor + Pesout, David Porubsky, Pjotr Prins, Jonas A. Sibbesen, Jouni Sirén, + Chad Tomlinson, Flavia Villani, Mitchell R. Vollger, Lucinda L. + Antonacci-Fulton, Gunjan Baid, Carl A. Baker, Anastasiya Belyaeva, + Konstantinos Billis, Andrew Carroll, Pi-Chuan Chang, Sarah Cody, Daniel + E. Cook, Robert M. Cook-Deegan, Omar E. Cornejo, Mark Diekhans, Peter + Ebert, Susan Fairley, Olivier Fedrigo, Adam L. Felsenfeld, Giulio + Formenti, Adam Frankish, Yan Gao, Nanibaa’ A. Garrison, Carlos Garcia + Giron, Richard E. Green, Leanne Haggerty, Kendra Hoekzema, Thibaut + Hourlier, Hanlee P. Ji, Eimear E. Kenny, Barbara A. Koenig, Alexey + Kolesnikov, Jan O. Korbel, Jennifer Kordosky, Sergey Koren, HoJoon Lee, + Alexandra P. Lewis, Hugo Magalhães, Santiago Marco-Sola, Pierre Marijon, + Ann McCartney, Jennifer McDaniel, Jacquelyn Mountcastle, Maria + Nattestad, Sergey Nurk, Nathan D. Olson, Alice B. Popejoy, Daniela Puiu, + Mikko Rautiainen, Allison A. Regier, Arang Rhie, Samuel Sacco, Ashley D. + Sanders, Valerie A. Schneider, Baergen I. Schultz, Kishwar Shafin, + Michael W. Smith, Heidi J. Sofia, Ahmad N. Abou Tayoun, Françoise + Thibaud-Nissen, Francesca Floriana Tricomi, Justin Wagner, Brian Walenz, + Jonathan M. D. Wood, Aleksey V. Zimin, Guillaume Bourque, Mark J. P. + Chaisson, Paul Flicek, Adam M. Phillippy, Justin M. Zook, Evan E. + Eichler, David Haussler, Ting Wang, Erich D. Jarvis, Karen H. Miga, Erik + Garrison, Tobias Marschall, Ira M. Hall, Heng Li, Benedict Paten" + +ADXCategories: + - Healthcare & Life Sciences Data + From 1b829960962701e9d49549759605403b0cb5d886 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Thu, 8 Jan 2026 12:34:51 -0500 Subject: [PATCH 734/751] ok: Update anvilproject.yaml --- datasets/anvilproject.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/anvilproject.yaml b/datasets/anvilproject.yaml index dfae48ea1..56cccb605 100644 --- a/datasets/anvilproject.yaml +++ b/datasets/anvilproject.yaml @@ -36,7 +36,7 @@ Citation: "Schatz MC, Philippakis AA, Afgan E, Banks E, Carey VJ, Carroll RJ, et Resources: - Description: "An S3 bucket containing all publicly accessible data files in the AnVIL Project. The bucket layout and access procedures are documented at https://github.com/DataBiosphere/azul/blob/develop/docs/mirror.rst and metadata can be viewed at https://explore.anvilproject.org/datasets or accessed programmatically at https://service.explore.anvilproject.org/" - ARN: arn:aws:s3:::humancellatlas + ARN: arn:aws:s3:::anvilproject Region: us-east-1 Type: S3 Bucket Explore: From b49e5ed84d182a90f2d09c19ffe2f8b88841885a Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Thu, 8 Jan 2026 09:32:17 -0900 Subject: [PATCH 735/751] ok: Update wpto-pds-us-wave.yaml --- datasets/wpto-pds-us-wave.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/wpto-pds-us-wave.yaml b/datasets/wpto-pds-us-wave.yaml index 27dc8a9ff..591f1382a 100644 --- a/datasets/wpto-pds-us-wave.yaml +++ b/datasets/wpto-pds-us-wave.yaml @@ -121,4 +121,3 @@ DataAtWork: - Title: Development and validation of a regional-scale high-resolution unstructured model for wave energy resource characterization along the US East Coast URL: https://doi.org/10.1016/j.renene.2019.01.020 AuthorName: Allahdadi, M.N., Gunawan, J. Lai, R. He, V.S. Neary - From a500e98e2831735c92da2e0dc0f6f01096522c8a Mon Sep 17 00:00:00 2001 From: ricardo Date: Fri, 9 Jan 2026 11:16:56 +0000 Subject: [PATCH 736/751] update tags --- datasets/opentargets.yaml | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index d2b79d12a..0772eaa0d 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -5,15 +5,13 @@ Contact: outreach@opentargets.org ManagedBy: Open Targets UpdateFrequency: The data is released every three months. Tags: - - drug targets - - drug discovery - - therapeutics - - targets - - diseases - - drugs - - gentropy - - variants - - credible sets + - target discovery + - genetics + - drug development + - disease + - GWAS + - tractability + - safety License: https://creativecommons.org/publicdomain/zero/1.0/ Resources: - Description: Open Targets Data From 4510e29368bf405a7e7dabd97bee8980594ead6b Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 10:54:04 -0500 Subject: [PATCH 737/751] ok: Update opentargets.yaml --- datasets/opentargets.yaml | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index 0772eaa0d..00bb11463 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -5,13 +5,14 @@ Contact: outreach@opentargets.org ManagedBy: Open Targets UpdateFrequency: The data is released every three months. Tags: - - target discovery + - drug discovery - genetics - - drug development - - disease - - GWAS - - tractability - - safety + - life sciences + - genomic + - aws-pds + - bioinformatics + - biology + - protein License: https://creativecommons.org/publicdomain/zero/1.0/ Resources: - Description: Open Targets Data From 2232002d8d2fd4ee8a89e472750d8114def834c3 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 11:17:16 -0500 Subject: [PATCH 738/751] ok: Update opentargets.yaml --- datasets/opentargets.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index 00bb11463..c2f2541fd 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -72,6 +72,5 @@ DataAtWork: URL: https://doi.org/10.1093/nar/gkae1128 AuthorName: Annalisa Buniello AuthorURL: https://orcid.org/0000-0002-4623-8642 -DeprecatedNotice: ADXCategories: - Healthcare & Life Sciences Data From e6c14c0f402527c9bb16aa5cdc704535e9ae3178 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 11:30:52 -0500 Subject: [PATCH 739/751] ok: Update opentargets.yaml --- datasets/opentargets.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index c2f2541fd..84f5bc607 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -5,7 +5,6 @@ Contact: outreach@opentargets.org ManagedBy: Open Targets UpdateFrequency: The data is released every three months. Tags: - - drug discovery - genetics - life sciences - genomic From f9810162982e78438efa5eb1bbc1198674977df9 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 11:35:12 -0500 Subject: [PATCH 740/751] ok: Update opentargets.yaml --- datasets/opentargets.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index 84f5bc607..2e240fe08 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -12,6 +12,7 @@ Tags: - bioinformatics - biology - protein + - drug discovery License: https://creativecommons.org/publicdomain/zero/1.0/ Resources: - Description: Open Targets Data From e390cf77a0531ed317659fbfb067c6861799df8e Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 11:35:52 -0500 Subject: [PATCH 741/751] ok: Update opentargets.yaml --- datasets/opentargets.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/opentargets.yaml b/datasets/opentargets.yaml index 2e240fe08..4e5ec456b 100644 --- a/datasets/opentargets.yaml +++ b/datasets/opentargets.yaml @@ -5,7 +5,7 @@ Contact: outreach@opentargets.org ManagedBy: Open Targets UpdateFrequency: The data is released every three months. Tags: - - genetics + - genetic - life sciences - genomic - aws-pds From af5d5bb6f77463fe69fa4cab597b84c683d02f60 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 14:01:12 -0500 Subject: [PATCH 742/751] Create smaht.yaml --- datasets/smaht.yaml | 89 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 datasets/smaht.yaml diff --git a/datasets/smaht.yaml b/datasets/smaht.yaml new file mode 100644 index 000000000..a6aaa9f6e --- /dev/null +++ b/datasets/smaht.yaml @@ -0,0 +1,89 @@ +Name: Somatic Mosaicism across Humnan Tissues (SMaHT) +Description: | + The Somatic Mosaicism across Human Tissues (SMaHT) project is an NIH Common + Fund consortium (2023-) aimed to comprehensively characterize somatic variation + ("mosaicism") in normal human tissues. While most genetic studies have relied + on blood-derived DNA, SMaHT captures the full spectrum of DNA variation across + cell types, tissues, and organs from phenotypically normal individuals to + better understand the role of somatic mosaicism in human development, aging, + and disease progression. + + Researchers in the consortium develop and apply experimental and computational + methods, paired with the state-of-the-art sequencing technologies, to accurately + detect even rare mutations (frequency < 1%) in subpopulations of cells. In + addition to generating the production data across ~20 tissue types from 150 + post-mortem donors, SMaHT also produces datasets from cell line and tissue + homogenate samples, to benchmark and develop new technologies and computational + tools for mosaic variant detection. + + The resulting data include high-coverage whole-genome and transcriptome data + using both short-read and long-read sequencing technologies from multiple platforms + (e.g., Illumina, PacBio, Oxford Nanopore Technologies, Ultima Genomics). SMaHT will + also generate comprehensive genome-wide catalogs of somatic variants. We anticipate + that this resource will be valuable not only for researchers studying somatic + mosaicism, but also for the broader scientific community interested in large-scale + WGS data from normal human tissues. More about the SMaHT project: + program announcement, https://commonfund.nih.gov/smaht, and https://smaht.org/. + More about the data portal: https://data.smaht.org/ and types of data generated: + https://data.smaht.org/about/consortium/data +Documentation: https://data.smaht.org/docs +Contact: smhelp@hms-dbmi.atlassian.net +ManagedBy: SMaHT Data Analysis Center (DAC) +UpdateFrequency: Bi-annually +Tags: + - biology + - bioinformatics + - genetic + - genomic + - imaging + - life sciences + - whole genome sequencing + - bam +License: NIH Genomic Data Sharing Policy: https://gdc.cancer.gov/access-data/data-access-policies +Citation: The SMaHT datasets were generated as part of the NIH Common Fund consortium initiative, + Somatic Mosaicism across Human Tissues (SMaHT). The SMaHT datasets are submitted under dbGaP + studies (http://www.ncbi.nlm.nih.gov/gap), with the study accession numbers, phs004193 for the + SMaHT Benchmarking data and phs004194 for the SMaHT Production data. The datasets were provided + by the SMaHT Data Analysis Center (DAC) [1UM1DA058230] on behalf of the SMaHT network. More + information about the SMaHT Network is available online at https://smaht.org/, about the SMaHT + Data Portal at https://data.smaht.org/ , and types of data generated by the Network at + https://data.smaht.org/about/consortium/data +Resources: + - Description: | + SMaHT Open-Access Data - Publicly available data files without restriction, including + aligned reads from WGS and RNA-Seq, as well as variants identified from cell line + samples that are commercially available without restriction. Somatic (non-inherited) + variants from donor tissue samples are also open-access data. + ARN: arn:aws:s3:::smaht-open-data-public + Region: us-east-1 + Type: S3 Bucket + - Description: | + SMaHT Controlled Access Data - Controlled-access data files, including aligned reads + from WGS and RNA-Seq, as well as germline (inherited) from donor tissue samples. + Access to these data is managed through dbGaP. + ARN: arn:aws:s3:::smaht-open-data-protected + Region: us-east-1 + Type: S3 Bucket + ControlledAccess: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs004194 + - Description: > + Amazon SNS topic that publishes notifications when public access data is added for this dataset. + ARN: arn:aws:sns:us-east-1:874962955096:smaht-open-data-public-object_created + Region: us-east-1 + Type: SNS Topic + - Description: | + Amazon SNS topic that publishes notifications when new controlled access data is added for this dataset. + ARN: arn:aws:sns:us-east-1:874962955096:smaht-open-data-protected-object_created + Region: us-east-1 + Type: SNS Topic +DataAtWork: + Tools & Applications: + - Title: Somatic Mosaicism across Human Tissues Data Portal + URL: https://data.smaht.org/ + AuthorName: SMaHT Data Analysis Center (DAC) + Publications: + - Title: The Somatic Mosaicism across Human Tissues Network + URL: https://www.nature.com/articles/s41586-025-09096-7 + AuthorName: Coorens T, Oh J, Choi Y, Lim N, Zhao B, Voshall A et al. +ADXCategories: + - Healthcare & Life Sciences Data + From 36708528425e9f836525a2ec9da79c250c4c9907 Mon Sep 17 00:00:00 2001 From: willmacs <103065262+willmacs@users.noreply.github.com> Date: Fri, 9 Jan 2026 14:05:59 -0500 Subject: [PATCH 743/751] Pointing at some new services --- datasets/radarsat-1.yaml | 29 +++++++++++++---------------- datasets/rcm-ceos-ard.yaml | 2 +- 2 files changed, 14 insertions(+), 17 deletions(-) diff --git a/datasets/radarsat-1.yaml b/datasets/radarsat-1.yaml index f262cc19f..395400db2 100644 --- a/datasets/radarsat-1.yaml +++ b/datasets/radarsat-1.yaml @@ -1,13 +1,19 @@ Name: RADARSAT-1 Description: "Developed and operated by the Canadian Space Agency, it is Canada's first commercial Earth observation satellite +
    +
    + Développé et exploité par l'Agence spatiale canadienne, il s'agit du premier satellite commercial d'observation de la Terre au Canada." Documentation: https://www.asc-csa.gc.ca/eng/satellites/radarsat1/what-is-radarsat1.asp Contact: https://www.eodms-sgdot.nrcan-rncan.gc.ca ManagedBy: "[Natural Resources Canada](https://nrcan.gc.ca/)" -UpdateFrequency: "Products are added on an adhoc basis driven by prioritized foreign repatriation efforts and new processing orders from the raw archive. +UpdateFrequency: "NRCan opened a new [RADARSAT-1 Processing service](https://github.com/eodms-sgdot/radarsat-notebooks/blob/main/examples/radarsat1_l1_processing.ipynb) to the public on December 18, 2025. The processor allows users to produce image products from the original raw data archive free of charge, with the primary goal of making previously unprocessed data accessible for the first time. + +
    +
    -Les produits sont ajoutés de manière ponctuelle en fonction des efforts prioritaires de rapatriement à l'étranger et des nouvelles commandes de traitement à partir des archives brutes." +RNCan a ouvert un nouveau [service de traitement RADARSAT-1](https://github.com/eodms-sgdot/radarsat-notebooks/blob/main/examples/radarsat1_l1_processing.ipynb) au public le 18 décembre 2025. Le processeur permet aux utilisateurs de produire des produits d'imagerie à partir des archives de données brutes originales gratuitement, avec l'objectif principal de rendre accessible pour la première fois les données jamais traitées auparavant." Collabs: ASDI: Tags: @@ -31,20 +37,11 @@ Resources: Type: S3 Bucket DataAtWork: Tutorials: - - Title: PCI Geomatics Webinar | Cloud enabling earth observation archives - URL: https://www.youtube.com/watch?v=SvejDH5-Hic - AuthorName: CATALYST - AuthorURL: https://catalyst.earth - Services: - Tools & Applications: - - Title: EODMS RAPI Client Python Script - URL: https://github.com/nrcan-eodms-sgdot-rncan/eodms-rapi-orderdownload - AuthorName: Earth Observation Data Management System - Natural Resources Canada - AuthorURL: https://www.eodms-sgdot.nrcan-rncan.gc.ca/index-en.html - - Title: RADARSTAC - URL: https://www.radarstac.com - AuthorName: Sparkgeo Consulting Inc. - AuthorURL: https://sparkgeo.com/ + - Title: RADARSAT-1 Processing Service | Service de traitement RADARSAT-1 + URL: https://github.com/eodms-sgdot/radarsat-notebooks/blob/main/examples/radarsat1_l1_processing.ipynb + AuthorName: Canada Centre for Remote Sensing | Centre canadien de télédétection + AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing + Tools & Applications: - Title: QGIS URL: https://qgis.org AuthorName: German QGIS user group diff --git a/datasets/rcm-ceos-ard.yaml b/datasets/rcm-ceos-ard.yaml index fb15cdd7b..e4f58a89c 100644 --- a/datasets/rcm-ceos-ard.yaml +++ b/datasets/rcm-ceos-ard.yaml @@ -46,7 +46,7 @@ Resources: DataAtWork: Tutorials: - Title: Workflows for accessing and manipulating RCM ARD SpatioTemporal Asset Catalog (STAC) in JupyterLab Python Notebooks - Flux de travail pour accéder et manipuler le catalogue d'actifs spatio-temporels (STAC) RCM ARD dans les notebooks Python JupyterLab - URL: https://github.com/eodms-sgdot/rcm-ard-stac-examples + URL: https://github.com/eodms-sgdot/radarsat-notebooks AuthorName: Canada Centre for Remote Sensing | Centre canadien de télédétection AuthorURL: https://natural-resources.canada.ca/science-data/science-research/research-centres/canada-centre-remote-sensing Publications: From 956a392a92148024d34db8a184c4e73ee31012f0 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 14:24:46 -0500 Subject: [PATCH 744/751] ok: Update smaht.yaml --- datasets/smaht.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/smaht.yaml b/datasets/smaht.yaml index a6aaa9f6e..13926cfda 100644 --- a/datasets/smaht.yaml +++ b/datasets/smaht.yaml @@ -39,6 +39,7 @@ Tags: - life sciences - whole genome sequencing - bam + - aws-pds License: NIH Genomic Data Sharing Policy: https://gdc.cancer.gov/access-data/data-access-policies Citation: The SMaHT datasets were generated as part of the NIH Common Fund consortium initiative, Somatic Mosaicism across Human Tissues (SMaHT). The SMaHT datasets are submitted under dbGaP From 5dcb0d57c448385ee92348bfd2adfb24e26ff313 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 14:34:54 -0500 Subject: [PATCH 745/751] ok: Update smaht.yaml --- datasets/smaht.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/smaht.yaml b/datasets/smaht.yaml index 13926cfda..0fdbbe968 100644 --- a/datasets/smaht.yaml +++ b/datasets/smaht.yaml @@ -40,7 +40,7 @@ Tags: - whole genome sequencing - bam - aws-pds -License: NIH Genomic Data Sharing Policy: https://gdc.cancer.gov/access-data/data-access-policies +License: NIH Genomic Data Sharing Policy - https://gdc.cancer.gov/access-data/data-access-policies Citation: The SMaHT datasets were generated as part of the NIH Common Fund consortium initiative, Somatic Mosaicism across Human Tissues (SMaHT). The SMaHT datasets are submitted under dbGaP studies (http://www.ncbi.nlm.nih.gov/gap), with the study accession numbers, phs004193 for the From b0a28e7a6ac6d5ea9a2ab9f42b8be995b9036918 Mon Sep 17 00:00:00 2001 From: Beryl Rabindran Date: Fri, 9 Jan 2026 16:20:03 -0500 Subject: [PATCH 746/751] ok: Update smaht.yaml --- datasets/smaht.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/smaht.yaml b/datasets/smaht.yaml index 0fdbbe968..e80aa7f83 100644 --- a/datasets/smaht.yaml +++ b/datasets/smaht.yaml @@ -1,4 +1,4 @@ -Name: Somatic Mosaicism across Humnan Tissues (SMaHT) +Name: Somatic Mosaicism across Human Tissues (SMaHT) Description: | The Somatic Mosaicism across Human Tissues (SMaHT) project is an NIH Common Fund consortium (2023-) aimed to comprehensively characterize somatic variation From f2f514890f94289ff3504a6e319b8139f4a8ceb7 Mon Sep 17 00:00:00 2001 From: kszura <43186787+kszura@users.noreply.github.com> Date: Mon, 12 Jan 2026 12:24:10 -0500 Subject: [PATCH 747/751] Update noaa-s104.yaml Added water tag --- datasets/noaa-s104.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/datasets/noaa-s104.yaml b/datasets/noaa-s104.yaml index 3abff1d02..4f226d2b1 100644 --- a/datasets/noaa-s104.yaml +++ b/datasets/noaa-s104.yaml @@ -23,6 +23,7 @@ Tags: - hydrography - oceans - coastal + - water Resources: - Description: "NOAA S-104 Water Level for Surface Navigation Datasets" From 5ccb0dee47468db1e99caede8de2ed89a0473c09 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 12 Jan 2026 08:57:29 -0900 Subject: [PATCH 748/751] ok: Update noaa-s104.yaml From 843331b79f36ffac78b1695d465003f84bacf0b4 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 12 Jan 2026 09:34:24 -0900 Subject: [PATCH 749/751] ok: Update rcm-ceos-ard.yaml From 84e2015b2c07896cb384665c3348fabac617f2e0 Mon Sep 17 00:00:00 2001 From: cstner <66844762+cstner@users.noreply.github.com> Date: Mon, 12 Jan 2026 10:19:32 -0900 Subject: [PATCH 750/751] ok: Update oedi-data-lake.yaml From b7ede6e8c6f4f147f72b0c6db1195b2a1ac670ea Mon Sep 17 00:00:00 2001 From: martinsiron Date: Tue, 13 Jan 2026 10:27:58 +0100 Subject: [PATCH 751/751] adding users and editing data entry --- datasets/lemat-rho.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/lemat-rho.yaml b/datasets/lemat-rho.yaml index edfec803a..022bd4375 100644 --- a/datasets/lemat-rho.yaml +++ b/datasets/lemat-rho.yaml @@ -24,7 +24,7 @@ DataAtWork: - Title: Accessing Data in LeMat-Rho AWS OpenData Repository URL: https://github.com/LeMaterial/LeMat-Rho/blob/feat/aws-upload/scripts/aws-open-data.ipynb NotebookURL: https://github.com/LeMaterial/LeMat-Rho/blob/feat/aws-upload/scripts/aws-open-data.ipynb - AuthorName: Martin Siron, Mathilde Franckel, Jonathan Schmidt + AuthorName: Martin Siron, Mathilde Franckel, Jonathan Schmidt, Richard Tran, Daniel Speckhard, Georgia Channing, Guilherme Penedo Tools & Applications: - Title: Pymatgen URL: https://pymatgen.org