{ "cells": [ { "cell_type": "markdown", "id": "0cdbad31", "metadata": {}, "source": [ "# Access Sentinel 2 Analysis Ready Data from Digital Earth Africa\n", "\n", "[](https://mybinder.org/v2/gh/opendatacube/odc-stac/develop?labpath=notebooks%2Fstac-load-S2-deafrica.ipynb)\n", "\n", "https://explorer.digitalearth.africa/products/s2_l2a" ] }, { "cell_type": "markdown", "id": "fea2d058", "metadata": {}, "source": [ "## Import Required Packages" ] }, { "cell_type": "code", "execution_count": 1, "id": "23624e97", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:29.498067Z", "iopub.status.busy": "2025-11-23T22:58:29.497751Z", "iopub.status.idle": "2025-11-23T22:58:31.392087Z", "shell.execute_reply": "2025-11-23T22:58:31.391270Z" } }, "outputs": [], "source": [ "from pystac_client import Client\n", "\n", "from odc.stac import configure_rio, stac_load" ] }, { "cell_type": "markdown", "id": "54f02751", "metadata": {}, "source": [ "## Set Collection Configuration\n", "\n", "The configuration dictionary is determined from the product's definition, available at https://explorer.digitalearth.africa/products/s2_l2a#definition-doc\n", "\n", "All assets except SCL have the same configuration. SCL uses `uint8` rather than `uint16`.\n", "\n", "In the configuration, we also supply the aliases for each band. This means we can load data by band name rather than band number." ] }, { "cell_type": "code", "execution_count": 2, "id": "ed8b4430", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:31.394427Z", "iopub.status.busy": "2025-11-23T22:58:31.394075Z", "iopub.status.idle": "2025-11-23T22:58:31.397710Z", "shell.execute_reply": "2025-11-23T22:58:31.397034Z" }, "lines_to_next_cell": 2 }, "outputs": [], "source": [ "config = {\n", " \"s2_l2a\": {\n", " \"assets\": {\n", " \"*\": {\n", " \"data_type\": \"uint16\",\n", " \"nodata\": 0,\n", " \"unit\": \"1\",\n", " },\n", " \"SCL\": {\n", " \"data_type\": \"uint8\",\n", " \"nodata\": 0,\n", " \"unit\": \"1\",\n", " },\n", " },\n", " \"aliases\": {\n", " \"costal_aerosol\": \"B01\",\n", " \"blue\": \"B02\",\n", " \"green\": \"B03\",\n", " \"red\": \"B04\",\n", " \"red_edge_1\": \"B05\",\n", " \"red_edge_2\": \"B06\",\n", " \"red_edge_3\": \"B07\",\n", " \"nir\": \"B08\",\n", " \"nir_narrow\": \"B08A\",\n", " \"water_vapour\": \"B09\",\n", " \"swir_1\": \"B11\",\n", " \"swir_2\": \"B12\",\n", " \"mask\": \"SCL\",\n", " \"aerosol_optical_thickness\": \"AOT\",\n", " \"scene_average_water_vapour\": \"WVP\",\n", " },\n", " }\n", "}" ] }, { "cell_type": "markdown", "id": "5e825aee", "metadata": {}, "source": [ "## Set AWS Configuration\n", "\n", "Digital Earth Africa data is stored on S3 in Cape Town, Africa. To load the data, we must configure rasterio with the appropriate AWS S3 endpoint. This can be done with the `odc.stac.configure_rio` function. Documentation for this function is available at https://odc-stac.readthedocs.io/en/latest/_api/odc.stac.configure_rio.html#odc.stac.configure_rio.\n", "\n", "The configuration below must be used when loading any Digital Earth Africa data through the STAC API." ] }, { "cell_type": "code", "execution_count": 3, "id": "209b1f81", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:31.399288Z", "iopub.status.busy": "2025-11-23T22:58:31.399132Z", "iopub.status.idle": "2025-11-23T22:58:31.401417Z", "shell.execute_reply": "2025-11-23T22:58:31.400901Z" }, "lines_to_next_cell": 2 }, "outputs": [], "source": [ "configure_rio(\n", " cloud_defaults=True,\n", " aws={\"aws_unsigned\": True},\n", " AWS_S3_ENDPOINT=\"s3.af-south-1.amazonaws.com\",\n", ")" ] }, { "cell_type": "markdown", "id": "053aa34d", "metadata": {}, "source": [ "## Connect to the Digital Earth Africa STAC Catalog" ] }, { "cell_type": "code", "execution_count": 4, "id": "0e9ee469", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:31.403140Z", "iopub.status.busy": "2025-11-23T22:58:31.402993Z", "iopub.status.idle": "2025-11-23T22:58:32.892170Z", "shell.execute_reply": "2025-11-23T22:58:32.891286Z" }, "lines_to_next_cell": 2 }, "outputs": [], "source": [ "# Open the stac catalogue\n", "catalog = Client.open(\"https://explorer.digitalearth.africa/stac\")" ] }, { "cell_type": "markdown", "id": "eb614d8a", "metadata": {}, "source": [ "## Find STAC Items to Load\n", "\n", "### Define query parameters" ] }, { "cell_type": "code", "execution_count": 5, "id": "9b7a6208", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:32.894546Z", "iopub.status.busy": "2025-11-23T22:58:32.894371Z", "iopub.status.idle": "2025-11-23T22:58:32.897257Z", "shell.execute_reply": "2025-11-23T22:58:32.896596Z" }, "lines_to_next_cell": 2 }, "outputs": [], "source": [ "# Set a bounding box\n", "# [xmin, ymin, xmax, ymax] in latitude and longitude\n", "bbox = [37.76, 12.49, 37.77, 12.50]\n", "\n", "# Set a start and end date\n", "start_date = \"2020-09-01\"\n", "end_date = \"2020-12-01\"\n", "\n", "# Set the STAC collections\n", "collections = [\"s2_l2a\"]" ] }, { "cell_type": "markdown", "id": "557e370a", "metadata": {}, "source": [ "### Construct query and get items from catalog" ] }, { "cell_type": "code", "execution_count": 6, "id": "94aaae3d", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:32.898805Z", "iopub.status.busy": "2025-11-23T22:58:32.898657Z", "iopub.status.idle": "2025-11-23T22:58:50.920732Z", "shell.execute_reply": "2025-11-23T22:58:50.919925Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found: 34 datasets\n" ] } ], "source": [ "# Build a query with the set parameters\n", "query = catalog.search(\n", " bbox=bbox, collections=collections, datetime=f\"{start_date}/{end_date}\"\n", ")\n", "\n", "# Search the STAC catalog for all items matching the query\n", "items = list(query.items())\n", "print(f\"Found: {len(items):d} datasets\")" ] }, { "cell_type": "markdown", "id": "96742f43", "metadata": {}, "source": [ "## Load the Data\n", "\n", "In this step, we specify the desired coordinate system, resolution (here 20m), and bands to load. We also pass the bounding box to the `stac_load` function to only load the requested data. Since the band aliases are contained in the `config` dictionary, bands can be loaded using these aliaes (e.g. `\"red\"` instead of `\"B04\"` below).\n", "\n", "The data will be lazy-loaded with dask, meaning that is won't be loaded into memory until necessary, such as when it is displayed." ] }, { "cell_type": "code", "execution_count": 7, "id": "7a8f6b71", "metadata": { "execution": { "iopub.execute_input": "2025-11-23T22:58:50.922521Z", "iopub.status.busy": "2025-11-23T22:58:50.922352Z", "iopub.status.idle": "2025-11-23T22:58:51.237501Z", "shell.execute_reply": "2025-11-23T22:58:51.236710Z" }, "lines_to_next_cell": 2 }, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 421kB\n",
"Dimensions: (y: 63, x: 49, time: 17)\n",
"Coordinates:\n",
" * y (y) float64 504B 1.582e+06 1.582e+06 ... 1.581e+06 1.581e+06\n",
" * x (x) float64 392B 3.643e+06 3.643e+06 ... 3.644e+06 3.644e+06\n",
" spatial_ref int32 4B 6933\n",
" * time (time) datetime64[ns] 136B 2020-09-03T08:06:43 ... 2020-11-2...\n",
"Data variables:\n",
" red (time, y, x) uint16 105kB dask.array<chunksize=(1, 63, 49), meta=np.ndarray>\n",
" green (time, y, x) uint16 105kB dask.array<chunksize=(1, 63, 49), meta=np.ndarray>\n",
" blue (time, y, x) uint16 105kB dask.array<chunksize=(1, 63, 49), meta=np.ndarray>\n",
" nir (time, y, x) uint16 105kB dask.array<chunksize=(1, 63, 49), meta=np.ndarray>