!pip install pandas matplotlib zarr fsspec s3fs intake intake_xarray intake_parquet ipython jinja2
Show code cell output
Requirement already satisfied: pandas in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (2.0.3)
Requirement already satisfied: matplotlib in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (3.9.0)
Requirement already satisfied: zarr in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (2.15.0)
Requirement already satisfied: fsspec in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (2024.5.0)
Requirement already satisfied: s3fs in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (2024.5.0)
Requirement already satisfied: intake in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (2.0.5)
Requirement already satisfied: intake_xarray in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (0.7.0)
Requirement already satisfied: intake_parquet in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (0.3.0)
Requirement already satisfied: ipython in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (8.25.0)
Requirement already satisfied: jinja2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (3.1.4)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (2024.1)
Requirement already satisfied: tzdata>=2022.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (2024.1)
Requirement already satisfied: numpy>=1.21.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (1.25.2)
Requirement already satisfied: contourpy>=1.0.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (1.2.1)
Requirement already satisfied: cycler>=0.10 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (4.53.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (24.0)
Requirement already satisfied: pillow>=8 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (10.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from matplotlib) (3.1.2)
Requirement already satisfied: asciitree in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from zarr) (0.3.3)
Requirement already satisfied: fasteners in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from zarr) (0.19)
Requirement already satisfied: numcodecs>=0.10.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from zarr) (0.12.1)
Requirement already satisfied: aiobotocore<3.0.0,>=2.5.4 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from s3fs) (2.13.0)
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from s3fs) (3.9.5)
Requirement already satisfied: pyyaml in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake) (6.0.1)
Requirement already satisfied: appdirs in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake) (1.4.4)
Requirement already satisfied: xarray>=02022 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_xarray) (2023.7.0)
Requirement already satisfied: dask>=2.2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_xarray) (2024.4.1)
Requirement already satisfied: netcdf4 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_xarray) (1.6.4)
Requirement already satisfied: msgpack in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_xarray) (1.0.8)
Requirement already satisfied: requests in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_xarray) (2.31.0)
Requirement already satisfied: fastparquet in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_parquet) (2024.11.0)
Requirement already satisfied: pyarrow in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from intake_parquet) (12.0.1)
Requirement already satisfied: decorator in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (0.19.1)
Requirement already satisfied: matplotlib-inline in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (0.1.7)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (3.0.47)
Requirement already satisfied: pygments>=2.4.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (2.18.0)
Requirement already satisfied: stack-data in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (5.14.3)
Requirement already satisfied: typing-extensions>=4.6 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (4.11.0)
Requirement already satisfied: pexpect>4.3 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from ipython) (4.9.0)
Requirement already satisfied: MarkupSafe>=2.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from jinja2) (2.1.5)
Requirement already satisfied: botocore<1.34.107,>=1.34.70 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs) (1.34.106)
Requirement already satisfied: wrapt<2.0.0,>=1.10.10 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs) (1.16.0)
Requirement already satisfied: aioitertools<1.0.0,>=0.5.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs) (0.11.0)
Requirement already satisfied: aiosignal>=1.1.2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs) (1.9.4)
Requirement already satisfied: click>=8.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from dask>=2.2->intake_xarray) (8.1.5)
Requirement already satisfied: cloudpickle>=1.5.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from dask>=2.2->intake_xarray) (3.0.0)
Requirement already satisfied: partd>=1.2.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from dask>=2.2->intake_xarray) (1.4.2)
Requirement already satisfied: toolz>=0.10.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from dask>=2.2->intake_xarray) (0.12.1)
Requirement already satisfied: importlib-metadata>=4.13.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from dask>=2.2->intake_xarray) (7.1.0)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from jedi>=0.16->ipython) (0.8.4)
Requirement already satisfied: ptyprocess>=0.5 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)
Requirement already satisfied: wcwidth in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython) (0.2.13)
Requirement already satisfied: six>=1.5 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Requirement already satisfied: cramjam>=2.3 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from fastparquet->intake_parquet) (2.9.0)
Requirement already satisfied: cftime in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from netcdf4->intake_xarray) (1.6.3)
Requirement already satisfied: certifi in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from netcdf4->intake_xarray) (2024.2.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests->intake_xarray) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests->intake_xarray) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests->intake_xarray) (2.2.1)
Requirement already satisfied: executing>=1.2.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from stack-data->ipython) (2.0.1)
Requirement already satisfied: asttokens>=2.1.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from stack-data->ipython) (2.4.1)
Requirement already satisfied: pure-eval in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from stack-data->ipython) (0.2.2)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from botocore<1.34.107,>=1.34.70->aiobotocore<3.0.0,>=2.5.4->s3fs) (1.0.1)
Requirement already satisfied: zipp>=0.5 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from importlib-metadata>=4.13.0->dask>=2.2->intake_xarray) (3.18.1)
Requirement already satisfied: locket in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from partd>=1.2.0->dask>=2.2->intake_xarray) (1.0.0)
Tutorial: Accessing Data with Intake & S3#
This notebooks demonstrates how to remotely load data from the archive, stored in a s3
bucket, using intake.
import intake
import matplotlib.pyplot as plt
import pandas as pd
import yaml
import PIL
from IPython.display import Image
Accessing the Shot Index#
Before we can load data from the archive, we find the url where the data is located.
To do this we will use a intake
catalog. intake
catalogs are a way of abstracting how data is loaded from the user. intake
is means that you don’t need to know the details of where the data is stored or how to read it.
To open the catalog we can use intake.open_catalog
and give it the path to where our catalog is hosted.
The outpt shows that the catalog contains two sources: index
and shots
index
is a source that reads metadata about different objects in the archive. It provides an index of different data objects stored in the archive.level1
is a source that reads from about level1 sources, which contain data directly from the tokamak. In the future, derived sources will be added at other product levels.
catalog = intake.open_catalog('https://mastapp.site/intake/catalog.yml')
list(catalog)
['index', 'level1']
Let’s look at the index. The index also contains different product levels. For now, we are only interested in level1 products.
list(catalog.index)
['level1']
Lets use the index
source to read in metadata about all the different shots.
Below we read the shot metadata (stored as JSON) directly into a pandas dataframe. The output is a table of metadata including urls for each shot in the archive.
df = pd.DataFrame(catalog.index.level1.shots.read())
df = df[['shot_id', 'campaign', 'url']]
df
shot_id | campaign | url | |
---|---|---|---|
0 | 11695 | M5 | s3://mast/level1/shots/11695.zarr |
1 | 11696 | M5 | s3://mast/level1/shots/11696.zarr |
2 | 11697 | M5 | s3://mast/level1/shots/11697.zarr |
3 | 11698 | M5 | s3://mast/level1/shots/11698.zarr |
4 | 11699 | M5 | s3://mast/level1/shots/11699.zarr |
... | ... | ... | ... |
15916 | 30467 | M9 | s3://mast/level1/shots/30467.zarr |
15917 | 30468 | M9 | s3://mast/level1/shots/30468.zarr |
15918 | 30469 | M9 | s3://mast/level1/shots/30469.zarr |
15919 | 30470 | M9 | s3://mast/level1/shots/30470.zarr |
15920 | 30471 | M9 | s3://mast/level1/shots/30471.zarr |
15921 rows × 3 columns
Using the urls of the shots we can load data from the archive.
In the next cell we use the url of the first shot to remotely open data from the amc
diagnostic.
intake
returns a xr.Dataset
object containing all the data for this diagnostic.
shot = df.loc[df.shot_id == 30420].iloc[0]
dataset = catalog.level1.shots(url=shot.url, group='amc').to_dask()
dataset
<xarray.Dataset> Dimensions: (time: 30000) Coordinates: * time (time) float32 -2.0 -2.0 -2.0 -1.999 ... 3.999 4.0 4.0 Data variables: (12/46) efps_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> error_field_02 (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> error_field_05 (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> p2il_coil_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> p2il_feed_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> p2iu_coil_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> ... ... p6u_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> plasma_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> sol_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> status float32 ... tf_current (time) float32 dask.array<chunksize=(30000,), meta=np.ndarray> version float32 ... Attributes: description: Plasma Current and PF/TF Coil Currents file_name: amc0304.20 format: IDA3 mds_name: None name: amc quality: Not Checked shot_id: 30420 signal_type: Analysed source: amc uda_name: AMC uuid: 01aad0c4-2a84-59e2-8b1b-168b4bd66aa3 version: 0
Data Analysis with Remote Data#
We’re going to perform a simple plotting task. We will:
Get the URLs for 10 shots in a given range
Load the plasma current data as a
xarray.Dataset
Slice every shot between 0 seconds and .3 seconds.
df = df.loc[(df.shot_id <= 30420) & (df.shot_id >= 30410)]
plasma_shots = []
for index, row in df.iterrows():
dataset = catalog.level1.shots(url=row['url'], group='amc')
dataset = dataset.to_dask()
dataset = dataset['plasma_current']
dataset = dataset.sel(time=slice(0, .3))
plasma_shots.append(dataset)
In the code above, we load each item as an xarray
dataset, with the data, time, and error data all together.
plasma_shots[0]
<xarray.DataArray 'plasma_current' (time: 1500)> dask.array<getitem, shape=(1500,), dtype=float32, chunksize=(1500,), chunktype=numpy.ndarray> Coordinates: * time (time) float32 0.0001998 0.0003998 0.0005996 ... 0.2996 0.2998 0.3 Attributes: (12/18) description: Plasma Current dims: ['time'] file_name: None format: None label: Plasma Current mds_name: \TOP.ANALYSED.AMC.PLASMA:CURRENT ... ... source: amc time_index: 0 uda_name: AMC_PLASMA CURRENT units: kA uuid: 04b71d20-1e39-538d-9626-a6ef7926a84e version: 0
Finally, we can plot the 10 shots we loaded and cropped.
for current in plasma_shots:
plt.plot(current.time, current.data, label=current.attrs['shot_id'])
plt.xlabel('time')
plt.ylabel(f"current ({current.attrs['units']})")
plt.legend()
<matplotlib.legend.Legend at 0x337c0cd50>
Larger Data - Loading RBB Image Data#
In this example we show how to load Image data remotely. Image data are just grouped by source, such as the rbb
data. In this example we load all the image data from an rbb
group and create a GIF of the contents.
dataset = catalog.level1.shots(url=shot.url, group='rbb')
dataset = dataset.read()
dataset
<xarray.Dataset> Dimensions: (time: 286, height: 448, width: 640) Coordinates: * time (time) float64 1.6e-05 0.002016 0.004016 ... 0.308 0.309 0.31 Dimensions without coordinates: height, width Data variables: data (time, height, width) uint8 0 0 2 0 0 0 1 2 0 ... 0 2 2 2 0 4 0 2 0 Attributes: (12/48) CLASS: IMAGE IMAGE_SUBCLASS: IMAGE_INDEXED IMAGE_VERSION: 1.2 board_temp: 0.0 bottom: 680 camera: ... ... units: pixels uuid: 10ed506a-3ac4-5e62-8a6b-25a7abfc3171 vbin: 0 version: -1 view: photron HM10 + Dalpha filter width: 640
imgs = [PIL.Image.fromarray(img) for img in dataset.data.values]
imgs[0].save("array.gif", save_all=True, append_images=imgs[1:], duration=50, loop=0)
Image(open('array.gif','rb').read())