! pip install pandas gql requests_toolbelt
Hide code cell output
Requirement already satisfied: pandas in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (2.0.3)
Collecting gql
  Using cached gql-3.5.0-py2.py3-none-any.whl.metadata (9.2 kB)
Collecting requests_toolbelt
  Using cached requests_toolbelt-1.0.0-py2.py3-none-any.whl.metadata (14 kB)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (2024.1)
Requirement already satisfied: tzdata>=2022.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (2024.1)
Requirement already satisfied: numpy>=1.21.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from pandas) (1.25.2)
Requirement already satisfied: graphql-core<3.3,>=3.2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from gql) (3.2.3)
Requirement already satisfied: yarl<2.0,>=1.6 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from gql) (1.9.4)
Collecting backoff<3.0,>=1.11.1 (from gql)
  Using cached backoff-2.2.1-py3-none-any.whl.metadata (14 kB)
Requirement already satisfied: anyio<5,>=3.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from gql) (4.3.0)
Requirement already satisfied: requests<3.0.0,>=2.0.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests_toolbelt) (2.31.0)
Requirement already satisfied: idna>=2.8 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from anyio<5,>=3.0->gql) (3.7)
Requirement already satisfied: sniffio>=1.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from anyio<5,>=3.0->gql) (1.3.1)
Requirement already satisfied: six>=1.5 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests<3.0.0,>=2.0.1->requests_toolbelt) (3.3.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests<3.0.0,>=2.0.1->requests_toolbelt) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from requests<3.0.0,>=2.0.1->requests_toolbelt) (2024.2.2)
Requirement already satisfied: multidict>=4.0 in /Users/rt2549/miniconda3/envs/mast/lib/python3.11/site-packages (from yarl<2.0,>=1.6->gql) (6.0.5)
Using cached gql-3.5.0-py2.py3-none-any.whl (74 kB)
Using cached requests_toolbelt-1.0.0-py2.py3-none-any.whl (54 kB)
Using cached backoff-2.2.1-py3-none-any.whl (15 kB)
Installing collected packages: backoff, requests_toolbelt, gql
Successfully installed backoff-2.2.1 gql-3.5.0 requests_toolbelt-1.0.0

GraphQL Examples#

In this notebook we explore searching for metadata from the GraphQL API. The GraphQL API provides a method to programmatically extract a JSON representation of the meta data from the API. It has the advantage over REST that it can prevent under and over fetching data.

First we load some python dependancies that we will use as part of this notebook and set the variable API_URL to the location of the GraphQL API.

import pandas as pd
from gql import gql, Client
from gql.transport.requests import RequestsHTTPTransport

# This is the location of the GraphQL API
API_URL = "https://mastapp.site/graphql"

Setup a gql client. This is used to query the graphql API

# Select your transport with a defined url endpoint
transport = RequestsHTTPTransport(url=API_URL)
# Create a GraphQL client using the defined transport
client = Client(transport=transport, fetch_schema_from_transport=True)

Querying Shots with GraphQL#

With GraphQL you can query exactly what you want, rather than having to recieve the whole table from the database This is useful in cases where the whole table has many columns, but you are interested in just a subset of them.

The GraphQL endpoint is located at /graphql. You can find the documentation and an interactive query explorer at the URL below:

print(f"GraphQL API Endpoint {API_URL}")
GraphQL API Endpoint https://mastapp.site/graphql

Unlike the REST API which uses HTTP GET requests to return data, with GraphQL we use HTTP POST to post our query to the API.

Here is a simple example of getting some shot data from the GraphQL API. We need to explicity state what information we want to return from the API. Here we are asking for:

  • the shot ID

  • the timestamp that the shot was taken

  • the preshot description

  • the divertor configuration

# Write our GraphQL query.
query = gql("""
query {
    all_shots  {
        shots {
            shot_id
            timestamp
            preshot_description
            divertor_config
        }
    }
}
""")

# # Query the API and get a JSON response
result = client.execute(query)
shots = result['all_shots']['shots']
df = pd.DataFrame(shots)
df.head()
shot_id timestamp preshot_description divertor_config
0 11695 2004-12-13T11:54:00 \n0.1T TF SHOT\n conventional
1 11696 2004-12-13T12:07:00 \nSTANDARD 0.3T TF SHOT\n conventional
2 11697 2004-12-13T12:19:00 \nRAISE TO 0.5T\n conventional
3 11698 2004-12-13T12:31:00 \nRAISE TO .56T\n conventional
4 11699 2004-12-13T12:45:00 \nRAISE TO .58T\n conventional

Searching & Filtering Data#

We can also supply query parameters to GraphQL, such as limiting the number of returned values or filtering by value. Here we are limiting the first 3 values and we are selcting only shots from the M9 campaign.

# Write our GraphQL query.
query = gql("""
query ($campaign: String!) {
    all_shots (limit: 3, where: {campaign: {eq: $campaign}}) {
        shots {
            shot_id
            timestamp
            preshot_description
            divertor_config
        }
    }
}
""")
# Query the API and get a JSON response
result = client.execute(query, variable_values={"campaign": "M9"})
df = pd.DataFrame(result['all_shots']['shots'])
df.head()
shot_id timestamp preshot_description divertor_config
0 28390 2012-03-06T14:47:00 \nBC5, 300 ms, 3 V. D2 plenum 1536 mbar. For r... conventional
1 28391 2012-03-06T14:52:00 \nBC5, 300 ms, 5 V. D2 plenum 1536 mbar. For r... conventional
2 28392 2012-03-06T15:03:00 \nHL11, 300 ms, 2 V. He plenum 1047.\n conventional

Nested Queries#

One feature which makes GraphQL much more powerful than REST is that you may perform nested queries to gather different subsets of the data. For example, here we are going to query for all datasets with “AMC” in the name, and query for information about shots associated with this dataset.

# Write our GraphQL query.
query = gql("""
query ($signal: String!) {
    all_signals (limit: 3, where: {name: {contains: $signal}}) {
        signals {
          name
          url
          shot {
            shot_id
            timestamp
            divertor_config
          }
        }
    }
}
""")
# Query the API and get a JSON response
client.execute(query, variable_values={"signal": "AMC"})
{'all_signals': {'signals': []}}

Pagination in GraphQL#

GraphQL queries are paginated. You may access other entries by including the page metadata and its associated elements. Here’s an example of getting paginated entries:

def do_query(cursor: str = None):
    query = gql("""
    query ($cursor: String) {
        all_shots (limit: 3, where: {campaign: {contains: "M9"}}, cursor: $cursor) {
            shots {
                shot_id
                timestamp
                preshot_description
                divertor_config
            }
            page_meta {
              next_cursor
              total_items
              total_pages
            }
        }
    }
    """)
    return client.execute(query, {'cursor': cursor})


def iterate_responses():
    cursor = None
    while True:
        response = do_query(cursor)
        yield response
        cursor = response['all_shots']['page_meta']['next_cursor']
        if cursor is None:
            return

responses = iterate_responses()
print(next(responses))
print(next(responses))
print(next(responses))
{'all_shots': {'shots': [{'shot_id': 28390, 'timestamp': '2012-03-06T14:47:00', 'preshot_description': '\nBC5, 300 ms, 3 V. D2 plenum 1536 mbar. For reference.\n', 'divertor_config': 'conventional'}, {'shot_id': 28391, 'timestamp': '2012-03-06T14:52:00', 'preshot_description': '\nBC5, 300 ms, 5 V. D2 plenum 1536 mbar. For reference.\n', 'divertor_config': 'conventional'}, {'shot_id': 28392, 'timestamp': '2012-03-06T15:03:00', 'preshot_description': '\nHL11, 300 ms, 2 V. He plenum 1047.\n', 'divertor_config': 'conventional'}], 'page_meta': {'next_cursor': 'Mw==', 'total_items': 1677, 'total_pages': 559}}}
{'all_shots': {'shots': [{'shot_id': 28393, 'timestamp': '2012-03-06T15:09:00', 'preshot_description': '\nHL11, 300 ms, 3 V. He plenum 1047.\n', 'divertor_config': 'conventional'}, {'shot_id': 28394, 'timestamp': '2012-03-06T15:13:00', 'preshot_description': '\nHL11, 300 ms, 5 V. He plenum 1047.\n', 'divertor_config': 'conventional'}, {'shot_id': 28395, 'timestamp': '2012-03-06T15:21:00', 'preshot_description': '\nHL11, 300 ms, 6 V. He plenum 1047.\n', 'divertor_config': 'conventional'}], 'page_meta': {'next_cursor': 'Ng==', 'total_items': 1677, 'total_pages': 559}}}
{'all_shots': {'shots': [{'shot_id': 28396, 'timestamp': '2012-03-06T15:31:00', 'preshot_description': '\nHL11, 300 ms, 3 V. CH4 plenum 1032.\n', 'divertor_config': 'conventional'}, {'shot_id': 28397, 'timestamp': '2012-03-06T15:38:00', 'preshot_description': '\nHL11, 300 ms, 5 V. CH4 plenum 1032.\n', 'divertor_config': 'conventional'}, {'shot_id': 28398, 'timestamp': '2012-03-06T15:43:00', 'preshot_description': '\nHL11, 300 ms, 4 V. CH4 plenum 1032.\n', 'divertor_config': 'conventional'}], 'page_meta': {'next_cursor': 'OQ==', 'total_items': 1677, 'total_pages': 559}}}