Skip to content

Custom Parsers#

In the situation where the output files from a process are not processable by any of the built-in parsers, a custom parser can be created to extract the information of interest.

File Parsers#

File parsers are used for tracking, they take the path of an identified candidate as an argument, parse the data from that file and return a key-value mapping of the data of interest. To create a custom parser you will need to use the multiparser.parsing.file.file_parser decorator. The function should take an argument file_path which is the file path, and allow an arbitrary number of additional arguments (**_) to be compatible with the decorator. It should return either:

- Two dictionaries containing relevant metadata (usually left blank), and the parsed information: `{...}, {...}`.
- A single dictionary representing the metadata, and a list of dictionaries (for cases where multiple lines are read in a single parse and these should be kept separate): `{...}, [{...}, ...]`
from typing import Any
import multiparser.parsing.file as mp_file_parse

@mp_file_parse.file_parser
def parse_user_file_type(file_path: str, **_) -> tuple[dict[str, Any], dict[str, Any]]:
    with open(file_path) as in_f:
        file_lines = in_f.readlines()

    data = {}

    for line in file_lines:
        key, value = line.split(":", 1)
        data[key.strip().lower().replace(" ", "_")] = value.strip()

    return {}, data

To use the parser within a FileMonitor session:

with multiparser.FileMonitor(timeout=10) as monitor:
    monitor.track(path_glob_exprs="custom_file.log", parser_func=parser_user_file_type)
    monitor.run()

In the case where you would like your parser function to accept additional keyword arguments you can add these to the definition and pass them to tracking using the parser_kwargs argument.

Log Parsers#

In the case where the custom parser will be used in file "tailing", that is read only the latest information appended to the file, the multiparser.parsing.tail.log_parser decorator is used when defining the function. The function should take an argument file_content which is a string containing the latest read content, and allow an arbitrary number of additional arguments (**_) to be compatible with the decorator. It should return either:

- Two dictionaries containing relevant metadata (usually left blank), and the parsed information: `{...}, {...}`.
- A single dictionary representing the metadata, and a list of dictionaries (for cases where multiple lines are read in a single parse and these should be kept separate: `{...}, [{...}, ...]`.
from typing import Any
import multiparser.parsing.tail as mp_tail_parse

@mp_tail_parse.log_parser
def parse_user_file_type(file_content: str, **_) -> tuple[dict[str, Any], dict[str, Any]]:
    file_lines = file_content.split("\n")

    data = {}

    for line in file_lines:
        key, value = line.split(":", 1)
        data[key.strip().lower().replace(" ", "_")] = value.strip()

    return {}, data

To use the parser within a FileMonitor session:

with multiparser.FileMonitor(timeout=10) as monitor:
    monitor.tail(path_glob_exprs="custom_file.log", parser_func=parser_user_file_type)
    monitor.run()

In the case where you would like your parser function to accept additional keyword arguments you can add these to the definition and pass them to tracking using the parser_kwargs argument.