brainome logo

303 Predictor Validation Measurements in JSON

The predictor can output validation measurements in json rather than human-readable text.

  • Validation measurements in json format.

Prerequisites

This notebook assumes brainome is installed as per notebook brainome_101_Quick_Start

The data sets are:

!python3 -m pip install brainome --quiet
!brainome --version

import urllib.request as request
response2 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_validate.csv', 'titanic_validate.csv')
%ls -lh titanic_validate.csv
WARNING: You are using pip version 22.0.3; however, version 22.0.4 is available.
You should consider upgrading via the '/opt/hostedtoolcache/Python/3.9.10/x64/bin/python3 -m pip install --upgrade pip' command.

/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/xgboost/compat.py:31: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
  from pandas import MultiIndex, Int64Index
brainome v1.8-120-prod
-rw-r--r-- 1 runner docker 5.8K Mar 12 21:09 titanic_validate.csv

Generate a predictor

The predictor filename is predictor_303.py

!brainome https://download.brainome.ai/data/public/titanic_train.csv -y -o predictor_303.py -modelonly -q
print('\nCreated predictor_303.py')
!ls -lh predictor_303.py
/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/xgboost/compat.py:31: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
  from pandas import MultiIndex, Int64Index

Created predictor_303.py
-rw-r--r-- 1 runner docker 35K Mar 12 21:10 predictor_303.py

Validation measurements in json format.

The same measurements as all previous exercises can be generated in JSON format for further system integration Use -validate -json to trigger the predictor to output json validation measurements.

!python3 predictor_303.py -validate titanic_validate.csv -json > validation_measurements_303.json
import json
with open('validation_measurements_303.json', 'r') as measurement_file:
    validation_measurements = json.load(measurement_file)
    print(json.dumps(validation_measurements, indent=4))
{
    "instance_count": 80,
    "classifier_type": "RF",
    "classes": 2,
    "number_correct": 65,
    "accuracy": {
        "best_guess": 0.6125,
        "improvement": 0.2,
        "model_accuracy": 0.8125
    },
    "model_capacity": 41,
    "generalization_ratio": 1.5218042472451754,
    "model_efficiency": 0.48,
    "shannon_entropy_of_labels": 0.9631672450918831,
    "class_balance": [
        0.6125,
        0.3875
    ],
    "confusion_matrix": [
        [
            44,
            5
        ],
        [
            10,
            21
        ]
    ],
    "multiclass_stats": {
        "0": {
            "TP": 44,
            "FN": 5,
            "TN": 21,
            "FP": 10,
            "TPR": 0.8979591836734694,
            "TNR": 0.6774193548387096,
            "PPV": 0.8148148148148148,
            "NPV": 0.8076923076923077,
            "F1": 0.8543689320388349,
            "TS": 0.7457627118644068
        },
        "1": {
            "TP": 21,
            "FN": 10,
            "TN": 44,
            "FP": 5,
            "TPR": 0.6774193548387096,
            "TNR": 0.8979591836734694,
            "PPV": 0.8076923076923077,
            "NPV": 0.8148148148148148,
            "F1": 0.7368421052631579,
            "TS": 0.5833333333333334
        }
    }
}

Next Steps

TODO next…