Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.complyhat.ai/llms.txt

Use this file to discover all available pages before exploring further.

ComplyHat’s bias engine runs four fairness metrics deterministically against tabular data you supply. Each test returns a pass or fail ruling against a configurable threshold, per protected class, with data-quality assessments that tell you whether the result is statistically meaningful. When any test fails, the model’s compliance_status is automatically updated to non_compliant; on a clean run it moves to needs_review.

The four test types

All four tests operate on a tabular dataset with an outcome column and one or more protected-class columns. For the statistical details and academic sources behind each metric, see methodology.
Test typeWhat it measuresThresholdGround truth required?
disparate_impactFavorable rate ratio between subgroups (Four-Fifths Rule)Fail if any ratio < 0.80No
statistical_parityAbsolute difference in favorable rates across subgroupsFail if difference > 0.10No
equal_opportunityTrue positive rate ratio across subgroupsFail if min/max TPR < 0.80Yes
predictive_parityPositive predictive value difference across subgroupsFail if max − min PPV > 0.10Yes
If you include equal_opportunity or predictive_parity in test_types, you must also supply ground_truth_column. The engine returns a 422 error if you omit it.

Run a bias test

Call bias_tests with mode: "run". Supply the model_id, your dataset inline in data.rows, the column names, and the test types you want. The data object requires source: "inline".
{
  "tool": "bias_tests",
  "arguments": {
    "mode": "run",
    "model_id": "mdl_01j9z...",
    "framework": "nyc-ll144",
    "test_types": ["disparate_impact", "statistical_parity"],
    "protected_classes": ["gender", "race"],
    "outcome_column": "hired",
    "favorable_outcome": "1",
    "data": {
      "source": "inline",
      "rows": [
        { "gender": "F", "race": "Black", "hired": "1", "score": 0.82 },
        { "gender": "M", "race": "White", "hired": "1", "score": 0.91 },
        { "gender": "F", "race": "Hispanic", "hired": "0", "score": 0.61 }
      ]
    }
  }
}
For tests that require ground truth labels, add the ground_truth_column field:
{
  "tool": "bias_tests",
  "arguments": {
    "mode": "run",
    "model_id": "mdl_01j9z...",
    "framework": "eu-ai-act",
    "test_types": ["disparate_impact", "statistical_parity", "equal_opportunity", "predictive_parity"],
    "protected_classes": ["gender", "age_group"],
    "outcome_column": "prediction",
    "favorable_outcome": "approved",
    "ground_truth_column": "actual_outcome",
    "data": {
      "source": "inline",
      "rows": []
    }
  }
}
The response contains a test_id, an overall_result (pass or fail), per-(test_type, protected_class) results with details, and a data_quality assessment for each protected class. Check data_quality[*].adequate and data_quality[*].warnings before treating a pass as conclusive.

List and retrieve test results

List all bias tests for a model with mode: "list":
{
  "tool": "bias_tests",
  "arguments": {
    "mode": "list",
    "model_id": "mdl_01j9z..."
  }
}
Retrieve a specific result by its test_id with mode: "get":
{
  "tool": "bias_tests",
  "arguments": {
    "mode": "get",
    "test_id": "bt_01m5..."
  }
}

Schedule recurring tests

Regulators specify both the test types and the cadence they expect. You can encode both into a recurring schedule so your host agent runs tests automatically without manual intervention. Create a schedule with mode: "create_schedule". Provide the dataset_id to run against, a test_config object describing the test parameters, the cadence (monthly, quarterly, or annually), and the next_run_at timestamp for the first run.
{
  "tool": "bias_tests",
  "arguments": {
    "mode": "create_schedule",
    "model_id": "mdl_01j9z...",
    "dataset_id": "ds_01k3...",
    "cadence": "quarterly",
    "next_run_at": "2026-07-01T00:00:00Z",
    "test_config": {
      "framework": "sr-11-7",
      "test_types": ["disparate_impact", "statistical_parity"],
      "protected_classes": ["gender", "race", "age_group"],
      "outcome_column": "prediction",
      "favorable_outcome": "approved"
    }
  }
}
List all active schedules for a model with mode: "list_schedules":
{
  "tool": "bias_tests",
  "arguments": {
    "mode": "list_schedules",
    "model_id": "mdl_01j9z..."
  }
}

Framework-specific requirements

Different frameworks require different test types and cadences. Configure your schedules accordingly:
FrameworkRequired testsCadence
sr-11-7disparate_impact, statistical_parityQuarterly
eu-ai-actdisparate_impact, statistical_parity, equal_opportunity, predictive_parityQuarterly
nyc-ll144disparate_impact, statistical_parityAnnual (per AEDT use case)
naic-model-bulletindisparate_impactAnnual
cms-0057-fdisparate_impact, equal_opportunityQuarterly
NYC Local Law 144 requires the annual bias audit to be conducted by an independent auditor. ComplyHat produces the technical artefacts — the independence requirement is a legal and operational matter your organization must arrange separately.

Next steps: Review the statistical methods behind each test in methodology, or see all bias_tests modes in the tool reference.