Generate your anomaly test with Elementary AI
Let our Slack chatbot create the anomaly test you need.
elementary.column_value_anomalies
Monitors individual values of a column and detects anomalous rows by comparing each value against the historical distribution of that column.
Unlike column_anomalies which computes aggregate metrics (like min, max, average) per time bucket and detects anomalies in the metric time series, column_value_anomalies operates directly on the raw column values — no additional aggregation functions are needed.
How it works
- Training: The test collects all values of the column within the training period and computes a baseline distribution (mean and standard deviation).
- Detection: For each row in the detection period, the test computes a z-score for the column value against the historical baseline.
- Anomaly flagging: Rows where the z-score exceeds the configured
anomaly_sensitivitythreshold are flagged as anomalous. If any anomalous rows are found, the test fails.
This test is designed for numeric columns. It detects individual row-level outliers, making it ideal for catching unexpected spikes or drops in values such as transaction amounts, prices, scores, or measurements.
When to use
| Use case | Recommended test |
|---|---|
| Detect anomalies in aggregate statistics (avg, min, max) of a column over time | column_anomalies |
| Detect individual rows with anomalous column values | column_value_anomalies |
Test configuration
Atimestamp_column is required for this test as it uses historical data to build the baseline distribution.
columns:
- name: column name
data_tests:
- elementary.column_value_anomalies:
arguments:
timestamp_column: column name
where_expression: sql expression
anomaly_sensitivity: int
anomaly_direction: [both | spike | drop]
detection_period:
period: [hour | day | week | month]
count: int
training_period:
period: [hour | day | week | month]
count: int
seasonality: day_of_week
detection_delay:
period: [hour | day | week | month]
count: int

