Since vega-lite-linter requires Clingo as the solver of Answer Set Programming, you are required to install it first.
For Linux users:
apt-get install -y gringoFor MacOs users:
brew install clingoOr using Conda:
conda install -c potassco clingoMore information for downloading Clingo can be found here.
Vega-lite-linter is built on Python 3 and can be installed by:
pip install vega-lite-linterAfter successfully installing Clingo and vega-lite-linter, you can use the below sample code to get started.
More detailed examples can be found in Examples.
from vega_lite_linter import Lint
vega_json = {
    "data": {
        "url": "data/cars.json"
    },
    "mark": "bar",
    "encoding": {
        "x": {
            "field": "Horsepower",
            "type": "quantitative"
        },
        "y": {
            "field": "Miles_per_Gallon",
            "type": "quantitative"
        },
        "size": {
            "field": "Cylinders",
            "type": "ordinal"
        }
    }
}
# initialize
lint = Lint(vega_json)
# show rules that the input vega-lite json violated
violate_rules = lint.lint()
# show fixing recommendation by vega-lite-linter
fix = lint.fix()
                                        Vega-lite-linter provides simple APIs for visualization developers to detect and fix issues in the built visualizations.
At first, a Lint instance should be initialized given the target visualization specification:
                                    
lint = Lint(vegalite_json)After initialization, the two functions listed below can be called on the instance object.
lint(): Detecting Issueslint() detects any issues in the given visualization specification. Each detected issue will be presented as an Rule object containing:
                                    
fix(): Fixing Issuesfix() runs the algorithm to help revise the visualization specification into a correct one.
                                            The result of fix() contains:
                                    
Action object contains:
The related Vega-Lite properties are listed as follows.
Vega-lite-linter helps detect some errors related to data by deriving data properties from raw data, such as data field type and min/max value of numerical data field. Currently, vega-lite-linter supports such calculation with inline data specified using values property, or build-in datasets of Vega and Vega-Lite.
| Property | Value | 
|---|---|
| mark | Required. The mark type of the visualization. Can be one of the following values: area, bar, line, point, and tick. | 
| Property | Value | 
|---|---|
| channel | Required. The encoding channel type, which is specified as the key of each encoding. Can be one of the following values: x, y, color, size. | 
| field | The data field encoded by the channel. | 
| type | The type of measurement. Can be one of the following values: quantitative, temporal, ordinal, or nominal. | 
| bin | Binning discretizes numeric values into a set of bins. Can be one of the following values: true, false, or { maxBins: Maximum_number_of_bins(e.g., 10) }. | 
| aggregate | Aggregating summary statistics on the data field. Can be one of the following values: count, mean, median, min, max, stdev, sum and etc. | 
| stack | The type of stacking offset if the field should be stacked. Can be one of the following values: true, zero, normalize, center or false. | 
| scale | Functions that transform a domain of data values. | 
The scale property includes:
                                    
| Property | Value | 
|---|---|
| type | The type of scale transformation. Currently, the algorithm detects errors related to log type. | 
| zero | If true, ensure that a zero baseline value is included in the scale domain. | 
More details about Vega-Lite properties can be found here.
Rules in vega-lite-linter are referred to and refined from Draco. The rules are grouped into four categories.
| Rule | Meaning | 
|---|---|
| enc_type_valid_1 | Verify the consistency of data field and type 'quantitative'. | 
| enc_type_valid_2 | Verify the consistency of data field and type 'temporal'. | 
| bin_q_o | Only use bin on quantitative or ordinal data. | 
| zero_q | Only use log scale with quantitative data. | 
| log_discrete | Only use log scale with non-discrete data. | 
| log_zero | A log scale cannot have a zero baseline in the scale domain. | 
| log_non_positive | Use log scale on data that are all positive. | 
| bin_and_aggregate | Use both bin and aggregate on the data in the same time is illegal. | 
| aggregate_o_valid | Oridnal data only supports min, max, and median aggregation. | 
| aggregate_t_valid | Temporal only supports min and max aggregation. | 
| aggregate_nominal | Nominal data cannot be aggregated. | 
| count_q_without_field_1 | Use count aggregation or declare a data field of an encoding, instead of doing both of them. | 
| count_q_without_field_2 | The encoding with count aggregation has to be 'quantitative' type. | 
| size_nominal | Channel size implies order in the data, it is not suitable for nominal data. | 
| size_negative | Channel size is not suitable for data with negative values. | 
| encoding_no_field_and_not_count | Declare the data field or use count aggregation in each encoding. | 
| color_with_cardinality_gt_twenty | Use at most 20 categorical colors in the visualization. | 
| stack_without_x_y | Use stack on x or y channels. | 
| stack_discrete | Use stack on continuous data. | 
| Rule | Meaning | 
|---|---|
| repeat_channel | Use each channel only once. | 
| no_encodings | Use at least one encoding. Otherwise, the visualization doesn't show anything. | 
| same_field_x_and_y | Use different fields for x axis and y axis. | 
| count_twice | Use count aggregation once in the visualization. | 
| stack_without_summative_agg | Only use summative aggregation (count, sum, distinct, valid, missing) with stack in the encoding. | 
| stack_without_discrete_color_1 | Only use stack with a color channel encoding discrete data in the visualization. | 
| stack_without_discrete_color_2 | Only use stack with a color channel encoding discrete data in the visualization. | 
| stack_without_discrete_color_3 | Only use stack with a color channel encoding discrete data in the visualization. | 
| stack_with_non_positional_non_agg | When using stack in the visualization, apply aggregation in non-positional continuous channels (color, size) . | 
| Rule | Meaning | 
|---|---|
| point_tick_bar_without_x_or_y | Use x or y channel for mark 'point', 'tick', and 'bar'. | 
| line_area_without_x_y | Use x and y channels for mark 'line' and 'area'. | 
| bar_tick_continuous_x_y | Use no more than one continuous data in the x and y channels for mark 'bar' and 'tick'. | 
| bar_tick_area_line_without_continuous_x_y | Mark 'bar', 'tick', 'line', 'area' require some continuous variable on x or y. | 
| bar_area_without_zero_1 | Mark 'bar' and 'area' require the scale of the x-axis to start at zero, when the x-axis encodes quantitative data. | 
| bar_area_without_zero_2 | Mark 'bar' and 'area' require the scale of the y-axis to start at zero, when the y-axis encodes quantitative data. | 
| size_without_point | Use the size channel with the mark 'point' would be better. | 
| stack_without_bar_area | Only use stacking for the mark 'bar' and 'area'. | 
| Rule | Meaning | 
|---|---|
| invalid_mark | Use valid mark type, including 'area', 'bar', 'line', 'point', 'tick'. | 
| invalid_channel | Use valid channels, including x, y, color, size. | 
| invalid_type | Use valid types, including quantitative, nominal, ordinal, temporal. | 
| invalid_agg | Use valid aggregation, including count, mean, median, min, max, stdev, sum, etc. | 
| invalid_bin | Use non-negative number for bin amounts (maxbins). | 
Vega-lite-linter was invented by the iDVx Lab together with AntV. Based on our technology, AntV and iDVx Lab also developed ChartLinter in Javascript to support visualization charts beyond Vega-Lite.
If you have any questions, please feel free to open an issue or contact idvx.lab [at] gmail.com.
The software is available under the MIT License.