🚩 What is redflag
?¶
Overview¶
Redflag is a Python library that applies “safety by design” to machine
learning. It helps researchers and practitioners in this field ensure their
models are safe and reliable by alerting them to potential pitfalls. These
pitfalls could lead to overconfidence in the model or wildly spurious
predictions. Redflag offers accessible ways for users to integrate safety
checks into their workflows by providing scikit-learn
transformers, pandas
accessors, and standalone functions. These components can easily be
incorporated into existing workflows, helping identify issues and enhance the
quality and safety of predictive models.
Safety by design¶
Safety by design means to ‘design out’ hazardous situations from complex machines or processes before they can do harm. The concept, also known as prevention through design, has been applied to civil engineering and industrial design for decades. Recently it has also been applied to software engineering and, more recently still, to machine learning [@van-gelder-etal-2021]. Redflag helps machine learning researchers and practitioners design safety into their workflows.
To read more about the motivation for this package, check out the draft paper submitted to JOSS.
What’s in redflag
¶
Redflag offers three ways for users to insert safety checks into their machine learning workflows:
scikit-learn
transformers which fit directly into the pipelines that most data scientists are already using, e.g.redflag.ImbalanceDetector().fit_transform(X, y)
.pandas
accessors on Series and DataFrames, which can be called like a method on existing Pandas objects, e.g.df['target'].redflag.is_imbalanced()
.Standalone functions which the user can compose their own checks and tests with, e.g.
redflag.is_imbalanced(y)
.
There are two kinds of scikit-learn
transformer:
Detectors check every dataset they encounter. For example,
redflag.ClippingDetector
checks for clipped data during both model fitting and during prediction.Comparators learn some parameter in the model fitting step, then check subsequent data against those parameters. For example,
redflag.DistributionComparator
learns the empirical univariate distributions of the training features, then checks that the features in subsequent datasets are tolerably close to these baselines.
Although the scikit-learn
components are implemented as transformers,
subclassing sklearn.base.BaseEstimator
, sklearn.base.TransformerMixin
, they
do not transform the data. They only raise warnings (or, optionally,
exceptions) when a check fails. Redflag does not attempt to fix any problems
it encounters.