Preprocessing#

prep_causarray_data() is the first step in the causarray pipeline. It validates and formats the count matrix Y, treatment matrix A, and optional covariate matrices X / X_A into the shapes expected by fit_gcate() and LFC(). An intercept column is added to X automatically, and a standardised log-library-size covariate is appended to X_A for the propensity model.

causarray.utils.prep_causarray_data(Y, A, X=None, X_A=None, intercept=True)#

Prepares the input data for the causarray model.

Parameters:

Yarray-like: The response matrix.
Aarray-like: The treatment matrix.
Xarray-like, optional: The covariate matrix. Defaults to None.
X_Aarray-like, optional: The covariate matrix for the treatment. Defaults to None.
interceptbool, optional: Whether to include an intercept in the covariate matrix. Defaults to True.

Returns:

Yarray: The processed response matrix.
Aarray: The processed treatment matrix.
Xarray: The processed covariate matrix.
X_Aarray: The processed covariate matrix with the log library size.

Preprocessing

Contents

Preprocessing#