Preprocessing

Preprocessing#

prep_causarray_data() is the first step in the causarray pipeline. It validates and formats the count matrix Y, treatment matrix A, and optional covariate matrices X / X_A into the shapes expected by fit_gcate() and LFC(). An intercept column is added to X automatically, and a standardised log-library-size covariate is appended to X_A for the propensity model.

causarray.utils.prep_causarray_data(Y, A, X=None, X_A=None, intercept=True)#

Prepares the input data for the causarray model.

Parameters:
Yarray-like

The response matrix.

Aarray-like

The treatment matrix.

Xarray-like, optional

The covariate matrix. Defaults to None.

X_Aarray-like, optional

The covariate matrix for the treatment. Defaults to None.

interceptbool, optional

Whether to include an intercept in the covariate matrix. Defaults to True.

Returns:
Yarray

The processed response matrix.

Aarray

The processed treatment matrix.

Xarray

The processed covariate matrix.

X_Aarray

The processed covariate matrix with the log library size.