Preprocessing#
prep_causarray_data() is the first step in the causarray pipeline.
It validates and formats the count matrix Y, treatment matrix A,
and optional covariate matrices X / X_A into the shapes expected
by fit_gcate() and LFC(). An intercept column is added to
X automatically, and a standardised log-library-size covariate is
appended to X_A for the propensity model.
- causarray.utils.prep_causarray_data(Y, A, X=None, X_A=None, intercept=True)#
Prepares the input data for the causarray model.
- Parameters:
- Yarray-like
The response matrix.
- Aarray-like
The treatment matrix.
- Xarray-like, optional
The covariate matrix. Defaults to None.
- X_Aarray-like, optional
The covariate matrix for the treatment. Defaults to None.
- interceptbool, optional
Whether to include an intercept in the covariate matrix. Defaults to True.
- Returns:
- Yarray
The processed response matrix.
- Aarray
The processed treatment matrix.
- Xarray
The processed covariate matrix.
- X_Aarray
The processed covariate matrix with the log library size.