abagen.normalize_expression

abagen.normalize_expression(expression, norm='srs', structures=None, ignore_warn=False)[source]

Performs normalization on expression data

Parameters
  • expression (list of (S, G) pandas.DataFrame) – Microarray expression data to be normalized, where S is samples (or regions) and G is genes

  • norm (str, optional) – Function with which to normalize expression data. See Notes for more information on options. Default: ‘scaled_robust_sigmoid’

  • structures (list of (S,) pandas.DataFrame) – Structural designations of S samples (or regions) in expression. Index of provided data frames should be identical to expression and must have at least column ‘structure’. If provided, normalization will be performed separately for each distinct structural class. Default: None

  • ignore_warn (bool, optional) – Whether to suppress potential warnings raised by normalization. Default: False

Returns

normalized – Data from expression normalized separately for each gene

Return type

list of (S, G) pandas.DataFrame

Notes

The following methods can be used for normalizing gene expression values for each donor (adapted from [PC2]):

  1. norm='center'

Removes the mean of data in each column. Aliased to ‘demean’

  1. norm='zscore'

Applies a basic z-score (subtract mean, divide by standard deviation) to each column; uses degrees of freedom equal to one for standard deviation

  1. norm='minmax'

Scales data in each column to the unit normal (i.e., range 0-1)

  1. norm='sigmoid'

Applies a sigmoidal transform function to normalize data in each column. Aliased to ‘sig’

  1. norm='scaled_sigmoid'

Combines ‘sigmoid’ and ‘minmax’. Aliased to ‘scaled_sig’

  1. norm='scaled_sigmoid_quantiles'

Caps input data at the 5th and 95th percentiles before performing the ‘scaled_sigmoid’ transform. Aliased to ‘scaled_sig_qnt’

  1. norm='robust_sigmoid'

Uses a robust sigmoid function ([PC1]) to normalize data in each column. Aliased to ‘rs’ and ‘rsig’

  1. norm='scaled_robust_sigmoid'

Combines ‘robust_sigmoid’ and ‘minmax’. Aliased to ‘srs’ and ‘scaled_rsig’

  1. norm='mixed_sigmoid'

Uses ‘scaled_sigmoid’ transform for columns where the IQR is 0; otherwise, uses the ‘scaled_robust_sigmoid’ transform. Aliased to ‘mixed_sig’

  1. norm='batch'

Uses a linear model to remove donor effects from data. Differs from other methods in that all donors are simultaneously fit to the same model and data are residualized based on estimated betas. Linear model includes the intercept but it is not removed during residualization

References

PC1

Fulcher, B. D., & Fornito, A. (2016). A transcriptional signature of hub connectivity in the mouse connectome. Proceedings of the National Academy of Sciences, 113(5), 1435-1440.

PC2

Fulcher, B. D., Little, M. A., & Jones, N. S. (2013). Highly comparative time-series analysis: the empirical structure of time series and their methods. Journal of the Royal Society Interface, 10(83), 20130048