abagen.io.read_pacall

abagen.io.read_pacall(fname, copy=False, parquet=True)[source]

Loads PACall.csv file found at fname

PA files contain a present/absent flag indicating whether the corresponding probe’s expression is above background noise. It is set to 1 when both of the following conditions are met:

  1. The mean signal of the probe’s expression is significantly different from the corresponding background, as assessed by a 2-sided t-test where p < 0.01, and

  2. The difference between the background subtracted signal and the background is significant (> 2.6 * background standard deviation).

This information can be used to discard “noisy” probes that might not be contributing high-quality expression information.

Parameters:
  • fname (str) – Path to PACall.csv file

  • copy (bool, optional) – Whether to return a copy if fname is a pre-loaded pandas.Dataframe. Default: False

  • parquet (bool, optional) – Whether to load data from parquet file instead of CSV. If a parquet file does not already exist then one will be created for faster loading in the future. Only available if fastparquet and python-snappy module are installed. Default: True

Returns:

pacall – Dataframe containing a binary indicator determining whether expression information for each probe exceeded background noise in a given sample, where P is probes and S is samples. The row index is the unique probe ID assigned during processing, which can be used to match data to the information obtained with read_probes(). The column index is the unique sample ID (integer, beginning at 1) which can be used to match data to the information obtained with read_annotation().

Return type:

(P, S) pandas.DataFrame