7. Sample aggregation options¶

The primary goal of abagen.get_expression_data() is to allow users to aggregate the ~3,700 disparate tissue samples from the Allen Human Brain Atlas into regions of interest defined by an atlas or parcellation file. However, there exist several options for exactly how to aggregate samples within each region of the specified atlas.

These options are controlled via two parameters to abagen.get_expression_data(): region_agg and agg_metric. We discuss both parameters and the different options available to each below.

7.1. The region_agg parameter¶

This parameter determines how samples are aggregated together to generate the expression values for a region. It can take two values: ‘donors’ or ‘samples’.

If set to ‘donors’, expression values for all samples assigned to a region are aggregated independently for each donor and then aggregated across donors. If set to ‘samples’, expression values for all samples for all donors assigned to a region are aggregated simultaneously.

7.2. The agg_metric parameter¶

This parameter determines the actual metric used for aggregating samples into regional expression values. It can be set to any callable function (as long as that function accepts the keyword axis argument), but generally either ‘mean’ (the default) or ‘median’ will suffice.