abagen: A toolbox for the Allen Brain Atlas genetics data¶
This package provides a Python interface for fetching and working with the Allen Human Brain Atlas (AHBA) microarray expression data.
Overview¶
In 2013, the Allen Institute for Brain Science released the Allen Human Brain Atlas, a dataset containing microarray expression data collected from six human brains (Hawrylycz et al., 2012) . This dataset has offered an unprecedented opportunity to examine the genetic underpinnings of the human brain, and has already yielded novel insight into e.g., adolescent brain development and functional brain organization.
However, in order to be effectively used in most analyses, the AHBA microarray expression data often needs to be (1) collapsed into regions of interest (e.g., parcels or networks), and (2) combined across donors. While this may potentially seem trivial, there are a number of analytic choices in these steps that can dramatically influence the resulting data and any downstream analyses. Arnatkevičiūte et al., 2019 provided a thorough treatment of this in a recent publication, demonstrating how the techniques and code used to prepare the raw AHBA data have varied widely across published reports. We extended this work in a recent preprint (Markello et al., 2021) to quantify how different processing choices can impact statistical analyses of the AHBA.
The current Python package, abagen
, aims to provide reproducible workflows
for processing and preparing the AHBA microarray expression data for analysis.
Installation requirements¶
Currently, abagen
works with Python 3.6+ and requires a few dependencies:
nibabel
numpy (>=1.14.0)
pandas (>=0.25.0), and
scipy
There are some additional (optional) dependencies you can install to speed up some functions:
fastparquet, and
python-snappy
These latter packages are primarily used to facilitate loading the (rather large!) microarray expression dataframes provided by the Allen Institute,
For detailed information on how to install abagen
, including these
dependencies, refer to our installation instructions.
Quickstart¶
At it’s core, using abagen
is as simple as:
>>> import abagen
>>> expression = abagen.get_expression_data('myatlas.nii.gz')
where 'myatlas.nii.gz'
points to a brain parcellation file.
This function can also be called from the command line with:
$ abagen --output-file expression.csv myatlas.nii.gz
For more detailed instructions on how to use abagen
please refer to our
user guide!
Development and getting involved¶
If you’ve found a bug, are experiencing a problem, or have a question about
using the package, please head on over to our GitHub issues and make a new
issue with some information about it! Someone will try and get back to you
as quickly as possible, though please note that the primary developer for
abagen
(@rmarkello) is a graduate student so responses make take some time!
If you’re interested in getting involved in the project: welcome ✨!
We’re thrilled to welcome new contributors. You should start by reading our
code of conduct; all activity on abagen
should adhere to the CoC. After
that, take a look at our contributing guidelines so you’re familiar with the
processes we (generally) try to follow when making changes to the repository!
Once you’re ready to jump in head on over to our issues to see if there’s
anything you might like to work on.
Citing abagen
¶
For up-to-date instructions on how to cite abagen please refer to our documentation.
License Information¶
This codebase is licensed under the 3-clause BSD license. The full license
can be found in the LICENSE file in the abagen
distribution.
Reannotated gene information located at abagen/data/reannotated.csv.gz
and
individualized donor parcellations for the Desikan-Killiany atlas located at
abagen/data/native_dk
are taken from Arnatkevičiūte et al., 2018 and are
separately licensed under the CC BY 4.0; these data can also be found on
figshare.
Corrected MNI coordinates used to match AHBA tissues samples to MNI space
located at abagen/data/corrected_mni_coordinates.csv
are taken from the
alleninf package, provided under the 3-clause BSD license.
All microarray expression data is copyrighted under non-commercial reuse policies by the Allen Institute for Brain Science (© 2010 Allen Institute for Brain Science. Allen Human Brain Atlas. Available from: Allen Human Brain Atlas).
All trademarks referenced herein are property of their respective holders.
Contents¶
- Installation and setup
- What’s new
- Command-line usage
- User guide
- Getting involved
- Citing abagen
- Reference API
abagen.allen
- Primary workflowsabagen.datasets
- Fetching AHBA datasetsabagen.images
- Image processing functionsabagen.correct
- Post-processing correctionsabagen.matching
- Functions for matching samplesabagen.reporting
- Functions for generating reportsabagen.io
- Loading AHBA data filesabagen.mouse
- Working with the Allen Mouse Brain Atlas