2. Defining a parcellation¶
2.1. Acceptable parcellations¶
In order to process the microarray expression data from AHBA you’ll need a
parcellation (or “atlas”). Here, we define a parcellation (atlas) as either (1)
a NIFTI image in MNI space, or (2) a tuple of GIFTI images in fsaverage
space (and with fsaverage5 resolution!). In both cases, parcels in the atlas
should be denoted by unique integer IDs (distinct across hemispheres). The
primary workflows in abagen
are designed to readily accept any
parcellations / atlases in this format; however, if you want to use a different
format please refer to Non-standard parcellations.
For demonstration purposes, abagen
has a copy of the Desikan-Killiany
atlas that you can use. Here, we load the volumetric atlas by default:
>>> import abagen
>>> atlas = abagen.fetch_desikan_killiany()
The returned object atlas
is a dictionary with two keys: image
, which
is filepath to a NIFTI image containing the atlas data, and info
, which is
a filepath to a CSV file containing extra information about the parcellation:
>>> print(atlas['image'])
/.../data/atlas-desikankilliany.nii.gz
>>> print(atlas['info'])
/.../data/atlas-desikankilliany.csv
You can load the surface version of the atlas by providing the surface
parameter:
>>> atlas = abagen.fetch_desikan_killiany(surface=True)
>>> print(atlas['image'])
('/.../data/atlas-desikankilliany-lh.label.gii.gz', '/.../data/atlas-desikankilliany-rh.label.gii.gz')
2.2. Providing additional parcellation info¶
While only the image (i.e., NIFTI or GIFTIs) is required for processing the
microarray data, the CSV file with information on the parcellation scheme can
also be very useful. In particular, abagen
can use the CSV file to
constrain the matching of tissue samples to anatomical regions in the atlas
image.
Note
If you are using a surface atlas and your GIFTI files have valid label
tables then abagen
will
automatically create a pandas.DataFrame with all the relevant information
described below. However, you can always provide a separate CSV file if
you are unsure and this will override any label tables present in the
GIFTI files.
If you want to supply your own CSV file with information about an atlas you must ensure it has (at least) the following columns:
id
: an integer ID corresponding to the labels in theatlas
image
hemisphere
: a left/right/bilateral hemispheric designation (i.e., ‘L’, ‘R’, or ‘B’)
structure
: a broad structural class designation (i.e., one of ‘cortex’, ‘subcortex/brainstem’, ‘cerebellum’, ‘white matter’, or ‘other’)
For example, a valid CSV might look like this:
>>> import pandas as pd
>>> atlas_info = pd.read_csv(atlas['info'])
>>> print(atlas_info)
id label hemisphere structure
0 1 bankssts L cortex
1 2 caudalanteriorcingulate L cortex
2 3 caudalmiddlefrontal L cortex
.. .. ... ... ...
80 81 hippocampus R subcortex/brainstem
81 82 amygdala R subcortex/brainstem
82 83 brainstem B subcortex/brainstem
[83 rows x 4 columns]
Notice that extra columns (i.e., label
) are okay as long as the three
required columns are present! If you want to confirm your file is formatted
correctly you can use abagen.images.check_atlas()
:
Note
Do not run abagen.images.check_atlas()
below if you are following
the tutorial for your first trial of abagen. It will convert the atlas
from atlas['image'], atlas['info']
into abagen.AtlasTree
object.
If you ran it accidentally, use atlas
directly in the next steps,
instead of atlas['image'], atlas['info']
. For more, see the section
on Non-standard parcellations.
>>> from abagen import images
>>> atlas = abagen.fetch_desikan_killiany()
>>> # atlas = images.check_atlas(atlas['image'], atlas['info']);
If something is amiss with the file this function will raise an error and try to give some information about what you should check for.
Important
You might be asking: “why should I provide this extra information for
my parcellation?” Providing this CSV file will ensure that microarray
samples designated as belonging to a given hemisphere/structure by the AHBA
ontology are not matched to regions in the atlas
image with different
hemispheric/structural designations. That is, if the AHBA ontology
specifies that a tissue sample comes from the left hemisphere subcortex, it
will only ever be matched to regions in atlas
belonging to the left
hemisphere subcortex.
While this seems trivial, it is very important because there are
numerous tissue samples which occur on the boundaries of hemispheres and
structural classes (i.e., cortex/subcortex). In many instances, these
samples won’t fall directly within a region of the atlas
, at which
point abagen
will attempt to match them to nearby regions. Without the
hemisphere/structure information provided by this CSV file there is a high
likelihood of misassigning samples, leading to biased or skewed expression
data.
2.3. Individualized parcellations¶
Instead of providing a single parcellation image that will be used for all
donors, you can instead provide a parcellation image for each donor in the
space of their “raw” (or native) T1w image. abagen
ships with versions
of the Desikan-Killiany parcellation defined in donor-native space:
>>> atlas = abagen.fetch_desikan_killiany(native=True)
>>> print(atlas['image'].keys())
dict_keys(['9861', '10021', '12876', '14380', '15496', '15697'])
>>> print(atlas['image']['9861'])
/.../data/native_dk/9861/atlas-desikankilliany.nii.gz
Note here that atlas['image']
is a dictionary, where the keys are donor IDs
and the corresponding values are paths to the parcellation for each donor. The
primary workflows in abagen
that accept a single atlas (i.e.,
abagen.get_expression_data()
and abagen.get_samples_in_mask()
) will
also accept a dictionary of this format.
We also provide donor-specific surface atlases (derived from the FreeSurfer
outputs that can be fetched with abagen.datasets.fetch_freesurfer()
).
These atlases are also shipped with abagen
and can be loaded with:
>>> atlas = abagen.fetch_desikan_killiany(native=True, surface=True)
>>> print(atlas['image'].keys())
dict_keys(['9861', '10021', '12876', '14380', '15496', '15697'])
>>> print(atlas['image']['9861'])
('/.../9861/atlas-desikankilliany-lh.label.gii.gz', '/.../9861/atlas-desikankilliany-rh.label.gii.gz')
Note that if you are using your own donor-specific surface atlases they must,
by default, be based on the geometry of the FreeSurfer surfaces provided with
abagen.datasets.fetch_freesurfer()
. If you wish to use surface atlases
based on different geometry please refer to Non-standard parcellations,
below.
Finally, when in doubt we recommend simply using a standard-space, group-level
atlas; however, we are actively investigating whether native-space atlases
provide any measurable benefits to the abagen
workflows.
Note
The donor-native volumetric versions of the DK parcellation shipped with
abagen
were generated by Arnatkevičiūte et al., 2018, NeuroImage, and
are provided under the CC BY 4.0 license. The donor-native surface versions
of the DK parcellation were generated by Romero-Garcia et al., 2017,
NeuroImage, and are also provided under the CC BY 4.0 license.
2.4. Non-standard parcellations¶
If you’d like to use a non-standard atlas in the primary abagen
workflows
that may be possible—with some caveats. That is, the constraining factor here
is the coordinates of the tissue samples from the AHBA: they are available in
(1) the native space of each donor’s MRI, or (2) MNI152 space, and we strongly
encourage you to use one of these options (rather than e.g., attempting to
register the coordinates to a new space). If you provide a group-level atlas
the toolbox will default to using the MNI152 coordinates; if you provide
donor-specific atlases then the tooblox will use the native coordinates. Thus,
by default, abagen
prefers you use one of the atlas conformations described
above.
However, if you have an atlas in a different space or resolution you can
(potentially) use it in the primary abagen
workflows. To do this you will
need to create a abagen.AtlasTree
object. All atlases provided are
internally coerced to AtlasTree instances, which is then used to assign
microarray tissue samples to parcels in the atlas.
Take, for example, a surface atlas in fsaverage6 resolution (by default, surface atlases are assumed to be fsaverage5 resolution). In this case, you simply need to supply the relevant geometry files for the atlas and specify the space of the atlas:
>>> from abagen import images
>>> atlas = ('/.../fsaverage6-lh.label.gii', '/.../fsaverage6-rh.label.gii')
>>> surf = ('/.../fsaverage6-lh.surf.gii', '/.../fsaverage6-lh.surf.gii')
>>> atlas = images.check_atlas(atlas, geometry=surf, space='fsaverage6')
The same procedure can be used for an atlas using fsLR geometry:
>>> from abagen import images
>>> atlas = ('/.../fslr32k-lh.label.gii', '/.../fslr32k-rh.label.gii')
>>> surf = ('/.../fslr32k-lh.surf.gii', '/.../fslr32k-lh.surf.gii')
>>> atlas = images.check_atlas(atlas, geometry=surf, space='fslr')