Using Cloud-Based Resources for Neuroimaging Research: A Practical Approach

By Deanna M. Barch, Sheena M. Posey Norris, and Maryann E. Martone
July 19, 2021 | Commentary

 

Introduction 

The scale and scope of human and animal neuroscience research has been increasing exponentially over the past decade. This growth has manifested both as increases in the number of participants in many studies, as well as an increase in the volume and types of data collected from each individual [1,2,3]. Many of these efforts have been enabled by the ability to use “cloud-based” tools for storage and computation. By cloud-based tools, the authors of this manuscript mean storage, computational resources, and programs that are available to a wide array of users on demand via the internet through a particular provider’s cloud-based servers. Initially, the use of such resources required extensive expertise held by relatively few researchers and few institutions. While the use of these tools still requires a level of knowledge and expertise that is not necessarily widespread, the tools have become much more accessible, and a growing number of investigators are interested in harnessing their power in support of their research. However, many barriers and consequences for misuse still exist as these tools are used to support human and animal neuroscience research. As with many new technologies, investigators have a tendency to use the cloud like they use their local computer and storage resources. Misuse can lead to inefficiencies, extra costs, and sometimes unwitting security or privacy violations because different policies and costs accrue with the use of the cloud. For example, rather than working with data directly in the cloud, researchers may continue to download copies of data to their local drives, not realizing that there are costs associated with downloading files from commercial clouds.

To better understand both the strengths of and barriers to appropriate use of such technology, the National Academies of Sciences, Engineering, and Medicine’s Forum on Neuroscience and Nervous System Disorders hosted a workshop on September 24, 2019, entitled “Neuroscience Data in the Cloud” [4]. This workshop explored the burgeoning use of cloud technology to advance neuroscience research and approaches to addressing current barriers [5].

Although the workshop highlighted great strengths in the use of cloud-based tools and the progress that has been made to date, numerous barriers and challenges remain for many researchers to move into this space. Based on discussions at the workshop, it seemed clear that there would be value in generating an informational resource for investigators and administrators in the field at different levels of experience for understanding, accessing, and successfully using cloud-based tools in support of neuroscience research, using human neuroimaging as an example. Human neuroimaging was chosen as it already has numerous cloud-based infrastructure and tools, but the resource is meant to be useful for neuroscientific data of all kinds.

 

Developing a Guide for Neuroscience Data in the Cloud

To explore how such resources might be organized, a collaborative working group came together, comprised of interested individuals from the workshop. The Action Collaborative on Neuroscience Data in the Cloud (see Acknowledgements for a list of members) included a diverse group of individuals with a wide range of expertise in cloud-based tools as well as legal and ethical issues surrounding the use of cloud-based technology. (The Collaborative is an ad hoc activity convened under the auspices of the Forum on Neuroscience and Nervous System Disorders at the National Academies of Sciences, Engineering, and Medicine [the National Academies]. The work it produces does not necessarily represent the views of any one organization, the Forum, or the National Academies, and is not subjected to the review procedures of, nor is it a report or product of, the National Academies). Members of the action collaborative produced a guide (https://training.incf.org/cloud-based-computermatrix) that could be used by investigators to make decisions about whether or not to use the cloud for their research and to provide guidance on how to use the cloud effectively.

 

Use Case Scenario and Evaluation Matrix

To provide useful examples for the field, the guide offers a use case scenario of an early-stage investigator with limited expertise using cloud-based technology resources and the types of informed choices the investigator would have to make across many different dimensions.

The guide includes an evaluation matrix (see Table 1), comprising different types of concerns and issues that an investigator would address, including dimensions relevant to the size and scope of the study (e.g., number of participants, amount of data per participant, length of study), but also considerations related to the type of data being collected (e.g., privacy and data sharing), the expertise and financial resources available to the investigator through the home institution, the number of institutions involved in the project, and requirements or desires in regard to data sharing and longevity of the data.

For each dimension in the matrix, a description, range of values or levels (e.g., the researcher’s skill level in cloud-based computing, level of privacy or security needed), and definitions of those levels are provided. Not all of the dimensions are technical. Issues involved in gaining institutional approval for cloud-based studies and the impact of involving multiple institutions in a cloud-based study is an example (see Box 1). The matrix also includes information about options and choices for each of the considerations, as well as resources for gathering more information or training, things to avoid, relevant articles, tools, and user stories. By providing different value sets for each dimension in this guide, researchers will be able to consider their own use cases and evaluate them against the matrix. These value sets are not intended to encourage an investigator either to use or not use cloud-based resources. Instead, through this process, the goal is for researchers to gain a better understanding of how different levels of these value sets impact the use of the cloud for neuroscientific data and the resources available to them for effective and responsible cloud use. While the evaluation matrix provides an overview of those decision points, detailed information on the suggested next steps is provided in the full guide.

Future Directions of a Living Resource

Cloud-based technologies and approaches are constantly evolving; therefore, this resource is designed to be a living document that is updated and modified as the space of cloud-based tools shifts and grows and as the concerns and considerations change, updated by members of the field with experience in this domain. To support this, a living version of the evaluation matrix is available at the International Neuroinformatics Coordinating Facility (INCF.org) [6] where the documents will be available online and comments can be provided (https:// training.incf.org/cloud-based-computer-matrix). The hope is that members of the community interested in this work would be willing to contribute to the comprehensiveness of this guide by sharing additional resources, best practices, and user stories. A working group has been established at INCF to handle updates and moderate discussions to help ensure that this guidance can be as helpful as possible to investigators who wish to engage with cloud-based tools, or who are already operating in this space but wish to gain greater knowledge or share the knowledge that they have gained.


Join the conversation!

Tweet this! Cloud-based technologies and approaches are constantly evolving, and neuroscience investigators are interested in harnessing their power in support of their research, according to authors of a new #NAMPerspectives: https://doi.org/10.31478/202107b

Tweet this! The authors of this #NAMPerspectives commentary discuss how a new guide can be used by neuroscience investigators to make decisions about whether or not to use the cloud for their research and provides guidance on how to use the cloud effectively: https://doi.org/10.31478/202107b

Tweet this! A new #NAMPerspectives commentary explores the burgeoning use of cloud technology to advance neuroscience research and puts forth a guide to help researchers decide whether to utilize cloud-based technologies for their research: https://doi.org/10.31478/202107b

 

Download the graphics below and share them on social media!

 

References

  1. Harms, M. P., L. H. Somerville, B. M. Ances, J. Andersson, D. M. Barch, M. Bastiani, S. Y. Bookheimer, T. B. Brown, R. L. Buckner, G. C. Burgess, T. S Coalson, M. A. Chappell, M. Dapretto, G. Douaud, B. Fischl, M. F. Glasser, D. N. Greve, C. Hodge, K. W. Jamison, S. Jbabdi, S. Kandala, X. Li, R. W. Mair, S. Mangia, D. Marcus, D. Mascali, S. Moeller, T. E. Nichols, E. C. Robinson, D. H. Salat, S. M. Smith, S. N. Sotiropoulos, M. Terpstra, K. M. Thomas, M. Dylan Tisdall, K. Ugurbil, A. van der Kouwe, R. P. Woods, L. Zöllei, D. C. Van Essen, and E. Yacoub. 2018. Extending the Human Connectome Project across ages: Imaging protocols for the Lifespan Development and Aging projects. NeuroImage 183: 972-984. https://doi.org/10.1016/j.neuroimage.2018.09.060.
  2. Miller, K. L., F. Alfaro-Almagro, N. K. Bangerter, D. L. Thomas, E. Yacoub, J. Xu, A. J. Bartsch, S. Jbabdi, S. N. Sotiropoulos, J. L. R. Andersson, L. Griff anti, G. Douaud, T. W. Okell, P. Weale, I. Dragonu, S. Garratt, S. Hudson, R. Collins, M. Jenkinson, P. M. Matthews, and S. M. Smith. 2016. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nature Neuroscience 19(11): 1523-1536. https://doi.org/10.1038/nn.4393.
  3. Casey, B. J., T. Cannonier, M. I., Conley, A. O. Cohen, D. M., Barch, M. M. Heitzeg, M. E. Soules, T. Teslovich, D. V. Dellarco, H. Garavan, C. A. Orr, T. D. Wager, M. T. Banich, N. K. Speer, M. T. Sutherland, M. C. Riedel, A. S. Dick, J. M. Bjork, K. M. Thomas, B. Chaarani, M. H. Mejia, D. J. Hagler Jr., M. Daniela Cornejo, C. S. Sicat, M. P. Harms, N. U. F. Dosenbach, M. Rosenberg, E. Earl, H. Bartsch, R. Watts, J. R. Polimeni, J. M. Kuperman, D. A. Fair, A. M. Dale, and ABCD Imaging Acquisition Workgroup. 2018. The Adolescent Brain Cognitive Development (ABCD) study: Imaging acquisition across 21 sites. Developmental Cognitive Neuroscience 32: 43-54. https://doi.org/10.1016/j.dcn.2018.03.001.
  4.  National Academies of Sciences, Engineering, and Medicine (NASEM). 2019. Neuroscience Data in the Cloud: A Workshop. Available at: https://www.nationalacademies.org/event/09-24-2019/neuroscience-data-in-the-cloud-a-workshop (accessed May 18, 2021).
  5. NASEM. 2020. Neuroscience Data in the Cloud: Opportunities and Challenges: Proceedings of a Workshop. Washington, DC: The National Academies Press. https://doi.org/10.17226/25653.
  6. Abrams, M. B., J. G. Bjaalie, S. Das, G. F. Egan, S. S. Ghosh, W. J. Goscinski, J. S. Grethe, J. H. Kotaleski, E. T. W. Ho, D. N. Kennedy, L. J. Lanyon, T. B. Leergaard, H. S. Mayberg, L. Milanesi, R. Mouček, J. B. Poline, P. K. Roy, S. C. Strother, T. B. Tang, P. Tiesinga, T. Wachtler, D. K. Wójcik, and M. E. Martone. 2021. A Standards Organization for Open and FAIR Neuroscience: the International Neuroinformatics Coordinating Facility. Neuroinformatics. https://doi.org/10.1007/s12021-020-09509-0.

DOI

https://doi.org/10.31478/202107b

Suggested Citation

Barch, D. M., S. M. Posey Norris, and M. E. Martone. 2021. Using cloud-based resources for neuroimaging research: A practical approach. NAM Perspectives. Commentary, National Academy of Medicine, Washington, DC. https://doi.org/10.31478/202107b.

Author Information

Deanna M. Barch, PhD, is professor and chair of the Department of Psychological & Brain Sciences, and the Gregory B. Couch Professor of Psychiatry at Washington University in St. Louis. Sheena M. Posey Norris, MS, is a program officer on the Board on Health Sciences Policy at the National Academies of Sciences, Engineering, and Medicine; Maryann E. Martone, PhD, is professor emerita at the University of California, San Diego, and Chair of the Governing Board at the International Neuroinformatics Coordinating Facility (INCF).

The authors are members and/or staff of the Action Collaborative on Neuroscience Data in the Cloud, an ad hoc activity convened under the auspices of the Forum on Neuroscience and Nervous System Disorders at the National Academies of Sciences, Engineering, and Medicine.

Acknowledgments

The authors wish to thank the members of the Action Collaborative on Neuroscience Data in the Cloud for their tremendous contributions to this practical guide. The members are Jonathan Cohen, Princeton University; Nita Farahany, Duke University; Gregory Farber, National Institute of Mental Health; Magali Haas, Cohen Veterans Bioscience; Sean Horgan, Verily Life Sciences; David Kennedy, University of Massachusetts Medical Center; Tara Madhyastha, Amazon Web Services; and Russell Poldrack, Stanford University.

In addition, we wish to thank Clare Stroud at the National Academies for her support and guidance during the work of the Action Collaborative.

Conflict-of-Interest Disclosures

Maryann Martone is a founder and has equity interest in SciCrunch, a tech start-up out of the University of California, San Diego, that develops tools to support rigor and reproducibility used in scientific publishing.

Correspondence

Questions or comments should be directed to Deanna Barch at dbarch@wustl.edu.

Disclaimer

The views expressed in this paper are those of the authors and not necessarily of the authors’ organizations, the National Academy of Medicine (NAM), the National Academies of Sciences, Engineering, and Medicine (the National Academies), or the Action Collaborative on Neuroscience Data in the Cloud. The paper is intended to help inform and stimulate discussion. It is not a report of the NAM or the National Academies. Copyright by the National Academy of Sciences. All rights reserved.


Join Our Community

Sign up for NAM email updates