INSPIRE: Teaming Citizen Science with Machine Learning to Deepen LIGO's View of the Cosmos

This project (INSPIRE 15-47880) has developed a citizen science system--Gravity Spy ()--to support the Advanced Laser Interferometer Gravitational wave Observatory (aLIGO), the most complicated experiment ever undertaken in gravitational physics. LIGO has opened up the window of gravitational wave observations on the Universe. However, the high detector sensitivity needed for astrophysical discoveries makes aLIGO very susceptible to non-cosmic artifacts and noise that must be identified and separated from cosmic signals. Teaching computers to identify and morphologically classify these artifacts in detector data is exceedingly difficult. Human eyesight is a proven tool for classification, but the aLIGO data streams from approximately 30,000 sensors and monitors easily overwhelm a single human. This research will address these problems by coupling human classification with a machine learning model that learns from the citizen scientists and also guides how information is provided to participants. A novel feature of this system will be its reliance on volunteers to discover new glitch classes, not just use existing ones. The project includes research on the human-centered computing aspects of this sociocomputational system, and thus can inspire future citizen science projects that do not merely exploit the labor of volunteers but engage them as partners in scientific discovery. Therefore, the project will have substantial educational benefits for the volunteers, who will gain a good understanding on how science works, and will be a part of the excitement of opening up a new window on the universe. The project is joint with Vassiliki Kalogera (Northwestern University), Joshua Smith (Cal State Fullerton), Shane Larson (Northwestern University) and Laura Trouille (Adler Planetarium), with involvement at Syracuse by Kevin Crowston and Carsten Østerlund. For more detail, see

Publications from this grant are listed below.

Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science

Publication Type:

Journal Article

Source:

Classical and Quantum Gravity, Volume 34, p.064003 (2017)

Teaching Citizen Scientists to Categorize Glitches using Machine-Learning-Guided Training

Publication Type:

Journal Article

Source:

Computers in Human Behavior, Volume 105, p.106198 (2020)

Abstract:

<p>Training users in online communities is important for making high performing contributors. However, several conundrums exists in choosing the most effective approaches to training users. For example, if it takes time to learn to do the task correctly, then the initial contributions may not be of high enough quality to be useful. We conducted an online field experiment where we recruited users (N = 386) in a web-based citizen-science project to evaluate the two training approaches. In one training regime, users received one-time training and were asked to learn and apply twenty classes to the data. In the other approach, users were gradually exposed to classes of data that were selected by trained machine learning algorithms as being members of particular classes. The results of our analysis revealed that the gradual training produced “high performing contributors”. In our comparison of the treatment and control groups we found users who experienced gradual training performed significantly better on the task (an average accuracy of 90% vs. 54%), contributed more work (an average of 228 vs. 121 classifications), and were retained in the project for a longer period of time (an average of 2.5 vs. 2 sessions). The results suggests online production communities seeking to train newcomers would benefit from training regimes that gradually introduce them to the work of the project using real tasks.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Classifying the unknown: Discovering novel gravitational-wave detector glitches using similarity learning

Publication Type:

Journal Article

Source:

Physical Review D, Volume 99, Issue 8, p.082002 (2019)

Abstract:

<p>The observation of gravitational waves from compact binary coalescences by LIGO and Virgo has begun a new era in astronomy. A critical challenge in making detections is determining whether loud transient features in the data are caused by gravitational waves or by instrumental or environmental sources. The citizen-science project Gravity Spy has been demonstrated as an efficient infrastructure for classifying known types of noise transients (glitches) through a combination of data analysis performed by both citizen volunteers and machine learning. We present the next iteration of this project, using similarity indices to empower citizen scientists to create large data sets of unknown transients, which can then be used to facilitate supervised machine-learning characterization. This new evolution aims to alleviate a persistent challenge that plagues both citizen-science and instrumental detector work: the ability to build large samples of relatively rare events. Using two families of transient noise that appeared unexpectedly during LIGO's second observing run, we demonstrate the impact that the similarity indices could have had on finding these new glitch types in the Gravity Spy program.</p>

Folksonomies to support coordination and coordination of folksonomies

Publication Type:

Journal Article

Source:

Computer Supported Cooperative Work, Volume 27, Issue 3–6, p.647–678 (2018)

URL:

https://rdcu.be/NZ7E

Abstract:

<p>Members of highly-distributed groups in online production communities face challenges in achieving coordinated action. Existing CSCW research highlights the importance of shared language and artifacts when coordinating actions in such settings. To better understand how such shared language and artifacts are, not only a guide for, but also a result of collaborative work we examine the development of folksonomies (i.e., volunteer-generated classification schemes) to support coordinated action. Drawing on structuration theory, we conceptualize a folksonomy as an interpretive schema forming a structure of signification. Our study is set in the context of an online citizen-science project, Gravity Spy, in which volunteers label "glitches" (noise events recorded by a scientific instrument) to identify and name novel classes of glitches. Through a multi-method study combining virtual and trace ethnography, we analyze folksonomies and the work of labelling as mutually constitutive, giving folksonomies a dual role: an emergent folksonomy supports the volunteers in labelling images at the same time that the individual work of labelling images supports the development of a folksonomy. However, our analysis suggests that the lack of supporting norms and authoritative resources (structures of legitimation and domination) undermines the power of the folksonomy and so the ability of volunteers to coordinate their decisions about naming novel glitch classes. These results have implications design. If we hope to support the development of emergent folksonomies online production communities need to facilitate 1) tag gardening, a process of consolidating overlapping terms of artifacts; 2) demarcate a clear home for discourses around folksonomy disagreements; 3) highlight clearly when decisions have been reached; and 4) inform others about those decisions.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Knowledge Tracing to Model Learning in Online Citizen Science Projects

Publication Type:

Journal Article

Source:

IEEE Transactions on Learning Technologies, Volume 13, p.123-134 (2020)

Abstract:

<p>We present the design of a citizen science system that uses machine learning to guide the presentation of image classification tasks to newcomers to help them more quickly learn how to do the task while still contributing to the work of the project. A Bayesian model for tracking volunteer learning for training with tasks with uncertain outcomes is presented and fit to data from 12,986 volunteer contributors. The model can be used both to estimate the ability of volunteers and to decide the classification of an image. A simulation of the model applied to volunteer promotion and image retirement suggests that the model requires fewer classifications than the current system.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Shifting forms of presence: Volunteer learning in online citizen science

Publication Type:

Journal Article

Source:

Proceedings of the ACM on Human-Computer Interaction, Issue CSCW, p.36 (2020)

Abstract:

<p>Open collaboration platforms involve people in many tasks, from editing articles to analyzing datasets. To facilitate mastery of these practices, communities offer a number of learning resources, ranging from project-defined FAQs to individually-oriented search tools and communal discussion boards. However, it is not clear which project resources best support participant learning, overall and at different stages of engagement with the project. We draw on Sørensen’s framework of forms of presence to distinguish three forms of engagement with learning resources: authoritative, agent-centered and communal. We analyzed trace data from the GravitySpy citizen-science project using a mixed-effects logistic regression with volunteer performance as an outcome variable. The findings suggest that engagement with authoritative resources (e.g., those constructed by project organizers) facilitates performance initially. However, as tasks become more difficult, volunteers seek and benefit from engagement with their own agent-centered resources and community generated resources. These findings suggest a broader scope for the design of learning resources for online communities.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Linguistic adoption in online citizen science: A structurational perspective

Publication Type:

Conference Proceedings

Source:

International Conference on Information Systems, Munich, Germany (2019)

URL:

https://aisel.aisnet.org /icis2019/crowds_social/crowds_social/28/

Abstract:

<p>For peer-production projects to be successful, members must develop a specific and common language that enables them to cooperate. We address the question of what factors affect the development of shared language in open peer production communities? Answering this question is important because we want the communities to be productive even when self-managed, which requires understanding how shared language emerges. We examine this question using a structurational lens in the setting of a citizen science project. Examining the use of words in the Gravity Spy citizen science project, we find that many words are reused and that most novel words that are introduced are not picked up, showing reproduction of structure. However, some novel words are used by others, showing an evolution of the structure. Participants with roles closer to the science are more likely to have their words reused, showing the mutually reinforcing nature of structures of signification, legitimation and domination.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Blending machine and human learning processes

Publication Type:

Conference Proceedings

Source:

Hawai'i International Conference on System Sciences (2017)

URL:

http://hdl.handle.net/10125/41159

Abstract:

<p>Citizen science projects rely on contributions from volunteers to achieve their scientific goals and so face a dilemma: providing volunteers with explicit training might increase the quality of contributions, but at the cost of losing the work done by newcomers during the training period, which for many is the only work they will contribute to the project. Based on research in cognitive science on how humans learn to classify images, we have designed an approach to use machine learning to guide the presentation of tasks to newcomers that help them more quickly learn how to do the image classification task while still contributing to the work of the project. A Bayesian model for tracking this learning is presented.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

The Genie in the Bottle: Different Stakeholders, Different Interpretations of Machine Learning

Publication Type:

Conference Proceedings

Source:

Hawai'i International Conference on System Science, Wailea, HI (2020)

Abstract:

<p>Machine learning (ML) constitute an algorithmic phenomenon with some distinctive characteristics (e.g., being trained, probabilistic). Our understanding of such systems is limited when it comes to how these unique characteristics play out in organizational settings and what challenges different groups of users will face in working with them. We explore how people developing or using an ML system come to understand its capabilities and challenges. We draw on the social construction of technology tradition to frame our analysis of interviews and discussion board posts involving designers and users of a ML-supported citizen-science crowdsourcing project named Gravity Spy. Our findings reveal some of the challenges facing different relevant social groups. We find that groups with less interaction with the technology have their understanding. We find that the type of understandings achieved by groups having less interaction with the technology is shaped by outside influences rather than the specifics of the system and its role in the project. Notable, some users mistake human input for ML input. This initial understanding of how different participants understand and engage with ML point to challenges that need to be overcome to help participants deal with the opaque position ML often hold in a work system.</p>

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.