Focusing attention to improve the performance of citizen science systems: Beautiful images and perceptive observers

This SOCS project (NSF 12-11071) examined strategies for dealing with the flood of digital data that confronts researchers. New techniques, tools and strategies for dealing with massive data sets, whether they consist of vast numbers of base-pair sequences or terabytes of data from all-sky astronomical surveys, present an opportunity to establish a 'fourth paradigm' of scientific discovery, but the task is not easy. In many areas of research, the relentless growth of data sets has led to the adoption of increasingly automated and unsupervised methods of classification. In many cases, this has led to degradation in classification quality, with machine learning and computer vision unable to replicate the successes of human pattern recognition. The growth of citizen science on the web has provided a temporary solution to this problem; in particular, the highly successful Galaxy Zoo (Lintott et al. 2008, 2011) and the Zooniverse projects (Smith et al. 2011, Fischer et al. 2011, Davis et al. 2011), which have grown from it and which this proposal takes as its starting point, have demonstrated that it is possible to recruit hundreds of thousands of volunteers to make an authentic contribution to results, boosting human analysis through the collective wisdom of a crowd of classifiers. However, human classifiers alone will not be able to cope with expected flood of data from future scientific instruments. The project was to develop a next-generation socio-computational citizen science platform that combines the efforts of human classifiers with those of computational systems to maximize the efficiency with which human attention can be used. We recognize that to do so requires a thorough understanding of human motivation and learning in this context, and knowledge of how the proposed system will affect these. The project was a partnership between computer and social scientists, addressing research problems both in automated data analysis and social science through systems implementation, alongside field research and experiments with project participants. This project was conducted in collaboration with the Adler Planetarium (Arfon Smith, PI). Carsten Østerlund served as replacement PI at Syracuse University.

Some key publications from the project are listed below.