Classification/Characterization Subgroup

Members

Members that collaborated to generate this roadmap:

Keivan Stassun, Vanderbilt University

Mahmoud Parvizi, Vanderbilt University

Martin Paegert, Vanderbilt University

Primary subgroup contact:

Keivan Stassun: keivan.stassun@vanderbilt.edu

Subgroup MAF engineer:

to be appointed

Subgroup Primary members

Andrew Becker
Josh Bloom
Mark Huber
Zeljko Ivezic
Darko Jevremovic
Ashish Mahabal
Adam Miller
Gautham Narayan
Hakeem Oluseyi
Frederic Piron
Peter Plavchan
Umaa Rebbapragada
Stephen Ridgway
Abi Saha
Rob Seaman
Tom Vestrand
Przemek Wozniak
Rafael Martínez-Galarza

Subgroup Secondary members

Lluis Galbany
Chris Smith
Alexandre Roman
Samaya Nissanke
Laura Chomiuk
Knox Long
Andrej Prsa
Paula Szkody
Lucianne Walkowicz
Virginia Trimble
Marcio Catelan
Arne Henden
Edward Schmidt
Alistair Walker
Peter Brown
Ryan Chornock
Melissa Graham
Cosimo Inserra
Tom Matheson
Danny Milisavljevic
David Reiss
Stephen Smartt
Niel Brandt
Suvi Gezari
Josh Pepper
Keivan Stassun
Rachel Street
Chris D'Andrea

Roadmap Outline

Science Drivers
Current Work
Key Questions

Science Drivers

Virtually everything in the sky will be "variable" at the expected photometric precision of LSST. This presents an unprecedented opportunity to develop methods for classification of astronomical objects partly or entirely on the basis of time-series photometric variability.

For example, LSST provides a powerful new capability for monitoring periodic variable stars, such as RR Lyrae stars, which can be used to map the Galactic halo and intergalactic space to distances exceeding 400 kpc. Exploiting this capability for time domain science means rapid data reduction and classification in order to flag interesting objects for spectroscopic and other follow up with separate facilities, as well as ensemble population studies through analysis of the LSST light curve data alone. Thus LSST requires that data processing enable a fast and efficient response to transient sources (i.e., automated identification of variable stars and astrophysically interesting binaries) with a robust and accurate preliminary classification, as well as methods for in-depth classification of large ensembles of LSST sources throughout the mission lifetime.

Challenges:

Collection Cadence and Number of Observations
Purity vs. Completeness
Extinction and Crowding
Disambiguating various classes of periodic variables
Extending classification techniques to quasi- and non-periodic variables
Classification, or at least identification, of unexpected classes of variables/transients

Opportunities:

LSST extends time–volume space a thousand times over current surveys such that the most interesting science may well be the discovery of new classes of objects.

Current Work

The EB Factory Project:

"LSST will observe about 2 billion stars yielding 28 million EBs. Due to the less than optimal coverage in time, 28%, or 7.8 million of these, will be detectable. Clearly these numbers indicate that a manual approach to light curve classification for analysis of EBs cannot continue into the LSST era.The EB Factory is an end-to-end computational pipeline that allows automatic processing of massive amounts of light curve data—from period finding to object classification to determination of the stellar physical properties."
- Paegert et al. 2014, The Astronomical Journal, 148, 31
- Abstract: We describe a new neural-net-based light curve classifier and provide it with documentation as a ready-to-use tool for the community. While optimized for identification and classification of eclipsing binary stars, the classifier is general purpose, and has been developed for speed in the context of upcoming massive surveys such as the Large Synoptic Survey Telescope. A challenge for classifiers in the context of neural-net training and massive data sets is to minimize the number of parameters required to describe each light curve. We show that a simple and fast geometric representation that encodes the overall light curve shape, together with a chi-square parameter to capture higher-order morphology information results in efficient yet robust light curve classification, especially for eclipsing binaries. Testing the classifier on the ASAS light curve database, we achieve a retrieval rate of 98% and a false-positive rate of 2% for eclipsing binaries. We achieve similarly high retrieval rates for most other periodic variable-star classes, including RR Lyrae, Mira, and delta Scuti. However, the classifier currently has difficulty discriminating between different sub-classes of eclipsing binaries, and suffers a relatively low (∼60%) retrieval rate for multi-mode delta Cepheid stars. We find that it is imperative to train the classifier's neural network with exemplars that include the full range of light curve quality to which the classifier will be expected to perform; the classifier performs well on noisy light curves only when trained with noisy exemplars. The classifier source code, ancillary programs, a trained neural net, and a guide for use, are provided.
"An automated classification pipeline with parameters adaptable to multiple time series photometric surveys would be immediately applicable to the Kepler data set and would be well suited to the LSST requirement that data processing enable a fast and efficient response to transient sources."
- Parvizi et al. 2014, The Astronomical Journal, 148, 125
- Abstract: Large repositories of high precision light curve data, such as the Kepler data set, provide the opportunity to identify astrophysically important eclipsing binary (EB) systems in large quantities. However, the rate of classical ``by eye" human analysis restricts complete and efficient mining of EBs from these data using classical techniques. To prepare for mining EBs from the upcoming K2 mission as well as other current missions, we developed an automated end-to-end computational pipeline — the Eclipsing Binary Factory (EBF) — that automatically identifies EBs and classifies them into morphological types. The EBF has been previously tested on ground-based light curves. To assess the performance of the EBF in the context of space-based data, we apply the EBF to the full set of light curves in the Kepler ``Q3" Data Release. We compare the EBs identified from this automated approach against the human generated Kepler EB Catalog of ∼2600 EBs. When we require EB classification with ≥90% confidence, we find that the EBF correctly identifies and classifies eclipsing contact (EC), eclipsing semi-detached (ESD), and eclipsing detached (ED) systems with a false positive rate of only 4%, 4%, and 8%, while complete to 64%, 46%, and 32% respectively. When classification confidence is relaxed, the EBF identifies and classifies ECs, ESDs, and EDs with a slightly higher false positive rate of 6%, 16%, and 8%, while much more complete to 86%, 74%, and 62% respectively. Through our processing of the entire Kepler ``Q3" dataset, we also identify 68 new candidate EBs that may have been missed by the human generated Kepler EB Catalog. We discuss the EBF's potential application to light curve classification for periodic variable stars more generally for current and upcoming surveys like K2 and the Transiting Exoplanet Survey Satellite.

Key Questions

What are the minimal requirements on the cadence and number of available observations for successful classification?
How to extend current light curve classification methods to quasi- and non-periodic light curves?
How do we incorporate the discovery of new classes of variable objects with automated classifiers trained via supervised leaning… the Miscellaneous (MISC) class?
What are the requirements on purity and completeness for the given constraints due to extinction and crowding?

You are here