Workshop on Modeling and Machine Learning in Astronomy

Workshop Date: Thursday, September 20, 9:30 - 13:00 (Asia/Calcutta) - Room No. 201
Data sets in Astronomy have been growing with the advent of many sky-surveys. The variety and complexity of the data sets at different wavelengths, cadences etc. imply that modeling, computational intelligence methods and machine learning need to be exploited to understand data-rich astronomy. Ranging from PB-sized archives to the recent example of the discovery of Gravitational Waves, the importance of data driven discovery in Astronomy has multiplied. That has resulted in the relatively new field of Astroinformatics: an interdisciplinary area of research where astronomers, mathematicians and computer scientists collaborate to solve problems in astronomy through the application of techniques developed in data science. Classical problems in astronomy now involve accumulation of large volumes of complex data with different formats and characteristic and cannot now be addressed using classical techniques. As a result, machine learning algorithms and data analytic techniques have exploded in importance, often without a mature understanding of the pitfalls in such studies.
The workshop aims to capture the baseline, set the tempo for future research in India and abroad and prepare a scholastic primer that would serve as a standard document for future research. We expect to discuss new developments in efficient models for complex computer experiments and data analytic techniques which can be used in astronomical data analysis in short term and various related branches in physical, statistical, computational sciences. The workshop aims to evolve and critique a set of fundamentally correct thumb rules and experiments, backed by solid mathematical theory and provide the marriage of astronomy and Machine Learning with stability and far reaching impact serving the context of specific science problems of interest to the audience.
Given the horizontal nature of ICACCI, we hope to disseminate methods that are applicable to Astroinformatics but are not currently used, and also making CS practitioners aware of the interesting problems that complex astronomy data sets provide.
Topics of interest include, but are not limited to:
  • Exoplanets (discovery, machine classification etc.)
  • Classification of transients (Galactic and extragalactic)
  • Multi-messenger astronomy aided by Machine learning
  • Deep learning in astronomy
  • MCMC on big data
  • Statistical Machine Learning
  • Bayesian Methods in Astronomy
  • Meta-heuristic and Evolutionary Clustering methods and applications in Astronomy

Authors should submit their papers online. We use EDAS system for managing submissions and review process. Unregistered authors should first create an account on EDAS to log on. Detailed usage instruction on EDAS can be found here. The manuscripts should be submitted in PDF format. Please compare all author names in EDAS with the author list in your paper. They must be identical and in the same order. Further guidelines for submission are posted at:

All papers that conform to submission guidelines will be peer reviewed and evaluated based on originality, technical and/or research content/depth, correctness, relevance to conference, contributions, and readability. Acceptance of papers will be communicated to authors by email. Accepted and presented papers will be published in the conference proceedings and submitted to IEEE Xplore as well as other Abstracting and Indexing (A&I) databases. To be published in the Proceedings, an author of an accepted paper is required to register for the conference at the full rate. All accepted papers MUST be presented at the conference by one of the authors, or, if none of the authors are able to attend, by a qualified surrogate.
Important Dates
Papers Due: June 30, 2018
Acceptance Notification: July 30, 2018
Final Paper Deadline: August 20, 2018

Talk 1: Title: Bayesian methods in Astrophysics, Dr. Tarun Deep Saini, Assistant Professor and Academic Coordinator, Joint Astronomy Programme at Indian Institute of Science

It is commonly believed that Bayesian statistic differs from frequentist statistic only in the narrow sense that it allows for prior information in statistical inference; and in those cases where the priors are a umed flat, the two are, in fact, thought to be indistinguishable. However, through specific examples, I will show that the conceptual turnaround that the Bayes theorem enables, allows for neater solutions to several problems of statistical inference.

Talk 2: Title: From Machine Learning to Deep Learning in Astronomy, Dr. Ajit Kembhavi, Professor Emeritus at the Inter-University Centre for Astronomy and Astrophysics

Machine Learning has for long been used in astronomy to address classification problems. It provides very useful techniques for quickly and reliably addressing problems such as to distinguish between stars and nearly unresolved galaxies and to distinguish between unresolved quasars and stars on the basis of their colours. These techniques have limitations because they are dependent on the use of derived parameters for the training. Such an approach might not be possible, for example, in classification problems involving objects with complex morphology, which cannot be easily parameterized. In such circumstances Deep Learning techniques, which have been used in applications like face recognition, prove to be very useful, since they directly use images or spectral data without the need for parameterization. I will consider, as convolutional neural networks (CNN) and describe their use with a few examples, including galaxy morphology and time series analysis. As applications, I will consider the finding of galaxies with bars and rings, and the classification of stellar spectra and show that very good accuracy can be obtained.

Talk 3: Title: Theoretical validation of potential habitability of recently discovered exoplanets via modeling and Machine learning, Dr. Margarita Safonova, Birla Institute of Fundamental Research

Seven Earth-sized planets, known as the TRAPPIST-1 system, was discovered with great fanfare in the last week of February 2017. Three of these planets are in the habitable zone of their star, making them potentially habitable planets (PHPs) a mere 40 light years away. The discovery of the closest potentially habitable planet to us just a year before - Proxima b and a realization that Earth-type planets in circumstellar habitable zones are a common occurrence provide the impetus to the existing pursuit for life outside the Solar System. The search for life has two goals essentially: looking for planets with Earth-like conditions (Earth similarity) and looking for the possibility of life in some form (habitability). An index was recently developed, the Cobb-Douglas Habitability Score (CDHS), based on Cobb-Douglas habitability production function (CD-HPF), which computes the habitability score by using measured and estimated planetary parameters. As an initial set, radius, density, escape velocity and surface temperature of a planet were used. The proposed metric, with exponents accounting for metric elasticity, is endowed with analytical properties that ensure global optima and can be scaled to accommodate a finite number of input parameters. We show here that the model is elastic, and the conditions on elasticity to ensure global maxima can scale as the number of predictor parameters increase. K-NN (K-Nearest Neighbor) classification algorithm, embellished with probabilistic herding and thresholding restriction, utilizes CDHS scores and labels exoplanets into appropriate classes via feature-learning methods yielding granular clusters of habitability. The algorithm works on top of a decision theoretical model using the power of convex optimization and machine learning. The goal is to characterize the recently discovered exoplanets into an "Earth League" and several other classes based on their CDHS values. A second approach, based on a novel feature-learning and tree-building method classifies the same planets without computing the CDHS of the planets and produces a similar outcome. For this, we use XGBoosted trees. The convergence of the outcome of the two different approaches indicates the strength of the proposed solution scheme and the likelihood of the potential habitability of the recently announced discoveries. We plan to discuss new activation functions for such classification purposes.

Talk 4: Title: Machine Learning in the context of New Surveys and the Internet of Things(IoT), Dr. Ninan Sajeeth Philip, Associate professor, and Head, Department of Physics, St. Thomas college. Area of specialization: Machine Learning

Abstract: Machine Learning became popular as a tool to substitute the logical deductions made by a human expert on a given problem by adjusting a finite nonlinear combination of variables through a process called training. In a large subset of problems, this can be done using a small set of derived features describing the logic used by the expert to come to the conclusions. When problems are described in terms of the derived features, the reliability of the predictions using it depends on how well the derived features are capable of describing the problem in all the possible situations. For example, when doing a star-galaxy classification, one can use the ellipticity of a galaxy to separate them from the gau ian psf of stars. The logic works well when the galaxy is well resolved. However, if the galaxy is far off, it might look similar to a star and our feature based classification fails. There is yet another class of problems where the domain expert has to dynamically select a subset of available features from a large pool of features to do the classification. This is typically the case when language, speech or image classifications are done, where, depending on each situation (context), the set of features for the classification are selected. For example, the set of features describing the image of a dog need not be the same for describing a human or some other animal in the same image. Going for manually defining the context leads to an ill-posed problem. The talk shall introduce the basic concepts of IoT, Natural Language Processing (NLP), big data and how it transforms modern machine learning from model-driven to a data-driven science.

Talk 5: Title: Classifying sources in multiwavelength astronomy using machine learning, Dr.  Jayant Murthy, Director, Indian Institute of Astrophysics

Astronomical data sets have been steadily increasing insensitivity and size to the point where it is no longer practical to extract ``interesting'' objects from the chaff. In this scenario, an automated analysis procedure is necessary not only to process the data into scientifically usable data sets, including a source catalog but to go further and classify the sources in the catalog. Even more information about the sources and, hence, a better classification will be obtained by combining multiple catalogs, particularly from multiple spacecraft with a broad spectral coverage. The complexity of the data and the number of sources necessitate a machine learning (ML) approach to the problem. We will specifically address the case of star-quasar separation where color-color plots have been used in the past. We have found that these would not yield a unique separation and used a template-based method where we fit all the available photometric data at once. We are now experimenting with a neural network based approach and are finding good results. I will focus here on presenting the scientific problem.

Workshop speakers
Dr. Snehanshu Saha holds Masters Degree in Mathematical Sciences at Clemson University and Ph.D. from the Department of Mathematics at the University of Texas at Arlington in 2008. He was the recipient of the prestigious Dean's Fellowship during PhD. After working briefly at his Alma matter, Snehanshu moved to the University of Texas El Paso as a regular full time faculty in the Department of Mathematical Sciences. He is a Professor of Computer Science and Engineering at PES University since 2011 and heads the Center for Applied Mathematical Modeling and Simulation. He is also a visiting Professor at the department of Statistics, University of Georgia, USA. He has published 60 peer-reviewed articles in International journals. Dr. Saha is an IEEE Senior member, ACM Senior Member, Vice Chair-International Astrostatistics Association and Chair-Elect, IEEE Computer Society Bangalore Chapter. Snehanshu’s current and future research interests lie in Data Science, Astronomy, Healthcare and Machine Learning.
Dr. Jayant Murthy obtained his PhD in Physics from the Johns Hopkins University in 1987 on a project to understand the gas that is very near the Sun. He worked for 2 years as a National Research Council Fellow at NASA's Goddard Space Flight Center after which he returned to Johns Hopkins as Research Scientist where he worked on a number of spacecraft (Voyager, Hubble Space Telescope, FUSE etc.). He joined the Indian Institute of Astrophysics in 1999 where he is now a Senior Professor. Murthy has published over 100 papers in the scientific literature. He has mentored several students and young scientists and is currently heading the balloon and space payload group at IIA. He takes active interest and authored high impact peer-reviewed publications in Machine Learning driven astronomy.
Dr. Margarita Safonova was born in Russia. Dr. Safonova received her M.Sc. in Physics from Moscow State University in 1991, and a Ph.D. in Astrophysics from University of Delhi, Dept. Physics & Astrophysics in 2002. She worked in Cambridge, UK; Tehran, Iran, and from 2006 till 2016 worked in Indian Institute of Astrophysics, Bangalore. She is currently associated with M.P.Birla Institute of Fundamental Research, Bangalore. Her research interests are application of gravitational lensing in astrophysics and cosmology; UV astronomy from space and near space; exoplanets and habitability and machine learning methods in understanding astrophysical data.
Dr. Tarun Deep Saini is an Assistant Professor and Academic Coordinator, Joint Astronomy Programme at Indian Institute of Science. He holds a PhD from IUCAA, University of Pune awarded in 2001 and subsequently held visiting position at Institute of Astronomy, Cambridge, UK between 2001 and 2004. Tarun is an accomplished researcher and highly cited in the areas of Cosmology including Dark energy, Gravitational lensing, Structure formation in the Universe and Large scale structure in the Universe. He has been using computational methods like Markov Chain Monte Carlo simulations with moderate success. Lately, he has developed interest in statsitical learning and applications in Astronomy.
Dr. Ninan Sajeeth Philip is an Associate professor and Head, Department of Physics, St. Thomas college. His area of specialization is Machine Learning. He works on a variety of topics ranging from Quasar hunts, Bayesian inference in Astronomy to early detection of diabetes. 
Dr. Ajit Kembhavi (born 16 August 1950) is an Indian astrophysicist. He is presently a Professor Emeritus at the Inter-University Centre for Astronomy and Astrophysics, (IUCAA) at Pune, India, of which he was also a founder member. He also serves as a Vice President of the International Astronomical Union. Kembhavi's areas of expertise largely include gravitation theory, extragalactic astronomy and astronomical database management. His contributions include:  + Gravity: Elucidating nature of cosmological singularities under conformal transformations (done as a part of his PhD thesis) , + Quasars: Estimation of quasars to the X-ray background, + X-Ray Binaries: Study of the tidal capture formation of globular cluster X-ray binaries, and their subsequent evolution millisecond and other pulsars, + Pulsars: Pulsar magnetic field decay from a comparison between observed data and simulations , + Galaxies: Quantitative study of galaxy morphology, scaling relations etc , + Warm Absorbers: Comprehensive high resolution X-ray spectral study of warm absorbers in Seyfert galaxies observed with ESA's X-ray observatory XMM-Newton, theoretical modelling using photo-ionisation code CLOUDY, understanding the effect of ionizing continuum shape on the properties of warm absorbers, and stability curve analysis of the warm absorbers.
Acepted Papers:
A Comparative Analysis of the Cobb-Douglas Habitability Score (CDHS) with the Earth Similarity Index (ESI)
Suryoday Basak (University Of Texas Arlington, USA); Surbhi Agrawal and Kakoli Bora (PES University, India); Jayant Murthy (Indian Institute of Astrophysics, India)
A New Activation Function for Artificial Neural Net Based Habitability Classification
Snehanshu Saha (PES University, India); Archana Mathur (Indian Statistical Institute, India); Kakoli Bora (PES University, India); Suryoday Basak (University Of Texas Arlington, USA); Surbhi Agrawal (PES University, India)
Website on Astroinformatics: