Note - the data for all the interactive sessions is available for download.

iPTF Summer School 2014

One of the fundamental tasks of time-domain astronomy is determining if a source is periodic, and if so, measuring the period. Period measurements are a vital first step for more detailed scientific study, which may include source classification (e.g., RR Lyrae, W Uma), lightcurve modeling (binaries), or luminosity estimation (Cepheids).

Binary stars in particular have lightcurves which may show a wide variety of shapes, depending on the nature of the stars and the system geometry.

In this workbook we will develop a basic toolset for the generic problem of finding periodic sources. *** By Eric Bellm (c) 2014 Aug 22 (ebellm@caltech)

In []:

```
# imports
import numpy as np
import matplotlib.pyplot as plt
from astroML.time_series import \
lomb_scargle, lomb_scargle_bootstrap
%matplotlib inline
```

In []:

```
file_R = 'variableA_R.csv'
file_g = 'variableA_g.csv' # This file is actually empty, and is here for use with other sources
def load_data(datafile):
"""Function to load homework data"""
return np.recfromcsv(datafile)
data_R = load_data(file_R)
data_g = load_data(file_g)
```

In []:

```
# looking at the data
print data_R['epoch'][0:10]
print data_R['mag'][0:10]
print data_R['magerr'][0:10]
```

Complete this function for plotting the lightcurve

In []:

```
# define plot function
def plot_data(data,color='red'):
plt.errorbar(# COMPLETE THIS LINE
color=color, fmt = '_', capsize=0)
plt.xlabel('Date (MJD)')
plt.ylabel('Magnitude')
# convenience function to rescale axes when plotting two filters
def set_mag_limits(data_list):
mags = []
for data in data_list:
mags.extend(data['mag'])
plt.ylim(np.max(mags)+0.3,np.min(mags)-0.3)
```

In []:

```
# run plot function
plot_data(data_R)
# uncomment below to plot g-band data
# plot_data(data_g,color='green')
# set_mag_limits([data_R,data_g])
```

The Lomb-Scarge Periodogram provides a method for searching for periodicities in time-series data. It is comparable to the discrete Fourier Transform, but may be applied to irregularly sampled data. The periodogram gives as output the relative significance of a least-squares sinusoidal fit to the data as a function of frequency.

Much of this presentation follows Ch. 10 of Ivezic et al.

We use the "generalized" LS version implemented in astroML rather than the "standard" version implemented in scipy: the generalized version accounts better for cases of poor sampling.

In []:

```
# documentation for the lomb_scargle function
help(lomb_scargle)
```

Complete this function to generate the frequency bins at which to evaluate the LS periodogram.

The minimum frequency should be 2 pi / (t_max - t_min).

Choosing the maximum frequency is more complicated (see Ivezic et al. 10.3.2). Let's use pi / dt, where dt = median(time between observations).

Generate a linear grid of frequencies with `np.linspace`

using bins of 0.1 dt.

In []:

```
# define frequency function
def frequency_grid(times):
freq_min = # COMPLETE
freq_max = # COMPLETE
n_bins = # COMPLETE
return np.linspace(freq_min, freq_max, n_bins)
```

In []:

```
# run frequency function
omegas = frequency_grid(data_R['epoch'])
```

In []:

```
# provided alternate frequency function
def alt_frequency_grid(Pmin, Pmax):
"""Generate an angular frequency grid between Pmin and Pmax (assumed to be in days)"""
freq_min = 2*np.pi / Pmin
freq_max = 2*np.pi / Pmax
return np.linspace(freq_min, freq_max, 5000)
```

Calculate the LS periodiogram and plot the power.

In []:

```
# calculate and plot LS periodogram
P_LS = lomb_scargle( # COMPLETE
plt.plot(omegas, P_LS)
plt.xlabel('$\omega$')
plt.ylabel('$P_{LS}$')
```

In []:

```
# provided: define function to find best period
def LS_peak_to_period(omegas, P_LS):
"""find the highest peak in the LS periodogram and return the corresponding period."""
max_freq = omegas[np.argmax(P_LS)]
return 2*np.pi/max_freq
```

In []:

```
# run function to find best period
best_period = LS_peak_to_period(omegas, P_LS)
print "Best period: {} days".format(best_period) # in days
```

Complete this function that returns the phase of an observation (in the range 0-1) given its period. For simplicity set the zero of the phase to be the time of the initial observation.

Hint: Consider the python modulus operator, `%`

.

Add a keyword that allows your function to have an optional user-settable time of zero phase, for use with multiple bands

In []:

```
# define function to phase lightcurves
def phase(time, period, t0 = None):
""" Given an input array of times and a period, return the corresponding phase."""
if t0 is None:
t0 = time[0]
return # COMPLETE
```

Plot the phased lightcurve at the best-fit period.

In []:

```
# define function to plot phased lc
def plot_phased_lc(data, period, t0=None, color='red'):
phases = phase(data['epoch'], period, t0=t0)
plt.errorbar( # COMPLETE THIS LINE
color=color, fmt = '_', capsize=0)
plt.xlabel('Phase')
plt.ylabel('Magnitude')
```

In []:

```
# run function to plot phased lc
plot_phased_lc(data_R, best_period)
# uncomment below to plot g-band data
#plot_phased_lc(data_g, best_period, color='green')
#set_mag_limits([data_R,data_g])
```

Calculate the chance probability of finding a LS peak higher than the observed value in random data observed at the specified intervals: use `lomb_scargle_bootstrap`

and `np.percentile`

to find the 95 and 99 percent significance levels and plot them over the LS power.

In []:

```
D = lomb_scargle_bootstrap(# COMPLETE
sig99, sig95 = np.percentile(# COMPLETE
plt.plot(omegas, P_LS)
plt.plot([omegas[0],omegas[-1]], sig99*np.ones(2),'--')
plt.plot([omegas[0],omegas[-1]], sig95*np.ones(2),'--')
plt.xlabel('$\omega$')
plt.ylabel('$P_{LS}$')
```

Now try finding the periods of these sources with more complex light curves:

- variableB
- variableC (use
`alt_frequency_grid`

) - variableD (very hard with LS! use
`alt_frequency_grid`

)

*Note on alternate algorithms*

Lomb-Scargle is equivalent to fitting a sinusoid to the phased data, but many kinds of variable stars do not have phased lightcurves that are well-represented by a sinusoid. Other algorithms, such as those that attempt to minimize the dispersion within phase bins over a grid of trial phases, may be more successful in such cases. See Graham et al (2013) for a review.

AstroML also includes code (see also `astroML.time_series.multiterm_periodogram`

) for including multiple Fourier components in the fit, which can better fit arbitrary lightcurves.

*Other effects to consider*

Real data may have *aliases*--frequency components that appear because of the sampling of the data, such as once per night. Bootstrap significance tests (and experience) can help rule these out.

Many eclipsing binaries have primary and secondary eclipses, often with comparable depths. The period found by LS (which fits a single sinusoid) will thus often be only half of the true period. Plotting the phased lightcurve at double the LS period is often the easiest way to determine the true period.

Scargle, J. 1982, ApJ 263, 835

Zechmeister, M., and Kürster, M. 2009, A&A 496, 577

Graham, M. et al. 2013, MNRAS 434, 3423

Statistics, Data Mining, and Machine Learning in Astronomy (Ivezic, Connolly, VanderPlas, & Gray)