The Kepler archive stores target-specific light curve files that have been derived from each TPF in binary FITS format. The format of the FITS file is defined in Fraquelli & Thompson (2011). Time stamps, quality flags, predicted target motion relative to the detector pixels, and pixel bitmap information stored within the TPFs are copied into the light curve FITS tables. Kepler light curve files contain a number of columns containing flux information. Two columns contain simple aperture photometry (SAP) flux with 1-σ statistical uncertainties, while a more processed version of SAP with artifact mitigation included called Pre-search Data Conditioning (PDCSAP) flux (Smith et al. 2012) and its uncertainties, populate two more columns. The sky background values, summed across the optimal aperture, and its 1-σ uncertainty are calculated directly from the TPF and added to the light curve files. The last set of columns in the light curve FITS files are the time-stamped moment-derived centroid positions of the target, as calculated from the calibrated TPF images. The Data Processing Handbook (Fanelli et al. 2011) provides details on how the centroid positions were calculated. The centroids are provided in detector pixel row and column coordinates. The centroid positions can be used as a direct comparison to the motion predicted from a set of reference stars per CCD channel. The purpose of the comparison between measured flux centroid and the centroid predicted from the motion of reference stars is to identify times when these quantities are uncorrelated. Potentially, uncorrelated centroid structure identifies events in the light curve that are caused by fractional changes in contamination from sources close to or unresolved from the target.
4.1 Simple Aperture Photometry (SAP)
The SAP light curve is a pixel summation time-series of all calibrated flux falling within the optimal aperture, as stored and defined in the TPF. The 1-σ errors are calculated from standard Gaussian error propagation of the TPF errors through the sum. Data archive users need to be aware that a SAP light curve can be contaminated by astrophysics from neighboring sources. One can inspect the concurrent TPF to identify contamination. Archive users must expect, a posteriori, that SAP photometry is contaminated by motion and focus systematics. To continue using SAP data for scientific exploitation, users must decide whether the artifacts will impact their results and conclusions. There is “low-hanging fruit” that has dominated Kepler astrophysics activity in the early phases of the mission because the SAP data has proved to be adequate for specific science goals without artifact mitigation. For example, asteroseismology of solar-like oscillations, δ Scuti and γ Doradus pulsations have been hugely successful because signals of frequency > 1-d-1 are mostly unaffected by the majority of artifacts (Balona & Dziembowski 2011; Uytterhoeven et al. 2011; Balona et al. 2011). High frequency artifacts that could prove problematic to these programs can be filtered out of the time-series using the quality flags provided. Data analysis of cataclysmic variables, RR Lyr stars and Cepheids are just as successful. While many of the astrophysical frequencies of interest in these pulsators can be longer than a few days and similar to the thermal resettling times of the spacecraft after a pointing maneuver, the large amplitude of target variability dominate over systematics that can consequently be neglected (e.g. Still et al. 2010; Benko et al. 2010; Szabo et al. 2011).
There are many astrophysical targets that are less likely to benefit from direct employment of SAP data. These include any science relying on more subtle light curve structures and periods longer than a few days, in which case systematics are more likely to be significant. Investigations of magnetic activity, gyrochronology, binary stars and long period variables must scrutinize the SAP data with great care before proceeding and most will likely benefit from one of three available artifact mitigation methods. The three methods are to use archived PDCSAP photometry, to re-extract the SAP light curve over a larger set of pixels, or to perform a custom correction on the archived SAP data using cotrending basis vectors. These methods and their precise application are very subjective. The highest quality Kepler research will in most cases result from the experience and understanding gained by applying all three of these methods to the archived data.
4.2 Pre-search Data Conditioning Simple Aperture Photometry (PDCSAP)
The PDCSAP data included within the archived light curve files are produced by a pipeline module that remains under continuing development at the time of writing (Smith et al. 2012; Stumpe et al. 2012). Systematic artifacts are characterized by quantifying the features most common to hundreds of strategically-selected quiet targets on each detector channel. For each channel and each operational quarter, this characterization is stored as 16 best-fit vectors called “Cotrending Basis Vectors” (CBVs). The basis vectors archived represent the most common trends found over each channel. The CBVs are ranked by order of the relative amplitude they contribute to systematic trends across a channel. An example of the 8 most dominant CBVs for CCD channel 50 over quarter 5 is provided in Figure 3.
Figure 3: An example of eight cotrending basis vectors with the highest principle values, or contribution to systematic variability, from channel 50 over operational quarter 5. Basis vectors run from left-to-right, top-to-bottom, in order of significance. Each basis vector is normalized and median-centered about zero. Basis vectors can be linearly-fit to a light curve and subtracted to mitigate for systematic effects. The fit coefficients can be positive or negative.
Within the PDC pipeline module, systematics are removed from SAP time-series by subtracting the CBVs. The results are stored in the archived files and labeled PDCSAP data. The correction is unique to each target. A weighted normalization for each basis vector in the calculation is determined by fitting basis vectors to the SAP data, but the CBV weighting and “best” astrophysical solution remains a subjective problem. The process is therefore repeatable by archive users and tunable. The pipeline has configured the tuning to provide the most effective conservation of astrophysics within the Kepler targets each quarter by detector channel as a statistical sample. The pipeline algorithm therefore provides a significant improvement in the quality of artifact mitigated photometry. However the PDC algorithm is not tuned to individual targets or specific classes of target. As the PDC pipeline continues to mature, the number of individual problematic cases in the archive will shrink. However, for any individual target, we recommend direct comparison of the three artifact mitigation methods available in order to understand whether the archived data provides a solution optimized to the users’ scientific requirement. A manual re-extraction of a target light curve from a TPF will produce a SAP time-series, but not a PDCSAP time-series. If artifact mitigation is required subsequent to light curve extraction, then the only viable option is to manually fit the CBVs.
Previous: Target Pixel Files Up: PyKE Primer Next: Cotrending Basis Vectors