Skip to content

Data Pipeline

BANZAI pipeline

LCO's Network of telescopes are used for a diverse set of scientific goals, and managing the data raises challenges that are not present in a single-purpose survey or traditional common-user facility. The large number of instruments and the volume of data they generate means that LCO, as the data originator, is in the best position to understand and to reduce the data optimally. On the other hand, the wide variety of scientific programs that use the network, and their diverse needs for data reduction, renders it almost impossible to make a generalized pipeline optimal for all potential science needs.

The aims of LCO's data pipeline are (1) to do the best we can for the bulk of potential users, and (2) to create pipeline products that are of the most general use. In addition, the pipeline emphasizes recording of the processing steps performed, the parameters used, and the software versions employed. These records are of vital importance for documenting the provenance of the reduced data.

The data pipeline, named BANZAI, evolved from the set of image processing algorithms devised by the 2014 Global Supernova Project team. The BANZAI pipeline began processing raw frames from all of LCOGT's instruments in April 2016. The BANZAI pipeline is coded in python, maintained in-house by LCOGT scientists, and stored in a Github repository. It runs automatically and requires no user input. Raw imager frames are processed as soon as they are received in LCOGT's cloud (Amazon S3) archive. The reduced frames are typically available in the archive in < 10 minutes.

During processing, the following calibrations are performed:

  • Bad-pixel masking
  • Bias subtraction
  • Dark subtraction
  • Flat field correction
  • Source extraction (using SEP, the Python and C library for Source Extraction and Photometry)
  • Astrometric calibration (using astrometry.net)

The final data products are multi-extension FITS files with four extensions:

  • SCI: the array of pixel (science) data.
  • BPM: the bad pixel mask.
  • ERR: the array of cumulative pixel uncertainties.
  • CAT: the catalog of sources detected by SEP in a FITS binary table. The catalog lists the pixel positions (X, Y), semi-major and semi-minor axes (A, B), positions angles (THETA), fluxes and errors (FLUX, FLUXERR) of each source.

Data Archive