NRES pipeline

Data Flow

Raw data are transferred from the spectrographs to LCO headquarters in near real time. Each new data file arrival triggers the pipeline. New calibration data are copied to local disk storage and relevant metadata are saved to a searchable database for later use. New science spectra are processed immediately, using the best-available calibration data, based on a suitable database search. Some intermediate data products are saved to local disk, and their metadata also go into the database. At the end of processing, the pipeline bundles the raw data, selected extraction-level data products, and summary data products into a compressed tar file, which is sent to the LCO Archive.

Separate processes run on a daily basis to construct the various master-calibration files and other derived calibration data. The inputs for these processes are extracted from the database, and their products are sent to the Archive in case they are needed by users. There is a facility to reprocess "old" data through the pipeline, if it is necessary.

To better understand the pipeline processing, it may be beneficial to read the webpage describing the NRES instrument architecture, including the calibration system.

Data Types

The NRES spectrograph and its associated calibration system and site software produce four kinds of calibration images and one kind of science image. The calibration images are (1) BIAS images, (2) DARK images, (3) images, called LAMPFLATs, in which two input fibers (the reference fiber and one star fiber) are illuminated by the tungsten-halogen (TH) lamp, and (4) images, called DOUBLEs, in which two input fibers (the reference fiber and one star fiber) are illuminated by a ThAr lamp. In the science images, the reference fiber is illuminated by a ThAr lamp (either slave or master) and one of the star fibers by starlight from a telescope. The science data are more complicated than the calibration images because they contain additional data from telescopes, autoguiders, and the exposure meter.

Calibration data files are normally acquired in groups of the same type, during daylight hours. These files are processed (external to the pipeline) on a daily basis and averaged to create "Supercalibration" files: SuperBIASes, SuperDARKs, and SuperFLATs. The SuperFLATs, which include valid flats for all fibers, are used to construct files (called TRACE files) that describe the positions and cross-dispersion shapes of the spectrum orders on the CCD. Multiple DOUBLE files are used to construct files (called TRIPLEs) that describe the fiber-to-fiber offsets between ThAr spectra along the dispersion axes. The set of calibration files is completed by standard stellar spectra (called ZERO files) that are compared to observed spectra to estimate radial velocities of stars. The ZERO spectra are made from averages of spectra of an observed star, which is chosen because it's of similar spectral type to the target star.

Extraction and Wavelength Calibration

For each science image, the pipeline selects the "best" calibration files from a database search, according to simple rules. The calibration steps that follow are:

  • Bias and dark subtraction;
  • Determining the positions of the orders, using the TRACE data, and then fitting and subtracting a model of the between-order background light;
  • Performing an iterative preliminary extraction and order cross-dispersion centroid computation, using the TRACE data to define "extraction boxes" that are nominally centered on the order positions. If the initial computed centroid displacements are too large, a parametric adjustment of the TRACE data is applied to move the boxes.
  • Computing an optimally-extracted spectrum using cross-dispersion profiles from the TRACE file and a noise model.
  • Examining the residuals around the extracted spectrum to identify evidence of radiation events. Where evidence is found, the fitting weights for nearby data points are set to zero, and the fit is re-computed.
  • Saving three versions of the resulting extracted spectrum to local disk. The versions are:
    • a raw extracted ("EXTR") spectrum. Each order looks more or less like the blaze function of the spectrograph.
    • a raw extracted spectrum with a constant multiple of the SuperFLAT spectrum subtracted from it. This ("BLAZ") spectrum should have a near-zero mean value, and goes to zero at the edges of the blaze function. It has desirable noise properties for use in radial velocity estimation.
    • a ("SPEC") spectrum that is the ratio of raw extracted spectrum and the SuperFLAT, i.e. something like the true stellar spectrum with the instrumental response removed. It is noisy at the edges, and prone to systematics arising from differences between star and flat illumination of the spectrograph optics.
  • Adjusting parameters in a model of vacuum wavelength vs (x-coordinate, order index, fiber index) to give an optimum match between the positions of emission lines observed in the reference spectrum and those implied by the wavelength model and the ThAr line catalog by Redman (2013). The pipeline saves the entire wavelength solution, along with all of the model parameters, to local disk.

Radial Velocity Estimation

The procedure for determining radial velocities continues to be improved as the pipeline evolves. It is possible to run the RV estimation code independently, after the rest of the pipeline has executed.

Stellar radial velocities are estimated by comparing the BLAZ extracted spectrum with a ZERO file (a standard stellar spectrum). To insure that a spectrum from a particular science target is always compared with the same ZERO spectrum, each target star is linked with a particular ZERO file in the database. First, the pipeline determines an approximate redshift by cross-correlating the BLAZ and ZERO spectra, but only for the echelle order containing the Mg b lines (roughly 516 nm). Based on this preliminary estimate, the pipeline then interpolates the entire ZERO spectrum to the provisional redshifted wavelength scale and breaks each order into a number of "blocks", i.e. contiguous wavelength segments. The pipeline then performs a fit to estimate the residual redshift of each block and formal errors. Last, the pipeline constructs several estimates of the "mean" redshift, taking differently-weighted averages or medians of the individual block redshifts. Outputs of the radial velocity analysis are written to a FITS extension file with an empty main data segment. The first extension table contains the cross-correlation function and various cross-correlation-related statistics. The second extension table contains the computed residual redshifts per order and block, and useful statistics related to them.

Data Products [Pipeline version 0.8 - after May 10th 2018]

Stellar spectrum output data files produced by the NRES reduction pipeline
will soon have a new format.  We hope this will be both more accurate, more
complete, and easier to use.

Data from the archive arrive in a form that depends on how you acquire it.
Please refer to the archive documentation to understand differences between
downloading your data via a web browser, or with a wget script, or one file
at a time with a Python script, or perhaps other methods.  In any case the
underlying data items are gzipped tar files, each corresponding to an 
individual spectrum exposure.  These have names like
   lscnrs01-fl09-20180307-0036-e91.tar.gz
   lscnrs01-fl09-20180307-0037-e91.tar.gz
where
   lsc = the observing site: lsc=CTIO, elp=McDonald, cpt=SAAO, tlv=Wise
   nrs0x = which spectrograph; NRES-0 to NRES-4
   20180307 = The OBS-DATE date on which the observation was started
      (This date rolls over at a different UT time for each site, such
       that OBS-DATE is fixed during each night of observation.)
   0036 = The nightly sequence number
   e91 = A code for the reduction level.  e91 = reduced with Banzai pipeline

Unpacking (tar -xzf file.tar.gz) each of these tar files yields a directory
named after the observation, eg lscnrs01-fl09-20180307-0031-e91.  Each such
directory will henceforth contain only 2 files:  the fpacked main data file  
(eg lscnrs01-fl09-20180307-0031-e91.fits.fz)
and the diagnostic plots pdf file
(eg lscnrs01-fl09-20180307-0031-e91.pdf)

Running funpack on the main data file yields the main data file
(eg lscnrs01-fl09-20180307-0031-e91.fits)

The main data file is a multi-extension fits
file containing 10 extensions.  The first of these is essentially only a
header;  the last is a binary table extension.  All the rest are IMAGE
extensions.  Their contents are as follows:

Extn #,name     Data contents             Header keywords

0: --            None            Everything relating to the observation as a
                                 whole: instrument config, dates, scheduling
                                 info, names of calib frames used, weather info,
                                 instrument setup info, spectrograph environment
                                 readings.  Data unique to this star spectrum:  
                                 Telescope RA, Dec, Object name, total fluxes of
                                 star and ThAr, telescope lat, long.
                                 NORD, per-order scaling factor AMPFLxxx applied
                                 to the flat field calibration before sub-
                                 tracting from the raw extracted spectrum.
                                 Wavelength solution parameters (SINALP, FL,
                                 Y0, Z0), polynomial correction coeffs
                                 C0 to C14, between-fiber correctopm coeffs
                                 FIBC0 to FIBC9, name of ThAr line catalog file.
                                 Cross-correl peak width (km/s) and height,
                                 RCC = redshift of peak relative to template
                                 RVCC = redshift (km/s) rel to barycenter
                                 Robust average, median, formal errors for
                                 per-block redshifts.

1: SPECRAW   Raw extracted star  Nothing of interest.
             spectrum (NX,NORD)  
             (float)            

2: SPECFLAT  Flat-fielded        Nothing of interest.
             extracted star
             spectrum (NX,NORD)
             (float).  This
             spectrum is the
             most nearly free
             of instrumental
             signatures, but
             it is noisy near its
             boundaries.

3: SPECBLAZE Blaze-subtracted    Nothing of interest.
             extracted star      
             spectrum (NX,NORD)  
             (float).  This
             spectrum has simple
             noise properties,
             and is the one used
             in radial velocity
             estimation.

4: THARRAW   Raw extracted ThAr  Nothing of interest.
             spectrum (NX,NORD)
             from calibration 
             fiber (float)

5: THARFLAT  Flat-fielded        Nothing of interest.
             extracted ThAr
             spectrum (NX,NORD)
             (float)

6: WAVESPEC  Wavelength solu-    Nothing of interest.
             tion [nm] vs
             (NX,NORD) for the
             starlight-
             carrying fiber
             (double)

7: WAVETHAR  Wavelength solu-    Nothing of interest.
             tion [nm] vs        
             (NX,NORD) for the   
             standard ThAr       
             fiber (fiber 1)
             (double)

8: SPECXCOR  Correlation fn      lag (km/s) vs pixel index, in fits-standard
             with template spec  CRVAL1, CDELT1, CTYPE1, CRPIX1 format.
             vs lag (km/s)       
             (float)             
                                 

9: RVBLOCKFIT Redshift and       descriptive info about table columns.
             related parameters  
             per (order,block)   
             from least-squares  
             fit, relative to
             RCC redshift
             (double, mostly)
             Data columns are:
             ZBLOCK              Array of (NORDER,NBLOCK) values of
                                 measured redshift relative to cross-correlation
                                 redshift.  Units are dimensionless (velocity/c)
             ERRZBLOCK           Formal error on ZBLOCK
             SCALE               Estimated scale factor connecting intensity
                                 in current spectrum to that in template
                                 spectrum, by order and block. (dimensionless)
             ERRSCALE            Formal error on SCALE
             LX1COEF             Estimated zero-point shift between intensity
                                 in current spectrum and in template spectrum,
                                 by order and block.  Units are ADU.
             ERRLX1              Formal error on LX1COEF
             PLDP                Estimated Photon-Limited Doppler Precision
                                 by order and block (km/s).
             BLKINDX             Block index.  Range [0 - NBLOCK-1]
             ORDINDX             Order index.  Range [0 - NORD-1]

Data Products [before May 10th 2018]

At the end of the reduction procedure, the pipeline bundles various output data into a gzipped tar file, and writes them to the LCO science archive. An example tar file name is:


lscnrs01-fl09-20170830-0044-e91.tar.gz

The naming convention for the tar files is {Site ID}{NRES ID}-{Camera ID}-{DAY-OBS}-{Image number}-e91.tar.gz, where the "e91" indicates that processing is complete. An example of the data products contained within the tar files is:


README                                             (A list of the files in the tarball)
arc_lsc_nres01_fl09_20170716.fits.fz (The ThAr arc spectrum (i.e. TRIPLE file) used in the reduction)
flat_lsc_nres01_fl09_20170716.fits.fz (The flat field used in the reduction) lscnrs01-fl09-20170717-0047-e91-blaze.fits.fz (The extracted spectrum with the blaze function subtracted) lscnrs01-fl09-20170717-0047-e91-noflat.fits.fz (The extracted spectrum with no flat field applied) lscnrs01-fl09-20170717-0047-e91-rv.fits.fz (The radial velocity solution) lscnrs01-fl09-20170717-0047-e91-wave.fits.fz (The wavelengths for each pixel in the extracted spectra) lscnrs01-fl09-20170717-0047-e91.fits.fz (The reduced spectrum) lscnrs01-fl09-20170717-0047-e91.pdf (A set of quality control plots. See below.) trace_lsc_nres01_fl09_20170716.fits.fz (The extraction region used in the reduction (i.e. TRACE file))

For every science spectrum that's processed, the pipeline creates a set of diagnostic plots. Some aim to show aspects of the target star spectrum; others contain diagnostics of the accuracy of the TRACE file used for extracting 1-dimensional spectra, and of the wavelength solution. Consult the NRES diagnostic plots page for a thorough description.

Pipeline Version Change Log

Version 0.8   May 10th 2018 

Update of data output format. Newly observed data will be processed with new pipeline. Old data will be reprocessed at a later time.