Data handling


It is easy to recognize that the very high throughput of the telescope and the relatively large format of the detectors will result in a huge output of raw data. For a typical observing time of a few minutes of integration for each exposure, an amount of about 20 Gb of raw data can be obtained for each night of operation. By definition, such a volume of data require very efficient and automated data reduction pipelines, at least partly implemented at the telescope.

It is beyond the scope of this feasibility study to define the hardware and software facilities required to handle this huge data flow. However, a few general guidelines have been envisaged, based on the instrument characteristics as outlined in this document. A more detailed definition of the architecture and its actual implementation is deferred to the construction phase of the instrument.

  • The large data flow requires an adequate network speed and bandwidth at the telescope. A 100 Megabit Ethernet connection is to be considered as a baseline. In these conditions, the major limitation results from the disk access speed. A distributed architecture of several disks operating in parallel should be designed, for a total storing capacity of about 100 Gb.

  • About 50% of the raw data produced each night are likely to come from calibration frames (biases, flats, standard stars). To reduce the output of data and to allow a check of the performance of the instrument, on-line procedures should be available to reduce all these frames to a few final ones (i.e. super-biases, super-flats and so on).

  • The typical output of an observing run will consist of several fields (often slightly overlapping) taken with several filters with a dithering (i.e. shift-and-add) technique. The reduction of these data to a single image or to a clean mosaic requires large disk space and a detailed knowledge of the geometric field distortion induced by the optics on the external regions of the array. It is very unlikely that these requirements can be fulfilled by individual observers in their home institutions.

  • the two latter items suggest that a data reduction pipeline should be implemented at the telescope for on-line processing of the data. The final product of this procedure should be summed images and weight maps of the observed fields, one for each filter. Zero-points for calibrations should be provided as well.

  • Similar pipelines are technically feasible, especially when the expected increase in the computer performances is taken into account. A system of this kind is under development at CfA for the MEGACAM project. Similar projects exist also in Europe (e.g. Terapix, EIS). The Rome Observatory has developed an automated pipeline for the reduction of the SUSI2 images that has been already successfully used during the SUSI2 first observations.

  • The existing Image Simulator should be improved in order to take fully into account the expected image quality (e.g. including ghosts and varying PSF across the chip) and to produce images of the actual format. The upgraded Image Simulator can be used to develop and test the data reduction pipeline.

  • The extraction of scientific information from the data (i.e. source detection, multicolor photometry etc.) may need optimization depending on the scientific targets. For instance, the star/galaxy separation, unbiased multicolor photometry or the search for low surface brightess galaxies may be more conveniently carried out with different softwares. It is therefore suggested that these procedures are left to the user's responsability, given also the availability of standard software (e.g. SExtractor).

  • A full archive of the raw data on magneto-optics devices must be created, to be eventually accessed by the users and (possibly) by more refined data reduction procedures.