ESPRESSO Data Tutorial

This tutorial demonstrates how to create and use RVData standard files for the Echelle SPectrograph for Rocky Exoplanets and Stable Spectroscopic Observations (ESPRESSO) instrument at all data levels:

Level 2 (L2): Extracted, wavelength-calibrated echelle spectra
Level 3 (L3): Stitched 1D spectrum on a common wavelength grid
Level 4 (L4): Radial velocity measurements

Prerequisites

Install the rvdata package:

pip install rv-data-standard

Setup and Data Download

First, we’ll import the necessary modules and download sample ESPRESSO data files.

[4]:

import os
import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from astropy.io import fits

# RVData imports
from rvdata.core.models.level2 import RV2
from rvdata.core.models.level3 import RV3
from rvdata.core.models.level4 import RV4

def download_file(url, filename):
    """Download a file if it doesn't already exist."""
    if not os.path.exists(filename):
        print(f"Downloading {filename}...")
        response = requests.get(url)
        response.raise_for_status()
        with open(filename, "wb") as f:
            f.write(response.content)
        print(f"Downloaded {filename}")
    else:
        print(f"{filename} already exists, skipping download.")

# ESPRESSO sample data URLs (hosted on project server)

file_urls = {
     "raw": "http://grinnell.as.arizona.edu/~rvdata/espresso/ESPRE.2017-12-03T02-09-40.348.fits",
        "S2D_BLAZE_A": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_S2D_BLAZE_A.fits",
        "S2D_BLAZE_B": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_S2D_BLAZE_B.fits",
        "BLAZE_A": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T10-43-59.835_BLAZE_A.fits",
        "BLAZE_B": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T10-43-59.835_BLAZE_B.fits",
        "S1D_A": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_S1D_A.fits",
        "S1D_B": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_S1D_B.fits",
        "S1D_TELL_CORR_A": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_S1D_TELL_CORR_A.fits",
        "DRIFT_MATRIX_B": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_DRIFT_MATRIX_B.fits",
        "CCF_A": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_CCF_A.fits",
        "CCF_TELL_CORR_A": "http://grinnell.as.arizona.edu/~rvdata/espresso/r.ESPRE.2017-12-03T02-09-40.348_CCF_TELL_CORR_A.fits"
}

# Download the files
raw_file = "ESPRE.2017-12-03T02:09:40.348.fits"
s2d_blaze_A = "r.ESPRE.2017-12-03T02:09:40.348_S2D_BLAZE_A.fits"
s2d_blaze_B = "r.ESPRE.2017-12-03T02:09:40.348_S2D_BLAZE_B.fits"
blaze_A = "r.ESPRE.2017-12-03T10:43:59.835_BLAZE_A.fits"
blaze_B = "r.ESPRE.2017-12-03T10:43:59.835_BLAZE_B.fits"
s1d_A = "r.ESPRE.2017-12-03T02:09:40.348_S1D_A.fits"
s1d_B = "r.ESPRE.2017-12-03T02:09:40.348_S1D_B.fits"
s1d_tel_corr_A = "r.ESPRE.2017-12-03T02:09:40.348_S1D_TELL_CORR_A.fits"
drift_matrix = "r.ESPRE.2017-12-03T02:09:40.348_DRIFT_MATRIX_B.fits"
CCF_A = "r.ESPRE.2017-12-03T02:09:40.348_CCF_A.fits"
CCF_TELL_CORRA = "r.ESPRE.2017-12-03T02:09:40.348_CCF_TELL_CORR_A.fits"


download_file(file_urls["raw"], raw_file)
download_file(file_urls["S2D_BLAZE_A"], s2d_blaze_A)
download_file(file_urls["S2D_BLAZE_B"], s2d_blaze_B)
download_file(file_urls["BLAZE_A"], blaze_A)
download_file(file_urls["BLAZE_B"], blaze_B)
download_file(file_urls["S1D_A"], s1d_A)
download_file(file_urls["S1D_B"], s1d_B)
download_file(file_urls["S1D_TELL_CORR_A"], s1d_tel_corr_A)
download_file(file_urls["DRIFT_MATRIX_B"], drift_matrix)
download_file(file_urls["CCF_A"], CCF_A)
download_file(file_urls["CCF_TELL_CORR_A"], CCF_TELL_CORRA)

ESPRE.2017-12-03T02:09:40.348.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_S2D_BLAZE_A.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_S2D_BLAZE_B.fits already exists, skipping download.
r.ESPRE.2017-12-03T10:43:59.835_BLAZE_A.fits already exists, skipping download.
r.ESPRE.2017-12-03T10:43:59.835_BLAZE_B.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_S1D_A.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_S1D_B.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_S1D_TELL_CORR_A.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_DRIFT_MATRIX_B.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_CCF_A.fits already exists, skipping download.
r.ESPRE.2017-12-03T02:09:40.348_CCF_TELL_CORR_A.fits already exists, skipping download.

Level 2: Extracted Echelle Spectra

Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace (fiber). Each trace contains flux, wavelength, variance, and blaze function arrays.

Creating L2 from Native ESPRESSO Files

To create an RVData-standard L2 file from native ESPRESSO data, you need:

raw file: Raw data with headers
S2D files: Extracted spectra for both fibers from the ESPRESSO pipeline
Blaze files Blaze spectra for both fibers
DRIFT file The measured drift

[5]:

# Create RVData-standard L2 from native ESPRESSO files
espr_l2 = RV2.from_fits(raw_file, instrument='ESPRESSO')

# Save to FITS file
l2_standard_file = "ESPR_L2_standard.fits"
espr_l2.to_fits(out_filedir=None, out_filename=l2_standard_file)
print(f"Created {l2_standard_file}")

SCIENCE FP HD 10700
File is valid for conversion!
No PS1 data found, PUPILIMAGE extension will not be generated
No TELLURIC file found, TRACEi_TELLURIC_x extensions will not be generated.
No SKYSUB file found, TRACEi_SKYSUB_x extensions will not be generated.
Created ESPR_L2_standard.fits

/Users/emilio/Desktop/EPRV/Translators/RVData/rvdata/core/models/base.py:474: UserWarning: Filename 'ESPR_L2_standard.fits' does not follow the EPRV naming convention. Suggested filename: 'espresso_SL2_20171203T020940.fits'
  warnings.warn(

Using L2 Data

Reading the L2 File

You can read L2 files using either astropy’s fits.open() or the RVData RV2.from_fits() method.

[6]:

# Open using astropy
l2 = fits.open(l2_standard_file)

# Examine the primary header - same keywords regardless of instrument!
hdr = l2[0].header
print(f"Telescope: {hdr['TELESCOP']}")
print(f"Instrument: {hdr['INSTRUME']}")
print(f"Object: {hdr['OBJECT']}")
print(f"Number of traces: {hdr['NUMTRACE']}")
print("\nTrace contents:")
for i in range(1, hdr['NUMTRACE'] + 1):
    print(f"  TRACE{i}: {hdr[f'TRACE{i}']}")

Telescope: ESO-VLT-U1
Instrument: ESPRESSO
Object: HD 10700
Number of traces: 4

Trace contents:
  TRACE1: SCI
  TRACE2: SCI
  TRACE3: CAL
  TRACE4: CAL

Examining L2 Extensions

The EXT_DESCRIPT extension lists all FITS extensions in the file.

[7]:

# List all extensions
ext_descript = pd.DataFrame(l2['EXT_DESCRIPT'].data)
print(ext_descript[['Name', 'Description']].to_string())

                 Name                                                               Description
0             PRIMARY  The primary header containing the keywords following the standard format
1   INSTRUMENT_HEADER                                            The original instrument header
2             RECEIPT                             The list of operations performed on this file
3          DRP_CONFIG                     Pipeline details to go from the raw file to this file
4        EXT_DESCRIPT                Contains the description of all the extensions in the file
5         ORDER_TABLE                       Table capturing the wavelength extent of each order
6         TRACE1_FLUX                                              The flux measured in trace 1
7         TRACE1_WAVE                          The wavelength associated to the flux in trace 1
8          TRACE1_VAR                                                  The variance for trace 1
9        TRACE1_BLAZE                                                     The blaze for trace 1
10       BARYCORR_KMS                                        The barycentric correction in km/s
11         BARYCORR_Z                                    The barycentric correction in redshift
12            BJD_TDB                                                  Photon weighted midpoint
13           EXPMETER                                                        The exposure meter
14       GUIDINGIMAGE                                            Images from the guiding camera
15        TRACE2_FLUX                                              The flux measured in trace 2
16         TRACE2_VAR                                                  The variance for trace 2
17        TRACE2_WAVE                          The wavelength associated to the flux in trace 2
18     TRACE1_QUALITY                                        The data quality array for trace 1
19     TRACE2_QUALITY                                        The data quality array for trace 2
20       TRACE2_BLAZE                                                     The blaze for trace 2
21        TRACE3_FLUX                                              The flux measured in trace 3
22        TRACE4_FLUX                                              The flux measured in trace 4
23         TRACE3_VAR                                                  The variance for trace 3
24         TRACE4_VAR                                                  The variance for trace 4
25        TRACE3_WAVE                           The wavelength associated to the flux in trace3
26        TRACE4_WAVE                          The wavelength associated to the flux in trace 4
27     TRACE3_QUALITY                                        The data quality array for trace 3
28     TRACE4_QUALITY                                        The data quality array for trace 4
29       TRACE3_BLAZE                                                     The blaze for trace 3
30       TRACE4_BLAZE                                                     The blaze for trace 4
31       TRACE3_DRIFT                                                     The drift for trace 3
32       TRACE4_DRIFT                                                     The drift for trace 4

Examining the Order Table

The ORDER_TABLE extension describes the wavelength coverage of each echelle order.

[8]:

order_table = pd.DataFrame(l2['ORDER_TABLE'].data)
print(f"Number of orders: {len(order_table)}")
print(f"\nWavelength coverage: {order_table['WAVE_START'].min():.1f} - {order_table['WAVE_END'].max():.1f} Angstroms")
print("\nFirst 5 orders:")
print(order_table.head())

Number of orders: 85

Wavelength coverage: 377.1 - 790.6 Angstroms

First 5 orders:
   ETC/Int_Order HR/UHR_Orders  MR_Order  Central_wav[nm]  FSR_range[nm]  \
0            161           1,2         1           380.04           2.36
1            160           3,4         2           382.42           2.39
2            159           5,6         3           384.83           2.42
3            158           7,8         4           387.26           2.45
4            157          9,10         5           389.73           2.48

   FSRmin[nm]  FSRmax[nm]  start_wav[nm]  endwav[nm]  TS_range[nm]  \
0      378.87      381.23         377.15      383.07          5.93
1      381.23      383.62         379.50      385.47          5.96
2      383.62      386.04         381.89      387.89          6.00
3      386.04      388.49         384.31      390.34          6.03
4      388.49      390.97         386.76      392.83          6.07

   ECHELLE_ORDER  ORDER_INDEX  WAVE_START  WAVE_END
0            161            0      377.15    383.07
1            160            1      379.50    385.47
2            159            2      381.89    387.89
3            158            3      384.31    390.34
4            157            4      386.76    392.83

Plotting L2 Spectra

ESPRESSO has 4 traces:

TRACE 1 and 2: Science fibers, slices 1 & 2
TRACE 3 and 4: Calibration fiber, slices 1 & 2

Let’s plot one order from all traces.

[9]:

# Plot a single order from the three science traces
order = 30  # Choose an order to plot

fig, axes = plt.subplots(4, 1, figsize=(12, 8), sharex=True)

for i, (ax, trace_num) in enumerate(zip(axes, [1,2, 3, 4])):
    wave = l2[f'TRACE{trace_num}_WAVE'].data[order]
    flux = l2[f'TRACE{trace_num}_FLUX'].data[order]
    blaze = l2[f'TRACE{trace_num}_BLAZE'].data[order]

    # Scale blaze for visualization
    blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))

    ax.plot(wave, flux, 'b-', lw=0.5, label='Flux')
    ax.plot(wave, blaze_scaled, 'orange', lw=1, label='Blaze (scaled)')
    ax.set_ylabel(f'TRACE{trace_num}\nCounts')
    ax.legend(loc='upper right')
    ax.set_ylim(0, np.nanmax(flux) * 1.1)

axes[-1].set_xlabel('Wavelength (Angstroms)')
fig.suptitle(f'ESPRESSO L2 Spectra - Order {order}', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

../_images/tutorials_ESPRESSO_Tutorial_12_0.png

Level 3: Stitched 1D Spectrum

Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:

Divides out the blaze function
Resamples each order onto a common wavelength grid

ESPRESSO level 3 files are obtained from S1D files produced by the ESPRESSO DRS.

[10]:

# Create L3 from the standard L2 file
espr_l3 = RV3.from_fits(raw_file, instrument="ESPRESSO")

# Save to FITS file
l3_standard_file = "ESPR_L3_standard.fits"
espr_l3.to_fits(out_filedir=None, out_filename=l3_standard_file)
print(f"Created {l3_standard_file}")

SCIENCE FP HD 10700
File r.ESPRE.2017-12-03T02:09:40.348_S1D_SKYSUB_A.fits does not exist.
Created ESPR_L3_standard.fits

/Users/emilio/Desktop/EPRV/Translators/RVData/rvdata/core/models/base.py:474: UserWarning: Filename 'ESPR_L3_standard.fits' does not follow the EPRV naming convention. Suggested filename: 'espresso_SL3_20171203T020940.fits'
  warnings.warn(

Using L3 Data

Reading the L3 File

[11]:

# Open the L3 file
l3 = fits.open(l3_standard_file)

# List extensions
print("L3 Extensions:")
for hdu in l3:
    print(f"  {hdu.name}")

L3 Extensions:
  PRIMARY
  INSTRUMENT_HEADER
  RECEIPT
  DRP_CONFIG
  EXT_DESCRIPT
  ORDER_TABLE
  STITCHED_CORR_SCI_FLUX
  STITCHED_CORR_SCI_WAVE
  STITCHED_CORR_SCI_VAR
  STITCHED_TELLCORR_SCI_WAVE
  STITCHED_TELLCORR_SCI_FLUX
  STITCHED_TELLCORR_SCI_VAR

Understanding L3 Extensions

For ESPRESSO with a single science fiber, the stitched spectrum is stored in STITCHED_CORR_SCI_* extensions:

STITCHED_CORR_SCI_WAVE/FLUX/VAR

[12]:

# Check which STITCHED extensions are present
stitched_exts = [hdu.name for hdu in l3 if 'STITCHED' in hdu.name]
print("Stitched spectrum extensions:")
for ext in stitched_exts:
    print(f"  {ext}")

Stitched spectrum extensions:
  STITCHED_CORR_SCI_FLUX
  STITCHED_CORR_SCI_WAVE
  STITCHED_CORR_SCI_VAR
  STITCHED_TELLCORR_SCI_WAVE
  STITCHED_TELLCORR_SCI_FLUX
  STITCHED_TELLCORR_SCI_VAR

Plotting the Stitched Spectrum

[13]:

# Plot the stitched spectrum for one trace
# Check which trace extensions exist
trace_num = 1  # Try TRACE2 first
wave_ext = f'STITCHED_CORR_SCI_WAVE'
flux_ext = f'STITCHED_CORR_SCI_FLUX'

if wave_ext in [hdu.name for hdu in l3]:
    wave_l3 = l3[wave_ext].data
    flux_l3 = l3[flux_ext].data

    fig, ax = plt.subplots(figsize=(14, 5))
    ax.plot(wave_l3, flux_l3, 'b-', lw=0.3)
    ax.set_xlabel('Wavelength (Angstroms)', fontsize=12)
    ax.set_ylabel('Flux (normalized)', fontsize=12)
    ax.set_title(f'ESPRESSO L3 Stitched Spectrum - TRACE{trace_num}', fontsize=14, fontweight='bold')

    # Zoom inset
    ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())
    plt.tight_layout()
    plt.show()

    print(f"\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms")
    print(f"Number of pixels: {len(wave_l3)}")
else:
    print(f"Extension {wave_ext} not found. Available: {stitched_exts}")

../_images/tutorials_ESPRESSO_Tutorial_20_0.png


Wavelength range: 3772.0 - 7900.0 Angstroms
Number of pixels: 443262

Zoomed View of Spectral Features

[14]:

# Zoom in on H-alpha region
if wave_ext in [hdu.name for hdu in l3]:
    fig, ax = plt.subplots(figsize=(12, 4))

    # H-alpha region
    mask = (wave_l3 > 6550) & (wave_l3 < 6580)
    ax.plot(wave_l3[mask], flux_l3[mask], 'b-', lw=0.5)
    ax.axvline(6564.6, color='red', ls='--', alpha=0.7, label='H-alpha (6564.6 A)')
    ax.set_xlabel('Wavelength (Angstroms)')
    ax.set_ylabel('Flux')
    ax.set_title('H-alpha Region', fontsize=12)
    ax.legend()
    plt.tight_layout()
    plt.show()

../_images/tutorials_ESPRESSO_Tutorial_22_0.png

Level 4: Radial Velocity Measurements

Level 4 data contains radial velocity (RV) measurements derived from the spectra. These can include:

Per-order RVs
Combined RV with uncertainty
Activity indicators

Creating L4 from ESPRESSO CCF data

L4 is typically created from native pipeline outputs that contain RV measurements. For ESPRESSO, the native CCF file includes CCF-derived RVs.

[15]:

# Create L4 from native ESPRESSO CCF file (which contains RV measurements)
espr_l4 = RV4.from_fits(raw_file, instrument="ESPRESSO")

# Save to FITS file
l4_standard_file = "espr_L4_standard.fits"
espr_l4.to_fits(out_filedir=None, out_filename=l4_standard_file)
print(f"Created {l4_standard_file}")

SCIENCE FP HD 10700
File r.ESPRE.2017-12-03T02:09:40.348_CCF_SKYSUB_A.fits not found. Skipping...
Created espr_L4_standard.fits

/Users/emilio/Desktop/EPRV/Translators/RVData/rvdata/core/models/base.py:474: UserWarning: Filename 'espr_L4_standard.fits' does not follow the EPRV naming convention. Suggested filename: 'espresso_SL4_20171203T020940.fits'
  warnings.warn(

Using L4 Data

Reading the L4 File

[16]:

# Open the L4 file
l4 = fits.open(l4_standard_file)

# Examine primary header for RV info
hdr4 = l4[0].header
print(f"Object: {hdr4['OBJECT']}")
print(f"Observation time (BJD): {hdr4.get('BJDTDB', 'N/A')}")

# List extensions
print("\nL4 Extensions:")
for hdu in l4:
    print(f"  {hdu.name}")

Object: HD 10700
Observation time (BJD): 2458090.59436763

L4 Extensions:
  PRIMARY
  INSTRUMENT_HEADER
  RECEIPT
  DRP_CONFIG
  EXT_DESCRIPT
  RV1
  CCF1
  DIAGNOSTICS1
  RV_TEL_CORR
  CCF_TEL_CORR

Examining the RV1 Extension

The RV1 extension contains per-order radial velocity measurements with standardized column names.

[17]:

# Examine the RV1 extension
rv1 = pd.DataFrame(l4['RV1'].data)
print("RV1 columns:")
print(rv1.columns.tolist())
print(f"\nNumber of orders: {len(rv1)}")
print("\nFirst 5 rows:")
print(rv1.head())

RV1 columns:
['BJD_TDB', 'RV', 'RV_ERR', 'BERV', 'WAVE_START', 'WAVE_END']

Number of orders: 1

First 5 rows:
        BJD_TDB         RV    RV_ERR       BERV  WAVE_START  WAVE_END
0  2.458091e+06 -15.082474  0.000205 -22.110426         NaN       NaN

Summary

This tutorial demonstrated how to:

Create L2 from native ESPRESSO files using RV2.from_fits()
Use L2 data: access headers, examine extensions, plot spectra
Create L3 from native ESPRESSO files using RV3.from_fits()
Use L3 data: access stitched spectra, examine spectral features
Create L4 from native ESPRESSO files using RV4.from_fits()
Use L4 data: access RV measurements and per-order RVs

The standardized data format allows consistent access patterns across all EPRV instruments!

[18]:

# Clean up - close FITS files
l2.close()
l3.close()
l4.close()