MAROONX Data Tutorial

This tutorial demonstrates how to create and use RVData standard files for the MAROONX instrument at all data levels:

  • Level 2 (L2): Extracted, wavelength-calibrated echelle spectra

  • Level 3 (L3): Stitched 1D spectrum on a common wavelength grid

  • Level 4 (L4): Radial velocity measurements

Prerequisites

Install the rvdata package:

pip install rv-data-standard

Setup and Data Download

First, we’ll import the necessary modules and download sample MAROONX data files.

[ ]:
import os
import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from astropy.io import fits
# RVData imports
from rvdata.core.models.level3 import RV3
from rvdata.instruments.maroonx.level2 import MAROONXRV2
from rvdata.instruments.maroonx.level4 import MAROONXRV4

[ ]:
def download_file(url, filename):
    """Download a file if it doesn't already exist."""
    if not os.path.exists(filename):
        print(f"Downloading {filename}...")
        response = requests.get(url)
        response.raise_for_status()
        with open(filename, "wb") as f:
            f.write(response.content)
        print(f"Downloaded {filename}")
    else:
        print(f"{filename} already exists, skipping download.")

# MAROONX sample data URLs (hosted on project server)
# MAROONX HDF5 dataproduct and a MAROONX flat file are needed to create L2
# MAROONX standard Level 2 FITS file and a MAROONX RV SERVAL pickle file are needed to create L4


pathname = "http://grinnell.as.arizona.edu/~rvdata/maroonx/"
file_urls = {
    "MAROONX": {
        "MX_OP": pathname + "20240923T092056Z_SOOOE_x_0040.hd5",
        "FLAT": pathname + "20240916T19_masterflat__FFFFF_x__blaze.hd5",
        "SERVAL_BLUE": pathname + "tauCet_activity_results_blue.pkl",
        "SERVAL_RED": pathname + "tauCet_activity_results_red.pkl"
    }
}

# Download the files
file_mx = "20240923T092056Z_SOOOE_x_0040.hd5"
flat_file = "20240916T19_masterflat__FFFFF_x__blaze.hd5"
rv_blue = "tauCet_activity_results_blue.pkl"
rv_red = "tauCet_activity_results_red.pkl"

download_file(file_urls["MAROONX"]['MX_OP'], file_mx)
download_file(file_urls["MAROONX"]['FLAT'], flat_file)
download_file(file_urls["MAROONX"]['SERVAL_BLUE'], rv_blue)
download_file(file_urls["MAROONX"]['SERVAL_RED'], rv_red)

Level 2: Extracted Echelle Spectra

Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace (fiber). Each trace contains flux, wavelength, variance, and blaze function arrays.

Creating L2 from MAROONX HDF5 Files

To create an RVData-standard L2 file from MAROONX data product, you need:

  • HDF5 file: Extracted object spectra from the MAROONX pipeline

  • FLAT file: Extracted flat spectra from the MAROONX pipeline

[ ]:
# Create RVData-standard L2 from MAROONX HDF5 files
timestamp = os.path.basename(file_mx).split("_")[0].replace("Z", "")
mx_l2 = MAROONXRV2()
mx_l2.createL2(file_mx, flat_file)

# Create separate fits files for Blue and red camera
l2_standard_blue, l2_standard_red = mx_l2.write_camera_fits(file_mx,
                                                        out_dir='.')

print(f"Created {l2_standard_blue}")
print(f"Created {l2_standard_red}")

Using L2 Data

Reading the L2 File

You can read L2 files using either astropy’s fits.open() or the RVData RV2.from_fits() method.

[ ]:
# Open red camera fits file using astropy
l2 = fits.open(l2_standard_red)

# Examine the primary header - same keywords regardless of instrument!
hdr = l2[0].header
print(f"Telescope: {hdr['TELESCOP']}")
print(f"Instrument: {hdr['INSTRUME']}")
print(f"Object: {hdr['OBJECT']}")
print(f"Number of traces: {hdr['NUMTRACE']}")
print("\nTrace contents:")
for i in range(1, hdr['NUMTRACE'] + 1):
    print(f"  TRACE{i}: {hdr[f'TRACE{i}']}")

Examining L2 Extensions

The EXT_DESCRIPT extension lists all FITS extensions in the file.

[ ]:
# List all extensions
ext_descript = pd.DataFrame(l2['EXT_DESCRIPT'].data)
print(ext_descript[['Name', 'Description']].to_string())

Examining the Order Table

The ORDER_TABLE extension describes the wavelength coverage of each echelle order.

[ ]:
order_table = pd.DataFrame(l2['ORDER_TABLE'].data)
print(f"Number of orders: {len(order_table)}")
print(f"\nWavelength coverage: {order_table['WAVE_START'].min():.1f} - {order_table['WAVE_END'].max():.1f} Angstroms")
print("\nFirst 5 orders:")
print(order_table.head())
print("\nLast 5 orders:")
print(order_table[-5:])

Plotting L2 Spectra

MAROONX has 6 traces per camera and a virtual trace:

  • TRACE1: Sky fiber (SKY)

  • TRACE2, TRACE3, TRACE4: Science fibers (SCI)

  • TRACE5: Simultaneous calibration fiber (CAL)

  • TRACE6: Flux weighted combination of the 3 science fibers (VIRTUAL)

Let’s plot one order from the science traces.

[ ]:
# Plot a single order from the three science traces
order_index = 20  # Choose an order to plot
order = order_table.loc[order_index, "ECHELLE_ORDER"]
fig, axes = plt.subplots(4, 1, figsize=(12, 8), sharex=True)

for i, (ax, trace_num) in enumerate(zip(axes, [2, 3, 4, 6])):
    wave = l2[f'TRACE{trace_num}_WAVE'].data[order_index]
    flux = l2[f'TRACE{trace_num}_FLUX'].data[order_index]
    blaze = l2[f'TRACE{trace_num}_BLAZE'].data[order_index]

    # Scale blaze for visualization
    blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))

    ax.plot(wave, flux, 'b-', lw=0.5, label='Flux')
    ax.plot(wave, blaze_scaled, 'orange', lw=1, label='Blaze (scaled)')
    ax.set_ylabel(f'TRACE{trace_num}\nCounts')
    ax.legend(loc='upper right')
    ax.set_ylim(0, np.nanmax(flux) * 1.1)

axes[-1].set_xlabel('Wavelength (Angstroms)')
fig.suptitle(f'MAROONX L2 Spectra - Order {order}', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

Level 3: Stitched 1D Spectrum

Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:

  1. Divides out the blaze function

  2. Resamples each order onto a common wavelength grid

  3. Combines overlapping regions using inverse-variance weighting

Creating L3 from L2

L3 is created from an RVData-standard L2 file.

[ ]:
# Create L3 from the standard L2 file
mx_l3 = RV3.from_fits(l2_standard_red, instrument="MAROONX")

# Save to FITS file
l3_standard = f'maroonxred_SL3_{timestamp}.fits'
mx_l3.to_fits(l3_standard)
print(f"Created {l3_standard}")

Using L3 Data

Reading the L3 File

[ ]:
# Open the L3 file
l3 = fits.open(l3_standard)

# List extensions
print("L3 Extensions:")
for hdu in l3:
    print(f"  {hdu.name}")

Understanding L3 Extensions

For MAROONX with multiple science fibers, the stitched spectra are stored in STITCHED_CORR_TRACE{n}_* extensions:

  • STITCHED_CORR_TRACE2_WAVE/FLUX/VAR: Science fiber 2

  • STITCHED_CORR_TRACE3_WAVE/FLUX/VAR: Science fiber 3

  • STITCHED_CORR_TRACE4_WAVE/FLUX/VAR: Science fiber 4

  • STITCHED_CORR_TRACE6_WAVE/FLUX/VAR: Virtual fiber 6

[ ]:
# Check which STITCHED extensions are present
stitched_exts = [hdu.name for hdu in l3 if 'STITCHED' in hdu.name]
print("Stitched spectrum extensions:")
for ext in stitched_exts:
    print(f"  {ext}")

Plotting the Stitched Spectrum

[ ]:
# Plot the stitched spectrum for one trace
# Check which trace extensions exist
trace_num = 3  # Try TRACE3 first
wave_ext = f'STITCHED_CORR_TRACE{trace_num}_WAVE'
flux_ext = f'STITCHED_CORR_TRACE{trace_num}_FLUX'

if wave_ext in [hdu.name for hdu in l3]:
    wave_l3 = l3[wave_ext].data
    flux_l3 = l3[flux_ext].data

    fig, ax = plt.subplots(figsize=(14, 5))
    ax.plot(wave_l3, flux_l3, 'b-', lw=0.3)
    ax.set_xlabel('Wavelength (Angstroms)', fontsize=12)
    ax.set_ylabel('Flux (normalized)', fontsize=12)
    ax.set_title(f'MAROONX L3 Stitched Spectrum - TRACE{trace_num}', fontsize=14, fontweight='bold')

    # Zoom inset
    ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())
    plt.tight_layout()
    plt.show()

    print(f"\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms")
    print(f"Number of pixels: {len(wave_l3)}")
else:
    print(f"Extension {wave_ext} not found. Available: {stitched_exts}")

Zoomed View of Spectral Features

[ ]:
# Zoom in on H-alpha region
if wave_ext in [hdu.name for hdu in l3]:
    fig, ax = plt.subplots(figsize=(12, 4))

    # H-alpha region
    mask = (wave_l3 > 6550) & (wave_l3 < 6580)
    ax.plot(wave_l3[mask], flux_l3[mask], 'b-', lw=0.5)
    ax.axvline(6562.8, color='red', ls='--', alpha=0.7, label='H-alpha (6562.8 A)')
    ax.set_xlabel('Wavelength (Angstroms)')
    ax.set_ylabel('Flux')
    ax.set_title('H-alpha Region', fontsize=12)
    ax.legend()
    plt.tight_layout()
    plt.show()

Level 4: Radial Velocity Measurements

Level 4 data contains radial velocity (RV) measurements derived from the spectra. These can include:

  • Per-order RVs

  • Combined RV with uncertainty

  • Activity indicators

Creating L4 from MAROONX STANDARD L2 and MAROONX SERVAL pickle file

L4 is created from RVData standard L2 file. The RV data product is separately available in pkl file and contains SERVAL derived RVs

[ ]:
# Create L4 from native MAROONX L2 file (which contains RV measurements)
mx_l4 = MAROONXRV4()
mx_l4.createL4(fits.open(l2_standard_blue), rv_blue, channel="BLUE")

l4_standard = f'maroonxblue_SL4_{timestamp}.fits'

# Save to FITS file
mx_l4.to_fits(l4_standard)

print(f"Created {l4_standard}")

Using L4 Data

Reading the L4 File

[ ]:
# Open the L4 file
l4 = fits.open(l4_standard)

# Examine primary header for RV info
hdr4 = l4[0].header
print(f"Object: {hdr4['OBJECT']}")
print(f"Observation time (BJD): {hdr4.get('BJD_TDB', 'N/A')}")

# List extensions
print("\nL4 Extensions:")
for hdu in l4:
    print(f"  {hdu.name}")

Examining the RV1 Extension

The RV1 extension contains per-order radial velocity measurements with standardized column names.

[ ]:
# Examine the RV1 extension
rv1 = pd.DataFrame(l4['RV1'].data)
print("RV1 columns:")
print(rv1.columns.tolist())
print(f"\nNumber of orders: {len(rv1)}")
print("\nFirst 5 rows:")
print(rv1.head())

Per-Order RVs

The RV1 extension contains one row per echelle order. The RV column holds the combined RV measurement for each order, and ORDER_INDEX identifies which order each row corresponds to. Additional columns include BERV (barycentric earth radial velocity), WAVE_START/WAVE_END (wavelength range), and WEIGHT (RV weight).

[ ]:
# Plot per-order RVs from the RV1 extension
rv1 = pd.DataFrame(np.array(l4['RV1'].data, dtype=l4['RV1'].data.dtype.newbyteorder('=')))
fig, ax = plt.subplots(figsize=(10, 5))
valid = np.isfinite(rv1['RV'])
ax.errorbar(rv1.loc[valid, 'ORDER_INDEX'], rv1.loc[valid, 'RV'], yerr=rv1.loc[valid, 'RV_ERR'],
            fmt='.', mfc='blue', ms=8, mec='gray',ecolor='gray', alpha=0.5)
median_rv = np.nanmedian(rv1['RV'])
ax.axhline(median_rv, color='red', ls='--', label=f'Median: {median_rv:.2f} m/s')
ax.set_xlabel('Order Index')
ax.set_ylabel('RV (m/s)')
ax.set_title('Per-Order Radial Velocities (RV1)')
ax.legend()
plt.tight_layout()
plt.show()

Summary

This tutorial demonstrated how to:

  1. Create L2 from MAROONX HDF5 and flat files using MAROONXRV2.createL2()

  2. Use L2 data: access headers, examine extensions, plot spectra

  3. Create L3 from standard L2 using RV3.from_fits()

  4. Use L3 data: access stitched spectra, examine spectral features

  5. Create L4 from standard MAROONX L2 and SERVAL pickle file using MAROONXRV4.createL4()

  6. Use L4 data: access RV measurements and per-order RVs

The standardized data format allows consistent access patterns across all EPRV instruments!

[ ]:
# Clean up - close FITS files
l2.close()
l3.close()
l4.close()