MAROONX Data Tutorial

This tutorial demonstrates how to create and use RVData standard files for the MAROONX instrument at all data levels:

Level 2 (L2): Extracted, wavelength-calibrated echelle spectra
Level 3 (L3): Stitched 1D spectrum on a common wavelength grid
Level 4 (L4): Radial velocity measurements

Prerequisites

Install the rvdata package:

pip install rv-data-standard

Setup and Data Download

First, we’ll import the necessary modules and download sample MAROONX data files.

[ ]:

import os
import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from astropy.io import fits
# RVData imports
from rvdata.core.models.level3 import RV3
from rvdata.instruments.maroonx.level2 import MAROONXRV2
from rvdata.instruments.maroonx.level4 import MAROONXRV4

[ ]:

def download_file(url, filename):
    """Download a file if it doesn't already exist."""
    if not os.path.exists(filename):
        print(f"Downloading {filename}...")
        response = requests.get(url)
        response.raise_for_status()
        with open(filename, "wb") as f:
            f.write(response.content)
        print(f"Downloaded {filename}")
    else:
        print(f"{filename} already exists, skipping download.")

# MAROONX sample data URLs (hosted on project server)
# MAROONX HDF5 dataproduct and a MAROONX flat file are needed to create L2
# MAROONX standard Level 2 FITS file and a MAROONX RV SERVAL pickle file are needed to create L4


pathname = "http://grinnell.as.arizona.edu/~rvdata/maroonx/"
file_urls = {
    "MAROONX": {
        "MX_OP": pathname + "20240923T092056Z_SOOOE_x_0040.hd5",
        "FLAT": pathname + "20240916T19_masterflat__FFFFF_x__blaze.hd5",
        "SERVAL_BLUE": pathname + "tauCet_activity_results_blue.pkl",
        "SERVAL_RED": pathname + "tauCet_activity_results_red.pkl"
    }
}

# Download the files
file_mx = "20240923T092056Z_SOOOE_x_0040.hd5"
flat_file = "20240916T19_masterflat__FFFFF_x__blaze.hd5"
rv_blue = "tauCet_activity_results_blue.pkl"
rv_red = "tauCet_activity_results_red.pkl"

download_file(file_urls["MAROONX"]['MX_OP'], file_mx)
download_file(file_urls["MAROONX"]['FLAT'], flat_file)
download_file(file_urls["MAROONX"]['SERVAL_BLUE'], rv_blue)
download_file(file_urls["MAROONX"]['SERVAL_RED'], rv_red)

Level 2: Extracted Echelle Spectra

Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace (fiber). Each trace contains flux, wavelength, variance, and blaze function arrays.

Creating L2 from MAROONX HDF5 Files

To create an RVData-standard L2 file from MAROONX data product, you need:

HDF5 file: Extracted object spectra from the MAROONX pipeline
FLAT file: Extracted flat spectra from the MAROONX pipeline

[ ]:

# Create RVData-standard L2 from MAROONX HDF5 files
timestamp = os.path.basename(file_mx).split("_")[0].replace("Z", "")
mx_l2 = MAROONXRV2()
mx_l2.createL2(file_mx, flat_file)

# Create separate fits files for Blue and red camera
l2_standard_blue, l2_standard_red = mx_l2.write_camera_fits(file_mx,
                                                        out_dir='.')

print(f"Created {l2_standard_blue}")
print(f"Created {l2_standard_red}")

Using L2 Data

Reading the L2 File

You can read L2 files using either astropy’s fits.open() or the RVData RV2.from_fits() method.

[ ]:

# Open red camera fits file using astropy
l2 = fits.open(l2_standard_red)

# Examine the primary header - same keywords regardless of instrument!
hdr = l2[0].header
print(f"Telescope: {hdr['TELESCOP']}")
print(f"Instrument: {hdr['INSTRUME']}")
print(f"Object: {hdr['OBJECT']}")
print(f"Number of traces: {hdr['NUMTRACE']}")
print("\nTrace contents:")
for i in range(1, hdr['NUMTRACE'] + 1):
    print(f"  TRACE{i}: {hdr[f'TRACE{i}']}")

Examining L2 Extensions

The EXT_DESCRIPT extension lists all FITS extensions in the file.

[ ]:

# List all extensions
ext_descript = pd.DataFrame(l2['EXT_DESCRIPT'].data)
print(ext_descript[['Name', 'Description']].to_string())

Examining the Order Table

The ORDER_TABLE extension describes the wavelength coverage of each echelle order.

[ ]:

order_table = pd.DataFrame(l2['ORDER_TABLE'].data)
print(f"Number of orders: {len(order_table)}")
print(f"\nWavelength coverage: {order_table['WAVE_START'].min():.1f} - {order_table['WAVE_END'].max():.1f} Angstroms")
print("\nFirst 5 orders:")
print(order_table.head())
print("\nLast 5 orders:")
print(order_table[-5:])

Plotting L2 Spectra

MAROONX has 6 traces per camera and a virtual trace:

TRACE1: Sky fiber (SKY)
TRACE2, TRACE3, TRACE4: Science fibers (SCI)
TRACE5: Simultaneous calibration fiber (CAL)
TRACE6: Flux weighted combination of the 3 science fibers (VIRTUAL)

Let’s plot one order from the science traces.

[ ]:

# Plot a single order from the three science traces
order_index = 20  # Choose an order to plot
order = order_table.loc[order_index, "ECHELLE_ORDER"]
fig, axes = plt.subplots(4, 1, figsize=(12, 8), sharex=True)

for i, (ax, trace_num) in enumerate(zip(axes, [2, 3, 4, 6])):
    wave = l2[f'TRACE{trace_num}_WAVE'].data[order_index]
    flux = l2[f'TRACE{trace_num}_FLUX'].data[order_index]
    blaze = l2[f'TRACE{trace_num}_BLAZE'].data[order_index]

    # Scale blaze for visualization
    blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))

    ax.plot(wave, flux, 'b-', lw=0.5, label='Flux')
    ax.plot(wave, blaze_scaled, 'orange', lw=1, label='Blaze (scaled)')
    ax.set_ylabel(f'TRACE{trace_num}\nCounts')
    ax.legend(loc='upper right')
    ax.set_ylim(0, np.nanmax(flux) * 1.1)

axes[-1].set_xlabel('Wavelength (Angstroms)')
fig.suptitle(f'MAROONX L2 Spectra - Order {order}', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

Level 3: Stitched 1D Spectrum

Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:

Divides out the blaze function
Resamples each order onto a common wavelength grid
Combines overlapping regions using inverse-variance weighting

Creating L3 from L2

L3 is created from an RVData-standard L2 file.

[ ]:

# Create L3 from the standard L2 file
mx_l3 = RV3.from_fits(l2_standard_red, instrument="MAROONX")

# Save to FITS file
l3_standard = f'maroonxred_SL3_{timestamp}.fits'
mx_l3.to_fits(l3_standard)
print(f"Created {l3_standard}")

Using L3 Data

Reading the L3 File

[ ]:

# Open the L3 file
l3 = fits.open(l3_standard)

# List extensions
print("L3 Extensions:")
for hdu in l3:
    print(f"  {hdu.name}")

Understanding L3 Extensions

For MAROONX with multiple science fibers, the stitched spectra are stored in STITCHED_CORR_TRACE{n}_* extensions:

STITCHED_CORR_TRACE2_WAVE/FLUX/VAR: Science fiber 2
STITCHED_CORR_TRACE3_WAVE/FLUX/VAR: Science fiber 3
STITCHED_CORR_TRACE4_WAVE/FLUX/VAR: Science fiber 4
STITCHED_CORR_TRACE6_WAVE/FLUX/VAR: Virtual fiber 6

[ ]:

# Check which STITCHED extensions are present
stitched_exts = [hdu.name for hdu in l3 if 'STITCHED' in hdu.name]
print("Stitched spectrum extensions:")
for ext in stitched_exts:
    print(f"  {ext}")

Plotting the Stitched Spectrum

[ ]:

# Plot the stitched spectrum for one trace
# Check which trace extensions exist
trace_num = 3  # Try TRACE3 first
wave_ext = f'STITCHED_CORR_TRACE{trace_num}_WAVE'
flux_ext = f'STITCHED_CORR_TRACE{trace_num}_FLUX'

if wave_ext in [hdu.name for hdu in l3]:
    wave_l3 = l3[wave_ext].data
    flux_l3 = l3[flux_ext].data

    fig, ax = plt.subplots(figsize=(14, 5))
    ax.plot(wave_l3, flux_l3, 'b-', lw=0.3)
    ax.set_xlabel('Wavelength (Angstroms)', fontsize=12)
    ax.set_ylabel('Flux (normalized)', fontsize=12)
    ax.set_title(f'MAROONX L3 Stitched Spectrum - TRACE{trace_num}', fontsize=14, fontweight='bold')

    # Zoom inset
    ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())
    plt.tight_layout()
    plt.show()

    print(f"\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms")
    print(f"Number of pixels: {len(wave_l3)}")
else:
    print(f"Extension {wave_ext} not found. Available: {stitched_exts}")

Zoomed View of Spectral Features

[ ]:

# Zoom in on H-alpha region
if wave_ext in [hdu.name for hdu in l3]:
    fig, ax = plt.subplots(figsize=(12, 4))

    # H-alpha region
    mask = (wave_l3 > 6550) & (wave_l3 < 6580)
    ax.plot(wave_l3[mask], flux_l3[mask], 'b-', lw=0.5)
    ax.axvline(6562.8, color='red', ls='--', alpha=0.7, label='H-alpha (6562.8 A)')
    ax.set_xlabel('Wavelength (Angstroms)')
    ax.set_ylabel('Flux')
    ax.set_title('H-alpha Region', fontsize=12)
    ax.legend()
    plt.tight_layout()
    plt.show()

Level 4: Radial Velocity Measurements

Level 4 data contains radial velocity (RV) measurements derived from the spectra. These can include:

Per-order RVs
Combined RV with uncertainty
Activity indicators

Creating L4 from MAROONX STANDARD L2 and MAROONX SERVAL pickle file

L4 is created from RVData standard L2 file. The RV data product is separately available in pkl file and contains SERVAL derived RVs

[ ]:

# Create L4 from native MAROONX L2 file (which contains RV measurements)
mx_l4 = MAROONXRV4()
mx_l4.createL4(fits.open(l2_standard_blue), rv_blue, channel="BLUE")

l4_standard = f'maroonxblue_SL4_{timestamp}.fits'

# Save to FITS file
mx_l4.to_fits(l4_standard)

print(f"Created {l4_standard}")

Using L4 Data

Reading the L4 File

[ ]:

# Open the L4 file
l4 = fits.open(l4_standard)

# Examine primary header for RV info
hdr4 = l4[0].header
print(f"Object: {hdr4['OBJECT']}")
print(f"Observation time (BJD): {hdr4.get('BJD_TDB', 'N/A')}")

# List extensions
print("\nL4 Extensions:")
for hdu in l4:
    print(f"  {hdu.name}")

Examining the RV1 Extension

The RV1 extension contains per-order radial velocity measurements with standardized column names.

[ ]:

# Examine the RV1 extension
rv1 = pd.DataFrame(l4['RV1'].data)
print("RV1 columns:")
print(rv1.columns.tolist())
print(f"\nNumber of orders: {len(rv1)}")
print("\nFirst 5 rows:")
print(rv1.head())

Per-Order RVs

The RV1 extension contains one row per echelle order. The RV column holds the combined RV measurement for each order, and ORDER_INDEX identifies which order each row corresponds to. Additional columns include BERV (barycentric earth radial velocity), WAVE_START/WAVE_END (wavelength range), and WEIGHT (RV weight).

[ ]:

# Plot per-order RVs from the RV1 extension
rv1 = pd.DataFrame(np.array(l4['RV1'].data, dtype=l4['RV1'].data.dtype.newbyteorder('=')))
fig, ax = plt.subplots(figsize=(10, 5))
valid = np.isfinite(rv1['RV'])
ax.errorbar(rv1.loc[valid, 'ORDER_INDEX'], rv1.loc[valid, 'RV'], yerr=rv1.loc[valid, 'RV_ERR'],
            fmt='.', mfc='blue', ms=8, mec='gray',ecolor='gray', alpha=0.5)
median_rv = np.nanmedian(rv1['RV'])
ax.axhline(median_rv, color='red', ls='--', label=f'Median: {median_rv:.2f} m/s')
ax.set_xlabel('Order Index')
ax.set_ylabel('RV (m/s)')
ax.set_title('Per-Order Radial Velocities (RV1)')
ax.legend()
plt.tight_layout()
plt.show()

Summary

This tutorial demonstrated how to:

Create L2 from MAROONX HDF5 and flat files using MAROONXRV2.createL2()
Use L2 data: access headers, examine extensions, plot spectra
Create L3 from standard L2 using RV3.from_fits()
Use L3 data: access stitched spectra, examine spectral features
Create L4 from standard MAROONX L2 and SERVAL pickle file using MAROONXRV4.createL4()
Use L4 data: access RV measurements and per-order RVs

The standardized data format allows consistent access patterns across all EPRV instruments!

[ ]:

# Clean up - close FITS files
l2.close()
l3.close()
l4.close()