KPF Data Tutorial
This tutorial demonstrates how to create and use RVData standard files for the Keck Planet Finder (KPF) instrument at all data levels:
Level 2 (L2): Extracted, wavelength-calibrated echelle spectra
Level 3 (L3): Stitched 1D spectrum on a common wavelength grid
Level 4 (L4): Radial velocity measurements
Prerequisites
Install the rvdata package:
pip install rv-data-standard
Setup and Data Download
First, we’ll import the necessary modules and download sample KPF data files.
[ ]:
import os
import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from astropy.io import fits
# RVData imports
from rvdata.core.models.level2 import RV2
from rvdata.core.models.level3 import RV3
from rvdata.core.models.level4 import RV4
def download_file(url, filename):
"""Download a file if it doesn't already exist."""
if not os.path.exists(filename):
print(f"Downloading {filename}...")
response = requests.get(url)
response.raise_for_status()
with open(filename, "wb") as f:
f.write(response.content)
print(f"Downloaded {filename}")
else:
print(f"{filename} already exists, skipping download.")
# KPF sample data URLs (hosted on project server)
# L0 (raw) and L1 (extracted) files are needed to create L2
# Native L2 files are needed to create L4
file_urls = {
"l0": "http://grinnell.as.arizona.edu/~rvdata/kpf/KP.20250208.17485.59.fits",
"l1": "http://grinnell.as.arizona.edu/~rvdata/kpf/KP.20250208.17485.59_L1.fits",
"native_l2": "http://grinnell.as.arizona.edu/~rvdata/kpf/KP.20241022.41656.30_L2.fits",
}
# Download the files
l0_file = "KP.20250208.17485.59.fits"
l1_file = "KP.20250208.17485.59_L1.fits"
native_l2_file = "KP.20241022.41656.30_L2.fits"
download_file(file_urls["l0"], l0_file)
download_file(file_urls["l1"], l1_file)
download_file(file_urls["native_l2"], native_l2_file)
Level 2: Extracted Echelle Spectra
Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace (fiber). Each trace contains flux, wavelength, variance, and blaze function arrays.
Creating L2 from Native KPF Files
To create an RVData-standard L2 file from native KPF data, you need:
L0 file: Raw data with headers and telemetry
L1 file: Extracted spectra from the KPF pipeline
[ ]:
# Create RVData-standard L2 from native KPF files
kpf_l2 = RV2.from_fits(l1_file, l0file=l0_file, instrument="KPF")
# Save to FITS file
l2_standard_file = "kpf_L2_standard.fits"
kpf_l2.to_fits(l2_standard_file)
print(f"Created {l2_standard_file}")
Using L2 Data
Reading the L2 File
You can read L2 files using either astropy’s fits.open() or the RVData RV2.from_fits() method.
[ ]:
# Open using astropy
l2 = fits.open(l2_standard_file)
# Examine the primary header - same keywords regardless of instrument!
hdr = l2[0].header
print(f"Telescope: {hdr['TELESCOP']}")
print(f"Instrument: {hdr['INSTRUME']}")
print(f"Object: {hdr['OBJECT']}")
print(f"Number of traces: {hdr['NUMTRACE']}")
print("\nTrace contents:")
for i in range(1, hdr['NUMTRACE'] + 1):
print(f" TRACE{i}: {hdr[f'TRACE{i}']}")
Examining L2 Extensions
The EXT_DESCRIPT extension lists all FITS extensions in the file.
[ ]:
# List all extensions
ext_descript = pd.DataFrame(l2['EXT_DESCRIPT'].data)
print(ext_descript[['Name', 'Description']].to_string())
Examining the Order Table
The ORDER_TABLE extension describes the wavelength coverage of each echelle order.
[ ]:
order_table = pd.DataFrame(l2['ORDER_TABLE'].data)
print(f"Number of orders: {len(order_table)}")
print(f"\nWavelength coverage: {order_table['WAVE_START'].min():.1f} - {order_table['WAVE_END'].max():.1f} Angstroms")
print("\nFirst 5 orders:")
print(order_table.head())
Plotting L2 Spectra
KPF has 5 traces:
TRACE1: Calibration fiber (etalon)
TRACE2, TRACE3, TRACE4: Science fibers
TRACE5: Sky fiber
Let’s plot one order from the science traces.
[ ]:
# Plot a single order from the three science traces
order = 30 # Choose an order to plot
fig, axes = plt.subplots(3, 1, figsize=(12, 8), sharex=True)
for i, (ax, trace_num) in enumerate(zip(axes, [2, 3, 4])):
wave = l2[f'TRACE{trace_num}_WAVE'].data[order]
flux = l2[f'TRACE{trace_num}_FLUX'].data[order]
blaze = l2[f'TRACE{trace_num}_BLAZE'].data[order]
# Scale blaze for visualization
blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))
ax.plot(wave, flux, 'b-', lw=0.5, label='Flux')
ax.plot(wave, blaze_scaled, 'orange', lw=1, label='Blaze (scaled)')
ax.set_ylabel(f'TRACE{trace_num}\nCounts')
ax.legend(loc='upper right')
ax.set_ylim(0, np.nanmax(flux) * 1.1)
axes[-1].set_xlabel('Wavelength (Angstroms)')
fig.suptitle(f'KPF L2 Spectra - Order {order}', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()
Level 3: Stitched 1D Spectrum
Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:
Divides out the blaze function
Resamples each order onto a common wavelength grid
Combines overlapping regions using inverse-variance weighting
Creating L3 from L2
L3 is created from an RVData-standard L2 file.
[ ]:
# Create L3 from the standard L2 file
kpf_l3 = RV3.from_fits(l2_standard_file, instrument="KPF")
# Save to FITS file
l3_standard_file = "kpf_L3_standard.fits"
kpf_l3.to_fits(l3_standard_file)
print(f"Created {l3_standard_file}")
Using L3 Data
Reading the L3 File
[ ]:
# Open the L3 file
l3 = fits.open(l3_standard_file)
# List extensions
print("L3 Extensions:")
for hdu in l3:
print(f" {hdu.name}")
Understanding L3 Extensions
For KPF with multiple science fibers, the stitched spectra are stored in STITCHED_CORR_TRACE{n}_* extensions:
STITCHED_CORR_TRACE2_WAVE/FLUX/VAR: Science fiber 1STITCHED_CORR_TRACE3_WAVE/FLUX/VAR: Science fiber 2STITCHED_CORR_TRACE4_WAVE/FLUX/VAR: Science fiber 3
[ ]:
# Check which STITCHED extensions are present
stitched_exts = [hdu.name for hdu in l3 if 'STITCHED' in hdu.name]
print("Stitched spectrum extensions:")
for ext in stitched_exts:
print(f" {ext}")
Plotting the Stitched Spectrum
[ ]:
# Plot the stitched spectrum for one trace
# Check which trace extensions exist
trace_num = 2 # Try TRACE2 first
wave_ext = f'STITCHED_CORR_TRACE{trace_num}_WAVE'
flux_ext = f'STITCHED_CORR_TRACE{trace_num}_FLUX'
if wave_ext in [hdu.name for hdu in l3]:
wave_l3 = l3[wave_ext].data
flux_l3 = l3[flux_ext].data
fig, ax = plt.subplots(figsize=(14, 5))
ax.plot(wave_l3, flux_l3, 'b-', lw=0.3)
ax.set_xlabel('Wavelength (Angstroms)', fontsize=12)
ax.set_ylabel('Flux (normalized)', fontsize=12)
ax.set_title(f'KPF L3 Stitched Spectrum - TRACE{trace_num}', fontsize=14, fontweight='bold')
# Zoom inset
ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())
plt.tight_layout()
plt.show()
print(f"\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms")
print(f"Number of pixels: {len(wave_l3)}")
else:
print(f"Extension {wave_ext} not found. Available: {stitched_exts}")
Zoomed View of Spectral Features
[ ]:
# Zoom in on H-alpha region
if wave_ext in [hdu.name for hdu in l3]:
fig, ax = plt.subplots(figsize=(12, 4))
# H-alpha region
mask = (wave_l3 > 6550) & (wave_l3 < 6580)
ax.plot(wave_l3[mask], flux_l3[mask], 'b-', lw=0.5)
ax.axvline(6562.8, color='red', ls='--', alpha=0.7, label='H-alpha (6562.8 A)')
ax.set_xlabel('Wavelength (Angstroms)')
ax.set_ylabel('Flux')
ax.set_title('H-alpha Region', fontsize=12)
ax.legend()
plt.tight_layout()
plt.show()
Level 4: Radial Velocity Measurements
Level 4 data contains radial velocity (RV) measurements derived from the spectra. These can include:
Per-order RVs
Combined RV with uncertainty
Activity indicators
Creating L4 from Native KPF L2
L4 is typically created from native pipeline outputs that contain RV measurements. For KPF, the native L2 file includes CCF-derived RVs.
[ ]:
# Create L4 from native KPF L2 file (which contains RV measurements)
kpf_l4 = RV4.from_fits(native_l2_file, instrument="KPF")
# Save to FITS file
l4_standard_file = "kpf_L4_standard.fits"
kpf_l4.to_fits(l4_standard_file)
print(f"Created {l4_standard_file}")
Using L4 Data
Reading the L4 File
[ ]:
# Open the L4 file
l4 = fits.open(l4_standard_file)
# Examine primary header for RV info
hdr4 = l4[0].header
print(f"Object: {hdr4['OBJECT']}")
print(f"Observation time (BJD): {hdr4.get('BJD_TDB', 'N/A')}")
# List extensions
print("\nL4 Extensions:")
for hdu in l4:
print(f" {hdu.name}")
Examining the RV1 Extension
The RV1 extension contains per-order radial velocity measurements with standardized column names.
[ ]:
# Examine the RV1 extension
rv1 = pd.DataFrame(l4['RV1'].data)
print("RV1 columns:")
print(rv1.columns.tolist())
print(f"\nNumber of orders: {len(rv1)}")
print("\nFirst 5 rows:")
print(rv1.head())
Per-Order RVs
The RV1 extension contains one row per echelle order. The RV column holds the combined RV measurement for each order, and ORDER_INDEX identifies which order each row corresponds to. Additional columns include BERV (barycentric earth radial velocity), WAVE_START/WAVE_END (wavelength range), and WEIGHT (RV weight).
[ ]:
# Plot per-order RVs from the RV1 extension
rv1 = pd.DataFrame(l4['RV1'].data)
fig, ax = plt.subplots(figsize=(10, 5))
valid = np.isfinite(rv1['RV'])
ax.scatter(rv1.loc[valid, 'ORDER_INDEX'], rv1.loc[valid, 'RV'], s=20, alpha=0.7)
median_rv = np.nanmedian(rv1['RV'])
ax.axhline(median_rv, color='red', ls='--', label=f'Median: {median_rv:.2f} m/s')
ax.set_xlabel('Order Index')
ax.set_ylabel('RV (m/s)')
ax.set_title('Per-Order Radial Velocities (RV1)')
ax.legend()
plt.tight_layout()
plt.show()
Summary
This tutorial demonstrated how to:
Create L2 from native KPF L0+L1 files using
RV2.from_fits()Use L2 data: access headers, examine extensions, plot spectra
Create L3 from standard L2 using
RV3.from_fits()Use L3 data: access stitched spectra, examine spectral features
Create L4 from native KPF L2 using
RV4.from_fits()Use L4 data: access RV measurements and per-order RVs
The standardized data format allows consistent access patterns across all EPRV instruments!
[ ]:
# Clean up - close FITS files
l2.close()
l3.close()
l4.close()