{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# KPF Data Tutorial\n", "\n", "This tutorial demonstrates how to create and use RVData standard files for the Keck Planet Finder (KPF) instrument at all data levels:\n", "\n", "- **Level 2 (L2)**: Extracted, wavelength-calibrated echelle spectra\n", "- **Level 3 (L3)**: Stitched 1D spectrum on a common wavelength grid\n", "- **Level 4 (L4)**: Radial velocity measurements\n", "\n", "## Prerequisites\n", "\n", "Install the rvdata package:\n", "```bash\n", "pip install rv-data-standard\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup and Data Download\n", "\n", "First, we'll import the necessary modules and download sample KPF data files." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:34.633277Z", "iopub.status.busy": "2026-02-11T05:21:34.633176Z", "iopub.status.idle": "2026-02-11T05:21:55.907082Z", "shell.execute_reply": "2026-02-11T05:21:55.906650Z" } }, "outputs": [], "source": [ "import os\n", "import requests\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from astropy.io import fits\n", "\n", "# RVData imports\n", "from rvdata.core.models.level2 import RV2\n", "from rvdata.core.models.level3 import RV3\n", "from rvdata.core.models.level4 import RV4\n", "\n", "def download_file(url, filename):\n", " \"\"\"Download a file if it doesn't already exist.\"\"\"\n", " if not os.path.exists(filename):\n", " print(f\"Downloading {filename}...\")\n", " response = requests.get(url)\n", " response.raise_for_status()\n", " with open(filename, \"wb\") as f:\n", " f.write(response.content)\n", " print(f\"Downloaded {filename}\")\n", " else:\n", " print(f\"{filename} already exists, skipping download.\")\n", "\n", "# KPF sample data URLs (hosted on project server)\n", "# L0 (raw) and L1 (extracted) files are needed to create L2\n", "# Native L2 files are needed to create L4\n", "file_urls = {\n", " \"l0\": \"http://grinnell.as.arizona.edu/~rvdata/kpf/KP.20250208.17485.59.fits\",\n", " \"l1\": \"http://grinnell.as.arizona.edu/~rvdata/kpf/KP.20250208.17485.59_L1.fits\",\n", " \"native_l2\": \"http://grinnell.as.arizona.edu/~rvdata/kpf/KP.20241022.41656.30_L2.fits\",\n", "}\n", "\n", "# Download the files\n", "l0_file = \"KP.20250208.17485.59.fits\"\n", "l1_file = \"KP.20250208.17485.59_L1.fits\"\n", "native_l2_file = \"KP.20241022.41656.30_L2.fits\"\n", "\n", "download_file(file_urls[\"l0\"], l0_file)\n", "download_file(file_urls[\"l1\"], l1_file)\n", "download_file(file_urls[\"native_l2\"], native_l2_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "# Level 2: Extracted Echelle Spectra\n", "\n", "Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace (fiber). Each trace contains flux, wavelength, variance, and blaze function arrays.\n", "\n", "## Creating L2 from Native KPF Files\n", "\n", "To create an RVData-standard L2 file from native KPF data, you need:\n", "- **L0 file**: Raw data with headers and telemetry\n", "- **L1 file**: Extracted spectra from the KPF pipeline" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:55.909208Z", "iopub.status.busy": "2026-02-11T05:21:55.908942Z", "iopub.status.idle": "2026-02-11T05:21:57.052919Z", "shell.execute_reply": "2026-02-11T05:21:57.052383Z" } }, "outputs": [], "source": [ "# Create RVData-standard L2 from native KPF files\n", "kpf_l2 = RV2.from_fits(l1_file, l0file=l0_file, instrument=\"KPF\")\n", "\n", "# Save to FITS file\n", "l2_standard_file = \"kpf_L2_standard.fits\"\n", "kpf_l2.to_fits(l2_standard_file)\n", "print(f\"Created {l2_standard_file}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using L2 Data\n", "\n", "### Reading the L2 File\n", "\n", "You can read L2 files using either astropy's `fits.open()` or the RVData `RV2.from_fits()` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:57.054390Z", "iopub.status.busy": "2026-02-11T05:21:57.054288Z", "iopub.status.idle": "2026-02-11T05:21:57.057301Z", "shell.execute_reply": "2026-02-11T05:21:57.057125Z" } }, "outputs": [], "source": [ "# Open using astropy\n", "l2 = fits.open(l2_standard_file)\n", "\n", "# Examine the primary header - same keywords regardless of instrument!\n", "hdr = l2[0].header\n", "print(f\"Telescope: {hdr['TELESCOP']}\")\n", "print(f\"Instrument: {hdr['INSTRUME']}\")\n", "print(f\"Object: {hdr['OBJECT']}\")\n", "print(f\"Number of traces: {hdr['NUMTRACE']}\")\n", "print(\"\\nTrace contents:\")\n", "for i in range(1, hdr['NUMTRACE'] + 1):\n", " print(f\" TRACE{i}: {hdr[f'TRACE{i}']}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Examining L2 Extensions\n", "\n", "The `EXT_DESCRIPT` extension lists all FITS extensions in the file." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:57.058667Z", "iopub.status.busy": "2026-02-11T05:21:57.058585Z", "iopub.status.idle": "2026-02-11T05:21:57.069504Z", "shell.execute_reply": "2026-02-11T05:21:57.069303Z" } }, "outputs": [], "source": [ "# List all extensions\n", "ext_descript = pd.DataFrame(l2['EXT_DESCRIPT'].data)\n", "print(ext_descript[['Name', 'Description']].to_string())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Examining the Order Table\n", "\n", "The `ORDER_TABLE` extension describes the wavelength coverage of each echelle order." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:57.070782Z", "iopub.status.busy": "2026-02-11T05:21:57.070690Z", "iopub.status.idle": "2026-02-11T05:21:57.074689Z", "shell.execute_reply": "2026-02-11T05:21:57.074486Z" } }, "outputs": [], "source": [ "order_table = pd.DataFrame(l2['ORDER_TABLE'].data)\n", "print(f\"Number of orders: {len(order_table)}\")\n", "print(f\"\\nWavelength coverage: {order_table['WAVE_START'].min():.1f} - {order_table['WAVE_END'].max():.1f} Angstroms\")\n", "print(\"\\nFirst 5 orders:\")\n", "print(order_table.head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting L2 Spectra\n", "\n", "KPF has 5 traces:\n", "- TRACE1: Calibration fiber (etalon)\n", "- TRACE2, TRACE3, TRACE4: Science fibers\n", "- TRACE5: Sky fiber\n", "\n", "Let's plot one order from the science traces." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:57.075986Z", "iopub.status.busy": "2026-02-11T05:21:57.075904Z", "iopub.status.idle": "2026-02-11T05:21:57.396971Z", "shell.execute_reply": "2026-02-11T05:21:57.393762Z" } }, "outputs": [], "source": [ "# Plot a single order from the three science traces\n", "order = 30 # Choose an order to plot\n", "\n", "fig, axes = plt.subplots(3, 1, figsize=(12, 8), sharex=True)\n", "\n", "for i, (ax, trace_num) in enumerate(zip(axes, [2, 3, 4])):\n", " wave = l2[f'TRACE{trace_num}_WAVE'].data[order]\n", " flux = l2[f'TRACE{trace_num}_FLUX'].data[order]\n", " blaze = l2[f'TRACE{trace_num}_BLAZE'].data[order]\n", " \n", " # Scale blaze for visualization\n", " blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))\n", " \n", " ax.plot(wave, flux, 'b-', lw=0.5, label='Flux')\n", " ax.plot(wave, blaze_scaled, 'orange', lw=1, label='Blaze (scaled)')\n", " ax.set_ylabel(f'TRACE{trace_num}\\nCounts')\n", " ax.legend(loc='upper right')\n", " ax.set_ylim(0, np.nanmax(flux) * 1.1)\n", "\n", "axes[-1].set_xlabel('Wavelength (Angstroms)')\n", "fig.suptitle(f'KPF L2 Spectra - Order {order}', fontsize=14, fontweight='bold')\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "# Level 3: Stitched 1D Spectrum\n", "\n", "Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:\n", "1. Divides out the blaze function\n", "2. Resamples each order onto a common wavelength grid\n", "3. Combines overlapping regions using inverse-variance weighting\n", "\n", "## Creating L3 from L2\n", "\n", "L3 is created from an RVData-standard L2 file." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:21:57.400984Z", "iopub.status.busy": "2026-02-11T05:21:57.400802Z", "iopub.status.idle": "2026-02-11T05:22:00.563685Z", "shell.execute_reply": "2026-02-11T05:22:00.563335Z" } }, "outputs": [], "source": [ "# Create L3 from the standard L2 file\n", "kpf_l3 = RV3.from_fits(l2_standard_file, instrument=\"KPF\")\n", "\n", "# Save to FITS file\n", "l3_standard_file = \"kpf_L3_standard.fits\"\n", "kpf_l3.to_fits(l3_standard_file)\n", "print(f\"Created {l3_standard_file}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using L3 Data\n", "\n", "### Reading the L3 File" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:00.565421Z", "iopub.status.busy": "2026-02-11T05:22:00.565295Z", "iopub.status.idle": "2026-02-11T05:22:00.572235Z", "shell.execute_reply": "2026-02-11T05:22:00.572033Z" } }, "outputs": [], "source": [ "# Open the L3 file\n", "l3 = fits.open(l3_standard_file)\n", "\n", "# List extensions\n", "print(\"L3 Extensions:\")\n", "for hdu in l3:\n", " print(f\" {hdu.name}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Understanding L3 Extensions\n", "\n", "For KPF with multiple science fibers, the stitched spectra are stored in `STITCHED_CORR_TRACE{n}_*` extensions:\n", "- `STITCHED_CORR_TRACE2_WAVE/FLUX/VAR`: Science fiber 1\n", "- `STITCHED_CORR_TRACE3_WAVE/FLUX/VAR`: Science fiber 2 \n", "- `STITCHED_CORR_TRACE4_WAVE/FLUX/VAR`: Science fiber 3" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:00.573442Z", "iopub.status.busy": "2026-02-11T05:22:00.573358Z", "iopub.status.idle": "2026-02-11T05:22:00.575327Z", "shell.execute_reply": "2026-02-11T05:22:00.575108Z" } }, "outputs": [], "source": [ "# Check which STITCHED extensions are present\n", "stitched_exts = [hdu.name for hdu in l3 if 'STITCHED' in hdu.name]\n", "print(\"Stitched spectrum extensions:\")\n", "for ext in stitched_exts:\n", " print(f\" {ext}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting the Stitched Spectrum" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:00.576640Z", "iopub.status.busy": "2026-02-11T05:22:00.576561Z", "iopub.status.idle": "2026-02-11T05:22:00.743617Z", "shell.execute_reply": "2026-02-11T05:22:00.742646Z" } }, "outputs": [], "source": [ "# Plot the stitched spectrum for one trace\n", "# Check which trace extensions exist\n", "trace_num = 2 # Try TRACE2 first\n", "wave_ext = f'STITCHED_CORR_TRACE{trace_num}_WAVE'\n", "flux_ext = f'STITCHED_CORR_TRACE{trace_num}_FLUX'\n", "\n", "if wave_ext in [hdu.name for hdu in l3]:\n", " wave_l3 = l3[wave_ext].data\n", " flux_l3 = l3[flux_ext].data\n", " \n", " fig, ax = plt.subplots(figsize=(14, 5))\n", " ax.plot(wave_l3, flux_l3, 'b-', lw=0.3)\n", " ax.set_xlabel('Wavelength (Angstroms)', fontsize=12)\n", " ax.set_ylabel('Flux (normalized)', fontsize=12)\n", " ax.set_title(f'KPF L3 Stitched Spectrum - TRACE{trace_num}', fontsize=14, fontweight='bold')\n", " \n", " # Zoom inset\n", " ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())\n", " plt.tight_layout()\n", " plt.show()\n", " \n", " print(f\"\\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms\")\n", " print(f\"Number of pixels: {len(wave_l3)}\")\n", "else:\n", " print(f\"Extension {wave_ext} not found. Available: {stitched_exts}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Zoomed View of Spectral Features" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:00.750903Z", "iopub.status.busy": "2026-02-11T05:22:00.750652Z", "iopub.status.idle": "2026-02-11T05:22:00.876451Z", "shell.execute_reply": "2026-02-11T05:22:00.876196Z" } }, "outputs": [], "source": [ "# Zoom in on H-alpha region\n", "if wave_ext in [hdu.name for hdu in l3]:\n", " fig, ax = plt.subplots(figsize=(12, 4))\n", " \n", " # H-alpha region\n", " mask = (wave_l3 > 6550) & (wave_l3 < 6580)\n", " ax.plot(wave_l3[mask], flux_l3[mask], 'b-', lw=0.5)\n", " ax.axvline(6562.8, color='red', ls='--', alpha=0.7, label='H-alpha (6562.8 A)')\n", " ax.set_xlabel('Wavelength (Angstroms)')\n", " ax.set_ylabel('Flux')\n", " ax.set_title('H-alpha Region', fontsize=12)\n", " ax.legend()\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "# Level 4: Radial Velocity Measurements\n", "\n", "Level 4 data contains radial velocity (RV) measurements derived from the spectra. These can include:\n", "- Per-order RVs\n", "- Combined RV with uncertainty\n", "- Activity indicators\n", "\n", "## Creating L4 from Native KPF L2\n", "\n", "L4 is typically created from native pipeline outputs that contain RV measurements. For KPF, the native L2 file includes CCF-derived RVs." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:00.877885Z", "iopub.status.busy": "2026-02-11T05:22:00.877800Z", "iopub.status.idle": "2026-02-11T05:22:01.036284Z", "shell.execute_reply": "2026-02-11T05:22:01.035888Z" } }, "outputs": [], "source": [ "# Create L4 from native KPF L2 file (which contains RV measurements)\n", "kpf_l4 = RV4.from_fits(native_l2_file, instrument=\"KPF\")\n", "\n", "# Save to FITS file\n", "l4_standard_file = \"kpf_L4_standard.fits\"\n", "kpf_l4.to_fits(l4_standard_file)\n", "print(f\"Created {l4_standard_file}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using L4 Data\n", "\n", "### Reading the L4 File" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:01.037819Z", "iopub.status.busy": "2026-02-11T05:22:01.037713Z", "iopub.status.idle": "2026-02-11T05:22:01.043681Z", "shell.execute_reply": "2026-02-11T05:22:01.043406Z" } }, "outputs": [], "source": [ "# Open the L4 file\n", "l4 = fits.open(l4_standard_file)\n", "\n", "# Examine primary header for RV info\n", "hdr4 = l4[0].header\n", "print(f\"Object: {hdr4['OBJECT']}\")\n", "print(f\"Observation time (BJD): {hdr4.get('BJD_TDB', 'N/A')}\")\n", "\n", "# List extensions\n", "print(\"\\nL4 Extensions:\")\n", "for hdu in l4:\n", " print(f\" {hdu.name}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Examining the RV1 Extension\n", "\n", "The `RV1` extension contains per-order radial velocity measurements with standardized column names." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:01.044998Z", "iopub.status.busy": "2026-02-11T05:22:01.044908Z", "iopub.status.idle": "2026-02-11T05:22:01.054282Z", "shell.execute_reply": "2026-02-11T05:22:01.054075Z" } }, "outputs": [], "source": [ "# Examine the RV1 extension\n", "rv1 = pd.DataFrame(l4['RV1'].data)\n", "print(\"RV1 columns:\")\n", "print(rv1.columns.tolist())\n", "print(f\"\\nNumber of orders: {len(rv1)}\")\n", "print(\"\\nFirst 5 rows:\")\n", "print(rv1.head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Per-Order RVs\n", "\n", "The `RV1` extension contains one row per echelle order. The `RV` column holds the combined RV measurement for each order, and `ORDER_INDEX` identifies which order each row corresponds to. Additional columns include `BERV` (barycentric earth radial velocity), `WAVE_START`/`WAVE_END` (wavelength range), and `WEIGHT` (RV weight)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:01.055491Z", "iopub.status.busy": "2026-02-11T05:22:01.055405Z", "iopub.status.idle": "2026-02-11T05:22:01.220368Z", "shell.execute_reply": "2026-02-11T05:22:01.218874Z" } }, "outputs": [], "source": [ "# Plot per-order RVs from the RV1 extension\n", "rv1 = pd.DataFrame(l4['RV1'].data)\n", "\n", "fig, ax = plt.subplots(figsize=(10, 5))\n", "valid = np.isfinite(rv1['RV'])\n", "ax.scatter(rv1.loc[valid, 'ORDER_INDEX'], rv1.loc[valid, 'RV'], s=20, alpha=0.7)\n", "median_rv = np.nanmedian(rv1['RV'])\n", "ax.axhline(median_rv, color='red', ls='--', label=f'Median: {median_rv:.2f} m/s')\n", "ax.set_xlabel('Order Index')\n", "ax.set_ylabel('RV (m/s)')\n", "ax.set_title('Per-Order Radial Velocities (RV1)')\n", "ax.legend()\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "# Summary\n", "\n", "This tutorial demonstrated how to:\n", "\n", "1. **Create L2** from native KPF L0+L1 files using `RV2.from_fits()`\n", "2. **Use L2** data: access headers, examine extensions, plot spectra\n", "3. **Create L3** from standard L2 using `RV3.from_fits()`\n", "4. **Use L3** data: access stitched spectra, examine spectral features\n", "5. **Create L4** from native KPF L2 using `RV4.from_fits()`\n", "6. **Use L4** data: access RV measurements and per-order RVs\n", "\n", "The standardized data format allows consistent access patterns across all EPRV instruments!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2026-02-11T05:22:01.223273Z", "iopub.status.busy": "2026-02-11T05:22:01.223086Z", "iopub.status.idle": "2026-02-11T05:22:01.226535Z", "shell.execute_reply": "2026-02-11T05:22:01.225796Z" } }, "outputs": [], "source": [ "# Clean up - close FITS files\n", "l2.close()\n", "l3.close()\n", "l4.close()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.8" } }, "nbformat": 4, "nbformat_minor": 4 }