{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MAROONX Data Tutorial\n",
    "\n",
    "This tutorial demonstrates how to create and use RVData standard files for the MAROONX instrument at all data levels:\n",
    "\n",
    "- **Level 2 (L2)**: Extracted, wavelength-calibrated echelle spectra\n",
    "- **Level 3 (L3)**: Stitched 1D spectrum on a common wavelength grid\n",
    "- **Level 4 (L4)**: Radial velocity measurements\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "Install the rvdata package:\n",
    "```bash\n",
    "pip install rv-data-standard\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup and Data Download\n",
    "\n",
    "First, we'll import the necessary modules and download sample MAROONX data files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import requests\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "from astropy.io import fits\n",
    "# RVData imports\n",
    "from rvdata.core.models.level3 import RV3\n",
    "from rvdata.instruments.maroonx.level2 import MAROONXRV2\n",
    "from rvdata.instruments.maroonx.level4 import MAROONXRV4\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def download_file(url, filename):\n",
    "    \"\"\"Download a file if it doesn't already exist.\"\"\"\n",
    "    if not os.path.exists(filename):\n",
    "        print(f\"Downloading {filename}...\")\n",
    "        response = requests.get(url)\n",
    "        response.raise_for_status()\n",
    "        with open(filename, \"wb\") as f:\n",
    "            f.write(response.content)\n",
    "        print(f\"Downloaded {filename}\")\n",
    "    else:\n",
    "        print(f\"{filename} already exists, skipping download.\")\n",
    "\n",
    "# MAROONX sample data URLs (hosted on project server)\n",
    "# MAROONX HDF5 dataproduct and a MAROONX flat file are needed to create L2\n",
    "# MAROONX standard Level 2 FITS file and a MAROONX RV SERVAL pickle file are needed to create L4\n",
    "\n",
    "\n",
    "pathname = \"http://grinnell.as.arizona.edu/~rvdata/maroonx/\"\n",
    "file_urls = {\n",
    "    \"MAROONX\": {\n",
    "        \"MX_OP\": pathname + \"20240923T092056Z_SOOOE_x_0040.hd5\",\n",
    "        \"FLAT\": pathname + \"20240916T19_masterflat__FFFFF_x__blaze.hd5\",\n",
    "        \"SERVAL_BLUE\": pathname + \"tauCet_activity_results_blue.pkl\",\n",
    "        \"SERVAL_RED\": pathname + \"tauCet_activity_results_red.pkl\"\n",
    "    }\n",
    "}\n",
    "\n",
    "# Download the files\n",
    "file_mx = \"20240923T092056Z_SOOOE_x_0040.hd5\"\n",
    "flat_file = \"20240916T19_masterflat__FFFFF_x__blaze.hd5\"\n",
    "rv_blue = \"tauCet_activity_results_blue.pkl\"\n",
    "rv_red = \"tauCet_activity_results_red.pkl\"\n",
    "\n",
    "download_file(file_urls[\"MAROONX\"]['MX_OP'], file_mx)\n",
    "download_file(file_urls[\"MAROONX\"]['FLAT'], flat_file)\n",
    "download_file(file_urls[\"MAROONX\"]['SERVAL_BLUE'], rv_blue)\n",
    "download_file(file_urls[\"MAROONX\"]['SERVAL_RED'], rv_red)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# Level 2: Extracted Echelle Spectra\n",
    "\n",
    "Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace (fiber). Each trace contains flux, wavelength, variance, and blaze function arrays.\n",
    "\n",
    "## Creating L2 from MAROONX HDF5 Files\n",
    "\n",
    "To create an RVData-standard L2 file from MAROONX data product, you need:\n",
    "- **HDF5 file**: Extracted object spectra from the MAROONX pipeline\n",
    "- **FLAT file**: Extracted flat spectra from the MAROONX pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create RVData-standard L2 from MAROONX HDF5 files\n",
    "timestamp = os.path.basename(file_mx).split(\"_\")[0].replace(\"Z\", \"\")\n",
    "mx_l2 = MAROONXRV2()\n",
    "mx_l2.createL2(file_mx, flat_file)\n",
    "\n",
    "# Create separate fits files for Blue and red camera\n",
    "l2_standard_blue, l2_standard_red = mx_l2.write_camera_fits(file_mx,\n",
    "                                                        out_dir='.')\n",
    "\n",
    "print(f\"Created {l2_standard_blue}\")\n",
    "print(f\"Created {l2_standard_red}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using L2 Data\n",
    "\n",
    "### Reading the L2 File\n",
    "\n",
    "You can read L2 files using either astropy's `fits.open()` or the RVData `RV2.from_fits()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Open red camera fits file using astropy\n",
    "l2 = fits.open(l2_standard_red)\n",
    "\n",
    "# Examine the primary header - same keywords regardless of instrument!\n",
    "hdr = l2[0].header\n",
    "print(f\"Telescope: {hdr['TELESCOP']}\")\n",
    "print(f\"Instrument: {hdr['INSTRUME']}\")\n",
    "print(f\"Object: {hdr['OBJECT']}\")\n",
    "print(f\"Number of traces: {hdr['NUMTRACE']}\")\n",
    "print(\"\\nTrace contents:\")\n",
    "for i in range(1, hdr['NUMTRACE'] + 1):\n",
    "    print(f\"  TRACE{i}: {hdr[f'TRACE{i}']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Examining L2 Extensions\n",
    "\n",
    "The `EXT_DESCRIPT` extension lists all FITS extensions in the file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# List all extensions\n",
    "ext_descript = pd.DataFrame(l2['EXT_DESCRIPT'].data)\n",
    "print(ext_descript[['Name', 'Description']].to_string())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Examining the Order Table\n",
    "\n",
    "The `ORDER_TABLE` extension describes the wavelength coverage of each echelle order."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "order_table = pd.DataFrame(l2['ORDER_TABLE'].data)\n",
    "print(f\"Number of orders: {len(order_table)}\")\n",
    "print(f\"\\nWavelength coverage: {order_table['WAVE_START'].min():.1f} - {order_table['WAVE_END'].max():.1f} Angstroms\")\n",
    "print(\"\\nFirst 5 orders:\")\n",
    "print(order_table.head())\n",
    "print(\"\\nLast 5 orders:\")\n",
    "print(order_table[-5:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plotting L2 Spectra\n",
    "\n",
    "MAROONX has 6 traces per camera and a virtual trace:\n",
    "- TRACE1: Sky fiber (SKY)\n",
    "- TRACE2, TRACE3, TRACE4: Science fibers (SCI)\n",
    "- TRACE5: Simultaneous calibration fiber (CAL)\n",
    "- TRACE6: Flux weighted combination of the 3 science fibers (VIRTUAL)\n",
    "\n",
    "Let's plot one order from the science traces."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot a single order from the three science traces\n",
    "order_index = 20  # Choose an order to plot\n",
    "order = order_table.loc[order_index, \"ECHELLE_ORDER\"]\n",
    "fig, axes = plt.subplots(4, 1, figsize=(12, 8), sharex=True)\n",
    "\n",
    "for i, (ax, trace_num) in enumerate(zip(axes, [2, 3, 4, 6])):\n",
    "    wave = l2[f'TRACE{trace_num}_WAVE'].data[order_index]\n",
    "    flux = l2[f'TRACE{trace_num}_FLUX'].data[order_index]\n",
    "    blaze = l2[f'TRACE{trace_num}_BLAZE'].data[order_index]\n",
    "    \n",
    "    # Scale blaze for visualization\n",
    "    blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))\n",
    "    \n",
    "    ax.plot(wave, flux, 'b-', lw=0.5, label='Flux')\n",
    "    ax.plot(wave, blaze_scaled, 'orange', lw=1, label='Blaze (scaled)')\n",
    "    ax.set_ylabel(f'TRACE{trace_num}\\nCounts')\n",
    "    ax.legend(loc='upper right')\n",
    "    ax.set_ylim(0, np.nanmax(flux) * 1.1)\n",
    "\n",
    "axes[-1].set_xlabel('Wavelength (Angstroms)')\n",
    "fig.suptitle(f'MAROONX L2 Spectra - Order {order}', fontsize=14, fontweight='bold')\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# Level 3: Stitched 1D Spectrum\n",
    "\n",
    "Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:\n",
    "1. Divides out the blaze function\n",
    "2. Resamples each order onto a common wavelength grid\n",
    "3. Combines overlapping regions using inverse-variance weighting\n",
    "\n",
    "## Creating L3 from L2\n",
    "\n",
    "L3 is created from an RVData-standard L2 file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create L3 from the standard L2 file\n",
    "mx_l3 = RV3.from_fits(l2_standard_red, instrument=\"MAROONX\")\n",
    "\n",
    "# Save to FITS file\n",
    "l3_standard = f'maroonxred_SL3_{timestamp}.fits'\n",
    "mx_l3.to_fits(l3_standard)\n",
    "print(f\"Created {l3_standard}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using L3 Data\n",
    "\n",
    "### Reading the L3 File"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Open the L3 file\n",
    "l3 = fits.open(l3_standard)\n",
    "\n",
    "# List extensions\n",
    "print(\"L3 Extensions:\")\n",
    "for hdu in l3:\n",
    "    print(f\"  {hdu.name}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Understanding L3 Extensions\n",
    "\n",
    "For MAROONX with multiple science fibers, the stitched spectra are stored in `STITCHED_CORR_TRACE{n}_*` extensions:\n",
    "- `STITCHED_CORR_TRACE2_WAVE/FLUX/VAR`: Science fiber 2\n",
    "- `STITCHED_CORR_TRACE3_WAVE/FLUX/VAR`: Science fiber 3  \n",
    "- `STITCHED_CORR_TRACE4_WAVE/FLUX/VAR`: Science fiber 4\n",
    "- `STITCHED_CORR_TRACE6_WAVE/FLUX/VAR`: Virtual fiber 6"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check which STITCHED extensions are present\n",
    "stitched_exts = [hdu.name for hdu in l3 if 'STITCHED' in hdu.name]\n",
    "print(\"Stitched spectrum extensions:\")\n",
    "for ext in stitched_exts:\n",
    "    print(f\"  {ext}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plotting the Stitched Spectrum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot the stitched spectrum for one trace\n",
    "# Check which trace extensions exist\n",
    "trace_num = 3  # Try TRACE3 first\n",
    "wave_ext = f'STITCHED_CORR_TRACE{trace_num}_WAVE'\n",
    "flux_ext = f'STITCHED_CORR_TRACE{trace_num}_FLUX'\n",
    "\n",
    "if wave_ext in [hdu.name for hdu in l3]:\n",
    "    wave_l3 = l3[wave_ext].data\n",
    "    flux_l3 = l3[flux_ext].data\n",
    "    \n",
    "    fig, ax = plt.subplots(figsize=(14, 5))\n",
    "    ax.plot(wave_l3, flux_l3, 'b-', lw=0.3)\n",
    "    ax.set_xlabel('Wavelength (Angstroms)', fontsize=12)\n",
    "    ax.set_ylabel('Flux (normalized)', fontsize=12)\n",
    "    ax.set_title(f'MAROONX L3 Stitched Spectrum - TRACE{trace_num}', fontsize=14, fontweight='bold')\n",
    "    \n",
    "    # Zoom inset\n",
    "    ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())\n",
    "    plt.tight_layout()\n",
    "    plt.show()\n",
    "    \n",
    "    print(f\"\\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms\")\n",
    "    print(f\"Number of pixels: {len(wave_l3)}\")\n",
    "else:\n",
    "    print(f\"Extension {wave_ext} not found. Available: {stitched_exts}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Zoomed View of Spectral Features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Zoom in on H-alpha region\n",
    "if wave_ext in [hdu.name for hdu in l3]:\n",
    "    fig, ax = plt.subplots(figsize=(12, 4))\n",
    "    \n",
    "    # H-alpha region\n",
    "    mask = (wave_l3 > 6550) & (wave_l3 < 6580)\n",
    "    ax.plot(wave_l3[mask], flux_l3[mask], 'b-', lw=0.5)\n",
    "    ax.axvline(6562.8, color='red', ls='--', alpha=0.7, label='H-alpha (6562.8 A)')\n",
    "    ax.set_xlabel('Wavelength (Angstroms)')\n",
    "    ax.set_ylabel('Flux')\n",
    "    ax.set_title('H-alpha Region', fontsize=12)\n",
    "    ax.legend()\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# Level 4: Radial Velocity Measurements\n",
    "\n",
    "Level 4 data contains radial velocity (RV) measurements derived from the spectra. These can include:\n",
    "- Per-order RVs\n",
    "- Combined RV with uncertainty\n",
    "- Activity indicators\n",
    "\n",
    "## Creating L4 from MAROONX STANDARD L2 and MAROONX SERVAL pickle file\n",
    "\n",
    "L4 is created from RVData standard L2 file. The RV data product is separately available in pkl file and contains SERVAL derived RVs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create L4 from native MAROONX L2 file (which contains RV measurements)\n",
    "mx_l4 = MAROONXRV4()\n",
    "mx_l4.createL4(fits.open(l2_standard_blue), rv_blue, channel=\"BLUE\")\n",
    "\n",
    "l4_standard = f'maroonxblue_SL4_{timestamp}.fits'\n",
    "\n",
    "# Save to FITS file\n",
    "mx_l4.to_fits(l4_standard)\n",
    "\n",
    "print(f\"Created {l4_standard}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using L4 Data\n",
    "\n",
    "### Reading the L4 File"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Open the L4 file\n",
    "l4 = fits.open(l4_standard)\n",
    "\n",
    "# Examine primary header for RV info\n",
    "hdr4 = l4[0].header\n",
    "print(f\"Object: {hdr4['OBJECT']}\")\n",
    "print(f\"Observation time (BJD): {hdr4.get('BJD_TDB', 'N/A')}\")\n",
    "\n",
    "# List extensions\n",
    "print(\"\\nL4 Extensions:\")\n",
    "for hdu in l4:\n",
    "    print(f\"  {hdu.name}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Examining the RV1 Extension\n",
    "\n",
    "The `RV1` extension contains per-order radial velocity measurements with standardized column names."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Examine the RV1 extension\n",
    "rv1 = pd.DataFrame(l4['RV1'].data)\n",
    "print(\"RV1 columns:\")\n",
    "print(rv1.columns.tolist())\n",
    "print(f\"\\nNumber of orders: {len(rv1)}\")\n",
    "print(\"\\nFirst 5 rows:\")\n",
    "print(rv1.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Per-Order RVs\n",
    "\n",
    "The `RV1` extension contains one row per echelle order. The `RV` column holds the combined RV measurement for each order, and `ORDER_INDEX` identifies which order each row corresponds to. Additional columns include `BERV` (barycentric earth radial velocity), `WAVE_START`/`WAVE_END` (wavelength range), and `WEIGHT` (RV weight)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot per-order RVs from the RV1 extension\n",
    "rv1 = pd.DataFrame(np.array(l4['RV1'].data, dtype=l4['RV1'].data.dtype.newbyteorder('=')))\n",
    "fig, ax = plt.subplots(figsize=(10, 5))\n",
    "valid = np.isfinite(rv1['RV'])\n",
    "ax.errorbar(rv1.loc[valid, 'ORDER_INDEX'], rv1.loc[valid, 'RV'], yerr=rv1.loc[valid, 'RV_ERR'],\n",
    "            fmt='.', mfc='blue', ms=8, mec='gray',ecolor='gray', alpha=0.5)\n",
    "median_rv = np.nanmedian(rv1['RV'])\n",
    "ax.axhline(median_rv, color='red', ls='--', label=f'Median: {median_rv:.2f} m/s')\n",
    "ax.set_xlabel('Order Index')\n",
    "ax.set_ylabel('RV (m/s)')\n",
    "ax.set_title('Per-Order Radial Velocities (RV1)')\n",
    "ax.legend()\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# Summary\n",
    "\n",
    "This tutorial demonstrated how to:\n",
    "\n",
    "1. **Create L2** from MAROONX HDF5 and flat files using `MAROONXRV2.createL2()`\n",
    "2. **Use L2** data: access headers, examine extensions, plot spectra\n",
    "3. **Create L3** from standard L2 using `RV3.from_fits()`\n",
    "4. **Use L3** data: access stitched spectra, examine spectral features\n",
    "5. **Create L4** from standard MAROONX L2 and SERVAL pickle file using `MAROONXRV4.createL4()`\n",
    "6. **Use L4** data: access RV measurements and per-order RVs\n",
    "\n",
    "The standardized data format allows consistent access patterns across all EPRV instruments!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Clean up - close FITS files\n",
    "l2.close()\n",
    "l3.close()\n",
    "l4.close()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "eprv_form",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}