NEID Data Tutorial

Written by Leonardo Paredes and Danny Krolikowski

This tutorial demonstrates how to create and use RVData standard files for the NEID instrument at all data levels:

  • Level 2 (L2): Extracted, wavelength-calibrated echelle spectra

  • Level 3 (L3): Stitched 1D spectrum on a common wavelength grid

  • Level 4 (L4): Radial velocities and other derived measurements

Prerequisites

Install the rvdata package:

pip install rv-data-standard

Setup and Data Download

First, we’ll import the necessary modules and download sample NEID data files.

[1]:
import os
import requests
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
from astropy.table import Table

# RVData imports
from rvdata.core.models.level2 import RV2
from rvdata.core.models.level3 import RV3
from rvdata.core.models.level4 import RV4
[2]:
def download_file(url, filename):
    """Download a file if it doesn't already exist."""
    if not os.path.exists(filename):
        print(f"Downloading {filename}...")
        response = requests.get(url)
        response.raise_for_status()
        with open(filename, "wb") as f:
            f.write(response.content)
        print(f"Downloaded {filename}")
    else:
        print(f"{filename} already exists, skipping download.")

# NEID sample data URLs (hosted on project server)
# For NEID: standard L2, L3, and L4 are all created just from a native format NEID L2 file
file_urls = {
    "native_l2": "http://grinnell.as.arizona.edu/~rvdata/neid/neidL2_20231010T020006.fits",
}

# Download the files
native_l2_file = "neidL2_20231010T020006.fits"

download_file(file_urls["native_l2"], native_l2_file)
Downloading neidL2_20231010T020006.fits...
Downloaded neidL2_20231010T020006.fits

Level 2: Extracted Echelle Spectra

Level 2 data contains wavelength-calibrated, extracted echelle spectra organized by trace, which in NEID’s case are different fibers.

For NEID’s HR mode, the traces are:

  • Trace 1: Science Fiber

  • Trace 2: Sky Fiber

  • Trace 3: Calibration Fiber

For NEID’s HE mode, there is no Trace 3: Calibration Fiber

Each trace has a corresponding flux, wavelength, variance, and blaze function arrays.

Creating L2 from Native NEID Files

For a RVData-standard L2 file only the native L2 NEID data file is needed neidL2_YYYYMMDDTHHMMSS.fits

[3]:
# Create RVData-standard L2 from native NEID files
neid_l2 = RV2.from_fits(native_l2_file, instrument="NEID")

# Save to FITS file
# to_fits will automatically write the standardized file name format
l2_standard_file = neid_l2.to_fits()
print(f"Created {l2_standard_file}")
/Users/krolikowski/codes/RVData/rvdata/instruments/neid/level2.py:172: RuntimeWarning: All-NaN slice encountered
  "WAVE_START": np.nanmin(self.data["TRACE1_WAVE"], axis=1),
/Users/krolikowski/codes/RVData/rvdata/instruments/neid/level2.py:173: RuntimeWarning: All-NaN slice encountered
  "WAVE_END": np.nanmax(self.data["TRACE1_WAVE"], axis=1),
Created neid_SL2_20231010T020006.fits

Note that the standard file name format is similar: neid_SL2_YYYYMMDDTHHMMSS.fits

Using L2 Data

Reading the L2 File

You can read L2 files using either astropy’s fits.open() or the RVData RV2.from_fits() method.

Let’s read back in the standardized L2 file and look at some primary header entries describing the observation

[4]:
# Open using astropy
sl2 = fits.open(l2_standard_file)

# Examine the primary header, which has the same keywords regardless of instrument!
hdr = sl2[0].header
print(f"Telescope: {hdr['TELESCOP']}")
print(f"Instrument: {hdr['INSTRUME']}")
print(f"Object: {hdr['OBJECT']}")
print(f"Number of traces: {hdr['NUMTRACE']}")
print("\nTrace contents:")
for i in range(1, hdr['NUMTRACE'] + 1):
    print(f"  TRACE{i}: {hdr[f'TRACE{i}']}")
print(f"\nScience Spectrum S/N at {hdr['EXSNRW1']} Angstrom: {hdr['EXSNR1']:.2f}")

Telescope: WIYN 3.5m
Instrument: NEID
Object: HD 185144
Number of traces: 3

Trace contents:
  TRACE1: Gaia DR2 2261614264930275072
  TRACE2: Sky
  TRACE3: Etalon

Science Spectrum S/N at 5500.0 Angstrom: 363.02

Examining L2 Extensions

The EXT_DESCRIPT extension lists all FITS extensions in the file.

[5]:
# Print out the extension description, which we convert from FITS data extension to an Astropy table for convenenience
ext_descript = Table(sl2["EXT_DESCRIPT"].data)
ext_descript.pprint_all()
       Name                                              Description
----------------- -----------------------------------------------------------------------------------------
          PRIMARY                                                       EPRV Standard FITS HEADER (no data)
INSTRUMENT_HEADER                                                Inherited NEID instrument header (no data)
          RECEIPT                                 Table of operations that have been performed on this file
       DRP_CONFIG                              Pipeline details (settings etc) to go from native data to L2
     EXT_DESCRIPT                                               Table describing contents of each extension
      ORDER_TABLE Table capturing the wavelength extent of each echelle order in Trace 1: NEID HR SCI fiber
      TRACE1_FLUX                                                        Flux in Trace 1: NEID HR SCI fiber
      TRACE1_WAVE                                         Wavelength solution in Trace 1: NEID HR SCI fiber
       TRACE1_VAR                                               Flux variance in Trace 1: NEID HR SCI fiber
     TRACE1_BLAZE                                                       Blaze in Trace 1: NEID HR SCI fiber
     BARYCORR_KMS                                         Barycentric correction velocity per order in km/s
       BARYCORR_Z                                 Barycentric correction velocity per order in redshift (z)
          BJD_TDB                     Photon weighted midpoint per order as barycentric dynamical time (JD)
      TRACE2_FLUX                                                        Flux in Trace 2: NEID HR SKY fiber
      TRACE2_WAVE                                         Wavelength solution in Trace 2: NEID HR SKY fiber
       TRACE2_VAR                                               Flux variance in Trace 2: NEID HR SKY fiber
     TRACE2_BLAZE                                                       Blaze in Trace 2: NEID HR SKY fiber
      TRACE3_FLUX                                                        Flux in Trace 3: NEID HR CAL fiber
      TRACE3_WAVE                                         Wavelength solution in Trace 3: NEID HR CAL fiber
       TRACE3_VAR                                               Flux variance in Trace 3: NEID HR CAL fiber
     TRACE3_BLAZE                                                       Blaze in Trace 3: NEID HR CAL fiber
     TRACE1_DRIFT                  Instrument drift velocity relative to start of observing session in km/s
         EXPMETER                                   Chromatic exposure meter for Trace 1: NEID HR SCI fiber
  TRACE1_TELLURIC    Telluric model for Trace 1: NEID HR SCI fiber (line and continuum absorption combined)

Examining the Order Table

The ORDER_TABLE extension describes the wavelength coverage of each echelle order.

[6]:
# We will again convert the order table FITS data format to an Astropy table
order_table = Table(sl2["ORDER_TABLE"].data)

print(f"Number of orders: {len(order_table)}")
print(f"Total wavelength coverage: {np.nanmin(order_table['WAVE_START']):.1f} - {np.nanmax(order_table['WAVE_END']):.1f} Angstroms")
print("\nFirst 5 orders:")
print(order_table[:5])
print("\nLast 5 orders:")
print(order_table[-5:])
Number of orders: 122
Total wavelength coverage: 3570.9 - 11251.2 Angstroms

First 5 orders:
ECHELLE_ORDER ORDER_INDEX     WAVE_START         WAVE_END
------------- ----------- ----------------- -----------------
          173           0               nan               nan
          172           1               nan               nan
          171           2               nan               nan
          170           3  3570.93576756872 3640.056733452913
          169           4 3591.935420310261 3661.512783703846

Last 5 orders:
ECHELLE_ORDER ORDER_INDEX     WAVE_START          WAVE_END
------------- ----------- ------------------ ------------------
           56         117 10821.217523481686 11045.807518650605
           55         118 11038.908724195322 11251.197788845078
           54         119                nan                nan
           53         120                nan                nan
           52         121                nan                nan

The ORDER_TABLE extension includes both the zero-indexed order (as would be used for python indexing) and the corresponding physical echelle order.

Note that for NEID the first 3 and last 3 orders have empty wavelength solutions. This is because those orders are defined but not extracted.

You can also see that consecutive orders overlap. For example, Order 3 extends to 3640 A but Order 4 starts at 3591 A.

Plotting L2 Spectra

As a reminder, NEID has 3 traces in High Resolution mode (HR):

  • TRACE1: Science fiber

  • TRACE2: Sky fiber

  • TRACE3: Calibration fiber (typically the etalon)

Let’s plot part of the science trace spectrum!

Finding a wavelength in the order table

First, let’s create a function that can use the ORDER_TABLE extension to look up where in the spectrum a specific wavelength is.

This is important, because a specific wavelength can be in multiple orders due to overlap.

[7]:
def find_wavelength_in_orders(wavelength, order_wave_start, order_wave_end):
    """A simple utility function to find a wavelength given the start
       and end points of the spectral orders.

    Parameters
    ----------
    wavelength : float
        The wavelength to search for in the given spectral orders.
    order_wave_start : array-like
        The column of the standard order table with the starting wavelengths per order.
    order_wave_end : array-like
        The column of the standard order table with the ending wavelengths per order.

    Returns
    -------
    orders : array-like or None
        A list of the order(s) that contain the given wavelength
    """

    # Compare the desired wavelength to the order table to find which orders contain that wavelength
    orders = np.where(((wavelength - order_wave_start) > 0) & ((wavelength - order_wave_end) < 0))[0]

    # Check that the wavelength is in an order
    if len(orders) == 0:
        print(f"The wavelength {wavelength} does not fall in a spectral order.")
        return None

    return orders

In particular, let’s search for the order(s) that contain the H-alpha spectral line, a strong absorption line that is a key activity diagnostic

[8]:
# Set the rest wavelength (vacuum) of the H-alpha line in Angstrom
halpha_wavelength = 6564.6

# We can use the science target's catalogue systemic velocity from the primary header to doppler shift H-alpha's wavelength
halpha_wavelength_shifted = halpha_wavelength * (1 + float(sl2["PRIMARY"].header["CRV1"]) / 3e5)

print(f"Science target {sl2['PRIMARY'].header['OBJECT']} "
      + f"has systemic velocity {sl2['PRIMARY'].header['CRV1']} km/s "
      + f"which shifts H-alpha to {halpha_wavelength_shifted:.2f} Angstrom.")

# Let's get the orders that contain H-alpha
orders_halpha = find_wavelength_in_orders(halpha_wavelength_shifted,
                                          order_table["WAVE_START"],
                                          order_table["WAVE_END"])

print(f"\nZero-indexed spectral orders that contain H-alpha at "
      + f"{halpha_wavelength_shifted:.2f} Angstrom: {orders_halpha}")
Science target HD 185144 has systemic velocity 26.5764 km/s which shifts H-alpha to 6565.18 Angstrom.

Zero-indexed spectral orders that contain H-alpha at 6565.18 Angstrom: [79 80]

So we can see that NEID has two spectral orders that contain the H-alpha spectral line, zero-indexed orders 79 and 80.

Let’s plot both of them! We can see how they overlap, and hopefully see a strong absorption line at the expected wavelength.

[9]:
# Let's make a plot!
fig, ax = plt.subplots(1, 1, figsize=(12, 5), sharex=True)

# Set the science trace ID number, 1, and extract the relevant data arrays to variables
trace_num = 1
wave = sl2[f'TRACE{trace_num}_WAVE'].data
flux = sl2[f'TRACE{trace_num}_FLUX'].data
blaze = sl2[f'TRACE{trace_num}_BLAZE'].data

# Let's also scale the blaze to match the flux maxima per order for visualization
blaze_scaled = blaze * (np.nanmax(flux, axis=1) / np.nanmax(blaze, axis=1))[:,np.newaxis]

# Go through and plot both of the H-alpha orders:
for order in orders_halpha:
    ax.plot(wave[order], flux[order], color='tab:blue', linestyle='-', lw=1.0, label=f'Order {order} Flux')
    ax.plot(wave[order], blaze_scaled[order], color='tab:orange', linestyle='-', lw=2.0, label=f'Order {order} Scaled Blaze')

# Let's also plot a vertical line at the expected wavelength
ax.axvline(x=halpha_wavelength_shifted, c='#323232', lw=0.75, ls='--', label='Expected H-alpha Wavelength', zorder=1)

# Set up all the labels and such
ax.set_xlabel('Wavelength (\u212B)')
ax.set_ylabel(f'TRACE{trace_num} Science Fiber Flux (e$^-$)')
ax.set_title(f"NEID L2 Science Fiber Spectrum of {sl2['PRIMARY'].header['OBJECT']} in Orders {orders_halpha}", fontsize=14, fontweight='bold')

ax.legend(loc='upper right')

ax.set_ylim(0, np.nanmax(flux) * 1.1)

plt.tight_layout()
plt.show()
/var/folders/dn/k4d1yrcx5z34cybs1b5c_pph0000gn/T/ipykernel_36259/4156444267.py:11: RuntimeWarning: invalid value encountered in divide
  blaze_scaled = blaze * (np.nanmax(flux, axis=1) / np.nanmax(blaze, axis=1))[:,np.newaxis]
../_images/tutorials_NEID_Tutorial_18_1.png

Hey! That looks like a very nice spectrum with a deep absorption line at the expected H-alpha wavelength!

This example is simple, but hopefully shows the ease with which you can find and inspect spectral regions of interest using the standardized data products.

It looks like Order 80 has the H-alpha line closer to the middle of the order, where its signal-to-noise is better. Let’s plot just order 80 and zoom in on the H-alpha line

[10]:
# Let's make a plot!
fig, ax = plt.subplots(1, 1, figsize=(12, 5), sharex=True)

# We can set the order here
order = 80

# Set the science trace ID number, 1, and extract the relevant data arrays to variables
trace_num = 1
wave = sl2[f'TRACE{trace_num}_WAVE'].data[order]
flux = sl2[f'TRACE{trace_num}_FLUX'].data[order]
blaze = sl2[f'TRACE{trace_num}_BLAZE'].data[order]

# Let's also scale the blaze to match the flux maxima per order for visualization
blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))

# We can zoom in and only plot a spectral region close to H-alpha, let's say +/- 10 Angstrom
plot_indices = np.where((np.abs(wave - halpha_wavelength_shifted) < 10))[0]

# Let's also divide out the scaled blaze function for this order, so we have a relatively flat spectral continuum
ax.plot(wave[plot_indices], (flux / blaze_scaled)[plot_indices], color='tab:pink', linestyle='-', lw=1.0, label=f'De-blazed Flux')

# Let's also plot a vertical line at the expected wavelength
ax.axvline(x=halpha_wavelength_shifted, c='#323232', lw=0.75, ls='--', label='Expected H-alpha Wavelength', zorder=1)

# Set up all the labels and such
ax.set_xlabel('Wavelength (\u212B)')
ax.set_ylabel(f'Normalized TRACE{trace_num} Science Fiber Flux')
ax.set_title(f"NEID L2 Science Fiber Spectrum of {sl2['PRIMARY'].header['OBJECT']} in Order {order}", fontsize=14, fontweight='bold')

ax.legend()

plt.tight_layout()
plt.show()
../_images/tutorials_NEID_Tutorial_20_0.png

Here we can see in detail the H-alpha line, exactly where expected as shown by the vertical dashed line.

The line is expectedly very deep and very broad. There are also a slew of other lines in the region.

However, not all of those lines are from the star itself! Some of those are telluric absorption lines. They are the signature of Earth’s atmosphere absorbing some of the incoming star light. It sure would be nice to remove those…

NEID’s standard L2 telluric absorption model and correction

Luckily, the NEID pipeline provides a model of telluric absorption across the spectrum. This is provided in the TRACE1_TELLURIC standard data format extension. It has the same data shape as the TRACE flux arrays.

Let’s take a look at the telluric absorption model in this H-alpha region!

[11]:
# Let's make a plot!
fig, ax = plt.subplots(1, 1, figsize=(12, 5), sharex=True)

# We can set the order here
order = 80

# Set the science trace ID number, 1, and extract the relevant data arrays to variables
trace_num = 1
wave = sl2[f'TRACE{trace_num}_WAVE'].data[order]
flux = sl2[f'TRACE{trace_num}_FLUX'].data[order]
blaze = sl2[f'TRACE{trace_num}_BLAZE'].data[order]
blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))

# Let's also set this order's telluric model to a variable
telluric_model = sl2[f'TRACE{trace_num}_TELLURIC'].data[order]

# We can zoom in and only plot a spectral region close to H-alpha, let's say +/- 10 Angstrom
plot_indices = np.where((np.abs(wave - halpha_wavelength_shifted) < 10))[0]

# Let's plot the spectrum with the normalized blaze removed and the telluric model on top
ax.plot(wave[plot_indices], (flux / blaze_scaled)[plot_indices], color='tab:pink', linestyle='-', lw=1.0, label=f'De-blazed Flux')
ax.plot(wave[plot_indices], telluric_model[plot_indices], color='tab:brown', linestyle='-', lw=1.0, label=f'Telluric Absorption Model')

# Set up all the labels and such
ax.set_xlabel('Wavelength (\u212B)')
ax.set_ylabel(f'Normalized TRACE{trace_num} Science Fiber Flux')
ax.set_title(f"NEID L2 Science Fiber Spectrum of {sl2['PRIMARY'].header['OBJECT']} in Order {order}", fontsize=14, fontweight='bold')

ax.legend()

plt.tight_layout()
plt.show()
../_images/tutorials_NEID_Tutorial_22_0.png

Here, the telluric absorption model is plotted in brown. You can see that plenty of the lines in this spectral region are actually telluric contamination, including multiple that directly overlap the H-alpha line!

Some important notes to keep in mind about the NEID telluric absorption model:

  • It is generated from line by line radiative transfer modeling for an observation’s airmass and best fit water vapor column. This model is then convolved with a model of NEID’s instrumental line profile. It is not a completely accurate model (e.g., it does not capture asymmetries in the instrumental line profile), but it performs fairly well!

  • It combines both line absorption and continuum absorption. That is why the overall level of the telluric model is below 1.

  • The NEID pipeline only provides a telluric model for zero-indexed orders 55-110 because the instrumental line profile has not been characterized outside of that range yet. The value of the provided telluric model in those orders is just set to 1.

Let’s apply the correction by dividing the observed flux by the model to see how it does!

[12]:
# Let's make a plot!
fig, ax = plt.subplots(1, 1, figsize=(12, 5), sharex=True)

# We can set the order here
order = 80

# Set the science trace ID number, 1, and extract the relevant data arrays to variables
trace_num = 1
wave = sl2[f'TRACE{trace_num}_WAVE'].data[order]
flux = sl2[f'TRACE{trace_num}_FLUX'].data[order]
blaze = sl2[f'TRACE{trace_num}_BLAZE'].data[order]
blaze_scaled = blaze * (np.nanmax(flux) / np.nanmax(blaze))

# Let's also set this order's telluric model to a variable, and normalize it so it only contains the line absorption
telluric_model = sl2[f'TRACE{trace_num}_TELLURIC'].data[order]
telluric_model_scaled = telluric_model / np.nanmax(telluric_model)

# We can zoom in and only plot a spectral region close around H-alpha, let's say +/- 15 Angstrom
plot_indices = np.where((np.abs(wave - halpha_wavelength_shifted) < 10))[0]

# We can plot the deblazed spectrum and the deblazed, telluric-corrected spectrum
ax.plot(wave[plot_indices], (flux / blaze_scaled)[plot_indices], color='tab:pink', linestyle='-', lw=1.0, label=f'De-blazed Flux')
ax.plot(wave[plot_indices], (flux / blaze_scaled / telluric_model_scaled)[plot_indices], color='tab:red', linestyle='-', lw=1.0, label=f'Telluric-corrected De-blazed Flux')

# Set up all the labels and such
ax.set_xlabel('Wavelength (\u212B)')
ax.set_ylabel(f'Normalized TRACE{trace_num} Science Fiber Flux')
ax.set_title(f"NEID L2 Science Fiber Spectrum of {sl2['PRIMARY'].header['OBJECT']} in Order {order}", fontsize=14, fontweight='bold')

ax.legend()

plt.tight_layout()
plt.show()
../_images/tutorials_NEID_Tutorial_24_0.png

Here, the telluric-corrected spectrum is plotted in red and the un-corrected spectrum is in pink. We’ve normalized the telluric model before removing it so that only the line absorption is removed and the two spectra are on top of each other.

The correction performs very well! The telluric lines that overlap the broad H-alpha line are cleanly removed.

In general, most of the lines are removed without much residual noise. Some lines, like the one just shy of 6574 Angstrom, have a correction residual. This is likely due to imperfections in the radiative transfer model.

Regardless, this is a handy and standardized way to remove telluric contamination from your NEID spectra!


Level 3: Stitched 1D Spectrum

Level 3 data contains a stitched 1D spectrum on a common wavelength grid with constant velocity spacing. The stitching process:

  1. Divides out the blaze function

  2. Resamples each order onto a common wavelength grid

  3. Combines overlapping regions using inverse-variance weighting

Creating L3 from L2

L3 is created from an RVData-standard L2 file.

[13]:
# Create L3 from the standard L2 file
neid_l2 = RV2.from_fits(l2_standard_file)
neid_l3 = RV3()
neid_l3.convert_level2_to_level3(neid_l2)

# Save to FITS file
l3_standard_file = neid_l3.to_fits()
print(f"Created {l3_standard_file}")
Created neid_SL3_20231010T020006.fits

Using L3 Data

Reading the L3 File

[14]:
# Open the standardized L3 file
sl3 = fits.open(l3_standard_file)

# List extensions using the extension description table
Table(sl3['EXT_DESCRIPT'].data).pprint_all()
         Name                                         Description
---------------------- --------------------------------------------------------------------------
               PRIMARY                                        EPRV Standard FITS HEADER (no data)
     INSTRUMENT_HEADER                                      Inherited instrument header (no data)
               RECEIPT                  Table of operations that have been performed on this file
            DRP_CONFIG               Pipeline details (settings etc) to go from native data to L2
          EXT_DESCRIPT                                Table describing contents of each extension
           ORDER_TABLE             Table capturing the wavelength extent of each order in Trace 1
STITCHED_CORR_SCI_FLUX Order stitched and blaze-corrected flux co-added across all science traces
STITCHED_CORR_SCI_WAVE        Order stitched barycentric- and drift-corrected wavelength solution
 STITCHED_CORR_SCI_VAR                         Order stitched variance for STITCHED_CORR_SCI_FLUX

Understanding L3 Extensions

For NEID with one science fiber, the stitched spectrum is stored in STITCHED_CORR_SCI_* extensions:

  • STITCHED_CORR_SCI_WAVE/FLUX/VAR: Science fiber

[15]:
# Check which STITCHED extensions are present
stitched_exts = [hdu.name for hdu in sl3 if 'STITCHED' in hdu.name]
print("Stitched spectrum extensions:")
for ext in stitched_exts:
    print(f"  {ext}")
Stitched spectrum extensions:
  STITCHED_CORR_SCI_FLUX
  STITCHED_CORR_SCI_WAVE
  STITCHED_CORR_SCI_VAR

Plotting the Stitched Spectrum

[16]:
# Plot the stitched spectrum for one trace
wave_ext = "STITCHED_CORR_SCI_WAVE"
flux_ext = "STITCHED_CORR_SCI_FLUX"
wave_l3 = sl3[wave_ext].data
flux_l3 = sl3[flux_ext].data

fig, ax = plt.subplots(figsize=(14, 5))
ax.plot(wave_l3, flux_l3, color='tab:blue', linestyle='-', lw=0.3, alpha=0.8)
ax.set_xlabel('Wavelength (\u212B)', fontsize=12)
ax.set_ylabel('Flux (e$^{-}$)', fontsize=12)
ax.set_title(f'NEID L3 Stitched Spectrum - Science Trace', fontsize=14, fontweight='bold')

# Zoom inset
ax.set_xlim(wave_l3[np.isfinite(wave_l3)].min(), wave_l3[np.isfinite(wave_l3)].max())
plt.tight_layout()
plt.show()

print(f"\nWavelength range: {np.nanmin(wave_l3):.1f} - {np.nanmax(wave_l3):.1f} Angstroms")
print(f"Number of pixels: {len(wave_l3)}")

../_images/tutorials_NEID_Tutorial_33_0.png

Wavelength range: 3670.0 - 10200.0 Angstroms
Number of pixels: 493473

Zoomed View of Spectral Features

Let’s again zoom in on the H-alpha spectral line. Here we plot the H-alpha line wavelength (vertical dashed line) without correcting for the object’s systemic radial velocity

[17]:
# Zoom in on H-alpha region
if wave_ext in [hdu.name for hdu in sl3]:
    fig, ax = plt.subplots(figsize=(12, 4))

    # H-alpha region
    mask = (wave_l3 > 6550) & (wave_l3 < 6580)
    ax.plot(wave_l3[mask], flux_l3[mask], color='tab:blue', linestyle='-', lw=0.7)
    ax.axvline(6564.6, color='red', ls='--', alpha=0.7, label='H\u03B1 (6564.6 \u212B)')
    ax.set_xlabel('Wavelength (\u212B)')
    ax.set_ylabel('Flux (e$^{-}$)')
    ax.set_title('H\u03B1 Region', fontsize=12)
    ax.legend()
    plt.tight_layout()
    plt.show()
../_images/tutorials_NEID_Tutorial_35_0.png

Level 4: Radial Velocity Measurements and Other Derived Products

Level 4 data contains radial velocity (RV) and ancillary measurements derived from the spectra. These include:

  • Combined RV with uncertainty

  • Per-order RVs

  • Per-order cross correlation functions from which RVs are measured

  • Activity indicators

Creating L4 from Native NEID L2

L4 is created from native pipeline outputs that contain RV measurements. For NEID, the native L2 file includes CCF-derived RVs.

[18]:
# Create L4 from native NEID L2 file (which contains RV measurements)
neid_l4 = RV4.from_fits(native_l2_file, instrument="NEID")

# Save to FITS file
l4_standard_file = neid_l4.to_fits()
print(f"Created {l4_standard_file}")
Created neid_SL4_20231010T020006.fits

Using L4 Data

Reading the L4 File

The standard L4 file has additional information added to the primary header regarding the radial velocity measurement.

[19]:
# Let's read back in the standard L4 file
sl4 = fits.open(l4_standard_file)

# Examine primary header for RV info
hdr4 = sl4[0].header
print(f"Object: {hdr4['OBJECT']}")
print(f"Observation time (BJD): {hdr4['BJDTDB']}")
print(f"Radial Velocity: {hdr4['RV']*1000:.5f} m/s")
print(f"Radial Velocity Error (m/s): {hdr4['RVERR']*1000:.5f} m/s")
print(f"Radial Velocity Measurement Method: {hdr4['RVMETHOD']}")
Object: HD 185144
Observation time (BJD): 2460227.586673037
Radial Velocity: 26731.33691 m/s
Radial Velocity Error (m/s): 0.25687 m/s
Radial Velocity Measurement Method: CCF

The L4 primary header includes an RV value and uncertainty for the observation, along with the method used for RV measurement. This RV header value is the one recommended for use by the instrument team for a “roll-up” RV of a given observation. The header also includes the corresponding time of the RV.

In the case of NEID, the RVs are measured with CCFs. The primary header RV value are the native CCFRVMOD values, which are made by coadding individual order CCFs that are weighted based on the target SED and instrument throughput.

L4 also includes an extension description table, like L2

[20]:
# Write out the extension description table
Table(sl4['EXT_DESCRIPT'].data).pprint_all()
       Name                               Description
----------------- ------------------------------------------------------------
          PRIMARY                          EPRV Standard FITS HEADER (no data)
INSTRUMENT_HEADER                   Inherited NEID instrument header (no data)
          RECEIPT    Table of operations that have been performed on this file
       DRP_CONFIG Pipeline details (settings etc) to go from native data to L2
     EXT_DESCRIPT                  Table describing contents of each extension
              RV1 Order-wise RV measurement table for NEID Science fiber trace
             CCF1                 Order-wise CCFs for NEID Science fiber trace
     DIAGNOSTICS1   Table of activity diagnostics for NEID science fiber trace

The other extensions include more granular RV information, such as the order-wise RVs and CCFs, as well as ancillary measurements like activity indicators.

Examining the Order-wise RVs in the RV1 Extension

The RV1 extension contains a table of the NEID order-wise RVs. The 1 refers to the spectral trace from which the RVs were calculated, which for NEID is the Trace 1 science fiber.

[21]:
# Let's set the order-wise RV table as an astropy Table object
order_rv_table = Table(sl4['RV1'].data)

# Here let's print just a part of the table (rows 9 through 14)
order_rv_table[9:15].pprint_all()
     BJD_TDB              RV        RV_ERR        BERV           WAVE_START          WAVE_END      PIXEL_START PIXEL_END ORDER_INDEX ECHELLE_ORDER       WEIGHT
----------------- ----------------- ------ ----------------- ------------------ ------------------ ----------- --------- ----------- ------------- ------------------
2460227.586733954               nan    nan 1.056088165714715                nan                nan         nan       nan           9           164                0.0
2460227.586733954               nan    nan 1.056088165714715                nan                nan         nan       nan          10           163  1.198329993108328
2460227.586733954 26.71503980582002    nan 1.056088165714715 3775.8065117012347 3800.0659493553376      3007.0    6082.0          11           162 0.9273202655362364
2460227.586733954 26.79718058753021    nan 1.056088165714715  3799.923601807259 3823.0455982184685      3083.0    5995.0          12           161 0.9710176047748902
2460227.586733954 26.71517774062335    nan 1.056088165714715  3822.534314642403    3846.7250185901      2953.0    5965.0          13           160  2.004500568990354
2460227.586733954 26.78489620068482    nan 1.056088165714715 3846.2304999240614 3870.8149092792005      2914.0    5951.0          14           159  1.253390746999374

Here we print just a part of the RV1 table, showing zero-indexed orders 9 through 14. Each row is a NEID spectral order and has columns including that order’s flux-weighted BJD, CCF-measured RV, wavelength extent that contributes to the RV measurement, and a weight that is applied to that order’s CCF when creating the roll-up RV in the primary header

You can see that orders 9 and 10 have no RVs: these are orders that do not have RVs measured. Various orders across NEID’s bandpass that have extracted spectra do not have RVs. For example, many of the reddest orders do not have RVs because they are excluded in the CCF mask from having too much telluric contamination.

These order-wise RV values are stored in header entries in the native NEID file, so this table makes it way easier to access the order-wise measurements!

Note that we do not calculate per-order RV errors.

We can also plot the order-wise RVs

[22]:
fig, ax = plt.subplots(1, 1, figsize=(10,4.5), sharex=True)

ax.plot(order_rv_table["ORDER_INDEX"], order_rv_table["RV"], "o", color="tab:cyan", ms=6, mec='#323232', mew=1.5)

# Labels
ax.set_xlabel("Spectral Order (zero-indexed)")
ax.set_ylabel("Order-wise Radial Velocity (km/s)")
ax.set_title(f"CCF-based Order RVs of {sl4['PRIMARY'].header['OBJECT']}", fontsize=14, fontweight='bold')

plt.tight_layout()
plt.show()
../_images/tutorials_NEID_Tutorial_45_0.png

Looking at order-wise RVs is a useful way to diagnose any potential issues with your roll-up RV value.

This particular observation is well-behaved, but you can look for orders that have particularly bad outlier RVs or trends as a function of spectral order/wavelength.

For example with NEID, you can look for differences in the RVs derived from orders that use the laser frequency comb for wavelength calibration and those that use the Thorium-Argon lamp. The LFC-using orders are 58-110 (zero-indexed) for observations before September 2024 and orders 40-110 for observations after September 2024.

Individual order CCFs

The standard L4 file also includes the per-order computed CCFs in the CCF1 extension.

The CCF1 header includes information about the CCFs:

[23]:
sl4['CCF1'].header
[23]:
XTENSION= 'IMAGE   '           / Image extension
BITPIX  =                  -64 / array data type
NAXIS   =                    2 / number of array dimensions
NAXIS1  =                  804
NAXIS2  =                  122
PCOUNT  =                    0 / number of parameters
GCOUNT  =                    1 / number of groups
VELSTART=             -73.4236
VELSTEP =                 0.25
CCFMASK = 'G8_espresso.txt'
VELNSTEP=                  804
EXTNAME = 'CCF1    '           / extension name

The file name for the spectral line mask used to calculate the CCF is given by the CCFMASK entry.

The data for this extension is just the CCF values. You can reconstruct the velocity array that the CCF was computed over using the header entries VELSTART, VELSTEP, and VELNSTEP. The CCF data array has shape (number of orders, VELNSTEP).

Here let’s reconstruct the CCF velocity array and then plot the CCF for one order:

[24]:
# We'll set the velocity starting value to its own variable
ccf_vel_start = sl4["CCF1"].header["VELSTART"]

# And compute the ending velocity using the start, step, and number of steps header entries
ccf_vel_end = sl4["CCF1"].header["VELSTART"] + sl4["CCF1"].header["VELSTEP"] * sl4["CCF1"].header["VELNSTEP"]

# And create the velocity array from these values
ccf_vel_arr = np.arange(ccf_vel_start, ccf_vel_end, sl4["CCF1"].header["VELSTEP"])
[25]:
# Now let's plot the CCF for an order
order = 50

fig, ax = plt.subplots(1, 1, figsize=(10,4.5), sharex=True)

ax.plot(ccf_vel_arr, sl4["CCF1"].data[order], "-", color="tab:green", lw=2.0)

# And a vertical line at that order's RV value
ax.axvline(x=order_rv_table["RV"][order], c='tab:gray', lw=1.5, ls='-', label=f'Order RV: {order_rv_table["RV"][order]:.4f} km/s')

ax.set_xlabel('Velocity (km/s)')
ax.set_ylabel('CCF Flux')
ax.set_title(f"CCF for Order {50} ({order_rv_table["WAVE_START"][order]:.1f} to {order_rv_table["WAVE_END"][order]:.1f} \u212B) of {sl4['PRIMARY'].header['OBJECT']}", fontsize=16)

ax.legend()

plt.tight_layout()
plt.show()
../_images/tutorials_NEID_Tutorial_50_0.png

The minimum of the CCF and the order RV value from the RV1 table directly coincide!

This is another avenue for inspecting the quality of your RVs. You could identify orders that might have outlier RVs in the RV1 table and then inspect their CCFs directly.

Activity diagnostic measurements

The standard L4 file also contains stellar activity diagnostic measurements that are output by the NEID data pipeline. These are stored in the DIAGNOSTICS1 extension. Let’s take a look!

[26]:
Table(sl4['DIAGNOSTICS1'].data).pprint_all()
   metric_name           value              uncertainty
----------------- -------------------- ----------------------
           CaIIHK  0.17541366583790996  0.0002495382532907082
  CaIIHK_tellcorr  0.17541366583790996  0.0002495382532907082
              HeI   1.0221921751942795  0.0004536309111715093
     HeI_tellcorr    1.032817634465483  0.0004584953742246589
              NaI  0.24106207913372552 0.00013738443185951513
     NaI_tellcorr  0.24685779738344446 0.00014074525848053042
         Halpha06  0.23151522477468936 0.00017433092730222092
Halpha06_tellcorr  0.23355674137989674  0.0001759410175867868
         Halpha16   0.4206121690911927 0.00014536501949436982
Halpha16_tellcorr   0.4259436216245271  0.0001473136305586283
              CaI   0.8731594140309632 0.00044222389677976394
     CaI_tellcorr   0.8682649369327627   0.000439739147699434
           CaIRT1   0.5031829193001701 0.00027932577851676627
  CaIRT1_tellcorr  0.49972510948400634  0.0002774161203393202
           CaIRT2  0.37333398980601035  0.0002603627216708849
  CaIRT2_tellcorr     0.37086322923294   0.000258646491023813
           CaIRT3   0.3679023487206318  0.0002591296853264343
  CaIRT3_tellcorr   0.3678302063706018  0.0002590787427299671
           NaINIR   0.6059970853306027 0.00029244657936825137
  NaINIR_tellcorr   0.5599961139096107  0.0002742005444497258
          PaDelta   0.9647779190373837   0.001844107344272932
 PaDelta_tellcorr   0.9647779190373837   0.001844107344272932
            Mn539    0.658072980500413  0.0005278987370161466
   Mn539_tellcorr   0.6583556304914492  0.0005281253437974287
         CCF_FWHM    6.407017599256683                    nan
          CCF_BIS -0.05113938026855891   0.000444689941390326

The values included are a mix of spectral line indices and CCF-derived activity diagnostics.

The last two rows are the CCF-based measurements: the FWHM of the best-fit Gaussian used to measure the RV and the bisector inverse slope. Note that the FWHM only has a corresponding uncertainty for data that are reduced with the NEID DRP v1.5 and after.

The rest of the rows are spectral line indices. Each one has two values: with and without the NEID telluric correction applied. The NEID team recommends using the version with the telluric correction applied (having suffix _tellcorr). This will reduce contamination when doing time series analysis of activity indicators (such as yearly aliasing). Some of the indices, like Ca II HK, have the same value for the corrected and un-corrected versions. This is because they are in a part of the spectrum that does not have a telluric model defined (outside of orders 55-110).

The units for the FWHM and BIS are km/s, and the spectral indices are unitless (since they are relative flux values).

Detailed information about the stellar activity diagnostics measurement can be found in the documentation for the NEID data reduction pipeline here: https://neid.ipac.caltech.edu/docs/NEID-DRP/algorithms.html#stellar-activity-info


NEID Tutorial Summary

This tutorial demonstrated how to:

  1. Create L2 from native NEID L2 file using RV2.from_fits()

  2. Use L2 data: access headers, examine extensions, find wavelengths in the spectrum, plot spectra, and apply the telluric model

  3. Create L3 from standard L2 using neid_l2 = RV2.from_fits() and RV3().convert_level2_to_level3(neid_l2)

  4. Use L3 data: access stitched spectra, examine spectral features

  5. Create L4 from native NEID L2 using RV4.from_fits()

  6. Use L4 data: access RV measurements, look at per-order RVs and CCFs, and examine stellar activity diagnostic information

The standardized data format allows consistent access patterns across all EPRV instruments!

[27]:
# Clean up - close FITS files
sl2.close()
sl3.close()
sl4.close()