Models

SpectrumModel

The SpectrumModel is the main implementation of the Starfish methods for a single-order spectrum. It works by interfacing with both Starfish.emulator.Emulator, Starfish.spectrum.Spectrum, and the methods in Starfish.transforms. The spectral emulator provides an interface to spectral model libraries with a covariance matrix for each interpolated spectrum. The transforms provide the physics behind alterations to the light. For a given set of parameters, a transformed spectrum and covariance matrix are provided by

>>> from Starfish.models import SpectrumModel
>>> model = SpectrumModel(...)
>>> flux, cov = model()

It is also possible to optimize our parameters using the interfaces provided in SpectrumModel.get_param_vector(), SpectrumModel.set_param_vector(), and SpectrumModel.log_likelihood(). A very minimal example might be

>>> from Starfish.models import SpectrumModel
>>> from scipy.optimize import minimize
>>> model = SpectrumModel(...)
>>> def nll(P):
        model.set_param_vector(P)
        lnprob = model.log_likelihood()
        return -lnprob
>>> P0 = model.get_param_vector()
>>> soln = minimize(nll, P0, method='Nelder-Mead')

For a more thorough example, see the Examples.

Parametrization

This model uses a method of specifying parameters very similar to Dan Foreman-Mackey’s George library. There exists an underlying dictionary of the model parameters, which define what transformations will be made. For example, if vz exists in a model’s parameter dictionary, then doppler shifting will occur when calling the model.

It is possible to have a parameter that transforms the spectrum, but is not fittable. We call these frozen parameters. For instance, if my 3 model library parameters are \(T_{eff}\), \(\log g\), and \([Fe/H]\) (or T, logg, Z in the code), but I don’t want to fit $log g$, I can freeze it:

>>> from Starfish.models import SpectrumModel
>>> model = SpectrumModel(...)
>>> model.freeze('logg')

When using this framework, you can see what transformations will occur by looking at SpectrumModel.params and what values are fittable by SpectrumModel.get_param_dict() (or the other getters for the parameters).

>>> model.params
{'T': 6020, 'logg': 4.2, 'Z': 0.0, 'vsini': 84.0, 'log_scale': -10.23}
>>> model.get_param_dict()
{'T': 6020, 'Z': 0.0, 'vsini': 84.0, 'log_scale': -10.23}

To undo this, simply thaw the frozen parameters

>>> model.thaw('logg')
>>> model.params == model.get_param_dict()
True

The underlying parameter dictionary is a special flat dictionary. Consider the nested dictionary

>>> cov = {
...     'log_amp': 0,
...     'log_ls': 6,
... }
>>> model['global_cov'] = cov
>>> model.params
{
    'T': 6020,
    'logg': 4.2,
    'Z': 0.0,
    'vsini': 84.0,
    'log_scale': -10.23,
    'global_cov:log_amp': 0,
    'global_cov:log_ls': 6
}

These values can be referenced normally or using its flat key

>>> model['global_cov:log_amp'] == model['global_cov']['log_amp']
True

API/Reference

class Starfish.models.SpectrumModel(emulator: Union[str, Starfish.emulator.emulator.Emulator], data: Union[str, Starfish.spectrum.Spectrum], grid_params: Sequence[float], max_deque_len: int = 100, name: str = 'SpectrumModel', **params)[source]

A single-order spectrum model.

Parameters
  • emulator (Starfish.emulators.Emulator) – The emulator to use for this model.

  • data (Starfish.spectrum.Spectrum) – The data to use for this model

  • grid_params (array-like) – The parameters that are used with the associated emulator

  • max_deque_len (int, optional) – The maximum number of residuals to retain in a deque of residuals. Default is 100

  • name (str, optional) – A name for the model. Default is ‘SpectrumModel’

Keyword Arguments

params (dict) – Any remaining keyword arguments will be interpreted as parameters.

Here is a table describing the avialable parameters and their related functions

Parameter

Function

vsini

rotational_broaden()

vz

doppler_shift()

Av

extinct()

Rv

extinct()

log_scale

rescale()

Note

If log_scale is not specified, the model will use renorm() to automatically scale the spectrum to the data using the ratio of integrated fluxes.

The global_cov keyword arguments must be a dictionary definining the hyperparameters for the global covariance kernel, kernels.global_covariance_matrix()

Global Parameter

Description

log_amp

The natural logarithm of the amplitude of the Matern kernel

log_ls

The natural logarithm of the lengthscale of the Matern kernel

The local_cov keryword argument must be a list of dictionaries defining hyperparameters for many Gaussian kernels, , kernels.local_covariance_matrix()

Local Parameter

Description

log_amp

The natural logarithm of the amplitude of the kernel

mu

The location of the local kernel

log_sigma

The natural logarithm of the standard deviation of the kernel

params

The dictionary of parameters that are used for doing the modeling.

Type

dict

frozen

A list of strings corresponding to frozen parameters

Type

list

residuals

A deque containing residuals from calling SpectrumModel.log_likelihood()

Type

deque

__call__()[source]

Performs the transformations according to the parameters available in self.params

Returns

flux, cov – The transformed flux and covariance matrix from the model

Return type

tuple

freeze(names)[source]

Freeze the given parameter such that get_param_dict() and get_param_vector() no longer include this parameter, however it will still be used when calling the model.

Parameters

name (str or array-like) – The parameter to freeze. If 'all', will freeze all parameters. If 'global_cov' will freeze all global covariance parameters. If 'local_cov' will freeze all local covariance parameters.

Raises

ValueError – If the given parameter does not exist

See also

thaw()

get_param_dict(flat: bool = False) → dict[source]

Gets the dictionary of thawed parameters.

Parameters

flat (bool, optional) – If True, returns the parameters completely flat. For example, ['local']['0']['mu'] would have the key 'local:0:mu'. Default is False

Returns

Return type

dict

See also

set_param_dict()

get_param_vector()[source]

Get a numpy array of the thawed parameters

Returns

Return type

numpy.ndarray

property grid_params

The parameters used for the spectral emulator.

Setter

Sets the values in the order of Emulator.param_names

Type

numpy.ndarray

property labels

The thawed parameter names

Type

tuple of str

load(filename)[source]

Load a saved model state from a TOML file

Parameters

filename (str or path-like) – The saved state to load

log_likelihood(priors: Optional[dict] = None) → float[source]

Returns the log probability of a multivariate normal distribution

Parameters

priors (dict, optional) – If provided, will use these priors in the MLE. Should contain keys that match the model’s keys and values that have a logpdf method that takes one value (like scipy.stats distributions). Default is None.

Warning

No checks will be done on the priors for speed.

Returns

Return type

float

plot(axes=None, plot_kwargs=None, resid_kwargs=None)[source]

Plot the model.

This will create two subplots, one which shows the current model against the data, and another which shows the current residuals with 3:math:sigma contours from the diagonal of the covariance matrix. Note this requires matplotlib to be installed, which is not installed by default with Starfish.

Parameters
  • axes (iterable of matplotlib.Axes, optional) – If provided, will use the first two axes to plot, otherwise will create new axes, by default None

  • plot_kwargs (dict, optional) – If provided, will use these kwargs for the comparison plot, by default None

  • resid_kwargs (dict, optional) – If provided, will use these kwargs for the residuals plot, by default None

Returns

The returned axes, for the user to edit as they please

Return type

list of matplotlib.Axes

save(filename, metadata=None)[source]

Saves the model as a set of parameters into a TOML file

Parameters
  • filename (str or path-like) – The TOML filename to save to.

  • metadata (dict, optional) – If provided, will save the provided dictionary under a ‘metadata’ key. This will not be read in when loading models but provides a way of providing information in the actual TOML files. Default is None.

set_param_dict(params)[source]

Sets the parameters with a dictionary. Note that this should not be used to add new parameters

Parameters

params (dict) – The new parameters. If a key is present in self.frozen it will not be changed

See also

get_param_dict()

set_param_vector(params)[source]

Sets the parameters based on the current thawed state. The values will be inserted according to the order of SpectrumModel.labels.

Parameters

params (array_like) – The parameters to set in the model

Raises

ValueError – If the params do not match the length of the current thawed parameters.

thaw(names)[source]

Thaws the given parameter. Opposite of freezing

Parameters

name (str or array-like) – The parameter to thaw. If 'all', will thaw all parameters. If 'global_cov' will thaw all global covariance parameters. If 'local_cov' will thaw all local covariance parameters.

Raises

ValueError – If the given parameter does not exist.

See also

freeze()

train(priors: Optional[dict] = None, **kwargs)[source]

Given a SpectrumModel and a dictionary of priors, will perform maximum-likelihood estimation (MLE). This will use scipy.optimize.minimize to find the maximum a-posteriori (MAP) estimate of the current model state. Note that this alters the state of the model. This means that you can run this method multiple times until the optimization succeeds. By default, we use the “Nelder-Mead” method in minimize to avoid approximating any derivatives.

Parameters
  • priors (dict, optional) – Priors to pass to log_likelihood()

  • **kwargs (dict, optional) – These keyword arguments will be passed to scipy.optimize.minimize

Returns

soln – The output of the minimization.

Return type

scipy.optimize.minimize_result

Raises
  • ValueError – If the priors are poorly specified

  • RuntimeError – If any priors evaluate to non-finite values

See also

log_likelihood()

Utils

There are some utilities that help with interfacing with the various Models

Starfish.models.find_residual_peaks(model, num_residuals=100, threshold=4.0, buffer=2, wl_range=(0, inf))[source]

Find the peaks of the most recent residual and return their properties to aid in setting up local kernels

Parameters
  • model (Model) – The model to determine peaks from. Need only have a residuals array.

  • num_residuals (int, optional) – The number of residuals to average together for determining peaks. By default 100.

  • threshold (float, optional) – The sigma clipping threshold, by default 4.0

  • buffer (float, optional) – The minimum distance between peaks, in Angstrom, by default 2.0

  • wl_range (2-tuple) – The (min, max) wavelengths to consider. Default is (0, np.inf)

Returns

means – The means of the found peaks, with the same units as model.data.wave

Return type

list

Starfish.models.optimize_residual_peaks(model, mus, threshold=0.1, sigma0=50, num_residuals=100)[source]

Optimize the local covariance parameters based on fitting the residual input means as Gaussians around the residuals

Parameters
  • model (Model) – The model to determine peaks from. Need only have a residuals array.

  • mus (array-like) – The means to instantiate Gaussians at and optimize.

  • threshold (float, optional) – This is the threshold for restricting kernels; i.e. if a fit amplitude is less than threshold standard deviations then it will be thrown away. Default is 0.1

  • sigma0 (float, optional) – The initial standard deviation (in Angstrom) of each Gaussian. Default is 50 Angstrom.

  • num_residuals (int, optional) – The number of residuals to average together for determining peaks. By default 100.

Returns

A dictionary of optimized parameters ready to be plugged into model[“local_cov”]

Return type

dict

Warning

I have had inconsistent results with this optimization, be mindful of your outputs and consider hand-tuning after optimizing.

Starfish.models.covariance_debugger()[source]

Special debugging information for the covariance matrix decomposition.