Models¶

SpectrumModel¶

The SpectrumModel is the main implementation of the Starfish methods for a single-order spectrum. It works by interfacing with both Starfish.emulator.Emulator, Starfish.spectrum.Spectrum, and the methods in Starfish.transforms. The spectral emulator provides an interface to spectral model libraries with a covariance matrix for each interpolated spectrum. The transforms provide the physics behind alterations to the light. For a given set of parameters, a transformed spectrum and covariance matrix are provided by

>>> from Starfish.models import SpectrumModel
>>> model = SpectrumModel(...)
>>> flux, cov = model()

It is also possible to optimize our parameters using the interfaces provided in SpectrumModel.get_param_vector(), SpectrumModel.set_param_vector(), and SpectrumModel.log_likelihood(). A very minimal example might be

>>> from Starfish.models import SpectrumModel
>>> from scipy.optimize import minimize
>>> model = SpectrumModel(...)
>>> def nll(P):
        model.set_param_vector(P)
        lnprob = model.log_likelihood()
        return -lnprob
>>> P0 = model.get_param_vector()
>>> soln = minimize(nll, P0, method='Nelder-Mead')

For a more thorough example, see the Examples.

Parametrization¶

This model uses a method of specifying parameters very similar to Dan Foreman-Mackey’s George library. There exists an underlying dictionary of the model parameters, which define what transformations will be made. For example, if vz exists in a model’s parameter dictionary, then doppler shifting will occur when calling the model.

It is possible to have a parameter that transforms the spectrum, but is not fittable. We call these frozen parameters. For instance, if my 3 model library parameters are $T_{eff}$, $\log g$, and $[Fe/H]$ (or T, logg, Z in the code), but I don’t want to fit $log g$, I can freeze it:

>>> from Starfish.models import SpectrumModel
>>> model = SpectrumModel(...)
>>> model.freeze('logg')

When using this framework, you can see what transformations will occur by looking at SpectrumModel.params and what values are fittable by SpectrumModel.get_param_dict() (or the other getters for the parameters).

>>> model.params
{'T': 6020, 'logg': 4.2, 'Z': 0.0, 'vsini': 84.0, 'log_scale': -10.23}
>>> model.get_param_dict()
{'T': 6020, 'Z': 0.0, 'vsini': 84.0, 'log_scale': -10.23}

To undo this, simply thaw the frozen parameters

>>> model.thaw('logg')
>>> model.params == model.get_param_dict()
True

The underlying parameter dictionary is a special flat dictionary. Consider the nested dictionary

>>> cov = {
...     'log_amp': 0,
...     'log_ls': 6,
... }
>>> model['global_cov'] = cov
>>> model.params
{
    'T': 6020,
    'logg': 4.2,
    'Z': 0.0,
    'vsini': 84.0,
    'log_scale': -10.23,
    'global_cov:log_amp': 0,
    'global_cov:log_ls': 6
}

These values can be referenced normally or using its flat key

>>> model['global_cov:log_amp'] == model['global_cov']['log_amp']
True

API/Reference¶

class Starfish.models.SpectrumModel(emulator: Union[str, Starfish.emulator.emulator.Emulator], data: Union[str, Starfish.spectrum.Spectrum], grid_params: Sequence[float], max_deque_len: int = 100, name: str = 'SpectrumModel', **params)[source]¶

A single-order spectrum model.

Parameters

emulator (Starfish.emulators.Emulator) – The emulator to use for this model.
data (Starfish.spectrum.Spectrum) – The data to use for this model
grid_params (array-like) – The parameters that are used with the associated emulator
max_deque_len (int, optional) – The maximum number of residuals to retain in a deque of residuals. Default is 100
name (str, optional) – A name for the model. Default is ‘SpectrumModel’

Keyword Arguments

params (dict) – Any remaining keyword arguments will be interpreted as parameters.

Here is a table describing the avialable parameters and their related functions

Parameter	Function
vsini	`rotational_broaden()`
vz	`doppler_shift()`
Av	`extinct()`
Rv	`extinct()`
log_scale	`rescale()`

Note

If log_scale is not specified, the model will use renorm() to automatically scale the spectrum to the data using the ratio of integrated fluxes.

The global_cov keyword arguments must be a dictionary definining the hyperparameters for the global covariance kernel, kernels.global_covariance_matrix()

Global Parameter	Description
log_amp	The natural logarithm of the amplitude of the Matern kernel
log_ls	The natural logarithm of the lengthscale of the Matern kernel

The local_cov keryword argument must be a list of dictionaries defining hyperparameters for many Gaussian kernels, , kernels.local_covariance_matrix()

Local Parameter	Description
log_amp	The natural logarithm of the amplitude of the kernel
mu	The location of the local kernel
log_sigma	The natural logarithm of the standard deviation of the kernel

params¶

The dictionary of parameters that are used for doing the modeling.

Type: dict

frozen¶

A list of strings corresponding to frozen parameters

Type: list

residuals¶

A deque containing residuals from calling SpectrumModel.log_likelihood()

Type: deque

__call__()[source]¶

Performs the transformations according to the parameters available in self.params

Returns: flux, cov – The transformed flux and covariance matrix from the model
Return type: tuple

freeze(names)[source]¶

Freeze the given parameter such that get_param_dict() and get_param_vector() no longer include this parameter, however it will still be used when calling the model.

Parameters: name (str or array-like) – The parameter to freeze. If 'all', will freeze all parameters. If 'global_cov' will freeze all global covariance parameters. If 'local_cov' will freeze all local covariance parameters.
Raises: ValueError – If the given parameter does not exist

See also

thaw()

get_param_dict(flat: bool = False) → dict[source]¶

Gets the dictionary of thawed parameters.

Parameters: flat (bool, optional) – If True, returns the parameters completely flat. For example, ['local']['0']['mu'] would have the key 'local:0:mu'. Default is False
Returns
Return type: dict

See also

set_param_dict()

get_param_vector()[source]¶

Get a numpy array of the thawed parameters

Returns
Return type: numpy.ndarray

See also

set_param_vector()

property grid_params¶

The parameters used for the spectral emulator.

Setter: Sets the values in the order of Emulator.param_names
Type: numpy.ndarray

property labels¶

The thawed parameter names

Type: tuple of str

load(filename)[source]¶

Load a saved model state from a TOML file

Parameters: filename (str or path-like) – The saved state to load

log_likelihood(priors: Optional[dict] = None) → float[source]¶

Returns the log probability of a multivariate normal distribution

Parameters: priors (dict, optional) – If provided, will use these priors in the MLE. Should contain keys that match the model’s keys and values that have a logpdf method that takes one value (like scipy.stats distributions). Default is None.

Warning

No checks will be done on the priors for speed.

Returns
Return type: float

plot(axes=None, plot_kwargs=None, resid_kwargs=None)[source]¶

Plot the model.

This will create two subplots, one which shows the current model against the data, and another which shows the current residuals with 3:math:sigma contours from the diagonal of the covariance matrix. Note this requires matplotlib to be installed, which is not installed by default with Starfish.

Parameters

axes (iterable of matplotlib.Axes, optional) – If provided, will use the first two axes to plot, otherwise will create new axes, by default None
plot_kwargs (dict, optional) – If provided, will use these kwargs for the comparison plot, by default None
resid_kwargs (dict, optional) – If provided, will use these kwargs for the residuals plot, by default None

Returns

The returned axes, for the user to edit as they please

Return type

list of matplotlib.Axes

save(filename, metadata=None)[source]¶

Saves the model as a set of parameters into a TOML file

Parameters

filename (str or path-like) – The TOML filename to save to.
metadata (dict, optional) – If provided, will save the provided dictionary under a ‘metadata’ key. This will not be read in when loading models but provides a way of providing information in the actual TOML files. Default is None.

set_param_dict(params)[source]¶

Sets the parameters with a dictionary. Note that this should not be used to add new parameters

Parameters: params (dict) – The new parameters. If a key is present in self.frozen it will not be changed

See also

get_param_dict()

set_param_vector(params)[source]¶

Sets the parameters based on the current thawed state. The values will be inserted according to the order of SpectrumModel.labels.

Parameters: params (array_like) – The parameters to set in the model
Raises: ValueError – If the params do not match the length of the current thawed parameters.

See also

get_param_vector()

thaw(names)[source]¶

Thaws the given parameter. Opposite of freezing

Parameters: name (str or array-like) – The parameter to thaw. If 'all', will thaw all parameters. If 'global_cov' will thaw all global covariance parameters. If 'local_cov' will thaw all local covariance parameters.
Raises: ValueError – If the given parameter does not exist.

See also

freeze()

train(priors: Optional[dict] = None, **kwargs)[source]¶

Given a SpectrumModel and a dictionary of priors, will perform maximum-likelihood estimation (MLE). This will use scipy.optimize.minimize to find the maximum a-posteriori (MAP) estimate of the current model state. Note that this alters the state of the model. This means that you can run this method multiple times until the optimization succeeds. By default, we use the “Nelder-Mead” method in minimize to avoid approximating any derivatives.

Parameters

priors (dict, optional) – Priors to pass to log_likelihood()
**kwargs (dict, optional) – These keyword arguments will be passed to scipy.optimize.minimize

Returns

soln – The output of the minimization.

Return type

scipy.optimize.minimize_result

Raises

ValueError – If the priors are poorly specified
RuntimeError – If any priors evaluate to non-finite values

See also

log_likelihood()

Utils¶

There are some utilities that help with interfacing with the various Models

Starfish.models.find_residual_peaks(model, num_residuals=100, threshold=4.0, buffer=2, wl_range=(0, inf))[source]¶

Find the peaks of the most recent residual and return their properties to aid in setting up local kernels

Parameters

model (Model) – The model to determine peaks from. Need only have a residuals array.
num_residuals (int, optional) – The number of residuals to average together for determining peaks. By default 100.
threshold (float, optional) – The sigma clipping threshold, by default 4.0
buffer (float, optional) – The minimum distance between peaks, in Angstrom, by default 2.0
wl_range (2-tuple) – The (min, max) wavelengths to consider. Default is (0, np.inf)

Returns

means – The means of the found peaks, with the same units as model.data.wave

Return type

list

Starfish.models.optimize_residual_peaks(model, mus, threshold=0.1, sigma0=50, num_residuals=100)[source]¶

Optimize the local covariance parameters based on fitting the residual input means as Gaussians around the residuals

Parameters

model (Model) – The model to determine peaks from. Need only have a residuals array.
mus (array-like) – The means to instantiate Gaussians at and optimize.
threshold (float, optional) – This is the threshold for restricting kernels; i.e. if a fit amplitude is less than threshold standard deviations then it will be thrown away. Default is 0.1
sigma0 (float, optional) – The initial standard deviation (in Angstrom) of each Gaussian. Default is 50 Angstrom.
num_residuals (int, optional) – The number of residuals to average together for determining peaks. By default 100.

Returns

A dictionary of optimized parameters ready to be plugged into model[“local_cov”]

Return type

dict

Warning

I have had inconsistent results with this optimization, be mindful of your outputs and consider hand-tuning after optimizing.

Starfish.models.covariance_debugger()[source]¶: Special debugging information for the covariance matrix decomposition.