Models¶
SpectrumModel¶
The SpectrumModel
is the main implementation of the Starfish methods for a single-order spectrum. It works by interfacing with both Starfish.emulator.Emulator
, Starfish.spectrum.Spectrum
, and the methods in Starfish.transforms
. The spectral emulator provides an interface to spectral model libraries with a covariance matrix for each interpolated spectrum. The transforms provide the physics behind alterations to the light. For a given set of parameters, a transformed spectrum and covariance matrix are provided by
>>> from Starfish.models import SpectrumModel
>>> model = SpectrumModel(...)
>>> flux, cov = model()
It is also possible to optimize our parameters using the interfaces provided in SpectrumModel.get_param_vector()
, SpectrumModel.set_param_vector()
, and SpectrumModel.log_likelihood()
. A very minimal example might be
>>> from Starfish.models import SpectrumModel
>>> from scipy.optimize import minimize
>>> model = SpectrumModel(...)
>>> def nll(P):
model.set_param_vector(P)
lnprob = model.log_likelihood()
return -lnprob
>>> P0 = model.get_param_vector()
>>> soln = minimize(nll, P0, method='Nelder-Mead')
For a more thorough example, see the Examples.
Parametrization¶
This model uses a method of specifying parameters very similar to Dan Foreman-Mackey’s George library. There exists an underlying dictionary of the model parameters, which define what transformations will be made. For example, if vz
exists in a model’s parameter dictionary, then doppler shifting will occur when calling the model.
It is possible to have a parameter that transforms the spectrum, but is not fittable. We call these frozen parameters. For instance, if my 3 model library parameters are \(T_{eff}\), \(\log g\), and \([Fe/H]\) (or T
, logg
, Z
in the code), but I don’t want to fit $log g$, I can freeze it:
>>> from Starfish.models import SpectrumModel
>>> model = SpectrumModel(...)
>>> model.freeze('logg')
When using this framework, you can see what transformations will occur by looking at SpectrumModel.params
and what values are fittable by SpectrumModel.get_param_dict()
(or the other getters for the parameters).
>>> model.params
{'T': 6020, 'logg': 4.2, 'Z': 0.0, 'vsini': 84.0, 'log_scale': -10.23}
>>> model.get_param_dict()
{'T': 6020, 'Z': 0.0, 'vsini': 84.0, 'log_scale': -10.23}
To undo this, simply thaw the frozen parameters
>>> model.thaw('logg')
>>> model.params == model.get_param_dict()
True
The underlying parameter dictionary is a special flat dictionary. Consider the nested dictionary
>>> cov = {
... 'log_amp': 0,
... 'log_ls': 6,
... }
>>> model['global_cov'] = cov
>>> model.params
{
'T': 6020,
'logg': 4.2,
'Z': 0.0,
'vsini': 84.0,
'log_scale': -10.23,
'global_cov:log_amp': 0,
'global_cov:log_ls': 6
}
These values can be referenced normally or using its flat key
>>> model['global_cov:log_amp'] == model['global_cov']['log_amp']
True
API/Reference¶
-
class
Starfish.models.
SpectrumModel
(emulator: Union[str, Starfish.emulator.emulator.Emulator], data: Union[str, Starfish.spectrum.Spectrum], grid_params: Sequence[float], max_deque_len: int = 100, name: str = 'SpectrumModel', **params)[source]¶ A single-order spectrum model.
- Parameters
emulator (
Starfish.emulators.Emulator
) – The emulator to use for this model.data (
Starfish.spectrum.Spectrum
) – The data to use for this modelgrid_params (array-like) – The parameters that are used with the associated emulator
max_deque_len (int, optional) – The maximum number of residuals to retain in a deque of residuals. Default is 100
name (str, optional) – A name for the model. Default is ‘SpectrumModel’
- Keyword Arguments
params (dict) – Any remaining keyword arguments will be interpreted as parameters.
Here is a table describing the avialable parameters and their related functions
Parameter
Function
vsini
vz
Av
Rv
log_scale
Note
If
log_scale
is not specified, the model will userenorm()
to automatically scale the spectrum to the data using the ratio of integrated fluxes.The
global_cov
keyword arguments must be a dictionary definining the hyperparameters for the global covariance kernel,kernels.global_covariance_matrix()
Global Parameter
Description
log_amp
The natural logarithm of the amplitude of the Matern kernel
log_ls
The natural logarithm of the lengthscale of the Matern kernel
The
local_cov
keryword argument must be a list of dictionaries defining hyperparameters for many Gaussian kernels, ,kernels.local_covariance_matrix()
Local Parameter
Description
log_amp
The natural logarithm of the amplitude of the kernel
mu
The location of the local kernel
log_sigma
The natural logarithm of the standard deviation of the kernel
-
params
¶ The dictionary of parameters that are used for doing the modeling.
- Type
dict
-
frozen
¶ A list of strings corresponding to frozen parameters
- Type
list
-
residuals
¶ A deque containing residuals from calling
SpectrumModel.log_likelihood()
- Type
deque
-
__call__
()[source]¶ Performs the transformations according to the parameters available in
self.params
- Returns
flux, cov – The transformed flux and covariance matrix from the model
- Return type
tuple
-
freeze
(names)[source]¶ Freeze the given parameter such that
get_param_dict()
andget_param_vector()
no longer include this parameter, however it will still be used when calling the model.- Parameters
name (str or array-like) – The parameter to freeze. If
'all'
, will freeze all parameters. If'global_cov'
will freeze all global covariance parameters. If'local_cov'
will freeze all local covariance parameters.- Raises
ValueError – If the given parameter does not exist
See also
-
get_param_dict
(flat: bool = False) → dict[source]¶ Gets the dictionary of thawed parameters.
- Parameters
flat (bool, optional) – If True, returns the parameters completely flat. For example,
['local']['0']['mu']
would have the key'local:0:mu'
. Default is False- Returns
- Return type
dict
See also
-
get_param_vector
()[source]¶ Get a numpy array of the thawed parameters
- Returns
- Return type
numpy.ndarray
See also
-
property
grid_params
¶ The parameters used for the spectral emulator.
- Setter
Sets the values in the order of
Emulator.param_names
- Type
numpy.ndarray
-
property
labels
¶ The thawed parameter names
- Type
tuple of str
-
load
(filename)[source]¶ Load a saved model state from a TOML file
- Parameters
filename (str or path-like) – The saved state to load
-
log_likelihood
(priors: Optional[dict] = None) → float[source]¶ Returns the log probability of a multivariate normal distribution
- Parameters
priors (dict, optional) – If provided, will use these priors in the MLE. Should contain keys that match the model’s keys and values that have a logpdf method that takes one value (like
scipy.stats
distributions). Default is None.
Warning
No checks will be done on the
priors
for speed.- Returns
- Return type
float
-
plot
(axes=None, plot_kwargs=None, resid_kwargs=None)[source]¶ Plot the model.
This will create two subplots, one which shows the current model against the data, and another which shows the current residuals with 3:math:sigma contours from the diagonal of the covariance matrix. Note this requires matplotlib to be installed, which is not installed by default with Starfish.
- Parameters
axes (iterable of matplotlib.Axes, optional) – If provided, will use the first two axes to plot, otherwise will create new axes, by default None
plot_kwargs (dict, optional) – If provided, will use these kwargs for the comparison plot, by default None
resid_kwargs (dict, optional) – If provided, will use these kwargs for the residuals plot, by default None
- Returns
The returned axes, for the user to edit as they please
- Return type
list of matplotlib.Axes
-
save
(filename, metadata=None)[source]¶ Saves the model as a set of parameters into a TOML file
- Parameters
filename (str or path-like) – The TOML filename to save to.
metadata (dict, optional) – If provided, will save the provided dictionary under a ‘metadata’ key. This will not be read in when loading models but provides a way of providing information in the actual TOML files. Default is None.
-
set_param_dict
(params)[source]¶ Sets the parameters with a dictionary. Note that this should not be used to add new parameters
- Parameters
params (dict) – The new parameters. If a key is present in
self.frozen
it will not be changed
See also
-
set_param_vector
(params)[source]¶ Sets the parameters based on the current thawed state. The values will be inserted according to the order of
SpectrumModel.labels
.- Parameters
params (array_like) – The parameters to set in the model
- Raises
ValueError – If the params do not match the length of the current thawed parameters.
See also
-
thaw
(names)[source]¶ Thaws the given parameter. Opposite of freezing
- Parameters
name (str or array-like) – The parameter to thaw. If
'all'
, will thaw all parameters. If'global_cov'
will thaw all global covariance parameters. If'local_cov'
will thaw all local covariance parameters.- Raises
ValueError – If the given parameter does not exist.
See also
-
train
(priors: Optional[dict] = None, **kwargs)[source]¶ Given a
SpectrumModel
and a dictionary of priors, will perform maximum-likelihood estimation (MLE). This will usescipy.optimize.minimize
to find the maximum a-posteriori (MAP) estimate of the current model state. Note that this alters the state of the model. This means that you can run this method multiple times until the optimization succeeds. By default, we use the “Nelder-Mead” method in minimize to avoid approximating any derivatives.- Parameters
priors (dict, optional) – Priors to pass to
log_likelihood()
**kwargs (dict, optional) – These keyword arguments will be passed to scipy.optimize.minimize
- Returns
soln – The output of the minimization.
- Return type
scipy.optimize.minimize_result
- Raises
ValueError – If the priors are poorly specified
RuntimeError – If any priors evaluate to non-finite values
See also
Utils¶
There are some utilities that help with interfacing with the various Models
-
Starfish.models.
find_residual_peaks
(model, num_residuals=100, threshold=4.0, buffer=2, wl_range=(0, inf))[source]¶ Find the peaks of the most recent residual and return their properties to aid in setting up local kernels
- Parameters
model (Model) – The model to determine peaks from. Need only have a residuals array.
num_residuals (int, optional) – The number of residuals to average together for determining peaks. By default 100.
threshold (float, optional) – The sigma clipping threshold, by default 4.0
buffer (float, optional) – The minimum distance between peaks, in Angstrom, by default 2.0
wl_range (2-tuple) – The (min, max) wavelengths to consider. Default is (0, np.inf)
- Returns
means – The means of the found peaks, with the same units as model.data.wave
- Return type
list
-
Starfish.models.
optimize_residual_peaks
(model, mus, threshold=0.1, sigma0=50, num_residuals=100)[source]¶ Optimize the local covariance parameters based on fitting the residual input means as Gaussians around the residuals
- Parameters
model (Model) – The model to determine peaks from. Need only have a residuals array.
mus (array-like) – The means to instantiate Gaussians at and optimize.
threshold (float, optional) – This is the threshold for restricting kernels; i.e. if a fit amplitude is less than threshold standard deviations then it will be thrown away. Default is 0.1
sigma0 (float, optional) – The initial standard deviation (in Angstrom) of each Gaussian. Default is 50 Angstrom.
num_residuals (int, optional) – The number of residuals to average together for determining peaks. By default 100.
- Returns
A dictionary of optimized parameters ready to be plugged into model[“local_cov”]
- Return type
dict
Warning
I have had inconsistent results with this optimization, be mindful of your outputs and consider hand-tuning after optimizing.