Spectral fitting example (Crab)
The binned data are simulations of the Crab Nebula and albedo photon background produced using the COSI SMEX mass model. The detector response needs to be unzipped before running the notebook.
This notebook fits the spectrum of a Crab simulated using MEGAlib and combined with background.
3ML is a high-level interface that allows multiple datasets from different instruments to be used coherently to fit the parameters of source model. A source model typically consists of a list of sources with parametrized spectral shapes, sky locations and, for extended sources, shape. Polarization is also possible. A “coherent” analysis, in this context, means that the source model parameters are fitted using all available datasets simultanously, rather than performing individual fits and finding a well-suited common model a posteriori.
In order for a dataset to be included in 3ML, each instrument needs to provide a “plugin”. Each plugin is responsible for reading the data, convolving the source model (provided by 3ML) with the instrument response, and returning a likelihood. In our case, we’ll compute a binned Poisson likelihood:
where \(d_i\) are the counts on each bin and \(\lambda_i\) are the expected counts given a source model with parameters \(\mathbf{x}\).
In this example, we will fit a single point source with a known location. We’ll assume the background is known and fixed up to a scaling factor. Finally, we will fit a Band function:
where \(K\) (normalization), \(\alpha\) & \(\beta\) (spectral indeces), and \(x_p\) (peak energy) are the free parameters, while \(E_{piv}\) is the pivot energy which is fixed (and arbitrary).
Considering these assumptions:
where \(B*b_i\) are the estimated counts due to background in each bin with \(B\) the amplitude and \(b_i\) the shape of the background, and \(s_i\) are the corresponding expected counts from the source, the goal is then to find the values of \(\mathbf{x} = [K, \alpha, \beta, x_p]\) and \(B\) that maximize \(\mathcal{L}\). These are the best estimations of the parameters.
The final module needs to also fit the time-dependent background, handle multiple point-like and extended sources, as well as all the spectral models supported by 3ML. Eventually, it will also fit the polarization angle. However, this simple example already contains all the necessary pieces to do a fit.
[1]:
from cosipy import COSILike, test_data, BinnedData
from cosipy.spacecraftfile import SpacecraftFile
from cosipy.response.FullDetectorResponse import FullDetectorResponse
from cosipy.util import fetch_wasabi_file
from scoords import SpacecraftFrame
from astropy.time import Time
import astropy.units as u
from astropy.coordinates import SkyCoord, Galactic
import numpy as np
import matplotlib.pyplot as plt
from threeML import Band, PointSource, Model, JointLikelihood, DataList
from astromodels import Parameter
from pathlib import Path
import os
12:03:40 WARNING The naima package is not available. Models that depend on it will not be functions.py:48 available
WARNING The GSL library or the pygsl wrapper cannot be loaded. Models that depend on it functions.py:69 will not be available.
WARNING We have set the min_value of K to 1e-99 because there was a postive transform parameter.py:704
WARNING We have set the min_value of K to 1e-99 because there was a postive transform parameter.py:704
WARNING We have set the min_value of K to 1e-99 because there was a postive transform parameter.py:704
WARNING We have set the min_value of K to 1e-99 because there was a postive transform parameter.py:704
WARNING We have set the min_value of F to 1e-99 because there was a postive transform parameter.py:704
WARNING We have set the min_value of K to 1e-99 because there was a postive transform parameter.py:704
12:03:40 INFO Starting 3ML! __init__.py:35
WARNING WARNINGs here are NOT errors __init__.py:36
WARNING but are inform you about optional packages that can be installed __init__.py:37
WARNING to disable these messages, turn off start_warning in your config file __init__.py:40
WARNING ROOT minimizer not available minimization.py:1345
WARNING Multinest minimizer not available minimization.py:1357
WARNING PyGMO is not available minimization.py:1369
WARNING The cthreeML package is not installed. You will not be able to use plugins which __init__.py:94 require the C/C++ interface (currently HAWC)
WARNING Could not import plugin HAWCLike.py. Do you have the relative instrument __init__.py:144 software installed and configured?
WARNING Could not import plugin FermiLATLike.py. Do you have the relative instrument __init__.py:144 software installed and configured?
12:03:41 WARNING No fermitools installed lat_transient_builder.py:44
12:03:41 WARNING Env. variable OMP_NUM_THREADS is not set. Please set it to 1 for optimal __init__.py:387 performances in 3ML
WARNING Env. variable MKL_NUM_THREADS is not set. Please set it to 1 for optimal __init__.py:387 performances in 3ML
WARNING Env. variable NUMEXPR_NUM_THREADS is not set. Please set it to 1 for optimal __init__.py:387 performances in 3ML
Download and read in binned data
Define the path to the directory containing the data, detector response, orientation file, and yaml files if they have already been downloaded, or the directory to download the files into
[2]:
data_path = Path("/path/to/files")
Download the orientation file (684.38 MB)
[3]:
fetch_wasabi_file('COSI-SMEX/DC2/Data/Orientation/20280301_3_month.ori', output=str(data_path / '20280301_3_month.ori'))
Download the binned Crab+background data (99.16 MB)
[5]:
fetch_wasabi_file('COSI-SMEX/cosipy_tutorials/crab_spectral_fit_galactic_frame/crab_bkg_binned_data.hdf5', output=str(data_path / 'crab_bkg_binned_data.hdf5'))
Download the binned Crab data (13.16 MB)
[7]:
fetch_wasabi_file('COSI-SMEX/cosipy_tutorials/crab_spectral_fit_galactic_frame/crab_binned_data.hdf5', output=str(data_path / 'crab_binned_data.hdf5'))
Download the binned background data (89.10 MB)
[9]:
fetch_wasabi_file('COSI-SMEX/cosipy_tutorials/crab_spectral_fit_galactic_frame/bkg_binned_data.hdf5', output=str(data_path / 'bkg_binned_data.hdf5'))
Download the response file (839.62 MB). This needs to be unzipped before running the rest of the notebook
[10]:
fetch_wasabi_file('COSI-SMEX/DC2/Responses/SMEXv12.Continuum.HEALPixO3_10bins_log_flat.binnedimaging.imagingresponse.nonsparse_nside8.area.good_chunks_unzip.h5.zip', output=str(data_path / 'SMEXv12.Continuum.HEALPixO3_10bins_log_flat.binnedimaging.imagingresponse.nonsparse_nside8.area.good_chunks_unzip.h5.zip'))
Read in the spacecraft orientation file
[4]:
sc_orientation = SpacecraftFile.parse_from_file(data_path / "20280301_3_month.ori")
Create BinnedData objects for the Crab only, Crab+background, and background only. The Crab only simulation is not used for the spectral fit, but can be used to compare the fitted spectrum to the source simulation
[5]:
crab = BinnedData(data_path / "crab.yaml")
crab_bkg = BinnedData(data_path / "crab.yaml")
bkg = BinnedData(data_path / "background.yaml")
Load binned .hdf5 files
[6]:
crab.load_binned_data_from_hdf5(binned_data=data_path / "crab_binned_data.hdf5")
crab_bkg.load_binned_data_from_hdf5(binned_data=data_path / "crab_bkg_binned_data.hdf5")
bkg.load_binned_data_from_hdf5(binned_data=data_path / "bkg_binned_data.hdf5")
Define the path to the detector response
[7]:
dr = str(data_path / "SMEXv12.Continuum.HEALPixO3_10bins_log_flat.binnedimaging.imagingresponse.nonsparse_nside8.area.good_chunks_unzip.h5") # path to detector response
Perform spectral fit
Set background parameter, which is used to fit the amplitude of the background, and instantiate the COSI 3ML plugin
[8]:
bkg_par = Parameter("background_cosi", # background parameter
1, # initial value of parameter
min_value=0, # minimum value of parameter
max_value=5, # maximum value of parameter
delta=0.05, # initial step used by fitting engine
desc="Background parameter for cosi")
cosi = COSILike("cosi", # COSI 3ML plugin
dr = dr, # detector response
data = crab_bkg.binned_data.project('Em', 'Phi', 'PsiChi'), # data (source+background)
bkg = bkg.binned_data.project('Em', 'Phi', 'PsiChi'), # background model
sc_orientation = sc_orientation, # spacecraft orientation
nuisance_param = bkg_par) # background parameter
Define a point source at the known location with a Band function spectrum and add it to the model. The initial values of the Band function parameters are set to the true values used to simulate the source
[9]:
l = 184.56
b = -5.78
alpha = -1.99
beta = -2.32
E0 = 531. * (alpha - beta) * u.keV
xp = E0 * (alpha + 2) / (alpha - beta)
piv = 500. * u.keV
K = 3.07e-5 / u.cm / u.cm / u.s / u.keV
spectrum = Band()
spectrum.alpha.min_value = -2.14
spectrum.alpha.max_value = 3.0
spectrum.beta.min_value = -5.0
spectrum.beta.max_value = -2.15
spectrum.xp.min_value = 1.0
spectrum.alpha.value = alpha
spectrum.beta.value = beta
spectrum.xp.value = xp.value
spectrum.K.value = K.value
spectrum.piv.value = piv.value
spectrum.xp.unit = xp.unit
spectrum.K.unit = K.unit
spectrum.piv.unit = piv.unit
spectrum.alpha.delta = 0.01
spectrum.beta.delta = 0.01
source = PointSource("source", # Name of source (arbitrary, but needs to be unique)
l = l, # Longitude (deg)
b = b, # Latitude (deg)
spectral_shape = spectrum) # Spectral model
# Optional: free the position parameters
#source.position.l.free = True
#source.position.b.free = True
model = Model(source) # Model with single source. If we had multiple sources, we would do Model(source1, source2, ...)
# Optional: if you want to call get_log_like manually, then you also need to set the model manually
# 3ML does this internally during the fit though
cosi.set_model(model)
12:04:35 WARNING The current value of the parameter beta (-2.0) was above the new maximum -2.15. parameter.py:794
... Calculating point source responses ...
WARNING ErfaWarning: ERFA function "utctai" yielded 7979956 of "dubious year (Note 3)"
--> done (source name : source)
--> all done
Gather all plugins and combine with the model in a JointLikelihood object, then perform maximum likelihood fit
[10]:
plugins = DataList(cosi) # If we had multiple instruments, we would do e.g. DataList(cosi, lat, hawc, ...)
like = JointLikelihood(model, plugins, verbose = False)
like.fit()
12:05:05 INFO set the minimizer to minuit joint_likelihood.py:1042
Adding 1e-12 to each bin of the expectation to avoid log-likelihood = -inf.
WARNING IntegrationWarning: The occurrence of roundoff error is detected, which prevents
the requested tolerance from being achieved. The error may be
underestimated.
12:05:26 WARNING get_number_of_data_points not implemented, values for statistical plugin_prototype.py:128 measurements such as AIC or BIC are unreliable
12:05:26 WARNING The current value of the parameter beta (-2.0) was above the new maximum -2.15. parameter.py:794
Best fit values:
result | unit | |
---|---|---|
parameter | ||
source.spectrum.main.Band.K | (2.857 +/- 0.023) x 10^-5 | 1 / (cm2 keV s) |
source.spectrum.main.Band.alpha | -1.9886 +/- 0.0004 | |
source.spectrum.main.Band.xp | 4.47 -0.17 +0.18 | keV |
source.spectrum.main.Band.beta | -2.1964 +/- 0.0016 | |
background_cosi | (9.9193 +/- 0.0020) x 10^-1 |
Correlation matrix:
1.00 | 0.42 | -0.61 | -0.12 | 0.05 |
0.42 | 1.00 | 0.45 | 0.08 | -0.03 |
-0.61 | 0.45 | 1.00 | 0.01 | -0.02 |
-0.12 | 0.08 | 0.01 | 1.00 | -0.52 |
0.05 | -0.03 | -0.02 | -0.52 | 1.00 |
Values of -log(likelihood) at the minimum:
-log(likelihood) | |
---|---|
cosi | -2.612135e+08 |
total | -2.612135e+08 |
Values of statistical measures:
statistical measures | |
---|---|
AIC | -5.224270e+08 |
BIC | -5.224270e+08 |
[10]:
( value negative_error positive_error \
source.spectrum.main.Band.K 0.000029 -2.272835e-07 2.278935e-07
source.spectrum.main.Band.alpha -1.988617 -3.560944e-04 3.585775e-04
source.spectrum.main.Band.xp 4.473464 -1.691279e-01 1.784522e-01
source.spectrum.main.Band.beta -2.196416 -1.564268e-03 1.565798e-03
background_cosi 0.991932 -1.915462e-04 1.937451e-04
error unit
source.spectrum.main.Band.K 2.275885e-07 1 / (cm2 keV s)
source.spectrum.main.Band.alpha 3.573359e-04
source.spectrum.main.Band.xp 1.737900e-01 keV
source.spectrum.main.Band.beta 1.565033e-03
background_cosi 1.926456e-04 ,
-log(likelihood)
cosi -2.612135e+08
total -2.612135e+08)
Error propagation and plotting (Band function)
Define Band function spectrum injected into MEGAlib
[11]:
alpha_inj = -1.99
beta_inj = -2.32
E0_inj = 531. * (alpha_inj - beta_inj) * u.keV
xp_inj = E0_inj * (alpha_inj + 2) / (alpha_inj - beta_inj)
piv_inj = 100. * u.keV
K_inj = 7.56e-4 / u.cm / u.cm / u.s / u.keV
spectrum_inj = Band()
spectrum_inj.alpha.min_value = -2.14
spectrum_inj.alpha.max_value = 3.0
spectrum_inj.beta.min_value = -5.0
spectrum_inj.beta.max_value = -2.15
spectrum_inj.xp.min_value = 1.0
spectrum_inj.alpha.value = alpha_inj
spectrum_inj.beta.value = beta_inj
spectrum_inj.xp.value = xp_inj.value
spectrum_inj.K.value = K_inj.value
spectrum_inj.piv.value = piv_inj.value
spectrum_inj.xp.unit = xp_inj.unit
spectrum_inj.K.unit = K_inj.unit
spectrum_inj.piv.unit = piv_inj.unit
12:06:33 WARNING The current value of the parameter beta (-2.0) was above the new maximum -2.15. parameter.py:794
The summary of the results above tell you the optimal values of the parameters, as well as the errors. Propogate the errors to the “evaluate_at” method of the spectrum
[12]:
results = like.results
print(results.display())
parameters = {par.name:results.get_variates(par.path)
for par in results.optimized_model["source"].parameters.values()
if par.free}
results_err = results.propagate(results.optimized_model["source"].spectrum.main.shape.evaluate_at, **parameters)
print(results.optimized_model["source"])
Best fit values:
result | unit | |
---|---|---|
parameter | ||
source.spectrum.main.Band.K | (2.857 +/- 0.023) x 10^-5 | 1 / (cm2 keV s) |
source.spectrum.main.Band.alpha | -1.9886 +/- 0.0004 | |
source.spectrum.main.Band.xp | 4.47 -0.17 +0.18 | keV |
source.spectrum.main.Band.beta | -2.1964 +/- 0.0016 | |
background_cosi | (9.9193 +/- 0.0020) x 10^-1 |
Correlation matrix:
1.00 | 0.42 | -0.61 | -0.12 | 0.05 |
0.42 | 1.00 | 0.45 | 0.08 | -0.03 |
-0.61 | 0.45 | 1.00 | 0.01 | -0.02 |
-0.12 | 0.08 | 0.01 | 1.00 | -0.52 |
0.05 | -0.03 | -0.02 | -0.52 | 1.00 |
Values of -log(likelihood) at the minimum:
-log(likelihood) | |
---|---|
cosi | -2.612135e+08 |
total | -2.612135e+08 |
Values of statistical measures:
statistical measures | |
---|---|
AIC | -5.224270e+08 |
BIC | -5.224270e+08 |
None
12:06:34 WARNING The current value of the parameter beta (-2.0) was above the new maximum -2.15. parameter.py:794
WARNING The current value of the parameter beta (-2.0) was above the new maximum -2.15. parameter.py:794
WARNING The current value of the parameter beta (-2.0) was above the new maximum -2.15. parameter.py:794
* source (point source):
* position:
* l:
* value: 184.56
* desc: Galactic longitude
* min_value: 0.0
* max_value: 360.0
* unit: deg
* is_normalization: false
* b:
* value: -5.78
* desc: Galactic latitude
* min_value: -90.0
* max_value: 90.0
* unit: deg
* is_normalization: false
* equinox: J2000
* spectrum:
* main:
* Band:
* K:
* value: 2.8565585663971596e-05
* desc: Differential flux at the pivot energy
* min_value: 1.0e-99
* max_value: null
* unit: keV-1 s-1 cm-2
* is_normalization: true
* alpha:
* value: -1.9886166208617622
* desc: low-energy photon index
* min_value: -2.14
* max_value: 3.0
* unit: ''
* is_normalization: false
* xp:
* value: 4.473463779563324
* desc: peak in the x * x * N (nuFnu if x is a energy)
* min_value: 1.0
* max_value: null
* unit: keV
* is_normalization: false
* beta:
* value: -2.196416422107725
* desc: high-energy photon index
* min_value: -5.0
* max_value: -2.15
* unit: ''
* is_normalization: false
* piv:
* value: 500.0
* desc: pivot energy
* min_value: null
* max_value: null
* unit: keV
* is_normalization: false
* polarization: {}
Evaluate the flux and errors at a range of energies for the fitted and injected spectra, and the simulated source flux
[13]:
energy = np.geomspace(100*u.keV,10*u.MeV).to_value(u.keV)
flux_lo = np.zeros_like(energy)
flux_median = np.zeros_like(energy)
flux_hi = np.zeros_like(energy)
flux_inj = np.zeros_like(energy)
for i, e in enumerate(energy):
flux = results_err(e)
flux_median[i] = flux.median
flux_lo[i], flux_hi[i] = flux.equal_tail_interval(cl=0.68)
flux_inj[i] = spectrum_inj.evaluate_at(e)
binned_energy_edges = crab.binned_data.axes['Em'].edges.value
binned_energy = np.array([])
bin_sizes = np.array([])
for i in range(len(binned_energy_edges)-1):
binned_energy = np.append(binned_energy, (binned_energy_edges[i+1] + binned_energy_edges[i]) / 2)
bin_sizes = np.append(bin_sizes, binned_energy_edges[i+1] - binned_energy_edges[i])
expectation = cosi._expected_counts['source']
Plot the fitted and injected spectra
[14]:
fig,ax = plt.subplots()
ax.plot(energy, energy*energy*flux_median, label = "Best fit")
ax.fill_between(energy, energy*energy*flux_lo, energy*energy*flux_hi, alpha = .5, label = "Best fit (errors)")
ax.plot(energy, energy*energy*flux_inj, color = 'black', ls = ":", label = "Injected")
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlabel("Energy (keV)")
ax.set_ylabel(r"$E^2 \frac{dN}{dE}$ (keV cm$^{-2}$ s$^{-1}$)")
ax.legend()
[14]:
<matplotlib.legend.Legend at 0x289dee910>
Plot the fitted spectrum convolved with the response, as well as the simulated source counts
[15]:
fig,ax = plt.subplots()
ax.stairs(expectation.project('Em').todense().contents, binned_energy_edges, color='purple', label = "Best fit convolved with response")
ax.errorbar(binned_energy, expectation.project('Em').todense().contents, yerr=np.sqrt(expectation.project('Em').todense().contents), color='purple', linewidth=0, elinewidth=1)
ax.stairs(crab.binned_data.project('Em').todense().contents, binned_energy_edges, color = 'black', ls = ":", label = "Source counts")
ax.errorbar(binned_energy, crab.binned_data.project('Em').todense().contents, yerr=np.sqrt(crab.binned_data.project('Em').todense().contents), color='black', linewidth=0, elinewidth=1)
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlabel("Energy (keV)")
ax.set_ylabel("Counts")
ax.legend()
[15]:
<matplotlib.legend.Legend at 0x2afdc8160>
Plot the fitted spectrum convolved with the response plus the fitted background, as well as the simulated source+background counts
[16]:
fig,ax = plt.subplots()
ax.stairs(expectation.project('Em').todense().contents+(bkg_par.value * bkg.binned_data.project('Em').todense().contents), binned_energy_edges, color='purple', label = "Best fit convolved with response plus background")
ax.errorbar(binned_energy, expectation.project('Em').todense().contents+(bkg_par.value * bkg.binned_data.project('Em').todense().contents), yerr=np.sqrt(expectation.project('Em').todense().contents+(bkg_par.value * bkg.binned_data.project('Em').todense().contents)), color='purple', linewidth=0, elinewidth=1)
ax.stairs(crab_bkg.binned_data.project('Em').todense().contents, binned_energy_edges, color = 'black', ls = ":", label = "Total counts")
ax.errorbar(binned_energy, crab_bkg.binned_data.project('Em').todense().contents, yerr=np.sqrt(crab_bkg.binned_data.project('Em').todense().contents), color='black', linewidth=0, elinewidth=1)
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlabel("Energy (keV)")
ax.set_ylabel("Counts")
ax.legend()
[16]:
<matplotlib.legend.Legend at 0x2b007c6d0>