-
Notifications
You must be signed in to change notification settings - Fork 6
Near-final updates #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
HighEnergyObsCoreExt.tex
Outdated
|
|
||
|
|
||
| \section{High Energy Astrophysics Data} | ||
|
|
||
| \gls{HEA} data include observations obtained using photon detectors covering X-ray (from $\sim$0.1 eV to $\sim$100 keV) through gamma-ray (from 100 MeV up to $\gtrsim$ PeV) energies, as well as cosmic-ray and astrophysical neutrino ($\gtrsim$ TeV) detectors, or other messenger related to \gls{HEA} phenomena. The domain is now sufficiently mature to provide open data that are science-ready and work with open analysis tools ({\em e.g.\/} CIAO \citep{2006SPIE.6270E..1VF} or Gammapy \citep{gammapy:2023}). The science output of the \gls{HEA} domain already includes high-level products such as images, cubes, spectra, and time series such as light curves and time-resolved spectra. Additional data products include fitted sky models with a spatial, spectral and/or temporal component(s), along with their confidence intervals or confidence limits, and covariance matrices. Finally, multiple \gls{HEA} instruments produce source catalogs and surveys covering up to the full the sky, which include maps of photon or particle flux, exposure, sensitivity, and aperture-photometry likelihood profiles. | ||
| \gls{HEA} data include observations obtained using photon detectors covering X-ray (from $\sim$0.1 keV to $\sim$120 keV) through gamma-ray (from 120 keV up to $\gtrsim$ PeV) energies, as well as cosmic-ray and astrophysical neutrino ($\gtrsim$ TeV) detectors, or other messengers related to \gls{HEA} phenomena. The domain is now sufficiently mature to provide open data that are science-ready and work with open analysis tools ({\em e.g.\/}, CIAO \citep{2006SPIE.6270E..1VF} or Gammapy \citep{gammapy:2023}). The science output of the \gls{HEA} domain already includes high-level products such as images, cubes, spectra, and time series such as light curves and time-resolved spectra. Additional data products include fitted sky models with a spatial, spectral, and/or temporal component(s), along with their confidence intervals or confidence limits, and covariance matrices. Finally, multiple \gls{HEA} instruments produce source catalogs and surveys covering up to the full the sky, which include maps of photon or particle flux, exposure, sensitivity, and aperture-photometry likelihood profiles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| \gls{HEA} data include observations obtained using photon detectors covering X-ray (from $\sim$0.1 keV to $\sim$120 keV) through gamma-ray (from 120 keV up to $\gtrsim$ PeV) energies, as well as cosmic-ray and astrophysical neutrino ($\gtrsim$ TeV) detectors, or other messengers related to \gls{HEA} phenomena. The domain is now sufficiently mature to provide open data that are science-ready and work with open analysis tools ({\em e.g.\/}, CIAO \citep{2006SPIE.6270E..1VF} or Gammapy \citep{gammapy:2023}). The science output of the \gls{HEA} domain already includes high-level products such as images, cubes, spectra, and time series such as light curves and time-resolved spectra. Additional data products include fitted sky models with a spatial, spectral, and/or temporal component(s), along with their confidence intervals or confidence limits, and covariance matrices. Finally, multiple \gls{HEA} instruments produce source catalogs and surveys covering up to the full the sky, which include maps of photon or particle flux, exposure, sensitivity, and aperture-photometry likelihood profiles. | |
| \gls{HEA} data include observations obtained using photon detectors covering X-ray (from $\sim$0.1 keV to $\sim$120 keV) through gamma-ray (from 120 keV up to $\gtrsim$ PeV) energies, as well as cosmic-ray and astrophysical neutrino ($\gtrsim$ TeV) detectors, or other messengers related to \gls{HEA} phenomena. The domain is now sufficiently mature to provide open data that are science-ready and work with open analysis tools ({\em e.g.\/}, CIAO \citep{2006SPIE.6270E..1VF} or Gammapy \citep{gammapy:2023}). The science output of the \gls{HEA} domain already includes advanced products such as images, cubes, spectra, and time series such as light curves and time-resolved spectra. Additional data products include fitted sky models with a spatial, spectral, and/or temporal component(s), along with their confidence intervals or confidence limits, and covariance matrices. Finally, multiple \gls{HEA} instruments produce source catalogs and surveys covering up to the full the sky, which include maps of photon or particle flux, exposure, sensitivity, and aperture-photometry likelihood profiles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As proposed by Mireille, let's try to use the term advanced products instead of high-level products (an event list is already a high-level product).
HighEnergyObsCoreExt.tex
Outdated
|
|
||
| %An {\bf event-bundle} might for example consist of an {\bf event-list} and the associated {\bf response-functions} (see below) used to calibrate the dataset; alternatively an {\bf event-bundle} may include the {\bf event-list} and associated data products necessary for the user to create the {\bf response-functions} (for those X-ray cases where detailed knowledge of the scientific use case — for example, the user’s selection of events — may be required to compute the responses).\\ | ||
| %particle-detection | ||
|
|
||
| In addition to {\em dataproduct\_type} terms that focus on event data, we note that existing ObsCore definitions do not adequately span the breadth of {\bf advanced data products} (with {\em calib\_level} $\ge$ 2) that may be generated from astronomical observations by users or observatories removing instrumental effects. The computational complexity of analyzing \gls{HEA} data robustly in the extreme Poisson regime ({\em e.g.\/}, Bayesian X-ray aperture photometry applied simultaneously to multiple overlapping detections and observations) means that data providers may choose to provide such analysis products directly to the end user. For example, the Chandra Source Catalog includes 38 types of advanced data products (for a total of $\sim$90 million files) and $\sim$50\% of these data product types are not well represented by a {\em dataproduct\_type} value that allows for meaningful data discovery. Users will certainly want to discover these data products independently from the associated observation data (and many of these data products combine data from multiple observations). We therefore propose the following additional {\em dataproduct\_type} (or {\em dataproduct\_subtype}) terms for these advanced data products, and note that these terms will certainly be useful independent of waveband ({\em i.e.\/}, they can be equally applicable to UV/optical, IR, and radio datasets): | ||
| In addition to {\em dataproduct\_type\/} terms that focus on event data, we note that existing ObsCore definitions do not adequately span the breadth of ``advanced data products'' (typically with {\em calib\_level\/} $\ge$ 3) that may be generated from astronomical observations by users or observatories. The computational complexity of analyzing \gls{HEA} data robustly in the extreme Poisson regime ({\em e.g.\/}, Bayesian X-ray aperture photometry applied simultaneously to multiple overlapping detections and observations) means that data providers may choose to provide such analysis products directly to the end user. For example, the Chandra Source Catalog includes 38 types of advanced data products (for a total of $\sim\!90$ million files) and $\sim\!50$\% of these data product types are not well represented by a {\em dataproduct\_type\/} value that allows for meaningful data discovery. Users will certainly want to discover these data products independently from the associated progenitor observation data (and many of these data products combine data from multiple observations). We therefore propose the following additional {\em dataproduct\_type\/} (or {\em dataproduct\_subtype\/}) terms for these advanced data products, and note that these terms will certainly be useful independent of waveband ({\em i.e.\/}, they can be equally applicable to UV/optical, IR, and radio datasets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| In addition to {\em dataproduct\_type\/} terms that focus on event data, we note that existing ObsCore definitions do not adequately span the breadth of ``advanced data products'' (typically with {\em calib\_level\/} $\ge$ 3) that may be generated from astronomical observations by users or observatories. The computational complexity of analyzing \gls{HEA} data robustly in the extreme Poisson regime ({\em e.g.\/}, Bayesian X-ray aperture photometry applied simultaneously to multiple overlapping detections and observations) means that data providers may choose to provide such analysis products directly to the end user. For example, the Chandra Source Catalog includes 38 types of advanced data products (for a total of $\sim\!90$ million files) and $\sim\!50$\% of these data product types are not well represented by a {\em dataproduct\_type\/} value that allows for meaningful data discovery. Users will certainly want to discover these data products independently from the associated progenitor observation data (and many of these data products combine data from multiple observations). We therefore propose the following additional {\em dataproduct\_type\/} (or {\em dataproduct\_subtype\/}) terms for these advanced data products, and note that these terms will certainly be useful independent of waveband ({\em i.e.\/}, they can be equally applicable to UV/optical, IR, and radio datasets): | |
| In addition to {\em dataproduct\_type\/} terms that focus on event data, we note that existing ObsCore definitions do not adequately span the breadth of ``advanced data products'' (typically with {\em calib\_level\/} $\ge$ 3) that may be generated from astronomical observations by users or observatories. The computational complexity of analyzing \gls{HEA} data robustly in the extreme Poisson regime ({\em e.g.\/}, Bayesian X-ray aperture photometry applied simultaneously to multiple overlapping detections and observations, Frequentist adjustment of a model of electron populations on multi-wavelength data spanning from X-rays to PeV gamma rays) means that data providers may choose to provide such analysis products directly to the end user. For example, the Chandra Source Catalog includes 38 types of advanced data products (for a total of $\sim\!90$ million files) and $\sim\!50$\% of these data product types are not well represented by a {\em dataproduct\_type\/} value that allows for meaningful data discovery. Users will certainly want to discover these data products independently from the associated progenitor observation data (and many of these data products combine data from multiple observations). We therefore propose the following additional {\em dataproduct\_type\/} (or {\em dataproduct\_subtype\/}) terms for these advanced data products, and note that these terms will certainly be useful independent of waveband ({\em i.e.\/}, they can be equally applicable to UV/optical, IR, and radio datasets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a VHE example, such that anyone can be convinced that the need of these products is from the whole HEA group.
HighEnergyObsCoreExt.tex
Outdated
|
|
||
| {\bf pdf}: a dataset that represents the probability density function of a quantity, for example the Bayesian marginal probability density function for a random variable. The probability density function provides a robust estimation of the variable and allows arbitrary confidence intervals to be computed directly from the distribution. | ||
| {\bf pdf}: a dataset that records the probability density function of a quantity, for example the Bayesian marginal probability density function for a random variable. The probability density function provides a robust estimation of the variable and allows arbitrary confidence intervals to be computed directly from the distribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| {\bf pdf}: a dataset that records the probability density function of a quantity, for example the Bayesian marginal probability density function for a random variable. The probability density function provides a robust estimation of the variable and allows arbitrary confidence intervals to be computed directly from the distribution. | |
| {\bf pdf}: a dataset that records the probability density function of a quantity, for example the Bayesian marginal probability density function for a random variable or the DeltaTS associated to a quantity from a Frequentist analysis . The probability density function provides a robust estimation of the variable and allows arbitrary confidence intervals to be computed directly from the distribution. |
HighEnergyObsCoreExt.tex
Outdated
| The {\bf measurements} data product type is quite useful for many different types of advanced data products (that may be derived from multiple observations). But users of those products often may not be interested the progenitor datasets, especially if many advanced data products are extracted from a single or a few progenitors ({\em e.g.\/}, measurements associated with sources detected in a single observation field). We propose to delete the caveat associated with {\bf dataproduct\_type} = ``measurements'' in the ObsCore IVOA Recommendation (\S4.1.1) that requires the derived data products be exposed ``together with the progenitor observation dataset''.\\ | ||
|
|
||
| Note that these terms will be repeated in the section \ref{sec:voc}, as mentioned in the introduction of this sub-section. | ||
| The {\bf measurements} {\em dataproduct\_type\/} is quite useful for many different types of advanced data products (that may be derived from multiple observations). But users of those products often may not be interested the progenitor datasets, especially if many advanced data products are extracted from a single or a few progenitors ({\em e.g.\/}, measurements associated with sources detected in a single observation field). We propose to delete the caveat associated with {\em dataproduct\_type\/} = ``measurements'' in the ObsCore IVOA Recommendation (\S~4.1.1) that requires the derived data products be exposed ``{\bf together} with the progenitor observation dataset''. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The {\bf measurements} {\em dataproduct\_type\/} is quite useful for many different types of advanced data products (that may be derived from multiple observations). But users of those products often may not be interested the progenitor datasets, especially if many advanced data products are extracted from a single or a few progenitors ({\em e.g.\/}, measurements associated with sources detected in a single observation field). We propose to delete the caveat associated with {\em dataproduct\_type\/} = ``measurements'' in the ObsCore IVOA Recommendation (\S~4.1.1) that requires the derived data products be exposed ``{\bf together} with the progenitor observation dataset''. | |
| The {\bf measurements} {\em dataproduct\_type\/} is quite useful for many different types of advanced data products (that may be derived from multiple observations). But users of those products often may not be interested the progenitor datasets, especially if many advanced data products are extracted from a single or a few progenitors ({\em e.g.\/}, measurements associated with sources detected in a single observation field). We propose to delete the caveat associated with {\em dataproduct\_type\/} = ``measurements'' in the ObsCore IVOA Recommendation (\S~4.1.1) that requires the derived data products be exposed ``{\bf together} with the progenitor observation dataset''. The recovery of the progenitor observation dataset(s) can be achieved with the Provenance information. |
HighEnergyObsCoreExt.tex
Outdated
|
|
||
| \subsection{{\em dataproduct\_subtype}} | ||
|
|
||
| The optional attribute {\em dataproduct\_subtype} may be used by the data provider to specify additional information about the nature of the data product. For some datasets, this attribute aims to be combined with {\em dataproduct\_type\/} to more precisely define the content of the dataset ({\em e.g.\/}, {\em dataproduct\_type\/} = {\bf image}${}+{}${\em dataproduct\_subtype\/} = {\bf exposuremap}). Even if the vocabulary of such data product (sub-)types is free, we recommand for \gls{HEA} data providers to use standardized vocabulary: {\em exposure} for the exposure map in any astrophysical dimension (ie, not with the instrumental ones), {\em psfkernel} for the PSF map in any astrophysical dimension, {\em edispkernel} for the energy dispersion matrix map in any astrophysical dimension, {\em significance} for the signifiance map in any astrophysical dimension, {\em probability} for a probability map in any astrophysical dimension. | ||
| The optional attribute {\em dataproduct\_subtype} may be used by the data provider to specify more precisely the scientific nature of a data product. Although no vocabulary is defined for {\em dataproduct\_subtype\/}, we recommend that data providers formulate and use a standardized vocabulary for this attribute for data products that are commonly used in \gls{HEA}\null. We have proposed several terms in \S~5 for commonly used \gls{HEA} {\bf response-function} types ({\em e.g.\/}, {\bf aeff}, {\bf edisp}, {\bf psf}), but additional terms could be standardized for other common data products. For example, standardizing using {\bf exposuremap} for an exposure map would enable queries such as ({\em dataproduct\_type\/} = {\bf image}) AND ({\em dataproduct\_subtype\/} = {\bf exposuremap}) to work across multiple facilities. Other possible terms could include {\bf significancemap} for a significance map, and {\bf probabilitymap} for aprobability map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The optional attribute {\em dataproduct\_subtype} may be used by the data provider to specify more precisely the scientific nature of a data product. Although no vocabulary is defined for {\em dataproduct\_subtype\/}, we recommend that data providers formulate and use a standardized vocabulary for this attribute for data products that are commonly used in \gls{HEA}\null. We have proposed several terms in \S~5 for commonly used \gls{HEA} {\bf response-function} types ({\em e.g.\/}, {\bf aeff}, {\bf edisp}, {\bf psf}), but additional terms could be standardized for other common data products. For example, standardizing using {\bf exposuremap} for an exposure map would enable queries such as ({\em dataproduct\_type\/} = {\bf image}) AND ({\em dataproduct\_subtype\/} = {\bf exposuremap}) to work across multiple facilities. Other possible terms could include {\bf significancemap} for a significance map, and {\bf probabilitymap} for aprobability map. | |
| The optional attribute {\em dataproduct\_subtype} may be used by the data provider to specify more precisely the scientific nature of a data product. Although no vocabulary is defined for {\em dataproduct\_subtype\/}, we recommend that data providers formulate and use a standardized vocabulary for this attribute for data products that are commonly used in \gls{HEA}\null. We have proposed several terms in \S~5 for commonly used \gls{HEA} {\bf response-function} types ({\em e.g.\/}, {\bf aeff}, {\bf edisp}, {\bf psf}, {\bf bkgrate}), but additional terms could be standardized for other common data products. For example, standardizing using {\bf exposuremap} for an exposure map would enable queries such as ({\em dataproduct\_type\/} = {\bf image}) AND ({\em dataproduct\_subtype\/} = {\bf exposuremap}) to work across multiple facilities. Other possible terms could include {\bf significancemap} for a significance map, {\bf probabilitymap} for a probability map, and {\bf exclusionmap} for the exclusion maps used to adjust TeV background models. |
|
Hello Ian, Thank you @iannevans for this rich PR, containing a lot of great editing corrections and text improvements. I have merged the PR of @loumir. So, the PDF generation should be fixed now. Let's see whether it works well once your PR will be merged. Thanks for the update of the glossary. As you have noticed, glossary handling is not yet part of the standard CI of the IVOA latex files. I had to add it 'manually', as you have seem. So, thanks for the update of the gls files. To make it appears into the document: 1/ make 2/ makeglossaries HighEnergyObsCoreExt 3/ make. Here are my comments for the LateX files after a carefull last reading: Core document:
Use Cases (I have focused my reading on the TeV use cases)
|
|
Hi Bruno, I will incorporate your suggestions and resubmit this as a new PR. I suggest that the current PR gets rejected since it has not yet been merged. A few comments. Your inline suggestions seem OK, but I noticed a couple of grammar errors in those same paragraphs in what I originally updated, so will correct those too. Re: the examples of responses: I think that "({\em e.g./}, {\bf aeff}, {\bf edisp}, {\bf psf})" is sufficient (it's 3 out of 6) without adding bkgrate, especially since we reference section 5. Alternatively, we could drop the "e.g.," and just add all 6 in parentheses but using "e.g.," means that we can updated the list in section 5 subsequently without having to update this section also. I'll update the text re draws to add the Lambda CDM example. I'm not worried about region. There is no existing dataproduct_type that describes a data product that incorporates region definitions (MOC is an encoding, not a data product). I think you might be misinterpreting the phot ({\em e.g./}, {\bf aeff}, {\bf edisp}, {\bf psf}) tree. phot is "photometry" not "photon". So phot.energy... are UCDs that describe energy flux (e.g., erg cm^-2 s^-1) and associated quantities. So M T^-3 is the correct dimensionality for energy flux. This highlighted an error in section 5.1.6, where energy flux (and associated radiance, flux density) is listed a W m^-2 s^-1. That should be J m^-2 s^-1. I'll fix that too. I can add in particle flux and associated quantities similarly as phot.flux.particle... and this is a natural and consistent extension. phys.particle.flux is not the appropriate place for these. The phys UCD tree describes physical properties but what we are discussing here are photometric quantities. That highlights another issue that we are missing - we need to add phys.particle.photon !!! You would then define a counts flux (counts/cm^2/s) using "phot.count" (I know this one is inconsistent, but that's what happens when a standard accretes updates instead of being thought through at one time), a photon flux (photons/cm^2/s) would be "phot.flux.particle; phys.particle.photon", and an energy flux (erg/cm^2/s) would be "phot.flux.energy". I'll also correct the descriptions of the counts... UCDs to say just counts and not counts or particles. I also don't want to add dimensionality to existing (basic) "phot.flux" and associated derived quantities because that has been "in the wild" for so long that there are probably a lot of uses out there that would not match any dimensionality that we suggest. I'll update the use cases. I'm going to leave \makeglossaries, \printglossaries commented out for now until ivoatex gets updated because the file won't pass the checks run by GitHub otherwise. Thans, |
|
Hello, About UCDs:
Thank you very much again :) |
Mass. Using SI units, energy flux is J m^-2 s^-1 = N m m^-2 s^-1 = kg m s^-2 m m^-2 s^-1 = kg s^-3 so dimensionality = M T^-3. |
A few fixes in response to feedback from Bruno
These updates are mostly those from issue #28, but include some changes from issue #29 and feedback from Pierre. There are additional changes because the TeX file that I downloaded from the repo included more changes than the PDF Preview against which I wrote issue #28. I have incorporated review and updates across these updates too for consistency. I made some significant changes to the UCDs section that I thought about deep and hard, and I will explain these and my reasoning in a separate note (probably tomorrow).
One issue I hit is that I had to comment out \makeglossaries ... \printglossaries because the document will not build for me with these in place. TeX fails with
(./HighEnergyObsCoreExt.gls
! Extra }, or forgotten \endgroup.
@endpbox ...finalstrut @arstrutbox \par \egroup
\ST@dimen =\ht \ST@pbox \a...
l.5 ...[]{page}\glsnumberformat{13}}}\glsgroupskip
however, I can't find an extra "}" in the .gls file so this appears to be a problem that will require more investigation to resolve.