Skip to content

Conversation

@rowlesmr
Copy link
Collaborator

PD_CALIB_XCOORD and PD_CALIB_XCOORD_OVERALL were designed as a thing to exist alongside PD_CALIBRATION - where PD_CALIBRATION gave a human-readable equation to calibrate detector channel or position or whatever to 2theta or energy or d-spacing or whatever, PD_CALIB_XCOORD would give a machine-readable equivalent; enumerating this position = that d-spacing, or something. Except that it couldn't really do that. I'm pretty sure that it now can.

The way that it is supposed to be used is:

data_tof_calibration_diffractogram
#
# This diffractogram was collected in order to provide
#  a way to determine d-spacing from time-of-flight
#  as the sample has a well-known unit cell
#
         _pd_diffractogram.id                        TOF_STD
         _pd_calib_xcoord_overall.id                 tof-calibration # the identity of the calibration
         _pd_calib_xcoord_overall.diffractogram_id   TOF_STD # the source of the calibration

         # this is the mapping of TOF to d-spacing derived from the calibration sample
         loop_
         _pd_calib_xcoord.id
         _pd_calib_xcoord.nominal_time_of_flight
         _pd_calib_xcoord.actual_d_spacing
          1   1110.301000     1.489225
          2   1114.742200     1.495170
          3   1119.201170     1.501138
          4   1123.677980     1.507131
          ...

         # Here is the analysed diffraction pattern that was used to get the calibration
         loop_
         _pd_data.id
         _pd_meas.time_of_flight
         _pd_proc.d_spacing
         _pd_proc.intensity_total
         _pd_calc.intensity_total
          i   1110.301000     1.489225     0.600083   0.553025
          ii  1114.742200     1.495170     0.635318   0.571286
          iii 1119.201170     1.501138     0.646909   0.593895
          iv  1123.677980     1.507131     0.655807   0.620014
          ...


data_tof_unknown_sample_diffractogram
         _pd_diffractogram.id          SAMPLE_1
         _pd_calib_xcoord_overall.id   tof-calibration # this is supposed to tell you to look ^ for 
                                                       # the _pd_calib_xcoord.* details on how TOF
                                                       # should be turned into d-spacing.
         loop_
         _pd_data.id
         _pd_meas.time_of_flight
         _pd_proc.intensity_total
          1   1137.216120     0.746836
          2   1141.764990     0.728226
          3   1146.332050     0.734770
          4   1150.917370     0.767760
          ...

Is this the correct way to link the calibration loop to the data to which it was (or should be) applied? or do I need to make an separate data item that holds a _pd_calib_xcoord_overall.id value?

*There is a fall-back in the category description that if the points in the diffractogram to be calibrated don't exist in the calibration loop, then you need to interpolate.

@jamesrhester
Copy link
Contributor

The first issue in this example is that _pd_calib_xcoord.diffractogram_id is the diffractogram that is calibrated, not the one that was used for calibration, yet the first data block has an implicit value for this of TOF_STD, not SAMPLE_1. Confusingly, for the overall category, PCXO.diffractogram_id is the one used for calibration, so that is OK.

The second issue is that the overall category has no explicit link to the non-overall category. So if these two loops appeared in different blocks, we don't know that they are talking about the same thing.

How to fix this?

If we want the overall information to be linked with the point-by-point information, then we link to PCXO.id in PCX. This probably means that we want PCXO to be a Set category so we're not repeating PCXO.id on every line. We drop PCXO.detector_id as a key, and therefore require that each detector has a different PCXO.id associated with it. We now do not need a diffractometer_id in both PCX and PCXO. We leave the one in PCXO, and then link to PCXO.id in pd_diffractogram. How does that sound?

@rowlesmr
Copy link
Collaborator Author

Are we looking at the same version?

PD_CALIB_XCOORD (pcx)

  • Loop category
  • data names:
    • _pcx.actual_*
      • the calibrated values that are mapped to the nominal values in the same loop entry
    • _pcd.detector_id
      • the detector to which the nominal and/or actual values apply
    • _pcx.id: KEY
      • an arbitary ID
    • _pcd.nominal_*
      • the nominal values that are mapped to the calibrated values in the same loop entry
    • _pcx.overall_id: KEY
      • a link to _pcxo.id

PD_CALIB_XCOORD_OVERALL (pcxo)

  • Set category
  • data names
    • _pcxo.diffractogram_id
      • the diffractogram from which the calibration is taken, if a diffractogram was used
    • _pcxo.id: KEY
      • an arbitrary ID -> linked to from PCX
    • _pcxo.phase_id
      • the phase used in the calibration, if a phase was used.
    • _pcxo.special_details
      • any special stuff

.

I think this structure is essentially what you said?

.

I think the key thing is that we need to specify the detector_id in both the calibration and unknown data to act as the link.

Here's a more fully fleshed version, with an instrument:

  • Instrument "super_TOF" has three detectors, A, B, and C.
  • A diffractogram was collected on super_TOF and it was used to create a calibration.
  • Later on, another diffractogram was collected on super_TOF. Can we find the calibration?
    • Does a calibration exist? We look for a PCXO entry, and find one (or many)
    • we look at the diffractogram_id associated with that PCXO_id
    • We look at the instr_id associated with that diffractogram_id
    • Its a match!
    • we look in the PCX table for overall_id values that match the PCXO_id for the matching instr_id
    • we take all of those
    • we match up detector_ids between the unknown and calibration
    • and do calibration things

I think that works.

data_tof_instr
    _pd_instr.id          super_TOF
    _pd_instr.geometry    complicated
    _pd_instr.location    "Somewhere in the world"

    loop_
    _pd_instr_detector.id
    _pd_instr.dist_spec_detc
    _pd_instr_detector.instr_id
        A   1000   TOF_STD
        B   1500   TOF_STD
        C   2000   TOF_STD


data_tof_calibration_diffractogram
#
# This diffractogram was collected in order to provide
#  a way to determine d-spacing from time-of-flight
#  as the sample has a well-known unit cell
#
    _pd_diffractogram.id                        TOF_STD
    _pd_diffractogram.instr_id                  super_TOF
    _pd_calib_xcoord_overall.id                 tof-calibration # the identity of the calibration
    _pd_calib_xcoord_overall.diffractogram_id   TOF_STD # the source of the calibration

    # this is the mapping of TOF to d-spacing derived from the calibration sample
    loop_
    _pd_calib_xcoord.id
    _pd_calib_xcoord.detector_id
    _pd_calib_xcoord.nominal_time_of_flight
    _pd_calib_xcoord.actual_d_spacing
    #_pd_calib_xcoord.overall_id is a key data name, and autopopulated from the above value of _pcxo.id
        1   A   1110.301000     1.489225
        2   A   1114.742200     1.495170
        3   B   1119.201170     1.501138
        4   B   1123.677980     1.507131
        #...

    # Here is the analysed diffraction pattern that was used to get the calibration
    loop_
    _pd_data.id
    _pd_meas.detector_id
    _pd_meas.time_of_flight
    _pd_proc.d_spacing
    _pd_proc.intensity_total
    _pd_calc.intensity_total
        i   A   1110.301000     1.489225     0.600083   0.553025
        ii  A   1114.742200     1.495170     0.635318   0.571286
        iii B   1119.201170     1.501138     0.646909   0.593895
        iv  B   1123.677980     1.507131     0.655807   0.620014
        #...


data_tof_unknown_sample_diffractogram
    _pd_diffractogram.id          SAMPLE_1
    _pd_diffractogram.instr_id    super_TOF

    loop_
    _pd_data.id
    _pd_meas.detector_id
    _pd_meas.time_of_flight
    _pd_proc.intensity_total
        1   C   1137.216120     0.746836
        2   A   1141.764990     0.728226
        3   A   1146.332050     0.734770
        4   C   1150.917370     0.767760
        #...

@rowlesmr rowlesmr marked this pull request as draft June 23, 2025 05:39
@jamesrhester
Copy link
Contributor

Sorry , my mistake, I hadn't realised the example was in together with the massive rearrangement of PCX/O! Will review carefully now.

@rowlesmr
Copy link
Collaborator Author

Also, in light of our previous discussion on PEAK keys and data items, I probably need to rethink the keys here.

id and overall_id is wrong. I think perhaps id and instr_id? Can't use diffractogram_id, as that is a reference to the diffractogram that may have been used to create the calibration. Also, I think that using instr_id also requires that PD_INSTR_DETECTOR has instr_id.

Copy link
Contributor

@jamesrhester jamesrhester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just waiting for resolution of the pd_instr, pd_instr_detector linkage to understand whether or not that will affect the contents of pd_calib_xcoord_overall.

@rowlesmr
Copy link
Collaborator Author

Looks good. Just waiting for resolution of the pd_instr, pd_instr_detector linkage to understand whether or not that will affect the contents of pd_calib_xcoord_overall.

my current proposal is in #204; pd_instr_detector gets pd_instr_detector.instr_id to link the the instrument the detectors are in. Question on whether it should be a key or not.

@rowlesmr
Copy link
Collaborator Author

my current proposal is in #204; pd_instr_detector gets pd_instr_detector.instr_id to link the the instrument the detectors are in. Question on whether it should be a key or not.

It should not be a key; the detectors can exist independent of the instrument on which they're fixed for a particular measurement.

@rowlesmr rowlesmr marked this pull request as ready for review July 8, 2025 13:58
@vaitkus
Copy link
Collaborator

vaitkus commented Jul 8, 2025

Data names _pd_calib_xcoord_overall.id and _pd_calib_xcoord.overall_id describe two distinct data items, however, the names only differ in the placement of the dot . symbol. I would say that this is generally not a good idea, as people tend to sometimes automatically replace the dot with an underscore (at least mentally, although some of our software currently does that automatically). Maybe something like _pd_calib_xcoord_overall.overall_id would work as an alternative for _pd_calib_xcoord_overall.id? Although that is quite verbose.

@rowlesmr
Copy link
Collaborator Author

rowlesmr commented Jul 9, 2025

This is also the case in _pd_peak.overall_id and _pd_peak.overall_id.

@rowlesmr rowlesmr merged commit fc52708 into COMCIFS:master Jul 9, 2025
3 checks passed
@vaitkus
Copy link
Collaborator

vaitkus commented Jul 9, 2025

@rowlesmr wrote:

This is also the case in _pd_peak.overall_id and _pd_peak.overall_id.

they-are-the-same

In all seriousness, the _pd_peak.overall_id and _pd_peak_overall.id seem to have also been added in this iteration of the dictionary (version 2.5.0 ) so it might also still be possible to rename them. I will raise a separate issue to see if anybody else sees this as a problem.

@rowlesmr
Copy link
Collaborator Author

rowlesmr commented Jul 9, 2025

Ok, I didn't realise that.

That is a fair concern.

@jamesrhester
Copy link
Contributor

Definitely a problem if two data names look the same when . is replaced by _. Avoiding this behaviour is a legacy issue to try to help software that has "solved" the new dotted data name "problem" by just changing the names using the above substitution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants