Skip to content

Conversation

@bhokansonfasig
Copy link

@bhokansonfasig bhokansonfasig commented Nov 1, 2019

This pull request addresses a bug in the ARA_bicone6in_output.txt and ARA_bicone6in_output_updated2016.txt files. For some reason phases with value <=-100 degrees end up losing their negative sign (e.g. -99 is okay, but -101 becomes 101 instead). It's not clear to me why this happens, but it is evident by plotting the phases for a given angle across frequencies (see plot below).

I fixed the phases in ARA_bicone6in_output_updated2016.txt and created a new file ARA_bicone6in_output_updated2019.txt with these values. The fix was implemented based on the knowledge that the error only occurs for phases <=-100 degrees by checking for the minimum difference between the inverted phase and the previous two phase points.

The plots below illustrate the difference in phases between the "old file" (2016) and the "new file" (2019) for a couple of different angles:
vpol_phases_30
vpol_phases_90

I also checked on the phase values of ARA_dipoletest1_output_updated2016.txt, but found that this file does not show the same error. In fact, for most angles the phase never drops below -100 degrees anyway, except for the single case of theta=180. The plots below show no difference after applying my script to fix phase angles:
hpol_phases_30
hpol_phases_90
hpol_phases_180

Since there was no change, ARA_dipoletest1_output_updated2016.txt was left alone.

In regards to the physics effect of this change, the new corrected phases result in less "ringing" in the waveform. This is to be expected, since large spikes in phase angle (as in the old files) will result in ringing. The plots below illustrate the effect, taking a sample Askaryan signal and passing it through just the directional gain, and then the entire Vpol signal chain:
input_signal
vpol_partial
vpol_final

In the end, I expect the amplitudes of the pulses on the antennas to change slightly, and this may result in a small increase in the effective volume.

@bhokansonfasig
Copy link
Author

The following is a dump of the python script used to fix the phase values. The most relevant function being the fix_phases function:

"""Script for fixing bad phases in ARA antenna directional gain files"""

import sys
import os.path
import numpy as np


def read_directionality_data(filename):
    """Gather antenna directionality data from a data file. Returns the data as
    a dictionary with keys (freq, theta, phi) and values (phase).
    Also returns a set of the frequencies appearing in the keys, a set of
    thetas appearing in the keys, and a set of phis appearing in the keys."""
    data = {}
    freqs = set()
    thetas = set()
    phis = set()
    freq = 0
    with open(filename) as f:
        for line in f:
            words = line.split()
            if line.startswith('freq'):
                freq = 1
                if words[-1]=="Hz":
                    pass
                elif words[-1]=="kHz":
                    freq *= 1e3
                elif words[-1]=="MHz":
                    freq *= 1e6
                elif words[-1]=="GHz":
                    freq *= 1e9
                else:
                    raise ValueError("Cannot parse line: '"+line+"'")
                freq *= float(words[-2])
                freqs.add(freq)
            elif line.startswith('SWR'):
                swr = float(words[-1])
            elif len(words)==5 and words[0]!="Theta":
                theta = words[0]
                thetas.add(theta)
                phi = words[1]
                phis.add(phi)
                # db_gain = words[2]
                # gain = words[3]
                phase = words[4]
                data[(freq, theta, phi)] = phase
    return data, freqs, thetas, phis


def fix_phases(phases):
    """Returns a list of corrected phases"""
    fixed = []
    for i in range(len(phases)):
        phase = float(phases[i])
        # The phases which need to be flipped are always 100 degrees or above
        if phase<100 or i<=1:
            fixed.append(str(phase))
            continue
        prev = float(fixed[i-1])
        prev2 = float(fixed[i-2])
        # If the phase should be flipped, its flipped form will be closer to
        # the previous phase (as a function of frequency) than its non-flipped
        # form.
        # Also its flipped form's slope to the previous point should be closer
        # to the slope from the previous two points than its non-flipped form
        # moved down or up by 360 degrees.
        if (np.abs(prev-phase)>np.abs(prev+phase) and
                np.abs((prev2-prev)-(prev-phase+360)) >
                np.abs((prev2-prev)-(prev+phase)) and
                np.abs((prev2-prev)-(prev-phase-360)) >
                np.abs((prev2-prev)-(prev+phase))):
            fixed.append(str(-phase))
        else:
            fixed.append(str(phase))
    return fixed



def write_fixed_phases(bad_phase_file, fixed_phase_file):
    bad_data, freqs, thetas, phis = read_directionality_data(bad_phase_file)
    fixed_data = {}
    for theta in thetas:
        for phi in phis:
            # Get the phases as a funciton of frequency
            bad_phases = [bad_data[(f, theta, phi)] for f in sorted(freqs)]
            # Fix the phases
            fixed_phases = fix_phases(bad_phases)
            # Write the gains and corrected phases to the fixed_data list
            for f, phase in zip(sorted(freqs), fixed_phases):
                fixed_data[(f, theta, phi)] = phase

    # Now rewrite the bad_phase_file to the fixed_phase_file with the old
    # structure but using the new data
    infile = open(bad_phase_file, 'r')
    outfile = open(fixed_phase_file, 'w')
    for line in infile:
        words = line.split()
        if line.startswith('freq'):
            freq = 1
            if words[-1]=="Hz":
                pass
            elif words[-1]=="kHz":
                freq *= 1e3
            elif words[-1]=="MHz":
                freq *= 1e6
            elif words[-1]=="GHz":
                freq *= 1e9
            else:
                raise ValueError("Cannot parse line: '"+line+"'")
            freq *= float(words[-2])
            outfile.write(line)
        elif len(words)==5 and words[0]!="Theta":
            theta = words[0]
            phi = words[1]
            db_gain = words[2]
            gain = words[3]
            bad_phase = words[4]
            fixed_phase = fixed_data[(freq, theta, phi)]
            outfile.write(line.replace(bad_phase, fixed_phase))
        else:
            outfile.write(line)
    infile.close()
    outfile.close()




if __name__ == '__main__':
    # Check command-line arguments
    if len(sys.argv)!=3:
        print("Usage:", sys.argv[0], "INPUT OUTPUT", file=sys.stderr)
        print("Corrects problems with antenna phases which should be -100 or",
              "lower in INPUT file and writes them to OUTPUT, keeping all",
              "other data intact", file=sys.stderr)
        sys.exit(1)
    input_file = sys.argv[1]
    output_file = sys.argv[2]
    if not os.path.exists(input_file):
        print("Couldn't find input file", input_file, file=sys.stderr)
        sys.exit(2)
    elif os.path.exists(output_file):
        print("Output file", output_file, "already exists. Stopping to prevent",
              "accidental overwriting of the file", file=sys.stderr)
        sys.exit(3)

    # Fix the phases in the file
    write_fixed_phases(input_file, output_file)

@uzairlatif90
Copy link

Hello Ben! This looks fine to me honestly though Jorge or Brian would be better placed to check it since they have worked with AraSim more than I have.

@clark2668
Copy link
Collaborator

clark2668 commented Nov 2, 2019

This looks good! Thanks Ben! Three quick questions:

  1. the problem basically looks like the phase file "wraps" at -100 instead of -180. And you essentially fixed that. Is that a fair characterization?
  2. I see you added a new file _updated2019. But I think which file is loaded is hard coded into AraSim. Did you change that also? Or maybe it should be loaded as an extra mode (yuck).
  3. Could we get a veff comparison between the old and new file at a few energies? This probably isn't quite as urgent.

@bhokansonfasig
Copy link
Author

  1. the problem basically looks like the phase file "wraps" at -100 instead of -180. And you essentially fixed that. Is that a fair characterization?

I don't think it's quite right to characterize it as wrapping at -100, because the slope is also inverted from what it should be. Maybe another example would make it a bit clearer:

If there was a sequence of phases in the file that should have been -98, -99, -100, -101, -102, the way they ended up being stored was instead -98, -99, 100, 101, 102. It seems like there was some sort of string truncation or something that dropped the negative sign. If it were a wrapping issue I would expect the file contents to be something more like -98, -99, 180, 179, 178.

  1. I see you added a new file _updated2019. But I think which file is loaded is hard coded into AraSim. Did you change that also? Or maybe it should be loaded as an extra mode (yuck).

I didn't change the file names in the code yet. In fact, when I just went to look it seems like the file that's currently loaded is still ARA_bicone6in_output.txt. I'm not sure what the history of the _updated2016 files is, but I would have expected those to have become the new "defaults" after they were added. The fact that they weren't makes me wonder if there was some reason for not changing those file names. I can change which file is loaded to the new version though if you think that's best.

  1. Could we get a veff comparison between the old and new file at a few energies? This probably isn't quite as urgent.

I agree that's probably a good idea, but I'd prefer if someone with more AraSim experience could make this comparison. Any volunteers?

@clark2668
Copy link
Collaborator

  1. oh yeah, I see your point. that is very strange. It must have been string truncation or something as you say, I can't imagine a different reason...

  2. It's possible that the new file would have been loaded in an alternative mode? We definitely want this fixed long term, but if the change wasn't hardcoded in yet, we probably don't need to do so just yet. What do you think?

  3. Sure, fair enough. Not sure who we can tap though, we are pretty resource constrained going into proposal season. It's really just a question of running jobs, but I wanna do it systematically enough. Does this affect the comparison you and Jorge are doing at all?

@bhokansonfasig
Copy link
Author

bhokansonfasig commented Nov 5, 2019

It's possible someone has a version of AraSim with another antenna mode that uses the _updated2016 files, but I don't see any reference to them in this version. I agree that since the original files are still hardcoded, maybe it would be best not to change that.

Actually, expecting that everyone will understand to change it themselves is probably unreasonable, and I think it's reasonably clear that the newer files are "more correct". Still there should be some test that the newer files don't break anything. Maybe after some testing of the simulation (e.g. the effective volume tests) we can convince ourselves that it's okay to change which files are hardcoded. If we don't make the change, it's possible it will just never happen.

I can understand that probably there is little to no manpower left for actually running through the changes it makes to effective volume. Unfortunately I think the overhead for learning to run and interpret the AraSim outputs is probably too high on my end. Maybe the effective volume change can be left to a less busy time. (If one ever comes!)

For the comparison Jorge and I are running this shouldn't be a problem. I think as long as we are using the same file then that element of the comparison is known to be the same and (ideally) won't affect the outcome.

Concluding thoughts since the above ended up being a bit more like a stream of consciousness than I expected it to:
My recommendation would be to merge this now, without changes to the hardcoded filenames used, and then open an issue which requests a check on the effective volume and a change to the hardcoded filenames if the change in effective volumes is reasonable.

@uzairlatif90
Copy link

I personally think that people before might have done the same i.e. left the file for later check and then never really got to it and then it was forgotten. It might be me being naive but I do not see how putting in the "more correct" file name might break anything in the simulation. I think we should do the merge and then change the hardcoded filenames now so that no one else in the future has to go through the hassle of going through this exercise. The error in the code (100 instead of 180) to me seems clearly like a transcribing error. I would vote for changing the hardcoded filenames too. I would leave it to Brian to make the decision on this.

@clark2668
Copy link
Collaborator

clark2668 commented Nov 6, 2019

Thanks for this discussion. I don't think I'm worried about it breaking the simulation. I'm worried about it changing the veff in a way that is hard to predict or understand looking back in retrospect. But also, it's strictly more correct, so maybe that's life. And it probably goes in the direction we want, if Ben's prediction is right.

In any case, I don't wanna be the final arbiter. Let me see if I can find the bandwidth in the next few days to throw the simulation jobs and do a veff comparison. Then we can say "Ben found a problem, we assessed the effect of the change, and we're pushing it." Does that sound reasonable?

@bhokansonfasig
Copy link
Author

That sounds good to me. Thanks for taking on this comparison on top of what I know is a busy schedule for you. I'll go ahead and make the filename changes in this branch so that you can simply submit jobs from the two different branches.

@clark2668
Copy link
Collaborator

Great, thanks Ben. This is super helpful.

Copy link
Member

@toej93 toej93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine. Brian just submitted sims to the Michigan State cluster, as Ohio's one is swamped. We'll see how this changes the veff and the approve the merge if nothing substantial changes.

@toej93
Copy link
Member

toej93 commented Nov 11, 2019

This is what we found by comparing the two implementations.

veff_phases

Seems that the fixing doesn't change the veffs (at least noticeably). Should we merge the pull request?
@clark2668 @bhokansonfasig

@bhokansonfasig
Copy link
Author

That looks good. Do you think you could also plot a ratio or relative difference between the two lines? It might make any differences easier to see

@clark2668
Copy link
Collaborator

Agreed. This effect is so small, I'm suddenly doubtful that I used the change correctly!

@bhokansonfasig, the new file will be loaded with ANTENNA_MODE=1, right?

@bhokansonfasig
Copy link
Author

As far as I can see it should. You could add a quick debugging line in ReadVgain of Detector.cc that prints out the filename just to be sure though

@clark2668
Copy link
Collaborator

@toej93, can we get the ratio plot made?
I'll try to double check the gain file stuff today.

@clark2668
Copy link
Collaborator

I just checked, I think somehow the git checkout didn't work, and those might me the same. I've downloaded new clean copies and will throw new jobs today. Hang tight...

@toej93
Copy link
Member

toej93 commented Dec 10, 2019

These plots correspond to the new files that Brian just produced.
veff_comparison
ratio

@clark2668
Copy link
Collaborator

I can't work out if the difference is statistically significant... And I'm a little surprised that fixing the bug makes it worse, not better. @toej93 , stupid question, but are you sure the labels are right?

@toej93
Copy link
Member

toej93 commented Dec 11, 2019

I just double checked, and the labels are fine.

@toej93
Copy link
Member

toej93 commented Jul 14, 2020

We never finished reviewing this. We need to come back to it.

@clark2668
Copy link
Collaborator

What do we need to do to resolve this? Do we just need more statistics? I can also perform the test "from scratch" again if necessary.

@toej93
Copy link
Member

toej93 commented Jul 31, 2020

What do we need to do to resolve this? Do we just need more statistics? I can also perform the test "from scratch" again if necessary.

More statistics, or re-simulate from scratch.

@clark2668
Copy link
Collaborator

Okay, let me see if I can do both.

@paramitadasgupta
Copy link

corrected_error_bar_veff
error_propagated
We plot the veff as a function of energy with the new (after the phases were fixed) and old (before the phases were fixed) MC files that Brian created. The ratio (veff_new-veff_old)/veff_old is also plotted as a function of energy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants