Skip to content

IndexError: index 79 is out of bounds for axis 0 with size 79 #20

@bazhaoyu

Description

@bazhaoyu

I am currently running killMS on a server with only 32 CPU cores and 128 GB of memory, using the following command:

kMS.py --MSName $msfile --FieldID 0 --SolverType KAFCA --PolMode Scalar --BaseImageName image_DI_Clustered.DeeperDeconv --dt 5 --NCPU 30 --OutSolsName DD0 --NChanSols 5 --InCol CORRECTED_DATA --TChunk 0.2 --BeamModel FITS --FITSParAngleIncDeg 0.5 --FITSFile=$BEAMfits --CenterNorm 1 --FITSFeed xy --FITSFeedSwap 1 --ApplyPJones 1 --FlipVisibilityHands 1 --NChanBeamPerMS 2

When using TChunk=0.2, I encountered the following error:
slurmstepd: error: Detected 7 oom-kill event(s) in StepId=86546498.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.

To reduce the memory usage, I then tried TChunk=0.1. This setting worked until the final time chunk [4.30, 4.39] (the total observation time for this target is 4.38 hours), and an error happened:

 - 23:44:39 - ClassVisServer               | Reading next data chunk in [ 4.30,  4.39] hours (column CORRECTED_DATA)
 - 23:44:39 - ClassMS                      |    Reading rows [2653500 -> 2723040]
 - 23:44:42 - ClassMS                      | Reading uvw_dt column
 - 23:44:43 - ClassMS                      | �[1m�[91mData has only two polarisation, adapting shape�[0m�[0m
 - 23:44:45 - ClassMS                      | Flagging the zeros-weighted visibilities
 - 23:44:51 - ClassMS                      |   Increase in flag fraction: 0.022015
 - 23:44:54 - ClassVisServer               | Channels are equidistant, can go fast
 - 23:44:54 - ClassVisServer               | Flagging baselines with w > 5.539633 km
 - 23:44:54 - ClassVisServer               |   w-Flagged   0.0% of the data
 - 23:45:04 - ClassVisServer               | Estimating Beam directions at the center of the individual facets areas
 - 23:45:04 - ClassFITSBeam                | Using station-independent E Jones for the array
 - 23:45:04 - ClassFITSBeam                | polarization basis specified by FITSFeed parameter: xx xy yx yy
 - 23:45:04 - ClassFITSBeam                | swapping feeds as per FITSFeedSwap setting
 - 23:26:32 - ClassFITSBeam                | All stations: beam patterns /meerkat_pb/meerkat_pb_jones_cube_97channels_yy_re.fits /meerkat_pb/meerkat_pb_jones_cube_97channels_yy_im.fits already in memory
 - 23:26:32 - ClassFITSBeam                | All stations: beam patterns /meerkat_pb/meerkat_pb_jones_cube_97channels_yx_re.fits /meerkat_pb/meerkat_pb_jones_cube_97channels_yx_im.fits already in memory
 - 23:26:32 - ClassFITSBeam                | All stations: beam patterns /meerkat_pb/meerkat_pb_jones_cube_97channels_xy_re.fits /meerkat_pb/meerkat_pb_jones_cube_97channels_xy_im.fits already in memory
 - 23:26:32 - ClassFITSBeam                | All stations: beam patterns /meerkat_pb/meerkat_pb_jones_cube_97channels_xx_re.fits /meerkat_pb/meerkat_pb_jones_cube_97channels_xx_im.fits already in memory
 - 23:45:04 - ClassFITSBeam                | computing beam sample times for 69540 timeslots
 - 23:45:04 - ClassFITSBeam                |   DtBeamMin=5.00 min results in 1 samples
 - 23:45:04 - ClassFITSBeam                |   FITSParAngleIncrement=0.50 deg results in 1 samples
 - 23:45:04 - ClassVisServer               | Update FITS beam in 190 dirs, 2 times, 2 freqs ...
 - 23:45:04 - ClassVisServer               |        .... done Update beam
 - 23:45:04 - ClassJonesDomains            | Building VisToJones time mapping...
 - 23:45:04 - ClassJonesDomains            | Building VisToJones freq mapping...
 - 23:45:36 - ClassWirtingerSolver         | DT=306.321903, dt=300.000000, nt=2.000000

Traceback (most recent call last):
  File "/public/home/danhu/.local/ddc-env/bin/kMS.py", line 8, in <module>
    sys.exit(kms_main())
  File "/public/home/danhu/.local/ddc-env/lib/python3.10/site-packages/killMS/__main__.py", line 3, in kms_main
    kMS.driver()
  File "/public/home/danhu/.local/ddc-env/lib/python3.10/site-packages/killMS/kMS.py", line 1335, in driver
    main(OP=OP,MSName=MSName)
  File "/public/home/danhu/.local/ddc-env/lib/python3.10/site-packages/killMS/kMS.py", line 758, in main
    Solver.doNextTimeSolve_Parallel(Parallel=True)
  File "/public/home/danhu/.local/ddc-env/lib/python3.10/site-packages/killMS/Wirtinger/ClassWirtingerSolver.py", line 898, in doNextTimeSolve_Parallel
    Res=self.setNextData()
  File "/public/home/danhu/.local/ddc-env/lib/python3.10/site-packages/killMS/Wirtinger/ClassWirtingerSolver.py", line 532, in setNextData
    self.AppendGToSolArray()
  File "/public/home/danhu/.local/ddc-env/lib/python3.10/site-packages/killMS/Wirtinger/ClassWirtingerSolver.py", line 1252, in AppendGToSolArray
    self.SolsArray_t0[self.iCurrentSol]=t0
IndexError: index 79 is out of bounds for axis 0 with size 79

I would like to ask:

  1. Are there any conflicting or improper parameter settings in the command above?
  2. how is the index 79 determined?
  3. Additionally, when this error occurred, the task was not automatically terminated. Instead, it becomes unresponsive and remains stuck on the server.

Any insights or suggestions would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions