Skip to content

Conversation

@tobyallwood
Copy link
Collaborator

@tobyallwood tobyallwood commented Aug 6, 2025

An update to issue #1171 so now errors are logged along with the line number that caused the exception. This should make it easier to locate issues within the try: section.


Before submitting a Pull Request please check the following.

  • Existing tests pass.
  • Documentation has been updated and builds. Remember to update as required...
    • docs/configuration.md
    • docs/usage.md
    • docs/data_dictionary.md
    • docs/advanced.md and new pages it should link to.
  • Pre-commit checks pass.
  • New functions/methods have typehints and docstrings.
  • New functions/methods have tests which check the intended behaviour is correct.

Optional

topostats/default_config.yaml

If adding options to topostats/default_config.yaml please ensure.

  • There is a comment adjacent to the option explaining what it is and the valid values.
  • A check is made in topostats/validation.py to ensure entries are valid.
  • Add the option to the relevant sub-parser in topostats/entry_point.py.

Copy link
Collaborator

@ns-rse ns-rse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tobyallwood

Thanks for taking on this issue.

I think a more generic solution would be to set the formatting for the LOGGER to include the line number.

As noted we should have this already as we setup the formatting in topostats/logs/logs.py as noted inline.

It would be worth investigating whether this works as it should do and if doesn't working out why and fixing it is a more generic solution that would work wherever LOGGER.error() is called and avoids having to add the same functionality at all instances where we might want to have the line number repeated.

Long term I thought we had an outstanding issue to migrate to using loguru but I can't seem to find it (I think @SylviaWhittle and I discussed this as its used in AFMReader) so I've just created #1194 so we don't miss it in the future. Loguru too includes the ability to include line numbers in output (by including logger.add(..., backtrace = True) I think). But don't worry about that now, stay focused on this issue 😉

# pylint: disable=fixme
# FIXME : The calculate_stats() and calculate_aspect_ratio() raise this error when linting, could consider putting
# variables into dictionar, see example of breaking code out to staticmethod extremes() and returning a
# variables into dictionary, see example of breaking code out to staticmethod extremes() and returning a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

_, _, e_traceback = sys.exc_info()
e_line_no = e_traceback.tb_lineno
LOGGER.error(
f"[{filename}] : An error occurred whilst calculating grain statistics on line {e_line_no}: {e}\nReturning empty dataframe."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger is setup in topostats/logs/logs/py and we do set LOG_ERROR_FORMATTER to include %(lineno)s so I'm not sure whether you need to import sys and get a traceback to include the line-number. It might just be sufficient to LOGGER.info > LOGGER.error and the line number would be reported.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ns-rse aah ok I see, would the line number be the line of the logging message or the line where the error occured?

Copy link
Collaborator Author

@tobyallwood tobyallwood Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ns-rse the reason I included the line number is that the info log line number automatically included is the line of the LOGGER.info() output, but not the line that the actual error occurs on and because the try catch statement includes 40+ lines it can still be difficult to pinpoint exactly where the error appears.

For example, if I change

grainstats_df["basename"] = basename.parent
grainstats_df["class_name"] = grainstats_df["class_number"].map(class_names)

to

grainstats_df["basename"] = basename.parent
grainstats_df["class_name"] = grainstats_df["class_number"].map(class_names)
grainstats_df = None

(lines 382 - 384)
in order to force an error, this is the error message when sys linenos are included:

[Mon, 18 Aug 2025 15:30:37] [INFO ] [topostats] [processing.py] [391] [minicircle] : An error occurred whilst calculating grain statistics on line 385: object of type 'NoneType' has no len()

[391] is the LOGGER.info() line, but the error occurs on line 385:
Line 385:
LOGGER.info(f"[{filename}] : Calculated grainstats for {len(grainstats_df)} grains.")

Would it still be fine to remove the sys lineno and just have the error message? I feel like it might not be clear enough but if you think that's still fine I'm all good with it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'd be inclined to pepper the code with more LOGGER.error() or LOGGER.debug() messages so that we can get finer grained information on where errors are occurring. It might be too much to expect people to re-run code with a more verbose log-level though so perhaps using LOGGER.info() might be more appropriate though.

Its a bit of a mess really, the broad try: ... except: ... was thrown in as there was a desire to get things working quickly. We should really have taken the time and done it properly with finer grained try: ... except: ... around each step to give more informative error messages that we are trying to deal with now.

My lesson...don't rush things, do them properly first time!

I'm back at work next week (currently killing time on a ferry from Islay to the mainland!), although am off to a conference the second week of September.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ok that makes sense. I'm in tomorrow so I'll get that done then, thanks

tobyallwood and others added 8 commits August 27, 2025 10:16
New version of the [`action/checkout v5.0.0`](https://github.com/actions/checkout/releases) so updating workflows.
Bumps the actions group with 1 update in the / directory: [actions/download-artifact](https://github.com/actions/download-artifact).


Updates `actions/download-artifact` from 4 to 5
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v4...v5)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Missed this in reviewing the code. We have already extracted the version from `TOPOSTATS_DETAILS` to `TOPOSTATS_VERSION` so may as well import it and do away with having to index the version from `TOPOSTATS_DETAILS`.
@ns-rse
Copy link
Collaborator

ns-rse commented Sep 1, 2025

Thanks for your on-going work on this @tobyallwood getting finer grained logging statements in topostats/processing.py is a good start but there is unfortunately more places where failure can occur that we would want to pin-point.

The main bulk of the processing is done in topostats/grainstats.py and it is the various methods/functions within here that the actual errors can occur that we want to pin point.

If you're able to pepper this with try: ... except: ... statements that would be wonderful. From memory one of the challenges is knowing what exceptions can occur though otherwise we fall foul of having a broad except. My approach here would be to see if people have files that failed Grainstats and try and work out where these are happening and what exceptions are raised, otherwise disable this linting check (for both Pylint and Ruff).

@ns-rse
Copy link
Collaborator

ns-rse commented Sep 17, 2025

Forgot to ask earlier, how are things going with peppering topostats/grainstats.py with lots of try: ... except: ... statements?

@tobyallwood
Copy link
Collaborator Author

Forgot to ask earlier, how are things going with peppering topostats/grainstats.py with lots of try: ... except: ... statements?

Not much progress on it as I've just been focussing on this refactor really.. it's in my mind to do but wanted to get the tests sorted first before I switch back to it

@ns-rse
Copy link
Collaborator

ns-rse commented Sep 17, 2025

🆒 thanks for the update @tobyallwood

I only asked as I've hit the problem of the broad try: ... except: ... masking the underlying problem so am digging into that now as I work through getting things working with classes.

@tobyallwood
Copy link
Collaborator Author

🆒 thanks for the update @tobyallwood

I only asked as I've hit the problem of the broad try: ... except: ... masking the underlying problem so am digging into that now as I work through getting things working with classes.

Hi @ns-rse , any chance you could share the config/ logs for the errors you were running into here? Thanks

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 8, 2025

Configuration will be topostats/default_config.yaml as I don't tinker.

Log files are long gone and I've moved on from refactoring the Grains section. I will have removed the try: ... except: to work out where the problem was arising and fixed it as it would have been an aspect of my refactoring, but its a pain to have to remove that try: ... except: to dig in and is why we need finer grained error reporting.

Sorry that's not very helpful.

@tobyallwood
Copy link
Collaborator Author

I'm back looking at this at the moment and I'm wondering what the best way to go about this is. I haven't been able to get logs of any times grainstats has failed except from the issue Yeti raised the other day (which has already been solved in the newest dev version of TopoStats) meaning I'm not sure what errors could appear (or how to figure this out!)

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 24, 2025

I've no great insight here I'm afraid, you could trawl the issues labelled with "bug" (this link includes all issues and not just those that remain open). Aside from that asking people who have had images fail in the past if they have a record of them. Possibly @derollins might have a few hidden with skeletons in a cupboard somewhere. 🤷

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 24, 2025

One thing you could do in the mean time is resolve the conflicts between this branch and main (which arise because work has been merged into main since this branch was forked).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants