Drastically improve efficiency of topography generation in tripole region #49

angus-g · 2025-04-01T08:33:00Z

In the tripole, we use the make_topo_gen routine. This was allocating two arrays with 20,000,000 elements to hold topography for the patch. The target grid is divided into blocks of 100x25 points, and for each point in these blocks, for every block, both of the large arrays were being completely zeroed. This meant that performance was almost completely dominated by memset.

Isolating to just 10 of these blocks (but including NetCDF input and output), I saw 98.2% of the CPU time spent in memset, taking over 2 minutes to process. With this patch, it takes about 5 seconds to process the same blocks, where most of that time is spent in KD tree generation and NetCDF input/output. An entire invocation of gen_topo from GEBCO to a 1/10 full-globe grid went down from around 2 hours to 6 minutes.

The other performance win would be to use a selection (e.g. from fortran stdlib) rather than sorting algorithm for the lines like

            call quicksort(t_s_all(im)%topo, frst, lst)
            topo_all_med_out(im, jm) = t_s_all(im)%topo((npts(im, jm)+1)/2)

I haven't introduced that because of the extra dependency, but it shaves the runtime down to 4:45 with the same conditions as the tests above (an extra minute or so). All told, with compiler optimisations, this PR, and using selection instead of sort, the 1/10 topography generation is down from 2 hours to 200 seconds.

In the tripole, we use the make_topo_gen routine. This was allocating two arrays with 20,000,000 elements to hold topography for the patch. The target grid is divided into blocks of 100x25 points, and for each point in these blocks, for ever block, both of the large arrays were being completely zeroed. This meant that performance was almost completely dominated by memset. Isolating to just 10 of these blocks (but including NetCDF input and output), I saw 98.2% of the CPU time spent in memset, taking over 2 minutes to process. With this patch, it takes about 5 seconds to process the blocks, where most of that time is spent in KD tree generation and NetCDF input/output. An entire invocation of gen_topo from GEBCO to a 1/10 full-globe grid went down from around 2 hours to 6 minutes.

aekiss · 2025-04-01T21:59:01Z

Awesome, thanks @angus-g! It looks like this will produce bitwise identical results to the original, right?

angus-g · 2025-04-02T00:35:51Z

It should do, yes.

micaeljtoliveira · 2025-04-02T22:55:54Z

@angus-g Great work! I suspected the code could be made faster, but never imagined it would be something this simple.

micaeljtoliveira · 2025-04-02T23:00:18Z

@angus-g Regarding the use of the Fortran stdlib, I think this would be fine. The more projects use stdlib the better ;) So feel free to create a PR with those changes.

angus-g changed the title ~~Drastically improve efficiency of tripole generation~~ Drastically improve efficiency of topography generation in tripole region Apr 1, 2025

micaeljtoliveira self-requested a review April 2, 2025 22:56

micaeljtoliveira approved these changes Apr 2, 2025

View reviewed changes

micaeljtoliveira merged commit a29d07a into COSIMA:main Apr 2, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drastically improve efficiency of topography generation in tripole region #49

Drastically improve efficiency of topography generation in tripole region #49

Uh oh!

angus-g commented Apr 1, 2025 •

edited

Loading

Uh oh!

aekiss commented Apr 1, 2025

Uh oh!

angus-g commented Apr 2, 2025

Uh oh!

micaeljtoliveira commented Apr 2, 2025

Uh oh!

micaeljtoliveira commented Apr 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Drastically improve efficiency of topography generation in tripole region #49

Drastically improve efficiency of topography generation in tripole region #49

Uh oh!

Conversation

angus-g commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aekiss commented Apr 1, 2025

Uh oh!

angus-g commented Apr 2, 2025

Uh oh!

micaeljtoliveira commented Apr 2, 2025

Uh oh!

micaeljtoliveira commented Apr 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

angus-g commented Apr 1, 2025 •

edited

Loading