[CELEBORN-2248] Implement lazy loading for columnar shuffle classes and skew shuffle method using static holder pattern #3581
+53
−33
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR converts the static initialization of columnar shuffle class constructors
and skew shuffle method to lazy initialization using the initialization-on-demand
holder idiom (static inner class pattern) in SparkUtils.java.
Specifically, the following changes were made:
Introduced
ColumnarHashBasedShuffleWriterConstructorHolderstatic inner classto lazily initialize the constructor for ColumnarHashBasedShuffleWriter
Introduced
ColumnarShuffleReaderConstructorHolderstatic inner class to lazilyinitialize the constructor for CelebornColumnarShuffleReader
Introduced
CelebornSkewShuffleMethodHolderstatic inner class to lazilyinitialize the
isCelebornSkewedShufflemethod referenceModified
createColumnarHashBasedShuffleWriter(),createColumnarShuffleReader(),and
isCelebornSkewShuffleOrChildShuffle()methods to use the holder pattern forlazy initialization
Added JavaDoc comments explaining the lazy loading mechanism
Why are the changes needed?
The current implementation statically initializes columnar shuffle class constructors
and the skew shuffle method at SparkUtils class loading time, which means these
classes/methods are loaded regardless of whether they are actually used.
This lazy loading approach ensures that:
celeborn.columnarShuffle.enabledis true and the create methods are called)The static holder pattern (initialization-on-demand holder idiom) provides several
advantages:
Does this PR resolve a correctness bug?
No, this is a performance optimization.
Does this PR introduce any user-facing change?
No. This change only affects when certain classes are loaded internally.
The functionality and API remain unchanged.
How was this patch tested?