-
Notifications
You must be signed in to change notification settings - Fork 18
adding PSC-Flink Source rate limiter #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding PSC-Flink Source rate limiter #106
Conversation
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Left a few comments. PTAL.
...flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicSource.java
Show resolved
Hide resolved
...flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicSource.java
Outdated
Show resolved
Hide resolved
psc-flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscRateLimitMap.java
Show resolved
Hide resolved
psc-flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscRateLimitMap.java
Show resolved
Hide resolved
psc-flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscRateLimitMap.java
Show resolved
Hide resolved
...flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicSource.java
Outdated
Show resolved
Hide resolved
...src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactory.java
Outdated
Show resolved
Hide resolved
...flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicSource.java
Outdated
Show resolved
Hide resolved
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few discussions to the PR. PTAL. Thanks!
psc-flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscRateLimitMap.java
Show resolved
Hide resolved
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactor. The logic is much cleaner now. Overall lgtm. PTAL of the minor comments.
...flink/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicSource.java
Outdated
Show resolved
Hide resolved
...nk/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscTableCommonUtils.java
Outdated
Show resolved
Hide resolved
...nk/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscTableCommonUtils.java
Outdated
Show resolved
Hide resolved
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Outdated
Show resolved
Hide resolved
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall lgtm. Some minor suggestions to improve unit tests. Thanks!
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Show resolved
Hide resolved
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Show resolved
Hide resolved
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Outdated
Show resolved
Hide resolved
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall good. One suggestion to make sure we cover the different conditions w.r.t. topic partition vs scan.parallelism w/ mocks.
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Outdated
Show resolved
Hide resolved
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Outdated
Show resolved
Hide resolved
...nk/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscTableCommonUtils.java
Show resolved
Hide resolved
...nk/src/main/java/com/pinterest/flink/streaming/connectors/psc/table/PscTableCommonUtils.java
Outdated
Show resolved
Hide resolved
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Outdated
Show resolved
Hide resolved
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final one nit on usage on WhiteBox. Otherwise, lgtm. Thanks!
...test/java/com/pinterest/flink/streaming/connectors/psc/table/PscDynamicTableFactoryTest.java
Outdated
Show resolved
Hide resolved
nickpan47
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks for the revisions!
* adding rate limiter * fix scale rate limit ordering * introduce parallelism scan by scoure * fix readability * code cleanup + add tests * fix tests * push partition class * refactor unit tests * adding assertions + renaming * remove whitebox in favor of setProviderTest * using setProvider test --------- Co-authored-by: Kevin Browne <kbrowne@kbrowne-JG645XV.dyn.pinadmin.com>
Summary
Implementing configurable rate limiting and explicit source parallelism control for
PscDynamicTableSourceto provide better control over record consumption and operator parallelism.Motivation
Rate Limiting - Throttle record consumption at the source level to:
Explicit Source Parallelism Control - Allow users to configure
scan.parallelismto:Changes
New Files
PscRateLimitMap.java -
RichMapFunctionthat applies per-record throttling using Guava'sRateLimiter. Automatically divides the configured global rate across parallel subtasks. Emits metrics for monitoring:throttleOccurred- Counter for number of throttle eventsmaxThrottleDelayMs- Gauge tracking maximum delay observed since job startcurrentThrottleDelayMs- Gauge tracking real-time throttle delayPscRateLimitMapTest.java - Unit tests validating rate limiter creation, rate division by parallelism, minimum rate validation, and edge cases
PscTableCommonUtilsTest.java - Comprehensive unit tests for
shouldApplyRescale()decision logic covering 16 scenarios with mocked partition countsChanges in Existing Files
Configuration:
scan.rate-limit.records-per-second(double, optional) - Global rate limit for all subtasksscan.parallelism(integer, optional) - Explicit source operator parallelismRescale Decision Logic:
shouldApplyRescale()to prioritizescan.parallelismover global default parallelismPartitionCountProviderinterface for dependency injection in testssetProviderForTest()andresetProvider()methods with@VisibleForTestingannotationConfigOption.key()referencesCore Implementation:
PscDynamicSource.java
scanParallelismfield and updated all constructorsrescale()BEFORE rate limitingrescaleenabled: use intended parallelism (scan.parallelismor global default)rescaledisabled: use source parallelism (partition count)isRateLimitingEnabled()utility method for reusable condition checkinggetIntendedParallelism()helper method to encapsulate parallelism logiccopy(),equals(), andhashCode()to include new fieldsPscDynamicTableFactory.java
scan.parallelismandscan.rate-limit.records-per-secondconfigurationsscanParallelismtoshouldApplyRescale()and source constructorisRateLimitingEnabled()for consistent condition checksUpsertPscDynamicTableFactory.java
PscDynamicTableFactoryfor upsert sourcesTesting:
createExpectedScanSource()to includescanParallelismparameterproduceTransformationFromSource()to reduce repetitive test codeaddRescaleConfig(),addRateLimitConfig(),addScanParallelismConfig()usingConfigOption.key()@AfterEachhook to reset partition count provider after each testtestRescaleCreatesPartitionTransformation()- VerifiesPartitionTransformationcreationtestRescaleAndRateLimitChain()- Verifies complete operator chaintestSkipsRescaleWhenNotNeeded()- Verifies rescale is skipped when not neededtestRescaleAndRateLimitWithDifferentParallelism()- Verifies correct parallelism when scan.parallelism differs from global defaultImplementation Details
The rate limiter is applied as a map transformation after the source operator:
PscRateLimitMap.open()divides the rate by parallelismRateLimiterinstance (e.g., 250 records/sec with parallelism of 4)tryAcquire()+acquire()patternRate limiting is disabled by default. Users must explicitly enable it.
Tests:
scanParallelismparameterImplementation Details
Operator Ordering (Critical Fix)
The implementation ensures correct rate limiting by applying operators in this order:
Usage Examples
Basic Rate Limiting
With 10 topic partitions, each source subtask processes at 500 records/sec.
Rate Limiting with Explicit Source Parallelism
Behavior:
Rate Limiting Without Rescale
Behavior:
Test Coverage
Unit Tests - PscRateLimitMapTest.java
Unit Tests - PscTableCommonUtilsTest.java (16 test cases)
testShouldNotRescaleWhenDisabled()testShouldRescaleWhenScanParallelismExceedsPartitionCount()testShouldNotRescaleWhenScanParallelismLessThanPartitionCount()testShouldNotRescaleWhenScanParallelismEqualsPartitionCount()testShouldRescaleWhenGlobalParallelismExceedsPartitionCount()testShouldNotRescaleWhenGlobalParallelismLessThanPartitionCount()testShouldNotRescaleWhenScanParallelismIsZero()testShouldNotRescaleWhenScanParallelismIsNegative()testShouldNotRescaleWhenNoParallelismConfigured()testShouldNotRescaleWhenPartitionCountCannotBeDetermined()testShouldNotRescaleWhenPartitionCountIsZero()testScanParallelismTakesPrecedenceOverGlobalParallelism()testShouldRescaleWithMultipleTopics()testShouldRescaleWithHighParallelismAndLowPartitionCount()testShouldNotRescaleWithHighPartitionCountAndLowParallelism()testProviderResetRestoresDefaultBehavior()psc % mvn -pl psc-flink test -Dtest=PscTableCommonUtilsTest,PscDynamicTableFactoryTest,PscRateLimitMapTest -Dgpg.skip=true -Djacoco.skip=true
WARNING: package sun.misc not in java.base
WARNING: A terminally deprecated method in sun.misc.Unsafe has been called
WARNING: sun.misc.Unsafe::staticFieldBase has been called by com.google.inject.internal.aop.HiddenClassDefiner (file:/opt/homebrew/Cellar/maven/3.9.11/libexec/lib/guice-5.1.0-classes.jar)
WARNING: Please consider reporting this to the maintainers of class com.google.inject.internal.aop.HiddenClassDefiner
WARNING: sun.misc.Unsafe::staticFieldBase will be removed in a future release
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-examples:jar:4.1.3-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ com.pinterest.psc:psc-java-oss:4.1.3-SNAPSHOT, /Users/kbrowne/code/psc/pom.xml, line 155, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-integration-test:jar:4.1.3-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ com.pinterest.psc:psc-java-oss:4.1.3-SNAPSHOT, /Users/kbrowne/code/psc/pom.xml, line 155, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-flink:jar:4.1.3-SNAPSHOT
[WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: org.apache.flink:flink-table-planner_${scala.binary.version}:jar -> duplicate declaration of version ${flink.version} @ line 327, column 21
[WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: org.apache.flink:flink-json:jar -> duplicate declaration of version ${flink.version} @ line 417, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-logging:jar:4.1.3-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ com.pinterest.psc:psc-java-oss:4.1.3-SNAPSHOT, /Users/kbrowne/code/psc/pom.xml, line 155, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-common:jar:4.1.3-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ com.pinterest.psc:psc-java-oss:4.1.3-SNAPSHOT, /Users/kbrowne/code/psc/pom.xml, line 155, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-flink-logging:jar:4.1.3-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ com.pinterest.psc:psc-java-oss:4.1.3-SNAPSHOT, /Users/kbrowne/code/psc/pom.xml, line 155, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.pinterest.psc:psc-java-oss:pom:4.1.3-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ line 155, column 21
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO] ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO]
[INFO] --------------------< com.pinterest.psc:psc-flink >---------------------
[INFO] Building psc-flink 4.1.3-SNAPSHOT
[INFO] from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[WARNING] 1 problem was encountered while building the effective model for org.javassist:javassist:jar:3.18.2-GA during dependency collection step for project (use -X to see details)
[WARNING] 2 problems were encountered while building the effective model for org.apache.yetus:audience-annotations:jar:0.5.0 during dependency collection step for project (use -X to see details)
[WARNING] 1 problem was encountered while building the effective model for org.javassist:javassist:jar:3.18.1-GA during dependency collection step for project (use -X to see details)
[INFO]
[INFO] --- jacoco:0.8.5:prepare-agent (prepare-unit-tests) @ psc-flink ---
[INFO] Skipping JaCoCo execution because property jacoco.skip is set.
[INFO] argLine set to empty
[INFO]
[INFO] --- resources:3.3.1:resources (default-resources) @ psc-flink ---
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 1 resource from src/main/resources to target/classes
[INFO]
[INFO] --- compiler:3.8.1:compile (default-compile) @ psc-flink ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- resources:3.3.1:testResources (default-testResources) @ psc-flink ---
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 99 resources from src/test/resources to target/test-classes
[INFO]
[INFO] --- compiler:3.8.1:testCompile (default-testCompile) @ psc-flink ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- surefire:3.0.0-M5:test (default-test) @ psc-flink ---
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running com.pinterest.flink.streaming.connectors.psc.table.PscRateLimitMapTest
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.318 s - in com.pinterest.flink.streaming.connectors.psc.table.PscRateLimitMapTest
[INFO] Running com.pinterest.flink.streaming.connectors.psc.table.PscTableCommonUtilsTest
[INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.19 s - in com.pinterest.flink.streaming.connectors.psc.table.PscTableCommonUtilsTest
[INFO] Running com.pinterest.flink.streaming.connectors.psc.table.PscDynamicTableFactoryTest
[INFO] Tests run: 70, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.401 s - in com.pinterest.flink.streaming.connectors.psc.table.PscDynamicTableFactoryTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 95, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.665 s
[INFO] Finished at: 2025-11-24T13:28:44-05:00
[INFO] ------------------------------------------------------------------------