From 464aac379feafd2b2311a9798d57f56aef38cd73 Mon Sep 17 00:00:00 2001 From: Ryan Kuo Date: Thu, 11 Dec 2025 15:51:48 -0500 Subject: [PATCH 1/5] new Replicator Metrics doc --- src/current/_includes/molt/fetch-metrics.md | 4 +- .../molt/fetch-schema-table-filtering.md | 4 +- .../_includes/molt/replicator-metrics.md | 37 --- .../_includes/v23.1/sidebar-data/migrate.json | 6 + .../_includes/v23.2/sidebar-data/migrate.json | 6 + .../_includes/v24.1/sidebar-data/migrate.json | 6 + .../_includes/v24.2/sidebar-data/migrate.json | 6 + .../_includes/v24.3/sidebar-data/migrate.json | 6 + .../_includes/v25.1/sidebar-data/migrate.json | 6 + .../_includes/v25.2/sidebar-data/migrate.json | 6 + .../_includes/v25.3/sidebar-data/migrate.json | 6 + .../_includes/v25.4/sidebar-data/migrate.json | 6 + .../_includes/v26.1/sidebar-data/migrate.json | 6 + src/current/molt/migrate-failback.md | 14 +- src/current/molt/migrate-load-replicate.md | 24 +- src/current/molt/molt-replicator.md | 13 +- src/current/molt/replicator-flags.md | 2 +- src/current/molt/replicator-metrics.md | 289 ++++++++++++++++++ 18 files changed, 391 insertions(+), 56 deletions(-) delete mode 100644 src/current/_includes/molt/replicator-metrics.md create mode 100644 src/current/molt/replicator-metrics.md diff --git a/src/current/_includes/molt/fetch-metrics.md b/src/current/_includes/molt/fetch-metrics.md index b4c37bb4cd9..471319ad533 100644 --- a/src/current/_includes/molt/fetch-metrics.md +++ b/src/current/_includes/molt/fetch-metrics.md @@ -17,7 +17,7 @@ Cockroach Labs recommends monitoring the following metrics during data load: You can also use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view the preceding metrics. {% if page.name != "migrate-bulk-load.md" %} -{{site.data.alerts.callout_info}} -Metrics from the `replicator` process are enabled by setting the `--metricsAddr` [replication flag](#replication-flags), and are served at `http://{host}:{port}/_/varz`.
To view Oracle-specific metrics from `replicator`, import [this Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json).
+{{site.data.alerts.callout_success}} +For details on Replicator metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}). {{site.data.alerts.end}} {% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-schema-table-filtering.md b/src/current/_includes/molt/fetch-schema-table-filtering.md index a949c447be3..108fbfeab4a 100644 --- a/src/current/_includes/molt/fetch-schema-table-filtering.md +++ b/src/current/_includes/molt/fetch-schema-table-filtering.md @@ -65,7 +65,7 @@ api.configureSource("molt.migration_schema", { }); ~~~ -Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replication-flags): +Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replicator-flags): ~~~ --userscript table_filter.ts @@ -109,7 +109,7 @@ api.configureSource("molt.public", { }); ~~~ -Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replication-flags): +Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replicator-flags): ~~~ --userscript table_filter.ts diff --git a/src/current/_includes/molt/replicator-metrics.md b/src/current/_includes/molt/replicator-metrics.md deleted file mode 100644 index 01bf1d9ef00..00000000000 --- a/src/current/_includes/molt/replicator-metrics.md +++ /dev/null @@ -1,37 +0,0 @@ -### Replicator metrics - -MOLT Replicator can export [Prometheus](https://prometheus.io/) metrics by setting the [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) flag to a port (for example, `--metricsAddr :30005`). Metrics are not enabled by default. When enabled, metrics are available at the path `/_/varz`. For example: `http://localhost:30005/_/varz`. - -Cockroach Labs recommends monitoring the following metrics during replication: - -{% if page.name == "migrate-failback.md" %} -| Metric Name | Description | -|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------| -| `commit_to_stage_lag_seconds` | Time between when a mutation is written to the source CockroachDB cluster and when it is written to the staging database. | -| `source_commit_to_apply_lag_seconds` | End-to-end lag from when a mutation is written to the source CockroachDB cluster to when it is applied to the target database. | -| `stage_mutations_total` | Number of mutations staged for application to the target database. | -| `apply_conflicts_total` | Number of rows that experienced a compare-and-set (CAS) conflict. | -| `apply_deletes_total` | Number of rows deleted. | -| `apply_duration_seconds` | Length of time it took to successfully apply mutations. | -| `apply_errors_total` | Number of times an error was encountered while applying mutations. | -| `apply_resolves_total` | Number of rows that experienced a compare-and-set (CAS) conflict and which were resolved. | -| `apply_upserts_total` | Number of rows upserted. | -| `target_apply_queue_depth` | Number of batches in the target apply queue. Indicates how backed up the applier flow is between receiving changefeed data and applying it to the target database. | -| `target_apply_queue_utilization_percent` | Utilization percentage (0.0-100.0) of the target apply queue capacity. Use this to understand how close the queue is to capacity and to set alerting thresholds for backpressure conditions. | -| `core_parallelism_utilization_percent` | Current utilization percentage of the applier flow parallelism capacity. Shows what percentage of the configured parallelism is actively being used. | -{% else %} -| Metric Name | Description | -|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------| -| `commit_to_stage_lag_seconds` | Time between when a mutation is written to the source database and when it is written to the staging database. | -| `source_commit_to_apply_lag_seconds` | End-to-end lag from when a mutation is written to the source database to when it is applied to the target CockroachDB. | -| `apply_conflicts_total` | Number of rows that experienced a compare-and-set (CAS) conflict. | -| `apply_deletes_total` | Number of rows deleted. | -| `apply_duration_seconds` | Length of time it took to successfully apply mutations. | -| `apply_errors_total` | Number of times an error was encountered while applying mutations. | -| `apply_resolves_total` | Number of rows that experienced a compare-and-set (CAS) conflict and which were resolved. | -| `apply_upserts_total` | Number of rows upserted. | -{% endif %} - -You can use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize the metrics.
For Oracle-specific metrics, import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json).
- -To check MOLT Replicator health when metrics are enabled, run `curl http://localhost:30005/_/healthz` (replacing the port with your [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) value). This returns a status code of `200` if Replicator is running. \ No newline at end of file diff --git a/src/current/_includes/v23.1/sidebar-data/migrate.json b/src/current/_includes/v23.1/sidebar-data/migrate.json index 6f563f0046f..81d046ba2d9 100644 --- a/src/current/_includes/v23.1/sidebar-data/migrate.json +++ b/src/current/_includes/v23.1/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v23.2/sidebar-data/migrate.json b/src/current/_includes/v23.2/sidebar-data/migrate.json index 6f563f0046f..81d046ba2d9 100644 --- a/src/current/_includes/v23.2/sidebar-data/migrate.json +++ b/src/current/_includes/v23.2/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v24.1/sidebar-data/migrate.json b/src/current/_includes/v24.1/sidebar-data/migrate.json index 6f563f0046f..81d046ba2d9 100644 --- a/src/current/_includes/v24.1/sidebar-data/migrate.json +++ b/src/current/_includes/v24.1/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v24.2/sidebar-data/migrate.json b/src/current/_includes/v24.2/sidebar-data/migrate.json index 6f563f0046f..81d046ba2d9 100644 --- a/src/current/_includes/v24.2/sidebar-data/migrate.json +++ b/src/current/_includes/v24.2/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v24.3/sidebar-data/migrate.json b/src/current/_includes/v24.3/sidebar-data/migrate.json index 6f563f0046f..81d046ba2d9 100644 --- a/src/current/_includes/v24.3/sidebar-data/migrate.json +++ b/src/current/_includes/v24.3/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v25.1/sidebar-data/migrate.json b/src/current/_includes/v25.1/sidebar-data/migrate.json index 31a4a778d57..e6ba00a899c 100644 --- a/src/current/_includes/v25.1/sidebar-data/migrate.json +++ b/src/current/_includes/v25.1/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v25.2/sidebar-data/migrate.json b/src/current/_includes/v25.2/sidebar-data/migrate.json index 47aec25ebfc..7693e764268 100644 --- a/src/current/_includes/v25.2/sidebar-data/migrate.json +++ b/src/current/_includes/v25.2/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v25.3/sidebar-data/migrate.json b/src/current/_includes/v25.3/sidebar-data/migrate.json index 31a4a778d57..e6ba00a899c 100644 --- a/src/current/_includes/v25.3/sidebar-data/migrate.json +++ b/src/current/_includes/v25.3/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v25.4/sidebar-data/migrate.json b/src/current/_includes/v25.4/sidebar-data/migrate.json index 31a4a778d57..e6ba00a899c 100644 --- a/src/current/_includes/v25.4/sidebar-data/migrate.json +++ b/src/current/_includes/v25.4/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/_includes/v26.1/sidebar-data/migrate.json b/src/current/_includes/v26.1/sidebar-data/migrate.json index 31a4a778d57..e6ba00a899c 100644 --- a/src/current/_includes/v26.1/sidebar-data/migrate.json +++ b/src/current/_includes/v26.1/sidebar-data/migrate.json @@ -72,6 +72,12 @@ "urls": [ "/molt/replicator-flags.html" ] + }, + { + "title": "Metrics", + "urls": [ + "/molt/replicator-metrics.html" + ] } ] }, diff --git a/src/current/molt/migrate-failback.md b/src/current/molt/migrate-failback.md index 1dece259c99..a14675fa93d 100644 --- a/src/current/molt/migrate-failback.md +++ b/src/current/molt/migrate-failback.md @@ -95,7 +95,7 @@ When you run `replicator`, you can configure the following options for replicati - [Connection strings](#connection-strings): Specify URL‑encoded source and target connections. - [TLS certificate and key](#tls-certificate-and-key): Configure secure TLS connections. -- [Replication flags](#replication-flags): Specify required and optional flags to configure replicator behavior. +- [Replicator flags](#replicator-flags): Specify required and optional flags to configure replicator behavior.
- [Tuning parameters](#tuning-parameters): Optimize failback performance and resource usage.
@@ -177,7 +177,7 @@ WITH ...; For additional details on the webhook sink URI, refer to [Webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink). -### Replication flags +### Replicator flags {% include molt/replicator-flags-usage.md %} @@ -187,7 +187,15 @@ For additional details on the webhook sink URI, refer to [Webhook sink]({% link {% include molt/optimize-replicator-performance.md %} -{% include molt/replicator-metrics.md %} +### Replicator metrics + +MOLT Replicator metrics are not enabled by default. Enable Replicator metrics by specifying the [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) flag with a port (or `host:port`) when you start Replicator. This exposes Replicator metrics at `http://{host}:{port}/_/varz`. For example, the following flag exposes metrics on port `30005`: + +~~~ +--metricsAddr :30005 +~~~ + +For guidelines on using and interpreting replication metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}?filters=cockroachdb). ## Stop forward replication diff --git a/src/current/molt/migrate-load-replicate.md b/src/current/molt/migrate-load-replicate.md index 39388831a41..f1de67c8175 100644 --- a/src/current/molt/migrate-load-replicate.md +++ b/src/current/molt/migrate-load-replicate.md @@ -75,7 +75,7 @@ Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and When you run `replicator`, you can configure the following options for replication: - [Replication connection strings](#replication-connection-strings): Specify URL-encoded source and target database connections. -- [Replication flags](#replication-flags): Specify required and optional flags to configure replicator behavior. +- [Replicator flags](#replicator-flags): Specify required and optional flags to configure replicator behavior.
- [Tuning parameters](#tuning-parameters): Optimize replication performance and resource usage.
@@ -127,7 +127,7 @@ For Oracle Multitenant databases, also specify `--sourcePDBConn` with the PDB co Follow best practices for securing connection strings. Refer to [Secure connections](#secure-connections). {{site.data.alerts.end}} -### Replication flags +### Replicator flags {% include molt/replicator-flags-usage.md %} @@ -137,7 +137,25 @@ Follow best practices for securing connection strings. Refer to [Secure connecti {% include molt/optimize-replicator-performance.md %} -{% include molt/replicator-metrics.md %} +### Replicator metrics + +MOLT Replicator metrics are not enabled by default. Enable Replicator metrics by specifying the [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) flag with a port (or `host:port`) when you start Replicator. This exposes Replicator metrics at `http://{host}:{port}/_/varz`. For example, the following flag exposes metrics on port `30005`: + +~~~ +--metricsAddr :30005 +~~~ + +
+For guidelines on using and interpreting replication metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}?filters=postgres). +
+ +
+For guidelines on using and interpreting replication metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}?filters=mysql). +
+ +
+For guidelines on using and interpreting replication metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}?filters=oracle). +
## Start Replicator diff --git a/src/current/molt/molt-replicator.md b/src/current/molt/molt-replicator.md index 3d3107590ea..aa6d1dbe039 100644 --- a/src/current/molt/molt-replicator.md +++ b/src/current/molt/molt-replicator.md @@ -646,16 +646,13 @@ Explicitly set a default `10s` [`webhook_client_timeout`]({% link {{ site.curren ### Metrics -MOLT Replicator can export [Prometheus](https://prometheus.io/) metrics by setting the [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) flag to a port (for example, `--metricsAddr :30005`). Metrics are not enabled by default. When enabled, metrics are available at the path `/_/varz`. For example: `http://localhost:30005/_/varz`. +MOLT Replicator metrics are not enabled by default. Enable Replicator metrics by specifying the [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) flag with a port (or `host:port`) when you start Replicator. This exposes Replicator metrics at `http://{host}:{port}/_/varz`. For example, the following flag exposes metrics on port `30005`: -For a list of recommended metrics to monitor during replication, refer to: - -- [Forward replication metrics]({% link molt/migrate-load-replicate.md %}#replicator-metrics) (PostgreSQL, MySQL, and Oracle sources) -- [Failback replication metrics]({% link molt/migrate-failback.md %}#replicator-metrics) (CockroachDB source) - -You can use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize the metrics. For Oracle-specific metrics, import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json). +~~~ +--metricsAddr :30005 +~~~ -To check MOLT Replicator health when metrics are enabled, run `curl http://localhost:30005/_/healthz` (replacing the port with your [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) value). This returns a status code of `200` if Replicator is running. +For guidelines on using and interpreting replication metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}). ### Logging diff --git a/src/current/molt/replicator-flags.md b/src/current/molt/replicator-flags.md index 575b7d95702..a8c8e114c66 100644 --- a/src/current/molt/replicator-flags.md +++ b/src/current/molt/replicator-flags.md @@ -37,7 +37,7 @@ This page lists all available flags for the [MOLT Replicator commands]({% link m | `--logDestination` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. | | `--logFormat` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Choose log output format: `"fluent"`, `"text"`.

**Default:** `"text"` | | `--maxRetries` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure.

**Default:** `10` | -| `--metricsAddr` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | A `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. | +| `--metricsAddr` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | A `:port` or `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. | | `--ndjsonBufferSize` | `start` | `INT` | The maximum amount of data to buffer while reading a single line of `ndjson` input; increase when source cluster has large blob values.

**Default:** `65536` | | `--oracle-application-users` | `oraclelogminer` | `STRING` | List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`). | | `-o`, `--out` | `make-jwt` | `STRING` | A file to write the token to. | diff --git a/src/current/molt/replicator-metrics.md b/src/current/molt/replicator-metrics.md new file mode 100644 index 00000000000..dfdffaf373a --- /dev/null +++ b/src/current/molt/replicator-metrics.md @@ -0,0 +1,289 @@ +--- +title: Replicator Metrics +summary: Learn how to monitor stages of the MOLT Replicator pipeline. +toc: true +docs_area: migrate +--- + +[MOLT Replicator]({% link molt/molt-replicator.md %}) exposes Prometheus metrics at each stage of the [replication pipeline](#replication-pipeline). When using Replicator to perform [forward replication]({% link molt/migrate-load-replicate.md %}#start-replicator) or [failback]({% link molt/migrate-failback.md %}), you should monitor the health of each pipeline stage to quickly detect issues. + +This page describes and provides usage guidelines for Replicator metrics, according to the replication source: + +- PostgreSQL +- MySQL +- Oracle +- CockroachDB (during [failback]({% link molt/migrate-failback.md %})) + +
+ + + + +
+ +## Replication pipeline + +[MOLT Replicator]({% link molt/molt-replicator.md %}) replicates data as a pipeline of change events that travel from the source database to the target database where changes are applied. The Replicator pipeline consists of four stages: + +- [**Source read**](#source-read): Connects Replicator to the source database and captures changes via logical replication (PostgreSQL, MySQL), LogMiner (Oracle), or [changefeed messages]({% link {{ site.current_cloud_version }}/changefeed-messages.md %}) (CockroachDB). + +- [**Staging**](#staging): Buffers mutations for ordered processing and crash recovery. + +
+- [**Core sequencer**](#core-sequencer): Processes staged mutations, maintains ordering guarantees, and coordinates transaction application. +
+ +
+- **Core sequencer**: Processes staged mutations, maintains ordering guarantees, and coordinates transaction application. +
+ +- [**Target apply**](#target-apply): Applies mutations to the target database. + +## Set up metrics + +Enable Replicator metrics by specifying the [`--metricsAddr`]({% link molt/replicator-flags.md %}#metrics-addr) flag with a port (or `host:port`) when you start Replicator. This exposes Replicator metrics at `http://{host}:{port}/_/varz`. For example, the following command exposes metrics on port `30005`: + +{% include_cached copy-clipboard.html %} +~~~ shell +replicator start \ +--targetConn $TARGET \ +--stagingConn $STAGING \ +--metricsAddr :30005 +... +~~~ + +To collect Replicator metrics, set up [Prometheus](https://prometheus.io/) to scrape the [Replicator metrics endpoint](#metrics-endpoints). To [visualize Replicator metrics](#visualize-metrics), use [Grafana](https://grafana.com/) to create dashboards. + +## Metrics endpoints + +The following endpoints are available when you [enable Replicator metrics](#set-up-metrics): + +| Endpoint | Description | +|-----------------|----------------------------------------------------------------------------| +| `/_/varz` | Prometheus metrics endpoint. | +| `/_/diag` | Structured diagnostic information (JSON). | +| `/_/healthz` | Health check endpoint. | +| `/debug/pprof/` | Go pprof handlers for profiling. | + +For example, to view the current snapshot of Replicator metrics on port `30005`, open `http://localhost:30005/_/varz` in a browser. To track metrics over time and create visualizations, use Prometheus and Grafana as described in [Set up metrics](#set-up-metrics). + +To check Replicator health: + +{% include_cached copy-clipboard.html %} +~~~ shell +curl http://localhost:30005/_/healthz +~~~ + +~~~ +OK +~~~ + +### Visualize metrics + +Use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize metrics. For Oracle sources, also import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json) to visualize [Oracle source metrics](#oracle-source). + +## Overall replication metrics + +### High-level performance metrics + +Monitor the following metrics to track the overall health of the [replication pipeline](#replication-pipeline): + +
+- `core_source_lag_seconds` + - Description: Age of the most recently received checkpoint. This represents the time from source commit to `COMMIT` event processing. + - Interpretation: If consistently increasing, Replicator is falling behind in reading source changes, and cannot keep pace with database changes. +
+
+- `core_source_lag_seconds` + - Description: Age of the most recently received checkpoint. This represents the time elapsed since the latest received resolved timestamp. + - Interpretation: If consistently increasing, Replicator is falling behind in reading source changes, and cannot keep pace with database changes. +
+
+- `target_apply_mutation_age_seconds` + - Description: End-to-end replication lag per mutation from source commit to target apply. Measures the difference between current wall time and the mutation's [MVCC timestamp]({% link {{ site.current_cloud_version }}/architecture/storage-layer.md %}#mvcc). + - Interpretation: Higher values mean that older mutations are being applied, and indicate end-to-end pipeline delays. Compare across tables to find bottlenecks. +
+- `target_apply_queue_utilization_percent` + - Description: Percentage of target apply queue capacity utilization. + - Interpretation: Values approaching 100 percent indicate severe backpressure throughout the pipeline, and potential data processing delays. + +
+### Replication lag + +Monitor the following metric to track end-to-end replication lag: + +- `target_apply_transaction_lag_seconds` + - Description: Age of the transaction applied to the target table, measuring time from source commit to target apply. + - Interpretation: Consistently high values indicate bottlenecks in the pipeline. Compare with `core_source_lag_seconds` to determine if the delay is in source read or target apply. +
+ +
+### Progress tracking + +Monitor the following metrics to track checkpoint progress: + +- `target_applied_timestamp_seconds` + - Description: Wall time (Unix timestamp) of the most recently applied resolved timestamp. + - Interpretation: Use to verify continuous progress. Stale values indicate apply stalls. +- `target_pending_timestamp_seconds` + - Description: Wall time (Unix timestamp) of the most recently received resolved timestamp. + - Interpretation: A gap between this metric and `target_applied_timestamp_seconds` indicates apply backlog, meaning that the pipeline cannot keep up with incoming changes. +
+ +## Replication pipeline metrics + +### Source read + +[Source read](#replication-pipeline) metrics track the health of connections to source databases and the volume of incoming changes. + +{{site.data.alerts.callout_info}} +For checkpoint terminology, refer to the [MOLT Replicator documentation]({% link molt/molt-replicator.md %}#terminology). +{{site.data.alerts.end}} + +
+#### CockroachDB source + +- `checkpoint_committed_age_seconds` + - Description: Age of the committed checkpoint. + - Interpretation: Increasing values indicate checkpoint commits are falling behind, which affects crash recovery capability. +- `checkpoint_proposed_age_seconds` + - Description: Age of the proposed checkpoint. + - Interpretation: A gap with `checkpoint_committed_age_seconds` indicates checkpoint commit lag. +- `checkpoint_commit_duration_seconds` + - Description: Amount of time taken to save the committed checkpoint to the staging database. + - Interpretation: High values indicate staging database bottlenecks due to write contention or performance issues. +- `checkpoint_proposed_going_backwards_errors_total` + - Description: Number of times an error condition occurred where the changefeed was restarted. + - Interpretation: Indicates source changefeed restart or time regression. Requires immediate investigation of source changefeed stability. +
+ +
+#### Oracle source + +{{site.data.alerts.callout_success}} +To visualize the following metrics, import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json). +{{site.data.alerts.end}} + +- `oraclelogminer_scn_interval_size` + - Description: Size of the interval from the start SCN to the current Oracle SCN. + - Interpretation: Values larger than the [`--scnWindowSize`]({% link molt/replicator-flags.md %}#scn) flag value indicate replication lag, or that replication is idle. +- `oraclelogminer_time_per_window_seconds` + - Description: Amount of time taken to fully process an SCN interval. + - Interpretation: Large values indicate Oracle slowdown, blocked replication loop, or slow processing. +- `oraclelogminer_query_redo_logs_duration_seconds` + - Description: Amount of time taken to query redo logs from LogMiner. + - Interpretation: High values indicate Oracle is under load or the SCN interval is too large. +- `oraclelogminer_num_inflight_transactions_in_memory` + - Description: Current number of in-flight transactions in memory. + - Interpretation: High counts indicate long-running transactions on source. Monitor for memory usage. +- `oraclelogminer_num_async_checkpoints_in_queue` + - Description: Checkpoints queued for processing against staging database. + - Interpretation: Values close to the `--checkpointQueueBufferSize` flag value indicate checkpoint processing cannot keep up with incoming checkpoints. +- `oraclelogminer_upsert_checkpoints_duration` + - Description: Amount of time taken to upsert checkpoint batch into staging database. + - Interpretation: High values indicate the staging database is under heavy load or batch size is too large. +- `oraclelogminer_delete_checkpoints_duration` + - Description: Amount of time taken to delete old checkpoints from the staging database. + - Interpretation: High values indicate staging database load or long-running transactions preventing checkpoint deletion. +
+ +
+#### MySQL source + +- `mylogical_dial_success_total` + - Description: Number of times Replicator successfully started logical replication. + - Interpretation: Multiple successes may indicate reconnects. Monitor for connection stability. +- `mylogical_dial_failure_total` + - Description: Number of times Replicator failed to start logical replication. + - Interpretation: Nonzero values indicate connection issues. Check network connectivity and source database health. +- `mutations_total` + - Description: Total number of mutations processed, labeled by source and mutation type (insert/update/delete). + - Interpretation: Use to monitor replication throughput and identify traffic patterns. +
+ +
+#### PostgreSQL source + +- `pglogical_dial_success_total` + - Description: Number of times Replicator successfully started logical replication (executed `START_REPLICATION` command). + - Interpretation: Multiple successes may indicate reconnects. Monitor for connection stability. +- `pglogical_dial_failure_total` + - Description: Number of times Replicator failed to start logical replication (failure to execute `START_REPLICATION` command). + - Interpretation: Nonzero values indicate connection issues. Check network connectivity and source database health. +- `mutations_total` + - Description: Total number of mutations processed, labeled by source and mutation type (insert/update/delete). + - Interpretation: Use to monitor replication throughput and identify traffic patterns. +
+ +### Staging + +[Staging](#replication-pipeline) metrics track the health of the staging layer where mutations are buffered for ordered processing. + +{{site.data.alerts.callout_info}} +For checkpoint terminology, refer to the [MOLT Replicator documentation]({% link molt/molt-replicator.md %}#terminology). +{{site.data.alerts.end}} + +- `stage_commit_lag_seconds` + - Description: Time between writing a mutation to source and writing it to staging. + - Interpretation: High values indicate delays in getting data into the staging layer. +- `stage_mutations_total` + - Description: Number of mutations staged for each table. + - Interpretation: Use to monitor staging throughput per table. +- `stage_duration_seconds` + - Description: Amount of time taken to successfully stage mutations. + - Interpretation: High values indicate write performance issues on the staging database. + +
+### Core sequencer + +[Core sequencer](#replication-pipeline) metrics track mutation processing, ordering, and transaction coordination. + +- `core_sweep_duration_seconds` + - Description: Duration of each schema sweep operation, which looks for and applies staged mutations. + - Interpretation: Long durations indicate that large backlogs, slow staging reads, or slow target writes are affecting throughput. +- `core_sweep_mutations_applied_total` + - Description: Total count of mutations read from staging and successfully applied to the target database during a sweep. + - Interpretation: Use to monitor processing throughput. A flat line indicates no mutations are being applied. +- `core_sweep_success_timestamp_seconds` + - Description: Wall time (Unix timestamp) at which a sweep attempt last succeeded. + - Interpretation: Stale values indicate the sweep has stopped. +- `core_parallelism_utilization_percent` + - Description: Percentage of the configured parallelism that is actively being used for concurrent transaction processing. + - Interpretation: High utilization indicates bottlenecks in mutation processing. +
+ +### Target apply + +[Target apply](#replication-pipeline) metrics track mutation application to the target database. + +- `target_apply_queue_size` + - Description: Number of transactions waiting in the target apply queue. + - Interpretation: High values indicate target apply cannot keep up with incoming transactions. +- `target_apply_queue_utilization_percent` + - Description: Percentage of apply queue capacity utilization. + - Interpretation: Values above 90 percent indicate severe backpressure. Increase [`--targetApplyQueueSize`]({% link molt/replicator-flags.md %}#target-apply-queue-size) or investigate target database performance. +- `apply_duration_seconds` + - Description: Amount of time taken to successfully apply mutations to a table. + - Interpretation: High values indicate target database performance issues or contention. +- `apply_upserts_total` + - Description: Number of rows upserted to the target. + - Interpretation: Use to monitor write throughput. Should grow steadily during active replication. +- `apply_deletes_total` + - Description: Number of rows deleted from the target. + - Interpretation: Use to monitor delete throughput. Compare with delete operations on the source database. +- `apply_errors_total` + - Description: Number of times an error was encountered while applying mutations. + - Interpretation: Growing error count indicates target database issues or constraint violations. +- `apply_conflicts_total` + - Description: Number of rows that experienced a compare-and-set (CAS) conflict. + - Interpretation: High counts indicate concurrent modifications or stale data conflicts. May require conflict resolution tuning. +- `apply_resolves_total` + - Description: Number of rows that experienced a compare-and-set (CAS) conflict and were successfully resolved. + - Interpretation: Compare with `apply_conflicts_total` to verify conflict resolution is working. Should be close to or equal to conflicts. + +## See also + +- [MOLT Replicator]({% link molt/molt-replicator.md %}) +- [Replicator Flags]({% link molt/replicator-flags.md %}) +- [Load and Replicate]({% link molt/migrate-load-replicate.md %}) +- [Migration Failback]({% link molt/migrate-failback.md %}) From 1370004519308358732ba6e648a0a0db70277f50 Mon Sep 17 00:00:00 2001 From: Ryan Kuo Date: Mon, 15 Dec 2025 11:24:04 -0500 Subject: [PATCH 2/5] filter staging metrics for non-crdb sources --- src/current/molt/replicator-metrics.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/current/molt/replicator-metrics.md b/src/current/molt/replicator-metrics.md index dfdffaf373a..8f6d84e6f4f 100644 --- a/src/current/molt/replicator-metrics.md +++ b/src/current/molt/replicator-metrics.md @@ -5,7 +5,7 @@ toc: true docs_area: migrate --- -[MOLT Replicator]({% link molt/molt-replicator.md %}) exposes Prometheus metrics at each stage of the [replication pipeline](#replication-pipeline). When using Replicator to perform [forward replication]({% link molt/migrate-load-replicate.md %}#start-replicator) or [failback]({% link molt/migrate-failback.md %}), you should monitor the health of each pipeline stage to quickly detect issues. +[MOLT Replicator]({% link molt/molt-replicator.md %}) exposes Prometheus metrics at each stage of the [replication pipeline](#replication-pipeline). When using Replicator to perform [forward replication]({% link molt/migrate-load-replicate.md %}#start-replicator) or [failback]({% link molt/migrate-failback.md %}), you should monitor the health of each relevant pipeline stage to quickly detect issues. This page describes and provides usage guidelines for Replicator metrics, according to the replication source: @@ -27,7 +27,13 @@ This page describes and provides usage guidelines for Replicator metrics, accord - [**Source read**](#source-read): Connects Replicator to the source database and captures changes via logical replication (PostgreSQL, MySQL), LogMiner (Oracle), or [changefeed messages]({% link {{ site.current_cloud_version }}/changefeed-messages.md %}) (CockroachDB). +
+- **Staging**: Buffers mutations for ordered processing and crash recovery. +
+ +
- [**Staging**](#staging): Buffers mutations for ordered processing and crash recovery. +
- [**Core sequencer**](#core-sequencer): Processes staged mutations, maintains ordering guarantees, and coordinates transaction application. @@ -215,6 +221,7 @@ To visualize the following metrics, import the [Oracle Grafana dashboard](https: - Interpretation: Use to monitor replication throughput and identify traffic patterns.
+
### Staging [Staging](#replication-pipeline) metrics track the health of the staging layer where mutations are buffered for ordered processing. @@ -232,6 +239,7 @@ For checkpoint terminology, refer to the [MOLT Replicator documentation]({% link - `stage_duration_seconds` - Description: Amount of time taken to successfully stage mutations. - Interpretation: High values indicate write performance issues on the staging database. +
### Core sequencer From adf79cce6379f6b0ed838a2d0d9872dea90d9053 Mon Sep 17 00:00:00 2001 From: Ryan Kuo <8740013+taroface@users.noreply.github.com> Date: Tue, 16 Dec 2025 13:33:44 -0500 Subject: [PATCH 3/5] Apply suggestions from code review Co-authored-by: Florence Morris <58752716+florence-crl@users.noreply.github.com> --- src/current/molt/replicator-flags.md | 2 +- src/current/molt/replicator-metrics.md | 8 +++++++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/src/current/molt/replicator-flags.md b/src/current/molt/replicator-flags.md index a8c8e114c66..5532e0f8502 100644 --- a/src/current/molt/replicator-flags.md +++ b/src/current/molt/replicator-flags.md @@ -1,7 +1,7 @@ --- title: Replicator Flags summary: Flag reference for MOLT Replicator -toc: true +toc: false docs_area: migrate --- diff --git a/src/current/molt/replicator-metrics.md b/src/current/molt/replicator-metrics.md index 8f6d84e6f4f..0b95e4c2ff9 100644 --- a/src/current/molt/replicator-metrics.md +++ b/src/current/molt/replicator-metrics.md @@ -86,7 +86,13 @@ OK ### Visualize metrics +
+Use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize metrics. +
+ +
Use the [Replicator Grafana dashboard](https://replicator.cockroachdb.com/replicator_grafana_dashboard.json) to visualize metrics. For Oracle sources, also import the [Oracle Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json) to visualize [Oracle source metrics](#oracle-source). +
## Overall replication metrics @@ -254,7 +260,7 @@ For checkpoint terminology, refer to the [MOLT Replicator documentation]({% link - Interpretation: Use to monitor processing throughput. A flat line indicates no mutations are being applied. - `core_sweep_success_timestamp_seconds` - Description: Wall time (Unix timestamp) at which a sweep attempt last succeeded. - - Interpretation: Stale values indicate the sweep has stopped. + - Interpretation: If this value stops updating and becomes stale, it indicates that the sweep has stopped. - `core_parallelism_utilization_percent` - Description: Percentage of the configured parallelism that is actively being used for concurrent transaction processing. - Interpretation: High utilization indicates bottlenecks in mutation processing. From f560dbcb92a843f3bdce6826a2824b35b79cf62b Mon Sep 17 00:00:00 2001 From: Ryan Kuo Date: Wed, 17 Dec 2025 18:22:43 -0500 Subject: [PATCH 4/5] address review comments --- src/current/molt/molt-fetch.md | 6 +++--- src/current/molt/replicator-metrics.md | 11 +++++------ 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/src/current/molt/molt-fetch.md b/src/current/molt/molt-fetch.md index 99a407fd058..fe4318bf145 100644 --- a/src/current/molt/molt-fetch.md +++ b/src/current/molt/molt-fetch.md @@ -122,8 +122,8 @@ MOLT Fetch loads the exported data into the target CockroachDB database. The pro | `--metrics-listen-addr` | Address of the Prometheus metrics endpoint, which has the path `{address}/metrics`. For details on important metrics to monitor, refer to [Monitoring](#monitoring).

**Default:** `'127.0.0.1:3030'` | | `--mode` | Configure the MOLT Fetch behavior: `data-load`, `export-only`, or `import-only`. For details, refer to [Fetch mode](#fetch-mode).

**Default:** `data-load` | | `--non-interactive` | Run the fetch task without interactive prompts. This is recommended **only** when running `molt fetch` in an automated process (i.e., a job or continuous integration). | -| `--pglogical-replication-slot-name` | Name of a replication slot that will be created before taking a snapshot of data. Must match the slot name specified with `--slotName` in the [MOLT Replicator command]({% link molt/molt-replicator.md %}#replication-checkpoints). For details, refer to [Load before replication](#load-before-replication). | -| `--pglogical-publication-and-slot-drop-and-recreate` | Drop the publication and replication slot if they exist, then recreate them. Creates a publication named `molt_fetch` and the replication slot specified with `--pglogical-replication-slot-name`. For details, refer to [Load before replication](#load-before-replication).

**Default:** `false` | +| `--pglogical-replication-slot-name` | Name of a PostgreSQL replication slot that will be created before taking a snapshot of data. Must match the slot name specified with `--slotName` in the [MOLT Replicator command]({% link molt/molt-replicator.md %}#replication-checkpoints). For details, refer to [Load before replication](#load-before-replication). | +| `--pglogical-publication-and-slot-drop-and-recreate` | Drop the PostgreSQL publication and replication slot if they exist, then recreate them. Creates a publication named `molt_fetch` and the replication slot specified with `--pglogical-replication-slot-name`. For details, refer to [Load before replication](#load-before-replication).

**Default:** `false` | | `--pprof-listen-addr` | Address of the pprof endpoint.

**Default:** `'127.0.0.1:3031'` | | `--row-batch-size` | Number of rows per shard to export at a time. For details on sharding, refer to [Table sharding](#table-sharding). See also [Best practices](#best-practices).

**Default:** `100000` | | `--schema-filter` | Move schemas that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression). Not used with MySQL sources. For Oracle sources, this filter is case-insensitive.

**Default:** `'.*'` | @@ -135,7 +135,7 @@ MOLT Fetch loads the exported data into the target CockroachDB database. The pro | `--transformations-file` | Path to a JSON file that defines transformations to be performed on the target schema during the fetch task. Refer to [Transformations](#transformations). | | `--type-map-file` | Path to a JSON file that contains explicit type mappings for automatic schema creation, when enabled with `--table-handling drop-on-target-and-recreate`. For details on the JSON format and valid type mappings, see [type mapping](#type-mapping). | | `--use-console-writer` | Use the console writer, which has cleaner log output but introduces more latency.

**Default:** `false` (log as structured JSON) | -| `--use-copy` | Use [`COPY FROM`](#data-load-mode) to move data. This makes tables queryable during data load, but is slower than using `IMPORT INTO`. For details, refer to [Data load mode](#data-load-mode). | +| `--use-copy` | Use [`COPY FROM`](#data-load-mode) to move data. This makes tables queryable during data load, but is slower than using `IMPORT INTO`. For details, refer to [Data load mode](#data-load-mode). | | `--use-implicit-auth` | Use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}) for [cloud storage](#bucket-path) URIs. | | `--use-stats-based-sharding` | Enable statistics-based sharding for PostgreSQL sources. This allows sharding of tables with primary keys of any data type and can create more evenly distributed shards compared to the default numerical range sharding. Requires PostgreSQL 11+ and access to `pg_stats`. For details, refer to [Table sharding](#table-sharding). | diff --git a/src/current/molt/replicator-metrics.md b/src/current/molt/replicator-metrics.md index 0b95e4c2ff9..c4d311091dc 100644 --- a/src/current/molt/replicator-metrics.md +++ b/src/current/molt/replicator-metrics.md @@ -148,10 +148,6 @@ Monitor the following metrics to track checkpoint progress: [Source read](#replication-pipeline) metrics track the health of connections to source databases and the volume of incoming changes. -{{site.data.alerts.callout_info}} -For checkpoint terminology, refer to the [MOLT Replicator documentation]({% link molt/molt-replicator.md %}#terminology). -{{site.data.alerts.end}} -
#### CockroachDB source @@ -197,6 +193,9 @@ To visualize the following metrics, import the [Oracle Grafana dashboard](https: - `oraclelogminer_delete_checkpoints_duration` - Description: Amount of time taken to delete old checkpoints from the staging database. - Interpretation: High values indicate staging database load or long-running transactions preventing checkpoint deletion. +- `mutation_total` + - Description: Total number of mutations processed, labeled by source and mutation type (insert/update/delete). + - Interpretation: Use to monitor replication throughput and identify traffic patterns.
@@ -208,7 +207,7 @@ To visualize the following metrics, import the [Oracle Grafana dashboard](https: - `mylogical_dial_failure_total` - Description: Number of times Replicator failed to start logical replication. - Interpretation: Nonzero values indicate connection issues. Check network connectivity and source database health. -- `mutations_total` +- `mutation_total` - Description: Total number of mutations processed, labeled by source and mutation type (insert/update/delete). - Interpretation: Use to monitor replication throughput and identify traffic patterns.
@@ -222,7 +221,7 @@ To visualize the following metrics, import the [Oracle Grafana dashboard](https: - `pglogical_dial_failure_total` - Description: Number of times Replicator failed to start logical replication (failure to execute `START_REPLICATION` command). - Interpretation: Nonzero values indicate connection issues. Check network connectivity and source database health. -- `mutations_total` +- `mutation_total` - Description: Total number of mutations processed, labeled by source and mutation type (insert/update/delete). - Interpretation: Use to monitor replication throughput and identify traffic patterns.
From df6e8e522e3daa0e5584ab063a7fe990511d9dda Mon Sep 17 00:00:00 2001 From: Ryan Kuo Date: Thu, 18 Dec 2025 18:38:02 -0500 Subject: [PATCH 5/5] update target_apply_queue_utilization_percent entry; add --scnWindowSize to flags table --- src/current/molt/replicator-flags.md | 155 +++++++++++++------------ src/current/molt/replicator-metrics.md | 15 ++- 2 files changed, 88 insertions(+), 82 deletions(-) diff --git a/src/current/molt/replicator-flags.md b/src/current/molt/replicator-flags.md index 5532e0f8502..ad6ec3b8664 100644 --- a/src/current/molt/replicator-flags.md +++ b/src/current/molt/replicator-flags.md @@ -7,80 +7,81 @@ docs_area: migrate This page lists all available flags for the [MOLT Replicator commands]({% link molt/molt-replicator.md %}#commands): `start`, `pglogical`, `mylogical`, `oraclelogminer`, and `make-jwt`. -| Flag | Commands | Type | Description | -|---------------------------------------------------------------------------------------------|-----------------------------------------------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `-a`, `--allow` | `make-jwt` | `STRING` | One or more `database.schema` identifiers. Can be repeated for multiple schemas. | -| `--applyTimeout` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The maximum amount of time to wait for an update to be applied.

**Default:** `30s` | -| `--assumeIdempotent` | `start` | `BOOL` | Disable the extra staging table queries that debounce non-idempotent redelivery in changefeeds. | -| `--backfillFromSCN` | `oraclelogminer` | `INT` | The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. | -| `--bestEffortOnly` | `start` | `BOOL` | Eventually-consistent mode; useful for high-throughput, skew-tolerant schemas with [foreign keys]({% link {{ site.current_cloud_version }}/foreign-key.md %}). | -| `--bestEffortWindow` | `start` | `DURATION` | Use an eventually-consistent mode for initial backfill or when replication is behind; `0` to disable.

**Default:** `1h0m0s` | -| `--bindAddr` | `start` | `STRING` | The network address to bind to.

**Default:** `":26258"` | -| `--claim` | `make-jwt` | `BOOL` | If `true`, print a minimal JWT claim instead of signing. | -| `--collapseMutations` | `start`, `pglogical`, `mylogical` | `BOOL` | Combine multiple mutations on the same primary key within each batch into a single mutation.

**Default:** `true` | -| `--defaultGTIDSet` | `mylogical` | `STRING` | **Required** the first time `replicator` is run. The default GTID set, in the format `source_uuid:min(interval_start)-max(interval_end)`, which provides a replication marker for streaming changes. | -| `--disableAuthentication` | `start` | `BOOL` | Disable authentication of incoming Replicator requests; not recommended for production. | -| `--discard` | `start` | `BOOL` | **Dangerous:** Discard all incoming HTTP requests; useful for changefeed throughput testing. Not intended for production. | -| `--discardDelay` | `start` | `DURATION` | Adds additional delay in discard mode; useful for gauging the impact of changefeed round-trip time (RTT). | -| `--dlqTableName` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `IDENT` | The name of a table in the target schema for storing dead-letter entries.

**Default:** `replicator_dlq` | -| `--enableCheckpointStream` | `start` | `BOOL` | Enable checkpoint streaming (use an internal changefeed from the staging table for real-time updates), rather than checkpoint polling (query the staging table for periodic updates), for failback replication.

**Default:** `false` (use checkpoint polling) | -| `--enableParallelApplies` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `BOOL` | Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage.

**Default:** `false` | -| `--fetchMetadata` | `mylogical` | `BOOL` | Fetch column metadata explicitly, for older versions of MySQL that do not support `binlog_row_metadata`. | -| `--flushPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Flush queued mutations after this duration.

**Default:** `1s` | -| `--flushSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Ideal batch size to determine when to flush mutations.

**Default:** `1000` | -| `--gracePeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Allow background processes to exit.

**Default:** `30s` | -| `--healthCheckTimeout` | `start` | `DURATION` | The timeout for the health check endpoint.

**Default:** `5s` | -| `--httpResponseTimeout` | `start` | `DURATION` | The maximum amount of time to allow an HTTP handler to execute.

**Default:** `2m0s` | -| `--immediate` | `start` | `BOOL` | Bypass staging tables and write directly to target; recommended only for KV-style workloads with no [foreign keys]({% link {{ site.current_cloud_version }}/foreign-key.md %}). | -| `-k`, `--key` | `make-jwt` | `STRING` | The path to a PEM-encoded private key to sign the token with. | -| `--limitLookahead` | `start` | `INT` | Limit number of checkpoints to be considered when computing the resolving range; may cause replication to stall completely if older mutations cannot be applied. | -| `--logDestination` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. | -| `--logFormat` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Choose log output format: `"fluent"`, `"text"`.

**Default:** `"text"` | -| `--maxRetries` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure.

**Default:** `10` | -| `--metricsAddr` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | A `:port` or `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. | -| `--ndjsonBufferSize` | `start` | `INT` | The maximum amount of data to buffer while reading a single line of `ndjson` input; increase when source cluster has large blob values.

**Default:** `65536` | -| `--oracle-application-users` | `oraclelogminer` | `STRING` | List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`). | -| `-o`, `--out` | `make-jwt` | `STRING` | A file to write the token to. | -| `--parallelism` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The number of concurrent database transactions to use.

**Default:** `16` | -| `--publicationName` | `pglogical` | `STRING` | The publication within the source database to replicate. | -| `--quiescentPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often to retry deferred mutations.

**Default:** `10s` | -| `--replicationProcessID` | `mylogical` | `UINT32` | The replication process ID to report to the source database.

**Default:** `10` | -| `--retireOffset` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How long to delay removal of applied mutations.

**Default:** `24h0m0s` | -| `--retryInitialBackoff` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation.

**Default:** `25ms` | -| `--retryMaxBackoff` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors.

**Default:** `2s` | -| `--retryMultiplier` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database.

**Default:** `2` | -| `--scanSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The number of rows to retrieve from the staging database used to store metadata for replication.

**Default:** `10000` | -| `--schemaRefresh` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.

**Default:** `1m0s` | -| `--scn` | `oraclelogminer` | `INT` | **Required** the first time `replicator` is run. The snapshot System Change Number (SCN) from the initial data load, which provides a replication marker for streaming changes. | -| `--slotName` | `pglogical` | `STRING` | **Required.** PostgreSQL replication slot name. Must match the slot name specified with `--pglogical-replication-slot-name` in the [MOLT Fetch command]({% link molt/molt-fetch.md %}#load-before-replication).

**Default:** `"replicator"` | -| `--sourceConn` | `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). | -| `--sourcePDBConn` | `oraclelogminer` | `STRING` | Connection string for the Oracle pluggable database (PDB). Only required when using an [Oracle multitenant configuration](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html). [`--sourceConn`](#source-conn) **must** be included. | -| `--sourceSchema` | `oraclelogminer` | `STRING` | **Required.** Source schema name on Oracle where tables will be replicated from. | -| `--stageDisableCreateTableReaderIndex` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `BOOL` | Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact.

**Default:** `false` | -| `--stageMarkAppliedLimit` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Limit the number of mutations to be marked applied in a single statement.

**Default:** `100000` | -| `--stageSanityCheckPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often to validate staging table apply order (`-1` to disable).

**Default:** `10m0s` | -| `--stageSanityCheckWindow` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How far back to look when validating staging table apply order.

**Default:** `1h0m0s` | -| `--stageUnappliedPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often to report the number of unapplied mutations in staging tables (`-1` to disable).

**Default:** `1m0s` | -| `--stagingConn` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The staging database's connection string. | -| `--stagingCreateSchema` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `BOOL` | Automatically create the staging schema if it does not exist. | -| `--stagingIdleTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | -| `--stagingJitterTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | -| `--stagingMaxLifetime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | -| `--stagingMaxPoolSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of staging database connections.

**Default:** `128` | -| `--stagingSchema` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Name of the CockroachDB schema that stores replication metadata. **Required** each time `replicator` is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction.

**Default:** `_replicator.public` | -| `--standbyTimeout` | `pglogical` | `DURATION` | How often to report WAL progress to the source server.

**Default:** `5s` | -| `--targetApplyQueueSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (`pglogical`) sources, and replaces the deprecated `--copierChannel` and `--stageCopierChannelSize` flags. | -| `--targetConn` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The target database's connection string. | -| `--targetIdleTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | -| `--targetJitterTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | -| `--targetMaxLifetime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | -| `--targetMaxPoolSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of target database connections.

**Default:** `128` | -| `--targetSchema` | `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | **Required.** The SQL database schema in the target cluster to update. CockroachDB schema names must be fully qualified in the format `database.schema`. | -| `--targetStatementCacheSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of prepared statements to retain.

**Default:** `128` | -| `--taskGracePeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How long to allow for task cleanup when recovering from errors.

**Default:** `1m0s` | -| `--timestampLimit` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of source timestamps to coalesce into a target transaction.

**Default:** `1000` | -| `--tlsCertificate` | `start` | `STRING` | A path to a PEM-encoded TLS certificate chain. | -| `--tlsPrivateKey` | `start` | `STRING` | A path to a PEM-encoded TLS private key. | -| `--tlsSelfSigned` | `start` | `BOOL` | If true, generate a self-signed TLS certificate valid for `localhost`. | -| `--userscript` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The path to a TypeScript configuration script. For example, `--userscript 'script.ts'`. | -| `-v`, `--verbose` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `COUNT` | Increase logging verbosity. Use `-v` for `debug` logging or `-vv` for `trace` logging. | +| Flag | Commands | Type | Description | +|---------------------------------------------------------------------------------------------|-----------------------------------------------------|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `-a`, `--allow` | `make-jwt` | `STRING` | One or more `database.schema` identifiers. Can be repeated for multiple schemas. | +| `--applyTimeout` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The maximum amount of time to wait for an update to be applied.

**Default:** `30s` | +| `--assumeIdempotent` | `start` | `BOOL` | Disable the extra staging table queries that debounce non-idempotent redelivery in changefeeds. | +| `--backfillFromSCN` | `oraclelogminer` | `INT` | The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. | +| `--bestEffortOnly` | `start` | `BOOL` | Eventually-consistent mode; useful for high-throughput, skew-tolerant schemas with [foreign keys]({% link {{ site.current_cloud_version }}/foreign-key.md %}). | +| `--bestEffortWindow` | `start` | `DURATION` | Use an eventually-consistent mode for initial backfill or when replication is behind; `0` to disable.

**Default:** `1h0m0s` | +| `--bindAddr` | `start` | `STRING` | The network address to bind to.

**Default:** `":26258"` | +| `--claim` | `make-jwt` | `BOOL` | If `true`, print a minimal JWT claim instead of signing. | +| `--collapseMutations` | `start`, `pglogical`, `mylogical` | `BOOL` | Combine multiple mutations on the same primary key within each batch into a single mutation.

**Default:** `true` | +| `--defaultGTIDSet` | `mylogical` | `STRING` | **Required** the first time `replicator` is run. The default GTID set, in the format `source_uuid:min(interval_start)-max(interval_end)`, which provides a replication marker for streaming changes. | +| `--disableAuthentication` | `start` | `BOOL` | Disable authentication of incoming Replicator requests; not recommended for production. | +| `--discard` | `start` | `BOOL` | **Dangerous:** Discard all incoming HTTP requests; useful for changefeed throughput testing. Not intended for production. | +| `--discardDelay` | `start` | `DURATION` | Adds additional delay in discard mode; useful for gauging the impact of changefeed round-trip time (RTT). | +| `--dlqTableName` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `IDENT` | The name of a table in the target schema for storing dead-letter entries.

**Default:** `replicator_dlq` | +| `--enableCheckpointStream` | `start` | `BOOL` | Enable checkpoint streaming (use an internal changefeed from the staging table for real-time updates), rather than checkpoint polling (query the staging table for periodic updates), for failback replication.

**Default:** `false` (use checkpoint polling) | +| `--enableParallelApplies` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `BOOL` | Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage.

**Default:** `false` | +| `--fetchMetadata` | `mylogical` | `BOOL` | Fetch column metadata explicitly, for older versions of MySQL that do not support `binlog_row_metadata`. | +| `--flushPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Flush queued mutations after this duration.

**Default:** `1s` | +| `--flushSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Ideal batch size to determine when to flush mutations.

**Default:** `1000` | +| `--gracePeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Allow background processes to exit.

**Default:** `30s` | +| `--healthCheckTimeout` | `start` | `DURATION` | The timeout for the health check endpoint.

**Default:** `5s` | +| `--httpResponseTimeout` | `start` | `DURATION` | The maximum amount of time to allow an HTTP handler to execute.

**Default:** `2m0s` | +| `--immediate` | `start` | `BOOL` | Bypass staging tables and write directly to target; recommended only for KV-style workloads with no [foreign keys]({% link {{ site.current_cloud_version }}/foreign-key.md %}). | +| `-k`, `--key` | `make-jwt` | `STRING` | The path to a PEM-encoded private key to sign the token with. | +| `--limitLookahead` | `start` | `INT` | Limit number of checkpoints to be considered when computing the resolving range; may cause replication to stall completely if older mutations cannot be applied. | +| `--logDestination` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. | +| `--logFormat` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Choose log output format: `"fluent"`, `"text"`.

**Default:** `"text"` | +| `--maxRetries` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure.

**Default:** `10` | +| `--metricsAddr` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | A `:port` or `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. | +| `--ndjsonBufferSize` | `start` | `INT` | The maximum amount of data to buffer while reading a single line of `ndjson` input; increase when source cluster has large blob values.

**Default:** `65536` | +| `--oracle-application-users` | `oraclelogminer` | `STRING` | List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`). | +| `-o`, `--out` | `make-jwt` | `STRING` | A file to write the token to. | +| `--parallelism` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The number of concurrent database transactions to use.

**Default:** `16` | +| `--publicationName` | `pglogical` | `STRING` | The publication within the source database to replicate. | +| `--quiescentPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often to retry deferred mutations.

**Default:** `10s` | +| `--replicationProcessID` | `mylogical` | `UINT32` | The replication process ID to report to the source database.

**Default:** `10` | +| `--retireOffset` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How long to delay removal of applied mutations.

**Default:** `24h0m0s` | +| `--retryInitialBackoff` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation.

**Default:** `25ms` | +| `--retryMaxBackoff` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors.

**Default:** `2s` | +| `--retryMultiplier` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database.

**Default:** `2` | +| `--scanSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The number of rows to retrieve from the staging database used to store metadata for replication.

**Default:** `10000` | +| `--schemaRefresh` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.

**Default:** `1m0s` | +| `--scn` | `oraclelogminer` | `INT` | **Required** the first time `replicator` is run. The snapshot System Change Number (SCN) from the initial data load, which provides a replication marker for streaming changes. | +| `--scnWindowSize` | `oraclelogminer` | `INT` | The maximum size of SCN bounds per pull iteration from LogMiner. This helps prevent timeout errors when processing large SCN ranges. Set to `0` or a negative value to disable the cap.

**Default:** `3250` | +| `--slotName` | `pglogical` | `STRING` | **Required.** PostgreSQL replication slot name. Must match the slot name specified with `--pglogical-replication-slot-name` in the [MOLT Fetch command]({% link molt/molt-fetch.md %}#load-before-replication).

**Default:** `"replicator"` | +| `--sourceConn` | `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). | +| `--sourcePDBConn` | `oraclelogminer` | `STRING` | Connection string for the Oracle pluggable database (PDB). Only required when using an [Oracle multitenant configuration](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html). [`--sourceConn`](#source-conn) **must** be included. | +| `--sourceSchema` | `oraclelogminer` | `STRING` | **Required.** Source schema name on Oracle where tables will be replicated from. | +| `--stageDisableCreateTableReaderIndex` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `BOOL` | Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact.

**Default:** `false` | +| `--stageMarkAppliedLimit` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | Limit the number of mutations to be marked applied in a single statement.

**Default:** `100000` | +| `--stageSanityCheckPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often to validate staging table apply order (`-1` to disable).

**Default:** `10m0s` | +| `--stageSanityCheckWindow` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How far back to look when validating staging table apply order.

**Default:** `1h0m0s` | +| `--stageUnappliedPeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How often to report the number of unapplied mutations in staging tables (`-1` to disable).

**Default:** `1m0s` | +| `--stagingConn` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The staging database's connection string. | +| `--stagingCreateSchema` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `BOOL` | Automatically create the staging schema if it does not exist. | +| `--stagingIdleTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | +| `--stagingJitterTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | +| `--stagingMaxLifetime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | +| `--stagingMaxPoolSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of staging database connections.

**Default:** `128` | +| `--stagingSchema` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | Name of the CockroachDB schema that stores replication metadata. **Required** each time `replicator` is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction.

**Default:** `_replicator.public` | +| `--standbyTimeout` | `pglogical` | `DURATION` | How often to report WAL progress to the source server.

**Default:** `5s` | +| `--targetApplyQueueSize` | `pglogical`, `oraclelogminer` | `INT` | Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to PostgreSQL and Oracle sources, and replaces the deprecated `--copierChannel` and `--stageCopierChannelSize` flags. | +| `--targetConn` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The target database's connection string. | +| `--targetIdleTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | +| `--targetJitterTime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | +| `--targetMaxLifetime` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | +| `--targetMaxPoolSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of target database connections.

**Default:** `128` | +| `--targetSchema` | `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | **Required.** The SQL database schema in the target cluster to update. CockroachDB schema names must be fully qualified in the format `database.schema`. | +| `--targetStatementCacheSize` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of prepared statements to retain.

**Default:** `128` | +| `--taskGracePeriod` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `DURATION` | How long to allow for task cleanup when recovering from errors.

**Default:** `1m0s` | +| `--timestampLimit` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `INT` | The maximum number of source timestamps to coalesce into a target transaction.

**Default:** `1000` | +| `--tlsCertificate` | `start` | `STRING` | A path to a PEM-encoded TLS certificate chain. | +| `--tlsPrivateKey` | `start` | `STRING` | A path to a PEM-encoded TLS private key. | +| `--tlsSelfSigned` | `start` | `BOOL` | If true, generate a self-signed TLS certificate valid for `localhost`. | +| `--userscript` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `STRING` | The path to a TypeScript configuration script. For example, `--userscript 'script.ts'`. | +| `-v`, `--verbose` | `start`, `pglogical`, `mylogical`, `oraclelogminer` | `COUNT` | Increase logging verbosity. Use `-v` for `debug` logging or `-vv` for `trace` logging. | diff --git a/src/current/molt/replicator-metrics.md b/src/current/molt/replicator-metrics.md index c4d311091dc..1a2e50845ea 100644 --- a/src/current/molt/replicator-metrics.md +++ b/src/current/molt/replicator-metrics.md @@ -115,9 +115,17 @@ Monitor the following metrics to track the overall health of the [replication pi - Description: End-to-end replication lag per mutation from source commit to target apply. Measures the difference between current wall time and the mutation's [MVCC timestamp]({% link {{ site.current_cloud_version }}/architecture/storage-layer.md %}#mvcc). - Interpretation: Higher values mean that older mutations are being applied, and indicate end-to-end pipeline delays. Compare across tables to find bottlenecks. +
- `target_apply_queue_utilization_percent` - Description: Percentage of target apply queue capacity utilization. - - Interpretation: Values approaching 100 percent indicate severe backpressure throughout the pipeline, and potential data processing delays. + - Interpretation: Values above 90 percent indicate severe backpressure throughout the pipeline, and potential data processing delays. Increase [`--targetApplyQueueSize`]({% link molt/replicator-flags.md %}#target-apply-queue-size) or investigate target database performance. +
+
+- `target_apply_queue_utilization_percent` + - Description: Percentage of target apply queue capacity utilization. + - Interpretation: Values above 90 percent indicate severe backpressure throughout the pipeline, and potential data processing delays. Investigate target database performance. +
+
### Replication lag @@ -174,7 +182,7 @@ To visualize the following metrics, import the [Oracle Grafana dashboard](https: - `oraclelogminer_scn_interval_size` - Description: Size of the interval from the start SCN to the current Oracle SCN. - - Interpretation: Values larger than the [`--scnWindowSize`]({% link molt/replicator-flags.md %}#scn) flag value indicate replication lag, or that replication is idle. + - Interpretation: Values larger than the [`--scnWindowSize`]({% link molt/replicator-flags.md %}#scn-window-size) flag value indicate replication lag, or that replication is idle. - `oraclelogminer_time_per_window_seconds` - Description: Amount of time taken to fully process an SCN interval. - Interpretation: Large values indicate Oracle slowdown, blocked replication loop, or slow processing. @@ -272,9 +280,6 @@ For checkpoint terminology, refer to the [MOLT Replicator documentation]({% link - `target_apply_queue_size` - Description: Number of transactions waiting in the target apply queue. - Interpretation: High values indicate target apply cannot keep up with incoming transactions. -- `target_apply_queue_utilization_percent` - - Description: Percentage of apply queue capacity utilization. - - Interpretation: Values above 90 percent indicate severe backpressure. Increase [`--targetApplyQueueSize`]({% link molt/replicator-flags.md %}#target-apply-queue-size) or investigate target database performance. - `apply_duration_seconds` - Description: Amount of time taken to successfully apply mutations to a table. - Interpretation: High values indicate target database performance issues or contention.