Skip to content

Conversation

@ismailsimsek
Copy link
Contributor

resolves #10842

@ismailsimsek
Copy link
Contributor Author

@bryanck copied over the code as is.

Im planning to refactor upsert mode (delta writer) code, planning to add few improvements to it, potentially changing existing behavior.
should we merge this first and add the changes with separate PR. or combine them? what do you think?

@bryanck
Copy link
Contributor

bryanck commented Jan 24, 2025

@bryanck copied over the code as is.

Im planning to refactor upsert mode (delta writer) code, planning to add few improvements to it, potentially changing existing behavior. should we merge this first and add the changes with separate PR. or combine them? what do you think?

There are a couple of discussions on why we didn't originally add the delta writer functionality. I think we will need to resolve those discussions before we add this.

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 24, 2025
@github-actions
Copy link

github-actions bot commented Mar 3, 2025

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Mar 3, 2025
@juanluhidalgo
Copy link

I consider this functionality as a MUST, what's preventing it from been merged? At the end people are force to move to tabular version if they want support for upserts

@olarcherc24
Copy link

@ismailsimsek @bryanck can we please revive this PR?

@ajantha-bhat ajantha-bhat reopened this Apr 10, 2025
@ismailsimsek ismailsimsek marked this pull request as ready for review April 10, 2025 21:02
@github-actions github-actions bot removed the stale label Apr 11, 2025
@nemesis910
Copy link

It would be great to have this functionality, even if it needs to be used with some attention points.
The current alternative is to use Tabular version, which has same CAVEATS and introduce further limitations in the environment due to the usage of old 1.5.2 Iceberg version.

@ismailsimsek ismailsimsek force-pushed the kafka-delete-classes branch 2 times, most recently from 5333248 to 9ad8d26 Compare April 14, 2025 07:28
@github-actions github-actions bot added the docs label Apr 14, 2025
@ismailsimsek ismailsimsek force-pushed the kafka-delete-classes branch 4 times, most recently from df1418f to cbad542 Compare April 14, 2025 09:12
@ismailsimsek
Copy link
Contributor Author

the failure doesn't seem related.

@ajantha-bhat @bryanck @jbonofre could you please review?

@bryanck
Copy link
Contributor

bryanck commented Apr 14, 2025

@danielcweeks Do you feel our stance on this evolved or should we hold off on adding this until there is more clarity on the future of equality deletes?

@bryanck
Copy link
Contributor

bryanck commented Apr 14, 2025

I feel we should close this PR until we discuss this with the community. If the community feels we can move forward, I can handling porting over my code.

@olarcherc24
Copy link

@bryanck while I can fully relate to your concerns, I strongly advocate for moving forward with this PR, performance considerations notwithstanding. I agree with @ajantha-bhat and their comment here that the performance limitations of equality deletes should be addressed separately. For our team, having an append-only connector is virtually useless and we would rather deal with the related performance issues.

@nemesis910
Copy link

I'm following up on this topic—are there any updates or decisions so far?
Has this discussion already been escalated to the community? If not, @bryanck could you please advise on the best way to do so?

Thanks in advance for your help!

@Override
public void write(Record row) throws IOException {
Operation op;
if (row instanceof RecordWrapper) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CDC enabled and upsertMode enabled the CDC insert will not be changed to an UPDATE. Which I think might be wrong.
I raised a similar PR in the old tabular code last week: databricks/iceberg-kafka-connect#332

Copy link

@BadCandy BadCandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iceberg.tables.cdc-field property is only used to determine whether to use DeltaWriter, and setting this property does not configure the op field in the Record.
It seems necessary to modify the code to set the op field in RecordWrapper, referring to the existing implementation in IcebergWriter.

@ismailsimsek ismailsimsek force-pushed the kafka-delete-classes branch from cbad542 to e06fe78 Compare May 9, 2025 14:32
@github-actions
Copy link

github-actions bot commented Jun 9, 2025

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jun 9, 2025
@olarcherc24
Copy link

Still relevant

@Fokko Fokko added not-stale and removed stale labels Jun 10, 2025
@cbuckle1
Copy link

Can this be merged?

@pvary
Copy link
Contributor

pvary commented Oct 14, 2025

Can this be merged?

There are conflicts with the main branch, so we can't merge this.

For the record: There is an ongoing effort to move away from the FileAppenderFactory and start using the FileWriterFactory (#14328), which will conflict with the current PR.

@cbuckle1
Copy link

Can this be merged?

There are conflicts with the main branch, so we can't merge this.

For the record: There is an ongoing effort to move away from the FileAppenderFactory and start using the FileWriterFactory (#14328), which will conflict with the current PR.

Will the FileWriterFactory allow for upserts?

@pvary
Copy link
Contributor

pvary commented Oct 14, 2025

@cbuckle1: It only changes the interface which uses for the files to write. We still need a PR like this. Mostly only a change where FileAppenderFactory<Record> appenderFactory is replaced by FileWriterFactory<Record> writerFactory

@albertocarref
Copy link

Hey @pvary I just saw this PR you were mentioning is already merged. Are we good to go now?

@bryanck
Copy link
Contributor

bryanck commented Oct 22, 2025

We should resolve concerns around relying on equality deletes before going down that road, or open a new PR for a solution that does not rely on equality deletes.

Here is one thread on the mailing list from a few months ago: https://lists.apache.org/thread/96dhf3sj5pc4ql0l8yk8sxgtr78bchrd.

@cbuckle1
Copy link

@bryanck - for us, we are using this for streaming data, so equality deletes are needed. Based on another comment from April: #12070 (comment) it looks like others are in the same boat.

@cbuckle1
Copy link

cbuckle1 commented Dec 1, 2025

@bryanck - in the linked thread, you mentioned that the flink sink is looking at a similar issue, do you have the Github issue for us to track?

@ismailsimsek
Copy link
Contributor Author

ismailsimsek commented Dec 1, 2025

Whoever interested to implement it : Umbrella issue is #11122
the main Flink PR is: #14197 note that it has followup PRs and references to spark implementation as well (followup changes to check).

@t3hw
Copy link

t3hw commented Dec 7, 2025

I can give it a shot, ill base it on your PR and the flink sink implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kafka Connect: Add delta writer support