Skip to content

Conversation

@folkertdev
Copy link
Collaborator

@folkertdev folkertdev commented Feb 7, 2024

Besides cleanup, this also fixes bugs in slice::from_raw_parts(_mut) calls in BD::pixel_copys introduced during bitdepth deduplication.

can be reviewed commit-by-commit.

these PRs get pretty large and have many unrelated changes. Would stacked PRs easier to review?

I'll leave some specific notes/questions in inline comments.

Comment on lines -399 to +342
current_block_84 = 17075014677070940716;
st_y = false;
} else {
current_block_84 = 17728966195399430138;
st_y = true;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue #700 mentions "cleaning up the state machine control flow". Is this sufficient?

It's not quite the goto that C inserts but I'm assuming that this results in a direct branch in practice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the current_block_* ugliness is gone and it's semantically equivalent to the C, it should be good. Let me check more closely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ee538a8 does exactly that. None of the other commits touch the control flow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ee538a8 looks like good itself; it translates the state machines as they were transpiled quite nicely. The original gotos in the C source are different, though, and I think some of them got translated into extra, large ifs and stuff like that. Ideally I think we want to match the C as reasonably as possible, though your way is a very clean solution, too. So for this PR, it's definitely good, a significant improvement. We should look into matching the C gotos structure as much as possible, though. It doesn't have to match completely, but following its general structure is good, as it helps a lot with any backports.

let f: *mut Rav1dFrameContext = (*tc).f as *mut Rav1dFrameContext;
let bitdepth_min_8 = if 16 == 8 {
0 as c_int
let f: *mut Rav1dFrameContext = tc.f as *mut Rav1dFrameContext;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really want to add more to this PR but I think at this point this could be a proper reference. #728 gets its hands on a &Rav1dFrameContext

@kkysen
Copy link
Collaborator

kkysen commented Feb 8, 2024

these PRs get pretty large and have many unrelated changes. Would stacked PRs easier to review?

Yeah, stacked PRs work great! I've been using them a lot. As long as the stack doesn't grow super long or branch a lot (a linear stack is usually easier for me).

Copy link
Collaborator

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed most of it, but I'm still checking some more things, like the goto state machine.

Comment on lines -399 to +342
current_block_84 = 17075014677070940716;
st_y = false;
} else {
current_block_84 = 17728966195399430138;
st_y = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the current_block_* ugliness is gone and it's semantically equivalent to the C, it should be good. Let me check more closely.

@kkysen kkysen changed the title src/cdef_apply.rs cleanup src/cdef_apply.rs: cleanup Feb 8, 2024
Comment on lines -34 to +48
slice::from_raw_parts_mut(
*dst.offset(y_stride as isize) as *mut BD::Pixel,
(-2 * y_stride) as usize,
),
slice::from_raw_parts(
*src.offset((7 as isize * y_stride) as isize) as *const BD::Pixel,
(-2 * y_stride) as usize,
),
(-2 * y_stride) as usize,
slice::from_raw_parts_mut(dst[0].offset(y_stride), len),
slice::from_raw_parts(src[0].offset(7 * y_stride), len),
Copy link
Collaborator

@kkysen kkysen Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these (other similar ones, too) were mistakenly mistranslated before, as src and dst are offset first and then dereferened, vs. dereferenced/[0]ed first, and then offset. Your way looks correct so all is good here. I'll add to the PR description that this bug is fixed as well. And thanks for catching and fixing this.

Copy link
Collaborator

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically all LGTM now besides dst should be an array in fn backup2lines and one other suggestion (though not too important).

folkertdev and others added 2 commits February 12, 2024 09:38
Co-authored-by: Khyber Sen <kkysen@gmail.com>
@kkysen kkysen merged commit 97371dc into main Feb 12, 2024
@kkysen kkysen deleted the cdef-apply-cleanup branch February 12, 2024 22:38
kkysen added a commit that referenced this pull request Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants