-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Impose a more sensible ordering for animation graph evaluation. #15530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impose a more sensible ordering for animation graph evaluation. #15530
Conversation
This commit changes the animation graph evaluation to be operate in a
more sensible order and updates the semantics of blend nodes to conform
to [the animation composition RFC]. Prior to this patch, a node graph
like this:
┌─────┐
│ │
│ 1 │
│ │
└──┬──┘
│
┌───────┴───────┐
│ │
▼ ▼
┌─────┐ ┌─────┐
│ │ │ │
│ 2 │ │ 3 │
│ │ │ │
└──┬──┘ └──┬──┘
│ │
┌───┴───┐ ┌───┴───┐
│ │ │ │
▼ ▼ ▼ ▼
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│ │ │ │ │ │ │ │
│ 4 │ │ 6 │ │ 5 │ │ 7 │
│ │ │ │ │ │ │ │
└─────┘ └─────┘ └─────┘ └─────┘
Would be evaluated as (((4 ⊕ 5) ⊕ 6) ⊕ 7), with the blend (lerp/slerp)
operation notated as ⊕. As quaternion multiplication isn't commutative,
this is very counterintuitive and will especially lead to trouble with
the forthcoming additive blending feature (bevyengine#15198).
This patch fixes the issue by changing the evaluation order to
postorder, with children of a node evaluated in ascending order by node
index.
To do so, this patch revamps `Keyframes` to be based on an *evaluation
stack* and a *blend register*. During target evaluation, the graph
evaluator traverses the graph in postorder. When encountering a clip
node, the evaluator pushes the possibly-interpolated value onto the
evaluation stack. When encountering a blend node, the evaluator pops
values off the stack into the blend register, accumulating weights as
appropriate. When the graph is completely evaluated, the top element on
the stack is *committed* to the property of the component.
A new system, the *graph threading* system, is added in order to cache
the sorted postorder traversal to avoid the overhead of sorting children
at animation evaluation time. Mask evaluation has been moved to this
system so that the graph only has to be traversed at most once per
frame. Unlike the `ActiveAnimation` list, the *threaded graph* is cached
from frame to frame and only has to be regenerated when the animation
graph asset changes.
This patch currently regresses the `animate_target` performance in
`many_foxes` by around 50%, resulting in an FPS loss of about 2-3 FPS.
I'd argue that this is an acceptable price to pay for a much more
intuitive system. In the future, we can mitigate the regression with a
fast path that avoids consulting the graph if only one animation is
playing. However, in the interest of keeping this patch simple, I didn't
do so here.
[the animation composition RFC]: https://github.com/bevyengine/rfcs/blob/main/rfcs/51-animation-composition.md
93cb652 to
0824add
Compare
|
This patch makes additive blending (#15198) easy to add, because there will longer be any need for complicated weight calculation. All that will be needed is to add an |
| /// symbol.) | ||
| /// | ||
| /// | Step | Node | Operation | Stack (after operation) | Blend Register | | ||
| /// | ---- | -----| ---------- | ----------------------- | -------------- | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| /// | ---- | -----| ---------- | ----------------------- | -------------- | | |
| /// | ---- | ---- | ---------- | ----------------------- | -------------- | |
| pub(crate) struct ThreadedAnimationGraph { | ||
| /// A cached postorder traversal of the graph. | ||
| /// | ||
| /// The node indices here are stored in postorder. Siblings are stored in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you document this below, but I think it would be useful to say upfront, right at the top of this doc comment, that it's stored in post-order with children being iterated in reverse (4, 3, 6, 5, 2, 1 instead of 5, 6, 2, 3, 4, 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does say "siblings are stored in descending order".
| /// This is kept up to date as the associated [`AnimationGraph`] instance is | ||
| /// added, modified, or removed. | ||
| #[derive(Default, Reflect)] | ||
| pub(crate) struct ThreadedAnimationGraph { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to expose this publicly to users, so that if they store animation graphs in somewhere other than Assets<AnimationGraph>, they can still use this (with some more involved manual setup)?
| /// The weight in the blend register. | ||
| /// | ||
| /// This will be `None` if the blend register is empty. In that case, | ||
| /// [`Self::blend_register_blend_weight`] will be empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to be
| /// [`Self::blend_register_blend_weight`] will be empty. | |
| /// [`Self::blend_register_morph_target_weights`] will be empty. |
? Since you've already stated
This will be `None` if the blend register is empty.
"this" being blend_register_blend_weight
|
@pcwalton do you think this should be in 0.15 to get the breakage over with for animation users? |
|
#15434 is merging; I'm going to optimistically add this to the milestone and we'll see what happens :) |
bushrat011899
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through the PR and couldn't see anything that stood out to me as wrong or in need of changing. However, I'm not familiar enough with animation systems to feel comfortable approving this PR sorry!
This is an updated version of #15530. Review comments were addressed. This commit changes the animation graph evaluation to be operate in a more sensible order and updates the semantics of blend nodes to conform to [the animation composition RFC]. Prior to this patch, a node graph like this: ``` ┌─────┐ │ │ │ 1 │ │ │ └──┬──┘ │ ┌───────┴───────┐ │ │ ▼ ▼ ┌─────┐ ┌─────┐ │ │ │ │ │ 2 │ │ 3 │ │ │ │ │ └──┬──┘ └──┬──┘ │ │ ┌───┴───┐ ┌───┴───┐ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │ │ │ │ │ │ │ 4 │ │ 6 │ │ 5 │ │ 7 │ │ │ │ │ │ │ │ │ └─────┘ └─────┘ └─────┘ └─────┘ ``` Would be evaluated as (((4 ⊕ 5) ⊕ 6) ⊕ 7), with the blend (lerp/slerp) operation notated as ⊕. As quaternion multiplication isn't commutative, this is very counterintuitive and will especially lead to trouble with the forthcoming additive blending feature (#15198). This patch fixes the issue by changing the evaluation order to postorder, with children of a node evaluated in ascending order by node index. To do so, this patch revamps `AnimationCurve` to be based on an *evaluation stack* and a *blend register*. During target evaluation, the graph evaluator traverses the graph in postorder. When encountering a clip node, the evaluator pushes the possibly-interpolated value onto the evaluation stack. When encountering a blend node, the evaluator pops values off the stack into the blend register, accumulating weights as appropriate. When the graph is completely evaluated, the top element on the stack is *committed* to the property of the component. A new system, the *graph threading* system, is added in order to cache the sorted postorder traversal to avoid the overhead of sorting children at animation evaluation time. Mask evaluation has been moved to this system so that the graph only has to be traversed at most once per frame. Unlike the `ActiveAnimation` list, the *threaded graph* is cached from frame to frame and only has to be regenerated when the animation graph asset changes. This patch currently regresses the `animate_target` performance in `many_foxes` by around 50%, resulting in an FPS loss of about 2-3 FPS. I'd argue that this is an acceptable price to pay for a much more intuitive system. In the future, we can mitigate the regression with a fast path that avoids consulting the graph if only one animation is playing. However, in the interest of keeping this patch simple, I didn't do so here. [the animation composition RFC]: https://github.com/bevyengine/rfcs/blob/main/rfcs/51-animation-composition.md # Objective - Describe the objective or issue this PR addresses. - If you're fixing a specific issue, say "Fixes #X". ## Solution - Describe the solution used to achieve the objective above. ## Testing - Did you test these changes? If so, how? - Are there any parts that need more testing? - How can other people (reviewers) test your changes? Is there anything specific they need to know? - If relevant, what platforms did you test these changes on, and are there any important ones you can't test? --- ## Showcase > This section is optional. If this PR does not include a visual change or does not add a new feature, you can delete this section. - Help others understand the result of this PR by showcasing your awesome work! - If this PR adds a new feature or public API, consider adding a brief pseudo-code snippet of it in action - If this PR includes a visual change, consider adding a screenshot, GIF, or video - If you want, you could even include a before/after comparison! - If the Migration Guide adequately covers the changes, you can delete this section While a showcase should aim to be brief and digestible, you can use a toggleable section to save space on longer showcases: <details> <summary>Click to view showcase</summary> ```rust println!("My super cool code."); ``` </details> ## Migration Guide > This section is optional. If there are no breaking changes, you can delete this section. - If this PR is a breaking change (relative to the last release of Bevy), describe how a user might need to migrate their code to support these changes - Simply adding new functionality is not a breaking change. - Fixing behavior that was definitely a bug, rather than a questionable design choice is not a breaking change. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
…engine#15589) This is an updated version of bevyengine#15530. Review comments were addressed. This commit changes the animation graph evaluation to be operate in a more sensible order and updates the semantics of blend nodes to conform to [the animation composition RFC]. Prior to this patch, a node graph like this: ``` ┌─────┐ │ │ │ 1 │ │ │ └──┬──┘ │ ┌───────┴───────┐ │ │ ▼ ▼ ┌─────┐ ┌─────┐ │ │ │ │ │ 2 │ │ 3 │ │ │ │ │ └──┬──┘ └──┬──┘ │ │ ┌───┴───┐ ┌───┴───┐ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │ │ │ │ │ │ │ 4 │ │ 6 │ │ 5 │ │ 7 │ │ │ │ │ │ │ │ │ └─────┘ └─────┘ └─────┘ └─────┘ ``` Would be evaluated as (((4 ⊕ 5) ⊕ 6) ⊕ 7), with the blend (lerp/slerp) operation notated as ⊕. As quaternion multiplication isn't commutative, this is very counterintuitive and will especially lead to trouble with the forthcoming additive blending feature (bevyengine#15198). This patch fixes the issue by changing the evaluation order to postorder, with children of a node evaluated in ascending order by node index. To do so, this patch revamps `AnimationCurve` to be based on an *evaluation stack* and a *blend register*. During target evaluation, the graph evaluator traverses the graph in postorder. When encountering a clip node, the evaluator pushes the possibly-interpolated value onto the evaluation stack. When encountering a blend node, the evaluator pops values off the stack into the blend register, accumulating weights as appropriate. When the graph is completely evaluated, the top element on the stack is *committed* to the property of the component. A new system, the *graph threading* system, is added in order to cache the sorted postorder traversal to avoid the overhead of sorting children at animation evaluation time. Mask evaluation has been moved to this system so that the graph only has to be traversed at most once per frame. Unlike the `ActiveAnimation` list, the *threaded graph* is cached from frame to frame and only has to be regenerated when the animation graph asset changes. This patch currently regresses the `animate_target` performance in `many_foxes` by around 50%, resulting in an FPS loss of about 2-3 FPS. I'd argue that this is an acceptable price to pay for a much more intuitive system. In the future, we can mitigate the regression with a fast path that avoids consulting the graph if only one animation is playing. However, in the interest of keeping this patch simple, I didn't do so here. [the animation composition RFC]: https://github.com/bevyengine/rfcs/blob/main/rfcs/51-animation-composition.md # Objective - Describe the objective or issue this PR addresses. - If you're fixing a specific issue, say "Fixes #X". ## Solution - Describe the solution used to achieve the objective above. ## Testing - Did you test these changes? If so, how? - Are there any parts that need more testing? - How can other people (reviewers) test your changes? Is there anything specific they need to know? - If relevant, what platforms did you test these changes on, and are there any important ones you can't test? --- ## Showcase > This section is optional. If this PR does not include a visual change or does not add a new feature, you can delete this section. - Help others understand the result of this PR by showcasing your awesome work! - If this PR adds a new feature or public API, consider adding a brief pseudo-code snippet of it in action - If this PR includes a visual change, consider adding a screenshot, GIF, or video - If you want, you could even include a before/after comparison! - If the Migration Guide adequately covers the changes, you can delete this section While a showcase should aim to be brief and digestible, you can use a toggleable section to save space on longer showcases: <details> <summary>Click to view showcase</summary> ```rust println!("My super cool code."); ``` </details> ## Migration Guide > This section is optional. If there are no breaking changes, you can delete this section. - If this PR is a breaking change (relative to the last release of Bevy), describe how a user might need to migrate their code to support these changes - Simply adding new functionality is not a breaking change. - Fixing behavior that was definitely a bug, rather than a questionable design choice is not a breaking change. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
…engine#15589) This is an updated version of bevyengine#15530. Review comments were addressed. This commit changes the animation graph evaluation to be operate in a more sensible order and updates the semantics of blend nodes to conform to [the animation composition RFC]. Prior to this patch, a node graph like this: ``` ┌─────┐ │ │ │ 1 │ │ │ └──┬──┘ │ ┌───────┴───────┐ │ │ ▼ ▼ ┌─────┐ ┌─────┐ │ │ │ │ │ 2 │ │ 3 │ │ │ │ │ └──┬──┘ └──┬──┘ │ │ ┌───┴───┐ ┌───┴───┐ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │ │ │ │ │ │ │ 4 │ │ 6 │ │ 5 │ │ 7 │ │ │ │ │ │ │ │ │ └─────┘ └─────┘ └─────┘ └─────┘ ``` Would be evaluated as (((4 ⊕ 5) ⊕ 6) ⊕ 7), with the blend (lerp/slerp) operation notated as ⊕. As quaternion multiplication isn't commutative, this is very counterintuitive and will especially lead to trouble with the forthcoming additive blending feature (bevyengine#15198). This patch fixes the issue by changing the evaluation order to postorder, with children of a node evaluated in ascending order by node index. To do so, this patch revamps `AnimationCurve` to be based on an *evaluation stack* and a *blend register*. During target evaluation, the graph evaluator traverses the graph in postorder. When encountering a clip node, the evaluator pushes the possibly-interpolated value onto the evaluation stack. When encountering a blend node, the evaluator pops values off the stack into the blend register, accumulating weights as appropriate. When the graph is completely evaluated, the top element on the stack is *committed* to the property of the component. A new system, the *graph threading* system, is added in order to cache the sorted postorder traversal to avoid the overhead of sorting children at animation evaluation time. Mask evaluation has been moved to this system so that the graph only has to be traversed at most once per frame. Unlike the `ActiveAnimation` list, the *threaded graph* is cached from frame to frame and only has to be regenerated when the animation graph asset changes. This patch currently regresses the `animate_target` performance in `many_foxes` by around 50%, resulting in an FPS loss of about 2-3 FPS. I'd argue that this is an acceptable price to pay for a much more intuitive system. In the future, we can mitigate the regression with a fast path that avoids consulting the graph if only one animation is playing. However, in the interest of keeping this patch simple, I didn't do so here. [the animation composition RFC]: https://github.com/bevyengine/rfcs/blob/main/rfcs/51-animation-composition.md # Objective - Describe the objective or issue this PR addresses. - If you're fixing a specific issue, say "Fixes #X". ## Solution - Describe the solution used to achieve the objective above. ## Testing - Did you test these changes? If so, how? - Are there any parts that need more testing? - How can other people (reviewers) test your changes? Is there anything specific they need to know? - If relevant, what platforms did you test these changes on, and are there any important ones you can't test? --- ## Showcase > This section is optional. If this PR does not include a visual change or does not add a new feature, you can delete this section. - Help others understand the result of this PR by showcasing your awesome work! - If this PR adds a new feature or public API, consider adding a brief pseudo-code snippet of it in action - If this PR includes a visual change, consider adding a screenshot, GIF, or video - If you want, you could even include a before/after comparison! - If the Migration Guide adequately covers the changes, you can delete this section While a showcase should aim to be brief and digestible, you can use a toggleable section to save space on longer showcases: <details> <summary>Click to view showcase</summary> ```rust println!("My super cool code."); ``` </details> ## Migration Guide > This section is optional. If there are no breaking changes, you can delete this section. - If this PR is a breaking change (relative to the last release of Bevy), describe how a user might need to migrate their code to support these changes - Simply adding new functionality is not a breaking change. - Fixing behavior that was definitely a bug, rather than a questionable design choice is not a breaking change. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
This commit changes the animation graph evaluation to be operate in a more sensible order and updates the semantics of blend nodes to conform to the animation composition RFC. Prior to this patch, a node graph like this:
Would be evaluated as (((4 ⊕ 5) ⊕ 6) ⊕ 7), with the blend (lerp/slerp) operation notated as ⊕. As quaternion multiplication isn't commutative, this is very counterintuitive and will especially lead to trouble with the forthcoming additive blending feature (#15198).
This patch fixes the issue by changing the evaluation order to postorder, with children of a node evaluated in ascending order by node index. It also makes blend nodes normalize the weights of their children, as suggested by the animation composition RFC.
To do so, this patch revamps
Keyframesto be based on an evaluation stack and a blend register. During target evaluation, the graph evaluator traverses the graph in postorder. When encountering a clip node, the evaluator pushes the possibly-interpolated value onto the evaluation stack. When encountering a blend node, the evaluator pops values off the stack into the blend register, accumulating weights as appropriate. When the graph is completely evaluated, the top element on the stack is committed to the property of the component.A new system, the graph threading system, is added in order to cache the sorted postorder traversal to avoid the overhead of sorting children at animation evaluation time. Mask evaluation has been moved to this system so that the graph only has to be traversed at most once per frame. Unlike the
ActiveAnimationlist, the threaded graph is cached from frame to frame and only has to be regenerated when the animation graph asset changes.This patch currently regresses the
animate_targetperformance inmany_foxesby around 35%, resulting in an FPS loss of about 2-3 FPS. I'd argue that this is an acceptable price to pay for a much more intuitive system. In the future, we can mitigate the regression with a fast path that avoids consulting the graph if only one animation is playing. However, in the interest of keeping this patch simple, I didn't do so here.Note that this PR should land on top of #15434, which has been waiting for a while and would heavily conflict with this patch. Consequently I'm marking it as a draft.
Migration guide