Skip to content

Conversation

@Xreki
Copy link
Collaborator

@Xreki Xreki commented Jan 5, 2026

PR Category

Feature Enhancement

Description

实现fixed-start模式下,auto-debugger工具最后的单测生成。假设最终定位到[0, N-1]子图测试精度满足要求、[0, N]子图测试精度不满足,嫌疑算子在N位置。单测中构造三个paddle.nn.Layer

  • PrologueLayer,包含子图[0, N-1]
  • SuspectLayer,包含子图[N-1, N]
  • TestModel,由PrologueLayer + SuspectLayer两个子模块构成

单测中构造两种测试模式:

  • test_separated,单独执行PrologueLayerSuspectLayer
    • reference硬件上
      • 执行PrologueLayer,保存PrologueLayer的结果,作为SuspectLayer的输入
      • 执行SuspectLayer,保存SuspectLayer的结果,作为最终的测试结果
    • target硬件上
      • 加载reference硬件上PrologueLayer的计算结果,作为SuspectLayer的输入
      • 执行SuspectLayer,保存SuspectLayer的结果,作为最终的测试结果
      • 加载reference硬件上SuspectLayer的计算结果,二者进行对比
  • test_combined,执行由PrologueLayerSuspectLayer构成的TestModel
    • reference硬件上
      • 执行TestModel,保存计算结果
    • target硬件上
      • 执行TestModel,作为最终的测试结果
      • 加载reference硬件上TestModel的计算结果,二者进行对比

注意:该PR依赖PaddlePaddle/Athena#20

生成的单测示例如下:

import os
import sys
import argparse
import unittest
import random
import numpy as np
import paddle


def init_integer_tensor(shape, dtype, device, min_val, max_val):
    array = np.random.randint(
        low=min_val, high=max_val + 1, size=shape, dtype="int64"
    )
    return paddle.to_tensor(array).to(dtype).to(device)


def init_float_tensor(shape, dtype, device, min_val, max_val, mean=None, std=None):
    if mean is not None and std is not None:
        array = np.random.normal(0, 1, shape) * std * 0.2 + mean
        array = np.clip(array, min_val, max_val)
    else:
        array = np.random.uniform(low=min_val, high=max_val, size=shape)
    return paddle.to_tensor(array).to(dtype).to(device)


class PrologueLayer(paddle.nn.Layer):
    def forward(self, parameter_0, parameter_1, parameter_2, parameter_3, parameter_4, data_0):
        conv2d_0 = paddle._C_ops.conv2d(data_0, parameter_0, [2, 2], [3, 3], 'EXPLICIT', [1, 1], 1, 'NCHW')
        del data_0, parameter_0
        (batch_norm_0, batch_norm_1, batch_norm_2, batch_norm_3, batch_norm_4, batch_norm_5) = (lambda x, f: f(x))(paddle._C_ops.batch_norm(conv2d_0, parameter_1, parameter_2, parameter_3, parameter_4, True, float('0.9'), float('1e-05'), 'NCHW', False, False), lambda out: out if isinstance(out, (list, tuple)) else (out, None, None, None, None, None))
        del conv2d_0, parameter_1, parameter_2, parameter_3, parameter_4
        relu_0 = paddle._C_ops.relu(batch_norm_0)
        del batch_norm_0
        full_int_array_0 = [3, 3]
        pool2d_0 = paddle._C_ops.pool2d(relu_0, full_int_array_0, [2, 2], [1, 1], False, True, 'NCHW', 'max', False, False, 'EXPLICIT')
        del full_int_array_0, relu_0
        return pool2d_0


class SuspectLayer(paddle.nn.Layer):
    def forward(self, parameter_5, pool2d_0):
        conv2d_1 = paddle._C_ops.conv2d(pool2d_0, parameter_5, [1, 1], [1, 1], 'EXPLICIT', [1, 1], 1, 'NCHW')
        del parameter_5
        return conv2d_1


class TestModel(paddle.nn.Layer):
    def __init__(self):
        super().__init__()
        self.prologue_layer = PrologueLayer()
        self.suspect_layer = SuspectLayer()

    def forward(self, parameter_0, parameter_1, parameter_2, parameter_3, parameter_4, parameter_5, data_0):
        pool2d_0 = self.prologue_layer(parameter_0, parameter_1, parameter_2, parameter_3, parameter_4, data_0)
        conv2d_1 = self.suspect_layer(parameter_5, pool2d_0)
        return (conv2d_1,)


def get_input_dict(device):
    input_dict = {
        'parameter_5': init_float_tensor(shape=[64, 64, 3, 3], dtype='float32', device=device, min_val=-0.636645, max_val=0.441977, mean=-0.00306299, std=0.0509207),
        'parameter_4': init_float_tensor(shape=[64], dtype='float32', device=device, min_val=0.0, max_val=0.5),
        'parameter_3': init_float_tensor(shape=[64], dtype='float32', device=device, min_val=0.0, max_val=0.5),
        'parameter_2': init_float_tensor(shape=[64], dtype='float32', device=device, min_val=0.0, max_val=0.5),
        'parameter_1': init_float_tensor(shape=[64], dtype='float32', device=device, min_val=0.0, max_val=0.5),
        'parameter_0': init_float_tensor(shape=[64, 3, 7, 7], dtype='float32', device=device, min_val=-0.687591, max_val=0.768039, mean=-0.000483499, std=0.128885),
        'data_0': init_float_tensor(shape=[64, 3, 224, 224], dtype='float32', device=device, min_val=-2.1179, max_val=2.64, mean=-0.0534006, std=1.34101),
    }
    return input_dict


def tolerance_generator(tolerance, dtype):
    if dtype == paddle.float16:
        return 10 ** (tolerance * 3 / 5), 10**tolerance
    elif dtype == paddle.bfloat16:
        return 10 ** (tolerance * 1.796 / 5), 10**tolerance
    elif dtype == paddle.float32:
        return 10 ** (tolerance * 5.886 / 5), 10**tolerance
    elif dtype == paddle.float64:
        return 10 ** (tolerance * 7 / 5), 10 ** (tolerance * 7 / 5)
    else:
        assert False, f"Unsupported {dtype=}."


class Resnet18Test(unittest.TestCase):
    def setUp(self):
        self.device = TEST_ARGS.device
        self.is_reference = TEST_ARGS.is_reference
        self.reference_dir = TEST_ARGS.reference_dir
        self.tolerance = 0

        paddle.seed(123)
        random.seed(123)
        np.random.seed(123)

        self.input_dict = get_input_dict(self.device)
        self.test_model = TestModel()

    def _flatten_outputs_to_list(self, outs):
        flattened_outs = outs
        if isinstance(outs, paddle.Tensor):
            flattened_outs = [outs]
        else:
            flattened_outs = [
                x
                for out in outs
                for x in (out if isinstance(out, (tuple, list)) else (out,))
            ]
        return flattened_outs

    def run_prologue_layer(self):
        prologue_inputs = [
            self.input_dict['parameter_0'],
            self.input_dict['parameter_1'],
            self.input_dict['parameter_2'],
            self.input_dict['parameter_3'],
            self.input_dict['parameter_4'],
            self.input_dict['data_0'],
        ]
        prologue_outputs = self.test_model.prologue_layer(*prologue_inputs)
        return self._flatten_outputs_to_list(prologue_outputs)

    def run_suspect_layer(self, prologue_outputs):
        suspect_inputs = [
            self.input_dict['parameter_5'],
            prologue_outputs[0],
        ]
        suspect_outputs = self.test_model.suspect_layer(*suspect_inputs)
        return self._flatten_outputs_to_list(suspect_outputs)

    def run_test_model(self):
        test_outputs = self.test_model(**self.input_dict)
        return self._flatten_outputs_to_list(test_outputs)

    def check_dtypes(self, reference_outputs, target_outputs):
        def _get_output_dtypes(outs):
            dtypes = [
                str(tensor.dtype).replace("paddle.", "")
                if isinstance(tensor, paddle.Tensor)
                else None
                for i, tensor in enumerate(outs)
            ]
            return dtypes

        reference_dtypes = _get_output_dtypes(reference_outputs)
        target_dtypes = _get_output_dtypes(target_outputs)
        dtype_match = all(
            reference == target for reference, target in zip(reference_dtypes, target_dtypes)
        )
        self.assertTrue(dtype_match, f"Data type of outputs are not matched ({reference_dtypes=} vs {target_dtypes}).")

    def check_shapes(self, reference_outputs, target_outputs):
        def _get_output_shapes(outs):
            shapes = [
                tensor.shape if isinstance(tensor, paddle.Tensor) else None
                for i, tensor in enumerate(outs)
            ]
            return shapes

        reference_shapes = _get_output_shapes(reference_outputs)
        target_shapes = _get_output_shapes(target_outputs)
        shape_match = all(
            reference == target for reference, target in zip(reference_shapes, target_shapes)
        )
        self.assertTrue(shape_match, f"Shape of outputs are not matched ({reference_shapes=} vs {target_shapes}).")

    def check_results(self, reference_outputs, target_outputs):
        def _convert_to_numpy(out):
            if out.dtype in [paddle.float16, paddle.bfloat16]:
                return out.cast("float32").numpy()
            else:
                return out.numpy()

        assert len(reference_outputs) == len(target_outputs), f"The number of outputs is not equal ({len(reference_outputs)=} vs {len(target_outputs)})."
        self.check_dtypes(reference_outputs, target_outputs)
        self.check_shapes(reference_outputs, target_outputs)

        for reference, target in zip(reference_outputs, target_outputs):
            atol, rtol = tolerance_generator(self.tolerance, reference.dtype)
            np.testing.assert_allclose(_convert_to_numpy(reference), _convert_to_numpy(target), atol, rtol)

    def test_separated(self):
        prologue_output_path = os.path.join(self.reference_dir, "ResNet18_separate_prologue.pdout")
        if self.is_reference:
            prologue_outputs = self.run_prologue_layer()
            print(f"Save prologue output tensors to {prologue_output_path}.")
            paddle.save(prologue_outputs, prologue_output_path)
        else:
            print(f"Load prologue output tensors from {prologue_output_path}")
            prologue_outputs = paddle.load(prologue_output_path)
        
        test_output_path = os.path.join(self.reference_dir, "ResNet18_separate_reference.pdout")
        test_outputs = self.run_suspect_layer(prologue_outputs)
        if self.is_reference:
            print(f"Save test output tensors to {test_output_path}.")
            paddle.save(test_outputs, test_output_path)
        else:
            print(f"Load test output tensors on reference device from {test_output_path}.")
            test_reference_outputs = paddle.load(test_output_path)
            self.check_results(test_reference_outputs, test_outputs)

    def test_combined(self):
        test_output_path = os.path.join(self.reference_dir, "ResNet18_combined_reference.pdout")
        test_outputs = self.run_test_model()
        if self.is_reference:
            print(f"Save test output tensors to {test_output_path}.")
            paddle.save(test_outputs, test_output_path)
        else:
            print(f"Load test output tensors on reference device from {test_output_path}.")
            test_reference_outputs = paddle.load(test_output_path)
            self.check_results(test_reference_outputs, test_outputs)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--is-reference", action="store_true", default=False)
    parser.add_argument("--device", type=str, required=True)
    parser.add_argument("--reference-dir", type=str, required=True)
    args, remaining = parser.parse_known_args()

    global TEST_ARGS
    TEST_ARGS = args

    unittest.main(argv=[sys.argv[0]] + remaining)

@paddle-bot
Copy link

paddle-bot bot commented Jan 5, 2026

Thanks for your contribution!

@Xreki Xreki force-pushed the generate_prologue_unittest branch 4 times, most recently from 089eb51 to 2cc4bb0 Compare January 5, 2026 09:00
@Xreki Xreki force-pushed the generate_prologue_unittest branch from 2cc4bb0 to cbdc80a Compare January 5, 2026 09:02
@Xreki Xreki force-pushed the generate_prologue_unittest branch from d17c422 to ef1eaa7 Compare January 6, 2026 01:25
@Xreki Xreki force-pushed the generate_prologue_unittest branch from f6d64cf to 1129bd8 Compare January 6, 2026 05:44
@Xreki Xreki requested a review from lixinqi January 6, 2026 06:11
@lixinqi lixinqi merged commit 4b7afbc into PaddlePaddle:develop Jan 6, 2026
3 checks passed
@Xreki Xreki deleted the generate_prologue_unittest branch January 6, 2026 07:01
Honglei-Qiu pushed a commit to Honglei-Qiu/GraphNet that referenced this pull request Jan 9, 2026
…addlePaddle#515)

* Implement PrologueUnittestGenerator.

* Generate correct prologue unittest.

* Fix the parse of returned names.

* Add unittest generating step.

* Fix several error.

* Add test_separated and test_combined.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants